Read about this project here
Join AMD Users when you install the application and create your username / handle.
Read about this project here
Join AMD Users when you install the application and create your username / handle.
This is similar to Majestic-12, but I'm still trying to figure out the options of the crawler.
I think I made something wrong here because the system got very slow for a while, even though boitho wasn't using almost no cpu. After shutting down boitho everything started working normally again.
2009 MULTI CORES CONTEST STATS
http://neogen.amdusers.com/contest2009/contestoverall.htm
CHRISTMAS QUEST CONTEST STATS
http://neogen.amdusers.com/contest2008/xmascontest.htm
http://neogen.amdusers.com/contest2008/stats.png (snapshot)
FEBRUARY '08 RACE STATS
http://neogen.amdusers.com/contest2008/racefeb08b.htm
http://neogen.amdusers.com/contest2008/racefeb08nb.htm
I took a look at this before it was annouced on distributedcomputing.info. IMO it's a bit crap, lol. All it did for me was get robots.txt. Not exactly exciting is it...
Btw we lost a team member to x grubbers yesterday![]()
Anyone knows how often the stats are updated?
2009 MULTI CORES CONTEST STATS
http://neogen.amdusers.com/contest2009/contestoverall.htm
CHRISTMAS QUEST CONTEST STATS
http://neogen.amdusers.com/contest2008/xmascontest.htm
http://neogen.amdusers.com/contest2008/stats.png (snapshot)
FEBRUARY '08 RACE STATS
http://neogen.amdusers.com/contest2008/racefeb08b.htm
http://neogen.amdusers.com/contest2008/racefeb08nb.htm
Hej
So sad we loose Peyoti. I wounder why he change team. If peoples disappear we will have difficult to compete against other Teams. Lets hope it is the last time such a thing happen.
Lagu :shock:
Once an AMDuser always an AMD user
Hej
I have downloaded the agent but yet not exequte it. What is the difference between MJ-12 and Boitho?
Lagu![]()
Once an AMDuser always an AMD user
Lagu,
Ultimately, I think they are both the same. Crawling and/or validating the web
Boitho makes thumbnail images of the web pages it crawls to show alongside the search engine results.
I guess that translates in more cpu usage than MJ-12. But on the other hand it seems that it crawls much slower than MJ-12.
2009 MULTI CORES CONTEST STATS
http://neogen.amdusers.com/contest2009/contestoverall.htm
CHRISTMAS QUEST CONTEST STATS
http://neogen.amdusers.com/contest2008/xmascontest.htm
http://neogen.amdusers.com/contest2008/stats.png (snapshot)
FEBRUARY '08 RACE STATS
http://neogen.amdusers.com/contest2008/racefeb08b.htm
http://neogen.amdusers.com/contest2008/racefeb08nb.htm
There's something in Boitho that makes my system almost crawl to a halt when it's running, but I can't point out anything in particular.
The main processes have normal priority and almost get no cpu, but I'm starting to suspect that the crawler threads are getting above normal or even high priority. But those are not shown in the task manager.
And I also don't know why I don't have any points. I know I haven't run it much but I should have at least something there.
2009 MULTI CORES CONTEST STATS
http://neogen.amdusers.com/contest2009/contestoverall.htm
CHRISTMAS QUEST CONTEST STATS
http://neogen.amdusers.com/contest2008/xmascontest.htm
http://neogen.amdusers.com/contest2008/stats.png (snapshot)
FEBRUARY '08 RACE STATS
http://neogen.amdusers.com/contest2008/racefeb08b.htm
http://neogen.amdusers.com/contest2008/racefeb08nb.htm
Hi
I am one of the people behind Boitho.
NeoGen, sorry to hear about your problem with the crawler.
In the folder you installed the Boitho client it should be a file called "ErrorLog.txt". This is a log off all errors. Can you send me this to: runarb [at] boitho dot com, sow I can take a look?
The threads runs as priority idle. The crawler uses two possesses, BGui.exe and BCrawler.exe both main threads runs as normal. The BCrawler.exe is then responsible for crawling, and creates new threads with priority idle as necessarily.
The crawler isn’t rely suitable for running along side when you are using the computer. As default it is configured to only run downloads when it hasn’t been used for more then 5 minutes. See Tools-> Options-> Crawling Mode
The statistics is live, and is updated every time your client sends us pages it hav crawled. Pages are sent in when you have crawled 500.Anyone knows how often the stats are updated?
The graphs are updated every 5 minutes.
It also has to download all the images from the site, not only the html to make the thumbnail. The bandwidth and CPU needed to make a thumbnail of an internet page is about 10 times more then the resources needed just to download the html.Boitho makes thumbnail images of the web pages it crawls to show alongside the search engine results. I guess that translates in more cpu usage than MJ-12.
It happens that we crawl a lot of robots.txt pages from time to time.All it did for me was get robots.txt. Not exactly exciting is it...
Boitho cashes the robots.txt pages locally and therefore have to get the robots.txt files before we can issue a url for crawling to a node.
Around the 13 des we added allot of new pages, from domains we hadn’t crawled before. (from the .com, .net and .edu list from Verisign ). If one looks at the crawler statistic page one can see we mostly crawled robots.txt pages from the 13 to 21 des: http://dcsetup.boitho.com/cgi-bin/dc/topCrawlers.cgi because of this.