PDA

View Full Version : BOINC 3.20 available - all standard platforms



BC
07-09-2004, 09:18 AM
BOINC 3.20 is available for all platforms now. all -110s are fixed, etc as promised.

If you wish, you may now also compile it yourself and install as a service.
(The service "CLI" is not pre-compiled.... I asked that it be for windows)
Installing as a service eliminates lots of problems.

If anyone has Visual Studio .NET and is willing to compile it for us, I would appreciate it... I don't have the ".NET" software... just the regular.

3.20 is doing well so far. Only had one failure and that was MY fault..
3.20 performance is better... benchmark appears more stable/reliable.
3.20 BOINC upgrades all existing BOINCS, including Predictor,
and runs on win/98 and up.

It is available at my site ( http://www.tumoeng.com/boinc_320.zip )
if the main site at Berkeley is down until all sites get it.

I would make a backup of your existing BOINC directory as a failsafe just in case you uncover a weird error (standard procedure) and then delete it after 2-3 days of normal ops...



BC

Ototero
07-09-2004, 12:03 PM
BOINC 3.20 now running. Easy peasy. Thanks BC.

It's running seti 3.08. I've been running seti for some years now. Just joined AMD Users, my id is MOLE but I'm changing that to OTOTERO, that's a bad word in Tagalog ;)

What position is our team in? The seti site keeps having database errors. so I can't update my id.

BC
07-09-2004, 12:19 PM
The Seti site is going nuts... and quite unstable. We are pushing (I believe) more on Predictor. http://predictor.scripps.edu

(don't download their released version of CC) just signup and get the URL and key. Then go to the web page and set your prefs. After that... Attach and join the fun.

On Predictor, last I saw, from 9th up to solid 8th, knocking on 7th's door...

but I did hardware damage here last night... or munged the BIOS
and will be forced to clear it and start over and have not paid much attention to predictor otherwise.

I've been busy with the machine... it LOOKS like a bios bug, but i cannot go back to 'auto' mode now from manual settings.... I will fuss more later.
I'm starting to get sleep deprived..... pushing only 1.5G out of a 1.6G memory bus and not keeping it stable... obvious sign of needing sleep... LOL


BC

PS: my BF taught me both good and bad words, hope I don't ever slip up! ..... LOL :)

Ototero
07-09-2004, 12:41 PM
Cheers BC,

Joined and started. Should I move from D2OL to Predictor? Where are our excess parity bits being used?

I'll follow the leader.

BC
07-09-2004, 12:49 PM
Cheers BC,

Joined and started. Should I move from D2OL to Predictor? Where are our excess parity bits being used?

I'll follow the leader.


I'm honestly not sure.... I think we are being a bit dynamic at the moment. One of the more senior members is probably better qualified to answer the D2OL vs Predictor question. I am spearheading Predictor for us because I'm also now on the development team.

Personally, if you had two processors, I would run 1 on each for now.. BOINC has that capability and it works very nicely.

I simply know that predictor is a front end (of sorts) to D2OL (which actually helps find the protease inhibitor).

Would one of the 'real' scientists please speak up here? (my foot is terribly close to my mouth!!!! HAHA)

BC.

chaz
07-09-2004, 01:16 PM
I am running about 75% predictor now, mainly to get a jump on the stats. Whats left is on SOB, since we fell a rung or two.

AMDave
07-09-2004, 01:48 PM
Thanks BC.
Always keen to give it a go.
With backup done, installed 3.20 over 3.19 successfully and it picked up the Predictor work unit that was half way through and has started moving on.
Benchmarks and time estimates as you say.
Looks like a success.
Will report if anything wierd happens.

I still have a foot in a couple of projects but it looks like we lost a lot of grunt in D2OL over the last few days. I had trouble with 2 of my machines due to me messing around with my network, but the difference is way more than my contribution.

** edit ** sorry, forgot to add that it refreshed my Predictor Stats OK. Seti handler not responding again, but it looks like the comms, are ok.

BC
07-09-2004, 08:34 PM
Thanks BC.
Always keen to give it a go.
With backup done, installed 3.20 over 3.19 successfully and it picked up the Predictor work unit that was half way through and has started moving on.
Benchmarks and time estimates as you say.
Looks like a success.
Will report if anything wierd happens.

I still have a foot in a couple of projects but it looks like we lost a lot of grunt in D2OL over the last few days. I had trouble with 2 of my machines due to me messing around with my network, but the difference is way more than my contribution.

** edit ** sorry, forgot to add that it refreshed my Predictor Stats OK. Seti handler not responding again, but it looks like the comms, are ok.

Thanks Dave,
I think seti is down or still in the 'overloaded' state.
I also think the help we've got on predictor will slide back over to D2OL
once we're caught back up.... but thats their decisions.

BC

bwhite
07-09-2004, 10:43 PM
I have Predictor on a couple of machines just to get my foot in the door and see how it goes :thumbleft: Major effort is still twords D2OL for me at least for a little while longer.

BC
07-09-2004, 11:30 PM
First: Bwhite: Welcome aboard.

Second: To all who have Seti/boinc and Predictor/boinc... please remember to balance your loading for each project.. it helps keep the point count up if the scheduler knows how you are running, which will minimize your 'pending' list and the loss of points due to aging.

Lastly: I'm trying to figure out how to manage this, but with
two slower machines about to come online this weekend from me,

I will need to make a decision how to best compliment the support
Chaz is giving us on Predictor.

I miss having both 3200's here & online full time, but will have to live
with it for the moment.

Would someone suggest what would be the most effective split between D2OL and Predictor? I'm thinking of doing a simple 'Mips count' and split the Mips 50/50. Chaz being 75% on Predictor is providing more help than my other machines would. Maybe I can at least back fill a little bit until he's ready to move back, or he & I trade off.

Chaz? Anyone? .... Comments, Suggestions?


BC

chaz
07-09-2004, 11:37 PM
If/when I switch over it wont be to D2OL :rightfighter5: Never been too fond of that one, but team rally's are fun.

I'll probably head off to SOB or Lifemapper in time.
Lots of projects, too few puters.

Ototero
07-10-2004, 10:21 AM
Hooray, I'm off the mark in Predictor, straight in at number 9.

I'm away for the weekend so my laptop (D2OL) will be offline. Be back Sunday midnight though. I don't want to loose too much cpu time.

Monday it's bios upgrade time, wish me luck.


Stu

Keith75
07-10-2004, 11:07 AM
Not sure how reliable this is but when BOINC benchmarks computers here are some results results are from computers running exactly the same version OS:

Pentium 4 3.4 GHz
_________________________

Measured floating point speed 1,812.69 million ops/sec
Measured integer speed 2,132.18 million ops/sec
Measured memory bandwidth 953.67 MB/sec

Athlon 64 3200+ @ 2.24 GHz

__________________________

Measured floating point speed 2,804.78 million ops/sec
Measured integer speed 5,759.60 million ops/sec
Measured memory bandwidth 953.67 MB/sec


Sure makes me feel good if these figures are accurate.

Keith

BC
07-10-2004, 11:40 AM
Not sure how reliable this is but when BOINC benchmarks computers here are some results results are from computers running exactly the same version OS:

Pentium 4 3.4 GHz
_________________________

Measured floating point speed 1,812.69 million ops/sec
Measured integer speed 2,132.18 million ops/sec
Measured memory bandwidth 953.67 MB/sec

Athlon 64 3200+ @ 2.24 GHz

__________________________

Measured floating point speed 2,804.78 million ops/sec
Measured integer speed 5,759.60 million ops/sec
Measured memory bandwidth 953.67 MB/sec


Sure makes me feel good if these figures are accurate.

Keith

The numbers are much more accurage except memory.... the mem bandwidth tests had trouble with caches, so everyone gets 953.67 for now until dev figures the problem.


Do you (or anyone) have the CLI version compiled yet for 3.20 on XP ?

I want to install it as a service and run the new socket interface instead of that slow mutex.

I unfortunately don't have a good copy of VS .NET... just plain VS 6 here to compile with.

BC

jlangner
07-10-2004, 12:42 PM
I'll try it out as well. I was getting tired of D2OL. Will keep some on SOB though. Trying to give Chaz a run. lol.

jlangner
07-10-2004, 12:55 PM
Keith - your 3200 is ahead of my 3500, but I see you are overclocked about 10% above me - hehe. Looks like I'll have to overclock to 2.2Ghz to compare. lol

Measured floating point speed 2641.27 million ops/sec
Measured integer speed 5378.72 million ops/sec
Measured memory bandwidth 953.67 MB/sec

BC
07-10-2004, 04:20 PM
Not to sound like a pushy old goat...... (Snickering)

But would you guys get crunching? LOL

I have some 900+ (soon 1000+) pending credits that I would like
to have posted.. Somebody needs to get them done... :)
(...... as he laughs sipping coffee..... )


Seriously, GREAT,, and I mean... REALLY REALLY GREAT performance and teamwork guys... My compliments and Thank you!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!


BC

jlangner
07-10-2004, 05:17 PM
What is your average time per application? Mine looks to be around 47 minutes. BTW I have 26 pending. :cry:

WienerDog
07-10-2004, 05:53 PM
heh i got almost 1200 credits pending :cry: thats about the only thing i really dont like about boinc

Keith75
07-10-2004, 07:17 PM
jlagner,

Mine is only overclocked by 44 MHz. :)

Keith

jlangner
07-10-2004, 09:10 PM
Yea you are right I was thinking mine was stock 2200 mhz and thought yours was 2000 mhz.

BC
07-10-2004, 09:24 PM
What is your average time per application? Mine looks to be around 47 minutes. BTW I have 26 pending. :cry:


I am running about 48minutes +/- per data block. some are bigger
than others.

It usually estimates 58min and then around 48-49 it jumps and is done.

BC
07-10-2004, 09:26 PM
heh i got almost 1200 credits pending :cry: thats about the only thing i really dont like about boinc

Weiner, agreed.. but it does make better science as all 3 have to agree otherwise the result is sent out again.

the other part.... it stops the cheating that hapened by people sending in the same 8000+ result files every other day.

bwhite
07-10-2004, 09:48 PM
I have 250 credits pending and have only been running for less than a day on 3 older slower boxes. Have taken one computer off Predictor and put back on D2OL as "Predictor@home - 2004-07-10 17:36:43 - No schedulers responded" and it is out of work to do. I'll be gone 'til late (Fishing) so will see if it is back online then and switch back to Predictor if it is.

BC
07-10-2004, 09:52 PM
I just got that a couple times as well and sent of an email...

something is wrong or you guys drained out all the AMD-assigned work
and it has to come up with more for us!!!!!!

It's being looked at.

BC
07-10-2004, 10:01 PM
I have 250 credits pending and have only been running for less than a day on 3 older slower boxes. Have taken one computer off Predictor and put back on D2OL as "Predictor@home - 2004-07-10 17:36:43 - No schedulers responded" and it is out of work to do. I'll be gone 'til late (Fishing) so will see if it is back online then and switch back to Predictor if it is.

Some of my pending credits are from 'Charmms'... which aren't being sent out right now... There is a problem with the application... when it's fixed, I'll get those credits... I think (hard to say without 1 by 1 exam) i have about 300 pending credits. I'm also doing w/u's that were both 'cleanup' and had to be turned around in 72 hours as well as some that aren't due until Jul 21-22-23 range.

BC
07-10-2004, 10:07 PM
Database / project reboot in progress... ... scheduler just gave a parse() error.

if you get errors, please let 3.20 handle it... this is a good test of
making sure 3.20 handles failure and recovery cleanly.

It should be back up shortly.

BC
07-10-2004, 10:20 PM
Project back online.... reboot complete.


Work Units available.....

chaz
07-10-2004, 11:22 PM
res0r9lm and his diskless nodes just appeared in my rear view mirror.........
Getting smaller.....
And smaller.....
Everybody wave....... bye bye :hello2:

BC
07-10-2004, 11:42 PM
res0r9lm and his diskless nodes just appeared in my rear view mirror.........
Getting smaller.....
And smaller.....
Everybody wave....... bye bye :hello2:




Don't go by him too fast, you might blow out the windows!!! :hello2:

At this rate though, you will be #1 in Predictor within the week!

(OMG, I've invited a monster to dinner!!!! )

ROFLMAO!!

chaz
07-10-2004, 11:51 PM
res0r9lm and his diskless nodes just appeared in my rear view mirror.........
Getting smaller.....
And smaller.....
Everybody wave....... bye bye :hello2:




Don't go by him too fast, you might blow out the windows!!! :hello2:

At this rate though, you will be #1 in Predictor within the week!

(OMG, I've invited a monster to dinner!!!! )

ROFLMAO!!

res0r9lm is Bionic Redneck, the one who switched teams and took with him some hardware that was donated to build the last node(supercomputer).
Just wanted everyone to have the opportunity to rub his nose in it a little, thats all.
It doesnt look like we're going to pull enough to prevent SETI.Germany from overtaking us, I may move some more to this but resources are getting scarce. 3 reserves, 2 Intel and 1 AMD.

chaz
07-10-2004, 11:54 PM
Don't go by him too fast, you might blow out the windows!!!

Wouldn't have hurt my feelings a bit :twisted:

Oops, almost forgot 1 thing......BYE BYE:bootyshake:

BC
07-11-2004, 12:14 AM
Chaz,
Thanks for the update the situ with redneck. I understand and fully agree.. It would be a shame if any stones flew up now, wouldn't it!

Seti Germany did a bulk update... Their average is already eroding so I would not worry too much... they will slip back again.. As for the other team... last person to blow by is that other team's lead @ 11896 points.. only 14 RAC seperate you two... then you will definately be gaining.

How much do you do CLI mode and how much GUI ... also, what is your disk-write rate? I got better perf by bumping up the write time to 120 seconds.



I hope those database 'burps' of earlier today don't damage anyone's scores. maybe I should check... how does that esql go?
"((select * from .... where 'user == .......... ' ) if (score = 0) )" ???

chaz
07-11-2004, 12:31 AM
I'm not so worried about how well I do, I just wanted to help keep the team in the top 10. It looks like this project is getting some serious attention lately and I'm sure it wont be long before the big guns are on it and blowing the doors off everyone, lol.

I have my prefs to write to disk every 60 secs, and I'm using the cli on all but HER machine. She has this thing about DOS boxes, she cant seem to keep her mouse off the little 'x' :-( .So for the time being she gets the gui. I was running 100% cli today, fired up hers about 15 min ago, so now 6 cli vs 1 gui.

Difficult to track client performance, it would be nice if the wu's were the same size. I have completed units with CPU time ranging from 21min to 2 hrs

BC
07-11-2004, 12:40 AM
I'm not so worried about how well I do, I just wanted to help keep the team in the top 10. It looks like this project is getting some serious attention lately and I'm sure it wont be long before the big guns are on it and blowing the doors off everyone, lol.

I have my prefs to write to disk every 60 secs, and I'm using the cli on all but HER machine. She has this thing about DOS boxes, she cant seem to keep her mouse off the little 'x' :-( .So for the time being she gets the gui. I was running 100% cli today, fired up hers about 15 min ago, so now 6 cli vs 1 gui.

Difficult to track client performance, it would be nice if the wu's were the same size. I have completed units with CPU time ranging from 21min to 2 hrs

Unfortunately the different sizes come from the different targets mostly (as I am told)... and if you have any Charmms (ct.....) those will run in nothng flat...

I'm switching to CLI now, tired of erratic performance because I'm not paying attention. Anything special you can recommend?

I did one thing outside the scope of this forum, but it may help you to tell me anything special I need to do for max perf on predictor based on your experience with CLI.

Fresh Diagnose - System Report


CPU Bench Result
WhetStone FPU 7,916 MWIPS
DhryStone ALU 9,522 MDIPS
Speed 2261 MHz

Last Benchmark Saturday, Jul 10, 2004 04:20 pm



Memory Bench Result
Integer Assignment 12,346
Real Assignment 12,555
Integer Split 59,738
Real Split 80,967

Last Benchmark Saturday, Jul 10, 2004 04:22 pm


.... It seems my Asus MB in 'auto' mode is doing both 1T and 2T as it wants based on the interleaving to memory. That's the only way I can get the math to work out ... Does that make sense to you (and Keith, if you are reading along...... does it fit with what you've seen?)

BC

chaz
07-11-2004, 01:00 AM
It seems that benchmarks and CPU speed isnt going to help alot based on what I've read here.
http://predictor.scripps.edu/forum_thread.php?id=211


I know I have some charmms, but if I remember correctly the deadline for them is September or October. I believe the client will run work units in the order of the deadline, so getting credit for those may try your patience.
I have only been using the cli for about 1 full day now, but I seriously cant tell if it helps performance, although it seems to use less memory. It appears to be better, the gui makes a couple of my machines hang when I bring them up to check progress and the cli does not. But, i have no real numbers to verify any increase in performance.

I dont know much about memory timings really, I stick the modules in my PC and I get what I get. I guess I look at it this way, for what it costs to buy low latency high performance ram, i could get more of the cheap stuff, or the same amount of the cheap stuff and use the extra cash to upgrade an existing machine or start ordering components for another.
I got this PC3700 the other day simply to push the limits on this p4, I know it will do more than my current PC3200 will allow.

Does anyone know what the commands are for the cli to force updates?

BC
07-11-2004, 01:03 AM
The Big Guns are making their way here..

We are technically 'alpha' only because 1 application is having trouble.
However, the mfoldB (which is 90% of the work now) is production.

So... "we ARE live"... it's just not official until both executables are 100% solid and we've shaken down 3.20..


PS: I emailed w/ the project lead (predictor) and was gratefull that you all helped catch the DB error... Her thanks are extended to all...

BC
07-11-2004, 01:08 AM
It seems that benchmarks and CPU speed isnt going to help alot based on what I've read here.
http://predictor.scripps.edu/forum_thread.php?id=211


I know I have some charmms, but if I remember correctly the deadline for them is September or October. I believe the client will run work units in the order of the deadline, so getting credit for those may try your patience.
I have only been using the cli for about 1 full day now, but I seriously cant tell if it helps performance, although it seems to use less memory. It appears to be better, the gui makes a couple of my machines hang when I bring them up to check progress and the cli does not. But, i have no real numbers to verify any increase in performance.

I dont know much about memory timings really, I stick the modules in my PC and I get what I get. I guess I look at it this way, for what it costs to buy low latency high performance ram, i could get more of the cheap stuff, or the same amount of the cheap stuff and use the extra cash to upgrade an existing machine or start ordering components for another.
I got this PC3700 the other day simply to push the limits on this p4, I know it will do more than my current PC3200 will allow.

Does anyone know what the commands are for the cli to force updates?


I'll find that CLI force update... it's in my doc here somewhere... If the GUI is working right, it SHOULD come up and see the CLI and be the 'remote console'... there is a lot of email traffic on that in boinc_dev... I will go search there too.



as for the plain GUI mode ...the GUI runs on the mutex against the background program... and it is MUCH slower (per the developer himself).... when you have the window open.


So.. recapping... I owe you two answers:

1. GUI command interface verification....
2. Forced update...


BRB.


BC

chaz
07-11-2004, 01:24 AM
Yeah I tried using the gui to force the cli to update, but the gui crashes.

I noticed that with the gui, if you have the window open, the time to completion for the current wu's starts to increase.

I've been looking for the commands on the site, and in the boinc directory but no luck yet.

BC
07-11-2004, 01:30 AM
Yeah I tried using the gui to force the cli to update, but the gui crashes.

I noticed that with the gui, if you have the window open, the time to completion for the current wu's starts to increase.

I've been looking for the commands on the site, and in the boinc directory but no luck yet.

Exactly... Just got an email from Rom confirming that... the email is:
============================================
If the cli is in the Windows service mode it'll log to the Windows event
log.

The GUI currently has to be shut down, they cannot work together.

When I finish the new GUI you'll be able to use them together...

There are already a few tools out there that will query the service to find
out the status of it...

----- Rom
=============================================

Gotta find those tools.... BoincSpy??? that ring a bell?

BC
07-11-2004, 01:38 AM
here's a good one for ya,

Predictor runs perfect as a service on machine 1.... machine 2 it starts and terminates immediately...


can't figure out what the deal is... just the standard windows 1067 error


ideas?

chaz
07-11-2004, 01:43 AM
Boincspy,I have that around here somewhere........

I'm not running as a service though, but maybe it'll work.
Checking their status by other means would help rather than scrolling through line after line to look for wu completion times.
I just thought there might be a command to force it to update immediately when Ive made changes to my preferences, rather than waiting for it do communicate on its own, but no biggie.

I have never run any project as a service.


(psssst.....SETIGermany's recent average is still climbing)

BC
07-11-2004, 01:52 AM
Boincspy,I have that around here somewhere........

I'm not running as a service though, but maybe it'll work.
Checking their status by other means would help rather than scrolling through line after line to look for wu completion times.
I just thought there might be a command to force it to update immediately when Ive made changes to my preferences, rather than waiting for it do communicate on its own, but no biggie.

I have never run any project as a service.


(psssst.....SETIGermany's recent average is still climbing)


ok... then when Rom get's the GUI done (3.21) we will use the two together as it is intended...


As for Germany.... .................................. Verdampt !!!!

now, if i can get my other machine service running all would be ok.. the CLI will NOT run... period.... turn right around... and kick off the GUI just fine.... Go figure.

chaz
07-11-2004, 02:08 AM
Funny that it works on one machine and not the other, are they using the same OS? Is one running as a server?
Beats me... sorry I cant really help on that one.

BC
07-11-2004, 02:28 AM
Funny that it works on one machine and not the other, are they using the same OS? Is one running as a server?
Beats me... sorry I cant really help on that one.

I FOUND IT.... don't know why, but the registry is messed up somehow.....

"NT Authority\Network Service" will not work.... put it under my username and all is fine..


Now... I just saw a bunch of stuff get uploaded and don't see the results as posted.

Like you.... I don't see the 'update'... right?

chaz
07-11-2004, 02:44 AM
What I mean is, if I change prefs on my account, I want to update the client immediately. Update cache size, disk write time, etc.
I changed my disk write time to see if 120 secs would help out like you stated earlier, thats why i wanted to force the client to update. Not a big thing, it will update next time it contacts the server, an hour or so wont kill me, lol.

Your results, they probably got sent off to a K6-333, you'll probably get credit for them next week,lol.

BC
07-11-2004, 02:47 AM
Ok,
I just got off the Alpha / Beta / Live Message boards.

The short version is this:

a) As I said before, The GUI will simply sit on top of the CLI (which will be the service actually doing the work)

b) The GUI will command, control, and monitor the CLI. (per my discussions with Rom and what's on the boards)

c) To get a faster update time, set your min/max values close together, You get auto-update (the only update) at min-fill time. So setting Min and Max about 0.25 days means an update 6 WUs if you do 1 per hour.

d) The GUI will be able to command the update when finished and sitting on top in the final config. "line b - above"


Hope this helps.

For now, I would go set your min & max close to each other... within a few WUs of work of each other...

Over the next few days... slowly sneek that down to a shorter queue length... (I'm at 3 & 4 right now... so I will not update for quite a while)

Sorry it has to be this way... but that is what the Alpha team is recommending.


BC

BC
07-11-2004, 02:50 AM
Your results, they probably got sent off to a K6-333, you'll probably get credit for them next week,lol.


:shock: Not if I can help it!!! LOL

chaz
07-11-2004, 03:02 AM
Ok,
I just got off the Alpha / Beta / Live Message boards.

The short version is this:

a) As I said before, The GUI will simply sit on top of the CLI (which will be the service actually doing the work)

b) The GUI will command, control, and monitor the CLI. (per my discussions with Rom and what's on the boards)

c) To get a faster update time, set your min/max values close together, You get auto-update (the only update) at min-fill time. So setting Min and Max about 0.25 days means an update 6 WUs if you do 1 per hour.

d) The GUI will be able to command the update when finished and sitting on top in the final config. "line b - above"


Hope this helps.

For now, I would go set your min & max close to each other... within a few WUs of work of each other...

Over the next few days... slowly sneek that down to a shorter queue length... (I'm at 3 & 4 right now... so I will not update for quite a while)

Sorry it has to be this way... but that is what the Alpha team is recommending.


BC

I wonder if that is the whole problem with the whole wu verification process. It seems to me that if everyone running this project had a smaller cache, results would turnaround much faster. Dont you think?
For instance, if I had a work cache of 4 days, I would not begin to verify a result that you turned in until Wednesday. Or, do they assign different deadlines for verification units than they do new work units?

Follow me?

BC
07-11-2004, 03:15 AM
I wonder if that is the whole problem with the whole wu verification process. It seems to me that if everyone running this project had a smaller cache, results would turnaround much faster. Dont you think?
For instance, if I had a work cache of 4 days, I would not begin to verify a result that you turned in until Wednesday. Or, do they assign different deadlines for verification units than they do new work units?

Follow me?

Yep, and we should all hae the same shorter turn around times... which (if the benchmarks were working right, would / should compute the proper amount). People started usng bigger cache's when the servers at Berkeley (Seti-alpha_boinc) became unstable... that began the whole database avalanche.


if we all (team AMD) have the same cache time... our results will be much more consistent too.


The actual deadline for a WU is preset when dispatched. The user is allowed N days. the value of N is set in each WU. If the quorum is met (everyone sends back before the deadline) then it auto-closes.

If anyone fails, the 'active' count is decremented, pushed back in the outbound queue for the next user to grab and get to work on.

There is also, somewhere, a total 'drop-dead' date... but i would have to ask where that is stored... I think it's in the database and is kept with the whole block of data that the batch of WUs are cut up from.....

I don't have a good grasp of the operation that far back behind the splitter, but do know it's got a lot of parameters they can set.... unfortunately all the values are set by hand in the database and then applied to / stuffed into the files when the WU binary files are created.


Does that answer it and give some insight as to how it works behind the scenes?

BC


PS: Network_Service did not have file write permission to the predictor folder.... that was the problem on machine 2.

chaz
07-11-2004, 03:27 AM
Yes I understand, I think the work unit cache should revolve around the benchmark and not be able to be changed by the user.

Also I think the server should(if it's possible) monitor incoming and outgoing wu's and be able to differentiate between brand new wu's and wu's currently undergoing the verification process. Work units that have returned and have been reassigned for verification should have a shorter deadline, or higher priority. I think this would help keep the stats more consistent also.

BC
07-11-2004, 03:54 AM
Yes I understand, I think the work unit cache should revolve around the benchmark and not be able to be changed by the user.

Also I think the server should(if it's possible) monitor incoming and outgoing wu's and be able to differentiate between brand new wu's and wu's currently undergoing the verification process. Work units that have returned and have been reassigned for verification should have a shorter deadline, or higher priority. I think this would help keep the stats more consistent also.

WU's which are close to their deadlines are indeed pushed out first...
Also you will notice that BOINC runs the jobs in order of due-date.

I have had some WUs come and because of the 'drop-dead' date, had only 48 hours for the turn around..... forunately I only need 48 minutes or so per WU and those WUs got handled right away.


************************************************** **********
I would like to propose that we, as a team, consider setting our BOINC settings to a) ensure Network_Service has full_control of the BOINC/Predictor directory (haha) and b) set our queues to be no more than 2 days long... and c) with (if possible) no more than 1/4 to 1/2 day min/max delta..

Just make sure you have enough WUs extra to handle a short interruption in the database / splitter... (which won't be more than a couple WU's per machine average)

******************************



I've seen the queue min-fill work very very nicely.. that code is solid. I have had 3 WU's go up, and it downloaded 4.... it gets very very stable and predictable after a few empty-fill cycles.


Suggestions / Comments?


BC

WienerDog
07-11-2004, 04:18 AM
good idea........ive already got my que set that way,seems to work fine for me [-o<

BC
07-11-2004, 05:13 AM
My queue is a bit longer.... and am working on pulling it down with each update...

BC
07-11-2004, 08:45 AM
Attention all:

At approx 12:30 AM PST (0730 UTC) SUN 11-JUL-04
The database got hit with a bunch of client errors (cough .. sorry .. cough ... from someone ... cough cough... make sure nobody notices.. LOL ) and stopped responding.

The appropriate log files were sent out to the DBA and given her tendency to check the DB all the time, I suspect it will be up and these error catching triggers (which I think are the cause of the hiccup) will be corrected.

For historical knowledge:

We (but mostly Seti, who was plagued) used to have the problem that client errors would get past the front end and into the core of the database mechanism which, once there, would foul up the entire database.... corrupting and invalidating everything associated with it.

The recent triggers now catch these errors as a user does an 'update' at the front end database where uploads are stored until 'pending' or 'validated'. This is new code to protect the core science AND our credits. It also is supposed to stop someone from sucking up hundreds of WUs and returning 0 time / 0 credit type conditions.

It is my belief that a syntax error (typo) exists and I just caused the DB to flip out since it happened within seconds of my error event. It had been 100% solid until then.

I apologize... whether it be me or not, it could be coincidental and/or contributory... I'm sure the DBA will have things back up early AM (about 5-6 hours from now) when she logs in and get's my email.

If any issues, Please let me know.





On a good note: I have two more machines (a 1P and a 2P) both ready to go as 'services' once the DB is running. They are not the fastest, but since they do nothing but hardware raid servicing, they have CPU time to burn.... approximate mips = 3600, and I think FPU is about 2000. Not fast, just good solid workhorses.

Email or PM me with questions or comments.

Thanks for your patience...
BC.

nickth
07-11-2004, 12:57 PM
well db is still down :evil:

WienerDog
07-11-2004, 12:58 PM
ya and all my machines are outta work :evil:

BC
07-11-2004, 01:26 PM
Hang in there please....

it's already being worked on....


(Yes, I am still awake... and it's 6:36am PST). I will sleep for a few
while the 'boss takes the helm'.

suggestion: when 'user page' comes back up.... set prefs to 2 and 2.1 days. I think that will keep us all busy and within 2 days we will all be within 2 hours sync of each other.

if 1 and 1.1 are more preferable, someone please post.... I will follow the group concensus


When the DB comes back up, we will be all caught up....

Also learned... to force update prefs......

a) stop the service (if using CLI) or exit GUI.
b) edit prefs on web page
c) restart.

works much cleaner for all.


BC

jlangner
07-11-2004, 01:58 PM
Yall get off the predictor site. I can't view my stats!! lol.

"Warning: mysql_pconnect(): Too many connections in /home/predictor/packages/boinc-3.19/projects/predictor/html/inc/db.inc on line 15
Unable to connect to database - please try again later Error: 1040Too many connections"

BC
07-11-2004, 02:03 PM
Yall get off the predictor site. I can't view my stats!! lol.

"Warning: mysql_pconnect(): Too many connections in /home/predictor/packages/boinc-3.19/projects/predictor/html/inc/db.inc on line 15
Unable to connect to database - please try again later Error: 1040Too many connections"

Jeff, that is the error.. the db is hung... it's being fixed right now.
please be patient... I apologize for the delay.


BC

PS: this 'BC' is going 'offline' for a few hours... when the DB is back, restart your GUI's and CLI's and force the update and all will be fine.... the other servers are sitting there ready to go.

GAWD, I can't wait for the 3.21 (M2) GUI.... SOOOO much easier.

jlangner
07-11-2004, 02:16 PM
What is the best way to exit the program? Just kill it or suspend then exit?

One of my wu has 100% done by it but status has "Computation error"

BC
07-11-2004, 02:24 PM
What is the best way to exit the program? Just kill it or suspend then exit?

One of my wu has 100% done by it but status has "Computation error"


Best way is to exit...... it is a clean terminate and writes the log files and sets you up for a clean restart.... suspend could catch you in any state and write out something that is not restartable.

CLI == STOP service, and put on manual until DB back online.

GUI == FILE - Exit from pop-up window.

( please remember... BOINC is multi threaded... most other DC's aren't.. that is why the clean exit is better... threads are handled properly)



BC


*EDIT* PS: IF you run CLI and start the GUI,, it will cause this kind of error... please make sure the GUI is not in the 'all users' startup or your personal 'startup' list if you are using the CLI... it will corrupt a WU.... This will be finished in 3.21 (M2) as scheduled. *END EDIT *

PPS: Ni Ni for now

nickth
07-11-2004, 07:37 PM
Here is the link to Boinc View.So that you can view you progress on multipule boxes http://boincview.amanheis.de/

chaz
07-11-2004, 09:46 PM
Hey thats nice.... lots of info

Shows alot of info about each client, benches, wu times, etc

Thanks

BC
07-12-2004, 12:41 AM
Hey thats nice.... lots of info

Shows alot of info about each client, benches, wu times, etc

Thanks


I think it's great!!!! Now, to get the '-allow_remote_gui_rpc' to work
ok would be super.... I'm sure I'm doing something stupid wrong or forgot to make another required entry somewhere.

Chaz,
are you getting it to work, or using file-sharing?



BC

chaz
07-12-2004, 01:13 AM
It works pretty good with the rpc switch, but I do have 1 thats being stubborn, so I just use file share for that. I turned down inquiries to 1 in 30 minutes too.
The benchmarks aren't realistic, according to them my fastest AMD machine is a 2400, but work completion times tell a different tale..

BC
07-12-2004, 01:26 AM
It works pretty good with the rpc switch, but I do have 1 thats being stubborn, so I just use file share for that. I turned down inquiries to 1 in 30 minutes too.
The benchmarks aren't realistic, according to them my fastest AMD machine is a 2400, but work completion times tell a different tale..

Ok,
PLEASE tell this Linux weenie where it goes? In the registry?

as in X:\Predictor -win_service -allow_remote_gui_rpc


---- or ----


the other order???

-------

where did you put it... I am getting errors from the CLI constantly.



BC

chaz
07-12-2004, 03:36 AM
Go into your Boinc directory, create a shortcut to the cli, add the switch to the end of the Target Line in the shortcut's Properties.

Then start the client with the shortcut to enable the rpc. Voila!

Note: there should be 1 space between the end of your target directory (") and the start of your switch (-)

Example:
"C:\DC Projects\BOINC\boinc_cli.exe" -allow_remote_gui_rpc

BC
07-12-2004, 03:56 AM
Go into your Boinc directory, create a shortcut to the cli, add the switch to the end of the Target Line in the shortcut's Properties.

Then start the client with the shortcut to enable the rpc. Voila!

Note: there should be 1 space between the end of your target directory (") and the start of your switch (-)

Example:
"C:\DC Projects\BOINC\boinc_cli.exe" -allow_remote_gui_rpc

Since I'm using the service mode....... should it be?

"C:\DC Projects\BOINC\boinc_cli.exe" -win_service -allow_remote_gui_rpc

and how should the 'service start' in the registry entry be (which is where the actual service is started)? Should it be the same as the shortcut?

My registry entry for BOINC currently (Image Path key):

G:\Predictor\boinc_cli.exe -win_service -allow_remote_gui_rpc

but this does not work.


BC

BC
07-12-2004, 04:11 AM
Guys,
On a side note.....

For the next few days, while my 'service mode' and 'gui mode' machines get settled with the new tighter profiles, as well as get established with the service mode and bring the queue length down (currently at 58 jobs), I am seeing jobs building up in the 'ready to post' queue... They are uploaded and will auto-update when i hit the low-water mark.

During this settling, this will temporarily decrease my performance rate and I may fall in standings (in spite of the built up pending credits), but will jump back up to normal with a lot of points at my next update.... which should put me in the 1.5 days queue w/ 2 hour update rate.

This is just an FYI... nothing to be alarmed about.

BC

PS: I also added 4 processors to my list, compliments of a friend. they are dual CPU machines which I will upgrade when I determine which parts to put in them... Does anyone know what the best AMD pn-for-pin replacement is (I can get tons of older processors) for the P2's and P-Pro's? that are socket 7 from personal experience? Both of the machines (currently dual CPU but can run as quads), have the arbitration logic on the MB to permit uni-processor CPUs to run in an SMP config... (aka, pre Athlon MP on-chip arbitration).. They are true 'servers' with redundant power supplies and split busses .

Will an Athlon 1200 or a bunch of K6's run in this type of config? I honestly have forgotten.

chaz
07-12-2004, 10:53 AM
I dont install services, so im afraid I wont be of much help there. but, I see in the forums a guy who installed his by typing from command prompt: boinc_cli -install

You think that might work?

chaz
07-12-2004, 11:53 AM
I also added 4 processors to my list, compliments of a friend. they are dual CPU machines which I will upgrade when I determine which parts to put in them... Does anyone know what the best AMD pn-for-pin replacement is (I can get tons of older processors) for the P2's and P-Pro's? that are socket 7 from personal experience? Both of the machines (currently dual CPU but can run as quads), have the arbitration logic on the MB to permit uni-processor CPUs to run in an SMP config... (aka, pre Athlon MP on-chip arbitration).. They are true 'servers' with redundant power supplies and split busses .

Will an Athlon 1200 or a bunch of K6's run in this type of config? I honestly have forgotten.

I dont know if you'll find anything to replace the P2's, if I remember right those are the little Nintendo cartridge looking buggers. You could go PIII to replace the PII since the PIII has more cache and higher fsb.Here's what I found:

Pentium II:

32 kB L1 cache
512 kB L2 cache running at half the CPU speed
FSB either 66 or 100 MHz (450 MHz model has 100 MHz)
MMX instructions


Pentium III:

64 kB L1 cache
512 kB L2 cache running at half the CPU speed
FSB either 100 or 133 MHz (450 MHz model has 100 MHz)
MMX and SSE instructions
Then I guess it's whether or not the board will support it also.

About the Socket 7's, need to know if it's a true Socket 7(66fsb) or Super Socket 7(100fsb). If memory serves,some older Socket 7 boards will not support a K6-2/3 because of different voltage requirements.
If your real lucky maybe you can find a K6-III 450, those were the screamers. I used to have one, it gave my Mom's PIII 600 a run for its money.
P-Pro's are another story they are socket 8.

jlangner
07-12-2004, 03:01 PM
Is there a way to sort the stats on the AMD Users Team page by total work and not averages?

Anonymous
07-14-2004, 07:13 AM
Is there a way to sort the stats on the AMD Users Team page by total work and not averages?

Jeff,
sorry for being gone so long.... I made a big mistake an (OOOOOPS) last night.... accidentally pushed the Vcore the up to the max... *POOF*

As for the sorting by stats in the team page... I passed on the request. It will be added. Since this machine doesn't have my account key, I will have to wait until the main is back and I can grab to cookies/email.


I'm breaking in a new CPU now... ..I decided to bust my piggy bank and go for broke.. Only thing I need now is more registered DRAM. and I'll be able to bring it up fully in Quad channel mode.. For now, she sits here in BIOS.

I hate the factory supplied coolers... they were running at 115F... I don't even have this AS-5 broken in yet and it's running at 101F. The joys of copper over aluminum!!

I also need to find my DVM and calibrate the PS voltages... I replaced that as well . I want them on the snuff... they appear to be off a bit, but the DVM will tell the tail.

When all said & done, I'll have my 3200+ back AND the new machine.

I won't keep up with you 5-10 cpu power users, but I should stay a solid 3rd place in the team from now on... :)

How *IS* Predictor (mfold + charmms) all running for you (and everyone)?


BC

BC
07-14-2004, 07:15 AM
...sorry forgot to login first.

jlangner
07-15-2004, 03:01 AM
I have only had one problem. I have a file that is in the transfer tab but it never transfers (3 days now). It keeps saying temporiarly failed transfer. Will try again in X... minutes.

"Predictor@home - 2004-07-14 22:01:03 - Temporarily failed upload of t0212C_1_3121_1_0

Predictor@home - 2004-07-14 22:01:03 - Backing off 1 hours, 2 minutes, and 49 seconds on transfer of file t0212C_1_3121_1_0"

Any suggestions?

Also, I recieved 5 work units, with less than 24 hours deadline. That is too short! I had to go out of town yesterday and with storms around I shut the pc down while I was gone, so they were late getting back. All others give almost a week.

Ototero
07-15-2004, 06:52 AM
The team is in position 7 now.

I have 6 machines, 2 running Ubero, 2 running Predictor and 2 running D2OL.

There isn't one place to collate all the teams positions so I'll stick them here. You can chastise me later.

Pos....DC Project
0003 Ubero
0005 Chessbrain...(edited)
0007 Predictor......(edited)
0009 EON
0011 Lifemapper
0014 D2OL
0016 DPAD
0017 TSC
0018 Seventeen
0039 Distributed Folding
0054 Find a Drug
0577 F@H
1420 DNET

em99010pepe
07-15-2004, 07:17 AM
The team is in 5th place in Chessbrain.

http://mag.chessbrainstats.com/allteams.php?start=0&limit=100

Beerknurd
07-16-2004, 03:28 AM
D2OL is really boring. I need to run something that a P4 will kick ass at. "Like Ubero"

BC
07-16-2004, 06:52 AM
Hey all,

** Predictor was down for about 20 minutes an hour ago (as of
this post ***.


FINALLY, after a hardware failure, corrupted file systems,
power supply failure that undervoltaged the CPU and MB (which
are being repaired under warranty because they were not over
clocked over overvoltage.....

I am just waiting for the last piece... Waiting for my DSL router,
which *conveniently* went out at the same time... (I suspect a
a 'California brownout'..... I should be back up at full speed tomorrow
when the router is reconfigured... It arrived today and I didn't have
time to get it done.

As for predictor.... I have some good news..... :)

I will be catching up & contributing at a much higher rate now....
While waiting for my A64 3200+ to be repaired/replaced and
then brought back on line, I am enjoying the first of a block of
Sledgehammers (clipped) to emulate an FX-53. It'sprogrammed to
report itself as a FX-53, but is a true Sledge, just unable to run
in SMP mode (the cpu-id pins are clipped)

This is my learning machine and is taking some getting used to in
this configuration. It's 1/2 way between both worlds... You know
this as the Opteron 800-series and the FX-53... It was cut to
make my life easier.

So, learning and recovering aside...


I posted a hundred or so results to Predictor tonight (FINALLY).


Also, Predictor now works ok in 'Windows XP service mode'...

The details are easy (as I think I said before)... and the next update,
which is about to hit test, will give us full distributed computer
RPC capability w/ the GUI fully detached from the 'service'.

Hollar w/ questions....


I would also like to cross-post & go off topic for a sec... and welcome
the new Predictor teams members as well as the new AMD Users to
the group!


I'll heading for dinner now (yes it is late here in PST USA land), but
everything is stable and I'm HUNGRY!!! LOL


Best of regards,
It's good to be back..

I'll share what I know about BOINC projects and Predictor ....
Just ask away.


THanks to all...
BC (Chuck)



PS: Jeff, While AS-5 is curing, is 55C under full load at 2400 Mhz
a normal temp if the chassis is at 30-31C (84F)? It seems a bit high,
but I also knw it takes time for things to cure.


Anyone with an SK8N MB can answer..... I'm running MB drivers 3.77 and BIOS 1007, Corsair CMX512R-3200PT C2.... it's dual-channelling just fine. And I do have the mod in place for quad-channel, if needed.



Again, Catch ya all shortly,
Chuck (BC)

Ototero
07-16-2004, 06:53 AM
Beerknurd,

If you keep catching me on Ubero at the same rate, I'll have to put some more resources on it :lol: :lol:

Can't have nicking 5th place from me ;)


Stu

em99010pepe
07-16-2004, 07:11 AM
D2OL is really boring. I need to run something that a P4 will kick ass at. "Like Ubero"

Run 20 instances of chessbrain!

BC
07-16-2004, 07:24 AM
The team is in position 7 now.

I have 6 machines, 2 running Ubero, 2 running Predictor and 2 running D2OL.

There isn't one place to collate all the teams positions so I'll stick them here. You can chastise me later.

Pos....DC Project
0003 Ubero
0005 Chessbrain...(edited)
0007 Predictor......(edited)
0009 EON
0011 Lifemapper
0014 D2OL
0016 DPAD
0017 TSC
0018 Seventeen
0039 Distributed Folding
0054 Find a Drug
0577 F@H
1420 DNET


FYI guys... we are now in 6th place in production again.... it won't be long before we own 6th. :)


And what are those *OTHER* DC projects doing in this thread??? ROFL
*cough* *cough*

BC

chaz
07-16-2004, 05:44 PM
Got a dead 2500, hard disk failure. Cant decide if I should tear down a RAID array to get it back up, or just wait for a new one to arrive. Ive been trying to run it diskless, but no luck. It is a recent board with PXE compatible bios, but I cant get a server to run correctly. I expect to start dropping off soon :cry:

Beerknurd
07-16-2004, 09:36 PM
Beerknurd,

If you keep catching me on Ubero at the same rate, I'll have to put some more resources on it :lol: :lol:

Can't have nicking 5th place from me ;)


Stu


Take your time.... LOL.

bwhite
07-16-2004, 10:45 PM
FYI guys... we are now in 6th place in production again.... it won't be long before we own 6th.

I say lets go for it. I only have 1 computer not on Predictor and am thinking about switching it over. I have had minimal trouble with it so far. I had one client corrupt all it's results when data base was being fixed a few days ago so killed that client and reinstalled it. Other than that has been smooth sailing.

vaughan
07-17-2004, 03:29 AM
Hey Bruce,
What happened to all your computers? You used to have quite a farm.

BC or anyone,
How long after the message says "ready to report" does BOINC check in? I'm on cable so I assumed it would send on completing the job.

BC
07-17-2004, 03:40 AM
Hey Bruce,
What happened to all your computers? You used to have quite a farm.

BC or anyone,
How long after the message says "ready to report" does BOINC check in? I'm on cable so I assumed it would send on completing the job.

Vaughan,

Boinc 'auto-reports' when it contacts the server again. This can be
done two ways.. either a) by hitting 'action - update' or b) automatically
when you reached your 'low-water mark' (minimum work) level.

I set my min & max to be ".1" days apart.... This gives me an
auto-update every 2 hours.... about every 4-5 jobs for Mfold, and
about every 15-20 for Charmms.

The settings are in your Predictor 'General' preferences.


Chuck

jlangner
07-17-2004, 04:30 AM
Chuck - I idle at about 33C with C-N-Q and while running full load - Bionic, I am running at 56C.

Also how do you get a file out of the transfer tab. Its been there for days!!! and want upload.

vaughan
07-17-2004, 04:33 AM
Thanks Chuck, I feel so silly. I was using 3.19 and have now upgraded to 3.20
Also found I had to detach from Seti and gave P@H 100% usage. Have adjusted preference to 0.1 and 0.1 from 0.1 and 1.0
I'll see how it performs.

BC
07-17-2004, 05:20 AM
Chuck - I idle at about 33C with C-N-Q and while running full load - Bionic, I am running at 56C.

Also how do you get a file out of the transfer tab. Its been there for days!!! and want upload.

right click on the job to upload itself... and select 'retry now'

that should do it.

Let me know if you have any problems.

Chuck

BC
07-17-2004, 06:01 AM
Chuck - I idle at about 33C with C-N-Q and while running full load - Bionic, I am running at 56C.

Also how do you get a file out of the transfer tab. Its been there for days!!! and want upload.

right click on the job to upload itself... and select 'retry now'

that should do it.

Let me know if you have any problems.

Chuck

PS: Don't forget to force the update afterwards.... saves time in getting credits and reduces risk of having timed out.

BC
07-17-2004, 06:10 AM
Chuck - I idle at about 33C with C-N-Q and while running full load - Bionic, I am running at 56C.

Also how do you get a file out of the transfer tab. Its been there for days!!! and want upload.

Jeff... thanks... I have CNQ as well... am not overclocking at all... pure stock now. I am back at full 12x and 200 mhz, 64 bit + dual channel (quad total capable) and stable at 53C CPU, 40C MB.

I did a COMPLETE rewiring of the chassis and improved airflow. It's much cleaner now... when I idle the cpu, the temp falls like a rock... so I think things are going well. I have one more T-take fan to install tomorrow. That will be when I give this a temperature break. I have to give it a temperature break soon... The AS-5 needs that.

I'm trying to help out as much as possible since, getting us up as far as possible with 'pending credits' because Chaz lost a major machine, I will be down for about 2-3 hours, and my 3200+ isn't back up yet.


Chuck

BC
07-17-2004, 06:39 AM
Attention all Predictor participants... PLEASE UPGRADE TO 3.20 ASAP.
See below please.

Per the 'current progress' link on the predictor site reads....



7/16/04 - Updates for today:

The scheduler on the server has submitted a lot of work to Mac hosts and is now waiting for more Mac's to provide duplicate wu's for validation purposes (remember we are using Homogeneous Redundancy which means that wu's are only send out to similar hosts because the calculated results are different on different CPU's/OS's). In case you are wondering why so much work has been submitted to Mac's if there are not so many around, you are right; we do so too and have mentioned this to the boinc-development team as a problem (we'll start thinking about this ourselves too ofcourse).

Michela has finished work on the Linux version of the Charmm application and we'll release it on the grid today. We'll only generate a small number of wu's for now though since we like to find out how and if results for this platform are coming back correctly. We have implemented some minor bugfixes and improvements in the Mfold version for Linux as well. Both science applications will be upgraded to version 3.10.

The Mfold and Charm applications for Windows have also been optimized a bit more and some minor bugfixes have been made: Charmm shouldn't crash anymore at the start of a new workunit, because a test is now made if files are already in the slot directories and Mfold provides us with some useful information and an error code in case something goes wrong. Their respective versions will be upgraded to 3.10 as well.

We are trying to optimize a lot of queries in the server code since we (like the seti project) are starting to notice some performance problems on the server now that we are getting more and more users (more than 4000) and computers (more than 10,000!). The boinc development team is working on this as well and will soon come out with a new version 4 of the boinc software where they have addressed a lot of performance questions.

Everybody should update to version 3.20 of the boinc core client (CC) as soon as possible. This will fix a lot of problems we currently have with results returning with cpu time 0 (and accordingly 0 credits are granted if this is the first result in a set). It will also make the benchmark numbers based on the same algorithms for all hosts; this also will solve some problems with credit allocation. Actually, in a couple of days we will set the minimum version required to 3.20. People who have not upgraded yet will not get any work anymore that way. Thanks!


(personal comment)
** I asked for the 3.20 minimum..... We beat intels and they should NOT be getting more credits than us! Time to play fair! ****

Also, there will be a new FAQ coming up for Predictor shortly.... Anyone with emails can email me directly or PM me and I will put ANYTHING you have in the FAQ, since that's my primary contribution.

Please feel free to PM me references to recent issues we've solved, or personal experiences you've solved on your own... it's all valuable materal.



Last but by FAR least, Thank you all.. We've made a *BIG* impact on Predictor and are helping a good cause as well as making some HUGE strides in standings.... Anyone who's bored elsewhere or has cycles to burn is welcome to join.


Chuck

jlangner
07-17-2004, 07:35 AM
Same as last 6 days:

"Predictor@home - 2004-07-17 02:33:58 - Temporarily failed upload of t0212C_1_3121_1_0

Predictor@home - 2004-07-17 02:33:58 - Backing off 33 minutes and 8 seconds on transfer of file t0212C_1_3121_1_0"

How do i delete it. It is stuck!

BC
07-17-2004, 07:59 AM
Same as last 6 days:

"Predictor@home - 2004-07-17 02:33:58 - Temporarily failed upload of t0212C_1_3121_1_0

Predictor@home - 2004-07-17 02:33:58 - Backing off 33 minutes and 8 seconds on transfer of file t0212C_1_3121_1_0"

How do i delete it. It is stuck!

First check... but i assume that Boinc is running and other jobs are fine, right? (Upload and download?)


Then, please check the web page and see if it has that stuck WU is marked as timed out (no reply) or with some other error... The easiest way to find it is go to your computers list from your 'User Page', then look down through the list for that WU number....

If expired or 'client error' or whatever it will have to come out. If not, hammer it again and again (6 times max, 1 per minute) until you get an error in the status on the web page.

If Removal required....To do this by hand is easy, just get it all in 1 shot.

1) Shut down BOINC/Predictor and edit the client.xml file (in the top level predictor directory) to remove that job from the list.....(Edit using an editor that supports LONG lines and formatting... like Wordpad in Windows). You will find that there are multiple files which comprise that WU, you must get them all. Just watch the XML tags (like HTML tags), deleting each file element from start tag to 'slashed' end tag.

2) You must write down the WU name and # and then go to your predictor\projects\-url- folder and remove (delete) all components of the WU. Most likely, you will find just the .res file (result).

3) Once all are gone, you may restart predictor and it will be gone.


I *ASSUME* you did not move the job from one CPU to another, did you?
If so, it will never upload.

Also, If this was a job running at time of a crash, you must wait until ALL jobs are gone from your run queue (disable network access is the easiest) and the 'slots' folder(s) should be empty.

Make sense?



Chuck

BC
07-17-2004, 08:18 AM
I am working on the FAQ and forgot to tell you all about this little, non-obvious tweak to predictor performance.

Mfold writes a checkpoint to disk at that 'write to disk' interval in your general profile. The jobs we are running generate about a 50MB file.

You must decide whether it's safer and quicker to recover and run overall by writing out the checkpoint file more frequently (e.g. 30-60 seconds) -or- hold it in memory longer (which I do... 180 seconds).

In the case of Charmms jobs... you may never write the file at all! LOL

The impact this has is simple... if you crash... the job will recover from the last checkpoint. SO, which is better for your PC? more frequent disk writes of 50MB or to simply recompute those extra couple minutes?


Chuck

bwhite
07-17-2004, 01:11 PM
I'm trying to help out as much as possible since, getting us up as far as possible with 'pending credits' because Chaz lost a major machine, I will be down for about 2-3 hours, and my 3200+ isn't back up yet.


Moved 2700XP from D2OL to Predictor. Pretty much have my max effort on Predictor now.

chaz
07-17-2004, 01:27 PM
ITS ALIVE !!!!!

Had to steal ram from another machine to make a large enough ramdisk. It didnt benchmark for crap, but at least it'll run until new HDD shows up.

jlangner
07-17-2004, 08:29 PM
Well I killed that one. Edited file wrong or something. :twisted: :evil: Oh well, lost 3 results.

BC
07-17-2004, 10:05 PM
:(

I think it's time to ask boinc dev for a 'removal / cleanup' tool.....
especially since 4.0 is about to come out... 3.21 (M2) is now 4.0.

we'll have the GUI fully detached from the running service... and (if i read it right), it will handle multiple hosts...

I will inform as I hear.

BC

BC
07-18-2004, 05:41 AM
Ok guys,
I have been watching your scores. Since I'm new to the DC competition world, how are you tuning your upload / download, min/max settings to get such great results?

I am making progress, but have to struggle for all that I get. Is it my virus scanner on the main machine that is killing me?

FYI... i have an average of 1000-1100 pts in 'pending. it tends to hover there. My job spread keeps picking up jobs that are due in short order (today got one due 7/24).

What's the trick?



user = BCCL

Thanks,
Chuck

chaz
07-18-2004, 10:27 AM
Kill any unnecessary services and/or applications that will steal priority from Predictor. Antivirus in autoprotect mode isnt good, but it would depend on your personal activity.
My pending credits have gone from 2514 4 days ago, to 3919 today, the only things I have changed is disk write time to 600 secs, cache to .1-.3 days, and 1 machine running Linux now instead of Windows.

WienerDog
07-18-2004, 12:27 PM
i have turned all my anti virus and stuff off........i turn it on when i surf the net on my 2800 but dont leave it on when i crunch.
I'm behind a NAT so i dont run any software firewalls.
about all i have running besides M$ services is the overclocking utilityes of the board... if it has one. some of mine dont

i have my it set to write to disk every 120 seconds
cache is set at .1 to .3 days

jlangner
07-18-2004, 02:14 PM
You think that is short order. On 7/12 I picked up 5 that were due on 7/13!

Ototero
07-18-2004, 06:46 PM
I agree with everything said here about speeding Predictor up.

One thing I've noticed is that if you look at the process in the task manager you'll find that Mfold's priority is low. I set mine to normal.

It could help :cool: