PDA

View Full Version : BOINC problems



Jeff
06-12-2007, 11:52 AM
I received this email this morning from a fellow AMD Users. Let's see if we can help him out!


I am suffering a small problem! I have tried to reset all my projects, and have the system send new data to work, but for over a week I have not been able to get any data from any of the projects. I have been at each site and check the status and they all say they were up and running except for malariacontrol which was down for a a few days.
Can you give me any Ideas on what to do or what it might be the problem?

AMDave
06-12-2007, 12:12 PM
Yikes!
That's not a small problem!
That's HUGE!

I'd really like to see a copy of the 1st 50 log messages from the client after a restart, but in the absence of that we can still have a go.

From the text of the message I see that our troubled fellow still has internet access and it is working
I have been at each site so I deduce that it is not the internet connection that is at issue.

My hypothesis is that the BOINC client has stopped communicating with the outside world.

Let's address the basics...

1st we want to make sure that the client is set to access the network.
In "advanced mode" I would go to the {Activity} menu item and make sure that "Network activity always available" is checked, as opposed to "Network activity suspended" which will actually stop the client from sending and receiving data to and from the projects.

If that is not the issue then still in the client I would want to make sure that the proxy settings have not become fouled, so
2nd select the {Advanced} menu item and click on "Options". This displays the options dialog box and I would select the 3rd tab "HTTP proxy". If you are not operating through a proxy then the tick box at the top should be un-ticked "Connect via HTTP proxy server". If you are operating through a proxy, then ensure the box is ticked and the fields below are completed with the same details as your web browser.

Well. That covers the two main reasons why the client may stop communicating with the internet due to internal/local settings. There are other possibilities that could arise such as
general preferences - an accidental general preferences setting change may be so far in error that it causes wu's not to download (eg "Do work only between the hours of" might be such a narrow / nil gap that nothing can be done
hardware - a memory DIMM failure causing insufficient RAM to allow BOINC to allow a wu to be downloaded
etc. etc.
but these would all show up in the log messages.

I hope that scenarios 1 or 2 above should cover the issue for our fellow AMD User, but I'm quite prepared to get to the bottom of this. A copy of the log after a BOINC restart would really help.

Best of luck fellow AMD User!

Standing by.....

PoorBoy
06-12-2007, 12:13 PM
I received this email this morning from a fellow AMD Users. Let's see if we can help him out!

Just too add to what AMDave said is make sure he even has Allow New Work for the projects turned on. And in his Preferences under the Activity Tab set to Run Always & Network Activity Always Available ...

If that don't work then the next thing I would do is to Un-Install BOINC, then go into the Registry and Delete everything related to BOINC, then Re-Install BOINC and Attach to whatever Projects he wants to run & see if he can get Wu's.

He's not going to lose any Wu's since he already said he's tried resetting the Projects, so he must be out already ... So it's best to try & start with a fresh install of BOINC if he's having problems & nothing else works.

Jeff
06-12-2007, 12:29 PM
I have sent an email back asking for the log information. Hopefully we can figure this out for him


Jeff

AMDave
06-12-2007, 12:29 PM
Allow New Work for the projects turned onThanks PoorBoy. I overlooked that.

Something else just occurred to me.
What is the client version? {Help} > About > Version: X.X.X ?
If the client is too old it may have stopped communicating with the newer versions on the project servers.

I have to admit that this is something I have not tested.
/ed - regression testing. what a good idea. doh! -ed/
If this were the case then an upgrade-install to the same folder would be in order.

Back to standing by...

PoorBoy
06-12-2007, 12:41 PM
I was just thinking the same thing Dave, his BOINC Client may be to old, a lot of the Projects require a certain Version or they won't send work to you ...

vaughan
06-12-2007, 12:47 PM
When that happens I've seen a message to the effect that you need such & such a version to run this project.

AMDave
06-12-2007, 12:58 PM
When that happens I've seen a message to the effect that you need such & such a version to run this project.
a pop up dialog or do you see it in the log?

vaughan
06-12-2007, 01:51 PM
Its in the messages tab.

PoorBoy
06-12-2007, 02:46 PM
Its in the messages tab.

Yup, I've seen that myself at a few Projects, it's what got me to finally get off v5.4.11 & install v5.8.16 across my Pc's. May have to go even higher shortly ...

AMDave
06-12-2007, 03:13 PM
I was pleasantly surprised by my 5.9.4 deployment for the competition.
I was even happier with BoincView 1.5 Beta 8 deployment.
(Thanks to Jason1478963 for the nudge :) there)
No troubles in either camp since deployment.
I think these two work very well together. Highly recommended.

Jeff
06-13-2007, 12:44 PM
Here is the log:


6/13/2007 12:42:45 PM||Starting BOINC client version 5.8.16 for windows_intelx86
6/13/2007 12:42:45 PM||log flags: task, file_xfer, sched_ops
6/13/2007 12:42:45 PM||Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
6/13/2007 12:42:45 PM||Data directory: C:\Program Files\BOINC
6/13/2007 12:42:46 PM||Processor: 1 AuthenticAMD AMD Sempron(tm) Processor 2500+ [x86 Family 15 Model 44 Stepping 2] [fpu tsc pae nx sse sse2 3dnow mmx]
6/13/2007 12:42:46 PM||Memory: 767.23 MB physical, 1.83 GB virtual
6/13/2007 12:42:46 PM||Disk: 74.53 GB total, 50.08 GB free
6/13/2007 12:42:46 PM|ABC@home|URL: http://abcathome.com/ (http://abcathome.com/); Computer ID: 16382; location: (none); project prefs: default
6/13/2007 12:42:46 PM|proteins@home|URL: http://biology.polytechnique.fr/proteinsathome/ (http://biology.polytechnique.fr/proteinsathome/); Computer ID: 17708; location: home; project prefs: default
6/13/2007 12:42:46 PM|rosetta@home|URL: http://boinc.bakerlab.org/rosetta/ (http://boinc.bakerlab.org/rosetta/); Computer ID: 466108; location: home; project prefs: default
6/13/2007 12:42:46 PM|boincsimap|URL: http://boinc.bio.wzw.tum.de/boincsimap/ (http://boinc.bio.wzw.tum.de/boincsimap/); Computer ID: 71611; location: home; project prefs: default
6/13/2007 12:42:46 PM|Leiden Classical|URL: http://boinc.gorlaeus.net/ (http://boinc.gorlaeus.net/); Computer ID: 24112; location: (none); project prefs: default
6/13/2007 12:42:46 PM|climateprediction.net|URL: http://climateprediction.net/ (http://climateprediction.net/); Computer ID: 678640; location: (none); project prefs: default
6/13/2007 12:42:46 PM|Einstein@Home |URL: http://einstein.phys.uwm.edu/ (http://einstein.phys.uwm.edu/); Computer ID: 913634; location: (none); project prefs: default
6/13/2007 12:42:46 PM|Predictor @ Home|URL: http://predictor.scripps.edu/ (http://predictor.scripps.edu/); Computer ID: 340087; location: (none); project prefs: default
6/13/2007 12:42:46 PM|QMC@HOME|URL: http://qah.uni-muenster.de/ (http://qah.uni-muenster.de/); Computer ID: 52864; location: (none); project prefs: default
6/13/2007 12:42:46 PM|SETI@home|URL: http://setiathome.berkeley.edu/ (http://setiathome.berkeley.edu/); Computer ID: 3257773; location: home; project prefs: default
6/13/2007 12:42:46 PM|Spinhenge@home|URL: http://spin.fh-bielefeld.de/ (http://spin.fh-bielefeld.de/); Computer ID: 37391; location: home; project prefs: default
6/13/2007 12:42:46 PM|malariacontrol.net beta|URL: http://www.malariacontrol.net/ (http://www.malariacontrol.net/); Computer ID: 41913; location: home; project prefs: default
6/13/2007 12:42:46 PM|World Community Grid|URL: http://www.worldcommunitygrid.org/ (http://www.worldcommunitygrid.org/); Computer ID: 197138; location: (none); project prefs: default
6/13/2007 12:42:46 PM||General prefs: from proteins@home (last modified 2007-06-12 21:39:36)
6/13/2007 12:42:46 PM||Host location: home
6/13/2007 12:42:46 PM||General prefs: no separate prefs for home; using your defaults
6/13/2007 12:42:46 PM||Reading preferences override file
6/13/2007 12:42:47 PM|QMC@HOME|Restarting task two_099_peptidexp.1950_1 using Amolqc-preRC1exp version 501
6/13/2007 12:43:22 PM|proteins@home|Sending scheduler request: Requested by user
6/13/2007 12:43:22 PM|proteins@home|(not requesting new work or reporting completed tasks)
6/13/2007 12:43:31 PM|proteins@home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:43:31 PM|proteins@home|Deferring communication for 31 sec
6/13/2007 12:43:31 PM|proteins@home|Reason: requested by project
6/13/2007 12:43:36 PM|rosetta@home|Sending scheduler request: Requested by user
6/13/2007 12:43:36 PM|rosetta@home|(not requesting new work or reporting completed tasks)
6/13/2007 12:43:41 PM|rosetta@home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:43:41 PM|rosetta@home|Deferring communication for 4 min 2 sec
6/13/2007 12:43:41 PM|rosetta@home|Reason: requested by project
6/13/2007 12:43:46 PM|boincsimap|Sending scheduler request: Requested by user
6/13/2007 12:43:46 PM|boincsimap|(not requesting new work or reporting completed tasks)
6/13/2007 12:43:54 PM|boincsimap|Scheduler RPC succeeded [server version 509]
6/13/2007 12:43:54 PM|boincsimap|Deferring communication for 7 sec
6/13/2007 12:43:54 PM|boincsimap|Reason: requested by project
6/13/2007 12:43:57 PM|Einstein@Home|Sending scheduler request: Requested by user
6/13/2007 12:43:57 PM|Einstein@Home|(not requesting new work or reporting completed tasks)
6/13/2007 12:44:03 PM|Einstein@Home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:44:03 PM|Einstein@Home|Deferring communication for 1 min 0 sec
6/13/2007 12:44:03 PM|Einstein@Home|Reason: requested by project
6/13/2007 12:44:08 PM|Predictor @ Home|Sending scheduler request: Requested by user
6/13/2007 12:44:08 PM|Predictor @ Home|(not requesting new work or reporting completed tasks)
6/13/2007 12:44:14 PM|Predictor @ Home|Scheduler RPC succeeded [server version 510]
6/13/2007 12:44:14 PM|Predictor @ Home|Deferring communication for 7 sec
6/13/2007 12:44:14 PM|Predictor @ Home|Reason: requested by project
6/13/2007 12:44:25 PM|Spinhenge@home|Sending scheduler request: Requested by user
6/13/2007 12:44:25 PM|Spinhenge@home|(not requesting new work or reporting completed tasks)
6/13/2007 12:44:31 PM|Spinhenge@home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:44:31 PM|Spinhenge@home|Deferring communication for 2 min 1 sec
6/13/2007 12:44:31 PM|Spinhenge@home|Reason: requested by project
6/13/2007 12:44:36 PM|malariacontrol.net beta|Sending scheduler request: Requested by user
6/13/2007 12:44:36 PM|malariacontrol.net beta|(not requesting new work or reporting completed tasks)
6/13/2007 12:44:42 PM|malariacontrol.net beta|Scheduler RPC succeeded [server version 509]
6/13/2007 12:44:42 PM|malariacontrol.net beta|Deferring communication for 11 sec
6/13/2007 12:44:42 PM|malariacontrol.net beta|Reason: requested by project
6/13/2007 12:44:47 PM|World Community Grid|Sending scheduler request: Requested by user
6/13/2007 12:44:47 PM|World Community Grid|(not requesting new work or reporting completed tasks)
6/13/2007 12:44:53 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/13/2007 12:44:53 PM|World Community Grid|Deferring communication for 5 min 3 sec
6/13/2007 12:44:53 PM|World Community Grid|Reason: requested by project
6/13/2007 12:51:38 PM|proteins@home|Sending scheduler request: Requested by user
6/13/2007 12:51:38 PM|proteins@home|(not requesting new work or reporting completed tasks)
6/13/2007 12:51:43 PM|proteins@home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:51:43 PM|proteins@home|Deferring communication for 31 sec
6/13/2007 12:51:43 PM|proteins@home|Reason: requested by project
6/13/2007 12:51:49 PM|rosetta@home|Sending scheduler request: Requested by user
6/13/2007 12:51:49 PM|rosetta@home|(not requesting new work or reporting completed tasks)
6/13/2007 12:51:54 PM|rosetta@home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:51:54 PM|rosetta@home|Deferring communication for 4 min 2 sec
6/13/2007 12:51:54 PM|rosetta@home|Reason: requested by project
6/13/2007 12:51:59 PM|boincsimap|Sending scheduler request: Requested by user
6/13/2007 12:51:59 PM|boincsimap|(not requesting new work or reporting completed tasks)
6/13/2007 12:52:04 PM|boincsimap|Scheduler RPC succeeded [server version 509]
6/13/2007 12:52:04 PM|boincsimap|Deferring communication for 7 sec
6/13/2007 12:52:04 PM|boincsimap|Reason: requested by project
6/13/2007 12:52:09 PM|Einstein@Home|Sending scheduler request: Requested by user
6/13/2007 12:52:09 PM|Einstein@Home|(not requesting new work or reporting completed tasks)
6/13/2007 12:52:14 PM|Einstein@Home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:52:14 PM|Einstein@Home|Deferring communication for 1 min 0 sec
6/13/2007 12:52:14 PM|Einstein@Home|Reason: requested by project
6/13/2007 12:52:20 PM|Predictor @ Home|Sending scheduler request: Requested by user
6/13/2007 12:52:20 PM|Predictor @ Home|(not requesting new work or reporting completed tasks)
6/13/2007 12:52:25 PM|Predictor @ Home|Scheduler RPC succeeded [server version 510]
6/13/2007 12:52:25 PM|Predictor @ Home|Deferring communication for 7 sec
6/13/2007 12:52:25 PM|Predictor @ Home|Reason: requested by project
6/13/2007 12:52:30 PM|Spinhenge@home|Sending scheduler request: Requested by user
6/13/2007 12:52:30 PM|Spinhenge@home|(not requesting new work or reporting completed tasks)
6/13/2007 12:52:35 PM|Spinhenge@home|Scheduler RPC succeeded [server version 509]
6/13/2007 12:52:35 PM|Spinhenge@home|Deferring communication for 2 min 1 sec
6/13/2007 12:52:35 PM|Spinhenge@home|Reason: requested by project
6/13/2007 12:52:40 PM|malariacontrol.net beta|Sending scheduler request: Requested by user
6/13/2007 12:52:40 PM|malariacontrol.net beta|(not requesting new work or reporting completed tasks)
6/13/2007 12:52:45 PM|malariacontrol.net beta|Scheduler RPC succeeded [server version 509]
6/13/2007 12:52:45 PM|malariacontrol.net beta|Deferring communication for 11 sec
6/13/2007 12:52:45 PM|malariacontrol.net beta|Reason: requested by project

Lagu
06-13-2007, 04:16 PM
Here is mine messages from the log after resterting Boinc due to a 5 updates as required a restart:

2007-06-13 17:09:54||Starting BOINC client version 5.8.16 for windows_intelx86
2007-06-13 17:09:54||log flags: task, file_xfer, sched_ops
2007-06-13 17:09:54||Libraries: libcurl/7.16.0 OpenSSL/0.9.8a zlib/1.2.3
2007-06-13 17:09:54||Data directory: C:\Program\BOINC
2007-06-13 17:09:54||Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 3200+ [x86 Family 15 Model 12 Stepping 0] [fpu tsc pae nx sse sse2 3dnow mmx]
2007-06-13 17:09:54||Memory: 1023.48 MB physical, 2.40 GB virtual
2007-06-13 17:09:54||Disk: 76.32 GB total, 69.74 GB free
2007-06-13 17:09:54|Riesel Sieve Project|URL: http://boinc.rieselsieve.com/; Computer ID: 16098; location: home; project prefs: home
2007-06-13 17:09:54|Spinhenge@home|URL: http://spin.fh-bielefeld.de/; Computer ID: 47199; location: home; project prefs: default
2007-06-13 17:09:54|NanoHive@Home|URL: http://www.nanohive-1.org/atHome/; Computer ID: 10545; location: (none); project prefs: default
2007-06-13 17:09:54|PrimeGrid|URL: http://www.primegrid.com/; Computer ID: 32522; location: home; project prefs: default
2007-06-13 17:09:54|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 185335; location: (none); project prefs: default
2007-06-13 17:09:54|Zivis|URL: http://zivis.bifi.unizar.es/; Computer ID: 2642; location: (none); project prefs: default
2007-06-13 17:09:54||General prefs: from Spinhenge@home (last modified 2007-06-06 18:14:23)
2007-06-13 17:09:54||Host location: home
2007-06-13 17:09:54||General prefs: using separate prefs for home
2007-06-13 17:09:54||Reading preferences override file
2007-06-13 17:09:54|Spinhenge@home|Restarting task 1_Fe30_map_137_98_2 using metropolis version 242
2007-06-13 17:09:55|Spinhenge@home|Sending scheduler request: To report completed tasks
2007-06-13 17:09:55|Spinhenge@home|Reporting 1 tasks
2007-06-13 17:10:00|Spinhenge@home|Scheduler RPC succeeded [server version 509]
2007-06-13 17:10:00|Spinhenge@home|Deferring communication for 2 min 1 sec
2007-06-13 17:10:00|Spinhenge@home|Reason: requested by project
2007-06-13 17:52:31|Spinhenge@home|Computation for task 1_Fe30_map_137_98_2 finished
2007-06-13 17:52:31|Spinhenge@home|Starting 1_Fe30_map_137_633_1
2007-06-13 17:52:31|Spinhenge@home|Starting task 1_Fe30_map_137_633_1 using metropolis version 242
2007-06-13 17:52:33|Spinhenge@home|[file_xfer] Started upload of file 1_Fe30_map_137_98_2_0
2007-06-13 17:52:35|Spinhenge@home|[file_xfer] Finished upload of file 1_Fe30_map_137_98_2_0
2007-06-13 17:52:35|Spinhenge@home|[file_xfer] Throughput 6716 bytes/sec

It seems as your teammate not have choose prefs for home or have different settings for general or home settings?:icon_wink:

BobCat13
06-14-2007, 03:43 AM
If he has a cc_config.xml file in the BOINC data directory, have him activate the work_fetch_debug flag to see if the client states why it isn't requesting work.

If he doesn't have a cc_config.xml file, then he can create it by doing the following:

create a new text file, renaming it cc_config.xml
open cc_config.xml with notepad
paste the following lines into the file:<cc_config>
<log_flags>
<work_fetch_debug>1</work_fetch_debug>
</log_flags>
</cc_config>

save the file.
in the Advanced view of BOINC Manager, click Advanced and choose "Read config file"
If his version doesn't have that option under the Advanced section, just exit Boinc and start it again.This should create a message log of what the work fetch is doing. Run if for a few minutes to get several entries from the debug data and then change the 1 back to 0 to stop the debug logging, as the list of messages will add up in a hurry.

AMDave
06-14-2007, 09:33 AM
Excellent.
Thanks for the log details.
Something caught my eye in the restart:

6/13/2007 12:42:47 PM|QMC@HOME|Restarting task two_099_peptidexp.1950_1 using Amolqc-preRC1exp version 501

I suspect we have a stuck WU from QMC

I have done some drill downs and found many error results from the QMC test set "two_099_peptidexp"

Also some of these WU's are taking upto 50,000 seconds.

So. The client may have this huge WU stuck in it that is blocking out any other WUs as it may be deciding there is no "time" capacity for them until this wu is finished.

Abort the WU.
If you cannot abort the WU, detach the QMC project from the project list.
If everything runs normally again then fine, otherwise I'd go with BobCat13's suggestion to get more log-info so we can see what is going wrong - in this case the extra log information this would be very useful.

Otherwise if patience is wearing thin, go back to PoorBoy's suggestion: uninstall, ensure the directory is deleted and then re-install. If it happens again after that then we REALLY want that extra log information as BobCat13 has described.

Bon chance!