PDA

View Full Version : FreeHAL problem



NeoGen
07-06-2009, 11:51 AM
Seems that they have a new server running now. But freehal stopped working for me.
Anybody else getting this message lately?


06/07/2009 12:48:05 FreeHAL@home Sending scheduler request: To report completed tasks.
06/07/2009 12:48:05 FreeHAL@home Reporting 1 completed tasks, requesting new tasks
06/07/2009 12:48:10 FreeHAL@home Scheduler request completed: got 0 new tasks
06/07/2009 12:48:10 FreeHAL@home Message from server: Error in request message: no start tag

liuqyn
07-06-2009, 11:55 AM
will have to re-attach and let you know.

liuqyn
07-06-2009, 01:46 PM
would seem you need to detach and reattach to connect properly to the new server.

liuqyn
07-06-2009, 01:59 PM
although I was only getting one at a time and this error

7/6/2009 8:27:59 AM FreeHAL@home Started upload of data.pro
7/6/2009 8:28:01 AM FreeHAL@home [error] Error reported by file upload server: can't open log file '../log_web1/file_upload_handler.log' (errno: 9)
7/6/2009 8:28:01 AM FreeHAL@home Temporarily failed upload of data.pro: transient upload error
7/6/2009 8:28:01 AM FreeHAL@home Backing off 2 hr 3 min 13 sec on upload of data.pro


after I cancelled that transfer, it then let me report completed tasks and downloaded 10 new ones(my setting in preferences).
still waiting to see if I get credit for the first few. (seems the validator isn't running)

liuqyn
07-06-2009, 02:24 PM
changed preferences and downloaded and crunching full 25 tasks. hope that lasts.

Nflight
07-06-2009, 03:10 PM
me too after I aborted the data.pro file I have been whistling Dixie with the new set of 25 Work Units per boxen. Woo Hoo :blob3:

NeoGen
07-06-2009, 10:11 PM
I detached and reattached and am getting a new error now.

06/07/2009 23:09:06 FreeHAL@home Scheduler request failed: Error 403
I believe this may be a server problem now... or is anyone able to get WU's?

liuqyn
07-06-2009, 10:26 PM
yup, it's new. no response yet from admin.

AMDave
07-09-2009, 09:48 AM
:icon_mad: ouch

I ran into hard-disk hell with some of these wus (only 5 at a time) on 2 machines this afternoon.

The IO-Wait suddenly went right through the roof while the HDDs were going bonkers and that locked up the desktops in both Win & Lin.

Sadly, detaching the project was the only thing that would fix it. :(

NeoGen
07-09-2009, 10:30 AM
Ouch indeed...
How many WUs were you running in parallel? I've tried running only 5 at a time for now to see how it goes, and haven't had problems.

Nflight
07-09-2009, 10:34 AM
I went all night with out communicating to the project, this morning I updated and the total of 30 WU's were recalled by the server and more were dispensed. I am back in business. I set the total from 25 WU's down to 18 for each boxen. The Overwhelming abuse of the CPU's had my system lurching instead of working like they should have. Too Much WU's for once, 18 works much better. :blob3:

vaughan
07-09-2009, 01:01 PM
Interesting observation is that FreeHAL seems to take some CPU cycles now and I often see a message pop up for a few seconds saying 'Communicating with BOINC client. Please wait...' . The HDD light seems to stay on lots more too. These systems are Win XP 32bit with 2GB ram.

liuqyn
07-09-2009, 01:19 PM
looks like all the activity is in the last 10-15% of the run, and it goes crazy.

Nflight
07-15-2009, 01:06 PM
The server is back online ! Woot Woot :blob3:

Nflight
07-18-2009, 04:43 PM
Work Units are considerably longer in Time now, I am seeing about 30% longer duration to complete the tasks. And it is still a last minute shiver of the system like Liuqyn mentions that is occurring. Like Brain Freeze when you suck down a cold Milkshake too fast and chills your head so much it gives you a headache!

liuqyn
07-18-2009, 05:40 PM
and to add on that, these seemingly nightly downtimes are killing my RAC as I'm not home to tell my boxes to "try again" when the server comes back up and boinc gets tired of trying and "waits" 24 hrs at which time the server is down again.

now that I made it to 100k, going to detach until it is more stable.