PDA

View Full Version : Riesel Sieve Down Now



LeBo
09-12-2006, 11:47 AM
Riesel Sieve has been down since last night and I'm completely out of WUs and thats sucks big time. Had my queue set at .3 :-( :-( :-(

vaughan
09-12-2006, 12:16 PM
I noticed too. Most systems will run dry in the next few hours. I changed my cache from about 4 or 5 days to 0.5 day only recently. Might have to set it at 10 when the work gets released again so that I can survive the droughts.

/Edit: I have left mine on 0.5 day as Bryan has requested that we don't horde work units as big dumps cause the server to have problems. /edit.

Nflight
09-12-2006, 01:15 PM
One system is dry and the other is dwindling as we speak. I had a good run there, but these dry spells also means B@A might lose interest as well and take this to move out of here. Just a thought~!
Vaughan I gave it my all the last couple days as you requested, I think I made a tremendous dent. I know I have been looking over the numbers of everyone who jumped in and there has been tremendous support to keep us at number one.

Congratulations Team Effort :D

vaughan
09-12-2006, 02:01 PM
Yeah I agree Nflight. Thanks for your support and to the rest of the team too.


Well done crew :!:

LeBo
09-12-2006, 02:03 PM
I will also change my cache to something like 3 to 4 days

gatekeeper53
09-12-2006, 02:25 PM
I still have 2 machine chugging along. It should be funny when they come back up. With all the finished wu's uploading it might shut them right back down.

Lagu
09-12-2006, 02:32 PM
Has anyone read the forums of Riesel Sieve.
There they is planning to double the size of the WU to 500 whish I think mean if a WU takes 25 min now it will take 50 minutes.

I´don´t know if I have understand it right but I will set the time for connecting from 10.0 to 0.1 to see how long it takes.

Right now they have server issues.

Lagu :)

Empty_5oul
09-12-2006, 04:53 PM
yes lagu i saw that, its on the front page as well.

double the size of each WU means double the time to crunch. So each person should not bombard the server as much, they will send/recieve more data each time tho.

changing the connection time will increase/decrease the amount of work you receive. Setting it to 0.1 means you would contact the server a lot of times, 10 being the other exteme means you are likely to connect infrequently (once every 10 days) - they point out on the front page dont stock pile too many WU and send all at once as their servers can't handle it.

LeBo
09-12-2006, 05:28 PM
Can't connect, must still be down..:(

Murray
09-12-2006, 05:59 PM
Check the announcements forum for updates...
http://www.rieselsieve.com/forum/viewforum.php?f=28&sid=c6e75d1463ed7132f0cc78dd6493c936

dAVE
09-12-2006, 06:50 PM
A little trick with the way BOINC seems to work. Each time it has a file ready to upload it contacts the server and tries to upload it before requesting new work. Because of the restrictions that Bryan has put onuploads the server the RPC gets bounced back and BOINC backs off without apparently getting to the requested new work bit. If you can hit the update project button when no attempted uploads are being tried then you make a simple request for work. This has priority so you can be lucky and see this
12/09/2006 19:23:34|Riesel Sieve Project|Started download of file wu_341659.txt
12/09/2006 19:23:34|Riesel Sieve Project|Started download of file wu_341660.txt
12/09/2006 19:23:36|Riesel Sieve Project|Finished download of file wu_341659.txt
12/09/2006 19:23:36|Riesel Sieve Project|Throughput 130 bytes/sec

Couple of things, don’t keep hitting the update button, it might take several minutes before the downloaded WUs start and don’t tell Bryan about this, it will add to his server load and he will cut me off, or some vital part of me if he gets near enough! :rightfighter5:
dAVE

LeBo
09-12-2006, 07:06 PM
"don’t tell Bryan about this"


But, but he visits our forum all the time. I just went in to the My Account Options and upped the Connect to network and started receiving more work after a while.

dAVE
09-12-2006, 07:16 PM
Yep LeBo, that has the same result a simple request for work without trying to upload files. I was hoping that by the time Bryan sorts his servers out AND moves house, these posts would be burried so deep that he wouldn't notice. dAVE

Nflight
09-12-2006, 08:07 PM
Worked for me, I now have work to crunch, Yippee :!: :!: :!:

Empty_5oul
09-12-2006, 08:14 PM
i moved over to spinhenge while this is down, rather than idle my cpu! When everyone can upload work i will re-join the team effort here.

LeBo
09-12-2006, 08:35 PM
The only problem I now have is nothing is uploading. Will his servers ever catch up, with our machines just caching more to upload. In fact I have over 130 WUS in my queue to upload now.

Nflight
09-12-2006, 10:09 PM
ETA till we are up is 1/2 hour According to the message boards where Bryan is posting: Found here:

http://www.rieselsieve.com/forum/viewtopic.php?t=892&sid=828c4ae78fb5c9b5dd9251daf23749c2

dAVE
09-12-2006, 10:24 PM
Bryan has disabled the uploading of all results for 2 hours to try to recover the situation. You should still be able to download WUs if you can get in between all the clients trying to upload results. LeBo can you check your PMs please. dAVE :-(

gatekeeper53
09-12-2006, 10:43 PM
When he gets it up maybe I can move the box I built today over from Einstein. Tried to put it on here since everyone was getting wu's but it needs to set up something I guess. Is there anyway to release the finished wu's slowly as to not overwelm the server again? I have between 2 and 300 of them.

LeBo
09-12-2006, 10:46 PM
With what everyone has back loaded, Bryan will need 3 servers like he has to catch up.

Empty_5oul
09-12-2006, 11:20 PM
gatekeeper53 as far as i know its not possible. Boinc tries to do it all itself, dont keep clicking update or refresh or anything. IF it times out for xx time let it, as it makes people back-off for different periods so it isnt overloaded.

for the moment i would crunch another project for a little while. Then when this has stabilised try it again.

Nflight
09-12-2006, 11:35 PM
I have just received 2 full days of work units to crunch, I put in the request almost three hours ago. To lessen the strain on Bryan's operations now that I have received these work units, I am going to turn off my network activity completely. This will reduce the stress on his server, and this will all pan out in the end when I up load my super splash of points.

If anyone else receives a horde of WU's I suggest you do the same, Thanks in adavance. :)

dAVE
09-12-2006, 11:43 PM
The latest from Bryan:-
OK, we're operating at 20% of requests on file uploads, 25% of requests on regular scheduler requests - and staying around to below a load of 20, which is where things start to time out/go haywire. The stats, validation, and other DB scripts are not running right now, and the DB-driven web pages are turned off. As long as things keep moving along slowly, I'll gradually increase those percentages, hopefully 100% on both by morning. Just be patient
_________________
Bryan
Stats Administrator

Strongbow
09-13-2006, 08:56 AM
Ouch, I have several thousand uploads pending across multiple systems and they're rapidly crunching many more!

The joys of beta projects! ...I'm sure it was actually really useful that the Aussie assault helped to over stress their environment as it highlighted these issues now rather than later on in the project and so it really shows the limitations of the configurations. I suspect there will be significant changes to the work units over the next few weeks, over and above doubling the size of the WUs.

The debate now should be around either suspending new work or carrying on as normal and hope that they can handle the onslaught of uploads?

LeBo
09-13-2006, 02:38 PM
Guys I think I will stop crunching BRS for a few days until Bryan can get his servers fixed. With the big increase in membership was just a little more than he expected.

Lagu
09-13-2006, 04:36 PM
Ouch, I have several thousand uploads pending across multiple systems and they're rapidly crunching many more!

The joys of beta projects! ...I'm sure it was actually really useful that the Aussie assault helped to over stress their environment as it highlighted these issues now rather than later on in the project and so it really shows the limitations of the configurations. I suspect there will be significant changes to the work units over the next few weeks, over and above doubling the size of the WUs.

The debate now should be around either suspending new work or carrying on as normal and hope that they can handle the onslaught of uploads?

Nice avatar blackheath!
Lagu :)

Strongbow
09-13-2006, 04:50 PM
Nice avatar blackheath!
Lagu :)

:D

Ototero still has by far the best avatar! :shock: :cool: :lol:

gatekeeper53
09-13-2006, 05:41 PM
Black yours is nice but I couldn't agree with you more.

Strongbow
09-13-2006, 08:48 PM
Where are we and where are the Aussies???

http://boinc.rieselsieve.com/orig/top_teams.php

:?

Nflight
09-13-2006, 08:54 PM
Place #153 and Place # 155 respectfully.

I would say there is a varmint in the gear box ;) :!:

gatekeeper53
09-13-2006, 09:26 PM
This message started popping up a little while ago




9/13/2006 4:24:06 PM|Riesel Sieve Project|Error on file upload: Project uploads temporarily disabled for load balancing. Uploads will resume around 22:00 UTC.

Lagu
09-13-2006, 11:01 PM
2006-09-14 00:46:43||Using earliest-deadline-first scheduling because computer is overcommitted.
2006-09-14 00:47:25||Rescheduling CPU: project resumed by user
2006-09-14 00:47:39||Resuming network activity
2006-09-14 00:47:39|Riesel Sieve Project|Started upload of file riesel-sieve_206464_1_fact.out
2006-09-14 00:47:39|Riesel Sieve Project|Started upload of file riesel-sieve_206464_1_excl.out
2006-09-14 00:47:41|Riesel Sieve Project|Sending scheduler request to http://boinc.rieselsieve.com/cgi/index.cgi
2006-09-14 00:47:41|Riesel Sieve Project|Reason: Requested by user
2006-09-14 00:47:41|Riesel Sieve Project|(not requesting new work or reporting completed tasks)
2006-09-14 00:47:45|Riesel Sieve Project|Scheduler request succeeded
2006-09-14 00:47:45|Riesel Sieve Project|No start tag in scheduler reply
2006-09-14 00:47:45|Riesel Sieve Project|Can't parse scheduler reply
2006-09-14 00:47:45|Riesel Sieve Project|Deferring scheduler requests for 30 minutes and 48 seconds

It seems to be hopeless. I´ve run out of job yesterday and on my Athlon 1001 is 1 task to crunch then is it stop. I have totally 250 wu:s to upload and do to download. I run Einstein on my AMD 64 but every work takes 9 hours.

Lagu

Nflight
09-13-2006, 11:11 PM
The 22:00 UTC time has passed one hour and 10 minutes ago, has anyone seen movement?

Lagu
09-13-2006, 11:40 PM
My time is GMT and it is difficult to know when 22.00 UTC is.
A question I have:

Say a wu will be done at least kl 10.30.25 sept 14 and I have not upload this WU because of the issue first sept 15. Will this WU be too late and I lose points?

Lagu :?

gatekeeper53
09-13-2006, 11:53 PM
Been almost 2 hours and not even a twich. These things hitting the server so often is making my connection seem like the old dial-up days. I wonder if he could up the multiplier on how long the wu's back off after a failure to send. so they only try about every 10 hours or so? Is there any way to ask? I can't even get to the site anymore.

dAVE
09-14-2006, 10:05 AM
Lagu, Bryan has turned of the regenerator so your WUs won’t be sent out again when they pass the deadline. This is so that everyone gets their credits no matter how long it takes to send them in. (Edit; Bryan has now said that he intends to keep the regenerator turned off for only "2 or 3 days" until things smooth out)
I see that the project is not sending out anymore work until they sort out the backlog and he asks us not to stay connected and hammer his servers if possible. I’ve gone over to Einstein and just allow BOINC to connect twice a day to report results to Einstein and get new work from them. I wish BOINC had a feature to allow me to suspend all the RS uploads. dAVE

bryanRS
09-14-2006, 10:59 AM
dAVE - don't worry, it will be 2 or 3 days after the backlog is cleared, so I figure not until Monday or Tuesday of next week.

Some general (related) info:

OK everyone, here's your 6:45EDT / 10:45 UTC update:

WU file uploads trickled in overnight, and I'm pleased with how things went. The server seemed to stabilize due to the apache controls I put into place. I've made some more changes, and basically, things are starting to catch up much faster. We had about 4,000 files sent in from 04:00-08:00 UTC, and about 5,000 files from 08:00-08:45. Since I'm heading off to work, I'm going to leave the scheduler turned off still, as I won't be here to watch it. I have upped the apache MaxClients to 35, from 20, so that should also help the catchup pace.

I haven't turned on the scheduler yet, although I do have a script set to update the configuration around 15:00 UTC to open up the scheduler again - my hope is that this will work. If the scheduler is still disabled, I'll manually turn it on when I get home.

As long as things are still running when I get home (22:00 UTC), I'll turn on the validator, sit back, and watch things again. We're much closer, and things are stable.

Empty_5oul
09-14-2006, 11:39 AM
thanks for the update bryan.

i see some WU are finally getting through as the retry status hits 0. :D

Strongbow
09-14-2006, 01:34 PM
Thanks Bryan,

The uploads are working fine!

I know it's sad but I'm looking forward to dumping all those WUs to see how much of a points jump I get!

bryanRS
09-14-2006, 10:06 PM
blackheath - it's not sad, I'm quite interested myself, as I have no clue how much work is really stuck out there - my stats/monitoring scripts have all been off to save load.

--------Latest Update--------
OK - my script kind of worked - I set it to check the DB server first, and since the DB server was not communicating with the BOINC server properly, it stopped the changeover. I've fixed the link between the two and they are now communicating properly.

As of 22:00 UTC, WU reservation & reporting reenabled, it seems that over the course of the day everything was uploaded as needed.

We'll see how things go from here....

Lagu
09-15-2006, 12:19 AM
At least i could both upload my work and receive new one and my score has got up a lot.

Thanks Bryan for your hard work. Now I have forbidden any network contact this week.

Lagu :D

Lagu
09-15-2006, 12:27 AM
And we are numro ONE sept. 15 kl. 02.25 GMT.

Well done all participants in RS.

Lagu ;)

Strongbow
09-15-2006, 08:22 AM
Ouch! :cry: :cry:

Click to enlarge...
http://img1.putfile.com/thumb/9/25704181411.jpg (http://www.putfile.com/pic.php?img=3399097)

vaughan
09-15-2006, 11:29 AM
Bryan: Thanks for all your hard work and for keeping us informed. Communication is essential to keep the crunchers happy so well done. Thanks also from all of us at AMD Users to B2 and anyone else working behind the scenes. In fact we don't know who all the coders and contributors are.

I'm sure the last few days have been a very useful learning experience for all involved.

Would the new server have helped the situation or just post-poned it? How are the donations going for the new hardware? Do you need more?

Lagu: Try this link for a UTC clock (http://www.time.gov/timezone.cgi?UTC/s/0/java).

Strongbow
09-15-2006, 03:33 PM
Bryan: Thanks for all your hard work and for keeping us informed. Communication is essential to keep the crunchers happy so well done. Thanks also from all of us at AMD Users to B2 and anyone else working behind the scenes. In fact we don't know who all the coders and contributors are.


I have to 2nd that, well done Bryan and have you completed your house move yet? ...much beer needed after this week's effort I'm sure!!! :shock: :D

LeBo
09-15-2006, 04:14 PM
I can't get "ready to report" to go up or download WUs to work on. Any one else having that problem?

vaughan
09-15-2006, 04:19 PM
yes

dAVE
09-15-2006, 04:21 PM
LeBo, I’ve just hit update and the WUs were reported OK. dAVE :?

dAVE
09-15-2006, 04:32 PM
Try setting RS to get no work then hit update so that only a request to report results goes in. If Bryan is throttling the downloads of WUs then that may work, dAVE.

LeBo
09-15-2006, 05:26 PM
It finally went to work.

Steve Lux
10-03-2006, 01:52 AM
Looks like they are down again. I have a whole page full of work to upload and the web site seems to be down.

Starting Einstein back up again.

vaughan
10-03-2006, 02:40 AM
Yes, they are re-locating the file servers so will be on and off the air for a day or so. I'll post here when things are back on-line.

Even my (Lisa's :oops: ) XPS has run out of work and it had lots of cached tasks. Got it playing Chess960 and then that project says "no work". Trying its hand at Tanpaku now. 8)

Lagu
10-05-2006, 09:25 AM
Good Vaughan!

I´m running Thanpaku on mu AMD64 and my Athlon 1001 is so slow. Every work units takes over 2 hours each (Riesel Sieve) and I have many WU´s to crunch yet

Evil-Dragon
10-05-2006, 09:30 AM
It says it should be up by 19:00 EDT later today. There is a workaround on the forums for reporting/getting new workunits.

dAVE
10-05-2006, 10:06 AM
It says it should be up by 19:00 EDT later today. There is a workaround on the forums for reporting/getting new workunits.

Could you post the workaround here because I cannot get anything from RS, even through Google. dAVE :(

DMMc
10-05-2006, 10:43 AM
Could you post the workaround here because I cannot get anything from RS, even through Google. dAVE :(

Same here...Have a TON of WUs and nowhere to go with them:confused:

Evil-Dragon
10-05-2006, 11:00 AM
In BOINC Manager, click Advanced, Options. Click on the HTTP Proxy tab, tick "connect via HTTP Proxy server" and enter: http://boinc.rieselsieve.com (http://boincf.rieselsieve.com) and enter 8000 as the port number.

Retry your file transfers, get them all sent back and at the same time get your BOINC client to request more work. Afterwards make sure you untick "connect via HTTP Proxy server" otherwise none of your other projects will be able to connect to request more work or report finished WU's.

RieselSieve's DNS should change some time today so i doubt you'll need to do this for much longer.

DMMc
10-05-2006, 11:22 AM
Nope....:confused: Now getting http error SIGH....:(

dAVE
10-05-2006, 11:27 AM
DMMc Take out the "f" after "boinc" in the address. I could then upload finished WUs but not download new work. :sigh: dAVE

Evil-Dragon
10-05-2006, 11:48 AM
Sorry about that, never noticed that "f" crept into there.

DMMc
10-05-2006, 03:07 PM
That was the ticket....Thanks:)

Murray
10-20-2006, 09:52 AM
It looks like Riesel is down again...

Evil-Dragon
10-20-2006, 12:55 PM
Yup down for me too. I have no WU's left to crunch.

Crunching proteins@home instead on both machines.

dAVE
10-20-2006, 01:41 PM
Evil-Dragon wrote


Crunching proteins@home instead on both machines.

Bang goes my first place :qright1:
Oh well at least we will at last get above Team USA :icon_salut:
dAVE :icon_cry:

Evil-Dragon
10-20-2006, 03:04 PM
Depends on how fast i can crunch. You might still keep first place because i hear on the grape vine that NanoHive@Home is going to have a lot of WU's next week. So i might be crunching those instead :)

Lagu
10-20-2006, 03:24 PM
Why not Tanpaku for a while and help us move to the first place?

Evil-Dragon
10-20-2006, 03:56 PM
I'll run both and keep both sides of the fence happy :)

Lagu
10-20-2006, 04:06 PM
Thank you for keeping an eye on the projects:)

Steve Lux
10-20-2006, 04:17 PM
Earlier this week, for 2 days, I had tried to set mine to 10 days and couldn't get any work to do at all. I was told to not set it over 7 days. At present my system is set to .5

If they aren't beck up by the time I get home from work I'll be turning Einstein back on.

bryanRS
10-20-2006, 08:40 PM
From our forums...Basically, one problem got combined with about 3 other unfortunately-timed problems...Sorry about the troubles!



Outage from about 22:00 EDT (02:00 UTC) last night ran until about 15:50 EDT (19:50 UTC) this afternoon. Let's go over what I have now as to why:

Some storms moved through the Cincinnati area last night, the connection to the server went out around 21:30-22:00. Service came back up at about 23:00 EDT. However, there was not access to the server itself - no known cause.

When Lee arrived home at about 01:00, he rebooted the servers, things seemed to be running. By about 02:00, things had gone out again.

At 10:00 EDT this morning, I tried getting to the servers, but everything was hung up. I contacted Lee to get things running.

At about 11:30 EDT, a major fiber line was cut in Central Ohio, and I lost my ability to access the internet.

At about 12:00, Lee restarted the web server again and brought the website back on line. However, the DB server had blocked the server because of too many errors.

At about 15:45, I contacted AT&T and found out about the fiber cut (I had talked to them earlier about my connection problems, they had no known cause at the time, but filed a trouble ticket). They informed me how to use (brace yourselves if you're not sitting down) dial-up to access the 'net, as there was no ETA for the repair.

I restored DB access for the server, changed some load settings for apache, and brought things back on line. As of the time of this post (16:00/20:00 UTC), RS is back to normal. Sorry about the much longer delay than was anticipated - but someone other than me caused it Wink
_________________
Bryan
Stats Administrator
BOINC Developer

PoorBoy
10-20-2006, 09:05 PM
Anyone know how to contact somebody from Riesel Sieve by E-Mail, I can't get make a Post at the Project because it keeps rejecting either my User name & or my password when I know their correct...???

Errrgg ... Never mind, I didn't realize I had to Register to Post in the Forums, I thought If I was running the project I could automatically just Post. I registered and can now Post ...

bryanRS
10-20-2006, 09:48 PM
PoorBoy,

Send an email to bryan (at) rieselsieve dot com that has your forum username & your email you signed up with in it, I'll reset the account and send you a new password.

PoorBoy
10-20-2006, 09:52 PM
PoorBoy,

Send an email to bryan (at) rieselsieve dot com that has your forum username & your email you signed up with in it, I'll reset the account and send you a new password.

I finally got in Bryan after a trying time with the Code, see my 1'st post in the Suggestions part of the forum, it's a doozy & the right way to get off on a good foot I figure ... hahaha

vaughan
01-03-2007, 10:18 PM
I'm unable to reach the BOINC Riesel Sieve server today. Anyone else? :(

dAVE
01-03-2007, 10:46 PM
Me too Vaughan, also Einstein has been down for a couple of days and now Proteins has a lot of dodgy WUs that won't run. At this rate I will soon have to go and chase little green men on Seti! :sad5: dAVE

Brucifer
01-03-2007, 10:48 PM
ahhhhhhhhh you don't wanna do that.......... they'll just send you another email and ask you for more money. :icon_twisted:

Evil-Dragon
01-03-2007, 11:04 PM
I'm sure i saw somewhere in their forums about a server change over. This might be what's causing the downtime. Don't hold me on that though.

vaughan
01-03-2007, 11:09 PM
Yeah I too recall something about some server hardware changes now you mention it. I think it was a new router or something to do with the network.

Dave I've also noticed a bunch of VTU tasks that like to process far in excess of the nominal 45 minutes. I aborted some yesterday that had run for 7 hours; gee I hate wasting CPU cycles on dud work units. :(

bryanRS
01-04-2007, 01:22 AM
There's some kind of external network problem - I wish it was our changeover to the new router/switch & webserver, but it's not. I can't do anything until B2 gets home at about 23:30 ET, as it could be something like a power outage that's knocked things out. I'll try and post when things are back up. Thanks for patience :)

bryanRS
01-04-2007, 04:57 AM
I didn't know, but our router went up in smoke at about 12:30 ET yesterday (Weds.). Lee put in the WRAP box running m0n0wall that will serve as our new (and much improved) router, but things weren't working right. Service was restored at about 00:30 ET, but a massive load spike occured (kind of expected). It's working its way through as best as possible right now. Expect load restrictions (and building Pending Credit as a result) for several hours.