PDA

View Full Version : Arrgh...



SB2
04-19-2004, 02:59 AM
from distributed.net

:: 18-Apr-2004 15:40 CDT (Sunday) ::
Ooops. Looks like we forgot to restore an incremental backup from just before
the crash. In order to restore that backup and have it take effect, all changes
participants have made since fritz went online well unfortunately be lost. We
will also have to re-run all stats for the past two months or so, which will
take a few days.

Sorry for the inconvenience.

Bionic_Redneck
04-19-2004, 08:10 AM
I was wondering what was up with the stats and now I know

em99010pepe
04-19-2004, 09:02 AM
I lost all my work. I have here 500 RC-52 blocks to send but I don't know what I should do.

Bionic_Redneck
04-19-2004, 09:32 AM
you just have to wait until server comes back online.

SB2
04-19-2004, 10:35 AM
a quick follow-up;

:: 18-Apr-2004 22:28 CDT (Sunday) ::
The restore is done and fritz is now churning through log files. I'm about to
turn apache back on. Something to keep in mind is that many user changes take
effect on the log date that the change was made. So if you retired an account
or joined a team on March 3rd, you won't see the change take effect until the
daily stats run for March 3rd.

Looks like the database is up Feb. 11th so far. Perhaps by tomorrow I'll be able to join the team. :)

NeoGen
04-19-2004, 06:32 PM
I've just seen the stats on distributed.net. :shock:
But from what I understood from the posts previous to mine, it will be updated, right?

em99010pepe
04-19-2004, 06:37 PM
I've just seen the stats on distributed.net. :shock:
But from what I understood from the posts previous to mine, it will be updated, right?

I hope so.

SB2
04-19-2004, 11:13 PM
I've just seen the stats on distributed.net. :shock:
But from what I understood from the posts previous to mine, it will be updated, right?

The stats are now up to Feb 21st, so one can hope all is not lost, yet.

SB2
04-23-2004, 11:23 AM
Statistics are now current (http://stats.distributed.net/team/tmsummary.php?project_id=8&team=940186047).

Also finally recieved my password and should (fingers crossed) show in tonights stats run. :)

Ototero
11-18-2005, 02:21 PM
More AAAARRRRRHHHHHHHHH


Thanks to poor driver support, we had been running for who knows how long with
3 failing drives in the raid10 array that housed the database. But that wasn't
actually what caused the outage... if a machine with an 8500 in it goes down
unexpectedly (think power failure), the controller can't trust the data on the
drives to be in-sync, so it needs to rebuild the array. Unfortunately, one
of the drives it picked to be authoritative was failing, and decided that it
wasn't going to give up it's data.

Unfortunately we've been unable to recover the array. We tried using spinrite
as a last resort, but at the rate it was going it would have taken something
like a week to recover the drive. This means that when we get back online,
we'll be running from a stats backup taken Nov. 6, about 4 days before the
failure. Any changes made to participant accounts or teams in the meantime will
have been lost.

In an ironic twist of fate, we've been working on getting a new machine in
production that would have allowed replicating user-modifiable tables (ie:
participant accounts and teams) to another machine. Had that been in place we
would have lost very little, if any, of this data.

The current situation is that we've bought 3 new drives and used them to
rebuild the array. We've also taken this opportunity to upgrade to FreeBSD 6.0.
But now any time we try to access the array, the machine reboots.

Once someone is on-site to investigate we'll hopefully know more.


What a waste of my crunching. I'm off to Folding@Home until the cows come home ;)

Brucifer
11-18-2005, 06:02 PM
Well as much as this all sux, I guess it's really about the first big foobar for distributed.net. But still..... will just have to see how things pan out the next few days. If much is lost, I would imagine that it will impact the number of people participating. :(

Sometimes, things just seem to get a little bit discouraging in this crunching game...

Brucifer
11-19-2005, 08:00 AM
Well crapolina, here it is, the wee hours of the morning here, 1:00am PST, and the dnet stats are back down again due to another hardware failure accoring to their web site. Not good. :(

They musta hired some boinc people or something........

NeoGen
11-28-2005, 05:10 PM
DNet stats back up again!

They seem to be a few days behind yet, but catching up quickly.

Brucifer
05-22-2006, 11:33 PM
Well it looks like things are rocking along for dnet again. :) I've got a little effort going on OGR-P2. A little break from the monotony. :)

Might even have to do a minor run on this project for a little bit, sorta see if it brings forth anymore life around this team.... heh heh he says as he rabidly digs through his closet for his "stomping boots." bwahahahaha :twisted: