MiniRosetta problems

**mitchellds** · 04-30-2009, 12:26 AM

Looks like there are some current problems, I'm having alot of WU die. Other forum users are having the same related problems. Right before our race to...

**liuqyn** · 04-30-2009, 12:36 AM

same here, and when they crash they seem to also take GPUGrid wu's with them(on my boxes that run both).

**Nflight** · 04-30-2009, 12:40 AM

Just as you have mentioned I am starting to receive Ralph Work Units. A horde of them came today to clog my computers work load. I have had the Ralph Project open to receive for weeks and nothing then today they hit with force. They are long work units, but I have not had any go dysfunctional like your suggesting.

More comments are needed to figure out if your race is on or extended a week to work out the kinks and hurdles that lie ahead of you all in your desire to RACE for the great equipment. Good Luck Everyone!

**mitchellds** · 04-30-2009, 12:41 AM

yes, I'm seeing similar GPUGrid wu's problems

**AMDave** · 04-30-2009, 01:04 AM

That happened to me last night too.
Now I check the Ralph tasks for the machine the GPU is on I see that the Ralph task succeeded but it coincides with the time the GPUGrid task failed.
I stopped Ralph on that machine for the moment.

Unless it is a coincidence and we have a batch of dodgy GPUGrid wu's at the same time as the Ralph release.
I don't have the debug log for the GPUGrid wu so I can't correlate the problem to here http://ralph.bakerlab.org/forum_thread.php?id=446
It would take some concerted effort to prove this.

As good as I can give is

<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1600000 kilohertz
# Total amount of global memory: 536150016 bytes
# Number of multiprocessors: 8
# Number of cores: 64
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1600000 kilohertz
# Total amount of global memory: 536150016 bytes
# Number of multiprocessors: 8
# Number of cores: 64
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1600000 kilohertz
# Total amount of global memory: 536150016 bytes
# Number of multiprocessors: 8
# Number of cores: 64
Cuda error: Kernel [fft_data_swizzle_out] failed in file 'CPME_cufft.cu' in line 94 : unspecified launch failure.

</stderr_txt>

How the Ralph WU could cause that is not clear, but the coincidence leaves me wondering.

**NeoGen** · 04-30-2009, 09:10 AM

Originally Posted by Nflight

More comments are needed to figure out if your race is on or extended a week to work out the kinks and hurdles that lie ahead of you all in your desire to RACE for the great equipment.

You're right Nflight.
I'm disappointed that a Rosetta@Home has problems like this when they have a side project specifically created for beta testing.
If there is no solution I'll have to propose another project for first race, and maybe delay the contest start one more week.

**NeoGen** · 05-01-2009, 06:23 PM

Guys, are the problems still around, or have they fixed it?

If they're still around I'll have to change the initial project I guess...

Thread: MiniRosetta problems

Thread Tools

Display

MiniRosetta problems

MiniRosetta problems

Posting Permissions