Results 1 to 7 of 7

Thread: MiniRosetta problems

  1. #1
    Join Date
    Feb 2006
    Location
    Virginia, USA
    Posts
    969

    MiniRosetta problems

    Looks like there are some current problems, I'm having alot of WU die. Other forum users are having the same related problems. Right before our race to...

  2. #2
    Join Date
    Jun 2007
    Location
    Mid-Michigan
    Posts
    756
    same here, and when they crash they seem to also take GPUGrid wu's with them(on my boxes that run both).


  3. #3
    Join Date
    Nov 2005
    Location
    Central Pennsylvania
    Posts
    4,333

    MiniRosetta problems

    Just as you have mentioned I am starting to receive Ralph Work Units. A horde of them came today to clog my computers work load. I have had the Ralph Project open to receive for weeks and nothing then today they hit with force. They are long work units, but I have not had any go dysfunctional like your suggesting.

    More comments are needed to figure out if your race is on or extended a week to work out the kinks and hurdles that lie ahead of you all in your desire to RACE for the great equipment. Good Luck Everyone!





    Challenge me, or correct me, but don't ask me to die quietly.

    …Pursuit is always hard, capturing is really not the focus, it’s the hunt ...

  4. #4
    Join Date
    Feb 2006
    Location
    Virginia, USA
    Posts
    969
    yes, I'm seeing similar GPUGrid wu's problems

  5. #5
    AMDave's Avatar
    AMDave is offline Seeker of the exit clause Moderator
    Site Admin
    Join Date
    Jun 2004
    Location
    Deep in a while loop
    Posts
    9,612
    That happened to me last night too.
    Now I check the Ralph tasks for the machine the GPU is on I see that the Ralph task succeeded but it coincides with the time the GPUGrid task failed.
    I stopped Ralph on that machine for the moment.

    Unless it is a coincidence and we have a batch of dodgy GPUGrid wu's at the same time as the Ralph release.
    I don't have the debug log for the GPUGrid wu so I can't correlate the problem to here http://ralph.bakerlab.org/forum_thread.php?id=446
    It would take some concerted effort to prove this.

    As good as I can give is
    <stderr_txt>
    # Using CUDA device 0
    # Device 0: "GeForce 9600 GT"
    # Clock rate: 1600000 kilohertz
    # Total amount of global memory: 536150016 bytes
    # Number of multiprocessors: 8
    # Number of cores: 64
    MDIO ERROR: cannot open file "restart.coor"
    # Using CUDA device 0
    # Device 0: "GeForce 9600 GT"
    # Clock rate: 1600000 kilohertz
    # Total amount of global memory: 536150016 bytes
    # Number of multiprocessors: 8
    # Number of cores: 64
    # Using CUDA device 0
    # Device 0: "GeForce 9600 GT"
    # Clock rate: 1600000 kilohertz
    # Total amount of global memory: 536150016 bytes
    # Number of multiprocessors: 8
    # Number of cores: 64
    Cuda error: Kernel [fft_data_swizzle_out] failed in file 'CPME_cufft.cu' in line 94 : unspecified launch failure.

    </stderr_txt>
    How the Ralph WU could cause that is not clear, but the coincidence leaves me wondering.
    Last edited by AMDave; 04-30-2009 at 01:15 AM.
    . . . . . ___
    . . . . . . .\___/\______
    . . . . . . . \__AMD___\\__
    ---------------------------------------------

  6. #6
    NeoGen's Avatar
    NeoGen is offline AMD Users Alchemist Moderator
    Site Admin
    Join Date
    Oct 2003
    Location
    North Little Rock, AR (USA)
    Posts
    8,451
    Quote Originally Posted by Nflight View Post
    More comments are needed to figure out if your race is on or extended a week to work out the kinks and hurdles that lie ahead of you all in your desire to RACE for the great equipment.
    You're right Nflight.
    I'm disappointed that a Rosetta@Home has problems like this when they have a side project specifically created for beta testing.
    If there is no solution I'll have to propose another project for first race, and maybe delay the contest start one more week.

  7. #7
    NeoGen's Avatar
    NeoGen is offline AMD Users Alchemist Moderator
    Site Admin
    Join Date
    Oct 2003
    Location
    North Little Rock, AR (USA)
    Posts
    8,451
    Guys, are the problems still around, or have they fixed it?

    If they're still around I'll have to change the initial project I guess...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •