Page 1 of 4 1234 LastLast
Results 1 to 10 of 37

Thread: LHC probs

  1. #1
    Join Date
    Jul 2004
    Location
    Sussex, UK
    Posts
    3,734

    LHC probs

    just checkig our stats and stuff, on the LHC home page came across the following:

    Server Status

    Up,
    Warning: Too many connections in /shift/lxfsrk429/data01/boinc/projects/lhcathome/html/inc/db_ops.inc on line 11

    Warning: MySQL Connection Failed: Too many connections in /shift/lxfsrk429/data01/boinc/projects/lhcathome/html/inc/db_ops.inc on line 11
    Unable to connect to database - please try again laterToo many connections

    2.10.2004 15:00 UTC
    LHC@Home database is having bad performance. We should get another server soon and get things running better.
    seems a shame, 2 many connections meaning they have too many users ??

  2. #2
    Join Date
    Jul 2003
    Location
    Sydney, Australia
    Posts
    5,662
    I couldn't get it to connect, tried several times this arvo and gave up in disgust. Bloody BOINC, fri%&in useless system.!

  3. #3
    AMDave's Avatar
    AMDave is offline Seeker of the exit clause Moderator
    Site Admin
    Join Date
    Jun 2004
    Location
    Deep in a while loop
    Posts
    9,658

    LHC Web Loading

    Yup.

    I saw that too a couple of hours ago when my client was not getting any WUs. Then about 2hours ago it resolved and the main page data came up. It said there were just over 19,000 tasks available, but I still could not get any of them. Now we both get the same message again.

    I expect that because the tasks are so short to process, the task server is getting overloaded with requests for new WUs.

    Interesting thing is that when the user increase happened from 2000 to 5000 users there were over 94,000 tasks available. I think they are going to run out of WUs before they get the new server in place.

    Predictor and Seti both went through the same issues, but at least LHC have mitigated the issue by limiting the user base so they can expand in a controlled way. I think they may have underestimated the number of CPUs those new 3000 users would put on the project.

    You can be sure that although some of us are getting error messages it is because some others are getting the WUs that are available. It may not seem very fair from a Team point of view, but for the project at least the work is still getting done.

    Either way, the demand is now clearly greater than the supply and they will have to increase the power of the splitter to feed the WU demand that is there if they want to keep the user base happy and involved (and you can be VERY sure that they do)

    As DMMc said in another thread...CERN don't hang about mulling things over for long. We should see something happen pretty promptly and powerfully - unlike the gradual recovery that Berkely had to go through with Seti.

    At least with CPDN the WUs take a long time to crunch, so the demand peaks are more varied. It will be interesting to see what happens tho when their user base finishes the next couple of WU sets and start hitting their server for WUs. They have of popular base too.

    In the mean time I have thrown a CPU back onto GRID. For some reason the SoB client on that machine is not registering when it sends work back. I'll figure that out later. GRID is one of my old favourites and currently our lowest placing project. Feel free to give it a spin if you have a CPU for it. (Windows GUI only, WUs vary 4hrs to 36 hours)

    Post Script:
    I must say that I still find the thought odd that the IT Pros in these projects would have not calculated the required productions rates and bandwidth demands at the servers satisfactorily enough to avoid these issues. There is sufficient data available on the exisiting projects in the world to calculate these things very finitely. Perhaps they did and were only able to apply equipment within restrricted budget limits and then had to demonstrate the issue to management when it occured, like an "I told you so".

    Q - When's the last time you heard "whatever you need to get it done" between Mgmt and IT ?
    A - right after it all F***d up. Thats when.

    (Please s'cuse the "F". This happened to me this week. So I Know)

  4. #4
    Join Date
    May 2004
    Location
    Kent, UK
    Posts
    3,511
    Trouble is.....lots of DC projects are moving to Boinc.

    If we decide to ignor Boinc, whats left to concentrate on??

  5. #5
    AMDave's Avatar
    AMDave is offline Seeker of the exit clause Moderator
    Site Admin
    Join Date
    Jun 2004
    Location
    Deep in a while loop
    Posts
    9,658
    Stuart.

    I didn't say "Ignore BOINC" !!!!
    I would never say that.

    The philosophy behind BOINC is great.
    Especially from a DC team point of view.

    I only put my CPU onto GRID until I can actually GET some LHC WUs.

    I hope that my post did not come across like that.

    I suppose that in my mind I was thinking of all the previos posts in which we have discussed BOINCs development.

    In that way I may have ommitted to say what I was thinking, given that I have already said it several times.

    BOINC is coming along in steps.
    It is a very powerful medium for scientific (and otherwise) projects.
    Conversion to BOINC is not yet a well mapped process.
    Problems will happen.
    DC Teams and members must try to be patient with them.
    They will come good.

    I did not mean to give that "other" impression at all.

  6. #6
    AMDave's Avatar
    AMDave is offline Seeker of the exit clause Moderator
    Site Admin
    Join Date
    Jun 2004
    Location
    Deep in a while loop
    Posts
    9,658

    my posts

    Oh. I think you meant vaughan's post.
    I was still typing mine when he posted,
    I left the room came back and then I saw yours.

    LOL.

    I gotta stop typing at the speed of thought and keep my posts shorter.

    Sorry if I gave the wrong impression.

    Dave.

  7. #7
    AMDave's Avatar
    AMDave is offline Seeker of the exit clause Moderator
    Site Admin
    Join Date
    Jun 2004
    Location
    Deep in a while loop
    Posts
    9,658

    LHC Problems thread

    Here's the URL to the LHC server problem thread for what it is
    http://lhcathome.cern.ch/forum_thread.php?id=623

    Be patient, it may take up to a minute to load at the moment.

  8. #8
    Join Date
    May 2004
    Location
    Kent, UK
    Posts
    3,511
    Dave,

    I wasn't having a pop. Just that all teams are complaining about the reliability of Boinc, levels are always underestimated.

    Screwed in Predictor, Seti points/people missing, LHC no work.....

    Ubero, although effectivly wasted cpu time, has never had a problem.
    SoB is always working.
    Grid is reliable

    Grid has more value because it's medical.

    There have been a few views on what our priorities should be, top 10 in all DCs being 1 of them.

    I'm all for taking 15th in Sob then having a vote.


    Rant over.

  9. #9
    AMDave's Avatar
    AMDave is offline Seeker of the exit clause Moderator
    Site Admin
    Join Date
    Jun 2004
    Location
    Deep in a while loop
    Posts
    9,658
    :D
    No rant there.
    Just facts.

    It *would* be nice to see the problems ironed out real fast.

    Then I think that there should be an "All projects" cobblestone site.
    That way, once the BOINC project implementations are sorted out people will sign up to more projects and it won't matter if a single project goes down for a few days as the users would still be punching out Wus on the other projects and still accumulating their overall cobblestones.

    SetiSYNERGY is sooo very close to it already, but they have just stopped short of doing the combined rankings. It seems like the next logical step though.
    http://www.setisynergy.com/stats/index.php

    Ahh. One day perhaps.

    ITMT Yes it would be rude not to capitalise on David's contribution to SoB and make the next rank.

  10. #10
    Join Date
    Jul 2004
    Location
    Sussex, UK
    Posts
    3,734
    if they do an overall score they would need to make usre workunits across the different projects score accordingly, as in LHC it takes usually about 6 hours but ocassionally 45mins, whereas in predictor always about 2 hours.
    i dont know how long SOB will be around for but they have solved few of the 17 calculations.

    ** For now we'll stick with SOB and when we comfortably have 15th place we will vote again and decide where to go - nobodya has mentioned D2ol for a while, i dunno what people feel about that ?

Page 1 of 4 1234 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •