PDA

View Full Version : Crunching hardware for Einstein



Steve Lux
05-29-2011, 02:44 PM
If size isn't an issue, does anyone have a recommended GPU for crunching Einstein? I have a half-built Asus KGPE-D16 12-core (Expandable to 24-core) motherboard with 4 PCIE 2.0 x16 slots and 1 PCIE x8 and 1 PCI slot. I figure some folks out there know a good bit more than I do about specing out GPU cards. I'm also wanting to go with a SS drive.

- Thanks

Dirk Broer
05-30-2011, 03:33 PM
If Einstein is your thing, you need the best nVidia card you can buy, the GTX590. If you can set them in SLI, do it (interresting question: will it work -4 Fermi GPUs at once in one system-.If it does: big credits just ahead. Further still: expensive power bill. Two GTX560s might be a cheaper and almost as good a solution

velociraptor
05-31-2011, 08:56 AM
I just got me two ASUS ENGTX 560 Ti Top( on an ASUS Crosshair II Formula/Phenom II X4 940 BE)..
would try them on Einstein for a reference value, if it wasn´t for the heat.
The cards stay in a acceptable temperature range, but their heat kills my southbridge for my average room temperature is a little bit high (30-35°, and no chance for a proper cooling of the room...). Maby i should give water cooling a try.
At Primegrid they easily managed an astonishing output untill i turned them of during the day...
All in all i am very satisfied with their performance and the only thing i can think about is that the summer is already lasting too long ;).

Dirk Broer
06-01-2011, 09:18 AM
30-35°, now in Vienna? What will it be in July-August? Try relocating to the Alps....
(Have a room without window(s) but with an airco might do as well)

velociraptor
06-01-2011, 02:37 PM
:icon_lol:, no it is not 30-35° in vienna, but in my room. i am still living in a home for students.

now it cooled down to 25° outside, and with the window wide open it has still 29° due to the fact that i am living in the highest floor and the sun heats the roof up. This heat has nothing better to do than to give me a visit (bad to no isolation) and finally i am not allowed to install an air conditioning (for i shall not make holes in the walls).
So all in all i have to live with about 5° more than outside.
So i just hope that it will not have 30° outside for too long for living in a 35° room isn´t a pleasant thing (imho).
You can´t imagine how i am already looking forward to the summer to fire up my to gtx 560´s, for now there is no budget for a proper water cooling so i will have to wait...

Dirk Broer
06-02-2011, 12:45 PM
Yes, I understood that it was just your room, and not the whole of Vienna...
So do not make holes in the walls, just hang it out off, or set it for an open window, depending on the model (http://www.coolmaestro.co.uk/portable.html)
My son, studying in Amsterdam, lives in a converted sea-container (http://walker.photoshelter.com/gallery-image/The-Netherlands/G0000CMltZjvlywg/I0000l9UXdZQS7nk)in pretty much the same conditions (living on the top level floor and chrunching on my account)
He has to have his balcony door open almost all the time to keep temperature below 30°C (and that's about 100°F)

Dirk Broer
06-03-2011, 04:02 PM
I have a half-built Asus KGPE-D16 12-core (Expandable to 24-core) motherboard with 4 PCIE 2.0 x16 slots and 1 PCIE x8 and 1 PCI slot.

:icon_eek:DROOL (http://ic.tweakimg.net/ext/i/imagelarge/1274088594.jpeg):5eek:

Cheapest 12-core CPU (Opteron 6168) comes at a little more than 600 EUR, Mobo is 380 EUR, and then you want at least 2 Gb of RAM per core=24 Gb with one CPU, and double that when you have two CPUs. But you have 2x8=16 slots for RAM, so filling it for now with 8 4Gb modules is just fine. You will soon be the team's fastest climber, provided that you have a good nVidia (ATI/AMD does not yet support Einstein) card. Money seems to be no objection, ever thought of a Tesla installation (http://www.nvidia.com/object/personal-supercomputing.html)?

Steve Lux
06-05-2011, 08:24 PM
Oh, money is an issue. I was getting good overtime and bought the case, board, power supply and one 12-core CPU. I was spec-ing the ram, SSD and video when my overtime at work went away, so put the project on hold. I'm getting a bit more overtime again and I hate to see a potential 30k/day box just sitting there, so I want to see if I can finish building this monster before the summer and overtime run out. Seems the most important factor for Einstein productivity is the GPUs.

Are you sure 2gb per cpu is required? I'm, running Einstein on 4 cores on my home system and only using 2.16 gb (Along with Norton's and several small applications) on Windoz 7 home premium. I'm not exactly an expert at building systems, but my understanding is that the more ram you have the more latency from addressing all that ram. Now, I don't want a system that is only capable of running Einstein, but I also don't want to slow down my system with too much ram.

Tesla? Only in my dreams.

NeoGen
06-05-2011, 09:27 PM
I maxed out my X3 with 16Gb of RAM to be able to crunch the biggest and baddest BURP workunits that have been out this year.
I have no way to really measure it, but I don't think I am getting any higher latency because of the higher amount of RAM.
I do know that an application like BURP that allocates gigs of RAM at a time does spend a little time filling it all up and/or emptying it. I have done some experiments in the past in programming, trying to allocate really big arrays of random numbers into RAM and Windows does take some time to fill it all up before starting to work on it, it's not a thing that happens instantly, and can easily be seen on the Task manager as the used memory graph grows. That might be what you mean about latency?

But so far I have not noticed any normal apps lagging or any latency issues, since I put 16Gb of RAM I even disabled the page file so I believe I am running 100% from the RAM, and it's doing great. 2Gb/core would be the "ideal" value but hardly any DC projects out there use 2Gb per running instance. Right now you would probably do great with 1Gb/core. But if you can, try to use higher capacity chips to leave memory slots open for future expansion. :)

Dirk Broer
06-07-2011, 07:28 PM
I run into problems (freezing up) running 4 Malaria WUs at a time with 4 Gb on a quad, so that's why I advise at least 2Gb per core. I do however also maintain a big pagefile.
Einstein might be able to run on less, but you might want to use your system for more than just Einstein, while you do want to let it run, no matter what you are doing.

vaughan
06-08-2011, 01:03 PM
You need lots of RAM for BURP and NFS too. EvoAtHome (when it has work) is memory intensive.

Steve Lux
06-30-2011, 09:44 PM
Well, apparently I bought the 8 core rather than the 12 core (shrug). Anyhow, I just now got it crunching Einstein. Letting it crunch at reduced capacity for a while, Trying to keep the cores under 60C for a week or so. Still haven't purchased the GPU yet, but they will be the main workhorses compared to the CPUs. Bought 2Gb per core - 4ea 4Gb sticks. Have an older nVidia card installed, but doesn't seem to recognize it, so probably not supported. Running Ubuntu on an SSD, but through a USB port. Need to see if I can get it hooked through the SATA for faster throughput. Presently hooked up to my son's flat screen tv in the living room. Need to get it moved back into my office where the AC and the hub are located. I have an unused KVM hiding around here somewhere, so shouldn't need to buy another monitor, keyboard or mouse.

I understand that within a few months AMD is supposed to be coming out with 16 core CPUs for this board. 32 cores and 4 GPUs (2 cards) could be quite productive, but that will take a monster power supply and I'm not certain the cost of more cores will be worth the expense.

Steve Lux
06-30-2011, 09:55 PM
velociraptor;

If you have a window that opens a portable AC unit could be helpful:

http://www1.mscdirect.com/cgi/NNPDFF?PMPAGE=4181&PMITEM=51371755&PMCTLG=00

I'm sure you have a local supplier with the correct voltages and plug style.

Dirk Broer
07-02-2011, 12:46 AM
Hi Steve,

What older nVidia card are you using? It has to be a GeForce8 or higher to be able to do anything, and you might need a more recent video driver (or boinc client) in order to get things working.
Here in the Netherlands I have presently a choice between 4 different 12-core AMD Opterons (6168, 6172, 6174, 6176) ranging between 615 and 1,052 Euro's, the cheapest (6168) needing a mere 80 Watt, and when you have two GT430 cards (that only need 50 Watt a piece) you can run the Einstein-system with an "average" 600 Watt PSU, having power left over for the rest of your system. I hope the 16-cores come at an affordable price and consume power at the same rate as the present 12-cores (per core that is), so will have a good Mips/Watt and Flop/Watt trade-off.

Steve Lux
09-12-2011, 04:13 AM
My 8-core system died several days ago. I had bought an nVidia GTX 580 for it and when I went to install the card that's when I found out the the system wasn't running. No video, nothing. Tried installing the new card (in case it was the old video card) - still nothing. Tried three different monitors, still didn't work. The LED's on the board and the fans come up but no video. Anyhow, so tonight I took the no-name OEM GPU out of my old system and put the new GTX 580 in, along with the 750w power supply from the 8-core system. It took a little creative sheet metal work, but I got them squeezed in. It's not pretty, but I also put in an extra fan to help with the cooling.

Let's see how this card works on Einstein. It's a bit late here (midnight) so I'm not going to stay up watching it any longer. My 4 core CPUs runs these work units in just under 6 hours each per core. With only a few completed the GPU seems to be popping them out at about 35 minutes per work unit. Boinc/Einsten seems confident also. About 100 work units were downloaded for the GPU - all due by the 25th. Still downloading new work units....

Steve Lux
09-12-2011, 04:19 AM
Dirk, that old nVidia card is a GeForce 7800 GTX.

Dirk Broer
09-12-2011, 07:38 AM
7800 GTX is unfortunately useless for BOINC, one generation later (8800 GTX) and you have a good performer...

Dirk Broer
09-12-2011, 07:43 AM
You haven't by any chance also replaced the PS/2 mouse with a USB mouse? Did that a couple of weeks ago and the system would not come up either. Switching back to the PS/2 mouse and the system came back to live

Steve Lux
09-12-2011, 01:47 PM
Both the mouse and keyboard on the 8-core Linux Ubuntu sysrtem were wireless via the USP port. They had been running that way since the system was first started up.

I stopped monitoring the system after a few weeks as the temps seemed to stay under 50C and Einstein stablized with the work units (sometimes Einstein assigns more than a system can do - so you lose productivity working on units you won't get credit for because they were completed late).

I looked at the Einstein messages this morning before leaving for work and learned that there is apparently a 128 assigned work unit limit per core/GPU. I suppose the reason is to limit the disruption caused by any system that fails or is shut down. Pokey old CPUs never see this limit, even when set to maintain a backlog of 10 days worth of work.

Steve Lux
02-02-2012, 04:21 PM
Now both of my systems are down. $$ is tight so don't know when I can get them back up again. Not sure exactly what died in either of them. as I can't get them to boot up even in safe mode. The power supply seems to be OK. Sometimes it seems like a memory or HDD problem, sometimes a video card issue. I'll poke around to see what spare parts I have to work with. I've donated many of the spare components to charity over the years, so I'm not sure what I have left. Alas for the days of BIOS post beep codes...

Dirk Broer
02-04-2012, 10:06 AM
Take out all cards, and use only one stick of memory. BIOS still has to beep.
The beep you will get now is that there is no video output (duh).
Power down, install an old PCI card and try to reboot. If it works, power down and install full memory.
Try again. If it works, power down and re-install your AGP or PCIe card.

If it goes wrong in step one you may have had a power surge and the PSU may have died. Also watch for blown caps on the mobo.

Steve Lux
02-09-2012, 08:02 PM
Pulled, cleaned and re-seated my GTX-580 (GF110 rev A1) video card and my system started back up again. Downloaded and installed GPU-Z 0.5.8 and DebugDiag 1.2. It's not an over-temp issue (60C with GPU running Einstein while 48% loaded and the fan is only running at 30%). The card is pulling just under 30 amps from the .994VDDC bus so my 750W power supply is more than sufficient and it's putting out a very stable 12.11 vdc.

Did see one error twice though, didn't record it, but it indicated a video driver error. I'm using driver nvlddmkm 8.17.12.8562 which is the newest driver and BIOS version 70.10.17.00.01 which seems to be the correct version. Running fairly stable, but my system reset again today while I was at work. Whichever utility reported the events didn't record them, so something isn't set up right.

Also ran Norton and PC Doctor utilities - no issues found.

Steve Lux
04-05-2012, 12:53 AM
Found out that my BIOS was overheating - even after cleaning. The heat pipes and the aluminum fins on top of the BIOS stay hot. My system is not overclocked. So, I got my system running again by sticking a fan right over my BIOS chip. However, now I can't seem to get GPU work:

4/4/2012 8:43:01 PM Einstein@Home Sending scheduler request: To fetch work.
4/4/2012 8:43:01 PM Einstein@Home Requesting new tasks for GPU
4/4/2012 8:43:06 PM Einstein@Home Scheduler request completed: got 0 new tasks
4/4/2012 8:43:06 PM Einstein@Home Message from server: No work sent
4/4/2012 8:43:06 PM Einstein@Home Message from server: see scheduler log messages on http://einstein.phys.uwm.edu//host_sched_logs/2326/2326176


So, for now it's just CPU work only.

Dirk Broer
04-05-2012, 02:49 PM
That Einstein message I get on my systems as well, has nothing to do with overheating chips!
It's a very dumb-ass way of saying "Sorry, we've ran out of GPU work and your settings do not allow for CPU work"

But seriously, heat-pipes and aluminium fins, that sounds to me like the cooling of either CPU or GPU.
BIOS chips are small and black and sit somewhere on your mobo, without heatpipes and fins -at least on my mobos-.

Or you mean the BIOS on your video card http://benchmarkreviews.com/images/reviews/video_cards/GeForce-GTX580/NVIDIA-GeForce-GTX580-PCB-Top.jpg
Can you point it out for me? And if it turns out to be the big chip saying '580', that is your GF110, then it's the Videocard cooler that is not up to its task. Consider buying an aftermarket cooler like this one : http://cdn.mos.techradar.com//Review%20images/PC%20Format/PCF%20252/PCF252.overclocking.artic-580-75.jpg

Steve Lux
04-05-2012, 10:10 PM
The chip is on the motherboard. Looks like you have the same brand GTX580 video card that I have - at least the fan package looks the same. I suspected the video card initially, as this problem started not long after I installed it. But then I installed the power supply at the same time so to me it could have been either. The problem didn't become a real issue as long as my house was cold during the winter. (I live alone so 55F is just fine to me and I leave the heat off when I'm not home.) Things really became an issue as the house heated up in spring-time.

To diagnose the issue I basically let the system run until it shut down and stuck my fingers in there to find out what was hot. I put a fan on it and have been running stable now for several days.

If my eldest son ever gets fully employed he can start paying his college bill and I can start investing in my systems again. Until that point I'll just have to try to patch stuff up to keep it running.

Dirk Broer
04-06-2012, 08:30 AM
Sure it is the BIOS chip overheating? When I google on it I get hits for overheating Northbridges...
Northbridges tend -on more expensive mobo's- to be equipped with fins and/or coolers, but I've never seen a cooled BIOS chip yet.

What mobo is it, and what BIOS version is it running?
Have you ever looked on the site of you mobo manufacturer and looked for a more recent BIOS?
Maybe your BIOS version is known to have temperature issues and can be flashed into one with more decent behaviour.
A bit farfetched, but you won't know if you haven't looked for it.

If it is the dual-cpu capable G34 board, have you enabled 2 cpu's while actually having installed just one?
Have you placed memory in the lanes for the unused cpu?
Does the Northbridge (or BIOS) chip also overheat using an old PCI video card?
It just might be that a stressed-out GTX 580 running Einstein is more than your Northbridge can handle.
Try running Seti@home for a time and look whether this causes a crash as well.
If crashes only occur doing BOINC GPU work then there is something wrong with the system.
Not enough cooling -obvious, as there was no problem during winter-,
not enough memory/wrongly placed memory ?,
not enough CPUs (if two are enabled) ?
PSU can't handle it if temperature gets too high? (You do have a 600+ Watt PSU I hope? Minimum that nVidia reccommends)

Steve Lux
06-24-2012, 01:59 PM
After some financial issues I finally got my big 8-core Ubuntu system back up and running again after getting a 1475W PS. Had it running a few days just on the CPU. Started up my GTX-580 card today. This thing just runs on a 100G SSD. It's a no-frills cruncher system. Eventually I plan to put the second CPU on the M-board and install a second CPU. I don't know if SLI will benefit Einstein or slow it down.

I want to try to get my main user home 4-core Windows system back to running soon, but still need to do some more troubleshooting. Need to verify the 750W power supply, but I know I'll need a new motherboard. I bought a GTX-670 for it to run after I get it up and going.

Additionally my son has put a few systems into Einstein also crediting my account. :) We should start seeing some better progress on this project.

vaughan
06-24-2012, 03:02 PM
Steve: re the query for an SSD I have bought a couple of these in the 120GB size for around the AUD190 mark and they seem pretty quick and reliable (touch wood).

http://www.intel.com.au/content/www/au/en/solid-state-drives/ssd-520-brief.html?wapkw=ssd

Dirk Broer
06-24-2012, 03:24 PM
Speaking about Einstein: You know you can set your GPUs at work through the preferences under 'Your Account'? It says:
"GPU utilization factor of BRP apps DANGEROUS! Only touch this if you are absolutely sure of what you are doing! Wrong setting might even damage your computer! Use solely on your own risk! Min: 0.0 / Max: 1.0 / Default: 1.0"
So I set it to 0.5, and as result that my F1A75 Llano system is doing four Einsteins at a time, using my HD 6670 and my HD 6550D (read: my A8-3870K), who are in CrossfireX forming a HD 6690D2. If SLI works the same you need not fear any performance loss!
The HD 6530D -the GPU part of my A6-3500- even went from processing one Einstein WU in 12 hours (tragic) to processing two Einstein WUs in six hours (good!)

gamer007
06-25-2012, 02:52 AM
I haven't really played around with that option. My 7870 crunches a WU in about 40-45mins. I should see how 2 at once goes.

Steve Lux
07-07-2012, 01:10 AM
Thanks for the tip Dirk. I set my GTX 580 from 1 to .5 and went from 1 GPU work unit completing in about 60 minutes to 2 GPU work units completing in about 90 minutes. A nice increase.

So, just the other day I changed the setting from .5 to .25 and I am now doing 4 GPU work units in about 150 minutes. Another nice increase.

The GTX580 GPU is running stable at 52c. I'll let it run for a week or so to ensure it will remain stable before tweaking it even more.

Steve Lux
12-02-2012, 05:20 AM
I've been having problems with my main cruncher system for several weeks now. My main user system has been dead since April. Decided of the two to get my main cruncher system back up first. Now I finally have it back up and running with a new GTX670 card and HDD, and got rid of Ubuntu. Too many compatibility issues with Ubuntu. I'm thinking of sticking a second video card in it and see if that improves system output. That may require installation of a second CPU.

Anyhow, we should start seeing the results in a few days when credits start rolling out. Einstein@home wouldn't let me merge the systems for some reason.