PDA

View Full Version : Anyone plan on trying this project



Bionic_Redneck
04-14-2004, 03:29 AM
http://www.people.iup.edu/vmrf/

I started running this earlier today. This will be a good project for dail up. you just email results back once a month. should be good for nonet too

vaughan
04-14-2004, 04:22 AM
I'll try it with you but how do you set it up? I created an output.txt file in the Sub-directory I made for it. Now it says to enter a value for k. What do I put in here? The Forum is ULTRA confusing as their is no mention of the kn2plus1 project just lots of Prime stuff I don't understand. The site doesn't explain what the k is either.
Any help appreciated.

vaughan
04-14-2004, 04:39 AM
Update:
I entered k min=5000000 (5 million)
and k max=10000000 (10 million)
and used exp=1 and it started. I don't know where to enter any Team related information. Maybe this will happen after 60 minutes when it writes to the output file.

Edit:
Scratch this - see William Garnett's instructions in the next thread please.

Bionic_Redneck
04-14-2004, 05:03 AM
I emailed vmrf@iup.edu to get started. Once got a reply I downloaded prp.tgz and the input.txt as per the instruction in email then I started client w/ menu "./prp -m" and went to Test/Input Data
Your choice: 1

Input file (from NewPgen): : input.txt
Output file (for proth.exe): : output.txt
Line number (1): ****

I also got the line numbers for my machines in email. then I when test/exit and started client again with ./prp -d but I think you can go straight to test/continue

wfgarnett3
04-14-2004, 06:40 AM
Hi Ralph (Bionic) and Vaugh,

I am starting a new thread now with instructions for all.

regards,
william

vaughan
04-17-2004, 03:43 AM
Just want you to know it takes 11.11hrs per task on an Athlon XP3200+ with 1 GB DDR400 RAM under WinXP-PE SP1.

Bionic_Redneck
04-17-2004, 04:45 AM
I did the last one in 10 hrs I have slowed down cpu a little don't want to show up that 3200+ 2 much with a little ole 1800+ :shock:

numibesi
04-17-2004, 08:47 AM
I have 5h and something and already have 55% complete, on idle and using the Pc at same time, on my AMD Athlon 1900+, 512M Ram (SDRam) ;)

Bionic_Redneck
04-17-2004, 08:51 PM
just finshed another one. time server 10.02 hrs and I built a few packages and workstations are doing one every 15-16 hrs or so. I think speed depends on more than just how fast cpu is.

Anonymous
04-18-2004, 05:14 AM
I followed the link William posted on the Mersenne Forum (http://www.mersenneforum.org/forumdisplay.php?f=13) and although I do not run this project "atm" I do try to keep an eye on the progress of the project. Being involved in several other "organized" prime searches presently leaves little in way of resources to contribute to this one at this time. However that should change in the near future.
I hope that all who join in William's project will enjoy the search and find a prime worthy of the Top 5000 (http://primes.utm.edu/primes/home.php)



'On a different note, I have been known to do a little Seti@Home from time to time and being without a "team" I've dropped my Seti@Home account into your team.'

Bionic_Redneck
04-20-2004, 03:13 PM
I did some tweaking in bios and it shows a 10% increase in speed just by disabling video and bios cache and lowering ram timings.

wfgarnett3
04-21-2004, 05:45 AM
PRP runs fastest on Pentium 4s, due to SSE2 instructions.

However, PRP isn't quite as SSE2 optimized as Prime95. I think it takes slightly under 8 hours to test these exponents on a P4 2.8Ghz 800Mhz bus processor, and your Athlons at around 11 hours and above get close. If this were Prime95 there would be a huge gap rather than a close one. I am going to put a post in mersenneforum.org probably tomorrow seeing if someone can improve the SSE2 code for PRP.

I love SSE2, it basically changed the situation from a slower clocked Athlon COMPLETELY CRUSHING a faster clocked P4 in x87 FPU code to a slower clock Pentium 4 using SSE2 COMPLETELY CRUSHING a faster clocked Athlon using x87 for Prime95. For prime numbers, SSE2 is the greatest thing since sliced bread :)

I am extremely disappointed at AMD for having poor SSE2 in their new Athlon FX/Opteron/64. I was looking forward to them putting SSE2 in these processors, and being equivalent to Pentium 4s for SSE2 at equivalent clock speeds, but now the end result is there isn't really any difference in performance than if they just didn't put it in. At the same clock speeds, a P4 crushes a Athlon FX. I was impressed with the original Athlon when it came out, it was such a well designed processor, and now AMD disappoints us. Me, Bionic, and some others were talking about SSE2 over at the hardware section of mersenneforum.org I am assuming what I read on web that these processors have problems with vectorized SSE2 code. Bionic, can AMD do something to do this? should they put the SSE2 part of the chip away from the x87 FPU so they don't share it? Is AMD going to do something? or will all future releases disappoint in 32-bit SSE2 code?

regards,
william

Keith75
04-21-2004, 06:37 AM
My XP has SSE2 and the FX55 will have SSE3. I am not saying the SSE2 in the XPs is necessarily as good as the P4s, I just don't know, but it does have it. ;)

Keith

Bionic_Redneck
04-21-2004, 06:58 AM
I don't share the same opinion as some of the others in that thread when it comes to SSE2 and Athlon64's. Everone I talked to that does have one the cpus say with a client with SSE2 enable showed a decrease in speed not same as without SEE2. It seem kind of odd to me that intel and amd has a agreement to share cpu technology when it comes to instruction like 3dnow and sse, sse2, ect.. and the only bug was the the intel sse2 and all of a sudden intel has decided to make their own x86-64 cpu. It's no secret that intel has had most of the cpu market up untill now and amd had the first year out of the red could intel have put some bug in the code to hender amd?

Bionic_Redneck
04-21-2004, 07:44 AM
My XP has SSE2 and the FX55 will have SSE3. I am not saying the SSE2 in the XPs is necessarily as good as the P4s, I just don't know, but it does have it. ;)

Keith

No Keith XP's have SSE not SSE2. I can't remember what SSE3 does off hand but I don't believe it has anything to do with FPU. I think it has to do with improving hyperthreading and cacheing.

wfgarnett3
04-21-2004, 08:38 AM
First, a faster clocked AMD Athlon should obviously perform faster than a slower clocked. Earlier in a thread, someone said their 1800+ outperformed a 3200+? Can't be; yeah front side and memory bus speeds maybe can make a difference and I don't know if both processors have the same bus speeds, but I don't believe an 1800+ beats a 3200+ unless some overclocking is done. Make sure no other programs or processes are running in the background.

In response to SSE2, here are benchmarks George sent me (lower ms are better). The first one is the Opteron with SSE2 disabled, and the second one is it's natural SSE2 enabled.
------------------------------
Compare your results to other computers at http://www.mersenne.org/bench.htm
That web page also contains instructions on how your results can be included.

AMD Opteron(tm) Processor 140
CPU speed: 1395.99 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 23.7, RdtscTiming=1
Best time for 384K FFT length: 39.760 ms.
Best time for 448K FFT length: 41.586 ms.
Best time for 512K FFT length: 45.334 ms.
Best time for 640K FFT length: 61.293 ms.
Best time for 768K FFT length: 72.643 ms.
Best time for 896K FFT length: 88.261 ms.
Best time for 1024K FFT length: 97.124 ms.
Best time for 1280K FFT length: 125.646 ms.
Best time for 1536K FFT length: 152.502 ms.
Best time for 1792K FFT length: 182.080 ms.
Best time for 2048K FFT length: 204.119 ms.
[Tue Apr 13 09:53:55 2004]
Compare your results to other computers at http://www.mersenne.org/bench.htm
That web page also contains instructions on how your results can be included.

AMD Opteron(tm) Processor 140
CPU speed: 1395.99 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 64 KB
L2 cache size: 1024 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 512
Prime95 version 23.7, RdtscTiming=1
Best time for 384K FFT length: 32.687 ms.
Best time for 448K FFT length: 39.195 ms.
Best time for 512K FFT length: 44.306 ms.
Best time for 640K FFT length: 55.108 ms.
Best time for 768K FFT length: 66.948 ms.
Best time for 896K FFT length: 80.943 ms.
Best time for 1024K FFT length: 91.412 ms.
Best time for 1280K FFT length: 122.026 ms.
Best time for 1536K FFT length: 149.502 ms.
Best time for 1792K FFT length: 179.271 ms.
Best time for 2048K FFT length: 201.410 ms.
-------------------------------

As you see, for Prime95, SSE2 provides a very, very small performace gain compared to old x87 mode. It's almost worthless. Looking at the benchmark page at the link given, you see for instance with the 1024K FFT length, an old Pentium 4 1.6Ghz iteration time is 64ms, completely crushing this Opeteron 1.4Ghz. Slow down the Pentium 4 to 1.0 Ghz, or even using an Celeron with SSE2 and 128KB cache, the Opteron can't compete.

Here is the thread where me and Bionic discuss SSE2:

http://www.mersenneforum.org/showthread.php?t=2362

I wonder why this is the case Bionic; I wonder if it is really a bug, or if they just put SSE2 in there to run SSE2 programs or whatever. I have a bad feeling it won't get fixed in a future release.

By the way, as you know, AMD has it's strengths and Intel has it's strengths. For instance AMD has hardware rotate:
http://n0cgi.distributed.net/faq/cache/55.html
so I think performs better than P4 since P4 has slow rotate except for the new Prescott version.

AMD has superb x87 FPU. A Pentium 4 can't compete. They traded off to make excellent SSE2 FPU.

etc. etc. etc. Each CPU has it's strengths.

Oh by the way, SSE3 (aka Prescott New Instructions, PNI) was introduced with the Pentium 4 Prescott; most P4s don't have it. I hear rumors AMD might get it in future.

Also, Keith, your Athlon XP has SSE, not SSE2. If I remember correctly old Athlons didn't have SSE. Doesn't matter, Prime95 doesn't use SSE.

regards,
william

Bionic_Redneck
04-21-2004, 09:42 AM
yea the athlon thunderbirds have only 3dnow extended as do duron spitfires and duron morgan which are XP paliminos counterpart had sse but was disabled not sure if sse is disabled in duron applebred(barton counterpart). if you go further back the k6 series had 3dnow+

I got to think about it and SSE2 is just an 64 bit extention for a 32 bit cpu so really whats going on is your running a 64bit cpu is 32bit mode to use 64bit extensions.

AMD Athlon(tm)
CPU speed: 1998.84 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE
L1 cache size: 64 KB
L2 cache size: 256 KB
L1 cache line size: 64 bytes
L2 cache line size: 64 bytes
L1 TLBS: 32
L2 TLBS: 256
Prime95 version 23.5, RdtscTiming=1
Best time for 384K FFT length: 37.277 ms.
Best time for 448K FFT length: 44.150 ms.
Best time for 512K FFT length: 47.700 ms.
Best time for 640K FFT length: 60.681 ms.
Best time for 768K FFT length: 73.697 ms.
Best time for 896K FFT length: 87.510 ms.
Best time for 1024K FFT length: 99.817 ms.
Best time for 1280K FFT length: 134.836 ms.
Best time for 1536K FFT length: 158.563 ms.
Best time for 1792K FFT length: 202.566 ms.
Best time for 2048K FFT length: 230.655 ms.

Never said it was stock ;)

SB2
04-21-2004, 10:06 AM
On a completely different note, is PRP faster than LLR? Can LLR be used to test the canidates in this project?

wfgarnett3
04-21-2004, 12:37 PM
My project uses k*2^n+1. Really LLR includes PRP in it. LLR does deterministic primality tests for k*2^n-1, while it defaults to normal PRP mode for k*2^n+1 (which is a probablistic probable prime test). Iteration times are roughly the same. However, LLR is buggy while Jean updates it to be fast for certain -1 cases, so please continue using PRP. Shouldn't make a difference, but wouldn't want to jeopardize things. Thankx
regards,
william

Bionic_Redneck
04-21-2004, 04:51 PM
wfgarnett3, by looking at the benchmarks you posted it looks like some of the other dc projects aren't as well written for SSE2. I'm mainly talking about "Seventeen or bust". That the project where Athlon64's were slower with SSE2. Not too long ago folding@home released another core I think it's fahcore_79 it's for SSE2. folding@home is different from most as each wu detrimends which core you get some use mmx, 3dnow or sse, and sse2. kind of pot luck but from asking around it doesn't appear that athlon64 does very good with the new core.

SB2
04-21-2004, 09:26 PM
My project uses k*2^n+1. Really LLR includes PRP in it. LLR does deterministic primality tests for k*2^n-1, while it defaults to normal PRP mode for k*2^n+1 (which is a probablistic probable prime test). Iteration times are roughly the same. However, LLR is buggy while Jean updates it to be fast for certain -1 cases, so please continue using PRP. Shouldn't make a difference, but wouldn't want to jeopardize things. Thankx
regards,
william

I have been active in Paul Underwood's 321 Search,and the 15k Search using LLR exclusively, and have experienced no problems with it. I believe that Jenne's newest build which took advange of some faster code (for small k values) had a few problems, earlier releases are not affected. I've only searched for primes of the form k*2^n-1, other than when running GIMPS, hence the question of the difference in the two programs.

Thanks for the reply, PRP it is then.

wfgarnett3
04-22-2004, 02:09 AM
Hi SB2,

Just so I don't give wrong information, are you in my project (i.e. your name appears on webpage). If so, yeah continue using PRP, since we are doing k*2^n+1. If you are not in my project and would like to join, email me as per instruction thread so I may give you a line number.

If you are not in my search but test numbers of the form k*2^n-1, then stick with LLR all the time; don't use PRP. As you know LLR has huge speedup for small k so you want to use that. Plus, even if you aren't using small k mode, stick with LLR because "normal" mode isn't buggy. Plus I like what Jean did with LLR, he used the Lucas-Lehmer Riesel algorithm, so all the numbers you test for k*2^n-1 always report back if composite or prime. When it says prime, it is 100% prime, because LLR always does a deterministic primality test, so it is always correct and proves outright primality. Older PRP uses probablisitc probable prime test (I thinnk) which is 99% accurate, and doesn't prove primality, requiring an extra step of using OpenpFgw for Proth.exe to prove primality.

Hope that helps. :)

regards,
william

Keith75
04-22-2004, 02:13 AM
Yeah, I looked around on the web after posting that and saw how wrong I was. Funny, I would have bet the farm that I saw mine say it had SSE2.
I just love how Intel blasted the 64-Bit AMDs saying they were pointless and not necessary. A few months later they say they are working on them and how wonderful they will be. Sounds like something you would here in an election.

Keith

SB2
04-22-2004, 11:01 AM
Hi SB2,

Just so I don't give wrong information, are you in my project (i.e. your name appears on webpage). If so, yeah continue using PRP, since we are doing k*2^n+1. If you are not in my project and would like to join, email me as per instruction thread so I may give you a line number.

If you are not in my search but test numbers of the form k*2^n-1, then stick with LLR all the time; don't use PRP. As you know LLR has huge speedup for small k so you want to use that. Plus, even if you aren't using small k mode, stick with LLR because "normal" mode isn't buggy. Plus I like what Jean did with LLR, he used the Lucas-Lehmer Riesel algorithm, so all the numbers you test for k*2^n-1 always report back if composite or prime. When it says prime, it is 100% prime, because LLR always does a deterministic primality test, so it is always correct and proves outright primality. Older PRP uses probablisitc probable prime test (I thinnk) which is 99% accurate, and doesn't prove primality, requiring an extra step of using OpenpFgw for Proth.exe to prove primality.

Hope that helps. :)

regards,
william

Thanks for the informative reply. I am not involved with your project yet, but as soon as we can finish off 321 Search to n=1M I will move a couple of the P4's over to your project.