Linux Shootout: Opteron 150 vs. Xeon 3.6 Nocona
by Kristopher Kubicki on August 12, 2004 2:35 PM EST- Posted in
- Linux
Opstone
Since our use of Ubench in the previous article clearly infuriated many people, we are going to kick that benchmark to the side for the time being until we can decide a better way to implement it.
In the meantime, a reader suggested we give Blue Sail Software's Opstone benchmarks a try. In this portion of the review, we will use their precompiled optimized binaries of the Scalar Product (SP) and Sparce Scalar Product (SSP) benchmark. The SP benchmark is explained by the author:
"The 'SP' benchmark calculates the scalar product (dot product) of 2 vectors ranging in size from 16 elements to 1048576 elements for both single and double-precision floats. Although the Gflops/sec. for every vector length is recorded (in the resulting output log file), the average of all these values is reported. This benchmark is indicative of the performance of many raw floating-point data processing apps (movie format conversion, MP3 extraction, etc.)"
Note that we ran the P4 optimized binaries on the Nocona, which did not provide x86-64 enhancements. Running the AMD64 binaries on the Xeon yielded poor results. The P4 Opstone binaries are the only 32-bit binaries used in this analysis.
Below is the SSP benchmark, as explained by the author:
"The 'ssp' benchmark also calculates the scalar product of 2 vectors, except that these vectors are sparsely populated (only the non-zero value elements are stored) ranging from a 'loading factor' (non-zero/zero elements) of 0.000001 to 0.01 for both single and double-precision floats. Since the data is not contiguous in memory, the performance is much lower than regular 'sp' and is measured in Mflops/sec. There is not much difference in performance between different loading factors as this benchmark really challenges the ability of the processor to perform short bursts of calculations coupled with lots of conditional testing. It is this reason that the P4 with its longer pipeline does not generally perform as well as the Athlon64. This benchmark is indicative of the performance of many 3D games as the processing is similar (short bursts of calculations with numerous conditional testing)"
There is a general distrust of synthetic benchmarks, so take this portion of the analysis only with a grain of salt. We see a tale of two processors in these graphs; generally the Xeon performs better in the raw operation SP benchmark, while the Opteron performs better in the condition testing SSP benchmark. We would be lead to believe the Intel processor does content integer content creation better than the Opteron, and visa versa with floating point applications. However as we see in the rest of the review, this is not always the case.
92 Comments
View All Comments
menads - Thursday, August 12, 2004 - link
Now all I want to say is big thanks for listening to your readers. Unlike other site which I would not mention that claims they are never wrong I think it is very nice of Anandtech editors to accept the criticizm and feedback from their readers and to get back retesting/reviewing.It is not about scores or brands - it is about the trust of the people reading these arcticles - a misleading review in most cases is worse than non-review.
Also Kristoper please do not take criticizm of the previous article personally - by criticizing your article most people were hoping you will do better next time.
KristopherKubicki - Thursday, August 12, 2004 - link
Hi Tau,>Handcoding ASM for specific tasks is NOT ancient
No youre correct. The context of the sentence though its the hand coded ASM used in 3.6 "stable" is ancient. Someone pointed out to me it doesnt even have the original MMX optimizations in it (i think).
Kristopher
Jeff7181 - Thursday, August 12, 2004 - link
What's that sound? I hear heavy footsteps and heavy breathing... oh... wait... it's the Xeon trying to keep pace with the Opteron :DSDA - Thursday, August 12, 2004 - link
Yeah, gj Kris, and yeah, I'd say you deserve a vacation after all that... thanks for listening to people, that's a lot more than certain editors at certain sites COUGHCOUGHTHGCOUGH would do.Pirox - Thursday, August 12, 2004 - link
Lmao...i got hand to kriz though ..you sure are one tough guy! Nice article...and to think that the guy remains calm...what gives?KristopherKubicki - Thursday, August 12, 2004 - link
Lynx516:Parts of 3.4.1 are backported into 3.3.3. Please check the SuSE 9.1 man pages.
Kristopher
TauCeti - Thursday, August 12, 2004 - link
Hi Kris,First: I appreciate the work you put into this review. But i cannot restrain to offer one (hopefully constructive) remark:
you write: "We are using John the Ripper 1.6.37 in this portion of the benchmark. As a few extremely knowledgeable readers pointed out, the "stable" 1.6 branch of code relies heavily on hand coded ASM which by today's standards is fairly ancient anyway."
Handcoding ASM for specific tasks is NOT ancient. Handcoded ASM allows you to utilize the execution units and the cache-latency distribution of a given core architecture to fullest extend.
That is of uttermost importance to widespread library functions used in scientific calculations. Even the popular GIMPS client is handcoded in ASM for every CPU-variation (there are even different codepaths for different cache sizes). The GIMPS developers are fighting for every single clock that can be saved in a inner loop for different architectures.
That aside...
Have a nice vacation. I guess you ned it ;)
If you - for yourself - agree that you could have done better, swallow your pride and try to convert the substantial complaints into positive energy. Ingnore the personal bullshit from wannabe-i-know-betters. Never waste a minute of your life for that. It's not worth is.
Regards,
Tau
Lynx516 - Thursday, August 12, 2004 - link
Hang on You said that you used -march=nocona and -march=k8 with gcc3.3.3. However those compile options are NOT IN gcc3.3.3! There is a serious problem if you use non existant optimisations as it casts a shadow of doubt on the competence of the author as it shows they dont know what they are doing.If this is the case read up on Linux before doing articles! If I am being overly harsh then correct the error
Lynx516 - Thursday, August 12, 2004 - link
Much better. Your compiler flags arnt the best as things like "-funroll-loops" tends to do nothing but bloat the binarys. Also your config page is not working in Firebird. Its nice to see realistic results. From the last version it looked as if all x86-64 cpus got owned by intel's offering because that was the only data you where presented with.However this shows a price for price comparison which is much better.
One point I have to make is why the first article was ever published in the first place as it was of little value as you had nothign realistic to compare it wiht.
syadnom - Thursday, August 12, 2004 - link
nice to see comparable processors benched against each other, the 164 in the old review justs isn't in the same category of processors.that said. i'm dissapointed to see the Xeon look so weak. I expected the benches to flop back and forth on which proc was faster because of their different designs. I know the Opt150 is one hell of a chip, but I think intel can do better.