The SKUs

 

The Opteron 6176 looks a bit ridiculous as it delivers only 4% more performance at 30% higher power and 20% higher prices. The real reason behind this CPU is to battle another tanker, the Nehalem EX that Intel is going to launch tomorrow.  The TDP and clockspeeds of that huge chip are very similar. If your application scales poorly and you don't care about power consumption, the X5677 is your champion; it is probably the fastest chip on the market for applications with low thread counts.  

AMD vs. Intel 2-socket SKU Comparison
Intel Xeon Model
Cores TDP Speed (GHz) Price AMD Opteron Model Cores TDP GHz Price
W5680 6 130W 3.30 GHz $1663 6176 SE 12 105/137W 2.3 GHz $1386
X5670 6 95W 2.93 GHz $1440          
X5660 6 95W 2.80 GHz $1219 6174 12 80/115W 2.2 GHz $1165
X5650 6 95W 2.66 GHz $996 6172 12 80/115W 2.1 GHz $989
                   
X5677 4 130W 3.46 GHz $1663 2439SE 6 105/137W 2.8 GHz ?
X5667 4 95W 3.06 GHz $1440          
          6168 12 80/115W 1.9 GHz $744
E5640 4 80W 2.66 GHz $744 6136 8 80/115W 2.4 GHz $744
E5630 4 80W 2.53 GHz $551 6134 8 80/115W 2.3 GHz $523
E5620 4 80W 2.40 GHz $387 6128 8 80/115W 2.0 GHz $266
                   
L5640 6 60W 2.26 GHz $996 6164 HE 12 65/? W 1.7 GHz $744
          6128 HE 8 65/? W 2.0 GHz $523
          6124 HE 8 65/? W 1.8 GHz $455
L5630 4 40W 2.13 GHz $551          
L5620 4 40W 1.86 GHz $440          

 

The most interesting parts that AMD offers are the dodeca-core 6174 (2.2GHz), the octal-core 6136 (2.4GHz) and the octal-core low power 6128 (2.0GHz).  The 6174 targets those with well scaling multi-threaded applications such as huge databases and virtualized loads. The 8-core 6136 might even be better as most schedulers find it easier to distribute threads and process over a power of 2 cores. Lots of applications also don't scale beyond 16 cores and the chip comes with a 200MHz clockspeed bonus and a very reasonable price.

The 6128 HE is also an interesting one. The 6128 HE might be a good way to reconcile low response times with low power, but we'll have to find that out later.

Magny-Cours Benchmark Methods and Systems
Comments Locked

58 Comments

View All Comments

  • zarjad - Friday, April 2, 2010 - link

    I understand that HT can be disabled in BIOS and that some benchmarks don't like HT.
  • elnexus - Wednesday, April 21, 2010 - link

    I can report that one of my customers, performing intensive image processing, found that DISABLING hyper-threading on a Nehalem-based workstation, actually IMPROVED performance considerably.

    It seems that certain applications don't like hyper-threading, while others do. I always recommend that my customers perform sensitivity analyses on their computing tasks with HT on and off, and then use whichever is best.
  • tracerburnout - Wednesday, March 31, 2010 - link

    How is it possible that Intel's Xeon X5670 rig returns 19k+ for a score while AMD's magny-cours returns only 2k+?? I only question the results of this benchmark chart because Intel's Xeon X5570 rig returns only around 1k. How can a X5670 be 19x faster than a X5570?? And I doubt the same is true for the magny-cours by being just 10.5% of what the X5670 can do.

    (is there an extra '0' by accident in there?)



    tracerburnout
    proud supporter of AMD, with a few Intel rigs for Linux only
  • JohanAnandtech - Thursday, April 1, 2010 - link

    No, it is just that Sisoft uses the new AES instructions of West-mere. It is a forward looking benchmark which tests only a small part of a larger website code base. So that 19x faster will probably result in 10 to 20% of the complete website being 19x faster. So the real performance impact will be a lot slower. It is interesting though to see how much faster these dedicated SIMD instructions are on these kinds of workloads.
  • alpha754293 - Thursday, April 1, 2010 - link

    If you guys need help with setting up or running the Fluent/LS-DYNA benchmarks let me know.

    I see that you don't really spend as much time writing or tweaking it as you do with some of the other programs, and that to me is a little concerning only because I don't think that it is showing the true potential of these processors if you run it straight out-of-the-box (especially with Fluent).

    Fluent tends to have a LOT of iterations, but it also tends to short-stroke the CPU (i.e. the time required to complete all of the calculations necessary is less than 1 second and therefore; doesn't make full use of the computational ability.)

    Also, the parallelization method (MPICH2 vs. HP MPI) makes a difference in the results.

    You want to make sure that the CPUs are fully loaded for a period of time such that at each iteration, there should be a noticable dwell time AT 100% CPU load. Otherwise, it won't really demonstrate the computational ability.

    With LS-DYNA, it also makes a difference whether it's SMP parallelization or MPP parallelization as well.
  • k_sarnath - Friday, April 2, 2010 - link

    The most baffling part is how linux could engage 12-CPUs much better than windows. I am obviously curious about the OS platform for other tests.. Similary MS SQL was able to scale well on multi-cores... In this context, I am not sure how we can look at the performance numbers... A badly scaling app or OS could show the 12-core one in bad light.
  • OneEng - Saturday, April 3, 2010 - link

    Hi Johan,

    I have followed your articles from the early day's at Ace's and have a good respect for the technical accuracy of your articles.

    It appears that the X5570 scaling between 4 and 8 cores has very little gain in the Oracle Calling Circle benchmark. Furthermore, the 24 cores of MC at 2.2Ghz are way behind. Westmere appears to do quite well, but really should not be able to best 8 cores in the X5570 with all else being equal.

    I have heard some state that the benchmark is thread bound to a low number of threads (don't know if I am buying this), but surely something fishy is going on here.

    It appears that there is either a real world application limit to core scaling on certain types of Oracle database applications (if there are, could you please explain what features an app has when these limits appear), or that the benchmark is flawed in some way.

    I have a good amount of experience in Oracle applications and have usually found that more cores and more memory make Oracle happy. My experience seems at odds with your latest benchmarks.

    Any feedback would be appreciated .... Thanks!
  • JohanAnandtech - Tuesday, April 6, 2010 - link

    I am starting to suspect the same. I am going to dissect the benchmark soon to see what is up. It is not disk related, or at least that surely it is not our biggest problem. Our benchmark might not be far from the truth though, I think Oracle really likes the big L3-cache of the Westmere CPU.

    If you have other ideas, mail at johanATthiswebsiteP
  • heliosblitz2 - Wednesday, April 7, 2010 - link

    You wrote
    Test-Setup:
    Xeon Server 1: ASUS RS700-E6/RS4 barebone
    Dual Intel Xeon "Gainestown" X5570 2.93GHz, Dual Intel Xeon “Westmere” X5670 2.93 GHz
    6x4GB (24GB) ECC Registered DDR3-1333

    "Also notice that the new Xeon 5600 handles DDR3-1333 a lot more efficiently. We measured 15% higher bandwidth from exactly the same DDR3-1333 DIMMs compared to the older Xeon 5570."

    That is not exactly the reason, I think.
    The reason ist you populated the second memory-bank in both setups.
    Intel specification:
    Westmere-1333MHZ-CPUs run with 1333 MHZ with second bank populated while
    Nehalem-1333MHZ-CPUs run with 1066 MHZ with second bank populated

    That could be updated.

    Compare tech docs on Intel site: datasheet Xeon 5500 Part 2 and datasheet Xeon 5600 Part 2

    Arnold.
  • gonerogue - Saturday, April 10, 2010 - link

    The Viper is a V10 and most certainly not a traditional muscle car ;)

Log in

Don't have an account? Sign up now