Intel Xeon E5-2697 v2 and Xeon E5-2687W v2 Review: 12 and 8 Cores
by Ian Cutress on March 17, 2014 11:59 AM EST- Posted in
- CPUs
- Intel
- Xeon
- Enterprise
The Mac Pro (Late 2013)
When Anand reviewed the Mac Pro late last year, he received the full fat 12 core edition, using the E5-2697 v2 CPU with a 2.7 GHz rating. The CPU choices for the Mac Pro include 4, 6 and 8 core models, all with HyperThreading. Interestingly enough, the 4/6/8 core models all come from the E5-16xx line, meaning the CPUs are designed with single processor systems in mind. But to get to the 12 core/24 thread model at the high end, Apple used the E5-2697 v2, a processor optimized for dual CPU situations. Based on the die shots on the previous page, this has repercussions, but as Anand pointed out, it all comes down to power usage and turbo performance.
Mac Pro (Late 2013) CPU Options | ||||||
Intel CPU | E5-1620 v2 | E5-1650 v2 | E5-1680 v2 | E5-2697 v2 | ||
Cores / Threads | 4 / 8 | 6 / 12 | 8 / 16 | 12 / 24 | ||
CPU Base Clock | 3.7GHz | 3.5GHz | 3.0GHz | 2.7GHz | ||
Max Turbo (1C) | 3.9GHz | 3.9GHz | 3.9GHz | 3.5GHz | ||
L3 Cache | 10MB | 12MB | 25MB | 30MB | ||
TDP | 130W | 130W | 130W | 130W | ||
Intel SRP | $294 | $583 | ? | $2614 |
The Mac Pro is designed within a peak 450W envelope, and Intel has options with its CPUs. For the same TDP limit, Intel can create many cores as low frequency, or fewer cores at higher frequency. This is seen in the options on the Mac Pro – all the CPU choices have the same 130W TDP, but the CPU base clocks change as we rise up the core count. Moving from 4 cores to 8 cores keeps the maximum turbo (single core performance) at 3.9 GHz, but the base clock decreases the more cores are available. Finally at the 12-core model, the base frequency is at its lowest of the set, as well as the maximum turbo.
This has repercussions on workloads, especially for workstations. For the most part, the types of applications used on workstations are highly professional, and have big budgets with plenty of engineers designed to extract performance. That should bode well for the systems with more cores, despite the frequency per core being lower. However, it is not always that simple – the mathematics for the problem has to be able to take advantage of parallel computing. Simple programs run solely on one core because that is the easiest to develop, but if the mathematics wholly linear, then even enterprise software is restricted. This would lend a positive note to the higher turbo frequency CPUs. Intel attempts to keep the turbo frequency similar as long as it can while retaining the maximum TDP to avoid this issue; however at the 12-core model this is not possible. Quantifying your workload before making a purchase is a key area that users have to consider.
Benchmark Configuration
I talk about the Mac Pro a little because the processors we have for a ‘regular’ test today are 8-core and 12-core models. The 12-core is the same model that Anand tested in the Mac Pro – the Xeon E5-2697v2. The 8-core model we are testing today is different to the one offered in the Mac Pro, in terms of frequency and TDP:
Intel SKU Comparison | |||
Core i7-4960X | Xeon E5-2687W v2 | Xeon E5-2697 v2 | |
Release Date | September 10, 2013 | September 10, 2013 | September 10, 2013 |
Cores | 6 | 8 | 12 |
Threads | 12 | 16 | 24 |
Base Frequency | 3600 | 3400 | 2700 |
Turbo Frequency | 4000 | 4000 | 3500 |
L3 Cache | 15 MB | 25 MB | 30 MB |
Max TDP | 130 W | 150 W | 130 W |
Max Memory Size | 64 GB | 256 GB | 768 GB |
Memory Channels | 4 | 4 | 4 |
Memory Frequency | DDR3-1866 | DDR3-1866 | DDR3-1866 |
PCIe Revision | 3.0 | 3.0 | 3.0 |
PCIe Lanes | 40 | 40 | 40 |
Multi-Processor | 1P | 2P | 2P |
VT-x | Yes | Yes | Yes |
VT-d | Yes | Yes | Yes |
vPro | No | Yes | Yes |
Memory Bandwidth | 59.7 GB/s | 59.7 GB/s | 59.7 GB/s |
Price | $1059 | $2112 | $2618 |
The reason for this review is to put these enterprise class processors through the normal (rather than server) benchmarks I run at AnandTech for processors. Before I started writing about technology, as an enthusiast, it was always interesting to hear of the faster Xeons and how much that actually made a difference to my normal computing. I luckily have that opportunity and would like to share it with our readers.
The system set up is as follows:
Test Setup | |
Motherboards |
GIGABYTE GA-6PXSV3 MSI X79A-GD45 Plus for 3x GPU Configurations |
Memory | 8x4 GB Kingston DDR3-1600 11-11-11 ECC |
Storage | OCZ Vertex 3 256 GB |
Power Supply | OCZ 1250 ZX Series |
CPU Cooler | Corsair H80i |
NVIDIA GPU | MSI GTX 770 Lightning 2GB |
AMD GPU | ASUS HD 7970 3GB |
Many thanks to...
We must thank the following companies for kindly providing hardware for our test bed:
Thank you to GIGABYTE Server for providing us with the Motherboard and CPUs
Thank you to OCZ for providing us with 1250W Gold Power Supplies and SSDs.
Thank you to Kingston for the ECC Memory kit
Thank you to ASUS for providing us with the AMD HD7970 GPUs and some IO Testing kit.
Thank you to MSI for providing us with the NVIDIA GTX 770 Lightning GPUs.
Power Consumption
Power consumption was tested on the system as a whole with a wall meter connected to the OCZ 1250W power supply, while in a single MSI GTX 770 Lightning GPU configuration. This power supply is Gold rated, and as I am in the UK on a 230-240 V supply, leads to ~75% efficiency > 50W, and 90%+ efficiency at 250W, which is suitable for both idle and multi-GPU loading. This method of power reading allows us to compare the power management of the UEFI and the board to supply components with power under load, and includes typical PSU losses due to efficiency. These are the real world values that consumers may expect from a typical system (minus the monitor) using this motherboard.
While this method for power measurement may not be ideal, and you feel these numbers are not representative due to the high wattage power supply being used (we use the same PSU to remain consistent over a series of reviews, and the fact that some boards on our test bed get tested with three or four high powered GPUs), the important point to take away is the relationship between the numbers. These boards are all under the same conditions, and thus the differences between them should be easy to spot.
At idle, the Xeons are on par with the Core i7-4960X for power consumption in the GIGABYTE motherboard. At load the extra TDP of the E5-2687W v2 can be seen.
DPC Latency
Deferred Procedure Call latency is a way in which Windows handles interrupt servicing. In order to wait for a processor to acknowledge the request, the system will queue all interrupt requests by priority. Critical interrupts will be handled as soon as possible, whereas lesser priority requests, such as audio, will be further down the line. So if the audio device requires data, it will have to wait until the request is processed before the buffer is filled. If the device drivers of higher priority components in a system are poorly implemented, this can cause delays in request scheduling and process time, resulting in an empty audio buffer – this leads to characteristic audible pauses, pops and clicks. Having a bigger buffer and correctly implemented system drivers obviously helps in this regard. The DPC latency checker measures how much time is processing DPCs from driver invocation – the lower the value will result in better audio transfer at smaller buffer sizes. Results are measured in microseconds and taken as the peak latency while cycling through a series of short HD videos - less than 500 microseconds usually gets the green light, but the lower the better.
The DPC latency of the Xeons is closer to the 100 mark, which we saw during Sandy Bridge. Newer systems seem to be increasing the DPC latency - so far all Haswell consumer CPUs are at the 140+ line.
71 Comments
View All Comments
mrnuxi - Tuesday, March 18, 2014 - link
I know you were benchmarking these Xeons for home use, thus the selection of rendering and gaming benchmarks. But there are lots of us doing home virtualization (VMWare ESXi all-in-one servers using PCI-passthrough ZFS virtual SAN and multiple VMs). It would be great so see some virtualization benchmarks. For further reference see: http://www.napp-it.org/index_en.htmlIan Cutress - Tuesday, March 18, 2014 - link
Johan covered the server side in his article -http://www.anandtech.com/show/7285/intel-xeon-e5-2...
alpha754293 - Wednesday, March 19, 2014 - link
No LS-DYNA or other HPC benchmark results??? Talk to Johan.colonelclaw - Wednesday, March 19, 2014 - link
Hi Ian, Thanks for a great review.Do you think there's any possibility of adding V-Ray to your workstation benchmarks? It's an incredibly popular renderer that is multi-platform and also works in pretty much any decent 3D software (Max, Maya, C4D etc). It also sucks the life out of any computer when it's running, so would be perfect for your tests.
Kougar - Wednesday, March 19, 2014 - link
Question. Since you discuss turbo bins at length and the article revolves around them, how does Windows Server handle load balancing in regards to the turbo bins. On a 2P E5-2697 will the OS balance all the threads on a single CPU first? Spread evenly across both CPUs? Max out all physical cores before assigning threads to logical cores?Is the OS capable of spreading 3 threads to each processor to ensure they both run at the max turbo frequency for as long as possible? Or would it instead max out one processor to attempt to let the other retain a lower power state? For that matter is any of this even configurable under Windows Server?
Ytterbium - Saturday, May 3, 2014 - link
Ian, does MCE work with Xeon?Ytterbium - Tuesday, May 6, 2014 - link
MCE doesn't seem to work with Xeon 2687w.RadamanthysBe - Sunday, May 4, 2014 - link
Interresting article, but I dont agree with your conclusion on the 2667 vs 2687w :You say 2667 id cheaper, ok but 50 $ difference in list price on cpu costing 2100 $, that is less than 2%.
You also say 2687w v2 use more energy than 2667 v2, do you have proof of that ? For me the fact that 2687w v2 has a 150w tdp only mean it can keep its turbo frequency under higher load than 2667, with situations where 2667 turbo mode would drop because of the power usage while not on the 2687w v2, making it in fine a faster cpu than 2667 under heavy loads. If the two cpu run the same computation at the same frequency, there is no reason 2687w v2 uses more power, it would be like saying that i5 and i7 consume the same because they have the same tdp, while everybody knows it is not the case.
Ytterbium - Monday, May 5, 2014 - link
I just bought a 2687w v2, it ended up being $3 difference between them, I have i7-3970X so I have 150W chip anyway, so TDP wasn't really much of a factor to me.It would be interesting to do a head to head of them and see how they perform, in thermal load/power.
Following Ian's logic he's be super interested in the E5-2673 v2, this is the same as the 2667 but with 110W TDP.
If the 2690 had a little higher turbo, it would be great, 10/20 with say 3 stock and 3.8 turbo
Ytterbium - Tuesday, May 6, 2014 - link
The 2687 I got seems to run a bit cooler than my 3970X, even though there rated for the same.