Memory Scaling on Haswell CPU, IGP and dGPU: DDR3-1333 to DDR3-3000 Tested with G.Skillby Ian Cutress on September 26, 2013 4:00 PM EST
‘How much does memory speed matter?’ is a question often asked when dealing with mainstream processor lines. Depending on the platform, the answers might very well be different. Similar to our comparisons with Ivy Bridge, today we publish our results for 26 different memory timings across 45 benchmarks, all using a G.Skill memory kit.
In our previous memory scaling article with an Ivy Bridge CPU, the results of memory testing between DDR3-1333 to DDR3-2400 afforded two main results – (a) the high end memory kit offered up to a 20% improvement, but (b) this improvement was restricted to certain memory limited tests. In order to be more thorough, our tests in this article take a single memory kit, the G.Skill 2x4GB DDR3-3000 12-14-14 1.65V kit, through 26 different combinations of memory speed and CAS latency to see if it is better to choose one set of timings over the other. Benchmarks chosen include my standard array of real world benchmarks, some of which are memory limited, as well as several gaming titles on IGP, single GPU and multi-GPU setups, recording both average and minimum frame rates.
The Problem with Memory Speed
As mentioned in the Ivy Bridge memory scaling article, one of the main issues with reporting memory speeds is the exclusion of the CAS Latency, or tCL. When a user purchases memory, it comes with an associated number of sticks, each stick is of a certain size, memory speed, set of subtimings and voltage. In fact the importance of order is such that:
1. Amount of memory
2. Number of sticks of memory
3. Placement of those sticks in the motherboard
4. The MHz of the memory
5. If XMP/AMP is enabled
6. The subtimings of the memory
I use this order on the basis that point 1 is more important than point 3:
- A system will be slow due to lack of memory before the speed of the memory is an issue (point 1)
- In order to take advantage of the number of memory channels of the CPU we must have a number of sticks that have a factor of the memory channels (point 2), known as dual channel/tri channel/quad channel.
- In order to ensure that we have dual (or tri/quad) channel operation these sticks need to be in the right slots of the motherboard – most motherboards support two DIMM slots per channel and we need at least one memory stick for each channel
- If the MHz of the memory is more than CPU is rated for (1333, 1600, 1866+), then the user needs to apply XMP/AMP in order to benefit from the additional speed. Otherwise the system will run at the CPU defaults.
- Subtimings, such as tCL, are used in conjunction with the MHz to provide the overall picture when it comes to performance.
A user can go out and buy two memory kits, both DDR3-2400, but in reality (as shown in this review), they can perform different and have different prices. The reason for this will be in the sub-timings of each memory kit: one might be 9-11-10 (2400 C9), and the other 11-11-11 (2400 C11). So whenever someone boasts about a particular memory speed, ask for subtimings.
G.Skill DDR3-3000 C12 2x4GB Memory Kit: F3-3000C12D-8GTXDG
For this review, G.Skill supplied us with a pair of DDR3 modules from their TridentX range, rated at DDR3-3000. This is at the absolute high end of memory kits, with very few memory kits going faster in terms of MHz. Of course, in this MHz race, it comes at a price premium: $690 for 8 GB. This memory kit uses single-sided Hynix MFR ICs, known for their high MHz numbers, and while there are large heat-spreaders on each stick, these can be removed reducing the height from 5.4 cm to 3.9 cm.
Hynix MFR based memory kits are used by extreme overclockers to hit the high MHz numbers. Recently YoungPro from Australia took one of these memory sticks and hit DDR3-4400 MHz (13-31-31 sub-timings) to reach #1 in the world in pure MHz.
Intel Core i7-4770K Retail @ 4.0 GHz
4 Cores, 8 Threads, 3.5 GHz (3.9 GHz Turbo)
|Motherboards||ASRock Z87 OC Formula/AC|
Intel Stock Cooler (pre-testing)
|Power Supply||Corsair AX1200i Platinum PSU|
|Memory||G.Skill TridentX 2x4 GB DDR3-3000 12-14-14 Kit|
|Memory Settings||1333 C7 to XMP (3000 12-14-14)|
|Discrete Video Cards||
|Video Drivers||Catalyst 13.6|
|Hard Drive||OCZ Vertex 3 256GB|
|Optical Drive||LG GH22NS50|
|Case||Open Test Bed|
|Operating System||Windows 7 64-bit|
|USB 3 Testing||OCZ Vertex 3 240GB with SATA->USB Adaptor|
With this test setup, we are using the BIOS to set the following combinations of MHz and subtimings:
Almost all of these combinations are available for purchase. For any combination of MHz and CAS, we attempt that CAS for all sub-timings, e.g. 2400 9-9-9 1T at 1.65 volts. If this setting is unstable, we move to 9-10-9, 9-10-10 then 9-11-10 and so on until the combination is stable.
There is an odd twist when dealing with DDR3-3000. In order to reach 3000 MHz, as Haswell does not accept the DDR3-3000 memory strap, we actually have to use the DDR3-2933 strap and boost the CPU speed to 102.3 MHz. This leads to a slight advantage in terms of CPU throughput when using DDR3-3000 which does come through in several benchmarks. In order to keep things even, our 4.0 GHz CPU has the multiplier reduced for 3000 C12 in order to keep the overall system speed the same, albeit with a slight BCLK advantage.
At the time of testing, DDR3-3000 C12 was the highest MHz memory kit available, but since then there are now 3100 C12 memory kits on the market taking price margins even higher at $1000 for 8 GB. The problem at this speed is the actual overclocking of the CPU aspect of the system will skew the performance results in favor of the high end kit.
For this test, we use the following real world and compute benchmarks:
CPU Real World:
- WinRAR 4.2
- FastStone Image Viewer
- Xilisoft Video Converter
- x264 HD Benchmark 4.0
- TrueCrypt v7.1a AES
- USB 3.0 MaxCPU Copy Test
- 3D Particle Movement, Single Threaded and MultiThreaded
- SystemCompute ‘2D Explicit’
- SystemCompute ‘3D Explicit’
- SystemCompute nBody
- SystemCompute 2D Implicit
- SystemCompute ‘2D Explicit’
- SystemCompute ‘3D Explicit’
- SystemCompute nBody
- SystemCompute MatrixMultiplication
- SystemCompute 3D Particle Movement
For what should be obvious reasons, there is no point in running synthetic tests when dealing with memory. A synthetic test will tell you if the peak speed or latency is higher or lower – that is not a number that necessarily translates into the real world unless you can detect the type and size of all the memory accesses used within a real world environment. The real world is more complex than a simple boost in memory read/write peak speeds.
For each of the 3D benchmarks we use an ASUS HD 6950 (flashed to HD6970) for the single GPU tests, the HD 4600 in the CPU for IGP, and a HD 5970+5870 for a lopsided tri-GPU test.
- Dirt 3, Avg and Min FPS, 1360x768
- Bioshock Infinite, Avg and Min FPS, 1360x768
- Tomb Raider, Avg and Min FPS, 1360x768
- Sleeping Dogs, Avg and Min FPS, 1360x768
Firstly, I want to go through enabling XMP in the BIOS of all the major vendors.
Post Your CommentPlease log in or sign up to comment.
View All Comments
HerrKaLeun - Saturday, September 28, 2013 - linkThis was a good review. But I see one major problem for practiacl applications:
Whoever cares about performace, doesn't use 8 GB of memory in the year 2013.
Even for a cheap home-built (no gaming, no CAD etc.) I used 16 GB a year ago, which cost only ~$70. when I run multiple applications in parallel (who doesn't?) W7/8 easily uses all memory for cache. Even with an SSD this is a speed advantage.
So for real world applications (running virus scan in parallel to work, 18 browser windows, watching movies etc) 8 GB re easily used up.
I would imagine a 16 GB PC (let's say ~$100) runs circles around the $700 8 GB PC in the real world.
Right now I run MSE and Malwarebytes while just using IE for browsing and I have none of my 16 GB left. The computer is not sluggish at all. I'm not sure how 8 GB RAM would work out.
One could argue most applications don't require that much memory, but running virusscan frequently should be done by all users.
I think this test should be repeated with either 16 GB or 24 GB for triple-channel platform. People interested in a few % more, also need more RAM.
Wwhat - Sunday, September 29, 2013 - link@HerrKaLeun you say who doesn't use more than 8GB? and say you got 16GB for about 70 dollars, but this article covers a lot of extremely highly speced RAM that as stated is quite expensive, and if you bought 8GB for several hundred dollars you aren't going to supplement it with cheap high-latency low speed off-the-shelf stuff obviously.
malphadour - Sunday, September 29, 2013 - linkHerrKaleun you are talking rubbish!! I have an X58 running 6gb ram and I never get anywhere near flooding it. 8GB is more tha ample for 99% of users out there. I recently built a 16gb ram rig for one of our engineers because he demanded it. To prove a point I benchmarked all our software (which includes a juicy construction CAD package) and recorded no more than a 3% performance increase going to 16gb and I put most of that down to going from single channel 8gb stick to dual channel for the 16gb. We tested render times, large drawing copies plus program open and close times with lots every peice of software on the machine running. His argument was the same as yours, and incorrect. Hardware is way ahead of the curve at the moment vs software and it will be a while before the everyday user "needs" more than 8gb.
Wwhat - Monday, September 30, 2013 - linkTo be fair, I hear battlefield 4 has as suggested setup at least 8GB.
Like always the more RAM people on average have the more software starts to require.
ShieTar - Monday, September 30, 2013 - link"So for real world applications (running virus scan in parallel to work, 18 browser windows, watching movies etc) 8 GB re easily used up."
Because Windows will fill up all the Memory it has before even starting any garbage collection algorithms. Even today, you should be able to do all those trivial applications on 2GB of memory.
And anybody doing serious work or gaming will probably not run two major software packages at the same time. A few background programs (depending on how paranoid your companies IT department is), and a few trivial programs like browser, word processor, excel, PDF may run on the side and use up 1GB to 2GB. But nobody in his right mind will start processing of huge images in Photoshop while keeping his CAD models open in CATIA. A few nutjobs out there may run 16 installations of WoW on 16 screens with the same PC, but thats not really relevant to a general review.
So if you go and have a look again at what is tested in this review, and once you understand that any reviewer worth his salary will not go and run a dozen pieces of software parallel to the one software he is benchmarking at that moment, it should be clear at the very least that repeating above benchmarks with 16GB will give you absolutely no difference in the benchmark results whatsoever.
Chrispy_ - Sunday, September 29, 2013 - linkSo the the three common scenarios are:
--- 1. You want an IGP ---
Get the cheapest RAM, If you buy significantly better RAM the cost of APU + RAM becomes more than the cost of a normal CPU + dGPU + cheap RAM, which is obviously much higher.performance.
--- 2. You want a single graphics card ---
Spend the money you're *thinking* about spending on better RAM on a better graphics card. If you want a decent dGPU then you're most likely a gamer and even 1600MHz CL9 is fine, but you'll see a big improvement if you move from a $200 GTX660 to a $250 660Ti
--- 3. You want more than one graphics card ---
Divide RAM Frequency by CAS Latency to get the actual speed, I've been doing this for years and I'm glad Ian has finally mentioned this in an article.
ShieTar - Monday, September 30, 2013 - linkI don't think anybody would disagree with the general direction of your comment, but you seem to overestimate the exact differences in cost for 8GB of RAM these days. A quick check (for Germany) gives me the following price differences for RAM frequency (relative to 1333):
1600 : -0.50€ (No-Brainer)
1866 : +1€
2000 : +20€
2133 : +10€
2400 : + 8€
2666 : +50€
2933 : +170€
So, for 8€ you can pick 2400 instead of 1600, which would give you a significant increase in performance should you ever find a piece of software that heavily depends on memory transfer rates. You are very unlikely to step up your CPU or GPU model for that kind of price difference.
Latencies can be similar. For DDR3-1600, going from CL11 to CL9 will cost you about 2€ to 3€. Of course, at that point you still have a higher latency than DDR3-2400 with a CL11, so that seems to make the most sense right now for price to value ratio.
rootheday3 - Sunday, September 29, 2013 - linkHd4600 is likely not memory bottlenecked with 20 eus at stock igp frequencies. There is a reason that intel didn't add the EDram to skus other than the 47w+ gt3e 40eu skus, 4 samplers and 2pixel hacienda. For a gt2 with half the assets, memory is not the issue- 1600mhz in dual channel is plenty. For people who were asking earlier in the thread, dual channel vs single channel is ~15-30% impact on gt2.
If you want to see more sensitivity/ scaling with memory, you would need to OC the igp first.
Or, as others said, test on skus that are more likely to stress memory - like gt3e (iris pro 5200) Note that hd5000 (15w package tdp) and iris 5100 (28w tdp) may be tdp bound on most workloads, so even there you may not see scaling with memory beyond ~1600-1866 dual channel.
Note that Trinity/Richland are more sensitive to memory (especially on 65-100w desktop skus) because they don't have the LLC to buffer some of the bandwidth demands.
malphadour - Sunday, September 29, 2013 - linkI have mushkin 6-8-6-21 1600mhz which seems to be almost unique (don't think I have seen nayone else make cl6 at this speed) - would be interested to see if CL6 at 1600mhz was a match for much higher mhz
malphadour - Sunday, September 29, 2013 - linkI think the comment 1600mhz is bad can be taken with a pinch of salt here. Depends on who the PC is for. If it is normal use then 1600mhz cl9 is going to be fine all day long. Ian's point is, I think, aimed at the enthusiast who is benchmark chasing, in which case bigger is always better. It would be nice if hte price of ram had not doubled. I was buying 8gb 1600mhz cl9 for £29.99 not too long ago, two recent builds it is as £54.99, nearly twice the price in the UK :(