For over two years the collective AMD vs Intel personal computer battle has been sitting on the edge of its seat. Back in 2014 when AMD first announced it was pursuing an all-new microarchitecture, old hands recalled the days when the battle between AMD and Intel was fun to be a part of, and users were happy that the competition led to innovation: not soon after, the Core microarchitecture became the dominant force in modern personal computing today. Through the various press release cycles from AMD stemming from that original Zen announcement, the industry is in a whipped frenzy waiting to see if AMD, through rehiring guru Jim Keller and laying the foundations of a wide and deep processor team for the next decade, can hold the incumbent to account. With AMD’s first use of a 14nm FinFET node on CPUs, today is the day Zen hits the shelves and benchmark results can be published: Game On!

[A side note for anyone reading this on March 2nd. Most of the review is done, bar an edit pass or two. Most of the results pages have graphs, but no text. Please have patience while we finish.]

Launching a CPU is like an Onion: There are Layers

As AnandTech moves into its 20th anniversary, it seems poignant to note that one of our first big review pieces that gained notoriety, back in 1997, was on a new AMD processor launch. Our style and testing regime since then has expanded and diversified through performance, gaming, microarchitecture, design, feature-set and debugging marketing strategy. (Fair play to Anand, he was only 14 years old back then…!). Our review for Ryzen aims to encompass many of the areas we feel pertinent to the scale of today’s launch.

An Interview with Dr Lisa Su, CEO of AMD
The Product Launch: A Small Stack of CPUs
Microarchitecture: Front-End, Back-End, micro-op cache, GloFo 14nm LP
Chipsets and Motherboards: AM4 and 300-series
Stock Coolers and Memory: Wraith and DDR4
Benchmarking Performance: Our New 2017 CPU Benchmark Suite
Conclusions

Before the official launch, AMD held a traditional ‘Tech Day’ for key press and analysts in the past week over in San Francisco. The goal of the Tech Day was to explain Zen to the press, before the press explain it to everyone else. AMD likes doing Tech Days, with recent events in memory covering Polaris and Carrizo, and I don’t blame them: alongside the usual slide presentations recapping what is already known or ISSCC/Hot Chips papers, we also get a chance to sit down with the VPs and C-Level executives as well as the key senior engineers involved in the products. Some of the information in this piece we would have published before: AMD’s machine of tickle-out information means that we know a good deal behind the design already, but the Tech Day helps us ask those final questions, along with the official launch details.

So, What Is Zen? Or Ryzen?

A Brief Backstory and Context 

The world of main system processors, or CPUs in this case, is governed at a high level by architecture (x86, ARM), microarchitecture (how the silicon is laid out), and process node (such as 28HPM and 16FF+ from TSMC, or 14LP from GloFo). Beyond these we have issues of die area, power consumption, transistor leakage, cost, compatibility, dark silicon, frequency, supported features and time to market.

Over the past decade, both AMD and their main competitor Intel have been competing for the desktop and server computer markets with processors based on the x86 CPU architecture. Intel moved to its leading Core microarchitecture on 32nm around 2010, and is now currently in its seventh generation of it at 14nm, with each generation implementing various front-end/back-end changes and design adjustments to caches and/or modularity (it’s actually a lot more complex than this). By comparison, AMD is running its fourth/fifth generation of Bulldozer microarchitecture, split between its high-end CPUs (running an older 2nd Generation on 32nm), its mainstream graphics focused APUs (running 4th/5th Generation on 28nm), and laptop-based APUs (on 5th Generation).

High-Performance and Mainstream
x86 Microarchitectures
AMD Year Intel
Zen 14FF 2017   ?
Excavator v2 GF28A 2016 14+FF Kaby Lake
Excavator GF28A 2015 14FF Skylake
Steamroller 28SHP 2014 14FF Broadwell
- - 2013 22FF Haswell
Piledriver 32 SOI 2012 22FF Ivy Bridge
Bulldozer 32nm 2011 32nm Sandy Bridge
- - 2010 32nm Westmere
K10 45nm Other 45nm Nehalem

AMD’s Bulldozer-based design was an ambitious goal: typically a processor core will have a single instruction entry point leading to a decode then dispatch to separate integer or floating point schedulers. Bulldozer doubled the integer scheduler and throughput capabilities, and allowed the module to accept two threads of instructions - this meant that integer heavy workloads could perform well, other bottlenecks notwithstanding. The perception was that most OS-based workloads are invariably integer/whole number based, such as loop iterations or true/false Boolean comparisons or predicates. Unfortunately the design ended up with a couple of key issues: OS-based workloads ended up getting more floating-point heavy, traditional OSes didn’t understand how to dispatch work given a dual-thread module leading to overloaded modules (this was fixed with an update), general throughput of a single thread was a small (if any) upgrade over the previous microarchitecture design, and power consumption was high. A number of leading analysts pointed to design philosophies in the Bulldozer design, such as a write-through L1 cache and a shared L2 cache within a module lead to higher latencies and increased power consumption. A good chunk of the deficit to the competition was the lack of a micro-op cache that can help bypass instruction decode power and latency, as well as a process node disadvantage and the incumbent’s strong memory pre-fetch/branch prediction capabilities.

The latest generation of Bulldozer, using ‘Excavator’ cores under the Carrizo name for the chip as a whole, is actually quite a large jump from the original Bulldozer design. We have had extensive reviews of Carrizo for laptops, as well as a review of the sole Carrizo-based desktop CPU, and the final variant performed much better for a Bulldozer design than expected. The fundamental base philosophy was unchanged, however the use of new GPUs, a new metal stack in the silicon design, new monitoring technology, new threading/dispatch algorithms and new internal analysis techniques led to a lower power, higher performance version. This was at the expense of super high-end performance above35W, and so the chip was engineered to focus at more mainstream prices, but many of the technologies helped pave the way for the new Zen floorplan.

Zen is AMD’s new microarchitecture based on x86. Building a new base microarchitecture is not easy – it typically means a large paradigm shift in the way of thinking between the old design and new design in key areas. In recent years, Intel alternates key microarchitecture changes between two separate CPU design teams in the US and Israel. For Zen, AMD made strides to build a team suitable to return to the high-end. Perhaps the most prominent key hire was CPU Architect and all-around praised design guru Jim Keller. Jim has a prominent history in chip design, starting with the likes of Freescale, to developing chips for AMD when they fighting tooth and nail with Intel in the early 2000s, then moving on to Apple to work on the A5/A6 designs. AMD were able to re-hire Jim and put a fire in his belly with the phrase: ‘Make a new high-performance x86 CPU’ (not in those exact simple words…. I hope).


From left to right: Mark Papermaster (CTO AMD), Dr Lisa Su (CEO AMD),
Simon Segars (CEO ARM) and Jim Keller (former AMD, now Tesla)

Jim’s goal was to build the team and lay the groundwork for the new microarchitecture, which for the long-term will revolve around AMD Processor Architect Michael (Mike) Clark. Former Intel engineer Sam Naffziger, who was already working with AMD when the Zen team was put together, worked in tandem with the Carrizo and Zen teams on building internal metrics to assist with power as well. Mark Papermaster, CTO, has stated that over 300 dedicated engineers and two million man-hours have been put into Zen since its inception.

As mentioned, the goal was to lay down a framework. Jim Keller spent three years at AMD on Zen, before leaving to take a position under Elon Musk at Tesla. When we reported that Jim had left AMD, a number of people in the industry seemed confused: Zen wasn’t due for another year at best, so why had he left? The answers we had from AMD were simple – Jim and others had built the team, and laid the groundwork for Zen. With all the major building blocks in place, and simulations showing good results, all that was needed was fine tuning. Fine tuning is more complex than it sounds: getting caches to behave properly, moving data around the fabric at higher speeds, getting the desired memory and latency performance, getting power under control, working with Microsoft to ensure OS compatibility, and working with the semiconductor fabs (in this case, GlobalFoundries) to improve yields. None of this is typically influenced by the man at the top, so Jim’s job was done.

This means that in the past year or so, AMD has been working on that fine tuning. This is why we’ve slowly seen more and more information coming out of AMD regarding microarchitecture and new features as the fine-tuning slots into place. All this culminates in Ryzen, the official name for today’s launch.

Along with the new microarchitecture, Zen is the first CPU from AMD to be launched on GlobalFoundries’ 14nm process, which is semi-licenced from Samsung. At a base overview, the process should offer 30% better efficiency over the 28nm HKMG (high-k metal gate) process used at TSMC for previous products. One of the issues facing AMD these past few years has been Intel’s prowess in manufacturing, first at 22nm and then at 14nm - both using iterative FinFET generations. This gave an efficiency and die-size deficit to AMD through no real fault of their own: redesigning older Bulldozer-derived products for a smaller process is both difficult and gives a lot of waste, depending on how the microarchitecture as designed. Moving to GloFo’ 14nm on FinFET, along with a new microarchitecture designed for this specific node, is one stepping stone to playing the game of high-end CPU performance.

But returning to high-end CPU performance is difficult. There are two crude ways to measure performance: overall throughput, or per-core throughput. Having a low per-core throughput but using lots and lots of cores to increase overall throughput already has a prime example in graphics cards, which are mainly many small dedicated vector processors in a single chip. For general purpose computing, we see designs such as ThunderX putting 64+ ARM cores in a single chip, or Intel’s own Xeon Phi that has up to 72 x86 cores as a host processor (among others). All of these have relatively low general purpose single-core performance, either due to the fact that they focus on specific workloads, their architecture, or they have to run at super-low frequency to be within a particular power budget. They’re still high-performance, but the true crown lies with a good per-core performance. Per-core is where Intel has been winning, and the latest Skylake/Kaby Lake microarchitecture holds the lead due to the way the core is designed, tuned, and produced on arguably the most competitive manufacturing process node geared towards high performance and efficiency.

With the Zen microarchitecture, AMD’s goal was to return to high-end CPU performance, or specifically having a competitive per-core performance again. Trying to compete with high frequency while blowing the power budget, as seen with the FX-9000 series running at 220W, was not going to cut it. The base design had to be efficient, low latency, be able to operate at high frequency, and scale performance with frequency. Typically per-core performance is measured in terms of ‘IPC’, or instructions per clock: a processor that can perform more operations or instructions per Hz or MHz has a higher performance when all other factors are equal. In AMD’s initial announcements on Zen, the goal of a 40% gain in IPC was put into the ecosystem, with no increase in power. It is pertinent to say that there were doubts, and many enthusiasts/analysts were reluctant to claim that AMD had the resources or nous to both increase IPC by 40% and maintain power consumption. In normal circumstances, without a significant paradigm shift in the design philosophy, a 40% gain in efficiency can be a wild goose chase. Then it was realised that AMD were suggesting a +40% gain compared to the best version of Bulldozer, which raised even more question marks. At the formal launch last week, AMD stated that the end goal was achieved with +52% in industry standard benchmarks such as SPEC from Piledriver cores (with an L3 cache) or +64% from Excavator cores (no L3 cache).

Moving nearer to the launch, and with more details under our belts, it was clear that AMD were not joking but actually being realistic. Analyzing what we were told about the core design (we weren’t told everything to begin with) would seem to suggest that AMD, through this long-term CPU team and core design plan, have all the blocks in place necessary to do the deed. Aside from getting an actual chip on hand to test, analysts were still concerned about AMD’s ability to execute both on time and sufficiently to make a dent and recreate a highly competitive x86 market.

Throughout the lifetime of the Zen concept at AMD, they have seen a couple of corporate restructurings, changes at the high C-level executives, time when the company value dipped below the company asset value, restructuring of manufacturing and wafer deals with GlobalFoundries (which was spun out of AMD before Zen), sale and re-lease of major corporate buildings, and the integration/de-integration of the graphics team within the company. Within all that, Bulldozer didn’t make the impact it needed to, in part due to pre-launch hype and underwhelming results. Sales volumes have been declining (partly due to lower global PC demand), and server-based revenues have dropped to almost nil as we’ve not seen a new Opteron in five years. In short, revenue has been decreasing. AMD’s ability to get their finances under control, and then focus and execute, has been under question.

In the past few months, that outlook has changed. The top four executives involved in Zen, Dr. Lisa Su (CEO), Mark Papermaster (CTO), Jim Anderson (CVP Client) and Forrest Norrod (CVP Server), have been stable in the company for a few years, and all four have that air of quiet confidence. It is rather surprising to see three traditional processor engineers at the top of the chain for a CPU launch, rather than a sales or marketing focused executive – typically an engineer releasing a product with a grin on their face is often a good sign. Apart from that, AMD’s semi-custom revenues have increased through securing deals with the leading consoles, the GPU side of the company (Radeon Technology Group, headed under Raja Koduri) successfully executed at 14nm with mainstream parts in 2016, and for that confidence has AMD now trading at over $12/share, just under their 2004/2005 peak. At the minute a large number of individuals and companies are invested in AMD succeeding, both consumers and financials alike, and recent performance on execution has been a positive factor.

Since August, AMD has been previewing the Zen microarchitecture in actual pre-production silicon, running Windows and running benchmarks. With the 40%+ gain in IPC constantly being quoted, and now a +52% being presented, most users and interested parties want to know exactly where it sits compared to an equivalent Intel product. Back in August AMD performed a single, almost entirely dismissible, benchmark showing equivalent IPC of Zen with Broadwell-E (read our write up on that here, and reasons why it’s difficult to interpret a single benchmark). AMD had a ‘New Horizons’ (Ho-Ryzen, get it?) event in December and announcements at CES in January showing performance in gaming and encoding, in particular Handbrake. It is worth noting that Intel, as you would expect any competitor, went through each open source project AMD had used an implemented updates to assist dead/bad code (e.g. coalescing read/writes to comply with AMD/Intel design guidelines) or offer improvements (adding in 256-bit vector codes to combine consecutive 128-bit compute*), which is why some of the results from the launch today are different from what AMD has shown. (If anyone thinks this is ‘unfair’, it begs a bigger question as to how IPC is a good measure of performance if the code is IPC limiting itself, which is a topic for another day.) At the Tech Day, AMD went into further detail regarding various internal benchmarks, both CPU and gaming related, showing either near parity or a fair distribution of win/loss metrics, but ultimately giving better price/performance.

*In this instance, having 4 x 128-bit vector math usually creates 4 ops. Arranging code to 256-bit can give 2 x 256-bit for two ops, which takes two cycles on Zen per core but only one cycle on Intel’s latest.

Recently AMD held a talk at ISSCC (the International Solid State Circuits Conference), which we’ll go into as part of this review, but one thing is worth mentioning here. Along with AMD requiring Zen to have good single thread performance, they also wanted something small. At ISSCC, AMD showed a slide comparing a Zen die shot to a ‘competitor’ (probably Skylake) die area:

Here AMD is comparing a four-core part to a four-core part, showing that while AMD’s L2 is bigger, the fin pitch is larger, the metal pitch is larger, and its SRAM cells are larger, the quad core design on 14nm GloFo is smaller than 14nm Intel. This is practically due to the way the Zen cores are designed vs. Intel cores – the simple explanation is that the Intel cores have wider execution ports (such as processing a 256-bit vector in one micro-op rather than two) and, we suspect, a larger area dedicated to pre-fetch logic. There’s also the consideration of dark silicon, to help assist the thermal environment around the logic units, which is undisclosed.

I mention this because one of AMD’s goals, aside from the main one of 40%+ IPC, is to have small cores. As a result, as we’ll see in this review, various trade-offs are made. Rather than dedicating a larger die area to execution ports that support AVX in one go, AMD decided to use 2x128-bit vector support (as 1x256) but increase the size of the L2 cache. A larger L2 will ensure fewer cache misses in varied code, perhaps at the expense of high-end compute which can take advantage of larger vectors. The larger L2 will help with single thread performance (all other factors being equal), and AMD knows that Intel’s cache is high performance, so we will see other trade-offs like this in the design. This is obviously combined with AMD’s recent work on having high-density libraries for cache and compute.

AMD has already set out a roadmap for different markets with products based on the Zen microarchitecture. Today’s launch is the first product line, namely desktop-class processors under the Ryzen 7 family name. Following Ryzen 7 will be Ryzen 5 (Q2) and Ryzen 3 (2H17) for the desktop. There is also AMD’s new server platform, Naples, featuring up to 32 cores on a single CPU (we’ll discuss Naples later in the review). We expect Naples to be in the Q2/Q3 timeframe. In the second half of this year (Q3/Q4), AMD is planning to launch mobile processors based on Zen, called Raven Ridge. This processor line-up will most likely be with integrated graphics, and it will be interesting to see how well the microarchitecture scales down into low power. As Ryzen CPUs are geared towards a more performance focused audience, and the fact that they lack integrated graphics, AMD will also maintain its line of ‘Bristol Ridge’ platform (Excavator based processors that also support the 300-series socket). The goal of Bristol Ridge is to fill in at the low end, or for systems that want the option of discrete graphics. Given what we’ve seen in benchmarks, despite recent statements from AMD to the contrary, the FX-line of AM3+ CPUs seems to be coming to an end.

For users looking to go buy parts today, here’s the pertinent information you need to know. The Ryzen CPUs will form part of the ‘Summit Ridge’ platform – ‘Summit Ridge’ indicates a Ryzen CPU with a 300-series chipset (X370, B350, A320). Both Bristol Ridge and Summit Ridge, and thus Ryzen, mean that AMD makes the jump to a desktop platform that supports DDR4 and a high-end desktop platform that supports PCIe 3.0 natively. These CPUs use the AM4 socket on the 300-series motherboards, which is not compatible with any of AMD’s previous socket designs, and also has slightly different socket hole distances for CPU coolers that don’t use AMD’s latch mechanism. Despite the socket hole size change, AMD has kept the latch mechanism in the same place, but coolers that have their own backplates will need new supporting plates to be able to be used (we’ve got a bit on this later in the review as well).

An Interview with Dr. Lisa Su, CEO of AMD
Comments Locked

574 Comments

View All Comments

  • RotJ2017 - Thursday, March 2, 2017 - link

    Why wasn't the test bed set up with a 960 EVO or Pro in the Turbo M2 slot which has direct lanes to the CPU? The SATA drive goes through the Chipset. This would have given a better look at the CPU rather than variances with the chipset.
  • Aerodrifting - Thursday, March 2, 2017 - link

    Come on, Where is the overclocking review that we actually care about! Everything else is pretty much the same from what we knew before NDA.
  • ThreeDee912 - Friday, March 3, 2017 - link

    Well at least we know AMD wasn't lying. Whatever they claimed before release was basically the same as the actual result.
  • HighTech4US - Friday, March 3, 2017 - link

    AMD wasn't lying - Really?

    gamersnexus outlined the "tricks?" used by AMD in benchmarking Ryzen before the launch:

    AMD inflated their numbers by doing a few things:

    In the Sniper Elite demo, AMD frequently looked at the skybox when reloading, and often kept more of the skybox in the frustum than on the side-by-side Intel processor. A skybox has no geometry, which is what loads a CPU with draw calls, and so it’ll inflate the framerate by nature of testing with chaotically conducted methodology. As for the Battlefield 1 benchmarks, AMD also conducted using chaotic methods wherein the AMD CPU would zoom / look at different intervals than the Intel CPU, making it effectively impossible to compare the two head-to-head.

    And, most importantly, all of these demos were run at 4K resolution. That creates a GPU bottleneck, meaning we are no longer observing true CPU performance. The analog would be to benchmark all GPUs at 720p, then declare they are equal (by way of tester-created CPU bottlenecks). There’s an argument to be made that low-end performance doesn’t matter if you’re stuck on the GPU, but that’s a bad argument: You don’t buy a worse-performing product for more money, especially when GPU upgrades will eventually out those limitations as bottlenecks external to the CPU vanish.

    As for Blender benchmarking, AMD’s demonstrated Blender benchmarks used different settings than what we would recommend. The values were deltas, so the presentation of data is sort of OK, but we prefer a more real-world render. In its Blender testing, AMD executes renders using just 150 samples per pixel, or what we consider to be “preview” quality (GN employs a 3D animator), and AMD runs slightly unoptimized 32x32 tile sizes, rendering out at 800x800. In our benchmark, we render using 400 samples per pixel for release candidate quality, 16x16 tiles, which is much faster for CPU rendering, and a 4K resolution. This means that our benchmarks are not comparable to AMD’s, but they are comparable against all the other CPUs we’ve tested. We also believe firmly that our benchmarks are a better representation of the real world. AMD still holds a lead in price-to-performance in our Blender benchmark, even when considering Intel’s significant overclocking capabilities (which do put the 6900K ahead, but don’t change its price).

    As for Cinebench, AMD ran those tests with the 6900K platform using memory in dual-channel, rather than its full quad-channel capabilities. That’s not to say that the results would drastically change, but it’s also not representative of how anyone would use an X99 platform.

    Conclusion:
    Regardless, Cinebench isn’t everything, and neither is core count. As software developers move to support more threads, if they ever do, perhaps AMD will pick up some steam – but the 1800X is not a good buy for gaming in today’s market, and is arguable in production workloads where the GPU is faster. Our Premiere benchmarks complete approximately 3x faster when pushed to a GPU, even when compared against the $1000 Intel 6900K. If you’re doing something truly software accelerated and cannot push to the GPU, then AMD is better at the price versus its Intel competition. AMD has done well with its 1800X strictly in this regard. You’ll just have to determine if you ever use software rendering, considering the workhorse that a modern GPU is when OpenCL/CUDA are present. If you know specific in stances where CPU acceleration is beneficial to your workflow or pipeline, consider the 1800X.

    For gaming, it’s a hard pass. We absolutely do not recommend the 1800X for gaming-focused users or builds, given i5-level performance at two times the price. An R7 1700 might make more sense, and we’ll soon be testing that.

    http://www.gamersnexus.net/hwreview...review-premi...
  • Meteor2 - Saturday, March 4, 2017 - link

    Now *that's* a good post.
  • Cooe - Monday, March 1, 2021 - link

    The i7-6900K was running wuad-
  • Cooe - Monday, March 1, 2021 - link

    The i9-6900K WAS running quad-channel in the Cinebench benchmark. It also has 2x as much RAM as the Ryzen machine (32GB vs 16GB). This post is nonsense.
  • webdoctors - Thursday, March 2, 2017 - link

    This looks like a great product, if all the AMD SKUs are 50% MSRP of equivalent Intel SKUs, its going to do amazing. The trolls here complaining about a 10% perf difference with a 50% difference in price are just plain nuts.

    The extra $$$ you save here will go into the GPU directly translating to a greater than 10% improvement in games, and for other workloads its not interesting at the consumer level.
  • lllllllllllll - Thursday, March 2, 2017 - link

    @IanCutress
    Can you elaborate on HEVC encoding results?
    Why 7700k is better then 1800x specifically in that benchmark?
  • Makaveli - Thursday, March 2, 2017 - link

    AVX?

Log in

Don't have an account? Sign up now