Radeon Instinct Hardware: Polaris, Fiji, Vega

Diving deeper into matters, let’s talk about the Radeon Instinct cards themselves. The Instinct cards are for all practical purposes a successor (or spin-off) to AMD’s current FirePro S series cards, so if you are familiar with AMD’s hardware there, then you know what to expect. Passively cooled cards geared for large scale server installations, offered across a range of power and performance options.

As this is a new product line the Instinct cards don’t have any immediate predecessors in AMD’s FirePro S lineup, but unsurprisingly, AMD has structured their new family of server cards similar to how NVIDIA has structured their P4/P40/P100 lineup of deep learning cards. All told, AMD is announcing 3 cards today, all 3 which tap different AMD GPUs, and are (roughly) named after their expected performance levels.

AMD Radeon Instinct
  Instinct MI6 Instinct MI8 Instinct MI25
Memory Type 16GB GDDR5 4GB HBM "High Bandwidth Cache and Controller"
Memory Bandwidth 224GB/sec 512GB/sec ?
Single Precision (FP32) 5.7 TFLOPS 8.2 TFLOPS 12.5 TFLOPS
Half Precision (FP16) 5.7 TFLOPS 8.2 TFLOPS 25 TFLOPS
TDP <150W <175W <300W
Cooling Passive Passive
(SFF)
Passive
GPU Polaris 10 Fiji Vega
Manufacturing Process GloFo 14nm TSMC 28nm ?

Starting things off, we have the Radeon Instinct MI6. This is a Polaris 10 card analogous to the consumer RX 480. As Polaris doesn’t have much in the way of special capabilities for deep learning (more on this in a second), AMD is pitching the card as their baseline card for neural network inference (execution). At 5.7 TFLOPS (FP16 or FP32) it will draw under 150W, and while pricing for the family hasn’t been announced, I believe it’s a safe bet that as the baseline card the MI6 will offer the best performance per dollar across the Instinct family.

Meanwhile in an unexpected move, AMD will be keeping their 2015 Fiji GPU around for the second card, the Instinct MI8. This card is for all intents and purposes a rebranded Radeon R9 Nano, AMD’s power tuned Fiji card that has proven quite popular with their server customers. Within the Instinct lineup, it is essentially an unusual variant to the MI6, offering higher throughput and greatly increased memory bandwidth for only a small increase in power consumption, with the drawback of Fiji’s 4GB VRAM limitation. Since it offers better performance than the MI6 and is smaller to boot, I expect we’ll see AMD pitch the MI8 as a premium alterative for inference.

The MI6 and MI8 will be going up against NVIDIA’s P4 and P40 accelerators. AMD’s cards don’t directly line-up against the NVIDIA cards in power consumption or expected performance, so the competitive landscape is somewhat broad, but those are the cards AMD will need to dethrone in the inference landscape. One potential issue here that I’m waiting to see if and how AMD addresses closer to the launch of the Instinct family will be the lack of high-speeds modes for lower precision operations. The competing Tesla cards can process 8-bit integer (INT8) operations at up to 4x speed, something the MI6 and MI8 Instinct cards can’t do. INT8 is something of a special case, but if NVIDIA’s expectations for inferencing with INT8 come to pass, then it means AMD has to compete more strongly on price than performance.

Last, but certainly not least in the Instinct family is the most powerful card of them all, and arguably the cornerstone of what the family is meant to become: the MI25. This is based on AMD’s forthcoming Vega GPU family, and while AMD is not sharing much in the way of new details on Vega today, they are leaving no doubts that this is going to be a high performance card. The passively cooled card is rated for sub-300W operation, and based on AMD performance projections elsewhere, AMD makes it clear that they’re targeting 25 TFLOPS FP16 (12.5 TFLOPS FP32) performance.

Significantly, of the few things AMD is saying about Vega right now, is that they’re confirming that it supports packed math formats for FP16 operations. This is something that first appears in Sony’s Playstation 4 Pro, with a strong hint that it was a feature of a future AMD architecture, and now this has been confirmed.

With AMD pitching the MI25 as a training accelerator, offering a packed math mode for FP16 is critical to the product. Neural network training very rarely requires higher precision FP32 math, which is otherwise the default for GPUs. Instead, FP16 is suitably precise for a process that is inherently imprecise, and as a result offering a fast FP16 mode makes the card significantly faster at its intended task. Coupled with the already high throughput rates of GPUs due to their wide arrays of ALUs, and this is what makes GPUs so potent at neural network training.

As AMD’s sole training card, the MI25 will be going up against NVIDIA’s flagship accelerator, the Tesla P100. And as opposed to the inference cards, this has the potential to be a much closer fight. AMD has parity on packed instructions, with performance that on paper would exceed the P100. AMD has yet to fully unveil what Vega can do – we have no idea what “NCU” stands for or what AMD’s “high bandwidth cache and controller” are all about – but on the surface there’s the potential for the kind of knock-down fight at the top that makes for an interesting spectacle. And for AMD the stakes are huge; even if they can’t necessarily win, being able to price the MI25 even remotely close to the P100 would give them huge margins. More practically speaking, it means they could afford to significantly undercut NVIDIA in this space to capture market share while still making a tidy profit.

On a final note, while AMD isn’t commenting on the future of FirePro S or other server GPU products – so it’s not clear if Instinct will be their entire server GPU backbone or only part of it – it’s interesting to note that they are pointing out that one of the ways they intend to stand out from NVIDIA is to not restrict their virtualization support to certain cards.

In other words, if Instinct does end up being AMD’s sole line of server cards, then these cards will be fully capable of serving the virtualization market just as well as the deep learning markets.

AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning Software, Servers, & Closing Thoughts
Comments Locked

39 Comments

View All Comments

  • Yojimbo - Monday, December 12, 2016 - link

    Well AMD's biggest problem is the software stack. But that issue aside, only the MI25 looks promising to me. I'm not sure why we should be too confident in AMD's ability to get the ball rolling with machine learning when they've had HPC offerings all along and barely had success. Guess we gotta wait and see.
  • Dribble - Tuesday, December 13, 2016 - link

    The way AMD does best right now is bidding for custom hardware for specific customers, combined with their willingness to accept lower margins then the opposition, so they win the deal. They can then do something general purpose based off that and sell some more, but the core funding is done by the big customer. See console deals, or apple gpu deal for examples.

    Because that customer knows exactly what they want and AMD are so cheap the customer does most of the software, AMD just provides hardware. That I suspect will be the real aim here - provide google/amazon/someone big with some cheap custom hardware.
  • webdoctors - Tuesday, December 13, 2016 - link

    Considering how the AMD ARM server initiative crashed and burned, I think this is going to be a pretty rough uphill battle. It seems the company has all the internal knowledge to create an end to end solution with their own CPU/GPU/motherboard/interconnect for HPCs, but somehow are drastically falling short against Intel and Nvidia whenever its time to execute.

    The prices are going to have to be very competitive to get a foothold into this market, but this is a market that's also not as price conscious as the consumer segment, when you consider bad software or tools can lead to man-months wasted (which is easily $10 of thousands of dollars when discussing Silicon valley engineering salary time).

    Looks like 2017 should be interesting.
  • TheinsanegamerN - Tuesday, December 13, 2016 - link

    It would help if they could deliver on time and on budget. They always seem to get products out months after they are supposed to.
  • IntoGraphics - Wednesday, December 14, 2016 - link

    That's typical of AMD.
    Here I have a Gigabyte RX 480 8GB. And they haven't even got drivers for other Linux distros than Ubuntu and RHEL. (I'm on Arch Linux.) The drivers they have for Ubuntu and RHEL are buggy, and there is no Vulkan support and spotty OpenCL support.
    The Open Source drivers I'm using give me all kinds of artifacts and glitches in Blender as soon as its window is displayed. The Blender UI is constantly corrupted. I uninstalled.
    And now they are already busy with other cards.

    I'm also waiting for Zen to be released. To compare with what Intel has on offer. But it's highly unlikely that I'm going to stick into the meat grinder again.

    It's probably never ever AMD again.
    The current Linux driver situation is just unforgivable.
    The green camp introduced GTX 1060 1 month after RX 480 and they have Linux drivers for all Linux distros. Same old. Same old.
  • IntoGraphics - Wednesday, December 14, 2016 - link

    "I'm going to stick into the meat grinder again." should be "I'm not going to stick my d.ck in the meat grinder again.".
  • appsforsys - Saturday, December 17, 2016 - link

    Thank you for sharing your wonderful experience, I found it really very helpful and interesting.
    These tips are to be kept in mind for sure while writing.
    Good one!
    /
  • IntoGraphics - Tuesday, January 3, 2017 - link

    I wish these incapable cu|\|ts would release stable, bug free and fast Linux drivers for Radeon RX for all Linux distros first. Motherfuckers, get your priorities right. Don't take money first and then ignore.
  • IntoGraphics - Wednesday, January 4, 2017 - link

    Radeon Itstinks.

Log in

Don't have an account? Sign up now