Battlefield 4 Mantle Performance Preview
by Ryan Smith on February 1, 2014 12:40 PM ESTAfter a false start or two, AMD is finally getting the first beta of Mantle out the door. With EA DICE having shipped their Mantle patch for Battlefield 4 and developer Oxide having released their Star Swarm technical demo, the first Mantle-enabled applications have landed. Meanwhile AMD for their part is still hammering out an installation issue on their new Mantle-enabled Catalyst drivers, which has led to them missing their previously scheduled January release date.
In the interim, AMD has released a slightly finickier set of drivers to the press for us to play around with ahead of the public Mantle driver release. These drivers should be functionally and performance identical to the public drivers, they just have an outstanding installation bug that requires a workaround, something that AMD doesn’t want in the shipping version. AMD hasn’t provided a public release date for these drivers – at this point it’s in their best interest to avoid providing release dates they don’t know if they can keep – but given the fact that this is the sole showstopper issue in our press drivers, we certainly don’t expect they’ll take much longer.
In any case, we’re hard at work at the moment putting together our full evaluation of this first version of Mantle. That article won’t be ready until next week, but in the meantime given the immense interest in Mantle, we wanted to quickly publish our first batch of numbers for Battlefield 4. We will have a much wider selection of benchmarks for our full article, including many more video cards and results for Star Swarm, but we wanted to quickly bring you what’s almost certainly going to be the most interesting set of data: Mantle performance with a high-end video card.
For that we’re turning to AMD’s Radeon R9 290X, testing the performance of that card under both Direct3D and Mantle in EA’s Battlefield 4. Battlefield 4 is Mantle’s showcase title and accordingly the first real world use case for AMD’s new API, making it the best place to start. As an application retrofitted with Mantle support we don’t expect Battlefield 4 to tap the complete potential of Mantle right out of the door – certainly not when the Mantle SDK and driver stack itself is still in development – but it can give us an idea of what kind of performance gains we can expect if developers chase the low-hanging fruit offered by Mantle.
What is that low-hanging fruit? For the most part that is going to be CPU bottlenecks, specifically bottlenecking in issuing draw calls. Of all of the bottlenecks that can impact a high performance GPU, keeping it fed can be the biggest bottleneck, and in turn bottlenecking in the draw call submission phase can be the biggest culprit. In the long term Mantle will also benefit GPU performance more directly by optimizing workflows within a GPU, and we already see a small bit of that today in Battlefield 4, but the bulk of the optimizations for these earliest titles have been made around the draw call bottleneck.
For our Mantle preview we’re taking a look at two sections of the Battlefield 4 single player game, the first being from the Tashgar mission and the second being from the South China Sea mission. As was the case with Battlefield 3 the use of single player is less than ideal, but as Battlefield 4 lacks a formal benchmark or for that matter the ability to record multiplayer matches, we’re left with single player if we want to have reasonably repeatable benchmarks. And we’ll definitely want a high degree of repeatability if we’re to be able to distinguish Mantle gains from variability in GPU bound scenarios.
Meanwhile to cover a wider spectrum of possibilities, we’re running our 290X against 3 CPU configurations on our GPU testbed. The first of which is our standard configuration, which is our i7-4960X with all cores and HypterThreading enabled (6C/12T), running at 4.2GHz. Our second configuration drops that down to 4C/4T at 2GHz, to test for the benefits of Mantle on a still relatively large core count at lower clockspeeds. Our final configuration takes the core count down further, to 2C/4T at 3GHz, so that we can see what performance is like for processors with fewer cores but higher clockspeeds.
Finally, on a quick note, for measuring Battlefield 4's performance we're using the game's newly built in PerfOverlay.FrameFileLogEnable feature, which replaces FRAPS in this game due to the fact that FRAPS only works with Direct3D and OpenGL. FrameFileLogEnable logs frame times for later analysis, and from this we can reconstruct the minimum and average framerates, and even the full frame pacing performance of the game (but only from the perspective of the game, not the video card). Today we'll be looking at just the average framerates, but be sure to come back next week for our full evaluation, where we'll have frame pacing data and minimum framerates ready to go.
CPU: | Intel Core i7-4960X @ 4.2GHz |
Motherboard: | ASRock Fatal1ty X79 Professional |
Power Supply: | Corsair AX1200i |
Hard Disk: | Samsung SSD 840 EVO (750GB) |
Memory: | G.Skill RipjawZ DDR3-1866 4 x 8GB (9-10-9-26) |
Case: | NZXT Phantom 630 Windowed Edition |
Monitor: | Sharp PN-K321 |
Video Cards: | AMD Radeon R9 290X (Uber) |
Video Drivers: | AMD Catalyst 14.1 Beta |
OS: | Windows 8.1 Pro |
SP-Tashgar
Our first test comes from the Tashgar mission, and is the benchmark we will be using for day-to-day GPU benchmarking. This benchmark takes place immediately at the start of the mission, with our character driving out of the mountains and into the city of Tashgar. This benchmark has a limited CPU load and is GPU-bound in most situations, which potentially limits the benefits of Mantle in alleviating CPU bottlenecks, but gives us an idea of what kind of performance benefits we can expect in GPU-bound scenarios.
Battlefield 4 Tashgar: Mantle Performance Gains | |||||
Ultra | High | Low | |||
i7-4960X 6C/12T @ 4.2GHz | 8% | 10% | -14% | ||
i7-4960X 4C/4T @ 2GHz | 8% | 13% | 26% | ||
i7-4960X 2C/4T @ 3GHz | 8% | 13% | 28% |
Even at 1080p Ultra, where the Radeon R9 290X is clearly GPU-bound, we can see that switching to Mantle offers some performance improvements. With our i7-4960X fully powered up, this leads to an 8% performance increase, and we see similar performance increases even with other CPU configurations. Since we don’t appear to be CPU-bound in any appreciable way, this gives us a decent idea of what kind of GPU performance benefits Mantle can offer.
Meanwhile if we switch to High and Low settings, the higher framerates are able to tease out the CPU benefits of Mantle. With BF4’s High settings this is 10-13% depending on the CPU configuration, which indicates we’re still significantly GPU bound here.
Using Low quality settings on the other hand significantly widens the gap in both directions, with the minimum gain being -14%, and the maximum gain being 28%. In the case of our 6C/12T CPU configuration, Mantle actually has a detrimental impact on performance, bringing down our framerate from a positively absurd 216fps to a slightly less absurd 181fps. This was unexpected to say the least, and while we’re not particularly concerned about it given the fact that we have little reason to use this setting in day-to-day gaming, but it does point to a weakness in the current builds of BF4 and the Mantle drivers.
Otherwise if we move to our slower CPU configurations, the benefits are 26% and 28% for 4C/4T and 2C/4T respectively. Despite the fact that the 4C/4T setup has more real cores to work with, which under normal circumstances would be the stronger setup for a highly threaded application, it’s the 2C/4T setup that technically benefits the most. The difference is quite small, but it’s an interesting outcome none the less.
SP-South China Sea
Our second test comes from the South China Sea mission of Battlefield 4, where our character and his squad are on the quickly disintegrating USS Titan. Whereas our first test is rather uniformly GPU-bound, the breakup of the USS Titan offers us the chance to look at a more CPU-bound scenario. Even this scene isn’t exclusively CPU-bound, but with ship parts and other debris flying around everywhere, it’s going to be one of the more strenuous CPU workloads in the single player game.
Battlefield 4 South China Sea: Mantle Performance Gains | |||||
Ultra | High | Low | |||
i7-4960X 6C/12T @ 4.2GHz | 7% | 8% | 7% | ||
i7-4960X 4C/4T @ 2GHz | 10% | 26% | 17% | ||
i7-4960X 2C/4T @ 3GHz | 10% | 30% | 28% |
Starting once again at 1080p Ultra, even with the greater CPU workload presented by this test, we are unsurprisingly still GPU-bound on Ultra settings. The benefits aren’t as uniform as last time – they now range from 7% to 10% – but it’s safe to say that we’re once again seeing what are mostly the GPU performance benefits of Mantle.
However shifting to High quality shows much greater performance gains, indicating that we’re at least partially (if not fully) CPU-bound here. Once we reduce our CPU performance from 6C/12T to 4C/4T, the performance gains from using Mantle jump from 8% to 26%, and then to 30% when using our 2C/4T configuration. For a game that’s not immensely CPU bound in the first place and has been retrofitted for Mantle, this is towards the upper bound of what we would expect.
Finally switching over to our Low quality settings causes our performance gains to actually taper off some. We’re still CPU-bound on our 4C/4T setup leading to a 17% performance gain, but we’re not as CPU-bound as we were at High quality settings, apparently. Meanwhile the performance gains for 2C/4T remain similar to last time, at 28%. Battlefield 4 has multiple CPU tasks going on here, not the least of which is the simulation itself, so in the case of our 4C/4T setups it’s likely we’ve stumbled onto a situation where the game is more strongly CPU-bound by the simulation and other aspects of the game than it is the submission of draw calls.
First Thoughts
As this is only a brief preview of our results we don’t intend to read too much into this limited data set, but even just looking at the 290X does provide us with some interesting data. For the pure high-end scenario – a 290X or similar GPU with a high-end CPU – Mantle can still offer performance benefits from the GPU workflow optimizations it provides. A 7-10% performance increase is not a dramatic difference, but it is 7-10% better performance than AMD had yesterday.
Meanwhile it comes as little surprise that the greatest performance benefits in our limited BF4 testing come in the mixed performance scenarios, pairing up a high-end GPU with slower CPUs. Since the lowest hanging fruit for Mantle optimizations is going to be CPU draw call bottlenecks, it’s going to be the weaker CPUs that have the most to gain here. In this case we still need to go out of our way to create CPU-bound scenarios – the 290X is rarely held back by the CPU on Ultra quality settings – but when we do create them we can see some of potential that Mantle can offer. At High and Low quality settings, and excluding our one Mantle performance regression, we see performance gains anywhere between 7% and 30%. This shows (if nothing else) that even a retrofit game with a highly optimized Direct3D rendering path, like Battlefield, can still be bottlenecked by draw call performance. And that consequently some of Mantle’s CPU overhead reduction capabilities do in fact pan out.
As for whether all of this is worth the costs and tradeoffs of Mantle from both a consumer perspective and a developer perspective is a longer discussion that we’ll be having next week, alongside our expanded benchmark results. But at first glance it looks like AMD has cleared the first hurdle, which is showcasing that there are tangible benefits to having a low-level graphics API. Now AMD just needs to further hammer out their Mantle drivers and get them into a public-consumable state, so that the wider community of end-users can test and evaluate AMD’s Mantle offering. Outside of the known installation issue we have not encountered any issues with Mantle thus far – this being despite the fact that AMD is being very explicit about the beta nature of the Mantle stack – so hopefully this is a good omen for the company after the delays leading up to this point.
AMD's Official Performance Data
Finally, we’ll quickly close with some of AMD’s performance numbers, which they’ve published in their reviewer’s guide. We feel that vendor-provided should always be taken with a grain of salt, but they do serve their purpose, especially for getting an idea of what performance is like under a best case scenario. To that end we can quickly see that AMD was able to top out at a 41% performance improvement on a 290X paired with an A10-7700K. This is a greater performance gain than the peak gain of 30% we’ve seen in our own results, but not immensely so. More importantly it can give us a good idea of what to reasonably expect for performance under Battlefield 4. If AMD’s results are accurate, then a 40% performance improvement is the most we should be expecting out of Battlefield 4’s Mantle renderer.
135 Comments
View All Comments
Omegaclawe - Sunday, February 2, 2014 - link
I think you might have a few misconceptions, here. First of all, AMD isn't going to pay per game. That would just be silly. The developers aren't going to, either. Instead, AMD pays the people who make the engines (e.g. EPIC for UE3/4, DICE for Frostbite). They do the work once, and then the vast majority of developers don't have to touch it. That's kinda the whole point of an Engine, so you don't have to do that work and can focus on the game.Incidentally, changing just the rendering backend really isn't difficult at all. It's only about 6000 lines of code. Once you throw in cross platform methods for input, sound, multitasking, etc, it becomes a significant amount of work, but mantle by itself is something like a man-week's worth of work. Maybe a man-month, if significant debugging is required.
On top of that, for heavy CPU-based games (like what oxide studio's trying to do) the CPU resources freed by Mantle enable you to do things you couldn't with DirectX / OpenGL. It also can vastly improve the play experience by preventing CPU spikes from interfering with the rendering and can help with frame pacing. Ultimately, the games probably feel like they're getting more than an 8% FPS boost... which is still more than the difference between an R9 290 and a Titan, but people were willing to pay an extra $600 for that.
Mantle isn't perfect, though. The CPU usage boost is nice, but that 8% or so GPU increase absolutely locks AMD to the GCN architecture, which means they can't make significant strides, like Nvidia has between, say, Fermi and Keppler, without losing the mantle advantage over Nvidia.
Honestly, what needs to be done, probably, is just taking OpenGL and take the heavy lifting off the driver's hands and shifting it to the program itself. You'd get the same CPU boost but with wider hardware support.
As far as WebGL, HTML5, and (ugh) Java go, these platforms tend to offer less security and tend to be absolutely terrible at multithreading. Java especially, the buggy piece of junk it is. Frankly, you could not make Crysis 3 in Java. If you wanted to make it run on Java, you'd probably need a high-end machine from circa 2020. Do you want all your games to look like they're from 2008 right now? Nooo? Then dont' develop for Java!
I also seriously doubt Nvdia is developing a "Mantle killer" in any large capacity. No doubt they've looked into it, but considering how much better their drivers tend to be at managing the CPU and wot, they would not get as large of boost from Mantle, Nvidia almost never does open type designs willingly, and they're not the first to this market. Mantle would need to provide a consistent 20%+ bonus before I can see Nvidia really trying to tackle it.
As far as Free-Sync vs G-Sync is concerned, I imagine it'll be something along the lines of CUDA, where the Nvidia-specific solution is actually inferior, but popular because of marketing and because it was "first". Still, Nvidia cards will undoubtedly be able to support both, whereas G-Sync monitors will only work with Nvidia cards, so they'll make some money that way... and I wouldn't exactly call G-Sync market ready, either.
djentleman - Sunday, February 2, 2014 - link
Amds PR has brain washed you. Beyond fixingdjentleman - Sunday, February 2, 2014 - link
Tl;drHave you heard of the maxwell arm chip?
rarson - Sunday, February 2, 2014 - link
Nvidia ARM anything makes me laugh.Kaleid - Sunday, February 2, 2014 - link
"hardocp says this chip wasn't really designed for 28nm and needs 20) and also enhancing the CPU's. "It's fine at 28nm, but needs a better default cooler.
rvalencia - Sunday, February 2, 2014 - link
PS4 uses GNM API/PSSL which claimed to be similar to Mantle.djentleman - Sunday, February 2, 2014 - link
By amd right?And just because it is similar, doesn't mean it is easy to program for.
That's like saying I wrote a program in c; so it must be like copy and paste to java?
They're similar, but not the same.
rarson - Sunday, February 2, 2014 - link
Every driver rev produces a 7-10% performance increase? And I've been replacing my hardware all these years! Silly me!Developers have been asking for a lower-level API for years. OpenGL isn't any lower than DirectX. I fail to see what kind of major advantage SteamOS is going to have over DirectX or Mantle. In fact, the stuff that you're saying about Mantle actually applies MORE to SteamOS: developers have to waste time and resources porting their code over Linux and OpenGL in order to support a software platform that is fragmenting their PC userbase. OpenGL has been around for years and still isn't commonly used for PC games. Ever wonder why that is?
Dice releases a patch for a DirectX game that improves performance anywhere from 7-30%, or possibly more for slower CPUs. Seems like a no-brainer. Especially when considering that the paradigm that this benefits matches the new consoles exactly. To me, it seems that SteamOS is has much more to overcome than Mantle.
If Mantle gains support, then there won't be any reason to buy expensive Intel CPUs for gaming computers that don't need them. Not that there is a good reason as it is, since most games aren't really CPU-bound with modern CPUs anyway, as these benchmarks clearly indicate.
mikato - Wednesday, February 5, 2014 - link
@TheJian - uhhhh did you notice the performance increase? Because your comment made it sound like there is none. That could be a reason why developers would want it. It's a competitive advantage, not something like hypothetically trying to charge more for AMD users vs everyone else. And you probably could charge more if your game runs better, and you can make the graphics look better... or you could keep it the same price and just enjoy the broader gamer base it enables.chizow - Saturday, February 1, 2014 - link
How could you possibly think that when EA launched Mantle update on time, while AMD has repeatedly delayed their driver launch the last few days, constantly changing support levels and acknowledging "nasty bugs" that prevented their Mantle driver from releasing?