At some point, probably soon, AMD is going to release a beta driver that will enable support for its new Mantle API on video cards and systems-on-chips using its Graphics Core Next (GCN) architecture. That spans Radeon HD 7000, HD 8000, R7, and R9 series devices, as well as APUs such as the recently launched Kaveri line.

Awkwardly, the drivers have been delayed. The first application to use Mantle is, however, available: the very latest update to Battlefield 4 will use AMD's new API if it can. This was meant to happen on Thursday morning; we're still waiting.

AMD claims that by reducing overheads and letting developers get closer to the metal, Mantle can provide substantial frame rate improvements on CPU-bound systems.

For example, AMD says an AMD A10-7700K paired with a Radeon R9 290X with Ultra settings and 4×AA showed a 40.9 percent improvement in framerates at 1920×1080 and a 40.1 percent improvement at 2560×1600. This is a system with a very fast video card, but a mid- to low-end CPU. With dual video cards, even relatively fast processors showed improvements: Electronic Arts claims that an Intel Core i7-3970X with a pair of Radeon R9 290X GPUs saw frame rates go up by 58 percent at 1920×1080.

On GPU-bottlenecked systems, the gains are much more modest: an Intel Core i7 4960X paired with a Radeon R7 260X showed only a 2.7 percent and 1.4 percent improvement at those same resolutions, again per AMD's own figures.

Even fairly lowly systems can benefit. EA cited a 14 percent framerate improvement on an AMD A10-7850K using the on-chip integrated GPU at 1280×720 and medium graphics settings. Such a system is far from a gaming powerhouse, but improvements like this may be enough to push it into the "acceptable" category for a wide range of buyers.

With AMD's most competitive processors being mid-range parts, a graphics API that enhances performance in CPU-bound situations and allows even mid-range processors to take full advantage of high-end video cards, Mantle appears to be an important development.

How does it actually go faster?

AMD describes Mantle API's position as being a lower-level API than Direct3D and OpenGL. While it supports broadly the same concepts as these APIs, it is designed for modern, highly programmable GPUs.

The first GPUs were regimented, restricted devices. You fed them with geometry (the lines and triangles that make up the objects you're displaying) and textures to apply to that geometry, specified some lights, and out came the result. The GPU did a bunch of important things: handling perspective, removing invisible pieces of geometry, applying anti-aliasing algorithms, but the basic structure was very limited. The various stages in the process were organized into a fixed pipeline, taking in the geometry and textures at one end, and spitting out pixels at the other.

Over time, GPUs became more capable, with shader programs. These shader programs were invoked at various points in the GPU's pipeline. Geometry shaders would operate on the geometry; pixel shaders would operate on the pixels, and so on.

As GPUs evolved, these shader programs went from being additional features of the pipeline to pretty much its entire point. Modern GPUs are processors for running shader programs. These shader programs have become so important that sometimes the GPU's primary or even sole purpose is to run them, without any of the trappings of the graphics pipeline.

Modern graphics APIs, such as Direct3D 11, have been engineered around this shader-based model, but AMD would argue that they don't go far enough. That's where Mantle comes in. The Mantle API is designed for executing shader programs in a lightweight, efficient way.

Running these shader programs is what GPUs are designed for. And as long as they're fed with data—the inputs and outputs for the shader programs—quickly enough, they can crunch numbers incredibly fast. The problem faced by many games is that feeding.

Commands are sent to the GPU as batches. Each draw command that tells the GPU to draw a particular piece of geometry will be a batch. So will each command to change the GPU's state, each command to do some GPU-based computation, and anything else that makes the GPU do work. The job of the CPU is to ensure that the GPU has enough batches to process to keep busy.

APIs, like Direct3D and OpenGL, can impose a relatively large overhead for these batches. The instructions in each batch are organized into command buffers for actually sending to the GPU, and in many video drivers, this isn't effectively multithreaded. As such, the APIs introduce various overheads and limit CPU scaling.

Mantle reduces the overheads imposed by these APIs. For example, the APIs perform various kinds of validation of each batch as it's sent. This can be superfluous—there's no need to revalidate an object that's drawn every single frame, for example—and so Mantle performs the validation once, when the object is created, rather than once per frame. It also reduces the strictness of some of that validation, on the basis that developers will do the work at development time, rather than having to do it on end-user machines at run time.

Mantle also supports filling command buffers in multiple threads.

Along with some other improvements, such as reducing the overhead of compiling shader programs, the result is that Mantle cuts the CPU time needed to get graphics onto the screen and gives the GPU more batches to process. When the CPU is the bottleneck, the gains can, if AMD and EA's figures are anything to go by, be substantial.

We asked AMD if the techniques could be used to provide gains to existing OpenGL and Direct3D programs. For example, Direct3D 11 permits command buffer generation to be done in parallel, with a feature called deferred contexts and multithreaded rendering. However, some video driver developers—including AMD—have not implemented multithreaded rendering support, so while the API supports parallelism, the work is done serially anyway, and sometimes more slowly than if no multithreading was used. As a result of this poor driver support, some game developers have removed multithreaded rendering support from their game engines.

In response, AMD told us that developers have tried to do it but haven't had much success with using such techniques with existing APIs, and that it requires Mantle to do the job properly. It's not immediately clear to us if this is because of AMD's refusal to implement the support in its drivers, or if there really is a problem with using Direct3D in this way; there is clearly something of the chicken and the egg at work here.

What's the future for Mantle?

Until independent testing can take place—which won't happen until the driver is finally made available—the benefits of Mantle are uncertain, but assuming AMD's numbers are replicated by third parties, it does seem that the API can provide significant gains in CPU-bottlenecked systems.

This could make it attractive to game developers. The problem, of course, is that Mantle is an AMD-specific API. While Nvidia and Intel GPUs are, at a high level, comparable to AMD GPUs—they're all processors for crunching shader programs—their implementations and drivers are, of course, significantly different.

As long as these other GPUs are supported, developers will at the very least have to create an OpenGL and/or Direct3D implementation alongside the Mantle implementation of their engines. EA DICE has done this for Battlefield 4, and it's certainly not impossible to see other developers follow suit.

Mantle may also have repercussions for the Xbox One and PlayStation 4. Both of these consoles are built around AMD APUs, albeit somewhat customized, and so in principle they too could support a Mantle-like API. If this were to materialize it would make Mantle support in PC versions something of a no-brainer. However, it's not immediately clear if either Sony or Microsoft is willing to go down this route, especially as they are already able to do things like reduce validation and enforce multithreading support in their consoles' existing APIs anyway.

Long term, it's hard to see a vendor-specific API like Mantle being the one that the industry settles on. Since the dim and distant past of the 1990s and the birth of the GPU, such APIs have been a feature of the computing landscape, with 3Dfx's Glide, PowerVR MiniGL, and Rendition's Speed3D and RRedline, but cross-vendor compatibility has prevailed, with OpenGL and Direct3D.

We can probably expect the same thing to happen this time around. Mantle may inform future developments of OpenGL and Direct3D and encourage those APIs to reduce their overheads and become friendlier to multithreading, but it's hard to see it displacing either.