Over the past few months, Nvidia has made a number of high-profile announcements regarding game development and new gaming technologies. One of the most significant is a new developer support program, called GameWorks. The GameWorks program offers access to Nvidia’s CUDA development tools, GPU profiling software, and other developer resources. One of the features of GameWorks is a set of optimized libraries that developers can use to implement certain effects in game. Unfortunately, these same libraries also tilt the performance landscape in Nvidia’s favor in a way that neither developers nor AMD can prevent.

Update (1/3/2014): According to Nvidia, developers can, under certain licensing circumstances, gain access to (and optimize) the GameWorks code, but cannot share that code with AMD for optimization purposes. While we apologize for the error, the net impact remains substantially identical. Game developers are not driver authors and much of the performance optimization for any given title is handled by rapid-fire beta driver releases from AMD or Nvidia in the weeks immediately following a title’s launch. When developers do patch GPU performance directly, it’s often after working with AMD or Nvidia to create the relevant code paths.

Understanding libraries

Simply put, a library is a collection of implemented behaviors. They are not application specific — libraries are designed to be called by multiple programs in order to simplify development. Instead of implementing a GPU feature five times in five different games, you can just point the same five titles at one library. Game engines like Unreal Engine 3 are typically capable of integrating with third party libraries to ensure maximum compatibility and flexibility. Nvidia’s GameWorks contains libraries that tell the GPU how to render shadows, implement ambient occlusion, or illuminate objects.

In Nvidia’s GameWorks program, though, the libraries are effectively black boxes. Nvidia has clarified that developers can see the code under certain licensing restrictions, but they cannot share that code with AMD — which means AMD can’t optimize its own drivers to optimally run the functions or make suggestions to the developer that would improve the library’s performance on GCN hardware. This is fundamentally different from how most optimization is done today, where Nvidia and AMD might both work with a developer to optimize HLSL code for their respective products.

Is GameWorks distorting problems in today’s shipping games?

To answer this question, I’ve spent several weeks testing Arkham Origins, Assassin’s Creed IV, and Splinter Cell: Blacklist. Blacklist appears to only use GameWorks libraries for its HBAO+ implementation, and early benchmarks of this last game showed a distinct advantage for Nvidia hardware when running in that mode. Later driver updates and a massive set of game patches appear to have resolved these issues; the R9 290X is about 16% faster than the GTX 770 at Ultra detail with FXAA enabled. Assassin’s Creed IV is more difficult to test — it’s engine is hard-locked to 63 FPS — but it showed the R9 290X as being 22% faster than the GTX 770 as well, roughly on par with expectations.

Arkham Origins’ performance is substantially different.

Like its predecessor, Arkham City, it takes place in an open-world version of Gotham, is built on the Unreal 3 engine, and uses DirectX 11. Both games are TWIMTBP titles. I’ve played both games all the way through — many of the animations, attacks, and visual effects of Arkham City carry over to Arkham Origins. Because the two games are so similar, we’re going to start with a comparison of the two games side-by-side in their respective benchmarks; first in DX11, then in DX9.

Previous Arkham titles favored Nvidia, but never to this degree. In Arkham City, the R9 290X has a 24% advantage over the GTX 770 in DX11, and a 14% improvement in DX9. In Arkham Origins, they tie. Can this be traced directly back to GameWorks? Technically, no it can’t — all of our feature-specific tests showed the GTX 770 and the R9 290X taking near-identical performance hits with GameWorks features set to various detail levels. If DX11 Enhanced Ambient Occlusion costs the GTX 770 10% of its performance, it cost the R9 290X 10% of its performance.

The problem with that “no,” though, is twofold. First, because AMD can’t examine or optimize the shader code, there’s no way of knowing what performance could look like. In a situation where neither the developer nor AMD ever has access to the shader code to start with, this is a valid point. Arkham Origins offers an equal performance hit to the GTX 770 and the R9 290X, but control of AMD’s performance in these features no longer rests with AMD’s driver team — it’s sitting with Nvidia.

There’s a second reason to be dubious of Arkham Origins: it pulls the same tricks with tessellation that Nvidia has been playing since Fermi launched. One of the differences between AMD and Nvidia hardware is that Nvidia has a stronger tessellation engine. In most games, this doesn’t matter, but Nvidia has periodically backed games and benchmarks that include huge amounts of tessellation to no discernible purpose. Arkham Origins is one such title.

Next page: Arkham shenanigans and an unequal playing field…