GPU rendering in Blender v2.79b - TITAN V | GTX 1080(Ti) | GTX 980(Ti) | GTX 1060 and more ...

Compute Tibor Nyers by

A huge number of NVIDIA dGPUs have been put through their paces in a number of GPU rendering applications by BoostClock recently. To help out the users of Blender (and drive home some interesting findings), I thought it would be a good idea to recap the results in the traditional minute:second format so that you can easily compare your results with ours - the "bigger is better" scale is usually prefered by many so the results of Blender were converted to samples / sec in our previous articles.

Test methodology

The Cycles render engine in Blender was assessed with help of the Blender Institute-prepared benchmark pack + the recent Barbershop Interior scene from Agent 327 animated feature film. Render time is extracted so that it only covers pure path tracing time (pure dGPU performance) - no kernel compilation, scene loading, CPU-side BVH construction, final composition.

Blender is launched headless (no GUI) with a python script responsible to get everything sorted and start the rendering process. Every render job is repeated three times so that in case of anomaly an investigation can be launched into what went wrong. Blender produced rock stable results with minimal (near zero) variance.

bmw27

classroom

pabellon_barcelona

fishy_cat

koro

barbershop_interior

Conclusion

The main takeaway is that one can arrive at very different results by benchmarking one scene only - let's disregard the Barbershop Interior scene as it has some issues on Volta (reported to the devs). For this very reason it is crucial to have multiple scenes for benchmarking with different workload scenarios, if possible. Scene complexity and geometric detail can have a major effect on the scores as well.

In most of the scenes the operating system didn't have too big of an effect on the final render time (again, Barbershop Interior is the odd one out) - slower render times of Windows 10 is usually associated with an overhead in the Windows Display Driver Model (WDDM). What a quick look at the log files reveal is that the BVH construction takes a lot more time on Windows, mainly because of the ancient compiler (MSVC 2013).

Hardware setup

PSU: Cooler Master 1000W VANGUARD

MOTHERBOARD: Gigabyte GA-AB350-Gaming 3

CPU: AMD Ryzen 5 1600

GPU Maxwell: MSI GTX 960 GAMING 4G / MSI GTX 970 GAMING 4G / MSI GTX 980 GAMING 4G / MSI GTX 980Ti GAMING 6G

GPU Pascal: MSI GTX 1050Ti GAMING X 4G / MSI GTX 1060 GAMING X 6G / MSI GTX 1080 GAMING X+ 8G / MSI GTX 1080Ti GAMING X 11G

GPU Volta: NVIDIA TITAN V

OS-LNX: Ubuntu 16.04.4 LTS x86_64 - Linux 4.13.0-45-generic

DRIVER-LNX: CUDA 9.2.106 - NV 396.26

OS-WIN: Microsoft Windows 10 (10.0) Home 64-bit - Version 1803/RS4 (17134.137)

DRIVER-WIN: CUDA 9.2.156 - NV 398.36

RAM: G.Skill FlareX 16GB (2X8GB) DDR4 3200MHz

STORAGE: Samsung m.2 SATA 500GB SSD 850 EVO

COOLER: AMD Wraith Spire Cooler