As machine learning (ML) researchers and practitioners continue to explore the bounds of deep learning, the need for powerful GPUs to both train and run these models grows ever stronger. New models for object detection, image segmentation, and speech transcription continue to be refined, finding use in a variety of industries ranging from autonomous driving to home assistants.

To satisfy this demand for GPU-compute, both Amazon and Google recently added next generation Nvidia Volta and P100 GPUs to their instance types. Paperspace¹, another cloud GPU vendor, has also added Volta GPUs to its offerings. These P100 and Volta GPUs are the best GPUs currently available on the market and are at the cutting edge for performance for ML applications. Not only do these GPUs have superior performance relative to the older K80 GPUs, they also come with 16GB of memory enabling even more expressive ML models and larger training minibatch sizes.

Modern object detection pipelines require GPUs for efficient training

To test how these modern GPUs perform on typical ML tasks, I trained a Faster R-CNN/resnet101 object detection model on Nvidia’s most recent GPUs. The object detection model was implemented in Tensorflow and operated on 300x300px image inputs, with training minibatch sizes of 10, 15, and 20 images.

The GPUs that were benchmarked:

Note: This benchmark focuses specifically on newer GPUs and thus excludes the older K80 and Quadro GPUs. These GPUs were benchmarked last April.

Results

From a performance standpoint, Voltas are unsurprisingly the most powerful GPUs available today, outperforming both the Nvidia 1080Ti (~1.1-1.3x) and the P100 (~1.2-1.5x) by significant margins, despite the 1080Ti being only around 9 months old. This continues Nvidia’s rapid cadence of releasing increasingly powerful GPU architectures.