Best Deals in Deep Learning Cloud Providers

AWS, Google, Paperspace, vast.ai and more

I wanted to figure out where I should train my deep learning models online for the lowest cost and least hassle. I wasn’t able to find a good comparison of GPU cloud service providers, so I decided to make my own.

Feel free to skip to the pretty charts if you know all about GPUs and TPUs and just want the results.

I’m not looking at serving models in this article, but I might in the future. Follow me to make sure you don’t miss out.

Deep Learning Chip Options

Let’s briefly look at the types of chips available for deep learning. I’ll simplify the major offerings by comparing them to Ford cars.

CPUs alone are really slow for deep learning. You do not want to use them. They are fine for many machine learning tasks, just not deep learning. The CPU is the horse and buggy of deep learning.

Horse and Buggy

GPUs are much faster than CPUs for most deep learning computations. NVDIA makes most of the GPUs on the market. The next few chips we’ll discuss are NVDIA GPUs.

One NVIDIA K80 is about the minimum you need to get started with deep learning and not have excruciatingly slow training times. The K80 is like the Ford Model A — a whole new way to get around.

Ford Model A

NVIDIA P4s are faster than K80s. They are like the Ford Fiesta. Definitely an improvement over a Model A. They aren’t super common.

Ford Fiesta. Credit; By Rudolf Stricker CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/), from Wikimedia Commons

The P100 is a step up from the Fiesta. It’s a pretty fast chip. Totally fine for most deep learning applications.

Ford Taurus. Credit: ford.com

NVDIA also makes a number of consumer grade GPUs often used for gaming or cryptocurrency mining. Those GPUs generally work okay, but aren’t often found in cloud service providers.

The fastest NVDIA GPU on the market today is the Tesla V100 — no relation to the Tesla car company. The V100 is about 3x faster than the P100.

The V100 is like the Ford Mustang: fast. It’s your best option if you are using PyTorch right now.

Ford Mustang. Credit: Ford.com

If you are on Google Cloud and using TensorFlow/Keras you can also use Tensor Processing Units — TPUs. Google Cloud, Google Colab, and PaperSpace (using Google Cloud’s machines) have TPU v2s available. They are like the Ford GT race car for matrix computations.

Ford GT. Credit: Ford.com

TPU v3s are available to the public only on Google Cloud. TPU v3s are the fastest chips you can find for deep learning today. They are great for training Keras/TensorFlow deep learning models. They are like a jet car.