In 2015, Google established its first TPU center to power products like Google Calls, Translation, Photos, and Gmail. To make this technology accessible to all data scientists and developers, they soon after released the Cloud TPU, meant to provide an easy-to-use, scalable, and powerful cloud-based processing unit to run cutting-edge models on the cloud.

According to Google’s team behind Colab’s free TPU:

“Artificial neural networks based on the AI applications used to train the TPUs are 15 and 30 times faster than CPUs and GPUs!”

But before we jump into a comparison of TPUs vs CPUs and GPUs and an implementation, let’s define the TPU a bit more specifically.

What is TPU?

TPU stands for Tensor Processing Unit. It consists of four independent chips. Each chip consists of two calculation cores, called Tensor Cores, which include scalar, vector and matrix units (MXUs).

Google Cloud TPU

In addition, each Tensor Core, with 8 GB chip memory (HBM), has been unified. Each of the 8 cores on the TPU can execute user accounts (XLA ops) independently. High-bandwidth interconnection paths allow the chips to communicate directly with each other.

XLA is an experimental JIT (Just in Time) compiler for TensorFlow backend. The most important difference and feature from CPUs (Central Processing Units) and GPUs (Graphical Processing Units) is that the TPU’s hardware is specifically designed for linear algebra, which is the building block of deep learning. This is sometimes called a matrix or tensor machine.

Now that you have a bit better idea of what the TPU actually is, let’s take a look at how it compares to other common processing units.

Comparing CPU, GPU, and TPU: When should each be used?

🔮 CPU:

Rapid prototyping requiring maximum flexibility

Simple models that don’t take long to train

Small models with small effective cluster sizes

Models dominated by special TensorFlow operations written in C ++

Available models with limited input/output or network bandwidth of the host system

🔮 GPU:

TensorFlow models with an external tool that require high processing power

Models that are too resource-free or too difficult to change

Models with a significant number of special TensorFlow operations that must be run at least partially in CPUs

TensorFlow ops models not available in Cloud TPU (see list of available TensorFlow ops)

It’s used in larger models with larger effective cluster sizes.

🔮 TPU: