The Tensor Processing Unit (TPU) is a custom ASIC chip—designed from the ground up by Google for machine learning workloads—that powers several of Google's major products including Translate, Photos, Search Assistant and Gmail. Cloud TPU provides the benefit of the TPU as a scalable and easy-to-use cloud computing resource to all developers and data scientists running cutting-edge ML models on Google Cloud. At Google Next ‘18 , the most recent installment of our annual conference, we announced that Cloud TPU v2 is now generally available (GA) for all users, including free trial accounts, and the Cloud TPU v3 is available in alpha.

But many people ask me "what's the difference between a CPU, a GPU, and a TPU?" So we've created a demo site that is home to a presentation and animation that answer this question.



In this post, I'd like to highlight some specific parts of the site’s content.

How neural networks work

Before we start comparing CPU, GPU, and TPU, let's see what kind of calculation is required for machine learning—specifically, neural networks.

For example, imagine that we're using single layer neural network for recognizing a hand-written digit image, as shown in the following diagram:

If an image is a grid of 28 x 28 grayscale pixels, it could be converted to a vector with 784 values (dimensions). The neuron that recognizes a digit "8" takes those values and multiply by the parameter values (the red lines above).

The parameter works as "a filter" to extract a feature from the data that tells the similarity between the image and shape of "8", just like this: