Animated RNN, LSTM and GRU

Recurrent neural network cells in GIFs

Changelogs:

4 Jul 2020: Removed “output gate” label for GRU.

Recurrent neural networks (RNNs) are a class of artificial neural networks which are often used with sequential data. The 3 most common types of recurrent neural networks are

vanilla RNN, long short-term memory (LSTM), proposed by Hochreiter and Schmidhuber in 1997, and gated recurrent units (GRU), proposed by Cho et. al in 2014.

Note that I will use “RNNs” to collectively refer to neural network architectures that are inherently recurrent, and “vanilla RNN” to refer to the simplest recurrent neural network architecture as shown in Fig. 1.

There are many illustrated diagrams for recurrent neural networks out there. My personal favourite is the one by Michael Nguyen in this article published in Towards Data Science, because he provides us with intuition on these models and more importantly the beautiful illustrations that make it easy for us to understand. But the motivation behind my post is to have a better visualisation what happens in these cells, and how the nodes are being shared and how they transform to give the output nodes. I was also inspired by the Michael’s nice animations.

This article looks into vanilla RNN, LSTM and GRU cells. It is a short read and is for those who have read up on these topics. (I recommend reading Michael’s article before reading this post.) It is important to note that the following animations are sequential to guide the human eyes, but do not reflect the chronological order during vectorised machine computation.

Here is the legend that I have used for the illustrations.

Fig. 0: Legend for animations

Note that the animations show the mathematical operations happening in one time step (indexed by t). Also, I have used an input size of 3 (green) and 2 hidden units (red) with a batch size of 1.

Let’s begin!