RNNs evolution

Recurrent neural networks are everywhere today. Giants like Google, Baidu extensively use them in production to do Machine Translation, Speech Recognition as well as a number of other tasks. Practically, almost all state of the art results in NLP related tasks are achieved by exploiting RNNs.

With the rise of fantastic Deep Learning frameworks, like TensorFlow, it’s easier than anytime before to build LSTM and other types of recurrent networks. And there is a strong temptation to treat them as a blackbox.

We feel that intuition behind modern RNNs is crucial. It’s really hard to get a good understanding of GRU and LSTM networks just by looking at the equations. In fact, LSTM network is a result of fighting with particular problems that Vanilla RNN has. So, hopefully, understanding the problems and the ways to fix them will make GRU, LSTM equations much more transparent and intuitive.

The ideas behind modern RNNs are really beautiful. In today’s lecture “Evolution: from vanilla RNN to GRU & LSTMs” we will discuss them!

Here is the link to slides.

Huge respect to R2RT for their blog post: Written Memories: Understanding, Deriving and Extending the LSTM. Our lecture was strongly inspired by their work.