$\begingroup$

So, recently there's a Layer Normalization paper. There's also an implementation of it on Keras.

But I remember there are papers titled Recurrent Batch Normalization (Cooijmans, 2016) and Batch Normalized Recurrent Neural Networks (Laurent, 2015). What's the difference between those three?

There is this Related Work section I don't understand: