On the Pathology of mass concentration in deep neural networks¶

It is argued in [1] that the outputs of deep neural networks tend to concentrate on one dimensionsal manifolds the deeper a network gets. This notebook reproduces the pathology in the context of parametric deep nets, while [1] concentrated on deep Gaussian processes.

We also investigate whether this really is a problem for plain neural networks.

We show that the problem only exists for bad initialisations, where the units are driven into saturation. Despite of this bad initialisation, we show that adadelta and rmsprop optimisers are able to undo it and learn an identity mapping.

[1] Duvenaud, David, et al. "Avoiding pathologies in very deep networks." arXiv preprint arXiv:1402.5836 (2014).