Here is the list of top deep learning papers prepared by our staff.

1. Deep Learning, by Yann L., Yoshua B. & Geoffrey H. (2015)

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics.

2. Visualizing and Understanding Convolutional Networks, by Matt Zeiler, Rob Fergus

The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery.

3. TensorFlow: a system for large-scale machine learning, by Martín A., Paul B., Jianmin C., Zhifeng C., Andy D. et al. (2016)

TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research.

4. Generative Adversarial Nets , by Ian J. Goodfellow∗, Jean Pouget-Abadie†, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair‡, Aaron Courville, Yoshua Bengio

This Paper proposes a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

5. Deep learning in neural networks, by Juergen Schmidhuber (2015)

This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

6. Human-level control through deep reinforcement learning, by Volodymyr M., Koray K., David S., Andrei A. R., Joel V et al (2015)

Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games.

7. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, by Shaoqing R., Kaiming H., Ross B. G. & Jian S. (2015)

In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position.

8. Long-term recurrent convolutional networks for visual recognition and description, by Jeff D., Lisa Anne H., Sergio G., Marcus R., Subhashini V. et al. (2015)

In contrast to current models which assume a fixed spatio-temporal receptive field or simple temporal averaging for sequential processing, recurrent convolutional models are “doubly deep” in that they can be compositional in spatial and temporal “layers”.

9. MatConvNet: Convolutional Neural Networks for MATLAB, by Andrea Vedaldi & Karel Lenc (2015) (Cited: 1,148)

It exposes the building blocks of CNNs as easy-to-use MATLAB functions, providing routines for computing linear convolutions with filter banks, feature pooling, and many more. This document provides an overview of CNNs and how they are implemented in MatConvNet and gives the technical details of each computational block in the toolbox.

10. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, by Alec R., Luke M. & Soumith C. (2015)

In this work, we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning.

11. U-Net: Convolutional Networks for Biomedical Image Segmentation, by Olaf R., Philipp F. &Thomas B. (2015)

There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently.

12. Conditional Random Fields as Recurrent Neural Networks, by Shuai Z., Sadeep J., Bernardino R., Vibhav V. et al (2015)

We introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks.

13. Image Super-Resolution Using Deep Convolutional Networks, by Chao D., Chen C., Kaiming H. & Xiaoou T. (2014)

Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one

14. Beyond short snippets: Deep networks for video classification, by Joe Y. Ng, Matthew J. H., Sudheendra V., Oriol V., Rajat M. & George T. (2015)

In this work, we propose and evaluate several deep neural network architectures to combine image information across a video over longer time periods than previously attempted.

15. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, by Christian S., Sergey I., Vincent V. & Alexander A A. (2017)

Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. With an ensemble of three residual and one Inception-v4, we achieve 3.08% top-5 error on the test set of the ImageNet classification (CLS) challenge.

16. Salient Object Detection: A Discriminative Regional Feature Integration Approach, by Huaizu J., Jingdong W., Zejian Y., Yang W., Nanning Z. & Shipeng Li. (2013)

In this paper, we formulate saliency map computation as a regression problem. Our method, which is based on multi-level image segmentation, utilizes the supervised learning approach to map the regional feature vector to a saliency score.

17. Visual Madlibs: Fill in the Blank Description Generation and Question Answering, by Licheng Y., Eunbyung P., Alexander C. B. & Tamara L. B. (2015)

In this paper, we introduce a new dataset consisting of 360,001 focused natural language descriptions for 10,738 images. This dataset, the Visual Madlibs dataset, is collected using automatically produced fill-in-the-blank templates designed to gather targeted descriptions about: people and objects, their appearances, activities, and interactions, as well as inferences about the general scene or its broader context.

18. Asynchronous methods for deep reinforcement learning, by Volodymyr M., Adrià P. B., Mehdi M., Alex G., Tim H. et al. (2016)

The best performing method, an asynchronous variant of actor-critic, surpasses the current state-of-the-art on the Atari domain while training for half the time on a single multi-core CPU instead of a GPU. Furthermore, we show that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

19. Theano: A Python framework for fast computation of mathematical expressions., by by Rami A., Guillaume A., Amjad A., Christof A. et al (2016)

Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers especially in the machine learning community and has shown steady performance improvements.

20. Deep Learning Face Attributes in the Wild, by Ziwei L., Ping L., Xiaogang W. & Xiaoou T. (2015)

This framework not only outperforms the state-of-the-art with a large margin, but also reveals valuable facts on learning face representation. (1) It shows how the performances of face localization (LNet) and attribute prediction (ANet) can be improved by different pre-training strategies. (2) It reveals that although the filters of LNet are fine-tuned only with imagelevel attribute tags, their response maps over entire images have strong indication of face locations.

This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several largescale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results.

22. A Few Useful Things to Know about Machine Learning , by Pedro Domingos

This article summarizes twelve key lessons that machine learning researchers and practitioners have learned. These include pitfalls to avoid, important issues to focus on, and answers to common questions.

The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery.