An alternative way to deal with the difficult optimization problem is to develop a more sophisticated optimizer that works really well for artificial neural networks – something that the optimization community has often suggested but never done. Ilya Sutskever has recently used an excellent “Hessian free” optimizer developed by James Martens to learn a recurrent neural network that predicts the next character in a string. “He was elected President during the Revolutionary War and forgave Opus Paul at Rome” is an example of what this neural net generates after being trained on character strings from Wikipedia.

Deep neural networks that contain many layers of non-linear feature detectors fell out of favor because it was hard to get enough labeled data to train many millions of parameters and difficult to optimize the connection weights really well. Both of these problems can be overcome by first training a multi-layer belief net to form a top-down generative model of unlabeled input data and then using the features discovered by the belief net to initialize a bottom-up neural net. The neural net can then be discriminatively fine-tuned on a smaller set of labelled data. Marc’Aurelio Ranzato has recently used this deep learning method to create a very good generative model of natural images and Navdeep Jaitly has used it to learn features that are derived directly from the raw sound wave and outperform the features that are usually used for phoneme recognition.

Speakers

Geoffrey Hinton

Geoffrey Hinton received his PhD in Artificial Intelligence from Edinburgh in 1978. He did postdoctoral work at Sussex University and the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. He then became a fellow of the Canadian Institute for Advanced Research and moved to the Department of Computer Science at the University of Toronto. He spent three years from 1998 until 2001 setting up the Gatsby Computational Neuroscience Unit at University College London and then returned to the University of Toronto where he is a University Professor. He is the director of the program on “Neural Computation and Adaptive Perception” which is funded by the Canadian Institute for Advanced Research.

Geoffrey Hinton is a fellow of the Royal Society, the Royal Society of Canada, and the Association for the Advancement of Artificial Intelligence. He is an honorary foreign member of the American Academy of Arts and Sciences, and a former president of the Cognitive Science Society. He received an honorary doctorate from the University of Edinburgh in 2001. He was awarded the first David E. Rumelhart prize (2001), the IJCAI award for research excellence (2005), the IEEE Neural Network Pioneer award (1998) and the ITAC/NSERC award for contributions to information technology (1992).

Geoffrey Hinton investigates ways of using neural networks for learning, memory, perception and symbol processing and has over 200 publications in these areas. He was one of the researchers who introduced the back-propagation algorithm that has been widely used for practical applications. His other contributions to neural network research include Boltzmann machines, distributed representations, time-delay neural nets, mixtures of experts, Variational learning and products of experts. His current main interest is in unsupervised learning procedures for multi-layer neural networks with rich sensory input.