Deep Learning, practice and Trends

Nando de Freitas, Scott Rees, Oriol Vinyals

Chart comparing a regular program (purple) with neural versions from the RobustFill paper

This talk gave a solid overview of the current state and recent advances in Deep Learning. Convolutional Neural Networks (CNN) and autoregressive models are starting to see ubiquitous use in production, showing a fast transition from research to industry. These models have taught us that introducing inductive biases such as translation invariance (CNN) or time recurrence (Recurrent Neural Networks) can be extremely useful. We’ve also found out that simple “tricks” such as Residual Networks or Attention can lead to tremendous leaps in performance. There are good reasons to believe we will find more such “tricks”. Looking ahead, a few exciting research areas were mentioned:

Weakly supervised domain mappings: learning to translate from one domain to another, without explicit input/output pairs. Particular examples include auto-encoders or recent variants of Generative Adversarial Networks (GANs), such as CycleGAN or DiscoGAN.

Deep Learning for graphs: a lot of input data, such as friend networks, product recommendations, or representations of molecules in chemistry can be thought of in graph form. Graphs as a data type are a generalization of sequences, which makes them widely applicable, but leads to similar problems to dealing with sequence data such as inefficient batching. In addition, it is hard to find a tractable tensor representation of most graphs. Message Passing Neural Networks are one of the proposed frameworks to tackle learning from graphs.

Neural Programming consists of having a neural network generate functioning source code directly (program synthesis), or generate program outputs using a latent program representation(program induction). Some of the first potential applications are creating programs that are more robust to noisy inputs. RobustFill introduces a neural version of Excel’s FlashFill feature that manages to still perform well when noise is added to the input data, contrary to the traditional program that quickly breaks down.

Engineering and Reverse-Engineering Intelligence Using Probabilistic Programs, Program Induction, and Deep Learning

Josh Tenenbaum, Vikash K Mansinghka

Illustration of intuitive physics models from Battaglia et al., 2013

Many AI technologies today rely on pattern recognition. However, the authors argue that intelligence is more than that, and that it consists of explaining the world, imagining possible consequences, and establishing a plan to solve problems. To bridge this gap, they propose addressing two issues:

Common sense in scene understanding: looking at a real estate photo and understanding the structure of a buildings, or guessing the goals of characters in a drawing.

Learning as model building: Understanding things by building a mental model of how the world works. For example, predicting which way a tower of blocks will fall from a still picture of it.

The authors describe an intuitive physics engine, that successfully predicts the direction in which blocks will fall. These models, that work by building a simulation of the world, generalize very well, and can be easily extended to answer other questions, such as what would happen if some of the blocks were heavier than others?

This “common sense” based approach could allow learning from fewer examples then modern deep learning, and reduce edge cases.

Fairness in Machine Learning

Solon Barocas, Moritz Hardt

Caption of the loan simulator detailed below

Ethical dilemmas abound in machine learning but often get little attention compared to the breathless coverage of autonomous autos and human-level artificial gamers. This tutorial featured an excellent critique of the many ways machine learning algorithms can discriminate against protected classes, such as training on imbalanced sampling due to institutional policies like stop and frisk, training on biased historical data, or simply not accounting for the natural imbalance in literal minority classes. Following this overview, formal definitions put these problems in a tractable format. See here for examples and details.

Causal inference gives another tool to analyze discrimination, but turnkey solutions rarely exist for understanding causation. Perhaps the most interesting observation was that measurements are models themselves. We can’t measure “intelligence”, but we can measure IQ. We can’t measure “trustworthiness”, but we can measure credit score. These measures are only proxies, and have built in biases and assumptions that are rarely given the attention they deserve.

Ultimately, we need to keep humans in the loop if we want to be sure we don’t let discrimination creep into our models. When AlphaGo beat Lee Sedol, commentators noted it made several distinctly inhuman moves. When we rely on algorithms to make critical decisions for us in vital domains, we shouldn’t be surprised when the results are inhumane.