Last November Synced ran an interview with Yoshua Bengio, in which the deep learning maverick, Université de Montréal Professor and MILA Scientific Director discussed his research and commented on the current state of deep learning and AI.

In this follow-up piece we look at the talk Bengio gave late last year at Tsinghua University in Beijing. Challenges for Deep Learning towards Human-Level AI addressed difficulties Bengio and his collaborators are facing and efforts they have made to improve deep learning for human-like AI development.

Research over the last decade has given us a much improved understanding of AI, such as why certain methods are helpful for model optimization and why deep learning is so useful. Researchers are showing great interest in deep learning and its potential for application across many different fields. However, despite advancements in deep learning, most AI successes still depend heavily on supervised learning, as machines alone still struggle to discover crucial high-level concepts in data.

Like many AI experts, Bengio believes machine performance remains far from human-level AI. For example even an advanced image recognition classifier can be fooled into making stupid mistakes such as identifying a dog as an ostrich if just several pixels are modified. Understanding pixels is not sufficient for fully understanding image content.

Image recognition model tricked by pixel-level manipulation

Because current deep learning networks only tend to learn superficial statistical regularities in a dataset, Bengio argues it is necessary to discover high-level representations that can give machines access for example to discovering and understanding other factors and mechanisms related to data, and that could enable the disentangling of these underlying factors. To encourage the discovery of the right kind of disentangled representations, Bengio proposed in his 2013 paper Deep Learning of Representations: Looking Forward that it could be very helpful to introduce and exploit “priors,” which are embedded knowledge such as mathematical or physical assumptions. Priors can “enhance the discovery of representations which disentangle the underlying and unknown factors of variation.”

Example prior knowledge includes: 1) different underlying factors exist at different temporal and spatial scales; 2) dependencies between factors become simple when data is mapped into the right high-dimensional space; and 3) some factors correspond to independently controllable aspects of the world, etc.

In his presentation Bengio also talked about a prior or constraint for representation learning he first proposed in his 2017 paper The Consciousness Prior, inspired by cognitive psychology and previous work on attention and consciousnesses. Based on the idea of a consciousness prior, Bengio proposes an object function in abstract space that exploits the soft attention mechanism to derive, combine, and transfer a few elements from the high-dimensional unconscious state representation into low-dimensional conscious state vectors that correspond to useful statements in order to make plans or predict future situations — which are key abilities machines should have to become human-level AIs.

Consciousness Prior structure

Bengio’s proposed consciousness prior fuses unconscious and conscious cognition: the former is intuitive, fast, non-linguistic; while the latter is slow, logical, sequential, linguistic and algorithmic and what classical AIs attempt with symbolic systems.

Bengio believes it is necessary to execute both cognitive tasks together in case something important is missing, especially regarding another main theme in his Tsinghua talk: grounded language learning, which requires a combination of both language learning and world modeling.

Inspired by babies who build up models of intuitive physics and psychology by observing and interacting with the world before they master languages, Bengio proposes that machines would be more powerful if they could better understand causal effects. To learn causality, it is better to move away from the traditional machine learning frameworks of passive observation of data and move toward deep learning for agents in simulated virtual environments.

Bengio and his collaborators launched the ongoing project, BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop to investigate grounded language learning. They asked an artificial agent tutored by a simulated human expert to complete 19 levels of tasks formulated in a synthetic “Baby” language with some 25 quintillion instructions, in a gridworld environment with partial observability (Minigrid).

Three “BabyAI” levels built using the MiniGrid environment.

Developers have tested data efficiency for language learning on the BabyAI platform. Preliminary results suggest that data efficiency still needs improvement, as an enormous amount of data and hundreds of thousands of demonstrations are currently required to achieve a 99 percent success rate even for very simple tasks.

BabyAI levels and tasks can be easily extended, and interested scientists can use the environment without requiring huge compute resources. The BabyAI code is open sourced at Github.

For more regarding the Tsinghua University talk, readers can download the slides for Bengio’s presentation.

Source: Synced China