Sometimes, advances in AI are a result of the combined effect of particular avenues of research that have made progress over the years.

Such is the case for a very interesting recent bit of work from the University of Berkeley artificial intelligence lab, specifically, professor Sergey Levine and colleague Dr. Chelsea Finn, and Sham Kakade, a leading machine learning theory expert, and his student, Aravind Rajeswaran, at the University of Washington.

You may be familiar with Levine from many projects he's done with robots over the years. Levine has been working to move robotics more and more toward a kind of comprehensive approach to "learning," whereby a robot, or, its counterpart in a computer simulation, an "agent," learns how to learn, so to speak. The goal is to have training of a computer system lead to performance on new, never-before-seen tasks. (Some background on the approach can be found in a blog post on the official Nvidia corporate blog.)

The challenge of the latest work can be summed up as how to give neural networks an ability not just to generalize from one learned task to another, but to continually sharpen that ability to generalize over time, with exposure to new tasks. And, to do so with a minimum of data required as examples, given that many new tasks a neural network confronts over time may not have a lot of training data available, or, at least, not a lot of "labeled" training data.

The result is described in a paper out last week, "Online Meta-Learning," posted on the arXiv pre-print server.

The current research has echoes in Levine's other work that's closer to robotics per se. ZDNet back in October related how Levine trains robot simulations -- agents -- to infer movement from multiple frames of video from YouTube. There's a parallel with online meta-learning, in that the computer is learning how to extend its understanding across examples in time, sharpening its ability to understand, in a sense.

The approach that lead authors Finn and Rajeswaran pursue is to combine two different approaches that the teams have explored extensively in recent years: meta-learning and online learning.

Also: Watching YouTube videos may someday let robots copy humans

In meta-learning, a neural network is in a sense pre-trained on some tasks, which then allows it to achieve a kind of transfer of skills as it is tested with new types of challenges that are different from what it was trained on. Levine and his team developed an extensive system for this back in 2017, called "model-agnostic meta-learning," or "MAML," a strategy that can be applied across any number of different neural networks, from classic "feed-forward" networks to "convolutional neural networks."

The authors built upon that MAML approach, but sought to solve one of its weak points: its ability to generalize essentially stops after the initial pre-training, it doesn't adapt over time. To fix that, the authors drew upon another long line of research, online learning. In online learning, a neural network keeps improving by comparing how different possible configurations of its parameters performed over time on each new task. The network seeks in this way to find a solution to its parameters that minimizes "regret," the difference between actual performance on a task and optimal performance.

The authors made something called "follow the meta-leader," which is a play on words combining the term meta-learning together with the name for one of the most successful online learning algorithms, "follow the leader," first developed back in the 1950s by Jim Hannan for the domain of game theory.

The agent, in this case, is presented with round after round of tasks from a family of tasks, in this case things such as transforming images of numbers in the classic MNIST data set, or performing "pose prediction" of objects in a scene, or doing classification of objects. After each round, the agent tries to minimize that regret function by fine-tuning the weights, or parameters, that it has developed over time. All of this happens via the classic neural net optimization approach, stochastic gradient descent.

The authors show some impressive benchmark results on those tasks versus prior approaches, such as, for example, one called "Train on Everything," or "TOE."

Must read

The paper concludes with the observation that the approach is "in some sense, a more natural perspective on the ideal real-world learning procedure," because it consists of "an intelligent agent interacting with a constantly changing environment." As they argue, that fact "should utilize streaming experience to both master the task at hand, and become more proficient at learning new tasks in the future."

There are some limits, however. One big one is computation. Some refinements will be needed in future to maintain the data of past tasks, to come up with "computationally cheaper" algorithms.

"While we showed that our method can effectively learn nearly 100 tasks in sequence without significant burdens on compute or memory, scalability remains a concern," they write. "Can a more streaming algorithm like mirror descent that does not store all the past experiences be successful as well?"

Previous and related coverage:

What is AI? Everything you need to know

An executive guide to artificial intelligence, from machine learning and general AI to neural networks.

What is deep learning? Everything you need to know

The lowdown on deep learning: from how it relates to the wider field of machine learning through to how to get started with it.

What is machine learning? Everything you need to know

This guide explains what machine learning is, how it is related to artificial intelligence, how it works and why it matters.

What is cloud computing? Everything you need to know about

An introduction to cloud computing right from the basics up to IaaS and PaaS, hybrid, public, and private cloud.

Related stories: