2nd February 2019

Game-playing AI is 10 times faster than Google's DeepMind

A new AI developed by RMIT University in Melbourne and trained to play the 1980s video game Montezuma's Revenge is reported to be 10 times faster than Google's DeepMind and able to finish the game.

In 2015, a famous study showed Google's DeepMind autonomously playing Atari video games like Video Pinball to human level, but notoriously failing to find a path to the first key in Montezuma's Revenge, due to the game's complexity. However, a new algorithm developed at RMIT University in Melbourne, Australia, allows computers to learn from their mistakes and identify sub-goals 10 times faster than Google DeepMind to finish the game.

This breakthrough combines "carrot-and-stick" reinforcement learning with an intrinsic motivation approach that "rewards" the AI for being curious and exploring its environment.

"Truly intelligent AI needs to be able to learn to complete tasks autonomously in ambiguous environments," says Associate Professor Fabio Zambetta from the Computer Science and Software Engineering Department at RMIT. "We've shown that the right kind of algorithms can improve results using a smarter approach, rather than purely 'brute forcing' a problem end-to-end on very powerful computers. Our results show how much closer we're getting to autonomous AI and could be a key line of inquiry if we want to keep making substantial progress in this field."

Zambetta's method rewards the system for autonomously exploring useful sub-goals such as 'climb that ladder' or 'jump over that pit', which may not be obvious to a computer, within the context of completing a larger mission. Previous state-of-the-art systems have required human input to identify these sub-goals or else decided what to do next randomly.

"Not only did our algorithms autonomously identify relevant tasks roughly 10 times faster than Google DeepMind while playing Montezuma's Revenge, they also exhibited relatively human-like behaviour while doing so," Zambetta says. "For example, before you can get to the second screen of the game you need to identify sub-tasks such as climbing ladders, jumping over an enemy and then finally picking up a key, roughly in that order. This would eventually happen randomly after a huge amount of time, but to happen so naturally in our testing shows some sort of intent. This makes ours the first fully autonomous sub-goal-oriented agent to be truly competitive with state-of-the-art agents on these games."

When supplied with raw visual inputs, the system could work outside of video games in a wide range of tasks, according to Zambetta.

"Creating an algorithm that can complete video games may sound trivial – but the fact we've designed one that can cope with ambiguity while choosing from an arbitrary number of possible actions is a critical advance," he adds. "It means that, with time, this technology will be valuable to achieve goals in the real world, whether in self-driving cars, or as useful robotic assistants with natural language recognition."

A paper on the breakthrough, Deriving Subgoals Autonomously to Accelerate Learning in Sparse Reward Domains, was presented at the 33rd AAAI Conference on Artificial Intelligence in Honolulu, Hawaii, yesterday.

---

• Follow us on Twitter

• Follow us on Facebook

• Subscribe to us on YouTube

Comments »