Google-backed startup DeepMind Technologies has built an artificial intelligence agent that can learn to successfully play 49 classic Atari games by itself, with minimal input.

Cofounder and WIRED2014 speaker Demis Hassabis called the move, detailed in a paper published in Nature, "the first significant rung on the ladder to proving general learning systems can work". "It's the first time that anyone has built a single general learning system that can learn directly from experience," he told journalists ahead of the announcement. "The ultimate goal is to build general purpose smart machines -- that's many decades away.

But this is going from pixels to actions, and it can work on a challenging task even humans find difficult. It's a baby step, but an important one."


Michael Newington Gray

Google acquired the London-based startup for a reported sum of £300 million in January 2014, following rumours that Facebook was also interested. Later that year, speaking at WIRED2014, former child chess prodigy Hassabis spoke about how DeepMind's AI -- or agent, as it is referred to internally -- had developed a perfect

Breakout strategy (engineering a tunnel so the ball hits the top of the screen) after being left to play the Atari game overnight. "It's now better at playing the game than any human. It has perfectly modelled this complex stream," Hassabis said at the time.

In the Nature paper published today (25 February), however, Hassabis and his coauthors reveal how deep Q-network (DQN) combined a very human type of learning known as reinforcement learning, with deep learning -- the method Google employed back in


2012 to teach its AI to recognise images of cats in YouTube videos. Hassabis noted this is the first time an open system has combined the two approaches.

In order to see this embed, you must give consent to Social Media cookies. Open my cookie preferences.

Hassabis, who also has a PhD in cognitive neuroscience from University College London, believes focusing on the biological functions of learning could be the key to cracking AI. "We learn through things like memory replay through the hippocampus, so there are crossovers between neuroscience and this," he told journalists.

DQN was only given pixel and score information, but was otherwise left to its own devices to create strategies and play 49 Atari games. This is compared to much-publicised AI systems such as IBM's Watson or Deep Blue, which rely on pre-programmed information to hone their skills. "With Deep Blue there were chess grandmasters on the development team distilling their chess knowledge into the programme and it executed it without learning anything," said Hassabis. "Ours learns from the ground up. We give it a perceptual experience and it learns from that directly. It learns and adapts from unexpected things, and programme designers don't have to know the solution themselves." "The interesting and cool thing about AI tech is that it can actually teach you, as the creator, something new. I can't think of many other technologies that can do that."


In order to see this embed, you must give consent to Social Media cookies. Open my cookie preferences.

As a result of this approach DQN -- which trained on each game for two weeks -- achieved more than 75 percent of the human score on more than half the games, and achieved better results than AIs using just reinforcement learning. It was even able to come up with loopholes in the games that the team did not know about. A supercomputer was not used to process the computations, but that would make progress even faster the team suggests. "It is worth noting that the games in which DQN excels are extremely varied in their nature, from side-scrolling shooters (River Raid) to boxing games (Boxing) and three-dimensional car-racing games (Enduro)," the team writes in the Nature paper. This is of paramount importance because DeepMind is of the belief that its AI is on the road to creating a general AI that can be applied to any decision-making situation. Any information could be used as the input, to help with these general applications -- the team just chose to only provide pixel and score feedback in this instance.

Before being acquired by Google, the company was hopeful its technology could one day be applied to climate science or disease modelling. But for now the team will be moving from Atari games to games of the 90s -- including 3D and racing games "where the challenge is much greater". The long-term goal is to then apply what is learned to Google's own products, including Search, Translate, and presumably its driverless car tech.