RL Weekly 14: OpenAI Five and Berkeley Blue

by Seungjae Ryan Lee

Subscribe to RL Weekly Get the highlights of reinforcement learning in both research and industry every week.

OpenAI Five wins 2-0 against OG eSports

What it is

OpenAI Five is a Dota 2 artificial intelligence developed by OpenAI. Previously, in The International 2018 (TI8), the most prestigious annual Dota 2 tournament, OpenAI Five lost against paiN gaming and an all-star team of former Chinese players. On April 13th, OpenAI Five played against OG eSports, a team that won TI8, and under a few restricted rules, it won 2-0.

OpenAI also showcased a demonstration match of two human players with OpenAI Five and another two human players with OpenAI Five, and announced “OpenAI Five Arena,” where players around the world can register to play with or against OpenAI Five.

Why it matters

There have been a few substantial “AI vs. Human” events. These include Deep Blue vs Kasparov (Chess), AlphaGo vs. Lee Sedol (Go), and AlphaStar vs. MaNa (StarCraft 2). StarCraft 2 and Dota 2 provide unique challenges. First, each game requires a few tens of thousands of actions, whereas a game of chess and Go ends after a few hundred moves. Similarly, compared to chess and Go, the dimensions of the action space and the observation space is massive. Finally, most of the map is covered by “fog of war”, so the state of the game is only partially observable to the agent. Despite such challenges, OpenAI Five showed great prowess in Dota 2.

It is worth noting how much computation was needed to train OpenAI Five. OpenAI estimated that OpenAI Five has played 45000 years of Dota 2, which is not the amount of computation most companies can afford. Even with this scale, the results some caveats. Dota II originally has 117 “heroes” that the players can choose from, but in this match only 17 heroes were allowed. Because each heroes are unique and a different combination of heroes could result in very different games, unrestricted pool of heroes would have exponentially increased the amount of training.

Read more

External Resources

Berkeley releases BLUE robot

What it is

The Robot Learning Lab at UC Berkeley released BLUE: Berkeley robot for Learning in Unstructured Environments. Blue is a human scale arm with 7 degrees of freedom (DOF) and 2kg payload that costs less than $5000 if produced in volumes of over 1500.

Why it matters

Human-scale robots with medium-to-high degree of freedom are extremely expensive: similar robotic arms cost more than $30000. By defining “useful” bandwidth and payload metrics for a set of outlined tasks, the team was able to successfully trade off unneeded performance for these tasks and reduce costs.

Read more

Some more exciting news in RL: