Three weeks ago we announced the release of the Obstacle Tower Environment, a new benchmark for Artificial Intelligence (AI) research built using our ML-Agents toolkit. One week ago we followed that up with the launch of the Obstacle Tower Challenge, a contest that offers researchers and developers the chance to compete to train the best-performing agents on this new task. The reception so far from the community has been great. I wanted to take the time to talk a little more about our motivation for the challenge, and what we hope it will promote.

The idea for the Obstacle Tower came from looking at the current field of benchmarks being used in Artificial Intelligence research today. Despite the great theoretical and engineering work being put into developing new algorithms, many researchers were still focused on using decades-old home console games such as Pong, Breakout, or Ms. PacMan. Aside from containing crude graphics and gameplay mechanics, these games are also completely deterministic, meaning that a player (or computer) could memorize a series of button presses, and even be able to solve them blindfolded. Given these drawbacks, we wanted to start from scratch and build a procedurally generated environment that we believe can be a benchmark that pushes modern AI algorithms to their limits. Specifically, we wanted to focus on AI agents vision, control, planning, and generalization abilities.

We believe that the Obstacle Tower has the potential to help contribute to research into AI, specifically a sub-field called Deep Reinforcement Learning (Deep RL), which focuses on agents which learn from trial-and-error experience. Our own internal tests have shown that even the current state-of-the-art algorithms in Deep RL are only able to solve on average a few test floors of Obstacle Tower. The graph below is taken from our paper, and shows that the top Deep RL algorithms (PPO and Rainbow) are still nowhere near the average human player when it comes to learning to play a deterministic version of the game (No Generalization) let alone a version where things look and play differently than what they were trained on (Weak and Strong Generalization).

At Unity, we think that the research being conducted on AI has benefits not only to the broader technology community but also to game developers and players. Smarter AI means better NPCs, more thorough playtesting, and ultimately more engaging player experiences. That is why we decided to launch the Obstacle Tower Challenge. To invite the best minds in Deep RL research and beyond to make an effort to solve the tower, and have those insights contribute to a wider world.

To help us evaluate entries, we have teamed up with AICrowd, a platform for hosting Machine Learning challenges. The challenge is taking place in, with the Round 1 submission deadline of March 31st and participants in the contest will submit trained agents, which will be evaluated on a special test set of Obstacle Tower levels. To enter the contest, learn more about the process, and to get started, go here.

We are happy to share that Google Cloud Platform (GCP) is a prize sponsor of the contest, and on top of the cash prizes and travel grants provided by Unity, winning participants will also receive GCP credits. These prizes are collectively valued at over $100K! Using GCP, it is possible to train agents on the cloud remotely rather than using desktop resources. This can both speed up training time, as well as make it simpler to run multiple concurrent experiments. Users who sign up for a new GCP account get $300 in free credit. On top of this, the first 50 participants who pass Round 1 of the Obstacle Tower Challenge will receive an additional $1100 in credits. The top three winners from Round 2 will receive an additional $5000 in credits.

For those new to training agents, or those wanting an easy way to get started, we have written a guide on using training an agent on Google Cloud Platform. The guide walks through setting up a cloud computing instance and using a state of the art algorithm provided by Google Dopamine to train an agent to progress in the Obstacle Tower. You can read the guide here.

If you have any questions about the contest, including support on submitting entries, please see the discussion forum here. For general issues or discussion of the environment itself, see our GitHub repo here. To learn more about the environment, read our research paper. We look forward to seeing the creative solutions the community comes up with to the challenge!