Tencent created AI agents that can beat StarCraft 2’s Cheater AI

Researchers from Chinese tech giant Tencent recently developed a pair of AI agents capable of defeating StarCraft II’s (SC2) “AI” on the highest difficulty levels in full matches — making them the first to do so.

In a recently published white paper the researchers explain the development of the two agents, dubbed TSTARBOT1 and TSTARBOT2. The first is a macro-level controller agent that oversees several specific algorithms designed to handle lower level functions. TSTARBOT2, the more robust of the two, is a macro-micro controller consisting of several modules that handle entire facets of the gameplay independently.

Like all neural network-based AI agents, the TSARBOTS were designed to imitate the human thought process.

Playing StarCraft 2 isn’t like playing Go or Chess, where all the pieces are on the table and in plain sight. SC2 players often can’t see each others’ units until they’ve “scouted” the map – which, in the case of Tencent’s experiments has “fog of war” turned on. And even then, there’s an insane amount of information for players to observe and process.

The AI agents were both trained to play a 1V1 Zerg versus Zerg match on Abyssal Reef, a map that has traditionally stymied neural networks trying to win against the CPU. In just a couple of days both agents could defeat the computer on the hardest setting: level 10.

The cool part: The agents are trained on a single GPU. The not-so-cool part: It takes a crazy amount of processors to process the ridiculous amount of data it takes to train the bots on billions of frames of video. According the researchers:

We currently take 1920 parallel actors (with 3840 CPUs across 80 machines) to generate the replay transitions, at the speed of about 16,000 frames per second. This significantly reduces the training time (from weeks to days), and also improves the learning stability thanks to the increased diversity of the explored trajectories.

One of the reasons the task is so difficult is that StarCraft 2’s three highest difficulty settings feature “AI” that cheats. No, I’m not just saying that because I can’t beat it – it’s actually called Cheater AI. At the highest level, the computer knows where all resources are, has no fog of war, and can always see every unit on the map. It’s a clearly unfair advantage designed to be incredibly challenging for opponents to overcome.

The TSTARBOTs don’t have any advantages that a human doesn’t – it has to interface with the game in terms of mouse clicks and macros, and it “sees” the exact same thing a person would. Though, since algorithms don’t have eyes, they interpret video output frame-by-frame and translate visual information into data it can work with.

To revisit the Chess and Go comparison from above, there’s also the fact that those games are turn-based. In SC2 players all act in real time. Couple that with the fact that there are thousands of units in play, and it quickly becomes nigh-unmanageable for all but the most talented and skilled players — human or otherwise.

Thanks to the high-level commander paradigm Tencent developed, which keeps track of the overall strategy while depending on middle and low-level algorithms (or modules, in the case of TSARBOT 2) for unit-level management, the bots are far more human-like than the computer opponent.

And that means they aren’t just capable of beating the top difficulty level, they dominate it. Bot number two wins over 90 percent of the time while number one isn’t quite as effective at 71 percent. Perhaps more interesting, both of them showed the ability to beat platinum and diamond-level humans, but the humans won more games than they lost.

And, if you’re wondering what happens when TSTARBOT1 and TSTARBOT2 go head-to-head, you might be surprised. Bot one kicks Bot two’s butt every time. Despite the fact that TSTARBOT2 is likely better suited for competition against humans (or, with further development could be), and beats the built-in SC2 AI at a higher win-rate, it can’t defend itself against 1’s attack strategy.

According to the researchers:

It is worthy (sic) noting that although TStarBot1 can successfully learn and acquire strategies to defeat all the builtin AIs and TStarBot2, it lacks strategy diversity in order to consistently beat human players. In the aforementioned test with human players, TStarBot1 will be unable to win once the human player starts to know TStarBot1’s preference for Zergling Rush.

So, humans are still top-dog in the world of StarCraft 2, but it’s pretty safe to say that won’t last long. Tencent plans to release the TSTARBOTS as open-source code, and you can bet your bottom dollar they’ll get better.

If you want to dive a little deeper into how the bots work, you can view the white paper here. It’s a surprisingly fun read. And don’t forget to check out our artificial intelligence section!