So lately I’ve been really obsessed with reinforcement learning, and finding that DeepMind open-sourced the StarCraft II reinforcement learning environment to the public got me excited.

https://deepmind.com/blog/deepmind-and-blizzard-release-starcraft-ii-ai-research-environment/

I’m a big fan of Blizzard games, especially StarCraft II, so I saw the RL environment as a great opportunity to learn and also just have a lot of fun.

After talking to some friends about this, I decided write up an intro tutorial for setting up the environment and training some models.

Prerequisites

IntelliJ ( or PyCharm)

Python3

StarCraft II (even Starter Pack is working)

GIT

This tutorial is based on Mac environment.

On today’s article, we’ll run training scripts to solve the CollectMineralShards mini-game using Deep Q-Network.

When we run the training script, you can see the training result like below.

Tutorial Outline

1) Install pysc2

2) Star & Fork pysc2-examples

3) Clone pysc2-examples repository

4) Download mini-games StarCraft II Maps

5) Install Tensorflow, baselines libraries

6) Open the project with IntelliJ ( or PyCharm )

7) Run the training script

8) Run the pre-trained model

Let's start!

1) Install pysc2

First of all, we’ll install pysc2 library.

You can just type the commands on the terminal.

(Since we are using python3, you have to type pip3)

pip3 install pysc2

Then you have your pysc2 installed.

2) Star & Fork pysc2-examples

Next, open the Github link below.

https://github.com/chris-chris/pysc2-examples

It’s the most important step! Star my repository ;)

And fork it!

3) Clone pysc2-examples repository

Okay, let’s clone the project.

You can clone this repository with this simple command.

git clone https://github.com/chris-chris/pysc2-examples

Then you will see ‘pysc2-examples’ directory on your computer.

4) Download mini-games StarCraft II Maps

Before running the training script, we have to download mini-games maps. And save these maps to StarCraft II/Maps directory.

Download mini-games maps

I’m a mac user, and this is my StarCraft II maps location

/Applications/StarCraft II/Maps/mini_games

If you are a Windows user, use can save the maps in StarCraft II/Maps/mini_games directory.

For Linux users, save the maps in ~/StarCraft II/Maps/mini_games directory.

5) Install Tensorflow, baselines libraries

We need some more libraries! We need the Google Tensorflow and OpenAI baselines libraries.

You can install these libraries by typing the commands below.

pip3 install tensorflow

pip3 install baselines

I implemented the reinforcement model using OpenAI's baselines library. Since OpenAI’s baselines library depends on Tensorflow we need to install Tensorflow. I think OpenAI’s baselines is the most beautiful implementation of Deep Q-Network, which is why i’m using it!

I expect most of you reading this article to already have installed Tensorflow library :)

6) Open the project with IntelliJ ( or PyCharm )

By typing the commands below, the training will be started.

python3 train_mineral_shards.py

Quick note! I strongly recommend that you develop your reinforcement learning on IDE(Integrated Development Environment). This is because I’m going to explain the detail environment variables using Debug mode :) I’m currently running this project on IntelliJ.

Execute IntelliJ or PyCharm. And open the project folder which we cloned.

And let’s set the Project Structure .

Select [File > Project Structure] menu.

And select Python3 SDK on Module SDK . If you cannot find the SDK, click [New...] button and add your python3 binary.

7) Run the training script

And then, let’s run the trainig script.

Right click the train_mineral_shards.py and select [Run 'train_mineral_shards'] menu.

Then you will see the logs on the console while executing StarCraft II.

This is the brief explanation of console logs.

steps : The number of commands that we sent to marines.

: that we sent to marines. episodes : The number of games that we played.

: that we played. mean 100 episode reward : mean rewards of last 100 episodes.

: mean rewards of last 100 episodes. mean 100 episode min… : mean minerals of last 100 episodes.

: mean minerals of last 100 episodes. % time spent exploring : The percentage of Exploring ( Exploration & Exploit )

Currently I set the train script to run 20,000,000 steps.

(It takes soooooo much time, so if you want to run on your laptop, I’d recommend you to set the training steps to 500,000 something)

8) Run the pre-trained model

I coded the program to save the trained model to the mineral_shards.pkl file after all the training steps.

act.save("mineral_shards.pkl")

If you want to use this pre-trained model, you can just execute enjoy script :)

Right click the enjoy_mineral_shards.py and select [Run 'enjoy_mineral_shards'] menu.

Then you can see the pre-trained agent of CollectMineralShards map.

Conclusion

On this article, I just introduced the way you set your environment and train the model.

Future Tutorials