How to win over 70% matches in Rock Paper Scissors

An elementary AI approach to a popular game

Have you ever wondered how do algorithms play chess? How to set a Go-playing program? Why was AI bot able to beat you in your favorite game? Well, you’re not going to read it in this article. The game I’m going to write about is both easier to play and implement.

Although rock-paper-scissors (RPS) may seem like a trivial game, it actually involves the hard computational problem of temporal pattern recognition. This problem is fundamental to the fields of machine learning, artificial intelligence, and data compression. In fact, it might even be essential to understanding how human intelligence works.

The above text comes from the great Rock Paper Scissors Programming Competition page. It hosts a free and open competition where everybody can submit his/her playing algorithm. Code submissions are in Python 2, and they are visible to everybody. It gives everybody an opportunity to see the details of the best solutions. It’s not always easy to understand them though.

Submitted algorithm plays one thousand rounds with other programs. It’s called a match. The algorithm with the higher number of rounds won takes the match. Players are listed on a leaderboard based on ranking points gained or lost by playing matches.

Let’s implement the most straightforward playing algorithm. It always plays Rock.

Note that the only thing you have to do in the program it to give a value for range(‘R,’ ‘P,’ ‘S’) to the global output variable. You opposing algorithm assigns its move to input global variable. Each round you can see input from the previous round. The first round the variable is an empty string.

Well, the first gameplay would be quite easy to counter for any algorithm that learns from data. Now take the opposite approach.

How do you think will it go? It’s quite easy to predict: on average one-third lost, one-third drew, and one-third won. This strategy gives you a 50% chance of winning every match. To achieve a higher win rate you have to take a risk. Would it be fun to watch only random models competing? Rather not, so please don’t submit entirely random solutions. We’re going to use the model in certain situations though.

Note that you can test your model offline. Just download rpsrunner.py from the site. You could post them to check how they’re doing, but there is not much computation power there. I made a mistake in the beginning and posted some codes that were unintentionally random. It’s easy to test — just let your code play with constant input (like the first model implemented). If it looses around 50% times, your model is probably random.

The model we’re going to implement is a discrete Markov Chain. It’s built on a straightforward idea. Let’s say a process has two possible states A and E. We’re now in the state A. What’s the chance we will stay at the state A? Also, what’s the chance of going to state E? Concerning a Markov Chain with two possible states, these two probabilities have to sum up to 1. Adequately, there would be two probabilities for the current state E. You can see the mechanism on the picture below.

There is an excellent visualization of how Markov Models work here.

How can we use the model in the context of RPS contest? A natural way is to analyze the input and the output from the latest round and try to predict the next input. Then make the move which beats it. This setup indicates that our current state is a pair, like “RP,” and the next state is our output. It should look like that:

Here is the implementation. I also added a n_obs key to keep the number of past occurrences. We’re going to use it in the learning process.

The decay parameter represents a memory of the model. A value one would mean the model has a perfect memory. If assigned with a value between zero and one the model would forget earlier observations and therefore adapt faster to changes in opponents behavior.

As the input prediction, the model chooses the move with the highest probability, given the latest output-input pair.

Below you can find the whole playing program:

You can play with the algorithm:

I have to warn you. This model would not perform well in the contest. It’s just too elementary. Most of the risk-taking algorithms should be able to counter this strategy. You can see its performance under this link. It’s only a place for you to start an adventure in the Rock-Paper-Scissors contest. You can modify parameters, train many models and choose the best to play, create ensembles. It’s only up to your imagination. The best algorithms win around 80% of matches. You can check my over-70%-algorithm here. Can you do better?