The first thing we need to settle is “what is a blunder?” A human will tell you that a blunder is a move which substantially decreases the player’s chances of winning. Good players can classify a move as a blunder with just a few seconds’ thought, but even that’s too slow for our purposes. Instead we’re going to be using a computer chess player or “chess engine” called Crafty.

Crafty is far from the strongest engine on the market right now, but it has one very appealing feature called “annotation mode.” This mode takes an already-completed game and highlights moves which Crafty believes to be suboptimal. It doesn’t just classify moves as blunders or non-blunders, it also quantifies how bad the blunder is in units of pawns.

Crafty computed that Carlsen’s move “26. Kd2" was 2.11 pawns worse than his best move “26. Rg3".

For example, applying crafty to the Carlsen-Anand game shows that players hurt their positions by approximately two pawns with their blunders. This might not seem like a lot, but in high-level chess, a two-pawn deficit is almost always a loss.

Now that we have a way to classify blunders, we’ll need to bundle Crafty up in a Docker image so we can use it in Pachyderm. The source for our image is available on GitHub or it can be pulled directly from the Docker registry. The image contains two http servers. A map server which takes chess games in .pgn format and returns the ratings of the players and a bucketed count of Crafty’s scores of the moves. And a reduce server which takes the results from the map server and aggregates them into buckets based on the player’s rating.

Our MapReduce job gives us a mapping from rating to a vector of blunders.

Next we’ll need to get a Pachyderm cluster up and running and filled with data. Using data from chessgames.com, we wrote a simple script to upload it to Pachyderm’s file system (pfs) and kick off the pipeline. The script and data are available in the repo along with more detailed instructions on how to reproduce the results yourself.

Extend the analysis by forking the repo.