Boo. Almost no significance. That is, at the Bonferroni-corrected p = 0.01 level, only two country’s win/loss avg were significantly different from 50%. And even without correcting for multiple comparisons, only two countries have p<0.01. One of these countries is actually “International”, a tag that the online chess website throws up when people decline to list their nationality. These sans-country players won 2842/5072 of their games, for a win average of 56%, which is significantly greater than 50% chance (p < 0.00001, chi-square proportion test). No other country aside from India showed anything interesting.

(To confirm this, I tried to predict the whether a game was won or lost using only the nationality of the players involved. The logic here is that if, say, Iceland really does have a winning advantage, then armed only with the knowledge that Iceland is playing I should be able to predict the outcome at above-chance levels. However, consistent with the results of the chi-squared proportion test, a naive Bayes classifier (leave-one-out cross-validation, trained on the country labels) failed to predict the game outcome: I couldn’t push the classifier past a meagre 50-51% success rate. This damned-by-failure analysis isn’t great, though, since it's possible I just failed at building a good classifier. To make sure, I trained the same classifier on something that should influence game outcome - player ratings, a numerical expression of player skill. Here the classifier could guess with 67% accuracy the winner of the game. So again, everything else being equal, it looks like a player's country has little to no impact on the whether that player will win or lose a game.)

From a stats-y point of view, there is just not enough evidence to reject the boring option that, for the majority of countries, the numbers in Figure 1 are nothing but random chance. That is, the neat structure in Figure 1 ("Europe wins!") could be totally different when looking at another 78,000 games.

It's also entirely possible that I just don't have enough data to show significance for win averages close to chance. For example, if I wanted to detect a 1% difference from chance, I'd need around 20,000 games for a country, which is on the very upper limit for this dataset. However, chasing significance by collecting more data is generally not a good idea, since it opens things up to potential bias from multiple comparisons. Instead, better to leave it here. Interesting trends may yet be hiding in the data, but the dramatic trends in Figure 1 are probably noise.

Anyways, this is all just to say:

1. Question pretty figures.

2. When playing online chess, don’t judge an opponent by their nationality (unless they don’t have a nationality - then maybe it's alright to be scared).