Hans Rosling has done a lot to popularize statistical thinking about human development, and that’s a very good thing, but yesterday he did something that drives me crazy. After word spread of an apparent coup attempt in Eritrea (more on that later), Rosling tweeted this:

If you create a Democracy x Income score Eritrea the lowest in the world! See the graphic predicting the coup.

And here’s a shot of that graphic:

Brilliant, right? Just cross-tabulate a couple of commonly used measures of economic and political development and you get an index that accurately predicts this attempted coup in Eritrea that seemed to catch so many people by surprise.

Well, modelers have a name for this strategy, and it’s “overfitting.” “Cherry picking” works, too. After the fact, it’s easy to construct a predictive index that does very well at spotlighting any single event. If you poke around enough in the data, you can usually find some combination of measures under which the case of the moment rises to the top. If yesterday’s coup attempt had happened in China, for example—the big red ball in the bottom middle of Rosling’s chart—Rosling could have treated population size as a third dimension in his index, and China would have occupied the bottom corner of the resulting cube. We saw a lot of this right after the uprisings in Tunisia and Egypt in early 2011, too, when for example New York Times columnist Charles Blow found a handful of factors that seemed to differentiate those two countries from many of their regional neighbors.

What those after-the-fact snapshots won’t tell you, however, is how reliable that forecasting strategy would be over time. Most of us don’t need an index that’s optimized to predict a specific event, and even if we did, we would still need to build it before the event happened in order for it to be useful. To build a good predictive model, we need to find things that consistently help separate the situations where events of interest will happen from the ones where they won’t. Going back to Rosling’s chart, we see that his index also puts North Korea, Myanmar, Togo, the Gambia, and Cameroon in the lower left-hand corner, yet none of those countries has suffered any coup attempts for many years. Meanwhile, the two countries that saw successful coups d’etat in 2012—Mali and Guinea-Bissau—are both in the upper left, poor but democratic. Dig a little deeper, and that scatterplot’s not looking quite as useful.

So how reliable is Rosling’s two-dimensional index as a device for forecasting coups? To get an empirical answer that question, I used the two variables Rosling picked—GDP per capita and degree of democracy—to estimate a simple logistic regression model in a training data set covering the period 1960-1994. I then applied that model to data from the period 1995-2010 to see how well it worked on cases it hadn’t already “seen.” The thing this model is trying to predict is the occurrence of any coup attempts (successful or failed) in a country during a particular calendar year, based on the value of the two predictors at the end of the previous year. Data on coup attempts come from the Center for Systemic Peace, and data for the two risk factors come from the World Bank’s World Development Indicators and the Polity project, respectively.

Before seeing how that model fared, it’s important to note that, just by modeling, we’ve already added some valuable information to the mix that isn’t in Rosling’s scatterplot. First, the logistic regression model includes an intercept that captures information about the historical base rate of coup attempts worldwide, and most forecasters can tell you that the base rate is a powerful predictor in its own right. Second, where Rosling’s scatterplot implicitly gives its two elements equal weight in its predictions, the statistical model estimates parameters for those two variables that incorporate historical evidence about the strength and direction of their association with coup risk. Ideally we would use Rosling’s two variables on their own, but we need a model to convert values of those variables into predicted probabilities, and the process of modeling itself already carries us a couple of steps beyond the two-dimensional plot.

Now, the results. Area under the ROC curve (AUC) is commonly used as a measure of predictive power for classification models like this one. AUC represents the probability that a randomly selected positive case (here, a country-year with any coup attempts) will have a higher predicted probability than a randomly selected negative case (a country-year with no coup attempts). It ranges from 0.5 to 1, with higher values indicating better discrimination. The bar chart below plots AUC for three models: 1) one with Rosling’s two variables as linear predictors of coup risk; 2) another with nonlinear versions of Rosling’s variables (logged GDP and a quadratic term for the degree of democracy); and 3) a more complex model that adds information about recent coup activity, the age of a country’s political institutions, participation in international human-rights treaty regimes, among other things.

As the chart shows, a model with linear versions of Rosling’s axes does reasonably well at forecasting coup attempts, with an AUC of about 0.75. Transforming those variables to capture nonlinearities in those associations improves the predictive accuracy, but only a smidgen, to 0.76. Finally, the model that includes several other risk factors produces a bigger bump, pushing the AUC up to 0.80.

Based on those results, I think it’s fair to say that Rosling’s scatterplot is on the right track, but we can do a lot better by a) estimating a model instead of just using a scatterplot and b) including other useful predictors in that model. The fact that a modeled version of Rosling’s index did okay won’t surprise anyone who’s done quantitative analysis of political instability. If you want to assess the relative risk of various forms of domestic political crisis across many countries, you can get a pretty good handle on the problem just by seeing how poor and authoritarian it is. Still, a scatterplot alone doesn’t get us very far, and adding a few more things to the model that are specifically indicative of coup risk helps us do even better.

I’ll close this post on an ironic note: at this point, it’s not even clear at this point that yesterday’s tumult in Eritrea was really an attempted coup. According to an initial report from Reuters, the soldiers who occupied the Ministry of Information demanded the release of political prisoners but did not threaten to topple the government. Political scientists generally reserve the term “coup” for situations where challengers use or threaten violence to capture state power and call cases where disobedient soldiers demand policy changes “mutinies.” This might seem like hair splitting, but the latter is more common and usually less consequential than the former, and we wouldn’t necessarily expect a predictive model designed for the one to work well for the other.