So now that we have a model that can do a reasonable job predicting the score, natural question to ask is, what factors that can’t be represented as mechanics or categories? In other words, how much ‘magic sauce’ does the game have makes its rating better or worse than expected? To do this, we take each game’s actual score, and subtract its predicted score, and then graph vs the rank of the game. As one would expect, higher-ranked games have more magic sauce than lower ranked games. Notice that for about the top 1000 games, the tendency is to be doing better than expected, with the top 200 or so games doing even better.

Once we get to the top 500, we have a strong tendency to be about half a point better than expected. Somewhat interesting is that newer games actually tend to fall below this mark, indicating that while newer games choose popular categories and mechanics, many fall slightly short of what their fundamentals would suggest.

Older games display the opposite phenomenon, but it’s notable that older games which fail to be exceptional tend to fall down in rank over time as they’re replaced by the latest and greatest. This phenomenon has led some to call board gaming as a hobby “the cult of the new”.

It’s worth nothing, however, that because rankings are based on the geek rating, which has Bayesian ballasting, new games require a higher average rating to achieve the same rank as an old game, simply because they in general will not have as many votes yet.

We see here that the magic sauce has a noticeably smaller variance than the average score does, since we captured about half the variance in our model.

While the amount of ‘magic sauce’ changes from about -0.7 at rank 5000 to about +0.8 at rank 1 (a difference of 1.5 points), notice that the average rating actually increases from about 6.0 to about 8.5, a difference of 2.5 points, indicating that quite a bit of the average rating is less about the quality of each individual game, but the popularity of the categories and mechanics of each game.

Games in the Top 500 with highest magic sauce scores

When we sort by the amount of magic sauce each top 500 game has, and take the top 50 entries, we end up with the list on the left. Many of the games on the list are also top 100 games in BGG, but many of them are not.

Some of the choices would definitely stand out as being games that have something going for them that can’t be explained merely via categories and mechanics. For example, Gloomhaven is widely renowned as one of the best if not the best dungeon crawler ever made, and yet looking at only its dry characteristics, it wouldn’t appear very different from other entries like Descent, for example. In reality, its popularity speaks for itself. If we look at games that aren’t ranked as highly but still make the list, we see some abstracts like YINSH which continue to have a following even 16 years after release, as well as some perennial favorites like Codenames and Resistance.

Of course, no discussion of this list would be completely without talking about Pandemic Legacy. The curious thing about Pandemic Legacy is that its rating is much higher than Pandemic or any of its other variants, probably because of the legacy factor adding something intangible but favorable to the game. Of course, there may be some sampling bias here, since it’s likely that Pandemic Legacy players tend to be those who enjoyed Pandemic. But this effect likely affects many games, as you wouldn’t expect someone to buy a heavy euro as one of the first board games they own.

We also see the emergence of some classics like Crokinole, Go, and MTG.

Games in the Top 1000 with lowest magic sauce scores

On the other end of the spectrum, what about overrated games? I actually own a couple of these, namely XCOM and Tiny Epic Kingdoms, and can confirm that they collect dust on the shelf. For example, Tiny Epic Kingdoms would be expected to be achieve a decent rating (7.19) based on being, for example, an action point, area control, area movement, variable player powers worker placement, but it only achieves an average rating of 6.67. Based on having played it, this isn’t surprising at all, as I think I’ve played it once and never then had a desire to take it off the shelf.

So what does this all mean?

So what’s the takeaway? BGG ratings have long been used as an invaluable tool in evaluating the merit of board games, and it can’t be denied the effect a game’s BGG ranking has on its sales. And yet, it seems that a lot of rating can be predicted without knowing anything about it except what mechanisms it uses.

Hobbyists have long since argued about the usefulness of BGG ratings, and opinions span the entire spectrum, between those who view rating as everything and those who view rating as totally useless. As it turns out, both camps had a point. While a lot of rating is priced into the mechanisms of the game, the really good games do have an extra bit of magic that you have to play them to experience.

If we account for the predictable, we end up with a list of top games that is somehow both both familiar and novel. I’m definitely looking forward to trying some of them, whether finding inspiration in a totally new game or recapturing the magic in an old but forgotten favorite.

As one final thought, here are some histograms of average score that compare games with and without each of the properties.