Evaluating Splatoon’s Ranking System

By Evan Miller

July 20, 2015

Update, Feb. 2016: After this article was written, Nintendo introduced the S and S+ ranks to Splatoon. I analyze their effects in a follow-up article, Splatoon’s Ranking System Is Still Broken. Read on for the original analysis.

Recently I’ve been playing a lot of Splatoon, a new paint-’em-up game from Nintendo. The game features Inklings, children who use an assortment of paint guns and oversized rollers to cover the ground with paint, and who then turn into squids to swim through the paint in order to engage enemies. In Battle Mode, each team has its own paint color, and you can only swim through your own color paint. The game is quite a bit of fun, even for people who don’t normally like shooters (or painting).

Splatoon has an online “Ranked Battle” mode, wherein each player has a graded rank from C− to A+ . If you are part of a team that wins a battle, you earn a certain number of points, and if you collect enough points, your rank increases. If you are part of a losing team, you lose a certain number of points, and if you lose too many points, your rank decreases. The idea is that if you tend to lead your team to victory, your rank will increase over time. (To keep things engaging, the online matching system attempts to assemble teams of members who have similar rank.)

The ranking system seems to work pretty well in practice — I am a “B” player who is regularly crushed by roller-wielding “A” players who I happen to encounter in the game — but I was curious about how it works in theory. That is, if all of the battles were decided purely by chance, what percent of players would be C players, B players, and A players?

It’s a fun little problem to solve, and my solution, which I’ll describe below, involves some non-trivial matrix math. The solution also has an interesting surprise: If all battles are decided purely by chance, then in the long run, nearly 75% of Splatoon players will end up with the rank of A− , A, or A+ , with over 36% of players having a rank of A+ . I didn’t expect this result at all, but it makes sense once I realized there’s a subtle flaw in the Splatoon ranking system, a flaw that the game designers probably didn’t realize was present when they shipped the game. I won’t tell you what the flaw is just now, but see if you can find it as we work through the math.

The Splatoon Ranking System

There are nine possible ranks in Splatoon: C− , C, C+ , B− , B, B+ , A− , A, and A+ . Within each rank, a player has a score, ranging between 0 and 99. If the score exceeds 99, the player is promoted to a higher rank (unless the rank is already A+, in which case the score remains 99); if the score dips below zero, the player is demoted to a lower rank (unless the rank is already C− , in which case the score remains zero). When a promotion occurs, the score is reset to 30; when a demotion occurs, the score is set to 70. This “buffer zone” of 30 points on either side prevents ranks from flip-flopping too much.

The size of rewards and penalties for losses and victories in battle depends on the player’s current rank. The point rewards are summarized in the table below.

Rank Reward for Victory Penalty for Defeat C− +20 −10 C +15 −10 C+ +12 −10 B− +12 −10 B or higher +10 −10

As you can see there is a bias in place for players at lower ranks, which tends to push players up from the starting rank, which is C− . That bias disappears once you become a “B” player, at which point the reward for victory equals the penalty for defeat. The system is designed so that, starting with a rank of B, you’ll need more victories than defeats to keep getting promoted.

(The reward or penalty is sometimes adjusted when lopsided battles occur, for example, rewarding an extra 2 points to an underdog team. For purposes of this analysis, we’ll ignore those adjustments.)

Modeling Score Transitions

To analyze the behavior of this ranking system, I’m going to assume that everyone starts with a rank of C− , and that battles are decided purely by chance. You might be able to intuit that, because of the reward/penalty bias for lower ranks, players will tend to be promoted up to “B”. Beyond that, perhaps a few of them will drift up to “A” just by dumb luck, but most will stay “B” players, right?

The actual answer is a bit more interesting. To see it, we’ll need to use some stochastic matrix (Markov) theory. The idea is that if we can construct a matrix that describes the probabilities of going from one state (rank and score) to every other state (rank and score), then we can apply some useful theorems to learn about the long-run, steady-state behavior of the system.

Unfortunately, the full Markov matrix for this problem is a large and hairy beast. Because there are nine possible ranks, and 100 possible scores within each rank, that means there are 9 × 100 = 900 possible states, and the transistion matrix will have 900 × 900 entries. Most of those entries will be zero, but still, that’s a much bigger matrix than I would like to work with.

So to make things a bit easier, I’m going to break the problem down into two kinds of transition matrices. First there’s the transition matrix within each rank, call it the “score transition matrix”, which will characterize the probability of going from one score to another (say, from 50 to 60) within each rank. There will be nine score matrices, each no larger than 100 × 100, and we’ll see in a minute that most will be much smaller.

Second, there will be the “rank transition matrix”, a 9 × 9 matrix which will characterize the transitions between ranks. We’ll analyze the nine score matrices to produce the rank matrix, then analyze the rank matrix to produce a final answer.

The C− Score Transition Matrix

To kick things off, let’s construct a transition matrix for players with a score of C− . Because the reward/penalty structure is +20/−10, and because the only entrances into the C− rank are divisible by ten (either playing for the first time, when the score is zero, or suffering a demotion from C, which resets the score to 70), we only need to analyze ten possible scores, that is, scores divisible by ten: 0, 10, 20, etc. If you’re familiar with ergodic theory, those ten scores represent an “ergodic set”.

We can deduce the transition probabilities directly from the reward/penalty structure, and from assuming that each battle is decided purely by chance. For example, a player with a score of 50 has a 50% chance of being knocked down to 40 (after suffering a penalty of −10), and a 50% chance of being bumped up to 70 (after receiving a reward of +20).

If we index starting scores by row, and ending scores by column, we can come up with a transition matrix (whose rows sum to 1) describing every possible score transition within the C− rank:

\[ C^- = \left[ \begin{array}{ccc} 0.5 & 0 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0 \\ \end{array}\right] \]

The player with starting score of 50, for example, is represented by the sixth row of the matrix.

There’s a problem, unfortunately, with the last two rows of the matrix: they don’t add up to 1. These rows represent a starting score of 80 and 90, respectively. If the player wins, she is promoted to a rank of C and leaves the matrix.

So to make this a proper stochastic matrix, I’m going to do something sneaky. In the steady state, we know that after each battle, the number of players being promoted to from C− to C must be the same as the number of players demoted from C to C− . So I’m going to pretend that the new C− players — the recently demoted ones, who have their scores reset to 70 — are the same as the ones who were just promoted out. It’s a lie, but for purposes of this analysis, players are interchangeable, and the charade lets us solve for a steady state within the C− rank.

The modified transition matrix, with 0.5 added to the eighth column of the last two rows, is then:

\[ C^- = \left[ \begin{array}{ccc} 0.5 & 0 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0 & 0 & 0.5 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1.0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.5 & 0.5 & 0 \\ \end{array}\right] \]

That 1.0 means that 100% of players with a score of 80 end up with a score of 70. It’s true for the players who lose the battle (and lose 10 points), and we’re pretending it’s true for the players who win the battle (who are promoted to C, but who are replaced by demoted players who start back at 70).

We can simulate a round of battles by multiplying the transition matrix operates by a starting distribution of players, call it \(p = (p_0, p_{10}, p_{20}, \ldots)\), and producing an ending distribution of players, call it \(p'\):

\[ p' = pC^- \]

We can solve for the steady state by setting \(p'=p\):

\[ p = pC^- \]

You can solve that system of equations by substitution with 10 unknowns, or just realize that the solution will be the eigenvector for \(C^-\) with an eigenvalue equal to 1. Normalizing the eigenvector, I get:

\[ p = \left( \begin{array}{l} 0.0131 \\ 0.0131 \\ 0.0262 \\ 0.0393 \\ 0.0656 \\ 0.1049 \\ 0.1705 \\ 0.2754 \\ 0.1541 \\ 0.1377 \end{array} \right) \]

Or in tabular form:

Score Steady state 0 1.31% 10 1.31% 20 2.62% 30 3.93% 40 6.56% 50 10.49% 60 17.05% 70 27.54% 80 15.41% 90 13.77%

The above table is interesting because it very clearly reflects the upward bias in the reward/penalty structure. It also clearly shows the “landing point” of 70, where all of the recent demotions from the C rank will first arrive.

More Steady States

You can repeat the above analysis, with some modifications, to compute the steady-state distribution of scores within each rank. The reward for victory at the C rank is +15, and so you’ll need a separate state for all scores divisible by five — twenty scores in total (0, 5, 10, 15, etc.) — and a transition matrix that is 20 × 20. Likewise, because the reward for victory at the C+/B− rank is +12, you’ll need fifty scores in total (0, 2, 4, 6, etc.), and a transition matrix that is 50 × 50.

The transition matrices for B and above will be only 10 × 10 in size, and those will be identical except for the A+ matrix (the A+ transition matrix differs because the A+ rank does not receive any demotions from a higher rank). I cheated a little bit and assumed that the A+ score tops out at 100, rather than 99, but this little bit of fudging shouldn’t affect the core analysis.

For the sake of completeness, I am including the figures I calculated below, but you can skip the rest of this section if you want to get straight to the coup de grâce.

For rank C:

Score Steady state Score Steady state 0 0.52% 50 6.50% 5 0.69% 55 6.09% 10 1.04% 60 9.56% 15 1.38% 65 7.81% 20 2.08% 70 14.33% 25 2.24% 75 9.13% 30 3.48% 80 6.19% 35 3.43% 85 8.71% 40 4.37% 90 4.57% 45 4.78% 95 3.10%

For rank C+/B−:

Score % Score % Score % Score % Score % 0 0.38% 20 1.53% 40 2.11% 60 4.07% 80 1.76% 2 0.45% 22 1.43% 42 2.67% 62 3.33% 82 3.47% 4 0.47% 24 1.42% 44 2.44% 64 2.88% 84 2.75% 6 0.46% 26 1.38% 46 2.24% 66 2.66% 86 2.21% 8 0.45% 28 1.33% 48 2.08% 68 2.4% 88 1.83% 10 0.77% 30 2.61% 50 2.9% 70 6.06% 90 1.12% 12 0.91% 32 2.1% 52 2.72% 72 3.77% 92 0.88% 14 0.94% 34 1.94% 54 2.78% 74 3.04% 94 1.74% 16 0.93% 36 1.83% 56 2.55% 76 2.55% 96 1.38% 18 0.9% 38 1.73% 58 2.32% 78 2.24% 98 1.1%

For rank B/B+/A−/A:

Score Steady state 0 3.33% 10 6.67% 20 10.00% 30 13.33% 40 13.33% 50 13.33% 60 13.33% 70 13.33% 80 8.89% 90 4.44%

For rank A+:

Score Steady state 0 2.63% 10 5.26% 20 7.89% 30 10.53% 40 10.53% 50 10.53% 60 10.53% 70 10.53% 80 10.53% 90 10.53% 100 10.53%

There’s a clear upward bias within each rank, which makes sense for the reasons discussed, until you reach rank B. At rank B and above, things seem to flatten out, at least between scores of 30 and 70.

Next we’ll use the above tables to construct a final transition matrix that characterizes player movements between the nine possible ranks, then solve for the steady-state to get the final distribution of player ranks that we crave.

The Rank Transition Matrix

What fraction of players within a given rank will be promoted or demoted at the end of a battle? Because we now know the distribution of scores within each rank, we can answer that question precisely.

For the C− rank, we know that after a battle, half of the players with a score of 80 will be promoted, as will half of the players with a score of 90. We also know that 15.41% of players have a score of 80 in the steady state, and that 13.77% of players have a score of 90 in the steady state. So the C− to C transition probability is \(\frac{1}{2} (15.41\%) + \frac{1}{2}(13.77\%) = 14.59\%\), and the C− to C− transition probability is one minus that quantity, or 85.41%.

Proceeding in a like manner, we can construct a full rank transition matrix from the steady states of within-rank scores, indexing starting ranks on the rows, and ending ranks in the columns:

\[ R = \left(\begin{array}{ccc} 0.854 & 0.146 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0.00605 & 0.912 & 0.0817 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0.0111 & 0.9487 & 0.0402 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0.0167 & 0.961 & 0.0222 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0.0167 & 0.961 & 0.0222 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0.0167 & 0.961 & 0.022 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0.0167 & 0.961 & 0.0222 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0.0167 & 0.961 & 0.0222 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0.0132 & 0.987 \\ \end{array}\right) \]

Interestingly, the matrix implies that while almost 15% of C− players are promoted each round, less than 2% of A+ players are demoted — and this assuming battles have absolutely nothing to do with skill!

There’s also a clue in this matrix that something is not quite symmetric with B ranks and above. Between the ranks of B and A, after each battle, 2.22% of players are promoted, but only 1.67% of players are demoted. That should be an alert that the ranking system has an upward bias even in ranks where the reward/penalty structure is symmetric. We’ll see in a moment how this small promotion/demotion asymmetry plays out.

When I solve the equation \(p=pR\), I get:

\[ p = \left( \begin{array}{c} 0.00005817 \\ 0.001402 \\ 0.01036 \\ 0.03758 \\ 0.09067 \\ 0.1209 \\ 0.1612 \\ 0.2149 \\ 0.3630 \end{array} \right) \]

Or in tabular form:

Score Steady state Sum C− 0.0058% 1.18% C 0.14% C+ 1.04% B− 3.76% 24.9% B 9.07% B+ 12.09% A− 16.12% 73.9% A 21.49% A+ 36.30%

It should be clear from the table that the Splatoon ranking system has a strong upward bias: with randomly decided battles, almost three-quarters of players will end up with a rank of A− or higher, and a staggering 36.3% of players will be ranked as A+ . And yet within each rank above B− , we saw that the scores are evenly distributed. What could be going on here?

The Source of Grade Inflation

There’s a subtle, but very important asymmetry in the Splatoon point system. When a player is promoted, she starts in the new rank with 30 points. When a player is demoted, the player starts the lower rank with 70 points. Nothing fishy going on there, right?

Consider a player with a rank of B who has zero points. First suppose the player wins five battles (earning 10 points for each), then loses five battles (losing 10 points for each). She still has a rank of B, and still has zero points.

Now suppose the player wins ten battles (earning a promotion to the rank of B+), then loses ten battles. After the third loss, the player has a rank of B+ and a score of zero. After the fourth loss, the player is demoted back to B, with a score of 70 points. After six more losses, the player has a rank of B, but with ten points instead of zero points!

In other words, there’s a hidden bonus lurking in the rank system. On average, a promotion gives you thirty points for “free” when you are promoted — but when you are demoted, you lose twenty points, not thirty points!

The asymmetry is hard to see because demoted players are bumped back to 70, which is 100 minus 30 (obviously). But you’re only demoted when the score drops below zero — if your score starts at zero, then losing a battle should knock you down to 90 at the previous rank. A demotion penalty of 30 would put the player at 60, not at 70 in that case.

The effect might be attenuated somewhat because of the small score adjustments that take place in lopsided battles, meaning not all scores are perfectly divisible by ten, but I believe this is a fundamental flaw in the Splatoon scoring algorithm. It seems like the designers wanted to provide an explicit upward bias at low ranks, but then level the playing field once players hit the “B” rank. But the way that promotions and demotions are designed, the ranking system is giving out more points to players it promotes than it is taking away from players it demotes. The result is that at higher levels, Splatoon ranks are not a zero-sum game. As a consequence, there will be grade inflation simply as more battles are played, and higher ranks will become less and less meaningful over time.

There’s a very simple fix to the design flaw: demote players when their score reaches zero, rather than dips below zero. That small tweak smooths out the hidden asymmetry in the reward system. Re-running the steady-state analysis with this tiny change, I get:

Score Steady state Sum C− 0.023% 3.39% C 0.44% C+ 2.93% B− 9.65% 42.6% B 16.45% B+ 16.45% A− 16.45% 54.1% A 16.45% A+ 21.15%

The above table shows that such a tweak eliminates the upward skew of ranks. In the steady state, ranks will be more or less evenly distributed between B and A+, which is consistent with the uniform distribution of scores within each rank, and is probably closer to the intentions of the Splatoon designers.

Conclusion

Probability theory is a powerful tool, and it can be used to understand the properties of ranking systems in video games (and elsewhere). It’s particularly useful for evaluating specific changes in ranking mechanics, such as the introduction of weights or asymmetries designed to achieve specific social goals.

It can also uncover unintended systemic behavior. Because of the slight asymmetry in the Splatoon promotion system, I predict there will be a slow upward drift in player Splatoon rankings, and that Nintendo will observe many more A+ players on their Splatoon servers than they originally anticipated. This is not because “all the children are above average”, but because of a design flaw in the ranking system, which should probably be addressed before the inflation gets out of hand. Other game makers would do well to consider formal analysis methods, such as the methods developed in this article, before release their own ranking systems into the wild.

Now if you’ll excuse me, I have some painting to do.

Updates/Corrections

7/22/2015: A previous version of the article stated there are 99 possible scores within each rank. Actually, there are 100 possible scores (0–99), so the full transition matrix would by 900 × 900, not 891 × 891.

7/22/2015: “Kid aSquid” has implemented a Monte Carlo simulation of the analysis, with results very similar to the values predicted above.

7/21/2015: A previous version of the article assumed that players’ starting rank is C. In fact, the starting rank is C− . The text has been updated to reflect this fact, but the analysis is unaffected. (Steady states do not depend on initial conditions.)

. The text has been updated to reflect this fact, but the analysis is unaffected. (Steady states do not depend on initial conditions.) 7/21/2015: The number-crunching code is now available on GitHub.

You’re reading evanmiller.org, a random collection of math, tech, and musings. If you liked this you might also enjoy: Splatoon’s Ranking System Is Still Broken

Adventure Games and Eigenvalues

Deriving the Reddit Formula

Ranking News Items With Upvotes

Bayesian Average Ratings

Ranking Items With Star Ratings: An Approximate Bayesian Approach

Get new articles as they’re published, via Twitter or RSS.

Want to look for statistical patterns in your MySQL, PostgreSQL, or SQLite database? My desktop statistics software Wizard can help you analyze more data in less time and communicate discoveries visually without spending days struggling with pointless command syntax. Check it out!



Wizard

Statistics the Mac way

Back to Evan Miller’s home page – Subscribe to RSS – Twitter – YouTube