Interpretation, representativeness, confidence interval, and more

The analyst desk during ESL One Cologne 2016.

The eye test's downside is statistics' nemesis

The verdict

About the author

Tomi Kovanen, more commonly known as "lurppis", is one of Finland's most prominent Counter-Strike experts. Kovanen started his career as a player back in 2004, retiring in early-2012. During his active years, Kovanen represented teams such as hoorai, Team ROCCAT, 4Kings and Evil Genuises.



Following his retirement, Kovanen has continued to be an influential member of the scene, sharing his expertise as a columnist, analyst, commentator and a frequent user of Twitter ( Tomi Kovanen, more commonly known as "lurppis", is one of Finland's most prominent Counter-Strike experts. Kovanen started his career as a player back in 2004, retiring in early-2012. During his active years, Kovanen represented teams such as hoorai, Team ROCCAT, 4Kings and Evil Genuises.Following his retirement, Kovanen has continued to be an influential member of the scene, sharing his expertise as a columnist, analyst, commentator and a frequent user of Twitter ( @lurppis_ ).

Statistics are not perfect, and they will only improve gradually. HLTV.org’s Rating 2.0 is a step in the right direction , and it will get even better when economic data is included. Still, that will not be enough, on its own. And frankly, there will never be a god-statistic, that for once and for all tells you how good a player is. As such, the most important thing about statistics is how you interpret them. The most common misconception is assuming that because someone interprets statistics one way, so does everyone else. That could not be further from the truth, and is where you can tell the number-fluent analysts from the rest.Another misconception regards sample size, which at times seems the community and even analysts believe is the holy grail of statistics. Sample size does not diminish the meaning of a statistic, only whether they are representative of a larger data-set. If the sample in question happens to be ELEAGUE’s Clash for Cash – a single best-of-three – it is what you would sensibly use to gauge who did best in Atlanta. You cannot generalize the figures to a larger population (i.e. expect an overperformer to keep that up over months of play), but the figures describe who did well in a tiny sample size with incredible meaning, to the tune of $250,000.In fact the statistics that are not representative of a larger data-set can even be more interesting – few people are excited about WorldEdit’s career 1.08 offline rating , but many remember his record-setting 3.42 rating thrashing of Echo Fox at ELEAGUE Season 1. flusha’s career offline rating of 1.09 is very good, but his 1.38 rating performance at ESL One Cologne 2015 was spectacular. And the fact is, he is probably better remembered for the latter – and rightfully so.One can draw conclusions from the most simplistic stats, and most of the time they will be right. A key statistical term that is rarely if ever thrown around in the Counter-Strike circles is confidence interval . If you do not hold your opinions formed based on a few figures to the grave, there is nothing wrong with using simple statistics to base your opinions on. They allow you to form a quick opinion in seconds that you can later modify as you take in more data. In addition, simple figures are great for sense-checking what you already believe.Right before CS:GO took over the scene in 2012, tournaments started streaming more of the matches that we were used to watching on HLTV (the equivalent of GOTV in CS 1.6), with ability to roam around the map, swap between players as you wish, and otherwise control your viewing experience. By the time CS:GO rolled around, most tournaments were streamed, and it was rare to get to watch games on GOTV. By the time Valve launched their majors in late 2013, they were practically the only events to feature open GOTVs for spectators.Newer fans may not remember, but at the time when streams started taking over, fans of Counter-Strike by and large agreed it was frustrating to watch games on stream, because you had such a poor idea of what was going on. You were at the mercy of whoever was directing the stream, and issues such as being unable to quickly swap between players, inability to look at the mini-map or scoreboard, and even missing kills caused concern. Today all the mentioned issues persist to varying degrees. But barely any of them is talked about, for streams are such an integral part of the community – without a viable alternative.At its best when GOTV is available to everyone, you can catch a reasonable amount of the action. There are 10 players, and you see the kill-feed for everyone, so for simplicity’s sake, let us say you catch 20% of what is going on, or double of what you see take place (10% of 10 players). How confident do you feel in determining who did well based on 20% of eye-data? The issue is compounded by lesser players getting less airtime on streams, and key firefights taking place simultaneously around a map. And most of the time, that is without ability to rewind.If we assume 40 maps are played in a tournament and a single map lasts 45 minutes, it would take you roughly 30 hours to watch the entire tournament in VODs. But if we required you to see all 10 players’ point of views, that balloons up to 300 hours of demo viewing. I am 100% confident that no one in Counter-Strike history has watched an entire tournament from start to finish from every player’s point of view. Yet that is never mentioned, let alone seen as a problem when comparing the eye test to statistics.This issue compounds the larger the sample size – a popular term few understand, but one that everyone throws around as a common buzz word – happens to be. By the time you have gone through three tournaments of stream VODs, you might have an entirely different picture than statistics point to. This happens even within single matches – you just need to recall the last time you were playing with a friend (or watching a game), only for you to notice (or for a caster to point out) how many kills your friend (a player) had racked by the time. We are constantly surprised by this, because it is hard to keep track of.Statistics stand no chance against the eye test if we compare watching a single player’s POV demo versus looking at his stats. Today’s numbers do not even come close – though I would argue that tomorrow’s will, and you normally at least land in the right ball park (which is where confidence interval comes in). But where numbers, or data more generally, reign supreme, is large sample sizes. Forget about the 300 hours to watch through a single tournament – what about going through ten events? That is 3,000 hours, or roughly six months of non-stop 16-hour days. And ten events do not even cover a single year of play.To make it worse, you can tell yourself you remember whether the fourth best player of a quarter-final team did well at a given event six months ago, but the human brain is not ideal for storing that kind of information. The more time passes, the weaker your memories are likely to become. And as explained before, your basis for the formed opinion could itself be false. That is not an issue with numbers, which stay the same through time – or sometimes can even be improved later (e.g. Rating 2.0) as more variations are derived from the same data-set.Finally, the eye test remains an entirely subjective metric. Two players watching the same match on the same stream can come up with wildly differing opinions – just listen to analysts or read Twitter during matches – and there is no way to verify any of that without going back and watching the game – which is very time-consuming, and something few will do. Complicated statistics have an inherent bias in the weighing, but it remains constant over time for everyone, and is something the simplest stats do not possess.Statistics are not perfect – or even close – but for anything larger than a single tournament, I do not think the eye test stands a chance in comparison. Interpreting numbers in a sensible way is not easy by a long shot, but the old cliché is true – numbers never lie.