Nothing engages the community more than discussion involving balance. While everyone is engaged in talking about balance one way or another, it seems to me few give much thought about the act of talking balance. This article is not about the state of balance per se, but rather I want to discuss about how we think when we talk balance.

Introduction

In this article, I explore the hidden sides of balance discussion, highlighting how our intrinsic biases can blind us in our debates.

Readers messaged me and asked if I am writing an article about the current state of balance. Short answer is no. However, I want to write about how we approach the topic of balance itself. I had written an article a few years ago about what good balance arguments are not. In it, I specify some common arguments people use when they discuss balance, and explain why they are not good arguments. I strongly recommend you to read that before you continue reading this article.

The thing that stood out from that article is that I did not exactly mention what is a good balance argument. That was intentional, as I wanted to tiptoe around the idea that there is a specific “framework” for a good balance argument. This is like doing science, whereby there is no pre-determined framework to make an argument about something, and you simply have to put forward the best argument based on evidence. This leads me asking myself (and others), what is the best evidence you can realistically provide in a balance argument? Give yourself a minute to answer this question as honestly as possible before you continue.

Questioning ourselves

Now, think about this question, what evidence regarding balance would convince you?

Are the answers to the two questions different? It is likely that you gave an honest answer to the first one, then you are not exactly convinced when you apply your own answer to the second question. This exercise is to demonstrate two psychological biases we have.

First, despite our experience as both the convincer and the convincee, we struggle to transfer our knowledge and apply it to another. This is related to the concept of perspective taking, which is the act of perceiving a situation or understanding a concept from an alternative point of view, such as that of another individual [academic references at the end – 1, 2]. This bias is demonstrated in many common social activities [3, 4]. For example, in the context of gratitude giving, expressers underestimate how surprised recipients would be about why expressers were grateful, overestimate how awkward recipients would feel, and underestimate how positive recipients would feel [4]. We all have expressed and received gratitude in our lives, yet we still find it extremely difficult to shift away from the perspective we are in. This egocentric bias underlies many heated balance debate we see on Starcraft forums. More importantly, this exercise implies that you can never make a good enough argument, and the best realistically possible evidence is not good enough to doubters. Why doubters refute the best evidence possible? This is related to the second bias.

Second, people evaluate information with tainted lens. My last article about the TvP metagame is a good example. I merely wrote about the metagame trends and make sense of the games I watched, and I didn’t go down the path of discussing about balance. But still I received messages/mails letting me know how “bad” I am. One side basically goes along the line of screaming “you Terran whiner”, and the other side is calling me out for not recognising Terran is in a shit spot in the match up. If you want to accuse me of making a statement about balance in that article, I was being ambiguous about it at worst. I would pay to have these two groups of zealots in a room to debate about what I wrote. This is a classic example of motivated reasoning, which refers to the phenomenon that people use reasoning strategies that allow them to draw the conclusions they want to draw [5]. Such bias has real life consequences, as we often see supporters of two different sides accuse the judges, journalists, and academics being biased against their ideals. Motivated reasoning leads us to be more critical about information that serves as evidence against our predisposition [6]. Thus, we often brush away the best reasonably possible evidence that we ourselves would have made.

The purpose of this simple exercise is to highlight our motivated cognitive biases and lay down the ground work for other assertions I make later. But importantly, I want readers to recognise our human biases, so that you can read on with an open mind.

Defining balance

When I first conduct an exploratory research for this article by asking some friends individually about what they think is the best evidence for a balance argument, they almost unanimously asked how is balance defined. It quickly becomes clear to me that no one could precisely define it. The answers I got range from there is no such thing as balance to both sides having a close to equal chance of winning all else equal.

Based on the countless balance discussion I see online, the general perception of balance can be vaguely described as both sides having a fair shot at winning after taking certain contexts into account. A fair shot refers to one’s perception that both sides are given an equally powerful tools in the match up, and this implies a statistical inference of having a close to 50% win rate if it is indeed fair. The “certain contexts” refer to factors that are considered noise to the data for the interpretation of meaningful win rate comparison. For example, Serral’s game against me would not be considered a data point when we want to provide evidence for a balance question. While this general description seems fair and reasonable, there are some underlying assumptions that blind us from looking at balance more clearly.

Motivated contextual effects

The above example of Serral versus me should be considered objectively irrelevant to a balance discussion, because I’m trash and Serral is the best player. But we make so many assumptions in deciding whether a contextual factor is relevant or not, and more importantly, we are incapable of isolating our biases in the interpretation processes (refer to the two specific biases mentioned above).

A good example of motivated contextual effect is the recent meme about “player X did not play well today” (examples: Maru, INnoVation, TY). Joke aside, at its core, it shows the bilateral debate of whether it is the mentioned player not performing to his usual standard or it is the race played underpower. We are so entrenched in making arguments for either of the two sides that we overlook the underlying problem with this attribution debate. The best way to illustrate the problem is to use some simple survey questions.

The above single-item measure is a representation of how most wrap their heads around the problem, whereby player factors and race factors are on the two ends of a semantic differential scale. In other words, they are of direct trade off. In reality, however, that is not the case, as the two factors can affect the results on their own even though they are inversely correlated (related to the concept of multicollinearity [7]). Going back to the Serral versus me example, it is logically plausible that Serral is better than me, and that Terran has an advantage over Zerg. The two questions below capture this phenomenon much better.

By measuring the perceived influence of the two factors separately, we do not make the assumption that the two are of direct trade off. Importantly, this two-item measure does not change your answer itself, but it simply captures your real sentiment better. This conceptualisation of player-race attribution reduces our biases and allows us to show proper appreciation to the winner (see tweet below). In fact, I believe many who think racial advantage plays a huge role probably also think the winner played well in her or his own right. When we adopt the assumption of the single-item measure of attribution, we perceive those who suggest race plays a role want to take credit away from the winner. This further polarises our predispositions. Therefore, consciously switching our perspective from the convincer to the convincee (and vice versa) can help us engage in more constructive balance debates, even though perspective taking alone does not guarantee accurate understanding of what the other party thinks [8].

Balance being discussed as the reason for TY losing that series is a shame to see, Solar played incredibly well against a player that's a monster in TvZ, don't take it away from him This message from one of TY's biggest fans#IEMKatowice2019 — Maynarde (@MaynardeSC2) March 2, 2019

Statistical arguments

Most evidence for balance arguments is based on some kind of statistical inference. Racial representation is the most common one. People argue there is a balance concern when a race’s representation at certain stage of a tournament is above or below one-third. I had already discussed why racial representation is not a great evidence in the previous balance argument article (link here again), so I want to focus more on the misunderstanding of fundamental statistics.

The underlying notion of racial representation (or other statistical argument) normally hinges on the basic idea of expected versus observed values. For example, people perceive the expected racial representation should be close to 4-4-4 for a ro12 if the game is balanced. Let say racial representation is the perfect data of balance for the sake of this argument. Do we consider 3-4-5 balance? Most would say, it is just +/- one, so it is not abnormal. I can keep going by asking what about 2-4-6? 1-4-7? You are likely to reach a point to say, look, that is clearly skewed off the expected 4-4-4, so it is imbalance. The selected threshold itself is hugely affected by the two biases I mentioned, and having only one Terran in ro12 of IEM Katowice 2019 is a recent example. Using the same exercise I put forward at the start of this article, do people, who argue that 1-5-6 with one Terran means Terran is underpower, are also convinced that 1-5-6 with six Terran means Terran is overpower?

This expected versus observed logic lays the foundation of hypothesis testing in statistics. The classic example is flipping a coin (let’s just ignore the frequentist vs. bayesian paradigm). Flipping a fair coin should result in 50% head and 50% tail (or at least close). If you make 100 flips, how many head (or tail) would allow you to conclude that the coin is not fair? I can start from 51 (or 49) and then I keep asking you this same question by increasing (or decreasing) the number by 1. You will reach a point that you say, that feels off and concludes the coin is not fair. But I will get different answers from different people, so who is right? Logically, even if you observe 100 heads from 100 flips, it is possible that the coin is fair. You are just very lucky or unlucky to observe a result that has an extremely low odd. Given that you can never be conclusive about the result, you are essentially making a judgment of likelihood. How likely given the observed value I have that the coin is fair? It is logically and conceptually distinct from asking whether the coin is fair.

Then, how do we make statistical judgment about a coin flip or racial distribution? Ironically, social scientists for a long time (still do) have been using statistical methods based on likelihood to make categorical judgment of whether X is related to Y (e.g., p-value) [9]. Long story short, there is this pseudo-arbitrary threshold in many scientific fields that, if the observed value is only going to happen 5% or less of the time of the expected value, then you get tenure it is unlikely that what we observe is simply happening by chance. It is pseudo-arbitrary because the decided threshold is roughly two standard deviations from the mean, or in other words, the observed value is pretty far away from the expected average. I know statisticians wouldn’t like the way I phrase this, but I just want to get the logic across in layman terms without getting in depth. Applying this to coin flipping, if the observed number of heads or tails is only going to happen 5% or less of the time if the coin is fair, you make a judgment call that the coin is unlikely to be fair.

Isn’t having one Terran out in ro12 below that threshold? This is a lot more complex than the coin flip example for numerous number of reasons. Coin flipping is the ideal example for understanding hypothesis testing, because we know each flip is independent and the expected value is 50%. The likelihood of each race occupying 1/3 of the spots is not equal like head versus tail (i.e., 1/2). For example, having one Terran and five Protoss in group A of IEM Katowice does not provide the same set up of a coin flip. Selection bias is another obvious one, whereby some tournaments are not opened to all equally (e.g., Koreans are basically not allowed to participate in WCS). In short, there is lots of noise in tournament data. We also need to consider how confident we can be in drawing a conclusion based on the data, and one key factor is sample size. Simply put, the greater the sample size, the more confident we are in drawing a conclusion. But how big is big enough [10]? Again, as highlighted with the two biases, our answer is hugely biased by our desire to make certain arguments. To those who hold a position that their race is not overpower, no effect size and sample size are big enough for a discussion. People’s information processing and judgment are shaped by their intrinsic motives that they aren’t aware of, such that they reach conclusions that align with their own desires instead of those that are based on evidence [5, 6, 11]. Specifically related to making scientific claims based on statistics, we must not forget that the absence of evidence is not evidence of absence [12].

Justification asymmetry

When you combine what I wrote in the previous balance argument article and in this article thus far, it seems blindingly obvious that it is much easier to suggest a balance argument is poor than to provide an argument. Yet, people demand strong evidence from those who suggest there is imbalance, but those who argue against such suggestion do not receive the same scrutiny. Why?

One explanation is our psychological tendency to justify the status quo. People are motivated to justify and rationalise the way things are, so that existing arrangements (e.g., social, economic, and political) tend to be perceived as fair and legitimate [12]. Hence, players by default hold naive beliefs that the existing game status is balance and fair. When this assumption is perceived as justified and is unchallenged, the pressure is on those who argue against the status quo.

This assumption is nicely illustrated by statements like “just play like Maru/TY” and “Terran whiners”. Those who hold the beliefs that the existing state is fair and legitimate rarely counter argue by providing evidence from game play. Rather, they presume the existing state is balance that such evidence is unnecessary, and they shift the arguments from situational factors (i.e., game play) to dispositional factors (i.e., players). Ironically, these are the people who demand the other side to provide evidence from game play, and then brush them away because no evidence is good enough to warrant debate in their eyes.

Ending words

We always talk about balance, but we rarely, if ever, take a step back and think about the assumptions we make in the process. The psychological biases I mentioned are not exhaustive, as there are undoubtedly other forces that push us to behave in certain ways in a balance debate. One important and worrying take away is that the best possible evidence for a balance argument is likely to be insufficient to convince others. It is always easier and more convenient psychologically to justify the status quo than to argue against it.

On a more personal note, there are some reddit comments about my articles that demonstrate the biases I mentioned. Here are some examples.

Below is a comment regarding my article on the TvP proxy strategy last year:

When one already forms a conclusion, no argument that leads to plausible alternatives is good enough. I never ever say I’m objective and unbiased. Am I biased in favour of Terran? Sure. Here is the bottom line, I put forward arguments and I justify them using evidence, and by no mean they are perfect reflection of the complete truth. I merely adopt the positivism approach in scientific thinking. Feel free to criticise, but your dislike toward the articles is not a justification.

The comment below is about my recent TvP article on the metagame stabilising:

The two quotes do not exist in the article. The closest sentences I can find are:

“I argue that it is a result of Terran not having better ways to get value out of the early Starport.”

“Terran do not have a good early game attacking/harassment option.”

It is plausible that this person’s memory of the article is biased by motivated reasoning [5, 14]. S/he polarises what I actually wrote, which effectively changes the meaning. This is a good example of how our motivation shapes our memory of stimuli to the way we want them to be. Of course, I’m open to the plausibility that this person simply could not read.

I had shortened this article by removing some parts, such as perception of balance versus actual balance. I may post them in the future.

If you enjoyed this article, I’d love you to share it with one friend. You can follow me on Twitter and Facebook. If you really like my work, you can help to sustain the site by contributing via PayPal and Patreon. You can also support me and enjoy quality tea with a 15% discount at AFKTea by using the “TERRAN” code. See you in the next article!

Academic references

[1] Galinsky, A. D., Maddux, W. W., Gilin, D., & White, J. B. (2008). Why it Pays to get Inside the Head of your Opponent: The Differential Effects of Perspective Taking and Empathy in Negotiations. Psychological Science, 19(4), 378-384.

[2] Galinsky, A. D., & Moskowitz, G. B. (2000). Perspective-Taking: Decreasing Stereotype Expression, Stereotype Accessibility, and In-Group Favoritism. Journal of Personality and Social Psychology, 78(4), 708.

[3] Flynn, F. J., & Adams, G. S. (2009). Money can’t buy love: Asymmetric beliefs about gift price and feelings of appreciation. Journal of Experimental Social Psychology, 45(2), 404-409.

[4] Kumar, A., & Epley, N. (2018). Undervaluing Gratitude: Expressers Misunderstand the Consequences of Showing Appreciation. Psychological Science, 29(9), 1423-1435.

[5] Kunda, Z. (1990). The Case for Motivated Reasoning. Psychological Bulletin, 108(3), 480-498.

[6] Ditto, P. H., & Lopez, D. F. (1992). Motivated Skepticism: Use of Differential Decision Criteria for Preferred and Nonpreferred Conclusions. Journal of Personality and Social Psychology, 63(4), 568-584.

[7] Farrar, D. E., & Glauber, R. R. (1967). Multicollinearity in Regression Analysis: The Problem Revisited. The Review of Economic and Statistics, 92-107.

[8] Eyal, T., Steffel, M., & Epley, N. (2018). Perspective Mistaking: Accurately Understanding the Mind of Another Requires Getting Perspective, not Taking Perspective. Journal of Personality and Social Psychology, 114(4), 547-571.

[9] Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s Statement on p-values: Context, Process, and Purpose. The American Statistician, 70(2), 129-133.

[10] Barlett, J. E., Kotrlik, J. W., & Higgins, C. C. (2001). Organizational Research: Determining Appropriate Sample Size in Survey Research Appropriate Sample Size in Survey Research. Information Technology, Learning, and Performance Journal, 19(1), 43-50.

[11] Lord, C. G., Ross, L., & Lepper, M. R. (1979). Biased Assimilation and Attitude Polarization: The Effects of Prior Theories on Subsequently Considered Evidence. Journal of Personality and Social Psychology, 37(11), 2098-2109.

[12] Altman, D. G., & Bland, J. M. (1995). Statistics Notes: Absence of Evidence is not Evidence of Absence. Bmj, 311(7003), 485.

[13] Jost, J. T., & Hunyady, O. (2005). Antecedents and Consequences of System-Justifying Ideologies. Current Directions in Psychological Science, 14(5), 260-265.

[14] Pyszczynski, T., & Greenberg, J. (1987). Toward an Integration of Cognitive and Motivational Perspectives on Social Inference: A Biased Hypothesis-Testing Model. In Advances in Experimental Social Psychology (Vol. 20, pp. 297-340). Academic Press.