Thank you for visiting this article. For a better reading experience, please consider reading it on my new website. Thank you!

Hello friends.

On National Video Games Day, a lovely little present landed into my Twitter mentions: a research study from 2014 which claimed that video games did not only reinforce racist stereotypes, but made people act more violently due to these racist stereotypes. I have previous experience in breaking down research studies on video games, so I thought I would jump back into the saddle and see how meritable a study that gets its own Huffington Post article is.

In the interest of transparency, please see ‘Disclaimers and Disclosures’ at the bottom of the page if you have doubts about my analysis. This includes information on my work experience and obtaining the article for yourself.

Now, let’s begin!

Introduction

One line from the abstract of the paper immediately stood out to me:

This is especially true in video games, in which Black male characters are virtually always violent.

When I examined this claim in the introduction, I came across this:

This stereotype may be more prevalent in video games than in any other form of media because being a Black character in a video game is almost synonymous with being a violent character (e.g., Burgess, Dill, Stermer, Burgess, & Brown, 2011)

The study that is referenced is a content analysis of video game magazine covers from January 2006, and an analysis of 125 video game covers from 2005. I would like to think that both the AAA and indie game scenes have flourished in terms of minority representation since this data was collected. I also find this slightly disingenuous as the study was published in 2011, but includes data that is one console generation late in its own study and two console generations late in the current study in which it is being cited.

The authors of the current study also refer to a previous study of theirs where they found that playing as a black avatar ‘increases accessibility to violent constructs’. This sounds suspiciously unclear and pretentious to me, so I may check this out in another article.

The theoretical basis provided in the literature review isn’t too radical. It presents previous research which found that participants find violence to be more synonymous with black people than white people. They then argue through behavioural conditioning and schema theory (a ‘schema’ is our cluster of thoughts surrounding a topic) that if we see black people committing violent acts, it will lend support for our previous notions that black people are violent. The main hypothesis that is derived from this reasoning can be summarised as follows: ‘as you play a black avatar, you are remembering violent black people and will thus be more violent yourself’*. Due to the theoretical structure being based around violent black people, the overall hypothesis is that this should indeed be true for black avatars, but not white avatars.

*Please note that ‘you are remembering violent black people’ is not a slur on my part, this reasoning is made in the paper using the example of Mike Tyson.

Methodology

There are two miniature studies contained in this paper. To avoid repetition, I will state that there is one major flaw and perhaps social injustice contained in this paper – it is all white participants. I cannot state clearly enough that nowhere in the introduction is there a sound basis for leaving out black participants. They explain why black avatars may carry prejudice with them, and their basis for conducting the study is as quoted:

“No previous research, however, has examined the effects of avatar race in violent video games on aggression”.

It does not state ‘…video games on aggression in white participants’. I find this wholly unfair and I think the inclusion of black participants would have made for an interesting read into activating the ‘black people = violent’ schema for black participants.

Experiment 1

This experiment aimed to test the hypothesis that playing Saints Row 2 as a black avatar increased negative attitudes towards black people. The sample was 60% white males, although no gender differences were found in this experiment. White avatars were given ‘short, conservative’ hairstyles, while black avatars had cornrows and spoke with an inner city dialect to remain ‘stereotype-consistent’. Participants either had a violent objective (breaking out of a prison and killing guards) or a peaceful objective (finding landmarks on the map).

Racist attitudes were measured both implicitly (unknowingly) and explicitly (knowingly). Implicit racist attitudes were measured using reaction time tests to pictures of black and white faces paired with good words and bad words. Faster reaction times to positive words while white faces were shown and slower times to positive words with black faces are indicative of racist beliefs. Reaction-based implicit attitude tests should always be approached with caution as they lend themselves to individual differences such as eyesight problems and slow reactions by default. The researchers running this test did not specify how they controlled for this in participants and it doesn’t look like they even assessed it beforehand.

The explicit racism test, however, is what I have more issues with. Racist attitudes were measured using the eight question ‘Symbolic Racism 2000’ scale. Questions from this scale include ‘It’s really a matter of some people not trying hard enough’. The Cronbach’s alpha for the scale in this study is recorded as .66.

Cronbach’s alpha is a statistical test which assesses the reliability of a questionnaire. In this context, ‘reliability’ refers to the consistency of the questionnaire in measuring what it is created to measure. For example, let’s say a questionnaire is made to assess happiness. If the questionnaire is filled out, a Cronbach’s alpha analysis is conducted and it comes out as .90, that means that the questionnaire does a very good job in measuring happiness.

In the world of Cronbach’s alpha, .66 is not a good score for a questionnaire; .70 is typically the lowest score for when questionnaires are deemed to be meritable. This means that there is a lot of ‘noise’ in the questionnaire and it is not measuring racist attitudes as well as it wants to. From looking at the example questions provided, it seems that there are a lot of questions surrounding trying and putting in effort. This extra ‘noise’ could perhaps be an additional variable such as meritocracy.

The results for this section are where I wanted to reach for a bottle of vodka. To quote directly from the article itself:

More specifically, in the violent gameplay condition, those who played as a Black avatar had stronger explicit negative attitudes toward Blacks than did those who played as a White avatar (M ¼ 17.8, SD ¼ 3.3) t(122) ¼ 1.95, p < .054, d ¼ 0.35.

This, ladies and gentleman, is what statistical misconduct looks like.

In statistics, you will see the letter ‘p’ a lot. ‘p’ really is no big secret, it simply stands for ‘probability’. In this context, it refers to the probability that our results happened by chance or by accident. The ‘p’ value that we should stick to in statistics has been subject to decades and decades of debate, but the result of all of that debate has remained consistent for decades of research: p must be equal to or below 0.05 to be statistically significant. Quite simply, this means that we must be 95%+ sure that our results did not occur due to chance. You will sometimes see academics grasping at straws where they have a value above 0.05. In these instances, they will say that their value is ‘nearing significance’ or ‘approaching significance’.

These authors are claiming that playing as a black avatar leads to significantly more explicit negative attitudes towards black people when it is not significant. Being between 0.05 and 0.054 is not statistically significant. This is quite simply fraudulence.

For the implicit attitude test, those playing with black avatars associated black faces more with negative words on the reaction test. I can’t find much fault with the analysis of this, but I am confident that the p value for this fell between .04 and .05 as the authors have previously specified the p value that their significance level falls under (e.g. p < .04 for a value between .03 and .04).

In the discussion, the authors are boasting that they found support for increased negative implicit and explicit attitudes after playing as black avatars. The ‘explicit attitudes’ part is a bold-faced lie and the ‘implicit’ part is on the cusp of significance.

Experiment 2

This experiment follows some similar protocols to Experiment 1. However, they hypothesise that participants themselves will behave more violently after playing as a violent black avatar. The sample for this study is a new sample and now consists of 65% white females, 35% white males.

The video games have now been changed so that all groups play a violent game; these games are WWE Smackdown vs RAW and Fight Night Round 4. The reaction time test for implicit racist attitudes is still being used, but has been changed so that races are associated with weapons/non-weapons pictures rather than good/bad words. The explicit violence assessment, however, is beyond absurd.

In a nutshell, violence and aggression was measured…by giving someone who didn’t like spicy food a portion of hot sauce that they had to measure for them. This test is host to an absolute plethora of individual differences. How much sauce does the participant themselves typically take? What is their spice threshold like? Do they cater for someone who takes ‘a little’ sauce and actually smothers it on? Of course, none of this was assessed or controlled for in advance. It is also worth noting that this method does not include a way to measure volume of hot sauce given at the baseline (before the experiment). We have no way of knowing if people would have given the same quantities before having any racist beliefs they hold primed.

It goes without saying that actual violence against someone cannot be part of research due to ethical considerations. However, there are so many better ways to measure this (such as measuring the force of punching a bag before and after a game) that I’m beyond baffled.

Playing as a black avatar led to more people associating black faces with weapons. I can’t really argue that this is a priming effect of having just played a black character with weapons as neither game features weapons. I can’t really fault this.

The hot sauce test was significant and I can’t find much wrong with the statistics beyond laughing at the methodology.

A mediation analysis was then conducted to understand if there was a relationship between playing the black avatar, having more implicitly racist attitudes and ‘”‘acting aggressively'”‘. This mediation analysis was interesting…and albeit suspicious.

The mediation analysis was run in the following way: black avatars x implicit attitudes x ‘acting aggressively’. This interaction was indeed significant, but then something interesting was reported. The avatar variable was ‘freed’ so that all avatars were included, meaning that the following interaction was run: playing violent games x implicit attitudes x ‘acting aggressively’. When both races were allowed into the analysis, the influence of implicit attitudes decreased.

This actually downplays the importance of racist attitudes in ‘acting violently’ which I credit them for keeping in. However, they downplay this finding immediately and it is not included in the abstract or discussion.

Discussion

The discussion section of this article quite simply makes me uncomfortable. It states that this finding is harmful to black people as it reinforces the idea that black people are violent. It also talks about how it is a risk for the general population as access to black characters may prime racist beliefs and violent behaviour, something which again makes me uncomfortable. The ‘limitations and future research’ section is also a joke. The authors do not acknowledge how weak their Cronbach’s alpha was, or how future research could attempt to replicate the findings with improved methodology for explicit attitudes. Instead, the section is simply ‘We need to understand racists more, we need to assess knowledge of black history beforehand, and we need to see if people will give black people more hot sauce than white people as a way to punish them’. Replicating this study with black people included in the sample is only given a tiny acknowledgement at the end.

My Own Conclusion

This is a study that disappoints me beyond my daily line of work. This is a study that blatantly lies about statistical significance, involves questionable methodology and actively excludes the sampling population which they refer to as ‘the guilty innocent’. This field of research and these researchers in particular can absolutely do better if they truly want to make the world a better and less racist place to live in.

Disclaimers and Disclosures

I work full-time as a Researcher/Statistician and study a part-time PhD in Epidemiology (with a particular focus on the interaction between adolescent physical and mental health). I am currently employed to audit the mental health system where I live and I am the sole statistician working on the project. I am in the process of analysing the data and writing the results section of the government report. TL;DR: I know my shit.

I feel it is appropriate to make a note here for any statisticians who may read this. Although I talked about the importance of p < .05, I am well aware that some tests are constructed so that p > .05 is more preferable, typically tests of assumption for further data analysis. I was pointing out the p > .05 finding in a two-way ANOVA, a test where p < .05 is preferred.

I do not like keeping the original article away from everyone, but I am afraid that I cannot give you a copy. My copy includes information about my employer that I would not like to be made public. Sorry!

Thank you very much for reading and happy gaming! ♥

Note

I am the author of an ongoing series titled ‘The Psychology of Video Games’, a series which aims to bridge the gap between gaming and academia. I do not profit from this series and all work will remain free forever, but if you like the idea of keeping me fed and caffeinated then I would very much appreciate it!