Source: Adrian Askew via Wikimedia commons

Over the years, researchers have documented differences between men and women in performance in domains like math as well as racial differences in performance on a variety of tasks. The core question that is raised by these observations is whether they reflect - or race-based differences in the ability to perform the task, or whether they reflect some other factor (like a difference in the opportunity to learn the task or differences in a motivational factor that affects task performance).

One motivational factor that has been studied extensively is stereotype threat. The idea behind stereotype threat is that if you are a member of a group for which there is a negative stereotype related to a task, you may underperform on tests because of your awareness of the stereotype. For example, there are demonstrations in the lab that women underperform on tests of their math ability relative to men in situations in which is made salient.

Stereotype threat has been studied largely in laboratory situations. Reviews of the literature that look across studies have been mixed in whether they find compelling evidence for stereotype threat effects, though many meta-analyses find at least a small stereotype threat effect.

Another way to assess stereotype threat is to look for real data sets. Many years ago, for example, we did such an analysis for choking under pressure in basketball, finding that professional ballplayers shoot worse than their season average when they have the opportunity to tie the game with a free throw toward the end of the game.

A paper by Tom Stafford in the March 2018 issue of Psychological Science analyzed data from over a half million games of tournament chess.

Chess is an interesting domain to study for two reasons. First, most chess players are men, and the best chess players in the world are overwhelmingly men. So, this sets up conditions under which stereotype threat is possible. Second, tournament players are given a numerical measure of their performance under the Elo system, so there is a numerical way to compare the goodness of the players in the game.

To assess the baseline likelihood that a player would win a tournament game, the author analyzed games played only by men. He looked at how likely a player would win a game based both on whether they were playing the white or the black pieces (there is an advantage to playing white because that player moves first) and on the difference in Elo scores of the players. As you might expect, as the difference in Elo scores of the players goes up, there is a greater likelihood that the player with the higher score will win.

Next, the author looked at games between a man and a woman. He compared the percentage of games won by women based on the difference in Elo scores between the players to the expected percentage of games from the baseline I described earlier. If there is a stereotype threat effect in chess, then you might expect that women would win fewer games against men than the baseline would predict. This might be strongest when there is the biggest difference between men and women in their scores.

In fact, the opposite is true. On average, women win more games than the baseline when playing against men. If anything, then, there isn’t a stereotype threat effect. There might even be a small stereotype lift.

Obviously, there are many reasons why a stereotype threat effect was not found for chess. Expert chess players who play in tournaments may have so much practice playing games that motivational changes induced by who they play may not have much of an impact. In addition, since only about 12% of the rated players are women, the women playing tournament chess have a lot of practice playing against men, so the effects of stereotypes may have weakened. And, of course, it is possible that stereotype threat effects are observable in controlled laboratory conditions, but not in real-world situations.