University of Washington social psychologist Anthony Greenwald, one of the co-creators of the implicit association test. Photo: Harvard University News Office

At the moment, you may have heard, the field of psychology is grappling with a so-called “replication crisis.” That is, certain findings that everyone had assumed to be true can’t be replicated in follow-up experiments, suggesting the original findings were the result not of actual psychological phenomena, but of various flawed methodologies and biases that have crept into the scientific process.

One of the major contributing factors to the replication crisis, which is centered mostly on social psychology, is human nature. Humans, being humans, do not like hearing that ideas they’ve worked on for a long time might have to get tossed in the bin, or at the very least revised significantly. That’s why some researchers — though by no means all of them — have responded to good-faith critiques of their work by attempting to derail the conversation, calling their critics crazy or mean or attributing to them dark ulterior motives. The researchers who attempt such derailings tend to be established, well-respected ones who have benefited from the old regime — the regime that led the field into its current, precarious situation, and which is now threatened by a growing reform movement.

The implicit association test, co-created by Harvard University psychology chair Mahzarin Banaji and University of Washington researcher Anthony Greenwald, is an excellent example. Banaji and Greenwald claim that the IAT, a brief exercise in which one sits down at a computer and responds to various stimuli, measures unconscious bias and therefore real-world behavior. If you score highly on a so-called black-white IAT, for example, that suggests you will act in a more biased manner toward a black person than a white person. Many social psychologists view the IAT, which you can take on Harvard University’s website, as a revolutionary achievement, and in the 20 years since its introduction it has become both the focal point of an entire subfield of research and a mainstay of diversity trainings all over the country. That’s partly because Banaji, Greenwald, and the test’s other proponents have made a series of outsize claims about its importance for fighting racism and inequality.

The problem, as I showed in a lengthy rundown of the many, many problems with the test published this past January, is that there’s very little evidence to support that claim that the IAT meaningfully predicts anything. In fact, the test is riddled with statistical problems — problems severe enough that it’s fair to ask whether it is effectively “misdiagnosing” the millions of people who have taken it, the vast majority of whom are likely unaware of its very serious shortcomings. There’s now solid research published in a top journal strongly suggesting the test cannot even meaningfully predict individual behavior. And if the test can’t predict individual behavior, it’s unclear exactly what it does do or why it should be the center of so many conversations and programs geared at fighting racism.

One striking thing about the process of reporting that article was the extent to which Banaji tried to smear her critics, suggesting to me in an email she believed that critiques of the test could be explained by the fact that the IAT “scares people who say things like ‘Look, the water fountains are desegregated, what’s your problem.’” She also accused the test’s critics of having a “pathological focus” on black-white race relations and the black-white IAT for reasons that “will need to be dealt with by them in the presence of their psychotherapists or church leaders.”

This is the definition of a derailing tactic — shift the focus from critiques of the IAT itself, some of which in this case appeared in a flagship social-psych journal, to the ostensible moral and psychological failings of the critiquers.

A couple days ago, Quartz published its own article on the IAT, by Olivia Goldhill. The article covers similar ground and comes to similar conclusions as mine, and adds some new insights and analysis: The headline, “The world is relying on a flawed psychological test to fight racism,” captures things pithily. Goldhill’s piece clearly shows that Banaji and Greenwald are still trying to deflect and derail rather than fully engage with the process of evaluating their test:

It’s highly plausible that the scientists who created the IAT, and now ardently defend it, believe their work will change the world for the better. Banaji sent me an email from a former student that compared her to Ta-Nehisi Coates, Bryan Stevenson, and Michelle Alexander “in elucidating the corrosive and terrifying vestiges of white supremacy in America.” || Greenwald explicitly discouraged me from writing this article. “Debates about scientific interpretation belong in scientific journals, not popular press,” he wrote. Banaji, Greenwald, and Nosek all declined to talk on the phone about their work, but answered most of my questions by email.

The idea that journalists shouldn’t write about scientific controversies would have been highly questionable even before the replication crisis exploded onto the scene, but it’s hard to fathom why anyone would take this argument seriously in 2017. After all, the replication crisis was spurred in part by opaque research and peer-review processes, by people not sharing data, by social and professional structures that sometimes had the effect of short-circuiting real debate about the merits of ideas — particularly popular ones of the sort that often get glowing write-ups in, well, the “popular press” (Greenwald, of course, doesn’t appear to have any problems with positive coverage of the IAT). Journalism, when it’s done well, can serve as a useful check on all these tendencies. To be fair, Greenwald isn’t the only one who thinks that science should only be critiqued by those very close to a given controversy — this is an idea that seems to sometimes pop up among defenders of the old, deeply flawed social-psychological ways — but that isn’t how things should work.

Even more surprising, though, is an email Greenwald wrote to Goldhill which read, “The IAT can be used to select people who would be less likely than others to engage in discriminatory behavior.” This might come across as a fairly banal defense of his research project, but it isn’t: It’s the continuation of a very slippery pattern I identified in my article.

As I noted, in their 2013 best seller Blindspot, which helped the IAT carve out an even bigger place in the public imagination than it had already achieved, Banaji and Greenwald wrote that the test “predicts discriminatory behavior even among research participants who earnestly (and, we believe, honestly) espouse egalitarian beliefs,” and “has been shown, reliably and repeatedly” to do so. In fact, this is a “clearly … established” “empirical truth.” But then, just two years later, they argued in an academic paper unlikely to be read by the general public that due to the test’s methodological weaknesses, it is “problematic to use [it] to classify persons as likely to engage in discrimination,” and “attempts to diagnostically use such measures for individuals risk undesirably high rates of erroneous classifications.”

I referred to this as a “Schrödinger’s test” situation in which the test both does and doesn’t predict behavior at the same time. When the test’s creators are addressing lay audiences unfamiliar with its problems, it does predict behavior; when they’re addressing academic audiences familiar with what is now a years-long controvery, they acknowledge that it doesn’t. Greenwald’s quote to Goldhill just marks the latest example.

In other words:

Banaji and Greenwald in 2013, to the public: Our test has been shown, reliably and repeatedly, to predict behavior.

Banaji and Greenwald in 2015, to academics: Our test doesn’t predict behavior.

Greenwald in 2017, to the public: Our test predicts behavior.

So, once more: I disagree with Greenwald. Society desperately needs more open scrutiny of scientific claims, not less, whether in scientific journals, the media, or anywhere else. Especially when it comes to claims that seem to change every two years.