You see it all the time in studies. "We controlled for..." And then the list starts. The longer the better. Income. Age. Race. Religion. Height. Hair color. Sexual preference. Crossfit attendance. Love of parents. Coke or Pepsi. The more things you can control for, the stronger your study is — or, at least, the stronger your study seems. Controls give the feeling of specificity, of precision. But sometimes, you can control for too much. Sometimes you end up controlling for the thing you're trying to measure.

Controls give the feeling of specificity, of precision

I was thinking about this while reading an interesting piece by Scott Alexander summing up the evidence for racial bias in law enforcement. He rips through dozens of studies and literature reviews and comes to the conclusion that "there seems to be a strong racial bias in capital punishment and a moderate racial bias in sentence length and decision to jail...There seems to be little or no racial bias in arrests for serious violent crime, police shootings in most jurisdictions, prosecutions, or convictions."

It's a fascinating review of the evidence, and I urge you to read the whole thing. But what I noticed, picking through the citations, was how much the researchers were controlling for. For instance, the first paper cited, which looked at race and traffic stops, is a festival of controls. Indeed, its primary contribution to the literature is controls. The authors write that they're trying to correct for the fact that "statistical analyses of poststop outcomes often do not control for relevant legal and extra legal factors that may explain racial differences."

And so the controls begin. Income. Prior record. Neighborhood. Nature of the stop. The controls cascade through the rest of the studies, too. There are efforts to control for community ties, family structure, type of crime, and even "dangerousness." The papers brag about their controls. They dismiss past research because it had too few controls.

The problem with controls

Don't get me wrong: Statistical controls are great! Except when they're not.

The problem with controls is that it's often hard to tell the difference between a variable that's obscuring the thing you're studying and a variable that is the thing you're studying.

An example is research around the gender wage gap, which tries to control for so many things that it ends up controlling for the thing it's trying to measure. As my colleague Matt Yglesias wrote:

The commonly cited statistic that American women suffer from a 23 percent wage gap through which they make just 77 cents for every dollar a man earns is much too simplistic. On the other hand, the frequently heard conservative counterargument that we should subject this raw wage gap to a massive list of statistical controls until it nearly vanishes is an enormous oversimplification in the opposite direction. After all, for many purposes gender is itself a standard demographic control to add to studies — and when you control for gender the wage gap disappears entirely!

"The question to ask about the various statistical controls that can be applied to shrink the gender gap is what are they actually telling us," he continued. "The answer, I think, is that it's telling how the wage gap works."

"The answer, I think, is that it's telling how the wage gap works"

Take hours worked, which is a standard control in some of the more sophisticated wage gap studies. Women tend to work fewer hours than men. If you control for hours worked, then some of the gender wage gap vanishes. As Yglesias wrote, it's "silly to act like this is just some crazy coincidence. Women work shorter hours because as a society we hold women to a higher standard of housekeeping, and because they tend to be assigned the bulk of childcare responsibilities."

Controlling for hours worked, in other words, is at least partly controlling for how gender works in our society. It's controlling for the thing that you're trying to isolate.

Researchers know this. There's just nothing they can do about it.

Some of the crime studies are vulnerable to a similar critique. Income isn't independent from race. Nor is neighborhood. Nor is type of crime. The list goes on.

The traffic-stop study cited in the Alexander piece, which is coauthored by Robin Shepard Engel and Jennifer Calnon, tries to control for different kinds of traffic stops. People pulled over for speeding tickets are searched much less often than people pulled over for non-speeding violations like broken taillights. The researchers find that minority drivers are much likelier than white drivers to be pulled over for those non-speeding offenses.

by controlling for type of stop, you might actually just be controlling for the effect of race

The result is that when you control for type of stop, some of the effect of race disappears. But by controlling for type of stop, you might actually just be controlling for the effect of race, which is the thing you're trying to measure in the first place. The researchers know this. "The findings do not address the question of why minorities are more likely to be stopped for nonspeeding offenses." The italics, by the way, are in the original paper. The researchers are frustrated, too.

Or take the Department of Justice study that controls for type of drug use, frequency of drug use, and location of drug use. Once you account for all that, black people go from being four times as likely to get arrested for drug crimes to twice as likely.

two of those three controls don't look like controls at all

Frequency of drug use deserves its control status here. Someone who uses drugs weekly is likelier to get arrested than someone who uses them monthly. But the other two controls don't look like controls at all.

For instance, the study finds that "large metropolitan areas are where 60% of blacks live but where 41% of whites live. Moreover, large metropolitan areas are where 63% of black drug use occurs compared to 45% of white drug use." And large metropolitan areas are policed more heavily.

Similarly, the study finds that "among blacks who reported using illicit drugs during the year, 20% said the drug was heroin or cocaine, the type with the greatest risk of arrest. For white drug users, the figure was lower — 16%. The type of drug with the lowest risk of arrest — psychotherapeutics/hallucinogens — had a high use rate among whites and a low use rate among blacks."

In other words, we police black communities more heavily and we are more aggressive about enforcing drug laws against drugs that black people use more frequently. Controlling for those facts isn't helping you isolate the role racial discrimination plays in drug enforcement. Those facts are the role that racial discrimination plays in drug enforcement.

My sense is that's true for a lot of the controls in these studies — they're telling us how racial discrimination in law enforcement actually works. And, like racism everywhere, the answer is complicated. It mingles with money and class, and where people live and labor and worship. It influences what kinds of crimes society has decided to fear. It echoes in what kinds of cars make police officers consider a search.

Society, controlled

Imagine applying these controls to society itself. We still have race, but people of all races have the same amount of money, and they live in the same kinds of neighborhoods, and they do the same kinds of drugs, and they even drive the same kinds of cars. That society would be a lot less racist. But part of the reason we're so far from that society is racism. Discrimination perpetuates itself.

In some ways, what's amazing about many of these studies is that they show a racial effect even after controlling for so much of racism's work. They show that racism exists even in our control society — the one with equality of income, and education, and neighborhood, and car choices. The one where we've wiped out most every difference but pigment. The one where we've left ourselves no excuses for our prejudice. It is remarkable how much discrimination can survive.

what's amazing about many of these studies is that they show a racial Effect even after controlling for so much

One of the fascinating documents that emerged during Ferguson was a 1992 op-ed written by New Jersey Sen. Cory Booker when he was a masters student at Stanford. "In the jewelry store, they lock the case when I walk in," he wrote. "In the shoe store, they help the white man who walked in after me. In the shopping mall they follow me — the Stanford shopping mall."

Booker, at this point, was already a Stanford graduate. He lived in a rich community. He had won a Rhodes scholarship. He was, by any measure, a victor in America's class war. But still, walking the streets of Palo Alto one night, "the police car slowed down while passing me." Booker had mastered every variable we might control for, every pursuit where race might have held him back. But it wasn't enough. It's easy to imagine a terribly racist society where Booker gets a pass. But we're not even there.

It's testimonies like Booker's that make me wonder whether these controls tell us that racism in law enforcement isn't as bad as we think — or whether its persistence even after all these controls shows that it's much, much worse.

Some final thoughts, courtesy of Harold Pollack

To go back to the original question — how much the criminal justice system discriminates based on race — the answer is I don't know. I haven't studied the topic nearly thoroughly enough to come up with a plausible answer. But I'm skeptical of any answer that works through controlling for variables that are, themselves, affected by race, like income, type of stop, neighborhood, and even prior record.

Harold Pollack, co-director of the University of Chicago's Crime Lab, has studied the topic enough to have an informed answer. So I e-mailed him and asked for his thoughts. I'll end with his response which was, like everything Pollack writes, wise, humane, and deeply informed by the data: