Sunday’s New York Times Magazine contained a long feature by Clive Thompson on the history of women in tech and the industry’s more recent efforts to grapple with software engineering’s gender gap.

One statistic in the piece jumped out at me:

In a 2016 experiment conducted by the tech recruiting firm Speak With a Geek, 5,000 résumés with identical information were submitted to firms. When identifying details were removed from the résumés, 54 percent of the women received interview offers; when gendered names and other biographical information were given, only 5 percent of them did.

That’s an astounding statistic — so astounding that it seemed literally incredible. Research into “blinded” résumés has sometimes found that it reduces bias — but never by a margin like that. Yes, blinded auditions for orchestras saw the share of women in the top orchestras grow from 5 percent to 25 percent, but that was over the course of 40 years. This effect was more pronounced than that, and it happened overnight.

If these statistics were right, a relatively simple change to job applications could close the whole gender gap in tech, if not reverse it. I dug into it and found that the whole statistic was fishy — based on a shadow study that no one seems to have actually seen.

In 2016, Speak With a Geek, a tech recruiting company that has since closed its doors, announced that it had conducted an experiment with blind auditions. CNET was the first to report on the experiment (everyone else who reported on it cited CNET, rather than seeing the original experiment):

On two different occasions, Speak With a Geek presented the same 5,000 candidates to the same group of employers. The first time around, details like names, experience and background were provided. Five percent selected for interviews were women. You can guess what happened next, right? When identifying details were suppressed, that figure jumped to 54 percent.

This claimed effect is much larger than the largest disparities ever found in reputable studies about the effects of résumé blinding. Speak With a Geek never published this experiment anywhere — including on its own website or social media. People who asked to see the results at the time were ignored. And some things about the study are confusing even aside from the shocking size of the effect: Who would select any candidate for an interview when “names, experience and background” weren’t provided? Companies were shown the exact same candidates twice? These are questions we could answer by looking at the methodology in more detail, if it were available. But so far, it doesn’t seem to exist.

Thompson, whose article was excerpted from his forthcoming book, Coders: The Making of a New Tribe and the Remaking of the World, confirmed that he didn’t see the original study and is now trying to find it. I, meanwhile, looked into other research on the effectiveness of résumé blinding to combat disparities in tech, and became convinced that there’s a bigger problem than one fishy statistic in one article. The hunger to fix a real problem in the field has fed a cottage industry of companies posting dubious statistics that are shared uncritically by news organizations.

Recruiting companies want to establish that their approach leads to more hiring of women and other underrepresented groups, something that companies want and often struggle to achieve. (I used to work at a technical recruiting company that was probably a competitor of the two companies I’m writing about here, though I’d never heard of either before I started digging into this question.)

To do that, these companies release studies that are pretty much just press releases, and which tout results out of line with the academic research literature in the area. It’s hard to tell what they’re doing wrong because the studies themselves are hard to find. But they get quoted — and cited — in mainstream publications, and readers get the impression there are nigh-magical “quick fixes” to problems like diversity in tech.

Where these implausible statistics come from — and how they spread

The implausible statistic from Speak With a Geek was first published on CNET, then shared by CBS News, Bustle, Melinda Gates, and eventually the Times. Along the way, people asked questions, including asking to see the study. They were ignored. Speak With a Geek shut down, but the statistic kept bouncing around the internet — and it likely still will.

This has happened before. A different New York Times Magazine article on disparities in tech from 2016 cited a different small recruiting firm on the power of résumé blinding. “GapJumpers has conducted more than 1,400 auditions for companies like Bloomberg and Dolby Laboratories,” the article reported. “According to the company’s numbers, using conventional résumé screening, about a fifth of applicants who were not white, male, able-bodied people from elite schools made it to a first-round interview. Using blind auditions, 60 percent did.”

This statistic was embedded in the section of the article discussing gender gaps in tech, so you might assume that these 1,400 auditions were for tech roles. But it’s not clear if that’s actually true. The vast majority of the job listings on GapJumpers are not for technical roles. Right now there are 14 job openings listed on the site, three of them software engineering jobs. It seems likely that not all of the 1,400 “auditions” were for tech jobs — which hasn’t stopped the statistic from being cited as evidence that gender equity in tech, in particular, can be achieved with résumé blinding.

That’s not the only question about GapJumpers’ research. Once again, there’s no study to be found. The research doesn’t seem to be on the recruiting firm’s website. They compare their approach to “using conventional résumé screening,” but it’s not clear what that means. Did they ask the same companies they worked with to look at résumés for the same roles? Why use the category “white, male, able-bodied people from elite schools,” and what share of their applicants meets all of those descriptions? (I reached out to GapJumpers to ask about their methodology; I haven’t heard back.)

Conducting informal research like this isn’t a bad thing. Companies do research like this to figure out whether their product is working, and they publicize it to attract companies that care about a diverse hiring pipeline.

The problem is that since these recruiting companies rarely publish their methodology, their results are much harder for readers to evaluate independently. If there’s an error or a deceptive practice, it’s much less likely to get caught. Reporters need to be doing a lot more diligence if they’re going to use studies like these — and my digging this week suggests they aren’t. That means any recruiting company can make a big splash — and get good PR — if they produce shocking results. Unsurprisingly, shocking results have followed. The problem is there’s no way to tell if they’re accurate.

What we know about background-blind recruiting

The benefits of background-blind recruiting aren’t all invented; far from it. Background-blind recruiting really does appear to be a powerful tool to ensure that candidates get evaluated based on their skills, not their name, gender, or skin color.

There’s abundant evidence from other industries that US employers exhibit a pronounced racial bias against candidates with “black-sounding” names like Jamal or Latisha. Such candidates are 50 percent less likely to get callbacks, and they don’t benefit as much as white candidates do from additional qualifications or a stronger résumé. (Note that this effect, while it’s quite large and can easily change the lives of the people it affects, is five times smaller than the effect Speak With a Geek claimed to find for women in tech.)

Blinding orchestra auditions really did open up a world of new opportunities for women in top orchestras. And even in cases where the effects of bias are smaller, it still seems obvious that we should be taking the easy steps to combat it.

But it’s easy to oversell background-blind recruiting. In particular, it doesn’t look like it’ll do much to fix gender disparities in tech. A peer-reviewed study of gender bias in the hiring of project managers, which sent fake résumés for male and female candidates to employers, found complex results. Among résumés designed to suggest less technical competence, women did worse than men, but among résumés designed to suggest technical competence, women did better than men.

The real source of disparities in tech starts far earlier. Only around 18 percent of computer science majors are women, and the share of women applicants for many jobs reflects that. Eliminating what gender bias remains might still be great, but it wouldn’t address many of the disparities in the industry so long as 82 percent of applicants are still men. In fact, a focus on blinding the process to gender arguably misses the point. If you’re trying to correct past disparities in an industry that you expect have resulted in disadvantaged candidates getting worse jobs, fewer promotions, and having less impressive résumés today, then blinding the résumés gives an advantage to whoever has historically been advantaged in the industry.

Of course, that does not mean tech is fair! If forces systematically make it harder for women to get each credential - less likely to do a CS class in high school, a CS program in college, less likely to get promoted, less likely to change jobs, likelier to be a primary caretaker — Kelsey Piper (@KelseyTuoc) February 18, 2019

- what results are systematic disparities, and we do have those. Companies are aware of this and most are actively trying to address it - targeting candidates who haven't been 'coding since high school', who took time off to raise kids. Gender-blinding at these cos hurts women. — Kelsey Piper (@KelseyTuoc) February 18, 2019

In other words, blinding seems to be solving a different problem than the problem tech actually has — at least when it comes to gender. A tech recruiting company, Interviewing.io, looked into this in a more sophisticated way a few years ago, using voice modulation software to see whether interviewers were prejudiced when a candidate sounded like a man or sounded like a woman:

“masking gender had no effect on interview performance with respect to any of the scoring criteria (would advance to next round, technical ability, problem solving ability). If anything, we started to notice some trends in the opposite direction of what we expected: for technical ability, it appeared that men who were modulated to sound like women did a bit better than unmodulated men and that women who were modulated to sound like men did a bit worse than unmodulated women.”

(Why trust this recruiter-sponsored study and not the other two? Well, for one thing, we can read it — the company published a detailed methodology, voice samples, and an explanation of its results. While the study shouldn’t settle the question — its sample size was fairly small — it’s appropriate to use as part of building a larger picture.)

Changing names on résumés and modulating voices won’t save us here. Addressing disparities in tech is likely to instead take two forms: addressing the reasons that women who start their careers in the industry decide to leave it at much higher rates than men, and, more controversially, addressing the reasons women are less likely than men to choose tech careers in the first place. Achieving change on either of those fronts is complicated — it requires extended effort and energy. Bad research suggesting that gender disparities in tech are the result of aggressive, widespread, unprecedented discrimination at the résumé stage gets in the way of that conversation. We have to face the problems we actually have if we’re going to fix them.

Sign up for the Future Perfect newsletter. Twice a week, you’ll get a roundup of ideas and solutions for tackling our biggest challenges: improving public health, decreasing human and animal suffering, easing catastrophic risks, and — to put it simply — getting better at doing good.