Modern neuroscience would be impossible without functional magnetic resonance imaging, or fMRI. The technique is barely 25 years old, but thousands of studies that use it are published each year. When you see headlines such as “Vegetative state patients can respond to questions” or “This is your brain on writing,” you can be sure that fMRI was involved. Last week a new map of the brain based on fMRI scans was greeted as a “scientific breakthrough.”

However, earlier this month, Anders Eklund, of Sweden’s Linköping University, published the latest in a series of papers showing a deep flaw in how researchers have been using fMRI. This flaw, Eklund and his colleagues believe, could ruin the results of as many as 16,500 neuroscience studies over the last 20 years.

The findings have prompted debate and discussion among scientists. But, surprisingly, none of them is freaking out about the fact that two decades worth of understanding could be overturned. In fact, it turns out, the flaws in fMRI are a good example of how scientists are tackling one of the biggest problems the discipline is currently facing: that of making experiments reproducible.

The dead fish that thought

fMRI is a specialized form of MRI, an imaging technique that enables you to look inside the body without having to cut it open. The “functional” bit of fMRI is that it measures changes in blood flow, while ordinary MRI just maps the shapes of tissue. The more active a part of the brain is, the more blood flows to it. By watching which bits are active when someone performs certain tasks or experiences certain stimuli, neuroscientists make deductions about how the brain works.

If you put a dead creature in an fMRI scanner, therefore, you should see nothing: Dead things don’t have any blood flow. But in 2009, some researchers put a dead salmon into an fMRI scanner, just to see what would happen. To their surprise parts of the brain lit up, as if the dead fish were ”thinking.”

Bennett et al. A dead salmon scanned using fMRI by Dartmouth College researchers. T-value is a measure of probability that the brain activity detected is down to chance (i.e. real or not).

The reason is that MRI measurements aren’t straightforward to interpret. The signals are “noisy,” in the same way a distant radio station sounds noisy or fuzzy when you try to tune in. In fMRI, which looks for very subtle changes in the signals, the noise can nearly obscure the effect you’re looking for. So fMRI scanners rely heavily on software and statistical tests to eliminate background noise—the standard level of activity you’d see when nothing is happening.

The trouble is, what is “standard” activity can vary from one object to another, or even from person to person. So these software packages and statistical tests have to make a lot of assumptions, and sometimes use shortcuts, in separating real activity from background noise.

Because of this scientists expect a 5% rate of false positives—of the scanner showing something as brain activity when it is not. The dead salmon paper grabbed headlines and even won an IgNobel prize (a jokey annual award for “improbable research”), but it was really just one case of a false positive—”a funny illustration of what can go wrong if you don’t check your assumptions,” says Sam Schwarzkopf, an experimental psychologist at University College London.

What Eklund’s studies have shown, however, is that the real rate of false positives can often be far higher than 5%.

Seeing through the fog

In their latest paper, Eklund and his colleagues studied data from 499 people in a resting state—scanned when they were doing nothing—and analyzed the readings using three commonly used software packages and settings. For one loose yet commonly used setting, the false positive rate was as much as 90%. Some brain regions were more prone to false positives than others; the posterior cingulate cortex, which is linked to emotion and memory, turned up more of them than any other region, no matter the program used.

Eklund’s team estimate that anywhere between 3,500 and 16,500 papers using flawed fMRI methods have been published. You’d be forgiven for thinking neuroscience is facing a crisis.

Yet when I contacted various scientists, there was no hint of panic. In fact, it seemed like this was quite normal.

In part this is because the flaws in fMRI are not new. Studies that have found excessive false positives go back several years, and some of the problems have been known ever since fMRI was first developed 25 years ago. Each software package has a slightly different way of correcting for errors.

Moreover, software can always be updated to improve it. Eklund says that in the time since he and his colleagues first released their findings last December on ArXiv (a site where scientists publish versions of papers awaiting peer review), one of the three packages they tested has been amended (pdf) via a software update, while the team behind another published a comment (pdf) agreeing with some of Eklund’s points but stating that the “flawed” methods are still useful to scientists.

However, the Eklund study does point at a much bigger problem facing science: how it deals with the fact that a certain proportion of studies are always flawed.

Rinse, repeat

Science depends on the idea of reproducibility: that other scientists should be able to do the same experiment as you and get the same results. In principle, a finding doesn’t become part of the canon of scientific knowledge until it’s been reproduced several times. If it can’t be, it is weeded out.

That’s the theory. But in fact science as a whole is facing a reproducibility crisis. Scientists don’t have much funding or professional incentive to repeat previous studies, and when they do, many studies have proven impossible to replicate. That means a lot of published findings may be wrong but remain unchallenged.

This is true of fMRI studies in particular. Though the cost of fMRI has fallen, a scan can still cost at least $600 an hour to run, and funding for repeating previous studies can be hard to come by.

This problem might be alleviated by letting neuroscientists see raw data from other studies, so they can check the results without the cost of doing their own scans. The trouble is that researchers—in neuroscience as in many other disciplines—also tend to keep their data to themselves. They publish just the brain images but not the underlying measurements that made them, and don’t disclose the version of software they used. (This isn’t deliberate secrecy, it’s just the way things have always been done.)

Nor are there standard protocols for how long researchers should keep their original datasets. In the early days of fMRI storage was expensive, so it’s unlikely data were kept. That means past studies can’t be reanalyzed even if someone could get the funding to do it.

The good news is that neuroscience is also leading the way in fixing the problem of reproducibility. After the dead salmon paper came out, scientists corrected for the flaws it showed up—in their IgNobel speech the researchers said that the number of people using the incorrect methods had gone from 40% down to 10%. “In many ways fMRI scientists lead the field in the application of new statistical methods and best practices,” says Micah Allen, a neuroscientist at University College London, adding that websites like Neurovault, which allow easy sharing of data, are growing in rapid popularity.

And Schwarzkopf says that over so many years, any really key findings from fMRI studies have likely been re-tested, some using newer and more accurate methods. False findings would have crumbled and been swept away, as they are supposed to.

This is why neuroscientists are not in despair. Eklund’s discovery doesn’t mean fMRI is useless, just that it needs to be used better. And that is what they are now striving to do.