Testing a new drug on human subjects is expensive, risky and ethically complex, so the vast majority of potential treatments are first tried out on non-human animals. Unfortunately similar issues also constrain the size of animal studies, meaning that they have limited statistical power, and the scientific literature is littered with studies that are either uncertain in their outcomes or which appear to flatly contradict each other.

Luckily statisticians have developed a workaround for this – the “meta-analysis”. To do this, scientists combine the data from a large number of published studies on the same treatment, ending up with a much more certain answer that’s tantamount to a super-study that uses the total number of animals from all the individual studies. Bingo! A much more solid basis for deciding on the merits of pursuing these treatments further.

But a study just published in PLOS Biology by Konstantinos Tsilidis, John Ioannidis and colleagues at Stanford University shows that a meta-analysis is only as good as the scientific literature that it uses. That literature seems to be compromised by substantial bias in the reporting of animal studies and may be giving us a misleading picture of the chances that potential treatments will work in humans. You can read the excellent Synopsis by Jon Chase for how the authors set about doing their study, but the key take-home is that more than twice as many studies as expected appeared to have statistically significant conclusions – something known as excess significance bias.

What’s the explanation for this anomaly? Rather than wilful fraud, the authors of the PLOS Biology study suggest that this excess significance comes from two main sources. The first is that scientists conducting an animal study might analyse their data in several different ways, but ultimately tend to pick the method that gives them the “better” result. The second arises because scientists usually want to publish in higher profile journals that tend to strongly prefer studies with positive, rather than negative, results. This can delay or even prevent publication, or relegate the study to a low-visibility journal, all of which reduce their chances of inclusion in a meta-analysis.

The new work raises important questions about the way in which the scientific literature works, and it’s possible that the types of bias reported in the PLOS Biology paper have been responsible for the inappropriate movement of treatments from animal studies into human clinical trials. What do we do about it? Here are the authors’ suggestions:

Animal studies should adhere to strict guidelines (such as the ARRIVE guidelines) as to study design and analysis. Availability of methodological details and raw data would make it easier for other scientists to verify published studies. Animal studies (like human clinical trials) should be pre-registered so that publication of the outcome, however negative, is ensured.

Well, these are all excellent, but most people would also say that there are problems elsewhere in the system – in the high-profile journals’ desire to a have a cute story with well-defined conclusions, and in the forces exerted on authors by institutions and funding bodies to publish in those high-profile journals. So whose fault is it?

The Institutions. Institutions (and funding bodies) feel driven to assess the “quality” of their employees’ work, and frankly, using the journal in which the work is published as a proxy for “quality” is an easy option – the peer reviewers and editors have already done the assessment job for them, and all they have to do is note down the impact factor. In this context, negative-result papers aren’t going to help the authors’ case. Even if they end up with the rosiest article-level metrics, they will end up tarred with a low impact factor.

The Authors. Authors may have the best intentions, but ultimately writing and submitting (and re-submitting…) a paper takes a substantial number of person-hours. These are hours that could be better spent doing experiments, writing grant applications, teaching, etc. How can writing a negative-result paper that will end up in a “low-profile” journal compete? Ioannidis and colleagues point out that many negative studies end their days at this stage – a collection of data on a hard-drive with no prospect of seeing the light of day. And those that do get submitted may have had to wait until the authors have done umpteen other more pressing tasks.

The Journals. There are some special journals out there, like PLOS ONE, that don’t make a judgement as to the importance of a study. In their eyes, positive- and negative-result papers should be published with equal probability. However, most journals, including PLOS Biology, make a call as to the perceived importance of a study. Some might disagree, but I would say that this is key to the “discoverability” of a paper, and that some flagging of important papers needs to occur, whether it’s pre-publication (as has happened traditionally) or post-publication (as some propose). Positive papers will almost always be seen as more important (with a few interesting exceptions), but what is essential is that the publication of negative results in a readily accessible journal is made as easy as possible; that publishers aren’t the barrier.

Yes, the pre-registration of animal studies should help incentivise authors to write and submit negative papers, but:

Institutions and funding bodies need to release themselves from the tyranny of the impact factor and view positive and negative results as equally valid contributions to the literature. Authors need to recognise that negative studies can contribute substantially to scientific knowledge, both via meta-analyses and by more informal means, and it is their duty to ensure that failure to submit these studies doesn’t bias the literature. Publishers need to ensure that there is an accessible home for sound papers that have negative results.

Now here’s a question –would PLOS Biology have published this study if it showed that there was in fact no excess significance in animal studies? I doubt it, but at least we would’ve cordially pointed the authors in the direction of PLOS ONE, where their study could be published without substantial delay.

Declaration of potential conflict of interest: I’m a PLOS Biology editor and an employee of PLOS. That said, these views are my own and don’t necessarily reflect those of PLOS.



Konstantinos K. Tsilidis, Orestis A. Panagiotou, Emily S. Sena, Eleni Aretouli, Evangelos Evangelou, David W. Howells, Rustam Al-Shahi Salman, Malcolm R. Macleod, John P. A. Ioannidis (2013). Evaluation of Excess Significance Bias in Animal Studies of Neurological Diseases PLoS Biology, 11 (7) DOI: 10.1371/journal.pbio.1001609