Albert Einstein is said to have noted that theories should be as simple as possible, but no simpler. By the same token, biomedical researchers doing in vivo experiments should use as few animals as possible, but no fewer. On page 271, Nature reports a move by UK government funding agencies to require grant applicants to show how they calculated the number of animals needed to make the results of an experiment statistically robust. In recent years there have been concerns that sample sizes in individual experiments can be too low, especially in preclinical research that attempts to determine whether a drug is worth pursuing in human studies.

Too-small sample sizes can lead to promising drugs being discarded when their effectiveness is missed, or to false positives, as well as to ethical issues if animals are being used in studies that are too small to provide reliable results.

The UK research councils’ move is to be applauded. And Britain is not alone in pursuing such improvements: the US National Institutes of Health has been testing the use of a grant-review checklist that includes features such as experimental design, to improve the reproducibility of preclinical research in animals.

The burden for this should not fall on funding bodies alone. Institutions must also increase the amount of support offered to researchers in designing the statistical aspects of an experiment. Such support is too often limited or ad hoc: study design is complex and needs careful consideration by people who truly understand the issues (see Nature 506, 131–132; 2014).

Journals are also responsible for ensuring that the research they publish is reported in sufficient detail for readers to fully appreciate key details of experimental and analytical design. Many publications — including Nature — have endorsed the ARRIVE guidelines for reporting animal research (C. Kilkenny et al. PLoS Biol. 8, e1000412; 2010). These are, however, hugely detailed, and compliance at this level is difficult for early, exploratory research.

“There are no magic bullets — all parts of the research community need to chip away at the problem.”

Journals published by Nature Publishing Group nevertheless encourage the use of ARRIVE. In 2013, we implemented a reporting checklist that demands that authors supply key details of study design. For animal studies, these include the methods of sample-size determination, randomization and study blinding, as well as exclusion criteria (see Nature 496, 398; 2013). An impact analysis on the effectiveness of the changes introduced in 2013 is currently under way.

Sample size is just one of a suite of issues that need to be addressed if poor reproducibility is to be tackled. Journals have a key part to play in dealing with this problem, but so do others. Credit to those academies that take a lead. This month, for example, the UK Academy of Medical Sciences held a meeting in London at which researchers, funders and representatives from research institutions and universities attempted to provide recommendations for improving reproducibility by examining case studies in disciplines from epidemiology to particle physics, and by exploring the role of culture and incentives. There are no magic bullets — all parts of the research community need to chip away at the problem.

Undoubtedly, part of the challenge is the culture that pushes investigators in many parts of the world to produce more and more with the same resources. The drive to maximize the number of papers and the impact of findings is pervasive.

In a commentary published in Nature Biotechnology last year, experimental psychologist Marcus Munafò and his colleagues compared modern biomedical research with the 1970s automobile industry (M. Munafò et al. Nature Biotechnol. 32, 871–873; 2014). The fast-moving but error-prone car production lines of the United States found themselves losing ground to Japanese manufacturers that stressed the importance of quality-control at every step in their factories.

The moral of the story: quality assurance adds a burden, but it is worth the effort for a longer-term gain in public confidence. Making sure that the power of an animal experiment suits its purpose is an important way for funders and researchers to contribute.