The microbiome needs no introduction – it has been several years since you could pick up a biomedical research journal and not run into an article on possible connections of human gut bacteria and disease. There were thousands of such papers last year alone. But it’s a very hard field to work in. You can establish correlations between certain gut profiles and some diseases, but causality is another matter. Do the bacteria cause or exacerbate the disease, or does the disease give you an altered gut bacteria profile? Or neither? There’s no particular reason that they should be connected at all (which is an option that our human brains are not always ready to consider, frankly). It’s worth remembering that we don’t even really have a baseline: there is no agreement on what a healthy human gut microbiome looks like, how that might vary depending on diet and environment, how many such microbial states might be considered healthy, and how much deviation from these would be considered acceptable.

Attempts to answer these questions can involve human microbiota-associated (HMA) mice, which are gnotobiotic (germ-free) animals that have had human microbiota transplanted into them. That sounds like a pretty useful experiment, although it has to be said that gnotobiotic rodents are not only expensive but a bit weird (and certainly not normal). It would seem like the sudden-microbiome-from-a-standing-start experience that they get is not exactly like what goes on in a human gut – you hope, anyway – but one could also hope that looking at differences and disease states could be useful.

But this new paper casts doubt. The authors have found 38 such studies in the literature over the last few years, and 36 of them apparently report a transfer of disease phenotype. That, they say, is a little too useful: it is implausible that so many of these experiments should work. (The two exceptions, by the way, were on possible microbiome effect in colorectal cancer). The paper suggests that a combination of loose experimental design (not enough donors, small n on the mice, etc.), lack of rigor on determining the disease state in both the humans and animals, and flat-out wishful thinking have skewed the literature. A whole range of diseases, many known to be multifactorial, should not be able to simply transfer into mice with the human bacteria with a 95% success rate. A big factor, which is also a silent one, is publication bias. The studies that show an effect get published, the studies that don’t go into the drawer (or its electronic equivalent).

At this point you’d be hard-pressed to find a prominent human disease that hasn’t had some connection made to the microbiome. Does this indicate its extraordinary importance, or is this a sign of people getting ahead of the evidence? The only way to be sure about this is to run more rigorous experiments – put these hypotheses to stricter tests and see if they survive. That’s not the say that the whole field is hype, because it certainly isn’t. But we need to get the hype out of it in order to work on the real stuff. This new paper has a number of specific recommendations for improving the quality of HMA mouse studies, and it’s hard to argue that we wouldn’t be better off if people followed them.

These include (a) compare donor samples from individuals with a specific disease to samples from donors without it, and try to determine the microbiome alterations associated with pathology, (b) use enough donors to account for biological variation, keeping effect size in mind, (c) avoid “pseudoreplication – that is, use the number of donors as the N for statistics, rather than the number of animals you transferred their samples into, (d) use more rigorous animal handling techniques to prevent the spread of microbes between animals, (e) don’t pool donor samples, because you’re eliminating important differences between individuals, (f) run confirmatory assays to make sure that the microbial transplants were actually successful, and do this with a time course because some of these things may take a while to get established, and (g) be honest about the power of your experiments and their ability to explain causality.

Long-term, we’re going to need complex experiments to really nail down causality and mechanism, but we can’t even get those off the ground (not and have them mean anything) unless the foundations are more solid than they are now. Basically, the authors are saying that it’s time for this field to grow up, for there to be fewer statistically thin publications that claim connections to human diseases that are poorly modeled in mice to start with. Case in point. There’s real science to be done in this field, and it could lead to some real advances in medical treatment, but we’re not going to get there with the sort of stuff that’s cluttering up the literature now.