Francis has previously written similar statistical reviews of psychology papers. His reviews are based on the theory that experiments, particularly those with relatively small sample sizes, are likely to produce “unsuccessful” findings, such as results that do not reach statistical significance (a p-value of less than 0.05), at least some of the time, even if the experiments are measuring a real phenomenon. Taking into account the researchers’ reports of the strength of the phenomena...

WIKIMEDIA, RAMA A commentary published in GENETICS this week (October 15) questions the results of a December 2013 Nature Neuroscience paper about how mice, when conditioned to fear odors, pass on their fears to their pups, as well as to their pups’ offspring, presumably by an epigenetic mechanism. Gregory Francis , the critique’s author and a professor of psychological sciences at Purdue University, suggests that the original paper’s statistical results are “too good to be true.”

“If the statistical power of a set of experiments is relatively low, then the absence of unsuccessful results implies that something is amiss with data collection, data analysis, or reporting,” he wrote in his GENETICS critique. One explanation could be that the researchers consciously or subconsciously omitted some experiments from the paper, Francis suggested.

Francis said that he submitted his critique to Nature Neuroscience but that “they wouldn’t send it out for review.” Nature senior press officer Neda Afsarmanesh wrote in an e-mail to The Scientist that Nature “cannot discuss specific internal dialog about our papers,” but noted that editors do review all critiques sent to them.

Brian Dias and Kerry Ressler of Emory University, the authors of the Nature Neuroscience paper, responded to Francis’s criticisms in another GENETICS article published this week, stating that they stand by their results and that they have reported all data they collected. They also said that Francis did not mention experiments they included in the supplemental materials that did not yield statistically significant results.

“[Dias] has replicated the main effect that was originally published last year in Nature Neuroscience several times in the lab along with several other people in the lab who have been blinded observers and counters of the data,” Ressler said in an interview. He added that they are working to understand how smell sensitivity might be passed from generation to generation, as it is still unclear how a smell processed in the brain would lead to changes in germ cells.

Gary Churchill, a senior editor of GENETICS and a scientist at The Jackson Laboratory in Bar Harbor, Maine, argued in his own commentary that surprising findings with abundant statistically significant results should be expected to appear in high-profile journals, since authors are more likely submit their most positive results and editors pluck out papers more likely to make splash. Steven Goodman, a professor of medicine at Stanford University and head of a new center for improving the validity of biomedical research, agreed, noting that the improbabilities Francis cites may be as much a result of the peer review and selection process, as they are a result of anything that Dias did.

“Even if Francis is correct that there is a surplus of significance here . . . it doesn’t necessarily invalidate the findings,” Goodman said, “unless he can provide evidence that the effect sizes are implausibly large, that there are internal inconsistencies, that data have been fudged, or that the allegedly omitted experiments are likely to undermine inferences from the experiments presented.”

Others expressed frustration that GENETICS published Francis’s critique. “I’m concerned that the criticism was not based on an apparent experimental error,” said Andy Feinberg, an epigeneticist at Johns Hopkins University. “The danger is that if we allow endless post hoc reanalysis of published work, editors will be even more conservative than they already are in accepting out-of-the-box findings.”

Goodman argues that there are more important factors that should lead people to harbor skepticism toward the Nature Neuroscience paper. For instance, he points out a PubMed Commons comment saying that the paper did not note which mouse pups were siblings in the behavioral and neuroanatomical experiments. Treating related pups as independent from each other can inflate sample sizes.

Dias and Ressler do say that they have made some changes to the way they operate due to statistical criticism. They now track which mice sire which pups, a practice they think more labs should follow. “We just want people to understand that we’re very devoted to finding the truth and to [being] as transparent with the data as we can,” Ressler said.

G. Francis, “Too much success for recent groundbreaking epigenetic experiments,” GENETICS, 198:449-51, 2014.

B.G. Dias and K.J. Ressler, “Reply to Gregory Francis,” GENETICS, 198:453, 2014.

G.A. Churchill, “When are results too good to be true,” GENETICS, 198:447-8, 2014.