A replication crisis has called into question results from behavioral (and other) sciences. Complaints have focused on poor statistical methods, the burying of negative results, and other “questionable research practices” that undermine the quality of individual studies.

But methods are only part of the problem, as Michael Muthukrishna and Joseph Henrich argue in a paper in Nature Human Behaviour this week. It’s not just that individual puzzle pieces are low in quality; it’s also that there’s not enough effort to fit those pieces into a coherent picture. "Without an overarching theoretical framework,” write Muthukrishna and Henrich, “empirical programs spawn and grow from personal intuitions and culturally biased folk theories.”

Doing research in a way that emphasizes joining the dots constrains the questions you can ask in your research, says Muthukrishna. Without a theoretical framework, “the number of questions that you can ask is infinite.” This makes for a scattered, disconnected body of research. It also feeds into the statistical problems that are widely considered the source of the replication crisis. Having too many questions leads to a large number of small experiments—and the researchers doing them don't always lay out a strong hypothesis and its predictions before they start gathering data.

This isn’t the first time someone's argued that better theory makes for better science. It’s a conversation that’s been going on for some time among the people agitating for more robust research. But this is a particularly loud klaxon, in one of the biggest journals in the field, meaning that it might make people sit up and take notice—and possibly start some concrete initiatives to improve theory at the same time as current attempts being made to improve statistical rigor.

Theories about theory

Paul Smaldino, a cognitive scientist who has also been vocal about the need for better theory, points to an infamous psychology paper as a perfect example of what happens when experimental work is divorced from theoretical scientific frameworks. The paper, published in 2011, reported finding evidence of precognition. But that, says Smaldino, “is not a psychology finding. That’s a physics finding. That is everything we know about the laws of physics and causality and how time works, all being wrong.”

The problem with the paper wasn’t just that the methods were bad, he argues, but also that “theory [in psychology] is so weak that something that completely contradicts hundreds of years of science was evaluated without that context.”

Science, he explains, is about accumulating sets of observations that occur reliably—the Sun appears at different places in the sky depending on the season and time of day; finches have different shaped beaks depending on what they eat. “That’s the raw ingredients,” he says. “To make sense of it requires a framework to say, this is how all these different facts fit together, and this is why.” We explain these observations by developing theoretical models—of how the Earth rotates around the Sun on a tilted axis, of natural selection.

Having a good theoretical framework makes it possible to make sense of sets of disconnected facts, and to explain why things happen sometimes and not at other times. Perhaps most importantly, it allows for predictions of what will be found in the data: if our model of human evolution is true, we could predict that we should find huge similarities in the genomes of humans and other great apes—and that’s exactly what we do find. If we made a prediction like this and found it to be false—say, our genomes turned out to be more similar to birds than to other great apes—it would undermine the theoretical framework.

Doing scientific research within a theoretical framework can help to highlight results that are surprising—like finding precognition—and that might therefore need a closer look and plenty of replications to test whether the finding holds up. By drawing connections between behavioral science findings and findings across other fields, “overarching theoretical frameworks pave the way toward a more general theory of human behavior,” write Muthukrishna and Henrich.

More math for the behavioral sciences

Part of what Muthukrishna and Henrich are advocating is a greater use of formal models—a way of setting down ideas about something in cold, hard math. For instance, you might think that, when children are learning whether it’s “toh-may-toe” or “toh-mah-toe” (or many of the other myriad arbitrary cultural variants that humans swim in), they’ll copy the majority of speakers they hear around them.

Putting this in a formal model means pinning numbers and exact relationships to it, forcing precision about your ideas. Perhaps you think that, for example, if 70 percent of adults say “toh-may-toe,” then 95 percent of children will acquire that variant—what happens if you build a computer simulation with little bots learning arbitrary variations in their environment? How does tweaking the numbers lead to different outcomes? And how does that match up with the real world? “I don’t want to say that all theories have to be written in math,” says Muthukrishna. “But most should be.”

Tal Yarkoni, a vocal critic of poor behavioral science, agrees with the prescription of more formal modelling, but otherwise he thinks that more of a focus on theory could be a terrible idea. "Many of our problems actually stem from far too much concern with elegant theoretical frameworks,” Yarkoni argues. Muthukrishna and Henrich draw on the analogy of natural selection in biology, which Yarkoni considers apt. While it’s true that all of biology hangs on the principles of natural selection, for many areas of active biological research, he argues, “the distance between the ‘overarching theoretical framework’ and the concrete mechanisms under investigation is so vast that it's usually pointless to consider the former at all.”

Instead, he suggests, the best way forward is to “accept that the world is really complicated. That in most domains even our best theories can only hope to explain a small fraction of the variation in the behaviors we're interested in, and that we should probably place much more emphasis than we do on large-scale description and prediction (and less on causal explanation).”

Muthukrishna, however, sees that kind of large-scale description part of a more collaborative way of doing science: some researchers work at the coalface, getting data about the world. Others work on fitting that data into the overarching frameworks. Sometimes, that will mean questioning the reliability of the data and insisting on replications. And other times, when the data seems solid, it could mean having to revise or even overturn parts of the theory.

If data are the bricks and the theoretical framework is the house, says Muthukrishna, “you might have the blueprint for a house, but without the bricks it’s not going to work.” Data is vital, he says, and the advances under discussion in the replication movement are crucial too—but “with better theory, it would have been clearer to see when things are nonsense.”

Nature Human Behaviour, 2018. DOI: 10.1038/s41562-018-0522-1 (About DOIs).