In South Korea, people under the age of 16 can’t play online games between midnight and 6am. The UK Parliament has launched an official inquiry into “the impact of social media and screen use on young people’s health.” Meanwhile in the United States, the Wait Until 8th campaign asks parents to delay giving their children a smartphone until they’re in eighth grade. Worry about kids and technology is rampant—so have smartphones, in fact, destroyed a generation?

A paper published in Nature Human Behaviour this week answers that question, often differently, thousands and thousands of times. Researchers Amy Orben and Andrew Przybylski took three huge datasets and threw every possible meaningful question at them. In part, their analysis is an illustration of how different researchers can get wildly different answers from the same data. But cumulatively, the answers they came up with indicate that tech use correlates with a teeny-tiny dent in adolescent well-being—and that there’s a big problem with big data.

High numbers don’t necessarily mean high quality

Studying small numbers of people, or rats, or trees can be a problem for scientists. Comparisons between small groups of subjects might miss a real finding or luck out and find something that looks like a pattern but is actually just noise. And it’s always tricky to generalize from a small group to a whole population. Sometimes small is the only sort of data that’s available, but some research disciplines have had the recent(-ish) boon of gigantic, rich datasets to work with.

That takes care of the sample-size issue, but it opens up new problems instead. If you want to know whether technology use affects teenagers’ well-being, what precisely do you plan to measure? Is the technology you’re worried about gaming, social media, cellphone use, or computer use? Are you worried about the time spent using devices or the times of day at which young people use them? Do you measure well-being with population-level depression rates, questionnaires issued to teenagers, or by asking parents?

Large datasets tend to have a belt-and-braces approach to big questions, getting at them by probing in a few different ways. That means researchers have choices in how they analyze things. This leads to a problem nicknamed the “garden of forking paths”: at each point in an analysis, a researcher has to make one choice rather than another, which sends them down a particular path. Then, at the next junction, they face a new choice. At the end of these choices, the researchers end up at a particular point in the garden, holding an answer—but all those other paths would have given them different answers.

Ask ALL the questions

Orben and Przybylski were concerned about how this problem could be affecting research into the effects of technology on children and teenagers. So, they adopted a relatively new statistical technique that involves trying out every possible question you could ask a dataset—or at least, all the questions that are theoretically reasonable to ask. For one dataset they used, that meant 372 possible ways to analyze the data; for another, nearly 41,000; and for a third, more than 600 million.

That might seem insane, but it’s possible for numbers like this to explode when quite a few options are involved. And these astronomical numbers are actually on the low side, because Orben and Przybylski opted for the simplest possible statistical test they could. Even then, they had to cut down to around 20,000 analyses for the last dataset because computational time became an issue.

What they found was that the data could tell very different stories depending on the precise question asked. Some analyses resulted in positive correlations—higher levels of technology use appearing alongside higher well-being. Some had negative correlations, with more tech use linked to lower well-being. And some found no real relationship at all.

These findings were also different depending on whether the analysis factored in information like socioeconomic status, which could obviously play a role in teenagers’ well-being. When controls like this weren’t built in, more of the analyses were likely to show a negative relationship between tech and well-being. Overall, they found a slight negative effect, explaining just 0.4 percent of the differences between different teenagers’ levels of well-being.

Tech: As bad as binge-drinking, being arrested, and... potatoes?

There’s a lot that correlations can’t tell us. They can’t tell us about causation, obviously—a relationship between tech and well-being only tells us that the two move together. Tech could cause problems, but there are other ways to interpret things. For instance, that depressed teenagers lean on their cellphones and games more; or that an unhappy household makes for a depressed teenager holing up in their room playing games so they don’t have to engage.

A correlation on its own might also not point to very meaningful changes in the real world. To get a better handle on this, Orben and Przybylski compared the small effect they found to other relationships that came out of the data—for instance, how does the effect of technology use compare to the effect of bullying, asthma, or getting plenty of sleep? In one dataset from the US, technology use was overall linked to a slight decline in well-being. But in that data, the consumption of potatoes had an effect that was roughly the same size.

Even here, results varied wildly across datasets. In a sample of teenagers from the UK, technology was about as strongly linked to lower well-being as binge drinking and being arrested, which seems pretty worrying. But in one of the two samples from the US, the effect of binge drinking was eight times larger than that of technology use. Asking different questions of the same data can produce dramatically different answers, and so can asking the same question of different data.

The research is only just getting started

So where does this leave us? More or less back at square one, says Patrick Markey, who researches the effects of video games and wasn’t involved in this work: “Maybe cellphones are terrible! We just don’t know right now.” What's so exciting about this paper is that it really highlights just how extreme that uncertainty is. With all the problems plaguing large-scale studies linking screen time to negative outcomes—like tiny effect sizes, the garden of forking paths, and bad self-reported data—“I just don’t know what they’re measuring at all,” he says.

For Orben, the low quality of research is also a result of asking broad, sweeping questions. This analysis suggests that, in general, technology use has at most a weak relationship with well-being. But is that true for all kinds of technology and all teenagers? What about a depressed teenage girl who spends hours looking at supermodels on Instagram?

Asking more specific questions is an essential next step, Orben says. “We don’t know yet what different sorts of technology do to people. It’s like asking whether eating sugar is bad for you: are we talking about chocolate cake every hour, or an apple every couple of days? What children are we talking about—diabetics, or athletes?”

Markey, who has been vocal in his criticism of research that pins negative outcomes to technology use, says that he’s not a cheerleader for technology use either—he just doesn’t think the current evidence warrants enough certainty to do things like set policy or advise parenting practices.

“It isn’t that cellphones are good,” he says. “It’s just that we don’t know yet. The research hasn’t been good—and I think we’re only just getting to the point where our measurements might be getting better.”

Nature Human Behaviour, 2018. DOI: 10.1038/s41562-018-0506-1 (About DOIs).