The notion that digital-screen engagement decreases adolescent well-being has become a recurring feature in public, political, and scientific conversation. The current level of psychological evidence, however, is far removed from the certainty voiced by many commentators. There is little clear-cut evidence that screen time decreases adolescent well-being, and most psychological results are based on single-country, exploratory studies that rely on inaccurate but popular self-report measures of digital-screen engagement. In this study, which encompassed three nationally representative large-scale data sets from Ireland, the United States, and the United Kingdom ( N = 17,247 after data exclusions) and included time-use-diary measures of digital-screen engagement, we used both exploratory and confirmatory study designs to introduce methodological and analytical improvements to a growing psychological research area. We found little evidence for substantial negative associations between digital-screen engagement—measured throughout the day or particularly before bedtime—and adolescent well-being.

As digital screens become an increasingly integral part of daily life for many, concerns about their use have become common (see Bell, Bishop, & Przybylski, 2015, for a review). Scientists, practitioners, and policymakers are now looking for evidence that could inform possible large-scale interventions designed to curb the suspected negative effects of excessive adolescent digital engagement (UK Commons Select Committee, 2017). Yet there is still little consensus as to whether and, if so, how digital-screen engagement affects psychological well-being; results of studies have been mixed and inconclusive, and associations—when found—are often small (Etchells, Gage, Rutherford, & Munafò, 2016; Orben & Przybylski, 2019; Parkes, Sweeting, Wight, & Henderson, 2013; Przybylski & Weinstein, 2017; Smith, Ferguson, & Beaver, 2018).

In most previous work, researchers considered the amount of time spent using digital devices, or screen time, as the primary determinant of positive or negative technology effects (Neuman, 1988; Przybylski & Weinstein, 2017). It is therefore imperative that such work incorporates high-quality assessments of screen time. Yet with the vast majority of studies relying on retrospective self-report scales, research indicates that there is good reason to believe that current screen-time measurements are lacking in quality (Scharkow, 2016). On the one hand, people are not skilled at perceiving the time they spend engaging in specific activities (Grondin, 2010). On the other hand, there are also a myriad of additional reasons why people fail to give accurate retrospective self-report judgments (e.g., Boase & Ling, 2013; Schwarz & Oyserman, 2001).

Recent work has demonstrated that only one third of participants provide accurate judgments when asked about their weekly Internet use, while 42% overestimate and 26% underestimate their usage (Scharkow, 2016). Inaccuracies vary systematically as a function of actual digital engagement (Vanden Abeele, Beullens, & Roe, 2013; Wonneberger & Irazoqui, 2017): Heavy Internet users tend to underestimate the amount of time they spend online, while infrequent users overreport this behavior (Scharkow, 2016). Both these trends have been replicated in subsequent studies (Araujo, Wonneberger, Neijens, & de Vreese, 2017). There are therefore substantial and endemic issues regarding the majority of current research investigating digital-technology use and its effects.

Direct tracking of screen time and digital activities on the device level is a promising approach for addressing this measurement problem (Andrews, Ellis, Shaw, & Piwek, 2015; David, Roberts, & Christenson, 2018), yet the method comes with technical issues (Miller, 2012) and is still limited to small samples (Junco, 2013). Given the importance of rapidly gauging the impact of screen time on well-being, other approaches for measuring the phenomena—approaches that can be implemented more widely—are needed for psychological science to progress.

To this end, a handful of recent studies have applied experience-sampling methodology, asking participants specific technology-related questions throughout the day (Verduyn et al., 2015) or after specific bouts of digital engagement (Masur, 2018). This method is complemented by studies using time-use diaries, which require participants to recall what activities they were engaged in during prespecified days; this approach builds a detailed picture of the participants’ daily life (Hanson, Drumheller, Mallard, McKee, & Schlegel, 2010). Because most time-use diaries ask participants to recount small time windows (e.g., every 10 min), they facilitate the summation of total time spent engaging with digital screens and allow for investigation into the time of day that these activities occur. Time-use diaries could therefore extend and complement the more commonly used self-report measurement methodology. Yet work using these promising time-use-diary measures has focused mainly on single smaller data sets, has not been preregistered, and has not examined the effect of digital engagement on psychological well-being.

Specifically, time-use diaries allow us to examine how digital-technology use before bedtime affects both sleep quality and duration. Researchers have postulated that by promoting users’ continued availability and fear of missing out, social media platforms can decrease the amount of time adolescents sleep (Scott, Biello, & Cleland, 2018). Previous research found negative effects when adolescents engage with digital screens 30 min (Levenson, Shensa, Sidani, Colditz, & Primack, 2017), 1 hr (Harbard, Allen, Trinder, & Bei, 2016), and 2 hr (Orzech, Grandner, Roane, & Carskadon, 2016) before bedtime. This could be attributable to delayed bedtimes (Cain & Gradisar, 2010; Orzech et al., 2016) or to difficulties in relaxing after engaging in stimulating technology use (Harbard et al., 2016).

The Present Research In this research, we focused on the relations between digital engagement and psychological well-being using both time-use diaries and retrospective self-report data obtained from adolescents of three different countries—Ireland, the United States, and the United Kingdom. Across all data sets, our aim was to determine the direction, magnitude, and statistical significance of these relations, with a particular focus on the effects of digital engagement before bedtime. In order to clarify the mixed literature and provide high generalizability and transparency, we used the first two studies to extend a general research question concerning the link between screen time and well-being into specific hypotheses. These theories were then tested in a third confirmatory study. More specifically, we used specification-curve analysis (SCA) to identify promising links in our two exploratory studies, generating informed data- and theory-based hypotheses. The robustness of these hypotheses were then evaluated in a third study using a preregistered confirmatory design. By subjecting the results from the first two studies to the highest methodological standards of testing, we aimed to shed further light on whether digital engagement has reliable, measurable, and substantial associations with the psychological well-being of young people.

Exploratory Studies Method Data sets and participants Data from two nationally representative data sets collected in Ireland and the United States were used to explore the plausible links between psychological well-being and digital engagement, generating hypotheses for subsequent testing. We selected both data sets because they were large in comparison with normal social-psychological-research data sets (total N = 5,363; Ireland: n = 4,573, United States: n = 790 after data exclusions); they were also nationally representative and had open and harmonized well-being and time-use-diary measurements. Because technology use changes so rapidly, only the most recent wave of time-use diaries was analyzed so that the data would reflect the current state of digital engagement. The first data set under analysis was Growing Up in Ireland (GUI; Williams et al., 2009). In our study, we focused on the GUI child cohort that tracked 5,023 nine-year olds, recruited via random sampling of primary schools. The wave of interest took place between August 2011 and March 2012 and included 2,514 boys and 2,509 girls, mostly aged thirteen (4,943 thirteen-year-olds, 24 twelve-year-olds, and 56 fourteen-year-olds). The time-use diaries were completed on a day individually designated by the head office (either weekend or weekday) after the primary interview of both children and their caretakers. After data exclusions, 4,573 adolescents were included in the study. Collected between 2014 and 2015, the second data set of interest was the United States Panel Study of Income Dynamics (PSID; Survey Research Center, Institute for Social Research, University of Michigan, 2018), which included 741 girls and 767 boys. It encompassed participants from a variety of age groups: 108 eight-year-olds, 100 nine-year-olds, 110 ten-year-olds, 89 eleven-year-olds, 201 twelve-year-olds, 213 thirteen-year-olds, 190 fourteen-year-olds, 186 fifteen-year-olds, 165 sixteen-year-olds, 127 seventeen-year-olds, and 19 who did not provide an age. We selected only those 790 participants who were between the ages of 12 and 15, to match the age ranges in the other data sets used. The sample was collected by involving all children in households already interviewed by the PSID who were descended from either the original families recruited in 1968 or the new immigrant family sample added in 1997. Those participants in the child supplement who were selected to receive an in-home visit were asked to complete two time-use diaries on randomly assigned days (one on a weekday and one on a weekend day). Ethical review The Research Ethics Committee of the Health Research Board in Ireland gave ethical approval to the GUI study. The University of Michigan Health Sciences and Behavioral Sciences Institutional Review Board reviews the PSID annually to ensure its compliance with ethical standards. Measures This research examined a variety of well-being and digital-screen-engagement measures. While each data set included a range of well-being questionnaires, we considered only those measures present in at least one of the exploratory data sets (i.e., the GUI and the PSID) and in the data set used for our confirmatory study (detailed below). Thus, measures included the popular Strengths and Difficulties Questionnaire (SDQ) completed by caretakers (part of the GUI and the confirmatory study) and two well-being questionnaires filled out by adolescents—the Short Mood and Feelings Questionnaire (in the GUI and the confirmatory study) and the Children’s Depression Inventory in the PSID. In addition, we relied on the Rosenberg Self-Esteem Scale (part of the PSID and the confirmatory study). Adolescent well-being The first measure of adolescent well-being considered was the SDQ completed by the Irish participants’ primary caretakers (Goodman, Ford, Simmons, Gatward, & Meltzer, 2000). This measure of psychosocial functioning has been widely used and validated in school, home, and clinical contexts. It includes 25 questions, 5 each about prosocial behavior, hyperactivity or inattention, emotional symptoms, conduct problems, and peer-relationship problems (0 = not true, 1 = somewhat true, 2 = certainly true; prosocial behavior was not included in our analyses, and the scale was subsequently reverse scored; see the Supplemental Material available online). The second measure of adolescent well-being was an abbreviated version of the Rosenberg Self-Esteem Scale completed by U.S. participants (Robins, Hendin, & Trzesniewski, 2001). This was a five-item measure that asked, “How much do you agree or disagree with the following statement?” Answer choices were, “On the whole, I am satisfied with myself”; “I feel like I have a number of good qualities”; “I am able to do things as well as most other people”; “I am a person of value”; and “I feel good about myself.” The participants answered these questions on a four-item Likert scale that ranged from strongly disagree (1) to strongly agree (4). Third, for the Irish data set, we included the Short Mood and Feelings Questionnaire as an indicator of well-being. The adolescent participants answered questions about how they felt or acted in the past 2 weeks using a three-level Likert scale that ranged from true to not true. Items included “I felt miserable or unhappy,” “I didn’t enjoy anything at all,” “I felt so tired I just sat around and did nothing,” “I was very restless,” “I felt I was no good any more,” “I cried a lot,” “I found it hard to think properly or concentrate,” “I hated myself,” “I was a bad person,” “I felt lonely,” “I thought nobody really loved me,” “I thought I could never be as good as other kids,” and “I did everything wrong.” We subsequently reverse-scored items so they instead measured adolescent well-being. Finally, for the U.S. sample, we included the Children’s Depression Inventory as a measure of adolescent well-being. The participants were asked to think about the last 2 weeks and select a sentence that best described their feelings (see the Supplemental Material for the sentences used). The choices are very similar to those in the 12 questions about subjective affective states and general mood asked in the confirmatory data set detailed later. Adolescent digital engagement The study included two varieties of digital-engagement measures: retrospective self-report measures of digital engagement and estimates derived from time-use diaries. Details regarding these measures varied for each data set because of differences in the questionnaires and time diaries used. For all data sets, we removed participants who filled out a time-use diary during a weekday that was not term or school time, and if participants went to bed after midnight (after the time-use diary was concluded), we coded them as going to bed at midnight. The Irish data set included three questions asking participants to think of a normal weekday during term time and estimate, “How many hours do you spend watching television, videos or DVDs?” “How much time do you spend using the computer (do not include time spent using computers in school)?” and “How much time do you spend playing video games such as PlayStation, Xbox, Nintendo, etc.?” Participants could answer in hours and minutes, but responses were recoded by the survey administrator into a 13-level scale.1 We took the mean of these measures to obtain a general digital-engagement measure. In the U.S. data set, adolescents were asked, “In the past 30 days, how often did you use a computer or other electronic device (such as a tablet or smartphone)” for any of the following: “for school work done at school or at home,” “for these types of online activities (visiting a newspaper or news-related website; watch or listen to music, videos, TV shows, or movies; follow topics or people that interest you on websites, blogs, or social media sites (like Facebook, Instagram or Twitter), not including following or interacting with friends or family online),” “to play games,” and “for interacting with others.” Participants answered using a 5-point Likert scale ranging from never (1) to every day (5). For the U.S. data, we took the mean of these four items to obtain a general digital-engagement measure. The study focused on five discrete measures from the participants’ self-completed time-use diaries: (a) whether the participants reported engaging with any digital screens, (b) how much time they spent doing so, and whether they did so (c) 2 hr, (d) 1 hr, and (e) 30 min before going to bed. We separated these numerical measures for weekend and weekday, resulting in a total of 10 different variables. Each time-use diary, although harmonized by study administrators, was administered and coded slightly differently. The Irish data set contained 21 precoded activities that participants could select for each 15-min period. These included the four categories we then aggregated into our digital-engagement measure: “using the internet/emailing (including social networking, browsing etc.),” “playing computer games (e.g., PlayStation, PSP, X-Box or Wii),” “talking on the phone or texting,” or “watching TV, films, videos or DVDs.” In the U.S. data set, participants (or their caretakers) could report their activities freely, including primary and secondary activities, duration, and where the activity occurred. Research assistants coded these activities afterward. There were 13 codes aggregated in our digital-engagement measure, including lessons in using a computer or other electronic device, playing electronic games, other technology-based recreational activities, communication using technology/social media, texting, uploading or creating Internet content, nonspecific work with technology such as installing software or hardware, photographic processing, and other activities involving a computer or electronic device. We aggregated these measures and did not include them in our analyses separately because there were too few people who scored on any one coded variable. Time-use-diary measures commonly have high positive skew: Many participants do not note down the activity at all, while only a few report spending much time on the activity. It is common practice to address this by splitting the time-use variable into two measures, with the first reflecting participation and the second reflecting amount of participation (i.e., the time spent doing this activity; Hammer, 2012; Rohrer & Lucas, 2018). Participation is a dichotomous variable representing whether a participant reported engaging in the activity on a given day; time spent is a continuous variable that represents the amount of engagement for participants who reported doing the activity. In addition to including these two different measures—separately for weekends and weekdays—we also created six measures to assess technology use before bedtime. These measures were dichotomous, simply indicating whether the participant had used technology in the specified time interval. These time intervals were 30 min, 1 hr, and 2 hr before bedtime, assessed separately on a weekend day and a weekday. Covariate and confounding variables Minimal covariates were incorporated in these exploratory analyses—just gender and age for both Irish and U.S. data sets—to prevent spurious correlations or conditional associations from complicating our hypothesis-generating process. Analytic approach To examine the correlation between technology use and well-being, we used an SCA approach proposed by Simonsohn, Simmons, and Nelson (2015) and applied in recent articles by Rohrer, Egloff, and Schmukle (2017) and Orben and Przybylski (2019). SCA enables researchers to implement many possible analytical pathways and interpret them as one entity, respecting that the “garden of forking paths” allows for many different data-analysis options which should be taken into account in scientific reporting (Gelman & Loken, 2013). Because the aim for these analyses was to generate informed data- and theory-driven hypotheses to then test in a later confirmatory study, the analyses consisted of four steps. Correlations between retrospective reports and time-use-diary estimates The first analytical step was to examine the correlations between retrospective self-report and time-use-diary measures of digital engagement, to gauge whether they were measuring similar or removed concepts. This was done to inform later interpretations of the SCA and to give valuable insights to researchers about such widely used measures. Identifying specifications We then decided which theoretically defensible specifications to include in the SCA. While this was done a priori for all studies, it was specifically preregistered only for the confirmatory study. The three main analytical choices addressed in the SCA were how to measure well-being, how to measure digital engagement, and whether to include statistical controls or not (see Table 1). Three different possible measures of well-being were included in the exploratory data sets: the SDQ, the reversed Children’s Depression Inventory or Short Mood and Feelings Questionnaire, and the Rosenberg Self-Esteem Scale. There were 11 possible measures of digital engagement, including the retrospective self-report measure and the time-use-diary measures separated for weekend day or weekday (participation, time spent, and engagement at < 2 hr, < 1 hr, and < 30 min before bedtime). Lastly, there was a choice of whether to include controls in the subsequent analyses or not. Table 1. Specifications Tested in the Irish, American, and British Data Sets View larger version Implementing specifications Taking each specification in turn, we ran a linear regression to obtain the standardized regression coefficient linking digital engagement measurements to well-being outcomes. To do so, we first used the various digital-engagement measures to predict the specific well-being questionnaires identified in the study. The regression either did or did not include covariates, depending on the specifications. We noted the standardized regression coefficient, the corresponding p value, and the partial r2. We also ran 500 bootstrapped models of each SCA to obtain the 95% confidence intervals (CIs) around the standardized regression coefficient and the effect-size measure. The specifications were then ranked by their regression coefficient and plotted in a specification curve, where the spread of the associations is most clearly visualized. The bottom panel of the specification-curve plot illustrates what analytical decisions lead to what results, creating a tool for mapping out the too-often-invisible garden of forking paths (for an example, see Fig. 1). Download Open in new tab Download in PowerPoint Statistical inferences Bootstrapped models were implemented to examine whether the associations evident in the calculated specifications were significant (Orben & Przybylski, 2019; Simonsohn et al., 2015). We were particularly interested in the different measures and the timing of digital-technology use, so we ran a separate significance test for each technology-use measure. Our bootstrapped approach was necessary because the specifications do not meet the independence assumption of conventional statistical testing. We created data sets in which we knew the null hypothesis was true and examined the median point estimate (measured using the median regression coefficient) and number of significant specifications in the dominant direction (the sign of the majority of the specifications) they produced. We used these two significance measures, as proposed by Simonsohn and colleagues, but do not report the number of specifications in the dominant direction—a significance measure also proposed by the authors—as the nature of the data meant that these tests did not give an accurate overview of the data (see the Supplemental Material). It was possible to calculate whether the amount of significant specifications or size of the median point estimates found in the original data set was surprising—that is, whether less than 5% of the null-hypothesis data sets had more significant specifications in the dominant direction, or more extreme median point estimates, than the original data set. To create the data sets in which the null hypothesis was true, we extracted the regression coefficient of interest (b), multiplied it by the technology-use measure, and subtracted it from the well-being measure. We then used these values as our dependent well-being variable in a data set in which we now know the effect of interest not to be present. We then ran 500 bootstrapped SCAs using this data. As the bootstrapping operation was repeated 500 times, it was possible to examine whether each bootstrapped data set (in which the null hypothesis was known to be true) had more significant specifications or more extreme median point estimates than the original data set. To obtain the p value of the bootstrapping test, we divided the number of bootstraps with more significant specifications in the dominant direction or more extreme median point estimates than the original data set by the overall number of bootstraps. Results Correlations between retrospective reports and time-use-diary estimates For Irish adolescents, the correlation of measures relating to digital engagement, operationalized using the time-use-diary estimate (prior to dichotomization into participation and time spent) and retrospective self-report measurement, was small (r = .18). For American adolescents, the correlations relating time-use-diary measures on a weekday and weekend day to self-report digital-engagement measurements were small as well (r = .08 and r = .05, respectively). Identifying and implementing specifications We identified 44 specifications each for the Irish and U.S. data sets. For details about these specifications, see the columns in Table 1 for these two data sets. After all analytical pathways specified in the previous step were implemented, it was evident that there were significant specifications present in both data sets (Fig. 1, left and middle panels). Some specifications showed significant negative associations (k = 16), though there was a larger proportion of nonsignificant specifications present (k = 72). No statistically significant specifications were positive. Specifications using retrospective self-report digital-engagement measures resulted in the largest negative associations in the Irish data. We did not find this trend in the U.S. data, possibly because of the restricted range of response anchors connected to their self-report digital-engagement measures. Statistical inferences Using bootstrapped null models, we found significant correlations between digital engagement and psychological well-being in both the Irish and American data sets (Table 2). We count those correlations as significant that showed significant effects both for the median point estimates and the number of significant tests in the dominant direction. Table 2. Results of the Specification-Curve Analysis Bootstrapping Tests for the Irish, U.S., and U.K. Data Sets View larger version There was a significant correlation between retrospective self-report digital engagement (median β = −0.15, p < .001; number of significant results in dominant direction = 4/4, p < .001) and adolescent well-being in the Irish data set. There were also negative associations for some of the time-use-diary measures, notably time spent using digital screens on a weekend (median β = −0.07, p < .001; number of significant results = 4/4, p < .001) and on a weekday (median β = −0.06, p < .001; number of significant results = 4/4, p < .001). In the American data set we found significant associations only for digital engagement 1 hr before bedtime on a weekend day (median β = −0.13, p < .001; number of significant results = 2/4, p = .010). There were no significant associations of retrospective self-reported digital engagement. Taking this pattern of results as a whole, we derived a series of promising data- and theory-driven hypotheses to test in a confirmatory study.

Discussion Because technologies are embedded in our social and professional lives, research concerning digital-screen use and its effects on adolescent well-being is under increasingly intense scientific, public, and policy scrutiny. It is therefore essential that the psychological evidence contributing to the available literature be of the highest possible standard. There are, however, considerable problems, including measurement issues, lack of transparency, little confirmatory work, and overinterpretation of miniscule effect sizes (Orben & Przybylski, 2019). Only a few studies regarding technology effects have used a preregistered confirmatory framework (Elson & Przybylski, 2017; Przybylski & Weinstein, 2017). No large-scale, cross-national work has tried to move away from retrospective self-report measures to gauge time spent engaged with digital screens, yet it has been evident for years that such self-report measures are inherently problematic (Scharkow, 2016; Schwarz & Oyserman, 2001). Until these three facts are reconciled in the literature, exploratory studies wholly dependent on retrospective accounts will command an outsized share of public attention (Cavanagh, 2017). This study marks a novel contribution to the psychological study of technology in a variety of ways. First, we introduced a new measurement of screen time, implemented rigorous and transparent approaches to statistical testing, and explicitly separated hypothesis generation from hypothesis testing. Given the practical and reputational stakes for psychological science, we argue that this approach should be the new baseline for researchers wanting to make scientific claims about the effects of digital engagement on human behavior, development, and well-being. Second, the study found little substantive statistically significant and negative associations between digital-screen engagement and well-being in adolescents. The most negative associations were found when both self-reported technology use and well-being measures were used, and this could be a result of common method variance or noise found in such large-scale questionnaire data. Where statistically significant, associations were smaller than our preregistered cutoff for a practically significant effect, though it bears mention that the upper bound of some of the 95% CIs equaled or exceeded this threshold. In other words, the point estimate was below the SESOI of a correlation coefficient (r) of .10, but because the CI overlapped with the SESOI, we cannot confidently rule out the possibility that it accounts for about 1% of covariance in the well-being outcomes. This is in line with results from previous research showing that the association between digital-technology use and well-being often falls below or near this threshold (Ferguson, 2009; Orben & Przybylski, 2019; Twenge, Joiner, Rogers, & Martin, 2017; Twenge, Martin, & Campbell, 2018). We argue that these effects are therefore too small to merit substantial scientific discussion (Lakens et al., 2017). This supports previous research showing that there is a small significant negative association between technology use and well-being, which—when compared with other activities in an adolescent’s life—is miniscule (Orben & Przybylski, 2019). Extrapolating from the median effects found in the MCS data set, we point out that those adolescents who reported technology use would need to report 63 hr and 31 min more of technology use a day in their time-use diaries to decrease their well-being by 0.50 standard deviations, a magnitude often seen as a cutoff for effects that participants would be subjectively aware of (Norman, Sloan, & Wyrwich, 2003; calculations included in the Supplemental Material). Whether smaller effects, even when not noticeable, are important is up for debate, as technology use affects a large majority of the population (Rose, 1992). The above calculation is based on the median of calculated effect sizes, but if we consider only the specification with the maximum effect size, the time an adolescent needs to spend using technology to experience the relevant decline in well-being decreases to 11 hr and 14 min per day. Third, this study was also one of the first to examine whether digital-screen engagement before bedtime is especially detrimental to adolescent psychological well-being. Public opinion seems to be that using digital screens immediately before bed may be more harmful for teens than screen time spread throughout the day. Our exploratory and confirmatory analyses provided very mixed effects: Some were negative, while others were positive or inconclusive. Our study therefore suggests that technology use before bedtime might not be inherently harmful to psychological well-being, even though this is a well-worn idea both in the media and in public debates. Limitations While we aim to implement the best possible analyses of the research questions posed in this article, there are issues intrinsic to the data that must be noted. First, time-use diaries as a method for measuring technology use are not inherently problem free. It is possible that reflexive or brief uses of technology concurrent with other activities are not properly recorded by these methods. Likewise, we cannot ensure that all days that were under analysis were representative. To address both issues, one would need to holistically track technology use across multiple devices over multiple days, though doing this with a population-representative cohort would be extremely resource intensive (Wilcockson, Ellis, & Shaw, 2018). Second, it is important to note that the time-use-diary and well-being measures were not collected on the same occasion. Because the well-being measures inquired about feelings in general, not simply about feelings on the specific day of questioning, the study assumed that the correlation between both measures still holds, reflecting links between exemplar days and general experiences. Finally, it bears mentioning that the study is correlational and that the directionality of effects cannot, and should not, be inferred from the data. Conclusion Until they are displaced by a new technological innovation, digital screens will remain a fixture of human experience. Psychological science can be a powerful tool for quantifying the association between screen use and adolescent well-being, yet it routinely fails to supply the robust, objective, and replicable evidence necessary to support its hypotheses. As the influence of psychological science on policy and public opinion increases, so must our standards of evidence. This article proposes and applies multiple methodological and analytical innovations to set a new standard for quality of psychological research on digital contexts. Granular technology-engagement metrics, large-scale data, use of SCA to generate hypotheses, and preregistration for hypothesis testing should all form the basis of future work. To retain the influence and trust we often take for granted as a psychological research community, robust and transparent research practices will need to become the norm—not the exception.

Acknowledgements The Centre for Longitudinal Studies, UCL Institute of Education, collected Millenium Cohort Study data; the UK Data Archive/UK Data Service provided the data. They bear no responsibility for our analysis or interpretation. We thank J. Rohrer for providing the open-access code on which parts of our analyses are based.

Action Editor

Brent W. Roberts served as action editor for this article. Author Contributions

A. Orben conceptualized the study with regular guidance from A. K. Przybylski. A. Orben completed the statistical analyses and drafted the manuscript; A. K. Przybylski gave integral feedback. Both authors approved the final manuscript for publication. Declaration of Conflicting Interests

The author(s) declared that there were no conflicts of interest with respect to the authorship or the publication of this article. Funding

The National Institutes of Health (R01-HD069609/R01-AG040213) and the National Science Foundation (SES-1157698/1623684) supported the Panel Study of Income Dynamics. The Department of Children and Youth Affairs funded Growing Up in Ireland, carried out by the Economic and Social Research Institute and Trinity College Dublin. A. Orben was supported by a European Union Horizon 2020 IBSEN Grant; A. K. Przybylski was supported by an Understanding Society Fellowship funded by the Economic and Social Research Council. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Supplemental Material

Additional supporting information can be found at http://journals.sagepub.com/doi/10.1177/0956797619830329 ORCID iDs

Amy Orben https://orcid.org/0000-0002-2937-4183 Andrew K. Przybylski https://orcid.org/0000-0001-5547-2185 Open Practices

The data can be accessed using the following links, which requires the completion of a request or registration form—Growing Up in Ireland: http://www.ucd.ie/issda/data/guichild/; Panel Study of Income Dynamics: https://simba.isr.umich.edu/U/Login.aspx?TabID=1; Millennium Cohort Study: https://beta.ukdataservice.ac.uk/datacatalogue/series/series?id=2000031#!/access. The analysis code for this study has been made publicly available via the Open Science Framework and can be accessed at https://osf.io/rkb96/. The design and analysis plans were preregistered at https://osf.io/wrh4x/. Changes were made after preregistration because improved methods became available. The complete Open Practices Disclosure for this article can be found at http://journals.sagepub.com/doi/suppl/10.1177/0956797619830329. This article has received the badges for Open Materials and Preregistration. More information about the Open Practices badges can be found at http://www.psychologicalscience.org/publications/badges.

Notes 1.

The values on the scale were 0 (0 min), 1 (1–30 min), 2 (31–60 min), 3 (61–90 min), 4 (91–120 min), 5 (121–150 min), 6 (151–180 min), 7 (181–210 min), 8 (211–240 min), 9 (241–270 min), 10 (271–300 min), 11 (301–330 min), 12 (331–360 min), 13 (361 or more min).