Naive contagion estimates

We estimated social contagion in the exercise behaviours of runners worldwide in a data set that precisely records the geographic locations, social network ties and daily running patterns of ∼1.1M individuals, who ran ∼359M km in a global social network of runners over 5 years. Following Aral12, we define the magnitude of peer effects or contagion in exercise behaviour (which we also refer to as social influence, social contagion, behavioural contagion and network contagion) as the degree to which the exercise behaviours of one’s peers change the likelihood that or extent to which one engages in those behaviours. The data contain the daily distance, duration and pace of, as well as calories burned during, runs undertaken by these individuals, as recorded by a suite of digital fitness-tracking devices. The data also track ∼3.4M social network ties formed among runners to connect and keep track of each other’s running behaviours. We analyse the ∼2.1M ties in the network for which we can geographically locate and find weather information for both nodes connected by a tie. Ties in this network link runners who follow each other’s running habits. Running information was not self-reported. When a run was completed, it was immediately digitally shared with a runner’s friends. Runners could not choose which runs they shared but rather comprehensively shared all new running information with their friends upon connecting their device to the platform.

These data give us unique insight into the daily, coevolving running and social network patterns of these individuals over 5 years. For example, when we examined progressively more sophisticated models of the correlations between an individual’s (also called ego’s) running behaviour and that of his or her friends (also called peers) (we use the terms friends and peers interchangeably throughout the paper), we found strong evidence of the possibility of social contagion in running behaviours in both model-free correlations and ordinary least squares (OLS) models that control for time invariant and time varying characteristics of individuals and their peers, including gender, height, weight, degree, device type and country. In the OLS models, an additional kilometre run by peers was associated with an additional 6/10th of a kilometre run by ego and an additional 10 min run by peers was associated with an additional 5.3 min run by ego (see ‘Comparison of IV Estimates with an OLS Model’ in Supplementary Note 3 for more detail).

Unfortunately, these estimates are only suggestive because they are subject to the well-known endogeneity biases created by homophily, confounding effects, simultaneity and other factors. We therefore focus our analysis on a natural experiment created by exogenous variation in global weather patterns across geographies. Our approach leverages an inference technique called the instrumental variables (IV) framework, which disentangles endogeneity by using exogenous variation created by natural events as a shock to one endogenous variable to estimate its causal effect on another variable (see the Methods section for more detail).

IV estimation

The results of our IV analysis revealed strong contagion effects: on the same day, on average, an additional kilometre run by friends influences ego to run an additional 3/10th of a kilometre (Fig. 1a); an additional kilometre per minute run by friends influences ego to run an additional 3/10th of a kilometre per minute faster (Fig. 1b); an additional 10 min run by friends influences ego to run 3 min longer (Fig. 1c); and an additional 10 calories burned by friends influences ego to burn three and a half additional calories (Fig. 1d). This peer influence diminishes over time, with friends’ running today influencing ego less tomorrow and the day after for every measure.

Figure 1: Peer effects in global running behaviours. The panels display social influence coefficients from second-stage regressions in the two-stage least squares specification for friends’ behaviour at time t influencing ego at time t, t+1 and t+2 for (a) distance ran in kilometres (km), (b) pace in km per minute, (c) running duration in minutes and (d) calories burned. Bars are 95% confidence intervals. (e) The table at the bottom of the figure compares social influence coefficients and s.e. from the IV models to those from the OLS models and provides the OLS overestimates of social influence as a percentage of the IV estimates. Full size image

Peer effects in exercise behaviours are both statistically and socially significant. Suppose, for example, that a runner (A) usually runs 6 km at a pace of 7 min km−1 (0.143 km min−1) and their friend (B) usually runs 6 km at a pace of 8 min km−1 (0.125 km min−1). An extra kilometre run by B (an increase from 6 to 7 km) causes A to increase their running distance by 0.3 km (from 6 to 6.3 km). Also, a 0.01 km min−1 increase in runner B’s pace (from 0.125 to 0.135 km min−1) causes runner A to increase their pace by 0.003 km min−1 (from 0.143 to 0.146 km min−1).

The results in Fig. 1 also summarize the dangers of model misspecification in the estimation of peer effects. Naive models that do not account for endogeneity biases created by homophily, confounding effects, simultaneity and other factors dramatically overestimate social spillovers. As the table in Fig. 1e shows, OLS models that control for ego’s (X it ) and peers’ time varying and time invariant characteristics (including age, gender, height, weight, degree, device type and country) but that do not implement the IV identification strategy overestimate social influence by between 72% and 81%.

Contagion heterogeneity

Peer effects in running are also heterogeneous across relationship types. For example, runners are more influenced by peers whose performance is slightly worse, but not far worse, than their own as well as by those who perform slightly better, but not far better, than they do (Fig. 2a). Moreover, less active runners influence more active runners more than more active runners influence less active runners (Fig. 2b). These results are corroborated by heterogeneity across consistent and inconsistent runners. Inconsistent runners influence consistent runners more than consistent runners influence inconsistent runners (Fig. 2c). Social comparisons may provide an explanation for these results. Festinger’s social comparison theory proposes that we self-evaluate by comparing ourselves to others27. But, in the context of exercise, a debate exists about whether we make upward comparisons to those performing better than ourselves28 or downward comparisons to those performing worse than ourselves29. Comparisons to those ahead of us may motivate our own self-improvement, while comparisons to those behind us may create ‘competitive behaviour to protect one’s superiority’ (27, p. 126). Our findings are consistent with both arguments, but the effects are much larger for downward comparisons than for upward comparisons.

Figure 2: Heterogeneity in social influence effects across relationships. The panels display social influence coefficients across dyadic relationships in which ego is (a,b) a more or less active runner than their friends, (c) a more or less consistent runner than their friends and (d) either the same or a different gender than their friends. Bars are 95% confidence intervals. Full size image

We also found strong evidence that social influence depends on gender relations. Influence among same sex pairs is strong, while influence among mixed sex pairs is statistically significantly weaker (Fig. 2d inset). Men strongly influence men, and women moderately influence both men and women. But, men do not influence women at all (Fig. 2d). This may be due to gender differences in the motivations for exercise and competition. For example, men report receiving and being more influenced by social support in their decision to adopt exercise behaviours, while women report being more motivated by self-regulation and individual planning30. Moreover, men may be more competitive and specifically more competitive with each other. Experimental evidence suggests that women perform less well in mixed gender competition than men, even though they perform equally well in non-competitive or single sex competitive settings31.

Testing structural theories of contagion

Finally, three theories describe how social network structure may shape behavioural contagions. Centola and Macy32 argue that complex contagions, involving costly behaviours, require multiple reinforcing signals of adoption from different peers to induce behaviour change and suggest that clustered social networks are therefore more likely to spread a complex contagion from one neighborhood to another. Centola16 goes on to predict that in real-world health behaviours such as exercise, which are more costly in terms of ‘time, deprivation, or even physical pain’, the need for social reinforcement should be greater than in his own study of less costly online health behaviours. In contrast, Ugander et al.33 suggest that structural diversity, measured by the number of unconnected clusters (called ‘components’) with at least one adopter, not the number of distinct peers, is the critical structural factor moderating influence. Aral and Walker34, on the other hand, suggest that embeddedness (the number of mutual connections), rather than the number of unconnected clusters, is what drives behavioural contagions. We tested these three structural theories of social contagion by examining how contagion in running varied across different network structures (see ‘Testing Structural Theories of Social Contagion’ section in Supplementary Note 2 and ‘Structural Theories of Social Contagion’ in Supplementary Note 3 for details).

We found strong evidence confirming both the Structural Diversity and Embeddedness theories of social contagion, but the evidence for Complex Contagion was mixed. Social influence coefficients under the Complex Contagion theory (which argues that the number of active friends is the key driver of diffusion for complex contagions) and the Structural Diversity theory (which argues that the number of active network components is the key driver of diffusion) are statistically significantly different (t-statistic=15.9, N=9.9M). The number of distinct friends who run is positively correlated with social influence when analysed alone (Fig. 3a), but this correlation disappears and becomes negative when we control for the structural diversity of the behaviourally active peer group (Fig. 3b). At the same time, the structural diversity of peer group activation (the number of unconnected network components that exhibit running) strongly predicts greater positive social contagion effects, even when we control for the number of distinct friends who run (Fig. 3b). This replicates the results of Ugander et al.33, who found that, for the social diffusion of Facebook, the number of active friends predicts Facebook adoption but that this correlation disappears and becomes negative when controlling for the structural diversity of Facebook adopting friends. We describe the evidence for Complex Contagion as mixed because the theory defines a complex contagion as one that exhibits adoption thresholds greater than one, meaning more than one adopter friend is required for transmission, and suggests that clustering in behavioural adoption is more conducive to the spread of complex contagions. Our findings show that contagion occurs even with only one adopter friend and that unconnected adopter friends, rather than connected adopter friends, are more likely to transmit exercise behaviours. These results suggest that exercise is not a complex contagion, but they do not invalidate Complex Contagion theory as other behaviours may indeed exhibit complex contagion dynamics.

Figure 3: Testing structural theories of networked contagion. The panels describe the structural correlates of social influence in the distance run (in km). Panel (a) estimates the social influence effects of the number of distinct friends that run and the number of distinct components of friends that run independently, in separate regressions (separated by the dotted line). Panel (b) directly compares, in the same regression, the number of distinct friends that run (supporting Complex Contagion theory) and the number of distinct network components of friends that run (supporting Structural Diversity theory) as structural moderators of social influence effects. The positive estimate for the number of distinct network components of friends that run and the negative estimate for the number of distinct friends that run, when both are analysed together in (b), supports the Structural Diversity theory. Panel (c) tests whether embedded dyadic relationships with mutual friends transmit influence more effectively than relationships with no mutual friends (supporting Embeddedness theory). The social influence coefficient estimated for embedded relationships (Regression 2) is statistically significantly greater than the social influence coefficient estimated for non-embedded relationships (Regression 1) (t-statistic=2.45, N=10.7M). Bars are 95% confidence intervals. Full size image

The data also confirm that the embeddedness of a relationship (the number of mutual friends between contacts) strongly moderates social influence and contagion in running behaviours (Fig. 3c), confirming the Embeddedness theory. Unlike Complex Contagion and Structural Diversity, the Embeddedness theory does not make predictions about the social structure of adopting friends but rather about the social structure surrounding a transmission, whether or not that structure contains other adopting friends. The embeddedness of a relationship, measured by the number of mutual friends a dyad shares, can promote behavioural contagion because of the social monitoring that embedded relationships facilitate. When two people have many mutual friends, there are greater opportunities for social sanctions, reputational consequences for misbehaviour and social rewards for positive behaviours. Mutual friends may therefore provide an added incentive to keep up with running buddies because shirking is widely observed in a set of mutually reinforcing relationships.