Source Data

The Framingham Heart Study was initiated in 1948, when 5209 people were enrolled in the original cohort.16 The Framingham Offspring Study began in 1971, when most of the children of members of the original cohort and their spouses were enrolled in the offspring cohort.17 There has been almost no loss to follow-up other than death in this cohort of 5124 people; only 10 people left the study. In 2002, the third-generation cohort, consisting of 4095 children of the offspring cohort, was initiated. All participants undergo physical examinations (including measurements of height and weight) and complete written questionnaires at regular intervals.

Network Ascertainment

For our study, we used the offspring cohort as the source of 5124 key subjects, or “egos,” as they are called in social-network analysis. Any persons to whom the egos are linked — in any of the Framingham Heart Study cohorts — can, however, serve as “alters.” Overall, 12,067 living egos and alters were connected at some point during the study period (1971 to 2003).

To create the network data set, we entered information about the offspring cohort into a computer. This information was derived from archived, handwritten administrative tracking sheets that had been used since 1971 to identify people close to the study participants to facilitate follow-up. These sheets contain valuable, previously unused social-network information because they systematically and comprehensively identify relatives and friends named by the ego. The tracking sheets provide complete information about all first-order relatives (parents, spouses, siblings, and children), whether they are alive or dead, and at least one “close friend” at each of seven examinations between 1971 and 2003. The examinations took place during 3-year periods centered in 1973, 1981, 1985, 1989, 1992, 1997, and 1999. Detailed home addresses were also recorded at each time point; we used this information to calculate the geographic distance between people.

Many of the named alters on these sheets also were members of Framingham Heart Study cohorts. This newly computerized database thus identifies the network links among participants at each examination and longitudinally from one examination to the next. As a person's family changed because of birth, death, marriage, or divorce, and as contacts changed because of residential moves or new friendships, this information was recorded. Furthermore, dates of birth and death were available from separate Framingham Heart Study files.

Overall, there were 38,611 observed social and family ties to the 5124 egos, yielding an average of 7.5 ties per ego (not including neighbors). For example, 83% of the spouses of egos were directly and repeatedly observed at the time of examination, and 87% percent of egos with siblings had at least one sibling in the network. For 10% of the egos, an immediate neighbor also participated in the study; more expansive definitions of neighbors yielded similar results.

A total of 45% of the 5124 egos were connected through friendship to another person in the network. There were 3604 unique, observed friendships, for an average of 0.7 friendship tie per ego. Because friendship identifications are directional, we studied three different kinds of friendships: an “ego-perceived friendship,” in which an ego identifies an alter as a friend; an “alter-perceived friendship,” in which an alter identifies an ego as a friend; and a “mutual friendship,” in which the identification is reciprocal. We hypothesized that a friend's social influence on an ego would be affected by the type of friendship, with the strongest effects occurring in mutual friendships, followed by ego-perceived friendships, followed by alter-perceived friendships. Our reasoning was that the person making the identification esteems the other person and may wish to emulate him or her.

We included only persons older than 21 years of age at any observation point and subsequently. At the inception of the study, 53% of the egos were women, the mean age of the egos was 38 years (range, 21 to 70), and their mean educational level was 13.6 years (range, no education to ≥17 years of education).

The study data are available from the Framingham Heart Study. The study was approved by the institutional review board at Harvard Medical School; all subjects provided written informed consent.

Statistical Analysis

We graphed the network with the use of the Kamada–Kawai18 algorithm in Pajek software.19 We generated videos of the network by means of the Social Network Image Animator (known as SoNIA).20 We examined whether our data conformed to theoretical network models such as the small-world,10 scale-free,21 and hierarchical types22 (see the Supplementary Appendix, available with the full text of this article at www.nejm.org).

We defined obesity as a body-mass index (the weight in kilograms divided by the square of the height in meters) of 30 or more. Analyses in which the body-mass index was a continuous variable did not yield different results.

We considered three explanations for the clustering of obese people. First, egos might choose to associate with like alters (“homophily”).21,23,24 Second, egos and alters might share attributes or jointly experience unobserved contemporaneous events that cause their weight to vary at the same time (confounding). Third, alters might exert social influence or peer effects on egos (“induction”). Distinguishing the interpersonal induction of obesity from homophily requires dynamic, longitudinal network information about the emergence of ties between people (“nodes”) in a network and also about the attributes of nodes (i.e., repeated measures of the body-mass index).25

The basic statistical analysis involved the specification of longitudinal logistic-regression models in which the ego's obesity status at any given examination or time point (t+1) was a function of various attributes, such as the ego's age, sex, and educational level; the ego's obesity status at the previous time point (t); and most pertinent, the alter's obesity status at times t and t+1.25 We used generalized estimating equations to account for multiple observations of the same ego across examinations and across ego–alter pairs.26 We assumed an independent working correlation structure for the clusters.26,27

The use of a time-lagged dependent variable (lagged to the previous examination) eliminated serial correlation in the errors (evaluated with a Lagrange multiplier test28) and also substantially controlled for the ego's genetic endowment and any intrinsic, stable predisposition to obesity. The use of a lagged independent variable for an alter's weight status controlled for homophily.25 The key variable of interest was an alter's obesity at time t+1. A significant coefficient for this variable would suggest either that an alter's weight affected an ego's weight or that an ego and an alter experienced contemporaneous events affecting both their weights. We estimated these models in varied ego–alter pair types.

To evaluate the possibility that omitted variables or unobserved events might explain the associations, we examined how the type or direction of the social relationship between the ego and the alter affected the association between the ego's obesity and the alter's obesity. For example, if unobserved factors drove the association between the ego's obesity and the alter's obesity, then the directionality of friendship should not have been relevant.

We evaluated the role of a possible spread in smoking-cessation behavior as a contributor to the spread of obesity by adding variables for the smoking status of egos and alters at times t and t+1 to the foregoing models. We also analyzed the role of geographic distance between egos and alters by adding such a variable.

We calculated 95% confidence intervals by simulating the first difference in the alter's contemporaneous obesity (changing from 0 to 1), using 1000 randomly drawn sets of estimates from the coefficient covariance matrix and assuming mean values for all other variables.29 All tests were two-tailed. The sensitivity of the results was assessed with multiple additional analyses (see the Supplementary Appendix).