Significance Knee osteoarthritis is a highly prevalent, disabling joint disease with causes that remain poorly understood but are commonly attributed to aging and obesity. To gain insight into the etiology of knee osteoarthritis, this study traces long-term trends in the disease in the United States using large skeletal samples spanning from prehistoric times to the present. We show that knee osteoarthritis long existed at low frequencies, but since the mid-20th century, the disease has doubled in prevalence. Our analyses contradict the view that the recent surge in knee osteoarthritis occurred simply because people live longer and are more commonly obese. Instead, our results highlight the need to study additional, likely preventable risk factors that have become ubiquitous within the last half-century.

Abstract Knee osteoarthritis (OA) is believed to be highly prevalent today because of recent increases in life expectancy and body mass index (BMI), but this assumption has not been tested using long-term historical or evolutionary data. We analyzed long-term trends in knee OA prevalence in the United States using cadaver-derived skeletons of people aged ≥50 y whose BMI at death was documented and who lived during the early industrial era (1800s to early 1900s; n = 1,581) and the modern postindustrial era (late 1900s to early 2000s; n = 819). Knee OA among individuals estimated to be ≥50 y old was also assessed in archeologically derived skeletons of prehistoric hunter-gatherers and early farmers (6000–300 B.P.; n = 176). OA was diagnosed based on the presence of eburnation (polish from bone-on-bone contact). Overall, knee OA prevalence was found to be 16% among the postindustrial sample but only 6% and 8% among the early industrial and prehistoric samples, respectively. After controlling for age, BMI, and other variables, knee OA prevalence was 2.1-fold higher (95% confidence interval, 1.5–3.1) in the postindustrial sample than in the early industrial sample. Our results indicate that increases in longevity and BMI are insufficient to explain the approximate doubling of knee OA prevalence that has occurred in the United States since the mid-20th century. Knee OA is thus more preventable than is commonly assumed, but prevention will require research on additional independent risk factors that either arose or have become amplified in the postindustrial era.

Osteoarthritis (OA) is the most prevalent joint disease and a leading source of chronic pain and disability in the United States (1) and other developed nations (2). Knee OA accounts for more than 80% of the disease’s total burden (2) and affects at least 19% of American adults aged 45 y and older (3). Substantial evidence indicates that knee OA is proximately caused by the breakdown of joint tissues from mechanical loading (4) and inflammation (5), but the deeper underlying causes of knee OA’s high prevalence remain unclear and poorly tested, hindering efforts to prevent and treat the disease. Two recent public health trends, however, are commonly assumed to be dominant factors (6, 7). First, because knee OA’s prevalence increases with age (8), the rise in life expectancy in the United States since the early 20th century is thought to have led to high knee OA levels among the elderly, with the presumption that, as people age, their senescing joint tissues accumulate more wear and tear from loading (9). Second, high body mass index (BMI) has become epidemic in the United States in recent decades and is a well-known risk factor for knee OA (8), probably because of the combined effects of joint overloading and adiposity-induced inflammation (10). Whether increases in longevity and BMI are responsible for current knee OA levels has never been tested, but this assumption has led many to view the disease’s high prevalence as effectively unpreventable, since aging is untreatable, and the high BMI epidemic is intractable (8, 11).

One underused yet potentially powerful way to identify and assess the risk factors responsible for current knee OA levels is to examine long-term changes in the disease’s prevalence by comparing contemporary with historic and prehistoric populations (12). Epidemiological studies of present day populations are valuable but are limited in their ability to analyze risk factors that are now pervasive but used to be less common. It is difficult to find large samples of living Americans whose lifestyles, including physical activity levels and diet, resemble those of past generations. Although many variables cannot be measured and thus controlled in epidemiological studies of people living in the past, a major benefit of analyzing populations over historical and evolutionary time is to assess known risk factors under different environmental conditions and thus bring to light the effects of risk factors that might not be apparent or testable in modern populations alone. Furthermore, although knee OA is known to be ancient (12), we know very little about changes in its prevalence over time. Low levels of knee OA have been reported for some historic and prehistoric populations (13⇓⇓⇓–17), suggesting that the disease’s prevalence has recently increased, but these studies used different diagnostic criteria than those used to diagnose knee OA in living patients, used samples composed mostly of younger individuals, and did not account for BMI, complicating comparisons with modern epidemiological data.

Here, we investigate long-term trends in knee OA prevalence in the United States and evaluate the effects of longevity and BMI on levels of the disease by comparing the prevalence of knee OA among people who lived during the early industrial era (19th to early 20th centuries) with that of people from the modern postindustrial era (late 20th to early 21st centuries). We studied knee OA in the largest available collections of cadaver-derived skeletal remains of people of documented age, BMI, sex, and ethnicity. To further consider knee OA levels from an evolutionary perspective, we also analyzed knee OA in a large sample of archeological skeletons of prehistoric Native American hunter-gatherers (6000–300 B.P.) and early farmers (900–300 B.P.). Although BMI is undocumented for prehistoric skeletons, the age at death and sex can be estimated, allowing us to assess the prevalence of knee OA among older individuals in these populations. The skeletal collections used in this study are, by necessity, samples composed of individuals who could not be randomly selected and for whom we lack comprehensive demographic and contextual information. Despite these limitations, these samples constitute the best available evidence for knee OA levels in the United States during earlier time periods to test if prevalence of the disease is higher today than in the past.

Materials and Methods Study Samples. The early industrial and postindustrial samples studied included complete skeletons of people aged 50 y and older who lived in major urban areas in the United States (Table S1). All individuals were documented as being of either European-American or African-American ancestry. Early industrial individuals (n = 1,581) were inhabitants of Cleveland, Ohio and St. Louis, Missouri who died between 1905 and 1940. BMI at death was recorded for 84% of these skeletons (n = 1,334). Postindustrial individuals (n = 819) lived in Albuquerque, New Mexico and Knoxville, Tennessee and died between 1976 and 2015. BMI at death was recorded for 64% of these skeletons (n = 525). All cadavers were acquired by academic institutions for the purposes of medical and anatomical education and research. Early industrial cadavers were of individuals whose bodies were unclaimed at local morgues or became property of the state; postindustrial cadavers were gathered through body donation programs. Occupation was documented for only 23% of individuals (n = 544), but records indicate that differences between samples reflect shifts in the US workforce between early industrial and postindustrial times, with the early industrial sample comprising primarily highly physically active laborers and the postindustrial sample including more service sector workers with less physically demanding jobs (SI Text). Cause of death was documented for 80% of individuals (n = 1,918), and differences between samples evince the epidemiological transition between early industrial and postindustrial times, with most deaths among early industrial individuals caused by infectious diseases, such as pneumonia and tuberculosis, whereas most deaths among postindustrial individuals were caused by noninfectious diseases, such as cancer and atherosclerotic heart disease (SI Text). Skeletons with knee joint articular surfaces that were severely damaged postmortem were excluded from the study as were individuals with lower limb amputations. Table S1. Osteological collections analyzed The prehistoric sample included skeletons from eight archeological sites (in Alaska, California, New Mexico, Kentucky, and Ohio) of people estimated to be aged 50 y and older who were hunter-gatherers (n = 116) and early farmers (n = 60) (Table S1). Only skeletons sufficiently preserved to examine both the right and left knees were included. Sex assignment was based on dimorphic characteristics of the pubis that have been shown to be 96% accurate (18). Individuals were estimated to be ≥50 y old based on age-related changes in the configuration of the auricular surface of the ilium (19). This method has been shown to correctly exclude individuals younger than 50 y of age with 100% accuracy (20). Unfortunately, estimating age precisely beyond 50 y is not possible with available osteological methods, and it is not possible to estimate accurately BMI at death from skeletal remains. Knee OA Diagnosis. Diagnosis of knee OA was based on visual identification of the presence of eburnation on the articular surfaces of the right or left distal femur, proximal tibia, or patella. Eburnation is a sclerotic, ivory-like reaction of subchondral bone that occurs from bone-on-bone contact at sites exposed by advanced cartilage erosion (12, 21). In pathology studies of skeletal remains, eburnation is considered pathognomonic for moderate to severe OA (12, 22⇓–24). Although it was not possible to assess knee OA blinded to the skeleton’s collection of origin, eburnation can be identified with negligible interobserver variation (SI Text). To avoid false-positive diagnoses, individuals exhibiting knee eburnation but also osteological signs of non-OA arthritides, such as rheumatoid arthritis, calcium pyrophosphate deposition disease, and spondyloarthropathy, were excluded. Osteophytes, bone spurs that often form at the margins of osteoarthritic joints (22⇓⇓–25), were generally large and expansive on eburnated knees but were not used as a diagnostic criterion for knee OA because interobserver variation in identifying osteophytes in skeletal samples is high (26), and they can lead to false-positive diagnoses (12); also, arthroplasty prostheses were not used to diagnose the disease among postindustrial individuals. Knee OA prevalence estimates for our samples are therefore underestimates of total disease prevalence, because they do not include mild (or early) cases of knee OA [e.g., cases that would be classified as two on the Kellgren–Lawrence scale (25)] or cases of the disease where arthroplasty prostheses obscure underlying eburnation. Statistical Analyses. Log-binomial generalized linear models (GLMs) were used to estimate adjusted prevalence and prevalence ratios for knee OA, which are reported with 95% confidence intervals (95% CIs). Prevalence is a measure of effect size that varies as a function of the values of predictors. Here, we predict prevalence over a range of values of the predictor of interest while holding all other covariates constant at the sample mean. A prevalence ratio is a measure of effect size that is constant over the range of the predictor of interest while controlling for other covariates. Since prevalence ratios are multiplicative, they denote a rate of change (percentage change) of the response per unit increase in the predictor of interest. Model goodness of fit was assessed using the Hosmer–Lemeshow χ2 test, with significance of individual estimates determined through two-sided Wald tests with an alpha level of 0.05 (Table S2). Table S2. Hosmer–Lemeshow goodness of fit test results Three separate GLMs were performed with a binary response variable indicating presence or absence of knee OA for each individual but including different explanatory variables. The first analysis included the prehistoric, early industrial, and postindustrial samples and controlled only for sex effects, since age and BMI were undocumented for prehistoric individuals. The second and third analyses included only the early industrial and postindustrial samples and additionally controlled for age, BMI, and ethnicity. The second analysis used all available individuals weighted equally, whereas the third analysis incorporated a subset of individuals who were differentially weighted based on optimal matching of covariate values between the early industrial and postindustrial samples. The analysis of matched data was performed as a sensitivity check to assess whether inferences were robust to sampling bias between the early industrial and postindustrial samples. This bias was evidenced by the large differences in average covariate values between these two samples in the unmatched data (Table 1). The purpose of matching is to approximate an experimental template, where the matching procedure mimics blocking before random group assignment to balance average covariate values between “target” and “comparator” groups. Separation of the estimation procedure into two steps simulates the research design of an experiment where no information on outcomes is known at the point of experimental design and randomization. The nonparametric matching procedure is therefore a data preprocessing step that replicates a randomized experiment with respect to observed covariates (27). Preprocessing was achieved by matching individuals from the early industrial and postindustrial samples that had a similar propensity to be included in the postindustrial sample based on covariate values (SI Text). Pruning nonmatches increased similarity in average covariate values between the early industrial and postindustrial samples (Table 1) and reduced model dependency and bias (28). Weights for each individual were constructed to estimate the average effect of interest on the postindustrial sample, with the early industrial sample weighted to look like the postindustrial sample. Analyses were conducted using R, version 3.3.2 (29). Table 1. Sample composition and covariate balance before and after matching

Results Long-Term Change in Knee OA Prevalence. Across all individuals analyzed (n = 2,576), the prevalence of knee OA was markedly higher among individuals from the postindustrial era compared with individuals from early industrial and prehistoric times, with females more affected than males (Fig. 1 A and C). After controlling for sex, knee OA prevalence in the postindustrial sample (16%; 95% CI, 14–19%) was 2.6 times higher (95% CI, 2.1–3.4; P < 0.001) than in the early industrial sample (6%; 95% CI, 5–7%) and 2 times higher (95% CI, 1.3–3.3; P = 0.003) than in the prehistoric sample (8%; 95% CI, 5–13%). Among postindustrial individuals with knee OA, 42% (64/151) had the disease in both knees, a 2.5-fold higher proportion (Fisher’s exact test: P = 0.042) of bilateral cases of knee OA than among the diseased individuals in the prehistoric sample (17%; 3/18) and 1.4-fold higher (Fisher’s exact test: P = 0.058) compared with the early industrial sample (30%; 28/94). Fig. 1. Knee OA prevalence during different time periods. (A and B) Knee OA prevalence from regression models controlling for sex (A) as well as age, BMI, sex, and ethnicity (B). Dark and light gray bars are from unmatched and matched analyses, respectively (B). (C and D) Knee OA prevalence ratios from regression models including sex (C) as well as age, BMI, sex, and ethnicity (D) as predictor variables. Black and light gray dots are from unmatched and matched analyses, respectively (D). Age and BMI were entered into models as continuous variables, but effects are reported for 10-y and 5-U intervals, respectively (D). Whiskers represent 95% CIs. Ethnicity effects are reported in Table S3. Table S3. Knee OA prevalence ratios Temporal Change in Knee OA Prevalence Controlling for Age and BMI. To test whether the higher levels of knee OA in the postindustrial era are attributable to greater longevity and higher BMIs, we analyzed the subset of individuals in our samples for whom age and BMI were documented (n = 1,859). Individuals from the postindustrial group were, on average, 6 y older and had 41% higher BMIs than their early industrial counterparts (Welch’s t test: P < 0.001 for both the age and BMI comparisons) (Table 1). Only 1% (13/1,334) of early industrial individuals were obese (BMI ≥ 30) and 6% (74/1,334) were overweight (25 ≤ BMI < 30) compared with 25% (132/525) and 24% (126/525) of postindustrial individuals who were obese and overweight, respectively (Fisher’s exact test: P < 0.001 for both the obese and overweight comparisons). Nevertheless, in a model controlling for age, BMI, and other variables, knee OA prevalence in the postindustrial sample (11%; 95% CI, 8–14%) remained 2.1 times higher (95% CI, 1.5–3.1; P < 0.001) than in the early industrial sample (5%; 95% CI, 4–7%) (Fig. 1 B and D). Age and BMI were positively associated with knee OA prevalence (P < 0.001 for both variables) (Fig. 1D), but at all ages, knee OA prevalence was at least twice as high in the postindustrial sample than in the early industrial sample, even after controlling for BMI (Fig. 2). Fig. 2. Age-related change in knee OA prevalence controlling for BMI, sex, and ethnicity. Shading represents 95% CIs. Temporal Change in Knee OA Prevalence Assessed Using Matched Samples. Matching individuals from the early industrial and postindustrial samples by propensity score increased covariate balance by 99% for age, 100% for BMI, 71% for sex, and 95% for ethnicity (Table 1). In a model using these matched samples and additionally controlling for age, BMI, and other variables, knee OA prevalence in the postindustrial sample (15%; 95% CI, 12–19%) remained approximately twice as high (prevalence ratio: 1.9; 95% CI, 1.1–3.5; P < 0.029) compared with the early industrial sample (8%; 95% CI, 5–13%) (Fig. 1 B and D).

Discussion To gain insight into the current high prevalence of OA in the United States and other developed nations, this study examined long-term trends in knee OA levels in the United States from prehistoric times through the early industrial era to the modern postindustrial era. These data show that knee OA long existed at low frequencies, but since the mid-20th century, knee OA has approximately doubled in prevalence, even after accounting for the effects of age and BMI. Our analyses therefore indicate that, although knee OA prevalence has increased over time, today’s high levels of the disease are not, as commonly assumed, simply an inevitable consequence of people living longer and more often having a high BMI. Instead, our analyses indicate the presence of additional independent risk factors that seem to be either unique to or amplified in the postindustrial era. Retrospective studies cannot directly test causation, but the dramatic increase in knee OA prevalence in recent times raises the question of what these additional risk factors might be. Alleles of genes, such as GDF5, have been shown to influence knee OA susceptibility (30), but the approximate doubling of knee OA prevalence in just the last few generations proves that recent environmental changes have played a principal role. The results of this study are thus clinically significant because they indicate that knee OA may be more preventable than is currently supposed. Given evidence that nearly all knee OA is associated with loading-induced damage to joint tissues (4), either because the loads are abnormal or the tissues are structurally weak, one especially important source of environmental change that warrants greater attention is whether and how joint loading has altered. Trauma has presumably always predisposed some individuals to knee OA (8), as suggested by the predominance of unilateral knee OA since prehistoric times (31), and while joint overloading from high BMI has become common only recently, our results indicate that the majority of knee OA today is not caused by high BMI per se. Although altered loads generated by walking more frequently on hard pavements (32) or with certain forms of footwear (33) might be factors, another possibility that merits more study is physical inactivity, which has become epidemic during the postindustrial era. Less physically active individuals who load their joints less develop thinner cartilage with lower proteoglycan content (34, 35) as well as weaker muscles responsible for protecting joints by stabilizing them and limiting joint reaction forces (36). Chronic low-grade inflammation, which is exacerbated by physical inactivity (37), modern diets rich in highly refined carbohydrates (38), and excessive adiposity (10), can further magnify and accelerate loading-induced damage to joint tissues and may also directly affect knee OA pathogenesis (5). Evaluating which of these or additional features of modern environments are responsible for today’s high knee OA levels is necessary. This study has important limitations that need to be considered. First, the samples analyzed, although large for their kind, were constrained by the availability of well-curated skeletal collections in the United States, and it is plausible that these collections exhibit levels of knee OA that differ from the actual US population prevalence. Second, BMI recorded at death is likely to underestimate average lifetime BMI, especially for individuals whose cause of death was associated with somatic wasting. While discrepancy between lifetime and postmortem BMI introduces error into the relationship between BMI and knee OA, such error is likely to have been systematic rather than specific to a particular time period. Third, although it is reasonable to infer that the postindustrial individuals studied here were, on average, less physically active and consumed more proinflammatory diets than those from earlier periods (39), direct data on these and other potential risk factors are not available for the individuals studied. Fourth, although socioeconomic status was undocumented for individuals in this study, the early industrial group likely included more relatively low-income individuals than the postindustrial group. This difference, however, partly reflects important sociodemographic shifts that occurred across the epidemiological transition between time periods (40). Fifth, BMI was unknown for prehistoric individuals, and although sex is reliably determined, age estimates beyond 50 y old are imprecise. Thus, the prehistoric samples could not be included in regression models that used age and BMI as predictor variables, and although the modal age of adult death in living hunter-gatherers is 68–78 y old (41), we cannot reject the hypothesis that knee OA levels are lower among prehistoric individuals than among postindustrial individuals, partly because prehistoric individuals were, on average, younger or had lower BMIs. Although the causes of OA in general and knee OA in particular are still not fully understood, the most important conclusion of this study is that the recent increase in knee OA levels cannot simply be considered an inevitable consequence of people living longer, but instead is the result of modifiable risk factors, including but not limited to high BMI, that have become more common since the mid-20th century. From an evolutionary perspective, knee OA thus fits the criteria of a “mismatch disease” that is more prevalent or severe because our bodies are inadequately or imperfectly adapted to modern environments (39). Intriguingly, other well-studied mismatch diseases, such as hypertension, atherosclerotic heart disease, and type 2 diabetes (39), that also have become epidemic during the last few decades are strongly associated with knee OA (42), suggesting common causes and risk factors. Susceptibility to knee OA and other mismatch diseases is undoubtedly influenced by intrinsic factors, including age, sex, and genes, but the historical and evolutionary perspective afforded by our data underscores that many modern cases of knee OA may be preventable. Prevention, however, will require a reappraisal of potential risk factors that have emerged or intensified only very recently. As with other mismatch diseases, it is likely that any effective prevention strategy will involve adjusting physical activity patterns and diets to approximate more closely the lifestyle conditions under which our species evolved.

SI Text Occupation-Related Physical Activity Among Early Industrial and Postindustrial Individuals. Since the mid-20th century in the United States, there has been a dramatic and progressive decline in the percentage of individuals employed in physically demanding occupations, such as manufacturing and agriculture, along with a steady increase in the percentage of individuals employed in less physically demanding, more sedentary occupations, such as those in the service sector (43). To assess whether the jobs of individuals in the early industrial and postindustrial samples reflect this temporal trend in occupation-related physical activity, we assigned each individual with a documented occupation (n = 544) an estimate of occupational physical activity intensity [metabolic equivalent (MET)] based on a previously published classification scheme (44). Jobs assigned MET values of less than three are sedentary or require light amounts of physical activity, whereas jobs given MET values greater than or equal to three involve moderate to large amounts of physical activity (43). Among early industrial individuals, 76% (34/45) were employed in occupations assigned MET values greater than or equal to three, a 2.2-fold higher proportion (Fisher’s exact test: P < 0.0001) of physically demanding jobs than among individuals in the postindustrial sample (34%; 169/499). Therefore, based on the limited data available, occupation-related physical activity among individuals in the early industrial and postindustrial samples seems to fit the trend of a shift toward less physically demanding jobs over the past half-century (43). Cause of Death Among Early Industrial and Postindustrial Individuals. It is well known that a major shift has occurred in the United States since the late 19th century in the types of diseases responsible for the majority of mortalities. During this shift, known as the epidemiological transition, improvements in sanitation and medicine lowered rates of deaths caused by infectious diseases and malnutrition, while other changes in diet and physical activity spurred a concomitant rise in mortality caused by chronic noninfectious diseases (45, 46). To determine whether individuals in the early industrial and postindustrial samples evince such changes in mortality patterns, we analyzed available death records (n = 1,918). Among early industrial individuals, 94% (1,316/1,402) of deaths were related to diseases as opposed to accidents or other external factors. Of the deaths resulting from diseases, the most common causes were heart disease (34%; primarily infectious or valvular), pneumonia or influenza (18%), and tuberculosis or syphilis (15%). In contrast, among postindustrial individuals who died from diseases rather than external factors (82%; 424/516), the most common causes of mortality were cancer (39%), heart disease (30%; primarily atherosclerotic), and noninfectious respiratory diseases (11%). These mortality patterns closely match US population statistics reported previously by more comprehensive analyses of temporal change in causes of death (47, 48), and they reflect the transition between early industrial and postindustrial times in the threat to life posed by infectious diseases relative to chronic noninfectious diseases (45, 46). Interobserver Agreement for Eburnation Identification. Interobserver agreement for identifying eburnation on skeletal material has been shown to be high (26). To confirm this previous finding and further establish the precision of discerning the presence and absence of eburnation, we compared our diagnoses for a subset of the postindustrial sample (n = 123 skeletons) with the diagnoses of a forensic anthropologist who previously analyzed the skeletons independently of our study (49). Our diagnoses were found to match those of the independent researcher in 122 cases (99% agreement), with 29 and 93 skeletons being classified in both analyses as displaying and not displaying eburnation, respectively. The single discrepancy between the two analyses was that eburnation was identified on a skeleton in our analysis that was not recognized by the independent researcher. Agreement between the two analyses measured using the kappa statistic (50) yields a coefficient value of 0.98 (95% CI, 0.93–1.00), indicating near-perfect interobserver agreement (51). Log-Binomial GLMs. Effect sizes (prevalence and prevalence ratios) were estimated using log-binomial models with the following basic composition. The response y is binary, and there are k observations y 1 ,…, y k , which are assumed to be independent: y i = { 1 if individual i exhibits OA 0 if individual i does not exhibit OA . Each ith observation of y is treated as a realization of a random variable Y i that can take the values one and zero with probabilities π i and 1 − π i , respectively. We assume that Y i has a Bernoulli distribution Y i ∼ Β ( π i ) with probability π i . This defines the stochastic structure of the model. Furthermore, we assume that the natural log of the underlying probability of success (P[Y = 1]; i.e., the proportion of diseased individuals in a group) is a linear function of the predictors ln ⁡ P [ Y = 1 | X ] = X β , where X is a design matrix that relates the predictors (x 1 …x n ) to the data, and β is a vector of regression coefficients. This defines the systematic structure of the model. The prevalence ratio, which is the relative proportion of diseased individuals in one group compared with another group, is defined for a given predictor x i as e β i . Reported P values are for tests performed on the log scale of the link function. In the first analysis, when all three samples (prehistoric, early industrial, and postindustrial) are compared, P values for the three pairwise mean differences in log prevalence are adjusted for multiple comparisons using the sequential Bonferroni method. Effect Size Interpretation in GLMs. The interpretation of effect sizes in GLMs depends on the scale in which they are reported. Several scales are available, including the link function, the response, and for some distribution families, a scale intermediate between the link function and response. In log-binomial GLMs, the link function is the natural logarithm, so effect sizes are estimated on the log prevalence scale. On this scale, the design matrix of predictors is linearly related to the expectation of the response variable; thus, the model is linear in the parameters. Effect sizes are rarely reported on the link function scale, however, as interpretation is unintuitive. Effect sizes are typically back-transformed via exponentiation either to prevalence ratios between two groups (or two values of a continuous predictor) or to prevalence estimates for a single group. Prevalence ratios represent a constant effect over the range of the predictor of interest while controlling for other covariates. Since the prevalence ratio is multiplicative, it denotes a rate of change (percentage change) of the response per unit increase in the predictor of interest. Predicted prevalence is an effect that varies as a function of the values of predictors. In practice, prevalence is therefore predicted over a range of values of the predictor of interest while holding all other covariates constant (typically at the sample mean), and we follow this convention (52). Doubly Robust, Weighted GLM. For the first and second analyses, parametric (log-binomial) models were used to estimate adjusted prevalence and prevalence ratios, with all observations weighted equally, conditioning on available covariates (i.e., sex for the first analysis and age, BMI, sex, and ethnicity for the second analysis). For the third analysis, we used a method of adjusting for as much information in control variables as possible without parametric assumptions. This was achieved by preprocessing the dataset with a matching method, so that the early industrial sample was as similar as possible to the postindustrial sample across the observed control variables. In this preprocessed dataset, the time period variable (the quantity of interest) is closer to being independent of background covariates. By reducing the link between the time period variable and control variables, preprocessing makes estimates based on subsequent parametric analyses far less dependent on modeling choices and specifications (28). In addition, since most of the adjustment for potentially confounding variables is done nonparametrically, the potential for bias is greatly reduced compared with parametric analyses based on raw data (27). In the third analysis, we therefore used a doubly robust estimator that used the conditioning variables two times: first in a nonparametric matching analysis that increased covariate balance (the similarity between average covariate values in the early industrial and postindustrial samples) by matching individuals from the early industrial and postindustrial time periods, and second in a parametric log-binomial model to adjust for remaining imbalance that was not eliminated by the matching procedure (53). This method had the following steps. i) Estimate a distance measure for the similarity between two individuals. For every individual in the analysis, a distance measure was calculated to assess the similarity between individuals in the early industrial and postindustrial samples. Propensity scores were used as the distance measure. These scores summarize all covariates into one scalar: the probability of being in the postindustrial sample. Propensity scores are balancing scores: at each value, the distribution of the covariates defining the propensity score is the same in the early industrial and postindustrial samples. Therefore, grouping individuals with similar propensity scores replicates a randomized experiment, although only with respect to observed covariates. The scores were estimated using logistic regression with time period as the binary response and age, BMI, sex, and ethnicity as predictors. The postindustrial sample was chosen as the target sample for matching, since it comprised only 28% (n = 525) of the total sample for the two time periods (n = 1,859).

ii) Exclude observations outside the region of common support. The region of common support was defined as the range of propensity scores exhibiting overlap between the early industrial and postindustrial samples. Observations not inside the region of overlap were excluded before matching.

iii) Match individuals from the early industrial period to those from the postindustrial period by minimizing the propensity score distance between the two samples and generating observation-level weights. Matching methods can be thought of as creating groups of individuals with at least one target and one comparator individual in each. In the “full matching” procedure used here, these groups are referred to as subclasses. There are as many subclasses as individuals in the target group (the postindustrial sample), while the number of comparator individuals varies across subclasses. This is because one-to-many matches are allowed within a subclass (i.e., a single individual from the target group can be matched to multiple individuals from the comparator group), thus making use of more observations than pair-matching techniques. The full matching method is optimal in terms of minimizing a weighted average of the estimated propensity score between each early industrial individual and each postindustrial individual within each subclass (54). Weights are constructed to estimate the average effect of interest on the target group, with the comparator group weighted to look like the target group. Unmatched individuals receive a weight of zero, while all matched target individuals receive a weight of one. Weights for matched comparator individuals are calculated in three steps. First, within each subclass, each comparator individual is given a preliminary weight of n ti /n ci , where n ti and n ci are the numbers of target and comparator individuals in subclass i, respectively. Second, each comparator individual’s weight is added up across the subclasses in which it was matched. Third, the comparator group weights are scaled to sum to the number of uniquely matched comparator individuals. Observation-level weights are then included in any subsequent parametric analyses. Full matching was performed using the R package MatchIt (55).

iv) Perform parametric modeling. Log-binomial models were used with weights from the matching analysis to increase covariate balance and additionally included the observed covariates to control for any residual imbalance between the early industrial and postindustrial samples.

Acknowledgments We thank the curatorial staffs of institutions housing the skeletal collections analyzed, including the American Museum of Natural History, the Cleveland Museum of Natural History, the Department of Anthropology at San Jose State University, the Forensic Anthropology Center at the University of Tennessee, the Maxwell Museum of Anthropology at the University of New Mexico, the National Museum of Natural History, the Peabody Museum at Harvard University, and the W. S. Webb Museum of Anthropology at the University of Kentucky. We also thank Michèle Morgan for providing age estimates for the prehistoric skeletons from New Mexico, and Ashley Brennaman for providing data used to assess interobserver agreement for eburnation identification. This work was supported by the Hintze Family Charitable Foundation and the American School of Prehistoric Research (Harvard University).

Footnotes Author contributions: I.J.W., S.W., D.T.F., and D.E.L. designed research; I.J.W., R.D.J., K.T.W., H.M., and R.J.W. performed research; I.J.W. and S.W. analyzed data; and I.J.W., S.W., D.T.F., and D.E.L. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission. O.P. is a guest editor invited by the Editorial Board.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1703856114/-/DCSupplemental.