Significance Agriculture changed not only human culture and lifeways, but human biology as well. Previous studies indicate that softer agricultural diets may have resulted in a less robust craniofacial morphology in early farmers. However, obtaining reliable estimates of worldwide subsistence effects has proved challenging. Here, we quantify changes in human skull shape and form across the agricultural transition at a global scale. Although modest, the effects are often reliably directional and most pronounced in craniofacial features that are directly involved in mastication.

Abstract Agricultural foods and technologies are thought to have eased the mechanical demands of diet—how often or how hard one had to chew—in human populations worldwide. Some evidence suggests correspondingly worldwide changes in skull shape and form across the agricultural transition, although these changes have proved difficult to characterize at a global scale. Here, adapting a quantitative genetics mixed model for complex phenotypes, we quantify the influence of diet on global human skull shape and form. We detect modest directional differences between foragers and farmers. The effects are consistent with softer diets in preindustrial farming groups and are most pronounced and reliably directional when the farming class is limited to dairying populations. Diet effect magnitudes are relatively small, affirming the primary role of neutral evolutionary processes—genetic drift, mutation, and gene flow structured by population history and migrations—in shaping diversity in the human skull. The results also bring an additional perspective to the paradox of why Homo sapiens, particularly agriculturalists, appear to be relatively well suited to efficient (high-leverage) chewing.

The emergence and spread of agriculture are among the more remarkable developments in the evolutionary history of Homo sapiens. This change in lifeway appears to be associated with changes in human skull shape and form. Although global cranial diversity is generally well explained by neutral evolutionary processes (1⇓⇓–4), early farmers tend to have a chewing architecture that is, at least in some dimensions, less massive than that of their hunter-gatherer counterparts (refs. 5⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–20 and 21, chap. 7). Explanatory scenarios cohere around the idea that softer agricultural foods reduce masticatory demands, resulting in less robust craniofacial skeletons and reduced and repositioned chewing muscles.

This is the essence of the “masticatory-functional hypothesis” Carlson and Van Gerven (5) posited four decades ago to explain morphological differences among a chronological series of ancient Nubian populations—from Mesolithic hunter-gatherers to Christian agriculturalists. Subsequent forager–farmer comparisons for European, Asian, and American samples support a trend of craniofacial reduction with agriculture (7⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓⇓–19). Most of these studies sample a small number of geographically local populations (but see refs. 22 and 23). Local comparisons are valuable because they often provide a detailed picture of the cultural, dietary, and chronological context for the morphological differences between closely related groups. In some cases, cultural and other evidence supports a hypothesis of biological continuity between the foragers and descendant farmers (5⇓–7, 10, 12, 14, 16⇓⇓–19). However, with few sampled groups, it can be difficult to separate diet effects from other factors differentiating the populations. Moreover, the major dimensions of reduction can vary from study to study, and some farmer masticatory dimensions are larger in some comparisons (8⇓–10, 18).

An alternative approach samples many populations, globally or regionally, to assess the extent to which deviations from a population genetic, neutral model of diversification correspond to differences in mode of subsistence (22, 23). Due to the complexities of characterizing high-dimensional phenotypes in structured samples, each observed variable (shape, diet, genetic data, or a proxy for it) is typically transformed from its original units to a matrix of pairwise distances between populations. The correlation between shape and diet distances, after accounting for population history and structure, becomes the focus of the inquiry.

However, the essential units of morphology are shape, form, and size, not pairwise distances. The beauty of statistical shape analysis is its potential to quantify and concretely represent morphological differences in morphological units. The loss of this potential when evaluating directional effects (diet, climate, etc.) in distance units is especially unfortunate: A distance analysis quantifies the correlation between morphological and diet distances, but not what the morphological response to subsistence practice looks like. It is the latter objective that motivates geometric morphometrics (24, 25) and is central to evaluating functional and evolutionary hypotheses. Insights at this level require methods that permit direct analysis of morphological observations in structured samples.

Here, we provide estimates of the influence of agriculture on human skull shape, form, and size at a global scale. The morphological observations are three-dimensional landmark data (Fig. 1 and SI Appendix, Fig. S1). The skeletal sample (Fig. 2 and SI Appendix, Document S1) is a large collection of preindustrial forager and farmer crania (n = 559 from 25 groups) and mandibles (n = 534 from 24 groups). Although bilateral landmarks were collected for most specimens, we average the sides and evaluate hemiforms so that somewhat fragmentary remains can be included in the sample.

Fig. 2. Sample map. The sample consists of n = 599 crania (25 groups) and n = 534 mandibles (24 groups). Population names, subsistence profiles, and additional details are in SI Appendix, Document S1.

We adapt a Bayesian, quantitative genetics mixed model for high-dimensional phenotypes (26) to these data. For each skeletal element (cranium, mandible), we fit three models, each with a different diet predictor. The diet predictors identify whether a specific agricultural subsistence staple is present/regular or absent/rare in a population: dairy (“Milk”); maize, wheat, rice, or other cereals (“Mush”); and “Soft,” which groups together all Milk and Mush populations. Diet assignments were made based on published archaeological, isotopic, and enthnographic reports (SI Appendix, Document S1). We focus on dairy and cereals because their association with reduced oral processing demands is relatively uncontroversial. All models also include fixed effects for sex and mean annual temperature, a random effect for population structure, and residual error. Temperature is known to influence human cranial diversity (3, 4, 27⇓–29). Without controlling for temperature, the absence of agriculture in extremely cold climates could confound diet effects.

For each fixed-effect predictor, the model estimates regression coefficients for each landmark coordinate and for size (log centroid size). Visualizing these coefficients as shape and form transformations across the agricultural transition yields intuitive, biologically meaningful representations of how skull morphology varies with subsistence conditions.

Discussion Although the Milk, Soft, and Mush binary predictors are coarse proxies for overall diet differences, the order of effect size magnitudes (Milk > Soft > Mush) suggests these predictors capture relevant variation in biomechanical demands among farming classes. Dairy items are generally softer and require less oral processing than cereals. Liquid dairy consumption requires no bite force whatsoever. It is therefore reasonable, and consistent with the logic of the masticatory-functional hypothesis, that masticatory reduction would be most noticeable in dairying groups. The specific differences in morphology are also consistent with a hypothesis of reduced biomechanical demands in farming (particularly dairying) groups. All else equal, a smaller bony masticatory apparatus would be less able to withstand bite force and muscle action forces. A reduced superior temporal line outlines a smaller anterior temporalis muscle. A narrower mandibular ramus indirectly suggests a reduced attachment area for, and hence reduced cross-sectional area of, the masseter and medial pterygoid muscles. These three muscles are the primary elevators of the jaw during chewing cycles. Reductions to these muscles imply reduced capacity to generate high or repetitive bite forces. A more inferiorly located superior temporal line (Figs. 3 and 4) and relatively taller coronoid process also imply a smaller (shorter) temporalis muscle, but not necessarily in a dimension that relates to bite force capacity. Finally, if the taller palatal vault in farmers indicates a thinner bony palate, then this morphological difference may reflect reduced loads as well (30, 31). The posterior shift of the maxillary and mandibular tooth rows in farmers does not as obviously fit a hypothesis of reduced masticatory performance. All else equal, more posteriorly located cheek teeth should increase the mechanical advantage of the masticatory system by shortening the external moment arm to the food bolus. This tooth row position paradox—increased leverage in an environment of reduced performance demands—has been noted elsewhere as a surprising feature of H. sapiens masticatory morphology (32⇓⇓–35). A more integrated view of the masticatory apparatus partially tempers the efficiency implications: Because of the potential for working-side temporomandibular joint (TMJ) dislocation, posteriorly positioned cheek teeth constrain muscle recruitment during chewing (34, 35). Nevertheless, the apparent efficiency of the human masticatory apparatus calls for an explanation. One proposal is that high leverage reflects selection for the ability to generate powerful bite forces (32). Alternatively, it has been argued that the shortened distance from TMJ to bite point is part of a generally flatter human facial morphology and thus that nonmasticatory explanations are more likely (34, 36). Our results point to a different explanation. First, we do not detect a strong association between agricultural diets and overall facial flatness; general orthognathy is more clearly associated with environmental temperature (37). More importantly, the morphological reductions that suggest increased mechanical advantage can actually be attributed to reduced oral processing. Consider the mandibular ramus, narrower in farming groups (Fig. 6C). Ramal growth displaces the tooth row anteriorly (ref. 38, chap. 4). All else equal, a narrow ramus results in more posteriorly positioned mandibular cheek teeth. All else equal, a shorter maxilla (Fig. 6B) has the same implications for the cranium. We therefore suggest that the mechanical advantage of farmers relative to foragers, and perhaps of H. sapiens relative to other taxa, may be incidental to reduced masticatory performance demands. Finally, it is important to put diet effect magnitudes in perspective. The vast majority of human genetic diversity is within group diversity, and much of the genetic variation that differentiates populations is consistent with neutral evolutionary processes (39⇓⇓–42). Variation in the human cranium is patterned similarly (1, 2, 43). As a simple accounting matter, one would therefore expect the influence of diet to be comparatively small. The relative ranking of effects in Fig. 7 and SI Appendix, Fig. S8 is consistent with this expectation: Subsistence differences are a fraction of typical differences among individuals and smaller than typical differences between groups of average relatedness. Nevertheless, for closely related groups, diet effects and group-level differences are of similar magnitude. This latter result may explain some of the inconsistency among studies that contrast local samples of foragers and farmers. In such samples, diet effects may be substantially obscured or magnified, depending on the extent to which the direction of subsistence and structured effects align.

Conclusion Explaining human cranial diversity has long occupied a central place in biological anthropology (44⇓⇓–47). Here, we isolated the influence of neutral evolutionary processes on global diversity to quantify changes in skull shape and form across the agricultural transition. The changes fit well with a hypothesis of reduced masticatory performance demands in farming groups. Due to some combination of food material properties and food processing/preparation (e.g., ceramic ware cooking), agricultural staples were likely easier to chew than foods typically consumed by foragers. Increased prevalence of dental malocclusion and tooth crowding in agricultural groups (21, 48) provides added support for this inference. However, morphological change need not be massive to have functional resonance. The changes in human skull shape and form and masticatory muscle size we identify are relatively small. Small diet effect magnitudes are consistent with studies quantifying the major variance components of global human genetic and cranial diversity, where most variation is found within groups. Small effects are also consistent with a long view of hominin cultural and morphological coevolution. The technologies for cooking, cutting, grinding, and pounding food all precede the emergence of agriculture. Each would have eased oral processing demands in hunter-gatherers as well as early farmers. Finally, inferences concerning the biological mechanism of subsistence-driven differences in skull morphology tend to favor phenotypic plasticity over natural selection (9, 10, 48⇓–50). A substantial body of experimental feeding and muscle function studies demonstrates the feasibility of a plastic response (31, 50⇓⇓⇓⇓–55). Comparative analyses of dental malocclusion tend to support the inference of plasticity as well (48). Nevertheless, in some forager–farmer ontogenetic comparisons, craniofacial differences consistent with variation in diet functional demands are evident before (15) or very shortly after (56) weaning age. These results in young individuals are consistent with a genetic mechanism. Dietary specializations have also been shown to produce some of the most discernable patterns of genetic divergence among living human groups (57⇓–59). We therefore think genetic mechanisms should not be wholly discounted in studies of the effects of agriculture on skull morphology.

Materials and Methods Landmark Data. Landmarks were recorded by D.C.K. For all but two populations, landmark data were collected directly using a Microscribe 3DX digitizer. For the Pampa Grande and Chubut samples from Argentina, D.C.K. used Avizo Lite (FEI Co., v. 9.0.1) to create surface models and record landmarks from computed tomography (CT) scans. We made two concessions to increase sample size in several populations. First, in some archaeological samples, we found the bones of the basicranium, particularly the occipital, were often fragmentary or displaced. The cranial landmark set therefore includes few basicranial landmarks. Second, although bilateral landmarks were collected for most specimens, the mixed model is fitted to cranial and mandibular hemiforms after averaging the right and left sides. This allows us to include true hemiform mandible fragments (symphysis plus landmarks from one side). An alternative would reflect mandible hemifragments, creating a bilaterally symmetric jaw by fitting the symphysis landmarks of the original and reflected forms to each other. However, we found that small amounts of measurement error along the symphysis occasionally result in large, clearly nonbiological variation in mandibular width at more distal landmarks. Imputation. For a missing right or left bilateral landmark where the antimere is present, we impute coordinates using a reflected relabeling procedure that substitutes the position of the reflected antimere for the missing point (60). Missing midline landmarks, and bilateral landmarks absent on both sides, were inferred using two-block partial least-squares (PLS) imputation (60). A total of 51 crania and 34 mandibles (respectively, 9.1% and 6.4% of specimens) required PLS imputation. No specimen required PLS imputation of more than two landmarks. Superimposition. After imputation, landmark configurations were aligned using generalized Procrustes analysis (GPA: 24). GPA is a multistep procedure that removes location and centroid size differences between configurations and then rotates each configuration to minimize its squared distance from the sample mean shape. Mixed Model. The mixed-effect model was developed to estimate effect sizes in structured samples (61) and has a long history in quantitative genetic studies of pedigreed observations (62⇓–64). Recent innovations extend the mixed model to interspecies samples (65, 66) and to highly multivariate data such as observations of shape or form (26, 28, 67). The mixed model for the matrix of shape and size observations Y is Y = X B + Z U + E . The observations (Y) are thus reconstructed as the outcome of contributions from fixed effects (XB), random effects of population history and structure (ZU), and residual error (E). In our implementation, all individuals from a population share the same temperature, diet, and structured contributions, and all males share a common sex effect. The error term characterizes idiosyncratic, individual-level variation. To minimize abstraction, we describe a mixed model for cranial observations (n = 559). The reference categories for the binary predictors are female and the presumably harder-textured (forager) diet. In Y, each cranium is characterized by a vector of 112 traits: the log centroid size and Procrustes residuals for a hemiform of 37 anatomical landmarks in three dimensions. B is a 4 × 112 matrix, each column containing an intercept and coefficients for sex, temperature, and diet (say, dairy domestication) for one cranial trait. U is a 25 × 112 matrix of structured random effects, each row of U corresponding to a sampled population. X and Z are known design matrices for B and U, respectively. E is a 559 × 112 error matrix. U has a matrix normal distribution with mean 0, column covariance matrix A (the 25 × 25 population relationship matrix), and row covariance matrix H (the 112 × 112 dispersion matrix for traits, analogous to G for pedigreed observations). E has a matrix normal distribution with mean 0, column covariance matrix I (the 559 × 559 identity matrix), and row covariance matrix R (a 112 × 112 residual covariance matrix). Thus, the covariance matrices for U and E in vectorized form are H ⊗ A and R ⊗ I, respectively, where ⊗ indicates the Kronecker product. For p traits, H contains p × (p + 1)/2 parameters to be estimated. This quadratic scaling can result in a rapid loss of precision and produce unstable parameter estimates in small to moderate samples if p is large (68). Runcie and Mukherjee (26) propose a Bayesian solution to this problem, whereby H is estimated with an underlying factor model (also ref. 28). The rationale for the factor approach is grounded in a model of biological development. In essence, if k modular, developmental processes contribute to covariance in a p-dimensional phenotype, with k < p, a factor model capturing the modules provides a lower-dimensional solution to H. The method is implemented in the Bayesian Sparse Factor Analysis of Genetic Covariance Matrices (BSFG) software package. BSFG uses an adaptive Gibbs sampler to estimate posterior densities of model parameters. After a 1,000,000-iteration burn in, we generated 1,000,000 realizations from a single Markov chain, thinning at a rate of 1,000 to obtain 1,000 posterior samples for inference. We examined time-series graphs of model parameters over Gibbs iterations to confirm that mixing was adequate. Relationship Matrix. Fitting a mixed model requires a relationship matrix (A; SI Appendix, Table S1), which encodes pairwise evolutionary correlations between sample populations. Because genetic data are not available for most groups in our sample, we relied on the close correspondence between geographic and genetic distances among human groups (29, 40) to estimate relatedness. Geographic distances were estimated using the haversine (69), with migration routes computed over landmasses using reasonable waypoints for passage between continents and over bodies of water. We fitted a linear regression of genetic distance [δμ2 microsatellite distance (70)] on geographic distance for the Human Genome Diversity Project-Centre d'Étude du Polymorphisme Humain microsatellite diversity panel (375 loci, 2,112 samples, 52 populations) (71) and then used the coefficients to predict δμ2 distances for our sample. For populations i and j having an estimated δμ2 distance D ij , the pairwise relatedness due to structure is A i j = D m a x − D i j D m a x , where D max is the maximum δμ2 between sampled pairs. Shape and Form Contrasts. Shape estimates require only a summation of landmark coordinate coefficients. The expected shape of a female forager is c + β 0 ¯ , where c is the consensus configuration (in vector form), and β 0 ¯ is the mean vector of intercept coefficients, averaged over Gibbs realizations. The expected shape for a (female) dairy agriculturalist is simply c + β 0 ¯ + β M i l k ¯ . Rescaling shape estimates by centroid size coefficients renders the model’s predictions in Procrustes form space. Statistical uncertainty in the diet effect estimates is visualized by plotting the baseline (female forager) configuration, along with landmark displacement vectors for stochastically varying realizations of β M i l k . The β M i l k realizations are sampled with replacement from the posterior distribution generated by the Gibbs sampler. Fixed-Effect Predictors. The Milk, Mush, and Soft binary predictor structures are coarse relative to actual variation in subsistence (72, 73). However, subsistence data are limited for several sample populations (SI Appendix, Document S1), mandating the use of broad, simple categories. If the food items that define the subsistence classes were to account for the entire diet of a farming population, average chewing stress is likely to be highest for the Mush farming class (cereals), lower for Soft (cereals and/or dairy), and lowest for Milk (dairy). If these rankings are correct, the Mush model contrasts cereal agriculturalists with a poorly defined harder diet category—one that includes populations expected to have both the highest and the lowest masticatory demands (foragers and dairy consumers, respectively). Nevertheless, we fitted separate models for all three diet predictors because we did not have enough prior information about the total diets of the populations to make an exclusion. As an alternative, we considered incorporating Milk and Mush predictors in the same model. However, with relatively few dairying populations that are not also cereal domesticators, we found this approach resulted in coefficients with very high levels of uncertainty for both subsistence categories. We considered inclusion of a sample age (chronology) predictor, but its incorporation is problematic for several reasons. Sample dating quality varies substantially; some samples accumulated over centuries whereas others are more temporally constrained; within regions, sample age effects are potentially useful if the populations are related by direct biological descent, but are otherwise misleading to some unknown degree. We have no means to assess whether the sampled populations are related by biological descent. For these reasons, the sample age predictor was not incorporated. Computing. BSFG is implemented in Matlab (Mathworks). Geometric morphometrics and posterior analysis of model coefficients were carried out in R (74) with scripts written by D.C.K. Three-dimensional plots were generated in R, using the rgl package (75), and converted to u3d format in Meshlab (Visual Computing Lab-ISTI-CNR). Scripts for several procedures are available at GitHubGist (https://gist.github.com/davidckatz). Data and additional code are available from the authors.

Acknowledgments We thank D. Runcie and S. Mukherjee (guidance with BSFG), Andre Strauss (CT scans), the University of California, Davis (UC Davis) Paleoanthropology Group, Michael Berthaume, and the PNAS editors and anonymous reviewers (manuscript comments). We also thank the institutions who graciously granted access to the skeletal materials in their care: Institut de Paléontologie Humaine (Stéphanie Renault, Amélie Vialet); Musée de l’Homme (Alain Froment, Philippe Mennecier, Martin Friess, Aurélie Fort, Véronique Laborde); University of Vienna (Katrin Schäfer); Naturhistorische Museum Wien (Maria Teschler-Nicola, Karin Wiltschke); University of Copenhagen (Niels Lynnerup); The British Museum (Daniel Antoine); Duckworth Laboratory (Marta Mirazón Lahr); University of Kyoto (Masato Nakatsukasa); Tokyo Natural Science Museum collections at Tsukuba University (Yousuke Kaifu, Kazuhiro Sakaue); Vietnamese Institute of Archaeology (Marc Oxenham, Trinh Hoang Hiep, Nguyen Anh Tuan); American Museum of Natural History (AMNH) (Gisselle Garcia); Harvard Peabody Museum (Michele Morgan, Olivia Herschensohn); University of Pennsylvania Museum of Archaeology and Anthropology (Janet Monge); Phoebe Hearst Museum of Anthropology (Natasha Johnson); San Jose State University (Elizabeth Weiss); and William S. Webb Museum (George Crothers). This material is based upon work supported by the National Science Foundation under Grant BCS-1232590, the Wenner-Gren Foundation, the UC Davis Department of Anthropology, and the AMNH.