The exact timing, route, and process of the initial peopling of the Americas remains uncertain despite much research. Archaeological evidence indicates the presence of humans as far as southern Chile by 14.6 thousand years ago (ka), shortly after the Pleistocene ice sheets blocking access from eastern Beringia began to retreat. Genetic estimates of the timing and route of entry have been constrained by the lack of suitable calibration points and low genetic diversity of Native Americans. We sequenced 92 whole mitochondrial genomes from pre-Columbian South American skeletons dating from 8.6 to 0.5 ka, allowing a detailed, temporally calibrated reconstruction of the peopling of the Americas in a Bayesian coalescent analysis. The data suggest that a small population entered the Americas via a coastal route around 16.0 ka, following previous isolation in eastern Beringia for ~2.4 to 9 thousand years after separation from eastern Siberian populations. Following a rapid movement throughout the Americas, limited gene flow in South America resulted in a marked phylogeographic structure of populations, which persisted through time. All of the ancient mitochondrial lineages detected in this study were absent from modern data sets, suggesting a high extinction rate. To investigate this further, we applied a novel principal components multiple logistic regression test to Bayesian serial coalescent simulations. The analysis supported a scenario in which European colonization caused a substantial loss of pre-Columbian lineages.

Keywords

Molecular clocks highly depend on the quality of calibration points to accurately estimate rates of molecular evolution ( 19 ). In the Americas, the scarce evidence of early human occupation and the absence of sites in eastern Beringia for most of the late Pleistocene hinder reliable calibration. An additional major challenge is the temporal dependence of molecular rate estimates, whereby molecular evolution appears more rapid when measured over short time intervals ( 19 ). In humans, this problem is most apparent when recent time scales (for example, the human settlement of the Americas) are analyzed using deep fossil calibrations such as the human-chimpanzee split ~6 to 7 million years ago ( 20 ). Accurate molecular rate estimates require a distribution of calibration points close to the age of events under study ( 21 , 22 ); in this regard, ancient DNA sequences from dated skeletons provide suitable tip calibrations for studying recent evolutionary events ( 23 ).

( A ) Mean age (symbols) and 95% highest posterior density (HPD) (error bars) for the TMRCA of each of the Native American haplogroups. Shading indicates the period between the oldest lower bound of any 95% HPD and the youngest upper bound of any 95% HPD for each data set. The purple dotted lines show the TMRCA bounds based on tip calibration; the blue dotted lines show the extreme TMRCA bounds from previous publications (26.3 to 9.7 ka) ( 20 , 25 ). ( B ) The isolation of Native American populations estimated to have occurred after the last observable divergence between Siberian and Native American lineages (24.9 ka based on the lowest 95% HPD upper bound) and before the oldest date at which all Native American founder haplogroups formed (18.4 ka based on the lowest 95% HPD upper bound). See section S5 for detailed methods.

Unfortunately, the precision of molecular clock studies in the Americas to date has been limited by the low genetic diversity and lack of appropriate calibration points to accurately estimate rates of molecular evolution. As a result, current mitochondrial molecular clock estimates of the initial entry into the Americas, which assume that the event corresponds to the initial diversification of Native American genetic lineages, range from 26.3 to 9.7 ka ( Fig. 2A ). This broad range spans most of the time frame over which the Bering Land Bridge route was feasible. Given the narrow temporal span of the actual diversification and migration into the Americas, much greater precision is needed to distinguish between different migration routes and hypotheses.

Genetic studies of Native American populations are complicated by the demographic collapse and presumed major loss of genetic diversity following European colonization at the end of the 15th century ( 7 ). However, geographically widespread signals of low diversity and shared ancestry ( 8 – 13 )—particularly striking in maternally inherited mitochondrial and paternally inherited Y-chromosome sequence data—suggest that small founding groups possibly initially entered the Americas in a single migration event that gave rise to most of the ancestry of Native Americans today ( 9 , 12 , 14 ). In contrast, the distribution of some of the rare founding mitochondrial haplogroups (D4h3a along the Pacific coast of North and South America, and X2a in northwestern North America) suggests that distinct migrations along the coastal route and the ice-free corridor occurred within less than 2000 years ( 15 ). Recent studies have identified a weak Australasian genomic signature in several Native American groups from the Amazon, compatible with two founding migrations ( 16 ), although the Australasian gene flow may have occurred after the initial peopling ( 17 ). Irrespective of the number of migration waves, the founding population appears to have rapidly grown and expanded southward ( 8 , 14 , 18 ), with low levels of gene flow between areas following initial dispersion ( 12 , 14 ).

( A ) Exposed land when sea levels were lowest (light green), modern-day landmass (dark green), and ice sheets (white). At the height of the LGM, the Laurentide and Cordilleran ice sheets blocked access to the Americas from eastern Beringia (that is, the Bering Land Bridge and Alaska/Yukon) ( 30 ). Populations west of the Bering Land Bridge were able to migrate southward during the LGM, but those on the Bering Land Bridge were unable to retreat farther than the Aleutian ice belt (arrows). The last point of detectable gene flow between Siberian and Native American ancestral populations (24.9 ka) and the geographic isolation marked by the formation of Native American founder lineages (18.4 ka) are shown (see Fig. 2B for details). The Yana Rhinoceros Horn site (32 ka) and the Swan Point site (14 ka) illustrate the temporal and geographic gaps in the Beringian archaeological record. ( B ) The ice sheets that began to retreat ~17 ka, opening a potential Pacific coastal route by ~15 ka (arrow). The rapid population expansion (16.0 ka) likely marks the movement south of the ice (see Fig. 3C for details).

The geographic isolation of the Americas delayed human settlement until the end of the Pleistocene [20 to 10 thousand years ago (ka)]; however, despite this relatively recent date, the specific time, place, and route of entry remain uncertain. It is likely that the first peoples moved from Asia across the Bering Land Bridge ( 1 , 2 ), the landmass between Eurasia and America exposed by lowered sea levels during the Last Glacial Maximum (LGM). However, at this time, much of northern North America was covered by the Cordilleran and Laurentide ice sheets, which blocked access from eastern Beringia (Bering Land Bridge and Alaska/Yukon) southward to the rest of the Americas ( Fig. 1A ). Shortly after the Cordilleran ice sheet began to retreat ~17 ka ( 3 ), a potential Pacific coastal route became available ~15 ka ( Fig. 1B ) ( 3 , 4 ), whereas an alternative route through an inland ice-free corridor along the eastern side of the Rocky Mountains opened around ~11.5 to 11 ka ( 4 – 6 ). The timing and route used in the migration event are important in understanding the size, number, and speed of the first migratory wave(s). Timing and route are also pivotal in resolving contentious issues such as the nature of peoples before Clovis—the first widespread archaeologically recognized culture in North America (13.2 to 12.8 ka) ( 1 ).

RESULTS AND DISCUSSION

Novel genetic diversity in pre-Columbian times The 92 pre-Columbian mitogenomes were sequenced to an average coverage depth of 112× (5.6× to 854.2×; table S2). Sequences were assigned to 84 distinct haplotypes, which fell within the expected overall mitochondrial diversity of Native South Americans (13), that is, haplogroups A2, B2, C1b, C1c, C1d, and D1 (figs. S2 to S5). The Native South American haplogroup D4h3a was not observed in our ancient data set, although we sampled the South American southern cone (Arroyo Seco 2, Argentina) where this lineage is common today (15). None of the 84 haplotypes identified from ancient samples are represented in the existing genealogy of global human mitochondrial diversity [that is, PhyloTree mt; (24)] (figs. S2 to S5) or in the literature (fig. S6). Although modern Native American genetic diversity is not well characterized, this result clearly illustrates the importance of sampling pre-Columbian specimens to fully measure the past genetic diversity and to reconstruct the process of the peopling of the Americas.

Marked synchronicity of the Native South American haplogroup times to most recent common ancestor The estimated times to most recent common ancestor (TMRCA) for haplogroups A2, B2, C1, D1, and D4h3a were highly synchronous (Fig. 3 and fig. S8), confirming previous interpretations that all five haplogroups were part of one initial population (25). The TMRCA fell within the range of previous molecular date estimates, although the narrower 95% credible intervals considerably increased the precision (Fig. 2A). Older dates for the initial diversification within each haplogroup have been previously calculated using the human-chimpanzee calibration (25, 26), whereas much younger dates resulted from calibrations using non–Native American mitochondrial lineages associated with biogeographic events (20). This clearly illustrates the impact of time-dependent rate estimates and the critical influence of the calibration framework (27). Our use of a large number of temporally and phylogenetically distributed tip dates provides an accurate calibration of the molecular rates relevant to Native American early history (28), allowing a uniquely precise timeline for the peopling of the Americas. Fig. 3 Dated Bayesian mitogenomic tree and reconstruction of past effective female population size. The mitogenomic tree and the demographic plot are based on replicate data set 1, which is representative of the three replicate data sets (fig. S7). (A) Complete tree showing the relationships between the main Native American haplogroups A, B, C, and D, as well as their TMRCA (colored circles). Black circles show the divergences between Siberian and Native American lineages. Siberian clades are shown in black and Native American clades are shown in gray. (B) Detailed tree with Siberian clades (black), modern Native Americans (blue), and ancient Native Americans (red). Colored and black circles as in (A). Gray shadings and empty black circles highlight shared ancestry for individuals from the same geographic location or from the same cultural background. The filled black triangle (haplogroup A2) is the most recent common ancestor between an ancient haplotype and a modern haplotype at ~9 ka. (C) Extended Bayesian skyline plot of female effective population size, based on a generation time of 25 years.

Separation from Siberian populations during the LGM The most recent genetic divergence observed between the ancestors of Siberians and Native Americans (24.9 ka; section S5 and Fig. 2B) is the last point at which we can detect apparent gene flow (that is, a shared lineage) between the Siberian population and the ancestral Native American population. We can assume that the real population divergence occurred after this point. In addition, if we accept that the estimated TMRCA of each of the five Native American haplogroups provides an independent estimate of the timing of the same small population’s isolation, we can use the 95% credible intervals to constrain the lower bound (section S5). The resulting estimate that the two populations became fully isolated between 24.9 and 18.4 ka is in accordance with calculations from modern complete genomes which indicate that Siberians and Native Americans split no later than ~23 ka (17). Gene flow to and from east Siberia certainly appears to have ceased by the height of the LGM (18.4 ka; Fig. 2B).

Eastern Beringia as a sustainable refugium for ancestral Native Americans Our data cannot determine whether the separation between Siberian and Native American ancestral populations occurred in Siberia or Beringia. However, the start of isolation (24.9 to 18.4 ka) closely coincides with the LGM. We hypothesize that cold arid conditions drove populations on the western (that is, Siberian) margins of the Bering Land Bridge to migrate to southern refugia (Fig. 1A), as suggested by the absence of megafauna kill sites younger than the far north Yana Rhinoceros Horn site 32 ka (1). In contrast, any populations east of the Kamchatka and Chukotka Peninsulas would not have been able to retreat farther south than the Aleutian ice belt and would thus remain isolated in eastern Beringia (Fig. 1A). We cannot accurately estimate the size of this founding population, but the effective female population that subsequently entered the Americas appears to be ~2000, which accords well with previous studies (9, 10, 25). Although this number cannot be directly translated into census population size, it suggests that the human population isolated in eastern Beringia was relatively small, probably not exceeding a few tens of thousands of people (section S6). The presence of large numbers of megafauna in eastern Beringia during the late Pleistocene, including the LGM, indicates an ice-free region dominated by shrub tundra (29), which would have been more than capable of sustaining such a population size (section S6). Thus, our observations are consistent with the idea that the founding Native American population used the exposed Bering Land Bridge and adjacent regions in Alaska/Yukon as a refugium during the height of the LGM, before climatic change and the retreat of the ice sheets allowed access to the remainder of the Americas. Unfortunately, the large temporal and geographic gaps in the archaeological record between the Yana Rhinoceros Horn site (~32 ka, western Beringia) and the Swan Point site (~14 ka, eastern Beringia) provide little additional information about this process (Fig. 1A) (1) or how the ancestral Native Americans were isolated from their Asian counterparts.

The Beringian Standstill (~2.4 to 9 thousand years) The scenario of an Eastern Beringia refugium is consistent with the Beringian standstill hypothesis, which suggests that the ancestral Native Americans were isolated in the area for up to 15 thousand years (ky) (9, 10, 29). Our large data set of dated mitogenomes provides tight estimates for the duration of the standstill and the subsequent movement out of the area. The mitogenomic tree shows a sudden burst of lineage diversification starting ~16.0 to 13.0 ka (Fig. 3B). This is followed by a steep increase in the mean female effective population size (>10%) between adjacent time bins starting 16.0 ka (Fig. 3C). Overall, the population underwent a 60-fold increase between 16.0 and 13.0 ka, suggesting that 16.0 ka represents the initial entry into the Americas, where population size significantly increased in a more favorable environment. Considering the time between isolation (24.9 to 18.4 ka) and entry (16.0 ka), the improved temporal resolution provided by our data suggests that the Beringian Standstill could have been as short as ~2.4 ky, and no longer than ~9 ky, consistent with recent estimates based on autosomal data from complete modern-day genomes (17).

A coastal entry route The population burst at 16.0 ka is contemporaneous with the rapid retreat of coastal glaciers along the northwest Pacific coast associated with a phase of stepwise ocean warming (2° to 3°C) in the region (3). This date considerably predates the opening of the inland ice-free corridor ~11.5 to 11.0 ka (4–6) and indicates that the initial entry into the Americas took place via a southward expansion along recently emerged northwest Pacific coastal land (Fig. 1B) (3, 17, 28, 30, 31). Given the early archaeological sites in Monte Verde in southern Chile at 14.6 ka (32), the mitogenome data indicate that the transit of the full length of the Americas took around 1.4 ky.

Early geographic structure after entry into the Americas The phylogenetic trees feature multiple long branches stemming from the initial expansion period (Fig. 3B), irrespective of whether the lineages lead to pre-Columbian (ancient) or modern-day individuals. This topology appears to reflect the swift migration and expansion of a population, which contained each major haplogroup, into and throughout the Americas (8, 14, 18). Subsequent lineage diversification within each haplogroup appears to be constrained to within specific geographic regions or shared cultural backgrounds (Fig. 3B, gray shadings), which is consistent with suggestions that geographic structure was rapidly established after colonization and was thereafter followed by limited gene flow between populations from diverse regions (14, 33).