We tested for contamination in mtDNA using schmutzi (parameters:–notusepredC–uselength) (), which iteratively determines the endogenous mitochondrial genome while also estimating human mitochondrial contamination given a database of potential contaminant mitochondrial genomes. For males we estimated contamination on the X chromosome with ANGSD (), which creates an estimate based on the rate of heterozygosity observed on the X chromosome. We used the parameters minimum base quality = 20, minimum mapping quality = 30, bases to clip for damage = 2, and set all other parameters to the default. Finally, we measured autosomal contamination using a recently developed tool based on breakdown of linkage disequilibrium that works for both males and females (N.N., Éadaoin Harney, S.M., N.P., and D.R., unpublished data). We report but do not include in our main analyses samples with evidence of contamination greater than 5% by any of the contamination estimation methods (only sample CP26 was excluded). Due to high contamination levels (the non-damage restricted samples skewed toward West Eurasians on global PCA (not shown), sequences of all Brazil_Jabuticabeira2_2000BP samples were filtered with PMDtools () to retain only fragments with a typical ancient DNA signature and then trimmed 10bp on either end before analysis. All contamination estimates are reported in Table S3

We used present-day human data from the Simons Genome Diversity Project (), which included 26 Native American individuals from 13 groups with high coverage full genome sequencing. We also included data from 48 Native American individuals from 9 different populations genotyped on the Affymetrix Human Origins array () as well as 493 Native American individuals genotyped on Illumina arrays either unmasked or masked to remove segments of possible European and African ancestry ().

The maximum parsimony tree is also striking in showing that the lineage leading to haplogroup D4h3a has a much longer branch than all other Native American-specific mtDNA haplogroups. The diversification of haplogroup D4h3a dates to ∼16000 BP which temporally overlaps with the coalescence time of A2, B2, C1, and D1 haplogroups (). This suggests that a rate acceleration took place on the lineage leading to the radiation of D4h3a, similar to what has been observed among African L2 lineages ().

We used the newly reconstructed mtDNA combined with previously published present-day and ancient sequences ( Table S3 ) to generate a maximum parsimony tree ( Figures S7 A and S7B). This tree recapitulates the star-like phylogeny of the founding Southern Native American mtDNA haplogroups A2, B2, C1b, C1c, C1d, D1 and D4h3a reported previously (). We report five new Central and South American individuals belonging to the rare haplogroup D4h3a (3 Brazil, 1 Chile, 1 Belize), which among ancient individuals has been identified so far only in two individuals from the North American Northwest Coast () and in the Anzick-1 individual () but not in Southern Ontario, ancient Californians (), or Western South America () where it has the highest frequency today (). Previously this haplogroup was hypothesized to be a possible marker of human dispersal along the Pacific coast, but its presence in early individuals from Belize and Brazil (as well as in the inland Anzick-1 genome from Montana in the U.S.A.) suggests an ancient spread toward the Atlantic coast as well with its lower frequency there today being due to population replacement or to genetic drift.

For Y chromosome haplogroup calling, we used the original BAM files and performed an independent processing procedure. We filtered reads with mapping quality < 30 and bases with base quality < 30, and for UDGhalf treated libraries we trimmed the first and last 2-3bp of each sequence to remove potential damage induced mutations. We determined the most derived mutation for each sample using the tree of the International Society of Genetic Genealogy (ISOGG) and confirmed the presence of upstream mutations consistent with the assigned Y chromosome haplogroup using Yfitter (). For mtDNA haplogroup assignment, we used Haplofind () on the consensus sequences reconstructed with schmutzi (parameters:–notusepredC–uselength) () after applying a quality filter of ≥ 10 (or ≥ 11 for LapaDoSanto_Burial28, LapaDoSanto_Burial17 and ArroyoSeco2_AS6) for a total of 48 newly reported sequences, including samples for which no nuclear data was obtained ( Table S3 ). We produced a multiple genome alignment of our newly reconstructed sequences (excluding LagunaChica_SC50_L763 because of low coverage) along with 17 previously published ancient sequences older than ∼4000 BP ( Table S3 ) and 230 present-day sequences () using MUSCLE (parameter: -maxiters 2) (). We thus analyzed a total of 295 mtDNAs and used an African sequence as outgroup. We used the program MEGA6 () to build a Maximum Parsimony tree with 98% partial deletion (16518 positions) and 500 bootstrap iterations, and visualized it in FigTree ( http://tree.bio.ed.ac.uk/software/ ) ( Figure S7 ).

We used smartpca from EIGENSOFT and default settings () to compute principal components using present-day populations. We projected ancient individuals with at least ∼10,000 overlapping SNPs using the option lsqproject: YES, on eigenvectors computed using the present-day populations genotyped on the Illumina array (we restricted our analysis to the subset of Native Americans without evidence of post-colonial mixture ()).

We computed D-statistics, f-statistics and f-statistics with ADMIXTOOLS () using the programs qp3Pop and qpDstat with default parameters and “f4mode: YES.” We computed standard errors with a weighted block jackknife over 5-Mb blocks. For f-statistics we set the “inbreed: YES” parameter to account for the fact that we are representing the ancient samples by a randomly chosen allele at each position rather than using their full diploid genotype which we do not have enough data to discern. The details of the inbreeding correction, which computes the expected value of statistics taking into account this random sampling, are presented in the section 1.1 of the Appendix of. We computed “outgroup” f-statistics of the form f(Mbuti; Pop, Pop), which measures the shared genetic drift between population 1 and population 2. Where relevant we plot the statistics on a heatmap using R ( https://github.com/pontussk/point_heatmap/blob/master/heatmap_Pontus_colors.R ). We also created a matrix of the outgroup fvalues between all pairs of populations. We converted these values to proxies for distances by subtracting the values from 1 and generating a multi dimensional scaling (MDS) plot with a custom-made R script. We converted the original values to distances by taking the inverse of the values and generating a Neighbor joining tree using PHYLIP version 3.696’s () “neighbor” function and setting USA_USR1_11400BP as the outgroup (default settings were used for the rest of the analysis). We displayed the tree using Itol ().

To determine the minimum number of streams of ancestry contributing to Central and South American populations, we used the software qpWave () which assesses whether the set of f-statistics of the form f(A = South American 1, B = South American 2; X = outgroup 1, Y = outgroup 2), which is proportional to the product of allele frequencies summed over all SNPs (p-p)(p-p), forms a matrix that is consistent with different ranks (rank 0 would mean consistency with a single stream of ancestry relative to the outgroups; rank 1 would mean 2 streams of ancestry, and so on). The significance of the statistic is assessed using a Hotelling Ttest that appropriately corrects for the correlation structure of f-statistics (and thus multiple hypothesis testing). For most analyses, we used ancient California individuals from(USA_MainlandChumash_1400BP, USA_SanFranciscoBay_300BP, USA_SanNicolas_4900BP, and USA_SanClemente-SantaCatalina_800BP), Chipewyan, Russia_MA1_24000BP (MA1), Anzick-1, Han, Papuan, Karelia Hunter Gatherer, and modern Mexican groups (Zapotec, Mixtec, Mixe, and Mayan) as outgroups. We also performed the analyses with different outgroups to determine the effect of outgroups on the results (for a detailed list, see Table S5 ). We used all possible pairs, triplets, and quadruplets of South American groups as test populations. We also tried different combinations of South American groups—up to 15 different groups together—as test populations. For qpWave analyses we used the default settings except for the change that we set allsnps: YES.

Admixture graph modeling

Patterson et al., 2012 Patterson N.

Moorjani P.

Luo Y.

Mallick S.

Rohland N.

Zhan Y.

Genschoreck T.

Webster T.

Reich D. Ancient admixture in human history. 2 , f 3 , and f 4 -statistics. We started with a skeleton phylogenetic tree consisting of Mbuti, Russia_MA1_24000BP (MA1), Onge, and Han from prior publications ( Lipson and Reich, 2017 Lipson M.

Reich D. A working model of the deep relationships of diverse modern human genetic lineages outside of Africa. Skoglund et al., 2015 Skoglund P.

Mallick S.

Bortolini M.C.

Chennagiri N.

Hünemeier T.

Petzl-Erler M.L.

Salzano F.M.

Patterson N.

Reich D. Genetic evidence for two founding populations of the Americas. 4 -statistics with |Z| > 3.5 between empirical and predicted statistics (except for the case of adding Surui due to the difficulties of modeling in the Population Y signal). We created the graphs with all overlapping SNPs among the included groups. We used the default settings of qpGraph for all runs exept for the options “outpop: NULL” instead of setting an outgroup population and “allsnps: YES” to compute each f-statistic on the common SNPs present in the populations involved in the statistic, rather than the intersection of all SNPs present in the dataset. To reduce the impact of damage-induced substitutions in UDGminus data of the Anzick-1 individual we restricted the analysis to a version of this sample where sequences were 10bp trimmed on both sides before genotyping. In addition, we performed all analyses with the transitions at CpG sites removed, and we also report the maximum Z-scores of many of the analyses with all transition sites removed. Lastly, for the graphs in Figures S6A-D we computed standard errors for the lengths of different graph edges by performing a block jackknife by dropping each of 100 contiguous blocks (with an equal number of SNPs) in turn ( Lazaridis et al., 2016 Lazaridis I.

Nadel D.

Rollefson G.

Merrett D.C.

Rohland N.

Mallick S.

Fernandes D.

Novak M.

Gamarra B.

Sirak K.

et al. Genomic insights into the origin of farming in the ancient Near East. We used qpGraph () to model the relationships between diverse samples. This software assesses the fit of admixture graph models to allele frequency correlation patterns as measured by f, f, and f-statistics. We started with a skeleton phylogenetic tree consisting of Mbuti, Russia_MA1_24000BP (MA1), Onge, and Han from prior publications (). We added the ancient South American populations in different combinations and retained only the graph solutions that provided no individual f-statistics with |Z| > 3.5 between empirical and predicted statistics (except for the case of adding Surui due to the difficulties of modeling in the Population Y signal). We created the graphs with all overlapping SNPs among the included groups. We used the default settings of qpGraph for all runs exept for the options “outpop: NULL” instead of setting an outgroup population and “allsnps: YES” to compute each f-statistic on the common SNPs present in the populations involved in the statistic, rather than the intersection of all SNPs present in the dataset. To reduce the impact of damage-induced substitutions in UDGminus data of the Anzick-1 individual we restricted the analysis to a version of this sample where sequences were 10bp trimmed on both sides before genotyping. In addition, we performed all analyses with the transitions at CpG sites removed, and we also report the maximum Z-scores of many of the analyses with all transition sites removed. Lastly, for the graphs in Figures S6A-D we computed standard errors for the lengths of different graph edges by performing a block jackknife by dropping each of 100 contiguous blocks (with an equal number of SNPs) in turn ().

Scheib et al., 2018 Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. Scheib et al. analyzed data from diverse Native American populations—ancient and modern—and proposed that in Central and South Americans today there is a history of widespread admixture between the two deepest branches of Native American genetic variation (ANC-A and ANC-B), with a minimum of ∼30% of each branch admixed into all populations (). They write “The summary of evidence presented here allows us to reject models of a panmictic “first wave” population from which the ASO [the Ancient South Ontario population] diverged after the population of South America or in which solely the ANC-A population contributed to modern southern branch populations.”

Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. The evidence for the claim that Central and South Americans do not have entirely ANC-A ancestry is based on fitting the admixture graph model of Figure 2A in, which the authors show is a fit to the data jointly for Han, Anzick-1, USA_SanNicolas_4900BP (ESN), USA_SanNicolas_1400BP (LSN), Pima, Surui, and Canada_Lucier_4800BP-500BP (ASO). They then added a diverse set of other Native American populations into the graph as mixtures of the same two lineages, and report the mixture proportions in Table S8 of their study.

Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. We began by replicating the finding ofthat their proposed admixture graph was a fit to the data (maximum mismatch between observed and expected f-statistics of |Z| = 1.1) ( Figure S6 A). However, when we added to the admixture graph additional non-American populations whose phylogenetic relationship to Native American populations has been well worked out (Russia_MA1_24000BP, Onge, and Mbuti), the model is a poor fit (maximum mismatch of observed and expected f-statistics of |Z| = 4.8) ( Figure S6 B). This implies that the model ofdoes not capture some important features of the history relating these populations, and suggests that we may not be able to rely on the inferred proportions of ancestry.

Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. 4 (USR1, Canada_Lucier_4800BP-500BP; Anzick-1, Test Central or South America) would often be positive. In fact, Canada_Lucier_4800BP-500BP is consistent with being an outgroup to all Central and South America in our analysis, as statistics of the form f 4 (USR1, Canada_Lucier_4800BP-500BP; Anzick-1, Test Central or South America) are all consistent with zero except for the special Late Central Andes individuals (as we describe elsewhere, this signal could be explained either by less than 2% Canada_Lucier_4800BP-500BP admixture into the Late Central Andes groups, or alternatively USA_SanNicolas_1400BP-related admixture into Canada_Lucier_4800BP-500BP) (modern South Americans such as Piapoco and Quechua had statistics consistent with zero as well) ( Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. Ifwere correct that there was widespread ANC-B ancestry in Central and South America, then Canada_Lucier_4800BP-500BP would not be an outgroup to Anzick-1 and all Central and South Americans; that is, statistics of the form f(USR1, Canada_Lucier_4800BP-500BP; Anzick-1, Test Central or South America) would often be positive. In fact, Canada_Lucier_4800BP-500BP is consistent with being an outgroup to all Central and South America in our analysis, as statistics of the form f(USR1, Canada_Lucier_4800BP-500BP; Anzick-1, Test Central or South America) are all consistent with zero except for the special Late Central Andes individuals (as we describe elsewhere, this signal could be explained either by less than 2% Canada_Lucier_4800BP-500BP admixture into the Late Central Andes groups, or alternatively USA_SanNicolas_1400BP-related admixture into Canada_Lucier_4800BP-500BP) (modern South Americans such as Piapoco and Quechua had statistics consistent with zero as well) ( Table S4 ). This is in line with Figure S13 of, where Canada_Lucier_4800BP-500BP is also fit as an outgroup to Central and South Americans; the fit of Figure S13 of their study is reasonable, with the maximum mismatch between observed and expected f-statistics being |Z| = 2.0, which is not surprising after correcting for the number of hypotheses tested.

Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. 4 (USR1, Canada_Lucier_4800BP-500BP; Anzick-1, Test Central or South America) are for the most part consistent with being zero, we estimated the genetic drift along the edge leading to Canada_Lucier_4800BP-500BP that mixed into South Americans in Figure 2A. We found that it is not significantly different from zero in any of the graphs that we analyzed ( To obtain some insight into why models such as Figure 2A ofcould fit the data even while statistics like f(USR1, Canada_Lucier_4800BP-500BP; Anzick-1, Test Central or South America) are for the most part consistent with being zero, we estimated the genetic drift along the edge leading to Canada_Lucier_4800BP-500BP that mixed into South Americans in Figure 2A. We found that it is not significantly different from zero in any of the graphs that we analyzed ( Figures S6 A–S6D; STAR Methods ), meaning the ancestry on the Canada_Lucier_4800BP-500BP branch that mixes into the South American groups does not share a significant amount of genetic drift with Canada_Lucier_4800BP-500BP and there is no need to propose widespread mixing between ANC-A and ANC-B.

Scheib et al. (2018) Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. A supporting piece of evidence cited byin favor of mixture between ANC-A and ANC-B lineages in Central and South Americans is that they identify present-day Pima and Surui haplotypes that match Anzick-1 haplotypes (as a representative of ANC-A) more closely than CK-13 (as a representative of Canada_Lucier_4800BP-500BP), and vice versa. However, Native American populations (like all human populations) have a large proportion of shared ancestral haplotypes, and incomplete lineage sorting means that even if two populations are not most closely related, in some sections of the genome they will be most closely related on a haplotypic basis. Thus, it is not clear to us that this analysis demonstrates that Pima and Surui derive from ANC-A/ANC-B mixtures.

Raghavan et al., 2015 Raghavan M.

Steinrücken M.

Harris K.

Schiffels S.

Rasmussen S.

DeGiorgio M.

Albrechtsen A.

Valdiosera C.

Ávila-Arcos M.C.

Malaspinas A.S.

et al. POPULATION GENETICS. Genomic evidence for the Pleistocene and recent population history of Native Americans. Rasmussen et al., 2014 Rasmussen M.

Anzick S.L.

Waters M.R.

Skoglund P.

DeGiorgio M.

Stafford Jr., T.W.

Rasmussen S.

Moltke I.

Albrechtsen A.

Doyle S.M.

et al. The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Reich et al., 2012 Reich D.

Patterson N.

Campbell D.

Tandon A.

Mazieres S.

Ray N.

Parra M.V.

Rojas W.

Duque C.

Mesa N.

et al. Reconstructing Native American population history. In conclusion, given that Canada_Lucier_4800BP-500BP is consistent with being an outgroup to nearly all Central and South Americans based on f-statistic analysis (with the exception of the special Late Central Andes populations), and that there is no compelling haplotype-based evidence for ANC-A and ANC-B admixture in the history of Central and South Americans, the genetic data are in fact consistent with the scenario in which an ANC-A population was the sole contributor to southern branch (Central and South American populations). Thus, our results are consistent with the originally suggested null hypothesis of entirely ANC-A ancestry leading to Central and South Americans ().

Lipson and Reich, 2017 Lipson M.

Reich D. A working model of the deep relationships of diverse modern human genetic lineages outside of Africa. Skoglund et al., 2015 Skoglund P.

Mallick S.

Bortolini M.C.

Chennagiri N.

Hünemeier T.

Petzl-Erler M.L.

Salzano F.M.

Patterson N.

Reich D. Genetic evidence for two founding populations of the Americas. Moreno-Mayar et al., 2018a Moreno-Mayar J.V.

Potter B.A.

Vinner L.

Steinrücken M.

Rasmussen S.

Terhorst J.

Kamm J.A.

Albrechtsen A.

Malaspinas A.S.

Sikora M.

et al. Terminal Pleistocene Alaskan genome reveals first founding population of Native Americans. Reich et al., 2012 Reich D.

Patterson N.

Campbell D.

Tandon A.

Mazieres S.

Ray N.

Parra M.V.

Rojas W.

Duque C.

Mesa N.

et al. Reconstructing Native American population history. Scheib et al., 2018 Scheib C.L.

Li H.

Desai T.

Link V.

Kendall C.

Dewar G.

Griffith P.W.

Mörseburg A.

Johnson J.R.

Potter A.

et al. Ancient human parallel lineages within North America contributed to a coastal expansion. 3 matrix-based neighbor-joining tree. We stopped building the admixture graph once we had fit as many representative ancient individuals as possible that could fit without strong evidence of mixture (worst Z-score outlier f 4 (Han, USA_SanNicolas_4900BP; Argentina_ArroyoSeco2_7700BP, Canada_Lucier_4800BP-500BP) Z = 2.9). To build the admixture graph shown in Figure 3 , we used a skeleton graph from previous publications (). We added in groups based on previous findings (e.g., the Ancient Beringian USR1 as an outgroup [] and the split between ANC-A and ANC-B []). We then added additional groups new to this study using guidance from other results such as the outgroup-fmatrix-based neighbor-joining tree. We stopped building the admixture graph once we had fit as many representative ancient individuals as possible that could fit without strong evidence of mixture (worst Z-score outlier f(Han, USA_SanNicolas_4900BP; Argentina_ArroyoSeco2_7700BP, Canada_Lucier_4800BP-500BP) Z = 2.9).

To build the complex admixture graphs shown in Figures 4 and 5 , we used two approaches. For Figure 4 , we started with the admixture graph of Figure 3 , and then grafted onto it admixture events motivated by our qpWave results, namely mixture from an Anzick-1-related lineage into the earliest Chilean individual and some of the Brazil and Argentina groups, and mixture of USA_SanNicolas_4900BP-related ancestry into Late Central Andes groups. We compared models with and without admixture edges and used the model with an extra admixture edge if it decreased the maximum Z score by over 0.3.

Lazaridis et al. (2018) Lazaridis I.

Belfer-Cohen A.

Mallick S.

Patterson N.

Cheronet O.

Rohland N.

Bar-Oz G.

Bar-Yosef O.

Jakeli N.

Kvavadze E.

et al. Paleolithic DNA from the Caucasus reveals core of West Eurasian ancestry. For Figure 5 we carried out a semi-automated search in which we began with a skeleton model including all non-Native Americans and USA_USR1_11400BP, and then iteratively added as many other populations as we could in a greedy approach, first as simple clades in order to minimize graph complexity, and then as 2-way mixtures if the sample clade approach did not fit. Thus, for N populations, we first fit graphs of m populations and then considered all remaining N-m populations as candidates to be grafted in all fitting models with m populations. Each grafted population was either placed anywhere on the graph (or its two components in case of mixture were placed anywhere on the graph). This approach is described in more detail in

The two admixture graphs shown in Figures 4 and 5 have many qualitative points of agreement including: i) USA_USR1_11400BP as an outgroup to all other Native Americans, ii) a split of ANC-A and ANC-B such that ANC-B had minimal genetic influence on all South Americans, iii) A rapid radiation of the earliest South Americans, with the earliest South Americans having very little drift on the lineages separating them, iv) distinctive shared ancestry between Brazil_LapaDoSanto_9600BP and Chile_LosRieles_10900BP on the one hand and USA_Anzick-1_12800BP on the other, v) distinctive shared ancestry between USA_Anzick-1_12800BP and USA_SanNicolas_4900BP, and vi) mixture of a source of ancestry with distinct relatedness to North Americans into Late Central Andes groups.

The primary disagreement between the admixture graphs concerns the question of whether or not USA_Anzick_12800BP is admixed.

In Figure 4 USA_Anzick_12800BP is modeled as unadmixed, and ancestry related to this group mixes into some of the Brazil, Chile, and Argentina groups as well as into USA_SanNicolas_4900BP. The ancestry sources can be interpreted as resulting from North to South America spreads in successive streams. There are an initial two streams from an Anzick-1-related group retained in Chile_LosRieles_10900BP, Brazil_LapaDoSanto_9600BP, and Argentina_ArroyoSeco2_7700BP and another ancestry stream that is pervasive throughout ancient South America (we cannot resolve the order of these two streams). There is a third ancestry source contributing to Late Central Andes groups, and a fourth ancestry source that corresponds to the Population Y signal in Karitiana and Surui but that we do not specifically model in the graph.

In Figure 5 , most South Americans can be modeled as a mixture of a lineage that split into regional branches in Peru (Lauricocha_8600BP and Cuncaicha_9000BP), the Southern Cone (Argentina_ArroyoSeco2_7700BP and Chile_LosRieles_5100BP), and Brazil_LapaDoSanto_9600BP, with the lineage more closely related to Brazil_LapaDoSanto_9600BP then mixing into the shared ancestors of USA_Anzick_12800BP and USA_SanNicolas_4900BP (possibly reflecting a back-flow from South to North America, although, alternatively, all the splits could have occurred in North America). The model also specifies more recent admixture into Late Central Andes population of a lineage with a distinctive relatedness to North Americans (this model also included West Eurasian related admixture in Canada_Lucier_4800BP-500BP that likely reflects a low level of contamination in these samples).

Both models shown in Figures 4 and 5 are reasonable statistical fits (maximum Z-scores of 3.4 and 2.9 with only transitions in CpG sites removed, and 3.0 and 2.9 when all transitions are removed), and we were unable to resolve which was better. Additional sampling of early North and South Americans could help to resolve the true model.

In Figure S5, we present various modifications of these models, including some that add Surui which has evidence of a fourth source of “Population Y” ancestry that bears a different relationship to Asians.