Abstract Previous research has shown that modern Eurasians interbred with their Neanderthal and Denisovan predecessors. We show here that hundreds of thousands of years earlier, the ancestors of Neanderthals and Denisovans interbred with their own Eurasian predecessors—members of a “superarchaic” population that separated from other humans about 2 million years ago. The superarchaic population was large, with an effective size between 20 and 50 thousand individuals. We confirm previous findings that (i) Denisovans also interbred with superarchaics, (ii) Neanderthals and Denisovans separated early in the middle Pleistocene, (iii) their ancestors endured a bottleneck of population size, and (iv) the Neanderthal population was large at first but then declined in size. We provide qualified support for the view that (v) Neanderthals interbred with the ancestors of modern humans.

INTRODUCTION During the past decade, we have learned about interbreeding among hominin populations after 50 thousand years (ka) ago, when modern humans expanded into Eurasia (1–3). Here, we focus farther back in time, on events that occurred more than a half million years ago. In this earlier time period, the ancestors of modern humans separated from those of Neanderthals and Denisovans. Somewhat later, Neanderthals and Denisovans separated from each other. The paleontology and archaeology of this period record important changes, as large-brained hominins appear in Europe and Asia and Acheulean tools appear in Europe (4, 5). It is not clear, however, how these large-brained hominins relate to other populations of archaic or modern humans (6–9). We studied this period using genetic data from modern Africans and Europeans and from two archaic populations, Neanderthals and Denisovans. Figure 1 illustrates our notation. Uppercase letters refer to populations, and combinations such as XY refer to the population ancestral to X and Y. X represents an African population (the Yorubans), Y is a European population, N is Neanderthals, and D is Denisovans. S is an unsampled “superarchaic” population that is distantly related to other humans. Lowercase letters at the bottom of Fig. 1 label “nucleotide site patterns.” A nucleotide site exhibits site pattern xyn if random nucleotides from populations X, Y, and N carry the derived allele, but those sampled from other populations are ancestral. Site pattern probabilities can be calculated from models of population history, and their frequencies can be estimated from data. Our Legofit (10) software estimates parameters by fitting models to these relative frequencies. Fig. 1 A population network including four episodes of gene flow, with an embedded gene genealogy. Upper case letters (X, Y, N, D, and S) represent populations (Africa, Europe, Neanderthal, Denisovan, and superarchaic). Greek letters label episodes of admixture. d and xyn illustrate two nucleotide site patterns, in which 0 and 1 represent the ancestral and derived alleles. A mutation on the red branch would generate site pattern d. One on the blue branch would generate xyn. For simplicity, this figure refers to Neanderthals with a single letter. Elsewhere, we use two letters to distinguish between the Altai and Vindija Neanderthals. Nucleotide site patterns contain only a portion of the information available in genome sequence data. This portion, however, is of particular relevance to the study of deep population history. Site pattern frequencies are unaffected by recent population history because they ignore the within-population component of variation (10). This reduces the number of parameters we must estimate and allows us to focus on the distant past. The current data include two high-coverage Neanderthal genomes: one from the Altai Mountains of Siberia and the other from Vindija Cave in Croatia (11). Rather than assigning the two Neanderthal fossils to separate populations, our model assumes that they inhabited the same population at different times. This implies that our estimates of Neanderthal population size will refer to the Neanderthal metapopulation rather than to any individual subpopulation. The Altai and Vindija Neanderthals appear in site pattern labels as “a” and “v”. Thus, av is the site pattern in which the derived allele appears only in nucleotides sampled from the two Neanderthal genomes. Figure 2 shows the site pattern frequencies studied here. In contrast to our previous analysis (12), the current analysis includes singleton site patterns, x, y, v, a, and d, as advocated by Mafessoni and Prüfer (13). A simpler tabulation, which excludes the Vindija genome, is included as fig. S2. Fig. 2 Observed site pattern frequencies. Horizontal axis shows the relative frequency of each site pattern in random samples consisting of a single haploid genome from each of X, Y, V, A, and D, representing Africa, Europe, Vindija Neanderthal, Altai Neanderthal, Denisovan, and superarchaic. Horizontal lines (which look like dots) are 95% confidence intervals estimated by a moving blocks bootstrap (35). Data: Simons Genome Diversity Project (SGDP) (14) and Max Planck Institute for Evolutionary Anthropology (11). Greek letters in Fig. 1 label episodes of admixture. We label models by concatenating Greek letters to indicate the episodes of admixture they include. For example, model “αβ” includes only episodes α and β. Our model does not include gene flow from Denisovans into moderns because there is little evidence of such gene flow into Europeans (14, 15). Two years ago, we studied a model that included only one episode of admixture: α, which refers to gene flow from Neanderthals into Europeans (12). The left panel of Fig. 3 shows the residuals from this model, using the new data. Several are far from zero, suggesting that something is missing from the model (16). Fig. 3 Residuals from models α and αβγδ. Key: red asterisks, real data; blue circles, 50 bootstrap replicates. Recent literature suggests some of what might be missing. There is evidence for admixture into Denisovans from a superarchaic population, which was distantly related to other humans (2, 11, 17–19), and also for admixture from early moderns into Neanderthals (19). These episodes of admixture appear as β and γ in Fig. 1. Adding β and/or γ to the model improved the fit, yet none of the resulting models were satisfactory. For example, model αβγ implied (implausibly) that superarchaics separated from other hominins 7 million years (Ma) ago. To understand what might still be missing, consider what we know about the early middle Pleistocene, around 600 ka ago. At this time, large-brained hominins appear in Europe, along with Acheulean stone tools (4, 5). They were probably African immigrants, because similar fossils and tools occur earlier in Africa. According to one hypothesis, these early Europeans were Neanderthal ancestors (6, 7). Somewhat earlier—perhaps 750 ka ago [(8), table S12.2]—the “neandersovan” ancestors of Neanderthals and Denisovans separated from the lineage leading to modern humans. Neandersovans may have separated from an African population and then expanded into Eurasia. If so, then they would not have been expanding into an empty continent, for Eurasia had been inhabited since 1.85 Ma ago (20). Neandersovan immigrants may have met the indigenous superarchaic population of Eurasia. This suggests a fourth episode of admixture, from superarchaics into neandersovans, which appears as δ in Fig. 1.

RESULTS We considered eight models, all of which include α, and including all combinations of β, γ, and/or δ. In choosing among complex models, it is important to avoid overfitting. Conventional methods such as Akaike’s information criterion (21) are not available because we do not have access to the full likelihood function. Instead, we use the bootstrap estimate of predictive error (bepe) (10, 22, 23). The best model is the one with the lowest value of bepe. When no model is clearly superior, it is better to average across several than to choose just one (24). For this purpose, we used bootstrap model averaging (booma) (10, 24). The booma weight of the ith model is the fraction of datasets (including the real data and 50 bootstrap replicates) in which that model “wins,” i.e., has the lowest value of bepe. The bepe values and booma weights of all models are in Table 1. Table 1 bepe values and booma weights. View this table: The best model is αβγδ, which includes all four episodes of admixture. It has smaller residuals (Fig. 3, right), the lowest bepe value, and the largest booma weight. One other model, αβδ, has a positive booma weight, but all others have zero weight. To understand what this means, recall that bootstrap replicates approximate repeated sampling from the process that generated the data. The models with zero weight lose in all replicates, implying that their disadvantage is large compared with variation in repeated sampling. On this basis, we can reject these models. Neither of the two remaining models can be rejected. These results provide strong support for two episodes of admixture (β and δ) and qualified support for a third (γ). Not only does this support previously reported episodes of gene flow but it also reveals a much older episode, in which neandersovans interbred with superarchaics. Model-averaged parameter estimates, which use the weights in Table 1, are graphed in Fig. 4 and listed in table S1. Fig. 4 Model-averaged parameter estimates with 95% confidence intervals estimated by moving blocks bootstrap ( 35 ). Key: m α , fraction of Y introgressed from N; m β , fraction of D introgressed from S; m γ , fraction of N introgressed from XY; m δ , fraction of ND introgressed from S; T XYNDS , superarchaic separation time; T XY , separation time of X and Y; T ND , separation time of N and D; T N 0 , end of early epoch of Neanderthal history; T A , age of Altai Neanderthal fossil; T V , age of Vindija Neanderthal fossil; T D , age of Denisovan fossil; N S , size of superarchaic population; N XYND , size of populations XYND and XYNDS; N XY , size of population XY; N ND , size of population ND; N N 0 , size of early Neanderthal population; N N 1 , size of late Neanderthal population. Parameters that exist in only one model are not averaged. Episode δ, which proposes gene flow from superarchaics into neandersovans, is a novel hypothesis. Before accepting it, we should ask whether the evidence in its favor could be artifactual, reflecting a bias in site pattern frequencies caused by sequencing error or somatic mutations. Sequencing error adds a positive bias to the frequency of each singleton site pattern proportional to the per-nucleotide error rate in the corresponding population (see the Supplementary Materials). Somatic mutations have a similar effect. These biases might explain evidence for episode δ, if it were true that larger values of m δ (the fraction of superarchaic admixture in neandersovans) imply larger frequencies of singleton site patterns. However, Table 2 shows that this is not the case. There is no consistent tendency for singleton frequencies to increase with m δ . Indeed, three of them decrease. Consequently, evidence that m δ > 0 cannot be the result of a positive bias in the frequencies of singleton site patterns. The evidence for δ admixture cannot be an artifact of sequencing error or somatic mutations. Table 2 Effect on singleton site pattern frequencies of gene flow (m δ ) from superarchaics into neandersovans. Column 2 shows expected frequencies of singleton site patterns in a model in which m δ = 0, and all other parameters are as fitted under model αβγδ. In column 3, all parameters including m δ are as fitted under this model. Column 4 is obtained by subtracting column 2 from column 3. Expected site pattern frequencies were estimated using legosim with 107 iterations. View this table: The superarchaic separation time, T XYNDS , has a point estimate of 2.3 Ma ago. This estimate may be biased upward because our molecular clock assumes a fairly low mutation rate of 0.38 × 10−9 per nucleotide site per year. Other authors prefer slightly higher rates (25). Although this rate is apparently insensitive to generation time among the great apes, it is sensitive to the age of male puberty. If the average age of puberty during the past 2 Ma were halfway between those of modern humans and chimpanzees, the yearly mutation rate would be close to 0.45 × 10−9 [(26), Fig. 2B], and our estimate of T XYNDS would drop to 1.9 Ma, just at the origin of the genus Homo. Under this clock, the 95% confidence interval is 1.8 to 2.2 Ma. If superarchaics separated from an African population, then this separation must have preceded the arrival of superarchaics in Eurasia. Nonetheless, our 1.8 to 2.2 Ma interval includes the 1.85 Ma date of the earliest Eurasian archaeological remains at Dmanisi (20). Thus, superarchaics may descend from the earliest human dispersal into Eurasia, as represented by the Dmanisi fossils. On the other hand, some authors prefer a higher mutation rate of 0.5 × 10−9 per year (2). Under this clock, the lower end of our confidence interval would be 1.6 Ma ago. Thus, our results are also consistent with the view that superarchaics entered Eurasia after the earliest remains at Dmanisi. Parameter N S is the effective size of the superarchaic population. This parameter can be estimated because there are two sources of superarchaic DNA in our sample (β and δ), and this implies that coalescence time within the superarchaic population affects site pattern frequencies. Although this parameter has a broad confidence interval, even the low end implies a fairly large population of about 20,000. This does not require large numbers of superarchaic humans, because effective size can be inflated by geographic population structure (27). Our large estimate may mean that neandersovans and Denisovans received gene flow from two different superarchaic populations. Parameter T ND is the separation time of Neanderthals and Denisovans. Our point estimate, 737 ka ago, is remarkably old. Furthermore, the neandersovan population that preceded this split was remarkably small: N ND ≈ 500. This supports our previous results, which indicated an early separation of Neanderthals and Denisovans and a bottleneck among their ancestors (12). Because our analysis includes two Neanderthal genomes, we can estimate the effective size of the Neanderthal population in two separate epochs. The early epoch extends from T N 0 = 455 ka to T ND = 737 ka, and within this epoch, the effective size was large: N N 0 ≈ 16,000. It was smaller during the later epoch: N N 1 ≈ 3400. These results support previous findings that the Neanderthal population was large at first but then declined in size (2, 11).

DISCUSSION This project began with a puzzle. We had argued in 2017 that Neanderthals and Denisovans separated early, that their neandersovan ancestors endured a bottleneck of population size, and that the postseparation Neanderthal population was large (12). That analysis omitted singleton site patterns. Mafessoni and Prüfer (13) pointed out that introducing singletons led to different results. In response, Rogers et al. (16) agreed, but also observed that the with-singleton analysis implied that the Denisovan fossil was only 4000 years old—a result that is plainly wrong. Furthermore, a residual analysis showed that neither of the models under discussion in 2017 fit the data very well (16). Something was apparently missing from both models—but what? The present paper provides an answer to that question. Our results shed light on the early portion of the middle Pleistocene, about 600 ka ago, when large-brained hominins appear in the fossil record of Europe along with Acheulean stone tools. There is disagreement about how these early Europeans should be interpreted. Some see them as the common ancestors of modern humans and Neanderthals (28), others as an evolutionary dead end, later replaced by immigrants from Africa (29, 30), and others as early representatives of the Neanderthal lineage (6, 7). Our estimates are most consistent with the last of these views. They imply that by 600 ka ago, Neanderthals were already a distinct lineage, separate not only from the modern lineage but also from Denisovans. These results resolve a discrepancy involving human fossils from Sima de los Huesos (SH). Those fossils had been dated to at least 350 ka ago and perhaps 400 to 500 ka ago (31). Genetic evidence showed that they were from a population ancestral to Neanderthals and therefore more recent than the separation of Neanderthals and Denisovans (9). However, genetic evidence also indicated that this split occurred about 381 ka ago [(2), table S12.2]. This was hard to reconcile with the estimated age of the SH fossils. To make matters worse, improved dating methods later showed that the SH fossils are even older, about 600 ka, and much older than the molecular date of the Neanderthal-Denisovan split (32). Our estimates resolve this conflict because they push the date of the split back well beyond the age of the SH fossils. Our estimate of the Neanderthal-Denisovan separation time conflicts with 381 ka ago estimate discussed above (2, 13). This discrepancy results, in part, from differing calibrations of the molecular clock. Under our clock, the 381-ka date becomes 502 ka (12), but this is still far from our own 737-ka estimate. The remaining discrepancy may reflect differences in our models of history. Misspecified models often generate biased parameter estimates. Our new results on Neanderthal population size differ from those we published in 2017 (12). At that time, we argued that the Neanderthal population was substantially larger than others had estimated. Our new estimates are more in line with those published by others (2, 11). The difference does not result from our new and more elaborate model because we get similar results from model α, which (as in our 2017 model) allows only one episode of gene flow (table S2). Instead, it was including the Vindija Neanderthal genome that made the difference. Without this genome, we still get a large estimate (N N1 ≈ 11,000), even using model αβγδ (table S3). This implies that the Neanderthals who contributed DNA to modern Europeans were more similar to the Vindija Neanderthal than to the Altai Neanderthal, as others have also shown (11). Our results revise the date at which superarchaics separated from other humans. One previous estimate put this date between 0.9 and 1.4 Ma [(2), p. 47], which implied that superarchaics arrived well after the initial human dispersal into Eurasia around 1.9 Ma. This required a complex series of population movements between Africa and Eurasia [(33), pp. 66 to 71]. Our new estimates do not refute this reconstruction, but they do allow a simpler one, which involves only three expansions of humans from Africa into Eurasia: an expansion of early Homo at about 1.9 Ma ago, an expansion of neandersovans at about 700 ka ago, and an expansion of modern humans at about 50 ka ago. Our results indicate that neandersovans interbred with superarchaics early in the middle Pleistocene, shortly after expanding into Eurasia. This is the earliest known admixture between hominin populations. Furthermore, the two populations involved were more distantly related than any pair of human populations previously known to interbreed. According to our estimates, neandersovans and superarchaics had been separate for about 1.2 Ma. Later, when superarchaics exchanged genes with Denisovans, the two populations had been separate even longer. By comparison, the Neanderthals and Denisovans who interbred with modern humans had been separate less than 0.7 Ma. It seems likely that superarchaics descend from the initial human settlement of Eurasia. As discussed above, the large effective size of the superarchaic population hints that it comprised at least two deeply divided subpopulations, of which one mixed with neandersovans and another with Denisovans. We suggest that around 700 ka ago, neandersovans expanded from Africa into Eurasia, endured a bottleneck of population size, interbred with indigenous Eurasians, largely replaced them, and separated into eastern and western subpopulations—Denisovans and Neanderthals. These same events unfolded once again around 50 ka ago as modern humans expanded out of Africa and into Eurasia, largely replacing the Neanderthals and Denisovans.

MATERIALS AND METHODS Study design Our sample of modern genomes includes Europeans but not other Eurasians. This allowed us to avoid modeling gene flow from Denisovans because there is no evidence of such gene flow into Europeans. The precision of our estimates depends largely on the number of nucleotides studied. For this reason, we used entire high-coverage genomes. The number of genomes sampled per population has little effect on our analyses, because of our focus on the between-population component of genetic variation, i.e., on site pattern frequencies. Nonetheless, our sample of modern genomes for the Yoruban, French, and English includes all those available from the Simons Genome Diversity Project (SGDP) (14), as detailed in the Supplementary Materials. We also included all available high-coverage archaic genomes (11). These data provide extremely accurate estimates of site pattern frequencies, as indicated by the tiny confidence intervals in Fig. 2. The large confidence intervals for some parameters in Fig. 4 reflect identifiability problems (discussed below) and would not be alleviated by an increase in sample size. Quality control Our quality control (QC) pipeline for the SGDP genomes excludes genotypes at which an FL value equals 0 or N. We also excluded sex chromosomes, normalized all variants at a given nucleotide site using the human reference genome, excluded sites within seven bases of the nearest insertion-deletion, and included sites only if they were monomorphic or were biallelic single-nucleotide polymorphisms. Further details are provided in the Supplementary Materials. All ancient genomes were also filtered against .bed files, which identify bases that pass the Max Planck QC filters. These .bed files are available at http://ftp.eva.mpg.de/neandertal/Vindija/FilterBed. Molecular clock calibration We assumed a mutation rate of 1.1 × 10−8 per site per generation (34) and a generation time of 29 years—a yearly rate of 0.38 × 10−9. To calibrate the molecular clock, we assumed that the modern and neandersovan lineages separated T XYND = 25,920 generations before the present (12). This is based on an average of several estimates published by Prüfer et al. [(2), table S12.2]. The average of their estimates is 570.25 ka, assuming a mutation rate 0.5 × 10−9/base pair/year. Under our clock, their separation time becomes 751.69 ka or 25,920 generations. Statistical analysis Because of our focus on deep history, we based statistical analyses on site pattern frequencies, using the Legofit statistical package (10). This method ignores the within-population component of genetic variation and is therefore unaffected by recent changes in population size. For example, the sizes of populations X, Y, and D (Fig. 1) have no effect, so we need not complicate our model with parameters describing the size histories of these populations. This allows us to focus on the distant past. Nonetheless, our models are quite complex. For example, model αβγδ has 17 free parameters. To choose among models of this complexity, we need methods of residual analysis, model selection, and model averaging. Legofit provides these methods, but alternative methods generally do not. These methods are described in detail elsewhere (10), so we summarize them only briefly here. We chose among models by minimizing the bepe (22, 23). This approach was needed because we could not use methods, such as Akaike’s information criterion (21), that depend on likelihood. Bepe is analogous to cross validation but uses bootstrap replicates instead of partitions of the data. The model is fit to each bootstrap replicate and then tested against the real data, after applying a correction for bootstrap bias. Bepe estimates the mean squared difference between observed and predicted site pattern frequencies, when the model is fit to one dataset and tested against another. We also used booma (24), which assigns weights to individual models, based on their bepe values. Parameters are estimated as the weighted average of estimates from individual models. The booma weight of the ith model is the fraction of replicates (including the real data and 50 bootstrap replicates) in which that model wins, i.e., has the lowest value of bepe. Because bootstrap replicates approximate repeated sampling from the process that generated the data, a model will receive zero weight if its disadvantage (as measured by bepe) is large compared with variation in repeated sampling. Figure S3 illustrates a problem of statistical identifiability. Several parameters are tightly correlated with others, indicating that our problem has fewer dimensions than parameters. This does not lead to incorrect estimates, but it broadens the confidence intervals of the parameters involved. Legofit addresses this problem using principal components analysis to remove dimensions that account for less than a fraction 0.001 of the total variance. This narrows confidence intervals and increases the accuracy of parameter estimates. Uncertainties are estimated by moving blocks bootstrap (35), using a block size of 500 single-nucleotide polymorphisms. Our statistical pipeline is detailed in the Supplementary Materials.

SUPPLEMENTARY MATERIALS Supplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/6/8/eaay5483/DC1 Supplementary Materials and Methods Fig. S1. Heterozygosity as a function of FL value for genome SS6004468 of the SGDP (14). Fig. S2. Observed site pattern frequencies excluding the Vindija genome. Fig. S3. Associations between estimates of several pairs of parameters after second stage in analysis of model αβγδ. Table S1. Model-averaged parameter estimates. Table S2. Estimates under model α. Table S3. Estimates under model αβγδ with a data set that excludes the Vindija Neanderthal genome. References (36–39)

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial license, which permits use, distribution, and reproduction in any medium, so long as the resultant use is not for commercial advantage and provided the original work is properly cited.