D-statistics and F4 ratio analysis

D-statistics and F4 ratio statistics, tests of frequent use in population genetics, have consistently demonstrated the existence of Neanderthal and Denisovan ancestry in out of African populations16. Using a common variant calling (see Methods and also Supplementary Methods), we found with D-statistics and F4 ratio test that all Asian populations (East Asians, Andamanese and Indian Tribal populations) have higher amounts of Neanderthal ( > 1%) and Denisovan ancestry ( > 1%) compared to Europeans (Table 1). Interestingly, when trying to calculate the excess of Neanderthal ancestry in Asian populations compared to Europeans using the F4 ratio test, different outgroups (ancestral alleles mainly constituted from the chimpanzee reference genomes33 and Denisovan12) gave different results (Fig. 1). This was unexpected, as the F4 ratio test and D-statistics should not be affected by changing the outgroup under the simplistic topology used in these tests. When using the ancestral alleles from the 1000 Genomes as outgroup33, the excess of Neanderthal ancestry in Asian populations is around 0.5% (s.e. 0.15%) and statistically significant (Z score > 3). In contrast, when using Denisovan as outgroup, this estimate diminishes and loses statistical significance (~0% ± 0.15%). We then used simulations considering different demographic models (Supplementary Figure 1), varying the percentage of introgression while keeping constant all the other parameters (Supplementary Table 3), to fit the observed D-statistics and F4 ratio statistics. The simulation results suggest that a simple model for the excess of Neanderthal and Denisova ancestry in Asian populations is not enough for fitting these statistics and other ancestral component is needed (Supplementary Figure 3). Moreover, our analyses suggest that D-statistics and F4 ratio are sensitive to the outgroup (Denisova or chimpanzee) used at least for archaic introgression model. In Oceania, results get more complex to interpret due to the derived allele sharing between Neanderthals and Denisovans. Given the possible biases due to the reference genome (mostly of European origin) used to calculate D-statistics, we re-mapped all the genomes to the chimpanzee reference and recalculated the results (Supplementary Table 2). Although most of the Neanderthal and Denisovan ancestry calculations remained the same, we note a decrease of African ancestry by changing the reference from human to chimpanzee in Indian Tribal and Oceanian populations. Overall, these results show a complex situation that needs a broader approach to distinguish among competing demographic models.

Table 1 D-statistics of Neanderthal and Denisova ancestry and dearth of African Ancestry in non-African populations Full size table

Fig. 1 Increase of Neanderthal ancestry in Asians compared to Europeans. The introgression amount is calculated by the F4 ratio test. F4 (Outgroup, Altai; Europeans, X)/F4(Outgroup, Altai; Africa, Vindjia) where Outgroup can be either Denisova or Ancestral alleles from 1000 Genome and X is either East Asian (ASN), Indian Tribal (IND) or Andamanese (AND). In the y axis to make it short we only mention F4(Outgroup, X) as the other populations remain constant for all the F4 ratio tests. In the x axis we have the amount of F4 ratio and the standard error is denoted by bars (standard error was calculated by Jackknife method using 555 blocks). The blue line signifies 0 value for F4 ratio test Full size image

ABC-DL analysis

In order to quantify the posterior probability of different demographic models given the observed genetic diversity in current populations and archaic individuals, which cannot be achieved by the D-statistics and F4 ratio analyses, we applied an ABC-DL analysis (see Methods and Supplementary Note 2).

We considered several plausible competing demographic models which describe scenarios that have been proposed by different studies to explain the observed genetic diversity of introgression patterns among current old world AMH populations, Neanderthals and Denisovans (see Fig. 2 and Supplementary Table 5). The implemented demographic models are variations from an initial model with many accepted features (Model A): OOA origin of modern humans, with a Eurasian split between Europeans and the group comprising two subgroups, East Asians, Indian and Andamanese on one hand, and Papuans and Australians on the other. Introgression of Neanderthals is found in all OOA populations, and that of Denisovans in the Oceanian populations. Introgression from an extinct group of distant hominins into Denisovans14,16 has been considered in all models, and this introgression does not affect directly our core results. From this basic model, several variations have been proposed to better fit the genetic data; here we consider the following. Model B: decrease of European Neanderthal ancestry due to admixture with a modern human, “basal Eurasian”, a ghost population non-introgressed by Neanderthals (Xn)34. Model C: two introgression events from Neanderthals, the first affecting all OOA populations and the second only affecting Asia35,36. Model D: two introgression events both from Neanderthals and Denisovans, with a Denisovan introgression on Asians before the split of Oceanian populations14,16. The remaining models have a single introgression of Neanderthals and Denisovans (as in Model A) but with an introgression to Asian and Oceanian populations from an extinct (Xe) hominin population that is a sister group of Neanderthals (Model E), or is a sister group of Denisova (Model F;20), or is an outgroup of both Neanderthals and Denisovans (Model G;6). Given the recently discovered Neanderthal–Denisovan hybrid37, we also considered a scenario in which this extinct hominin is an admixed population between Neanderthals and Denisovans (Model H)37, a similar model of a trichotomy as will be seen below. Furthermore, Model E and Model F can be considered particular cases of Model H when the percentage of Denisovan admixture is 0 or 100%, respectively. In all the models, the basic tree structure remained constant although all the demographic parameters—including the amount of introgression- were estimated.

Fig. 2 Demographic models implemented for explaining the genetic variation present in current African (A), European (E), East Asian (Ea), Andamanese (An), Indian (I), Papuan (P) and Australian (Au) populations, in relation with Altai Neanderthal (N) and Denisovan (D) archaic populations. The process of archaic introgression in Asian populations is modelled either by different introgression events from a anatomically modern human ghost population (Xn), b, c, d known archaic populations such as N and D and or e, f, g, h by an unknown archaic extinct (Xe) population. An archaic population (Xd) introgressing to Denisovans is also modelled. Turquoise arrows indicate single pulse events of archaic introgression. All models include recent continuous migrations between African and European, European and East Asian, and Papuan and Australian populations Full size image

We first evaluated the performance of the proposed ABC-DL methodology for distinguishing between the proposed eight models. We used simulated data as pseudo-observed data and run the full ABC-DL pipeline; we applied a ‘hard’ model classification to assign the simulated data to the model with the highest posterior probability (see Supplementary Note 2). The confusion matrix (Supplementary Table 6) summarizing the model classification shows that the proposed ABC approach identifies the model that generated the data (Model sim ) with posterior probability P(Model sim = Model abc | Data) > 50% in all the models. When comparing the eight considered models by ABC-DL with the observed data, the ABC model inference supports model H (P(Model = H|Data) = 0.46), namely the introgression of genes from an archaic Denisovan-Neanderthal admixed extinct population (Xe) into Asian AMH populations, before the split of Oceanian groups, closely followed by Model F (P(Model = F|Data) = 0.38), which models Xe as a sister population of Denisovans. Model H is slightly more likely than Model F (1.2 times more likely), and much more likely than models E (5 times more likely), G (7.6 times) and D (107 times). Models not considering the presence of an extinct archaic ‘ghost’ population introgressing AMH -A, B, C and D- obtained a posterior probability of ~0 (see Supplementary Figure 4). Therefore, only models with admixture of an extinct population in the Neanderthal–Denisovan clade, but with significant differences to each of them, are supported by the ABC-DL framework.

Next, we implemented ABC-DL to estimate the posterior distribution of each parameter for Model H (Fig. 3b and Supplementary Figure 5). Given that Model F is also similarly supported, for completion we also estimated the posterior distribution of the parameters in this model (Fig. 3a). As internal check of the performance of the DL in Model H, we first estimated the Pearson correlation between the parameter values used for the simulation and the predicted parameter value by the DL (Supplementary Table 7). For almost all the parameters from the Model H, we obtained a high correlation between the real parameter and the predicted one by the DL with the exception of the effective population sizes of the unknown ghost populations, which is a very acceptable expectation as we do not have direct information of the genetic variation in these populations. When applied to the observed data (Table 2 and Supplementary Table 8), our ABC-DL approach for estimating parameters of Model H suggests that (i) the introgression of Neanderthals with AMH out of Africa was 1.3% (CI from 0.18% to 2.5%) and occurred 69 Kya (CI from 56 Kya to 88 Kya; assuming a generation time of 29 years38), whereas the split between African and OOA populations took place 121 Kya (CI from 78 Kya to 167 Kya); (ii) the Denisovan introgression to Oceanian populations was 1.6% (CI from 0.4% to 2.5%) and took place 43 Kya (CI from 29 Kya to 50 Kya); finally, (iii) the estimated amount of introgression in the AMH ancestor of current Asian populations by the extinct archaic population Xe was 2.6% (CI from 0.7% to 4.6%) at 51 Kya (CI from 45 Kya to 58 Kya). According to our analyses, Xe appeared 304 Kya (CI from 211 Kya to 375 Kya) from an admixture event of Denisovans and Neanderthals that would have occurred only 14 Kya after the divergence of Denisovans and Neanderthals (314 Kya; CI from 300 Kya to 343 Kya). The estimated divergence between Denisovan and Neanderthals is relatively recent compared to the estimated in16 (381–473 Kya when considering a mutation rate of 0.5e-09 per bp per year or 1.45e-08 per generation assuming 29 years by generation) but higher than in16 (190–236 Kya when considering a mutation rate of 1e-09 per bp per year or 2.9e-08 per generation). Since the number of non-shared variants depends on the time of divergence, the genetic drift at each population and the mutation rate, a possible explanation for the differences in time of divergence could be the mutation rate, which substantially variate among studies. Demographic parameters estimated for Model F are similar in magnitude to the ones estimated in Model H, with credible intervals that overlap the ones obtained for Model H (Supplementary Table 9).

Fig. 3 Graphical representation of the two best supported model (Model F and Model H) with mean posterior distribution of each parameter. a Mean of the posterior distributions of each parameter calculated by means of ABC-DL in model F for Africans (A), Europeans (E), East Asians (Ea), Andamanese (An), Indians (I), Papuans (P), Aboriginal Australians (Au), Altai Neanderthal (N) and Denisovan (D). The model includes an extinct hominin population (Xe) descendent from the Denisovan population. b Mean of the posterior distributions of each parameter calculated by means of ABC-DL in model H. Xe population is an admixture between Neanderthals and Denisovans which interbreeds with Asian populations and a Homo erectus-like population (Xd) that interbreeds with Denisovan. In blue, some of the recent continuous migrations between populations which were taken into consideration in the model. In turquoise, the different introgressions Full size image

Table 2 Mean and 95% credible interval of the posterior distributions of time and introgression parameters for Model F and Model H Full size table

In order to see whether the proposed Model H fit observed D-statistics, we generated simulations using the mean of the estimated posterior probability of each parameter, and computed D-statistics assuming different population tree topologies for stressing the impact of archaic introgression in Asian populations. Simulation results from ABC-DL for Model H replicate the empirical results of the D-statistics (Table 1, last column) by reproducing the excess of Neanderthal and Denisova ancestry in Asian populations compared to Europeans.

Taken together, we have an overwhelming support for the existence of a third extinct branch of the Neanderthal–Denisovan clade; considering the credible interval of the time of split between the archaic populations, a model with a trichotomy is a good consensus for Model D-H as in all the models the Xe separates from Neanderthal or Denisova at a similar time depth of Neanderthal and Denisova split (Supplementary Table 10).