Abstract Hybridization between humans and Neanderthals has resulted in a low level of Neanderthal ancestry scattered across the genomes of many modern-day humans. After hybridization, on average, selection appears to have removed Neanderthal alleles from the human population. Quantifying the strength and causes of this selection against Neanderthal ancestry is key to understanding our relationship to Neanderthals and, more broadly, how populations remain distinct after secondary contact. Here, we develop a novel method for estimating the genome-wide average strength of selection and the density of selected sites using estimates of Neanderthal allele frequency along the genomes of modern-day humans. We confirm that East Asians had somewhat higher initial levels of Neanderthal ancestry than Europeans even after accounting for selection. We find that the bulk of purifying selection against Neanderthal ancestry is best understood as acting on many weakly deleterious alleles. We propose that the majority of these alleles were effectively neutral—and segregating at high frequency—in Neanderthals, but became selected against after entering human populations of much larger effective size. While individually of small effect, these alleles potentially imposed a heavy genetic load on the early-generation human–Neanderthal hybrids. This work suggests that differences in effective population size may play a far more important role in shaping levels of introgression than previously thought.

Author Summary A small percentage of Neanderthal DNA is present in the genomes of many contemporary human populations due to hybridization tens of thousands of years ago. Much of this Neanderthal DNA appears to be deleterious in humans, and natural selection is acting to remove it. One hypothesis is that the underlying alleles were not deleterious in Neanderthals, but rather represent genetic incompatibilities that became deleterious only once they were introduced to the human population. If so, reproductive barriers must have evolved rapidly between Neanderthals and humans after their split. Here, we show that observed patterns of Neanderthal ancestry in modern humans can be explained simply as a consequence of the difference in effective population size between Neanderthals and humans. Specifically, we find that on average, selection against individual Neanderthal alleles is very weak. This is consistent with the idea that Neanderthals over time accumulated many weakly deleterious alleles that in their small population were effectively neutral. However, after introgressing into larger human populations, those alleles became exposed to purifying selection. Thus, rather than being the result of hybrid incompatibilities, differences between human and Neanderthal effective population sizes appear to have played a key role in shaping our present-day shared ancestry.

Citation: Juric I, Aeschbacher S, Coop G (2016) The Strength of Selection against Neanderthal Introgression. PLoS Genet 12(11): e1006340. https://doi.org/10.1371/journal.pgen.1006340 Editor: David Reich, Broad Institute of MIT and Harvard, UNITED STATES Received: December 22, 2015; Accepted: September 6, 2016; Published: November 8, 2016 Copyright: © 2016 Juric et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are within the paper and its Supporting Information files. Funding: This work was supported by an Advanced Postdoc.Mobility fellowship from the Swiss National Science Foundation P300P3_154613 to SA, and by grants from the National Science Foundation under Grant No. 1353380 to John Willis and GC, and the National Institute of General Medical Sciences of the National Institutes of Health under award numbers NIH R01 GM108779 to GC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Introduction The recent sequencing of ancient genomic DNA has greatly expanded our knowledge of the relationship to our closest evolutionary cousins, the Neanderthals [1–5]. Neanderthals, along with Denisovans, were a sister group to modern humans, having likely split from modern humans around 550,000–765,000 years ago [5]. Genome-wide evidence suggests that modern humans interbred with Neanderthals after humans spread out of Africa, such that nowadays 1.5–2.1% of the autosomal genome of non-African modern human populations derive from Neanderthals [2]. This admixture is estimated to date to 47,000–65,000 years ago [6, 7], with potentially a second pulse into the ancestors of populations now present in East Asia [2, 8–11]. While some introgressed archaic alleles appear to have been adaptive in anatomically modern human (AMH) populations [12–14], on average selection has been suggested to act against Neanderthal DNA from modern humans. This can be seen from the non-uniform distribution of Neanderthal alleles along the human genome [9, 13]. In particular, regions of high gene density or low recombination rate have low Neanderthal ancestry, which is consistent with selection removing Neanderthal ancestry more efficiently from these regions [13]. In addition, the X chromosome has lower levels of Neanderthal ancestry and Neanderthal ancestry is absent from the Y chromosome and mitochondria [2, 4, 5, 9, 13, 15, 16]. The genome-wide fraction of Neanderthal introgression in Europeans has recently been shown to have decreased over the past forty thousand years, and, consistent with the action of selection, this decrease is stronger near genes [17]. Finally, a pattern of lower levels of Denisovan ancestry near genes and on the X chromosome in modern humans have also recently been reported [18, 19]. It is less clear why the bulk of Neanderthal alleles would be selected against. Were early-generation hybrids between humans and Neanderthals selected against due to intrinsic genetic incompatibilities? Or was this selection mostly ecological or cultural in nature? If reproductive barriers had already begun to evolve between Neanderthals and AMH, then these two hominids may have been on their way to becoming separate species before they met again [13, 20, 21]. Or, as we propose here, did differences in effective population size and resulting genetic load between humans and Neanderthals shape levels of Neanderthal admixture along the genome? We set out to estimate the average strength of selection against Neanderthal alleles in AMH. Due to the relatively short divergence time of Neanderthals and AMH, we still share much of our genetic variation with Neanderthals. However, we can recognize alleles of Neanderthal ancestry in humans by aggregating information along the genome using statistical methods [9, 13]. Here, we develop theory to predict the frequency of Neanderthal-derived alleles as a function of the strength of purifying selection at linked exonic sites, recombination rate, initial introgression proportion, and split time. We fit these predictions to recently published estimates of the frequency of Neanderthal ancestry in modern humans [13]. Our results enhance our understanding of how selection shaped the genomic contribution of Neanderthal to our genomes, and shed light on the nature of Neanderthal–human hybridization.

Discussion There is growing evidence that selection has on average acted against autosomal Neanderthal alleles in anatomically modern humans (AMH). Our approach represents one of the first attempts to estimate the strength of genome-wide selection against introgression between populations. The method we use is inspired by previous efforts to infer the strength of background selection and selective sweeps from their footprint on linked neutral variation on a genomic scale [36–39]. We have also developed an approach to estimate selection against on-going maladaptive gene flow using diversity within and among populations [40] that will be useful in extending these findings to a range of taxa. Building on these approaches, more refined models of selection against Neanderthal introgression could be developed. These could extend our results by estimating a distribution of selective effects against Neanderthal alleles, or by estimating parameters separately for various categories of sequence, such as non-coding DNA, functional genes, and other types of polymorphism(e.g. structural variation) [41]. Here, we have shown that observed patterns of Neanderthal ancestry in modern human populations are consistent with genome-wide purifying selection against many weakly deleterious alleles. For simplicity, we allowed selection to act only on exonic sites. It is therefore likely that the effects of nearby functional non-coding regions are subsumed in our estimates of the density (μ) and average strength (s) of purifying selection. Therefore, our findings of weak selection are conservative in the sense that the true strength of selection per base pair may be even weaker. We argue that the bulk of selection against Neanderthal ancestry in humans may be best understood as being due to the accumulation of alleles that were effectively neutral in the Neanderthal population, which was of relatively small effective size. However, these alleles started to be purged, by weak purifying selection, after introgressing into the human population, due to its larger effective population size. Thus, we have shown that it is not necessary to hypothesize many loci harboring intrinsic hybrid incompatibilities, or alleles involved in ecological differences, to explain the bulk of observed patterns of Neanderthal ancestry in AMH. Indeed, given a rather short divergence time between Neanderthals and AMH, it is a priori unlikely that strong hybrid incompatibilities had evolved at a large number of loci before the populations interbred. It often takes millions of years for hybrid incompatibilities to evolve in mammals [42, 43], although there are exceptions to this [44], and theoretical results suggest that such incompatibilities are expected to accumulate only slowly at first [45, 46]. While this is a subjective question, our results suggest that genomic data—although clearly showing a signal of selection against introgression—do not strongly support the view that Neanderthals and humans should be viewed as incipient species. Sankararaman et al. [13] found that genes expressed in the human testes showed a significant reduction in Neanderthal introgression, and interpreted this as being potentially consistent with a role of reproductive genes in speciation. However, this pattern could also be explained if testes genes were more likely to harbor weakly deleterious alleles, which could have accumulated in Neanderthals. These two hypotheses could be addressed by relating within-species estimates of the distribution of selective effects with estimates of selection against introgression at these testes genes. This is not to say that alleles of larger effect, in particular those underlying ecological or behavioral differences, did not exist, but rather that they are not needed to explain the observed relationship between gene density and Neanderthal ancestry. Alleles of large negative effect would have quickly been removed from admixed populations, and would likely have led to extended genomic regions showing a deficit of Neanderthal ancestry as described by [9, 13, 47]. Since our method allows us to model the expected amount of Neanderthal ancestry along the genome accounting for selection, it could serve as a better null model for finding regions that are unusually devoid of Neanderthal ancestry. We have ignored the possibility of adaptive introgressions from Neanderthals into humans. While a number of fascinating putatively adaptive introgressions have come to light [14], and more will doubtlessly be identified, they will likely make up a tiny fraction of all Neanderthal haplotypes. We therefore think that they can be safely ignored when assessing the long-term deleterious consequences of introgression. As our results imply, selection against deleterious Neanderthal alleles was very weak on average, such that, after tens of thousands of years since their introduction, these alleles will have only decreased in frequency by 56% on average. Thus, roughly seven thousand loci (≈ μ × 82 million exonic sites) still segregate for deleterious alleles introduced into Eurasian populations via interbreeding with Neanderthals. However, given that the initial frequency of the admixture was very low, we predict that a typical EUR or ASN individual today only carries roughly a hundred of these weak-effect alleles, which may have some impact on genetic load within these populations. Although selection against each deleterious Neanderthal allele is weak, the early-generation human–Neanderthal hybrids might have suffered a substantial genetic load due to the sheer number of such alleles. The cumulative contribution to fitness of many weakly deleterious alleles strongly depends on the form of fitness interaction among them, but we can still make some educated guesses (the caveats of which we discuss below). If, for instance, the interaction was multiplicative, then an average F1 individual would have experienced a reduction in fitness of 1 − (1 − 4 × 10−4)7000 ≈ 94% compared to modern humans, who lack all but roughly one hundred of these deleterious alleles. This would obviously imply a substantial reduction in fitness, which might even have been increased by a small number of deleterious mutations of larger effect that we have failed to capture. This potentially substantial genetic load has strong implications for the interpretation of our estimate of the effective initial admixture proportion (p 0 ), and, more broadly, for our understanding of those early hybrids and the Neanderthal population. We now discuss these topics in turn. Strictly, under our model, the estimate of p 0 reflects the initial admixture proportion in the absence of unlinked selected alleles. However, the large number of deleterious unlinked alleles present in the first generations after admixture violates that assumption, as each of these unlinked alleles also reduces the fitness of hybrids [23]. These unlinked deleterious alleles should cause a potentially rapid initial loss of Neanderthal ancesty following the hybridization. Harris and Nielsen [32] have recently independently conducted simulations of the dynamics of deleterious alleles during the initial period following Neanderthal admixture. They have shown that the frequency of Neanderthal-derived alleles indeed decreases rapidly in the initial generations due to the aggregate effects of many weakly deleterious loci. The reduction in neutral Neanderthal ancestry due to unlinked sites under selection is felt equally along the genome and as such, our estimate of p 0 is an effective admixture proportion that incorporates the genome-wide effect of unlinked deleterious mutations, but not the localized effect of linked deleterious mutations (as formalized by Bengtsson [23]). In practice, segregation and recombination during meiosis in the early generations after admixture will have led to a rapid dissipation of the initial associations (statistical linkage disequilibrium) among any focal neutral site and unlinked deleterious alleles. Therefore, our estimates of p 0 can actually be interpreted as the admixture proportion to which the frequency of Neanderthal alleles settled down to after the first few generations of segregation off of unlinked deleterious alleles. As a consequence, the true initial admixture proportion may have been much higher than our current estimates of p 0 . However, any attempt to correct for this potential bias in our estimates of p 0 is likely very sensitive to assumptions about the form of selection, as we discuss below. Conversely, our estimates of the strength and density of deleterious sites (s and μ) do not strongly change when we include multiple deleterious sites or consider large windows surrounding each focal neutral site (up to 10 cM) in our inference procedure (see S2 Text for details). This is likely because much of the information about s and μ comes from the localized dip in Neanderthal ancestry close to genes, and thus these estimates are not strongly affected by the inclusion of other weakly linked deleterious alleles (the effects of which are more uniform, and mostly affect p 0 ). If the predicted drop in hybrid fitness is due to the accumulation of many weakly deleterious alleles in Neanderthals, as supported by our simulations, it also suggests that Neanderthals may have had a very substantial genetic load (more than 94% reduction in fitness) compared to AMH (see also [28, 29, 32]). It is tempting to conclude that this high load strongly contributed to the low population densities, and the extinction (or at least absorption), of Neanderthals when faced with competition from modern humans. However, this ignores a number of factors. First, selection against this genetic load may well have been soft, i.e. fitness is measured relative to the most fit individual in the local population, and epistasis among these many alleles may not have been multiplicative [48–50]. Therefore, Neanderthals, and potentially early-generation hybrids, may have been shielded from the predicted selective cost of their load. Second, Neanderthals may have evolved a range of compensatory adaptations to cope with this large deleterious load. Finally, Neanderthals may have had a suite of evolved adaptations and cultural practices that offered a range of fitness advantages over AMH at the cold Northern latitudes that they had long inhabited [51, 52]. These factors also mean that our estimates of the total genetic load of Neanderthals, and indeed the fitness of the early hybrids, are at best provisional. The increasing number of sequenced ancient Neanderthal and human genomes from close to the time of contact [7, 17, 53] will doubtlessly shed more light on these parameters. However, some of these questions may be fundamentally difficult to address from genomic data alone. Whether or not the many weakly deleterious alleles in Neanderthals were a cause, or a consequence, of the low Neanderthal effective population size, they have had a profound effect on patterning levels of Neanderthal introgression in our genomes. More generally, our results suggest that differences in effective population size and nearly neutral dynamics may be an important determinant of levels of introgression across species and along the genome. Species coming into secondary contact often have different demographic histories (e.g. as is the case of Drosophila yakuba and D. santomea [54, 55] or in Xiphophorus sister species [56]) and so the dynamics we have described may be common. We have here considered the case of introgression from a small population (Neanderthals) into a larger population (humans), where selection acts genome-wide against deleterious alleles introgressing. However, from the perspective of a small population with segregating or fixed deleterious alleles, introgression from a population lacking these alleles can be favoured [57]. This could be the case if the source population had a large effective size, and hence lacked a comparable load of deleterious alleles. Therefore, due to this effect, our results may also imply that Neanderthal populations would have received a substantial amount of adaptive introgression from modern humans.

Acknowledgments We would like to thank Nicolas Bierne, Jeremy Berg, Vince Buffalo, Gideon Bradburd, Yaniv Brandvain, Nancy Chen, Henry Coop, Kristin Lee, Samantha Price, Alisa Sedghifar, Guy Sella, Michael Turelli, Tim Weaver, Chenling Xu, and members of the Ross-Ibarra and Schmitt labs at UC Davis for helpful feedback on the work described in this paper. We thank David Reich, Molly Schumer, and two anonymous reviewers for feedback on an earlier version of the paper.

Author Contributions Conceptualization: IJ SA GC. Formal analysis: IJ SA GC. Investigation: IJ SA GC. Methodology: IJ SA GC. Software: IJ SA GC. Validation: IJ SA GC. Visualization: IJ SA GC. Writing – original draft: IJ SA GC. Writing – review & editing: IJ SA GC.