Samples, sequencing and authenticity

Recent excavations of Satsurblia cave in Western Georgia yielded a human right temporal bone, dated to the Late Upper Palaeolithic 13,132–13,380 cal. BP. Following the approach of Gamba et al.3, extractions from the dense part of the petrous bone yielded sequencing libraries comprising 13.8% alignable human sequence which were used to generate 1.4-fold genome coverage. A molar tooth sampled from a later Mesolithic (9,529–9,895 cal. BP) burial in Kotias Klde, a rockshelter also in Western Georgia showed excellent preservation, with endogenous human DNA content of 76.9%. This was sequenced to 15.4-fold genome coverage. Grotte du Bichon is a cave situated in the Swiss Jura Mountains where a skeleton of a young male of Cro-magnon type was found and dated to the late Upper Palaeolithic 13,560–13,770 cal. BP (for further details on the archaeological context see Supplementary Note 1). A petrous bone sample extraction from this also gave excellent endogenous content at 71.5% and was sequenced to 9.5-fold coverage. The sequence data from each genome showed sequence length and nucleotide misincorporation patterns which were indicative of post-mortem damage and contamination estimates, based on X chromosome and mitochondrial DNA tests (Supplementary Note 2), were <1%, comparable to those found in other ancient genomes2,3,8.

Continuity across the Palaeolithic–Mesolithic boundary

Kotias and Satsurblia, the two CHG, are genetically different from all other early Holocene (that is, Mesolithic and Neolithic) ancient genomes1,2,3,4,5,6,8,9,10, while Bichon is similar to other younger WHG. The distinctness of CHG can be clearly seen on a principal component analysis (PCA) plot11 loaded on contemporary Eurasian populations1, where they fall between modern Caucasian and South Central Asian populations in a region of the graph separated from both other hunter gatherer and EF samples (Fig. 1a). Clustering using ADMIXTURE software12 confirms this view, with CHG forming their own homogenous cluster (Fig. 1b). The close genetic proximity between Satsurblia and Kotias is also formally supported by D-statistics13, indicating the two CHG genomes form a clade to the exclusion of other pre-Bronze Age ancient genomes (Supplementary Table 2; Supplementary Note 3), suggesting continuity across the Late Upper Palaeolithic and Mesolithic periods. This result is mirrored in western Europe as Bichon is close to other WHG in PCA space (Fig. 1a) and outgroup f 3 analysis (Supplementary Fig. 1), belongs to the same cluster as other WHG in ADMIXTURE analysis (Fig. 1b), and forms a clade with other WHG to the exclusion of other ancient genomes based on D-statistics (Supplementary Table 3; Supplementary Note 3). Thus, these new data indicate genomic persistence between the Late Upper Palaeolithic and Mesolithic both within western Europe and, separately, within the Caucasus.

Figure 1: Genetic structure of ancient Europe. (a). Principal component analysis. Ancient data from Bichon, Kotias and Satsurblia genomes were projected11 onto the first two principal components defined by selected Eurasians from the Human Origins data set1. The percentage of variance explained by each component accompanies the titles of the axes. For context we included data from published Eurasian ancient genomes sampled from the Late Pleistocene and Holocene where at least 200 000 SNPs were called1,2,3,4,5,6,7,9 (Supplementary Table 1). Among ancients, the early farmer and western hunter-gatherer (including Bichon) clusters are clearly identifiable, and the influence of ancient north Eurasians is discernible in the separation of eastern hunter-gatherers and the Upper Palaeolithic Siberian sample MA1. The two Caucasus hunter-gatherers occupy a distinct region of the plot suggesting a Eurasian lineage distinct from previously described ancestral components. The Yamnaya are located in an intermediate position between CHG and EHG. (b). ADMIXTURE ancestry components12 for ancient genomes (K=17) showing a CHG component (Kotias, Satsurblia) which also segregates in in the Yamnaya and later European populations. Full size image

Deep coalescence of early Holocene lineages

The geographical proximity of the Southern Caucasus to the Levant begs the question of whether CHG might be related to early Neolithic farmers with Near Eastern heritage. To address this question formally we reconstructed the relationship among WHG, CHG and EF using available high-quality ancient genomes1,3. We used outgroup f 3 -statistics14 to compare the three possible topologies, with the correct relationship being characterized by the largest amount of shared drift between the two groups that form a clade with respect to the outgroup (Fig. 2a; Supplementary Note 4). A scenario in which the population ancestral to both CHG and EF split from WHG receives the highest support, implying that CHG and EF form a clade with respect to WHG. We can reject a scenario in which CHG and WHG form a distinct clade with respect to EF. The known admixture of WHG with EF1,3,4,5 implies that some shared drift is found between WHG and EF with respect to CHG, but this is much smaller than the shared drift between CHG and EF. Thus, WHG split first, with CHG and EF separating only at a later stage.

Figure 2: The relationship between Caucasus hunter-gatherers (CHG), western hunter-gatherers and early farmers. (a). Alternative phylogenies relating western hunter-gatherers (WHG), CHG and early farmers (EF, highlighted in orange), with the appropriate outgroup f 3 -statistics. (b). The best supported relationship among CHG (Kotias), WHG (Bichon, Loschbour), and EF (Stuttgart), with split times estimates using G-Phocs15. Oxygen 18 values (per mile) from the NGRIP core provide the climatic context; the grey box shows the extent of the Last Glacial Maximum (LGM). Full size image

We next dated the splits among WHG, CHG and EF using a coalescent model implemented with G-PhoCS15 based on the high-coverage genomes in our data set (Fig. 2b for a model using the German farmer Stuttgart1 to represent EF; and Supplementary Table 5 for models using the Hungarian farmer NE1 (ref. 3)) and taking advantage of the mutation rate recently derived from Ust’-Ishim10. G-Phocs dates the split between WHG and the population ancestral to CHG and EF at ∼40–50 kya (range of best estimates depending on which genomes are used; see Supplementary Table 5 and Supplementary Note 5 for details), implying that they diverged early on during the colonisation of Europe16, and well before the LGM. On the WHG branch, the split between Bichon and Loschbour1 is dated to ∼16–18 kya (just older than the age of Bichon), implying continuity in western Europe, which supports the conclusions from our previous analyses. The split between CHG and EF is dated at ∼20–30 kya emerging from a common basal Eurasian lineage1 (Supplementary Fig. 2) and suggesting a possible link with the LGM, although the broad confidence intervals require some caution with this interpretation. In any case, the sharp genomic distinctions between these post-LGM populations contrasts with the comparative lack of differentiation between the earlier Eurasian genomes, for example, as visualized in the ADMIXTURE analysis (Fig. 1a), and it seems likely that this structure emerged as a result of ice age habitat restriction. Like EF, but in contrast to WHG, CHG carry a variant of the SLC24A5 gene17 associated with light skin colour (rs1426654, see Supplementary Note 6). This trait, which is believed to have risen to high frequency during the Neolithic expansion18, may thus have a relatively long history in Eurasia, with its origin probably predating the LGM.

A partial genome from a 24,000-year-old individual (MA1) from Mal’ta, Siberia6 had been shown to be divergent from other ancient samples and was shown by Lazaridis et al.1, using f 4 statistics, to have more shared alleles with nearly all modern Europeans than with an EF genome. This allowed inference of an ANE component in European ancestry, which was subsequently shown to have an influence in later eastern hunter-gatherers and to have spread into Europe via an incursion of Steppe herders beginning ∼4,500 years ago5,7. Several analyses indicate that CHG genomes are not a subset of this ANE lineage. First, MA1 and CHG plot in distinct regions of the PCA and also have very different profiles in the ADMIXTURE analysis (Fig. 1). Second, when we test if CHG shows any evidence of excess allele sharing with MA1 relative to WHG using tests of the form D(Yoruba, CHG; MA1, WHG) no combinations were significantly positive (Supplementary Table 6). Last, we also tested whether the ancestral component inferred in modern Europeans from MA1 was distinct from any that may have been donated from CHG using tests of the form D(Yoruba, MA1; CHG, modern North European population) (Supplementary Table 7). All northern Europeans showed a significant sharing of alleles with MA1 separate to any they shared with CHG.

WHG and CHG are the descendants of two ancient populations that appear to have persisted in Europe since the mid Upper Palaeolithic and survived the LGM separately. We looked at runs of homozygosity (ROH: Fig. 3) which inform on past population size3,19,20. Both WHG and CHG have a high frequency of ROH and in particular, the older CHG, Satsurblia, shows signs of recent consanguinity, with a high frequency of longer (>4 Mb) ROH. In contrast, EF are characterised by lower frequency of ROH of all sizes, suggesting a less constricted population history20,21, perhaps associated with a more benign passage through the LGM than the more northern populations (see Supplementary Note 7 for further details).

Figure 3: Distribution of ROH. (a). The total length of short ROH (<1.6 Mb) plotted against the total length of long ROH (≥1.6 Mb) and (b) mean total ROH length for a range of length categories. ROH were calculated using a panel of 199,868 autosomal SNPs. For Kotias we analysed both high-coverage genotypes and genotypes imputed from downsampled data (marked in italics; see Supplementary Information). Diploid genotypes imputed from low-coverage variant calls were used for Satsurblia and high-coverage genotypes were used for all other samples. A clear distinction is visible between either WHG and CHG who display an excess of shorter ROH, akin to modern Oceanic and Onge populations, and EF who resemble other populations with sustained larger ancestral population sizes. Full size image

Caucasus hunter-gatherer contribution to subsequent populations

We next explored the extent to which Bichon and CHG contributed to contemporary populations using outgroup f 3 (African; modern, ancient) statistics, which measure the shared genetic history between an ancient genome and a modern population since they diverged from an African outgroup. Bichon, like younger WHG, shows strongest affinity to northern Europeans (Supplementary Fig. 3), while contemporary southern Caucasus populations are the closest to CHG (Fig. 4a and Supplementary Fig. 3), thus implying a degree of continuity in both regions stretching back at least 13,000 years to the late Upper Palaeolithic. Continuity in the Caucasus is also supported by the mitochondrial and Y chromosomal haplogroups of Kotias (H13c and J2a, respectively) and Satsurblia (K3 and J), which are all found at high frequencies in Georgia today22,23,24 (Supplementary Note 8).

Figure 4: The relationship of Caucasus hunter-gatherers to modern populations. (a). Genomic affinity of modern populations1 to Kotias, quantified by the outgroup f 3 -statistics of the form f 3 (Kotias, modern population; Yoruba). Kotias shares the most genetic drift with populations from the Caucasus with high values also found for northern Europe and central Asia. (b). Sources of admixture into modern populations: semicircles indicate those that provide the most negative outgroup f 3 statistic for that population. Populations for which a significantly negative statistic could not be determined are marked in white. Populations for which the ancient Caucasus genomes are best ancestral approximations include those of the Southern Caucasus and interestingly, South and Central Asia. Western Europe tends to be a mix of early farmers and western/eastern hunter-gatherers while Middle Eastern genomes are described as a mix of early farmers and Africans. Full size image

EF share greater genetic affinity to populations from southern Europe than to those from northern Europe with an inverted pattern for WHG1,2,3,4,5. Surprisingly, we find that CHG influence is stronger in northern than Southern Europe (Fig. 4a and Supplementary Fig. 3A) despite the closer relationship between CHG and EF compared with WHG, suggesting an increase of CHG ancestry in Western Europeans subsequent to the early Neolithic period. We investigated this further using D-statistics of the form D(Yoruba, Kotias; EF, modern Western European population), which confirmed a significant introgression from CHG into modern northern European genomes after the early Neolithic period (Supplementary Fig. 4).

CHG origins of migrating Early Bronze Age herders

We investigated the temporal stratigraphy of CHG influence by comparing these data to previously published ancient genomes. We find that CHG, or a population close to them, contributed to the genetic makeup of individuals from the Yamnaya culture, which have been implicated as vectors for the profound influx of Pontic steppe ancestry that spread westwards into Europe and east into central Asia with metallurgy, horseriding and probably Indo-European languages in the third millenium BC5,7. CHG ancestry in these groups is supported by ADMIXTURE analysis (Fig. 1b) and admixture f 3 -statistics14,25 (Fig. 5), which best describe the Yamnaya as a mix of CHG and Eastern European hunter-gatherers. The Yamnaya were semi-nomadic pastoralists, mainly dependent on stock-keeping but with some evidence for agriculture, including incorporation of a plow into one burial26. As such it is interesting that they lack an ancestral coefficient of the EF genome (Fig. 1b), which permeates through western European Neolithic and subsequent agricultural populations. During the Early Bronze Age, the Caucasus was in communication with the steppe, particularly via the Maikop culture27, which emerged in the first-half of the fourth millennium BC. The Maikop culture predated and, possibly with earlier southern influences, contributed to the formation of the adjacent Yamnaya culture that emerged further to the north and may be a candidate for the transmission of CHG ancestry. In the ADMIXTURE analysis of later ancient genomes (Fig. 1b) the Caucasus component gives a marker for the extension of Yamnaya admixture, with substantial contribution to both western and eastern Bronze Age samples. However, this is not completely coincident with metallurgy; Copper Age genomes from Northern Italy and Hungary show no contribution; neither does the earlier of two Hungarian Bronze Age individuals.

Figure 5: Lowest admixture f 3 -statistics of the form f 3 (X, Y; Yamnaya). These statistics represent the Yamnaya as a mix of two populations with a more negative result signifying the more likely admixture event. (a). All negative statistics found for the test f 3 (X, Y; Yamnaya) with the most negative result f 3 (CHG, EHG; Yamnaya) highlighted in purple. Lines bisecting the points show the standard error. (b). The most significantly negative statistics which are highlighted by the yellow box in a. Greatest support is found for Yamnaya being a mix of Caucasus hunter-gatherers (CHG) and Russian hunter-gatherers who belong to an eastern extension of the WHG clade (EHG). Full size image

Modern impact of CHG ancestry