Over the past few years we have seen ancient DNA researchers “carve nature at its joints” when it comes to the paleohistory of Europe after the end of the last Ice Age. In relation to this historical reconstruction we aren’t at the end of the road, but I do think that the terminus is within sight. There are only so many populations one can sample, and so many statistical constructs one can posit, before one is on the plateau of diminishing marginal returns. For example, the model of Holocene Europe being a synthesis of two very distinctive populations which merged after the last Ice Age was too simple. A model with three populations is sufficient for the vast majority of European groups. Though in these sorts of situations more complex models may be consistent with the results, the bias is to go with parsimony, and attempt some alignment with linguistic and archaeological evidence.

A new paper in Nature Communications, Upper Palaeolithic genomes reveal deep roots of modern Eurasians, fills in some gaps in the broader picture. The figure at the top of this post illustrates the modification that these authors made to the schematic of Lazaridis et al. Iosif Lazaridis himself as weighed in on Twitter:

In Haak, Lazaridis et al. (2015) we modeled Yamnaya as EHG+Something from the Caucasus/Near East. Quite convinced that CHG is that something — Iosif Lazaridis (@iosif_lazaridis) November 16, 2015

CHG = Caucasian hunter-gatherers. More specifically, the authors of this paper analyze two subfossils from Georgia dated to ~10 to ~13 thousand years, Kotias and Satsurblia. Kotias, at ~15x coverage (that is, each position is sampled ~15 times, so you have a good sense of variation at any given position), is particularly useful. What they found is as Lazaridis reports above: CHG seem one of the primordial groups to give rise to the extant variation of modern Europeans, and Western Eurasians writ large.

The rough stylized history of the non-African populations is as such: a “basal Eurasian” (bEu) population separates off first, and then west and east Eurasians diverge, and then in the west there is a divergence between the ancestors of western hunter-gatherers (WHG) and ancient north Eurasians (ANE). The early European farmers (EEF) are compounds between WHG and bEu, with a slight bias toward WHG. The Anatolian farmers were also admixed, though biased toward bEu. The eastern hunter-gatherers (EHG) are a balanced mix between WHG and ANE, and this group fused with the CHG to give rise to Yamnaya. This brings up the question: are CHG the basal Eurasians? I doubt it. The paleodemography of the ancient Near East has been barely elucidated. It seems likely that CHG, like the Anatolian farmers, are a compound of some sort. Basal Eurasians may manifest as an allele frequency spectrum across the Middle East during this period, the remnants of a back migration from west Eurasian groups mixing with the ur-Basal Eurasians, who were the first to split off from the Out of Africa migration. In the colder and drier world of the Last Glacial Maximum (LGM) it seems likely that some of the northern hunter-gatherer populations would move into territory occupied by basal Middle Eastern groups. In addition, the Sahara had periods of extreme dryness during the Ice Age, so the ur-basal Eurasians wouldn’t necessarily have been able to withdraw to Africa.

Using G-PhoCS they inferred separation dates between the various populations. I have two issues with this. First, their mutation rate seems likely, but there is still some debate about the exact value and whether it is constant across a lineage (for this level of phylogenetic distance the assumption of constancy seems valid). Second, the confidence intervals from these results are huge. The authors report the results, and tentatively attempt to relate separation to the LGM ~20 thousand years before the present, but know that they can’t assert anything robustly. It strikes me that we know the sequence of separations between the groups better than the period of separations.

But one definite result is the pattern of ancestry (or shared drift) which is derived from CHG. It is high in the Caucasus, as one would expect, but also in South Asia. This is not surprising. Several papers have suggested that the West Eurasian admixture into South Asians seems to have an affinity with northern West Asians. Agriculture in South Asia began at Mehrgarh with a traditional West Asian cultural toolkit, and likely the character of the ANI-ASI admixture took root here. In Europe many researchers believe that the replacement of the hunter-gather populations in most areas was rather complete after the initial admixture event that occurred when farmers initially entered the continent, and it seems possible that the same is true in South Asia as well. There are no “Ancestral South Indians” in pure form left, and the variation in ancestry between tribes and caste groups in many areas is not very large.

When you into the supplements though it all becomes much clearer. To the left you see a table of D-statistics, where the left column are Indian populations, and in the right column are the top hits, X, for these groups in terms of inferred gene flow with the tree form (Yoruba, X; Onge, Indian population). The key thing to note is that while some Indian groups have the strongest hit from the Kotias CHG sample, others, and of note the North Indian Brahmin Tiwari community, the signal from the Afanasevo is strongest. The Afanasevo are genetically basically the eastern extension of the Yamnaya. In other words, the D-statistics are showing evidence of a migration from the steppe, and a migration from West Asia. This also makes sense of supplementary figure 3, which shows non-trivial shared drift among some South Asian groups with the Swiss Bichon WHG sample. The Afanasevo would have brought this via their EHG ancestry, which was about half similar to WGH.

The evidence from uniparental (Y and mtDNA) and functional genes is also interesting. CHG carry mtDNA haplogroups H13 and K3, and Y chromosomal groups J and J2. It seems likely that the prevalence of haplogroup in H is due to post-Neolithic population replacements. The CHG contribute about half the ancestry to Yamnaya, but these two did not have haplogroup R1a or R1b. Haplogroup J2 is particularly common among caste groups in South India. All this points to the likelihood that the Dravidian languages are probably derived from agriculturalists with West Asian roots, and gives a touch more plausibility to the idea that ancient Elamite in Khuzistan may have been a distant relative of Dravidian.

Additionally, the derived light-skin variant of SLC24A5 is found among the CHG, as it is among the Anatolian farmers. The haplotype is the common one found in West and South Eurasian populations. The variant for SLC45A2 in Kotias is definitely homozygous for the ancestral variant. On the whole most South Asians do not share European light-skin variants except for SLC24A5. The exceptions tend to be groups in the Northwest, and upper castes. Exactly the same groups which likely have the strongest Afanasevo stamp.

One thing the authors note is that the Caucasus themselves have been subject to great change. It is clear that a farmer group related to EEF has mixed with the CHG descended groups. And, today the Caucasus has very high fractions of ANE ancestry in some groups, but these samples did not at all. At ASHG a few years ago a prominent population geneticist offered to me that he thought ANE might not have been the best term, as there was no strong evidence that this group wasn’t more common elsewhere. But CHG did not have ANE ancestry, despite that being very salient in modern trans-Caucasian groups. This suggests a later expansion and mixing event. From what I know ANE drift is not evident in many Indian populations, pegging the arrival of ANE-bearing groups to a later period after agriculture. Gene flow into Amerindian groups, and high ANE fractions in Central Siberia and the Altai, do point to their locus of habitation in Northern Eurasia.

Finally, let’s remember that we’re constructing the past from the slim remains which we have on hand. Ten years ago we were using extant genetic variation, because that’s all we had, and that led us astray. In the broadest sketches the inferences were right, but in many details they were misleading. Similarly, we shouldn’t think that the ancient DNA yielding populations are necessarily the direct ancestors of any modern groups. We know, for example, that Ma’lta is actually not ancestral to the ANE population which contributed to both Europeans via the EHG and Native Americans. The ANE drift of these two groups has more in common than with Ma’lta. Like the ancient Ethiopian genome there are many interesting conclusions one can derive from novel results, but to the first carving of nature’s joints is not always the best.