– Rig Veda

Five years ago I found out that my friend Daniel MacArthur and I are members of the same Y chromosomal haplogroup, R1a. Both of us thought it was rather cool, that ~5,000 years ago there lived a man who was ancestral to us both on the direct paternal line. Five years on, and both Dan and I have sons who continue this lineage. True, surely Dan and I share more than one lineage of connection over the past ~5,000 years, the Y chromosomal one is simply the one that is genetically irrefutable since recombination does not break apart the sequence of variants, the haplotype, allowing the inference to be as simple as taking candy from a baby. The common ancestral information is transmitted as a whole block, excepting the mutations which separate us from our common forefather. Additionally, since he has attested South Asian ancestry (< 200 years), we probably share many lines of descent over the past ~3,000 years (one of Dan’s ancestors was stationed in Bengal in the 19th century, so I think our genealogies intersect a decent amount for non-related individuals).

But there’s something special about R1a beyond the fact that it binds me paternally with a host of people who I know from all around the world. The figure to the right is from the supplements of a Genome Research paper, A recent bottleneck of Y chromosome diversity coincides with a global change in culture. You see that R1a1 diverges by very few mutational steps, and a rake-like pattern defines the phylogeny. That is in keeping with a history of relatively recent diversification, and rapid population expansion. The Genome Research paper found that R1a, along with a host of other Y chromosomal lineages, have undergone very rapid demographic expansion over the past when put through the sieve of phylogenomic inference. This is similar to what you see with the Genghis Khan haplotype. Remember, this is a very specific signature of direct male descent. It does not necessarily extrapolate well to the rest of the human genome. So, though Daniel MacArthur and I share a common Y chromosomal lineage, he is Northern European and I am South Asian, with all that implies for the set of genealogies which come together to contribute to the patterns of variation we see in our whole genomes.

But recently we’ve been gaining even more understanding at the phylogeography of R1a, and its likely history. To the left is a figure from the supplements of Reconstructing Genetic History of Siberian and Northeastern European Populations. You see in this chart a few important things. First, the sister to the haplogroup R, which includes R1b and R1a, and therefore huge numbers of European, West, and South Asian men, is Q, an Amerindian one. The Mal’ta boy, who lived ~24,000 years ago, seems likely to have carried a basal R1 lineage. This is reasonable because most people peg the divergence of R1a and R1b ~20,000 years ago (or somewhat more recently). A major takeaway here is that the dominant lineages across much of western Eurasia today on the male side seem to derive from a group with central Eurasian affinities. The two R1 lineages are very rare in Europe before ~4,000 years ago, according to ancient DNA. This is also concomitant with the arrival of “Ancient North Eurasian” (ANE) ancestry, which is closer to that of Mesolithic European hunter-gatherers than East Eurasians, but still rather anciently diverged, on the order of ~30-40,000 years before the present. Amerindians also have substantial admixture from this group, as do many groups in the Caucasus, and South Asia.

The second major issue that is evident from this figure is thatThe Altay population in this paper are Turkic, but “trace approximately 37%…of their ancestry to another unknown population, which the model predicted to be related to modern Europeans.” And, its R1a looks basal to the South Indian sample, which because it is from Singapore, is likely to be Tamil. Nearly 15 years ago in The Eurasian Heartland: A continental perspective on Y-chromosome diversity , Spencer Wells reported R1a at reasonable frequencies even among non-Brahmin South Indians. More recent work using more markers suggests that R1a has two very common major lineages in Eurasia, with one very common in Eastern Europe, and decreasing in frequency west, and another common in South Asia, with appreciable fractions in regions of Central Asia such as the Altai mountains. Going back to the earlier work, and connecting the dots, it looks like these two “brotherhoods” of R1a diverged on the order of ~4,000 years ago, both undergoing rapid expansion in different regions of Eurasia.

Oh, but there’s more! Eight thousand years of natural selection in Europe has been updated with new ancient DNA results form Iosif Lazaridis’ work. As you might know by now it seems likely that the Indo-European languages were brought into Europe by peoples related to (descended from?) the Yamna culture of the trans-Caspian steppe. The Yamna were genetically a compound population, with about half their ancestry being derived form “eastern hunter-gatherers” (EHG), who themselves were a equal compound between “western hunter-gatherhers” (WHG), the latter presumably descendants of the Pleistocene populations which had retreated to the habitable fringes of the continent, and the previously mentioned ANE group, with Siberian affinities. The other half of the Yamna peoples’ ancestry derives from something similar to that of the early European farmers (EEF), but somewhat different. In particular, rather than western Anatolian affinities, this ancestry seems more trans-Caucasian or eastern Anatolian, with Armenians and Kartvelian groups either being source population, or related to the source populations.

Intriguingly, the Yamna carry the R1b haplogroup, today rather rare in Eastern Europe, but common, and modal, in Western Europe, with extremely high frequencies along the Atlantic fringe. The new version of the preprint now reports some ancient DNA results form the successor culture to the Yamna, the Srubna. There are two intriguing aspects to the new results. First, the Srubna have nearly ~20% ancestry from a population related to the EEF. There are two possible options here. One, that there was back-migration from Europe after the initial migration west. Second, that an EEF-like migration occurred directly from the Middle East to the steppe. But now, from the preprint:

Srubnaya possess exclusively (n=6) R1a Y-95 chromosomes (Extended Data Table 1), and four of them (and one Poltavka male) belonged to haplogroup R1a-Z93 which is common in central/south Asians…very rare in present-day Europeans…and absent in all ancient central Europeans studied to date.

First things first. There are some “Out of India” theorists who posit that R1a derives from South Asia. If you take a very deep time perspective this may be true; recall that much of Eurasia was not habitable during the Last Glacial Maximum (LGM), so the distribution of populations was very different from what we see today. But, on the scale of ~4,000 years ago it seems that one can say that the very common variant of R1a found in the eastern Iranian world and South Asia likely derives from the steppe. The reasoning here is that while peoples in South Asia have elements of ancestry across their genome with affinities to the steppe people (e.g., ANE), there is little evidence for South Asian distinctive ancestry (e.g., ASI) in the steppe people. Additionally, the majority of South Asia mtDNA does not have a West Eurasian profile, but is closer to the lineages of eastern Eurasia. This is strongly suggestive of mostly male migrants. What we can say definitively is that it looks as if male lineages overturned each other multiple times on the steppe. First, R1b was dominant. Then in the same region one lineage of R1a came to the fore, only to later be marginalized by another lineage of the same haplogroup. Finally, in Central Asia more generally the Turkic migrations reshaped the whole ethnographic landscape within historical memory.

Though I begin this post with Y chromosomes, I will not end with them. My belief though is that the Y chromosomal story gives us a deep insight into the nature of social relations over the past ~5,000 years. More on this later. But, the constant turnover of the Y chromosomal record should clue us in to the fact that human demographic history exhibits punctuated turnover events, which reshape the genetic landscape radically over a few centuries. This is a far cry from a model of a set of serial founder events from Africa, dispersing outward as a phylogenetic tree overlain upon a spatial map over a time-scale of tens of thousands of years in Fisher waves.

Specifically, I’m referring here to the 2005 paper, Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Currently, the best rejoinder to this model is probably Towards a new history and geography of human genes informed by ancient DNA, by Joe Pickrell and David Reich. In this review the authors show that though the serial founder bottleneck framework is consistent with the data at a certain level of granularity, it is not the only possibility. What ancient DNA in particular is telling is that local geographic continuity of lineage is often very rare. This result then should make us skeptical of taking contemporary genetic variation, inferring phylogenies, and then overlaying those phylogenies upon the spatial distribution of particular ethno-linguistic groups. Of course, on a coarser scale of granularity the “Out of Africa” model inferred from older genetic work from the pre-ancient DNA era is probably correct. That is, African populations tend to harbor lots of genetic variation, and are basal in relation to non-African lineages. Or, put another way, non-Africans are a derived lineage of Africans. ~100,000 years ago almost all of the ancestors of non-Africans would have been in Africa (or perhaps the biogeographic extension of Africa in the Middle East).

But the story beyond that scale is more complex. At least some of the first settlers of Europe have no modern descendants in Europe. In fact, these populations are nearly as close to East Asians as they are to modern Europeans, suggesting that the modern east-west and north-south axes in Eurasia are products of events of the last few tens of thousands of years at most. In fact, the synthetic origins of Europeans and South Asians is strongly suggestive of the likelihood that inferences from modern genetic variation only have time depths back ~4-5,000 years or so in much of Eurasia. A recent paper in Science, Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent, suggests widespread back-migration to Africa itself from Eurasia! Though I disagree with the interpretation in some details (I don’t believe that this occurred ~3,000 years ago), the circumstantial evidence from this and other studies is strong that there has been several waves of migration of Eurasian groups back to Africa. Excepting the northern fringe of the continent in no region is this preponderant, so that the status of Africa as the home of the original population of modern humans from which others derives, remains unshaken. For now.

Nevertheless, both ancient DNA and whole genome sequencing are fleshing out surprising and enigmatic details in relation to how human genetic variation came to distribute itself around the the world today. Here we can come back to Europe. Mostly because there has been a lot of genetic work on this continent, and the ancient DNA is probably thick enough that we won’t find any major new surprises. In short, the phylogenomic history of the continent over the past ~10,000 years has been “solved” more or less. What did we find out? What can it tell us about the more general human story?

We can start with the present. As noted in The History and Geography of Human Genes Europe is a very genetically homogeneous continent. The distances as inferred from allele frequency differences between two given populations is very low, and Northern Europe between the Atlantic fringe and the great Eurasian plain in particular is very uniform in terms of the total genome. Today, we know why. As outlined in Massive migration from the steppe was a source for Indo-European languages in Europe, Northern Europe was demographically shaken ~4,000-5,000 years ago by population movements triggered by peoples which left the steppe. It was not a total replacement. But the world of the first farmers, who had issued out of the Middle East ~8,000 years ago, was rocked in the north. The male Y haplogroups associated with these old farming groups, such as G2a, are found at low, though relatively even, proportions all across Northern Europe today.

One interesting aspect of the story is the huge genetic distance between some of these ancient groups. For example, that between the first farmers from the Middle East and their nearby hunter-gatherer neighbors ~8,000 years ago was of the same order as between Europeans and East Asians! This is more than ten times the larger genetic distances you can find in Europe today, but this persisted for thousands of years, though it seems that hunter-gatherer ancestry increased over time among the farming populations, likely through admixture with the local substrate. The reason for this high genetic distance is because the early European farmers carried ancestry which has been termed “Basal Eurasian” (BEu). This points to the fact that these people seem to have diverged first away from all other non-Africans when it comes to Out-of-Africa populations. In other words, ~40% of the ancestry of early European farmers is from a population which is more genetically distant from European hunter-gatherers than Andaman Islanders are. It was the arrival of the steppe people which resulted in the leveling of the genetic distances across much of Europe, overwhelmingly so in the north, and to a non-trivial extent in the south.

So if Europe went through a great homogenization and leveling ~4,000 years ago, why does the “genetic map of Europe” exist? That is, why does geography predict variation in genes so well? There are three things one might say about this. First, PC 1, the larger dimension of variation is north-south. This comports with the idea that the heritage of the early farmers persisted in the south to a far greater extent, and the Indo-European demographic impact was more modest, if not trivial. An earlier explanation I had seen floated around was that there was a north-south gradient due to expansion from the post-Pleistocene refugia, via the serial bottleneck effect. The real explanation for the north-south difference though seems more likely to be the differing proportions of Indo-European ancestry, overlain upon the early farmer and hunter-gatherer ancestry.

The second issue to consider is that the underlying genetic variation in Europe was absorbed into the expanding population. Even if the steppe invaders differed little from east to west, there were differing levels of absorption of the substrate, and after several thousand years there had likely been some divergence between the different early farmer groups, perhaps due to differing levels of admixture with hunter-gatherers. Basically, PC analysis could still pick up the signal of underlying variation even if that component was minor if the dominant element was not particularly structured (you can pick up indigenous structure in Mestizo populations in Mexico for this reason).

Finally, after the initial punctuated change, there was an equilibration as isolation by distance dynamics resulted in divergence across the North European plain. We have enough historical records to know that aside from the Slavic migrations there seems to have been little change in the population structure of Europe since the Roman period (the Saxon migrations were not trivial, but they were neither preponderant nor continent-wide in impact).

What general inferences can we glean from this specific European case? As Graham Coop’s group has noted, one must account both for continuous gene flow via isolation by distance dynamics, and pulse admixture events between very distant populations. Consider the metaphor of a forest expanding over the landscape. There will be local structure, accrued over generations, hundreds and thousands of years. But perhaps periodically a fire will sweep through the landscape and clear huge swaths of territory. Into this virgin landscape may expand forests which derive from isolated reservoirs which escape the flames. Over time geographic structuring will be evident again, and depending on the number of refuges the jigsaw puzzle of genetic islands expanding into the gaps will fade somewhat as migration smooths the edges.

The reference to fire here is conscious, insofar as fire can immolate structure which has taken generations to develop. Before the steppe people arrived in Northern Europe the first farmers had established a long-standing cultural commonwealth of sorts. Their legacy had persisted for thousands of years. Then, in a period of centuries, it all changed. Why? Culture.

Outright genocide with weapons is a dangerous business. Societies which engage in endemic long-term warfare as a primary male vocation, such as highland New Guinea, have high mortality rates. But in the context of the Malthusian world, where villages persist on the knife’s edge of subsistence, marginalization and disturbance of long-held patterns is all that might be needed for cultures to descend into famine and starvation. In 1493 Charles C. Mann notes that the mass death triggered by the arrival of Europeans and Africans to the New World had as much to do with the destabilization of society by illness as much as the illness itself. In a world where all hands were on deck to bring in the harvest, the loss of critical labor during those periods could result in starvation, and high death rates led to the rapid collapse of the institutions which served as scaffolds for the maintenance of everyday life.

The scenario then might be one where populations on the Eurasian steppe develop some of the basic elements which would lead to agro-pastoralism, and undergo population expansion. With numbers, and well fed on the agro-pastoralist diet, these tribes might have poured into the lands of the farmers as rapid mobile groups in their wagons. The pattern in antiquity down to the early modern period, from the Goths to the Mongols, was to extract rents and treat the farmers as cattle. There was no incentive for one to starve cattle, and so the demographic impact of conquests was relatively modest.

But what about a world with less institutional complexity? In a world where the basic levers of rent to extract from the conquered did not exist, the natural path would be to replace them. The story goes that Genghis Khan had hoped to turn North China into a vast pastureland by driving out the peasantry (and almost certainly killing most of them through starvation), but his sage Khitai adviser explained the wealth that could be gained by taxing humans rather than raising stock on land. But the Khitai themselves were a semi-civilized people with centuries of experience milking the Han peasantry, and were heirs to a tradition of pastoralist predation that went back to the Hsiung-Nu. And yet no doubt there was a time when the idea of collecting rents from a conquered people was an innovation in and itself. The genocidal antics of the Israelites in the Hebrew Bible strike us as dark and atavistic, but they reflect a cultural mindset which is nearly contemporaneous* with the arrival of Indo-Europeans to Europe.

This plausible sketch puts into better perspective Steven Pinker’s thesis in Better Angels of Our Nature as well as Peter Turchin’s War and Peace and War. The emergence of state institutions and pacific ideologies in the past ~3,000 years may be a sort of response to the high-stakes inter-group competitions which would level societies and turnover populations on a regular basis in the human past.

And yet not all was as sweetness and light. In terms of their total genome the differences between the Srubna and their predecessors were not very great. Conversely, the differences in the total genome between Slavic people and South Asians are legion. But interlaced more recently across the landscape of a more stable structuring of genetic variation, a great regrowth of the forest through isolation by distance equilibration if you will, has been the explosion of powerful patrilineages which trace out an intriguing skein across the landscape. The total genome signal of these men may quickly decay over the generations, as their female-line descendants lose the golden allure of their status, but their male-line descendants continue to accrue mating prowess by dint of their association with great kinship units which succeeded in a winner-take-all game with other such groups of men. On top of the story of migrations of whole peoples, and the extinction and absorption of others, is the story of bands of men operating as units, related either in truth or fictively, which extract rents across a thickly populated landscape of human cattle. Another way to state this is that the thuggish state which imposed a monopoly of violence on a chaotic world where small-scale conflict was becoming too expensive allowed for the emergence of patriarchy as we understand in its customary form. Like so many hirelings, the men charged with protecting the people, made the whole world their possession and left dreams of their people behind.

While the cultural and genetic affinities of folk wanderings were tightly coupled, I am not sure that the Y chromosomal lineages are so neat. The Hazara people of Afghanistan exhibit an Asiatic appearance in comparison to other Afghans, and their Y chromosomes suggest a close connection to the Khalkha Mongols, but they are Shia Muslims who speak Persian. It does seem that the R1 lineages ascendant in Europe and South Asia owe their success to the Indo-Europeans, but both R1b and R1a transcend a connection to Indo-European ethno-linguistic groups. In some cases, as in that of R1a in the Levant, one might see in that a submerged Indo-European element, from the Mitanni down to the later Persian and Kurdish peoples. But in other cases, such as R1b among the Basque and R1a among Dravidian-speaking tribal people in South India, what we are seeing is the long arm of the patriarchy reaching beyond bounds of cultural and genetic affinity. The great Cherokee chief John Ross was famously 7/8th Scottish in ancestry. But he was a voice for the Cherokee people nevertheless. In most places where the Mongol hordes washed over they assimilated to the cultural folkways of the people whom they conquered. Like modern corporations the patriarchies were only loosely associated with other units of human organization, even if they used them as their vehicles of choice.

And so the story ties back to the beginning. Many of us are the sons of Indra, Zeus, and Thor. The descendants of Herakles, and of Abraham who haggled with God himself. Of Ishmael, whose hand will be against everyone and everyone’s hand against him. Of Niall of the Nine Hostages, and Temujin. The interests of men like this know no nation, nations are but ends to their will. The tension we see in our modern world, between egocentric plutocratic elites jostling nation-states like playthings, might be simply the repetition of an old pattern. In the Bible Saul was rebuked for not destroying all the capital of the Amalekites, perhaps reflecting the tensions of interests which reflect the leader of a people who must act in the collective good, but have their own selfish needs and dreams of self aggrandizement for their own very particular posterity.

Addendum: Ancient DNA will expand in its ability to discern various patterns in the past. But the general disturbances will fall in line with what I have outlined above, I believe. Rather, the move will be from phylogenomics, to population genomics. Phylogenomics leverages genomic methods to attempt to infer phylogenetic patterns. Population genomics explores the classical parameters which shape the change in allele frequencies in lineages, and ultimately, deep evolutionary questions. We now know from ancient DNA that in all likelihood the phenotype which we associate with modern Europeans is a novel configuration. To some extent this is to be expected, as the basic elements which combine to form the European genome, fusing together lineages which diverged at least ~50,000 years before the present (BEu vs. everyone else outside of Africa) and ~35,000 years before the present (ANE vs. WHG), only came together around ~4,000 years ago. But there is more, as natural selection seems to have changed allele frequencies after these elements came together. That is, selection may have been operating across the European landscape when Hannibal was skirting the Alps!

And again, this is likely a general story. Physical anthropologists have long wondered why classical East Asian skeletal morphology seems to be scarce in the prehistoric past. But what if the classical East Asian appearance is relatively new? The Ainu, who have long been considered at “Lost White Race” turn out to be a basal Northeast Asian group. It may be that they retain more of the “ancestral” features of East Eurasians.

The first age of selection studies in the 2000s was fraught with confusion and false positives. To a great extent we still don’t know what to make many of the signals, which are deposited in the middle of obscure open reading frames. But the real golden age of selection will probably begin when we have more temporal transects with whole genome sequencing of ancient DNA, and with the phylogenomic context relatively robust as an interpretative framework.

* I am aware that the Hebrew Bible coalesced between thousands of the years after the arrival of Indo-Europeans to Europe, but it no doubt distills very ancient folkways. This seems obvious for example in the recollection of the Sumerian flood story.