

One of the first posts on this blog relating to archaeogenetics involved an essay by me involving reflections on the fact that a particular Y chromosomal haplogroup, N1c (N3a now), had a peculiar distribution which ranged from Siberia to Finland. The argument, at the time, was whether it was a lineage which moved east to west (as suggested by the decline of microsatellite diversity in that direction), or whether it moved west to east (as was suggested by the frequency, which was highest in parts of Uralic Europe).

Today we know the general outline of the answer. The N1c lineage seems to have moved westward along the forest-tundra fringe, along with Uralic peoples in general. Genome-wide evidence shows minor but significant affinities with Siberian people among many European Uralic groups, including the Finns, and to a lesser extent Estonians. Though the genome-wide fraction is small in Finns, 5% or less, because this minor component is so genetically different from the generic Northern European ancestry of this group, it shifts Finns off the normal dimensions of variation for Europeans (in addition to the fact that many Finns have been subject to bottlenecks). The fraction is higher in the Sami, and lower in the Estonians.

Additionally, ancient DNA suggests that the arrival of this ‘eastern’ Uralic mediated ancestry seems to date to the early Iron Age. The hypothesis that the Finnic languages were primal to Baltic Europe, is on shaky ground which has cracked open. Rather, the circumstantial evidence is that Finnic languages replaced Indo-European dialects.

A new paper from Estonia as some more detail to the general outline, as well as highlighting some aspects of adaptation. The Arrival of Siberian Ancestry Connecting the Eastern Baltic to Uralic Speakers further East:

In this study, we compare the genetic ancestry of individuals from two as yet genetically unstudied cultural traditions in Estonia in the context of available modern and ancient datasets: 15 from the Late Bronze Age stone-cist graves (1200–400 BC) (EstBA) and 6 from the Pre-Roman Iron Age tarand cemeteries (800/500 BC–50 AD) (EstIA). We also included 5 Pre-Roman to Roman Iron Age Ingrian (500 BC–450 AD) (IngIA) and 7 Middle Age Estonian (1200–1600 AD) (EstMA) individuals to build a dataset for studying the demographic history of the northern parts of the Eastern Baltic from the earliest layer of Mesolithic to modern times. Our findings are consistent with EstBA receiving gene flow from regions with strong Western hunter-gatherer (WHG) affinities and EstIA from populations related to modern Siberians. The latter inference is in accordance with Y chromosome (chrY) distributions in present day populations of the Eastern Baltic, as well as patterns of autosomal variation in the majority of the westernmost Uralic speakers [1, 2, 3, 4, 5]. This ancestry reached the coasts of the Baltic Sea no later than the mid-first millennium BC; i.e., in the same time window as the diversification of west Uralic (Finnic) languages [6]. Furthermore, phenotypic traits often associated with modern Northern Europeans, like light eyes, hair, and skin, as well as lactose tolerance, can be traced back to the Bronze Age in the Eastern Baltic.



An admixture analysis shows how ancestry changed over time (from before 2000 BC down to the Middle Ages). The blue fraction associated with Mesolithic Western European hunter-gatherers (WHG) is the majority throughout (though remember that this is an artifactual construct; not necessarily some real ancestral source population). The Estonia_CCC is a representation of the Ceramic Comb Culture individuals, who were dominant in the Baltic after the Mesolithic hunter-gatherers, who seem to have affinities to populations in Western Europe. The CCC individuals seem very similar to Eastern European hunter-gatherers (EHG), whose ancestry had affinities with Paleo-Siberians (Ancient North Eurasians, or ANE), as well as Mesolithic Western Europeans.

There was once a model where CCC mediated the arrival of Finnic languages, but there are two issues now with this framework. First, CCC did not bring N1c. Second, they did not have any Neo-Siberian East Asian ancestry. Rather, their Paleo-Siberian ancestry is deeply nested within the broader West Eurasian set of populations (though some seem to detect reciprocal gene flow between ANE and some proto-East Asian groups). Genes don’t tell us what languages people spoke, but though N1c is not exclusive to Finnic peoples, it’s one of the diagnostic components, along with East Asian ancestry.

After the CCC, the East Baltic region was dominated by Corded Ware Culture (CWC). This culture is affinal in some way to the older Yamna culture. Whereas males in the Yamna culture tended to carry R1b, CWC males were R1a. The Y chromosomes in the paper show massive turnover, so that in the period between 2000 BC and 500 BC R1a was the overwhelming male lineage around the East Baltic. N1c only arrives in large numbers in the Iron Age. One thing that is interesting is to note that N1c is at very high frequencies among Indo-European speaking groups like Lithuanians (the Rurikid lineage is N1c, though Rurik was reputedly a Swede).

It seems clear to me that the Finnic-speaking populations were interacting extensively with Indo-European peoples as they expanded west. The Finnic-speaking regions of the Baltic were once inhabited by CWC, including southern Finland (the Sami probably absorbed relic hunter-gatherer populations in the north, seeing as they have some unique mtDNA lineages). This is still evident in the frequencies of R1a, which are higher in Finland than in most of Western Europe. In Finland and Estonia, language-shift occurred. The same can not be said for the Latvians and Lithuanians, two Baltic-speaking groups who likely integrated Finnic men to such an extent that N1c is now as common as R1a.

In terms of ancestry, the patterns are starting to become so clear that I think we are near the end of the line. I’ve been pondering the genetic ancestry of Finnic peoples for 17 years on this weblog, but I think at this point there are only details to fill in (thanks to the good ancient DNA climate of Norden, and, Tartu’s genetics researchers).

So here is the sketch. The first people to settle the post-Pleistocene Baltic were hunter-gathers with affinities to the west (WHG). These in their turn were marginalized by hunter-gatherers with affinities to the east (EHG). We don’t know what languages they spoke, but I wouldn’t be surprised if the CCC spoke some sort of Uralic language. At some point, computational methods of linguistic substrate analysis will get much better (there are some linguists who claim that toponyms in parts of Northern Europe give clues to pre-Neolithic languages). Then they, in turn, were replaced by the CWC people. This is important because the CWC introduced the full agricultural package into the eastern Baltic region. Unlike the rest of Europe, including Scandinavia, Early European Farmers (EEF), who descended from Anatolians in the largest part, never colonized the European northeast. The was due to biogeography (I wrote a blog post on this issue years before we had ancient DNA because the climatic patterns were so clear, along with the archeology). The agricultural toolkit of the first farmers was not as well adapted to the climate of this region as elsewhere.

Even in “ancient-DNA-blind” admixture analyses from ~2010 that genome bloggers ran, something was different between Finns and Swedes even aside from the eastern ancestry. Swedes clearly had a “southern” component in much higher fraction than Finns or even Lithuanians. With hindsight, it is now clear that this southern component was the legacy of the EEF, who never settled en masse in the northeast. But, this component, which is light-green in the plot above, begins to show up among post-CWC populations in Estonia during the Bronze Age, and definitely by the Iron Age. This, in concert with an increase in the blue “hunter-gatherer” ancestry, indicates gene flow from the west.

What this illustrates is that during this period of cultural stability there was a large interaction zone in the Baltic, mediated by trade, which resulted in marriages across populations. As implied in this and other papers, for the CWC peoples this seems to involve the assimilation of women from other groups, as it was highly patrilineal. The women from the west brought both more WHG ancestry, and, EEF. The authors detect that the pattern continued into modern Estonians, who have slightly more EEF than Bronze and Iron Age samples. As agriculture and trade flourished around the Baltic, reciprocal gene flow homogenized what were earlier rather distinct population groups. This was a continuous and stable parameter over the past few thousand years, even without any considerations of perturbations such as the German settlement of the region during the medieval period.

But this gets ahead of ourselves. In the centuries after 500 BC, it seems that post-CWC cultures gave way to Finnic ones. The language is the clearest sign, but the genetic signatures are also clear. But even the Sami are mostly of the same Northern European stock as the other peoples of Norden. The Uralic component was culturally critical, and, looking at the Y chromosomes it looks to have been mediated by mobile paternal lineages. I am not one who knows enough about the paleoclimate or archaeology to hazard a guess what was happening here, but one speculation I have is that the phenomenon may have been rather similar to that which occurred in Greenland, where the Norse were simply unable to compete with the Thule people, whose lifestyle was better suited to the median climate in Greenland. The interactions cannot have been all hostile, as one can judge by the fact that both the Lithuanians and Latvians were extensively penetrated by N1c bearing men who must have arrived relatively late. There are nearly as many N1c lineages in Lithuania as R1a.

This has analogs elsewhere. As I have pointed out, R1a is found in almost all regions of South Asia, across all population classes. The main exceptions outside of the trans-Himalayan fringe are the Austro-Asiatic Munda, who carry high frequencies of Southeast Asian Y chromosomal lineages and notably lack steppe ancestry and R1a. Just as men of Finnic provenance assimilated to the Indo-European language among the Latvian and Lithuanian tribes, so men of Indo-Aryan origin were clearly integrated into non-Indo-European (Dravidian-speaking) societies across South Asia.

Archaeologists, historians, and folklorists have some work to do. But like Thanos, Mait Metspalu, who has spearheaded so much of this genetic work in Europe’s northeast, can now go farming! His job is done.

But wait! There’s another act here.

The bigger, more surprising, though not entirely so, implication of this paper is that the Nordic phenotype was not brought to the north by a new people, but that it developed in situ through the mixing of peoples. The evidence from this, and other, papers is that Northern Europeans in the Bronze-Age were considerably darker in complexion and mien than they are today. That selection between the Bronze Age and the present has resulted in a sweeping up in frequencies of derived alleles which are strongly correlated with lighter skin, along with selection in other traits considered typical of Northern Europeans, such as the ability to digest milk sugar.

I took one of the supplemental tables and turned them into a chart:

Compare this chart to the one at the top. Between the Bronze Age and the Estonian Middle Ages, and therefore the modern period, the genome-wide changes have been subtle. But for lactase persistence and many of the pigmentation loci, there has been a substantial change without substantial gene flow (and, the East Asian Finnic ancestry likely introduced “dark” alleles, as one can see ancestral copies of SLC24A5 in Finns).

Lactase persistence is interesting because cattle culture in Europe precedes this allele by thousands of years. Likely Pre-Indo-European farmers seem to have utilized cheese (which has lower sugar content).

As for the pigmentation alleles, the standard caveats of predicting past populations on training-sets of the present apply here. But, the genetic character of the East Baltic region seems to have been overwhelmingly in place by the Late Bronze Age. On a genome-wide basis they would be subtly different from modern people in the region, but not substantially so. But, on these salient loci, they do seem quite different. Looking at the detailed SNPs, there are ancestral copies of both SLC24A5 and SLC45A2 variants which are extremely rare in the area today into the Iron Age.

This is not limited to just Estonia. We have two huge datasets from Western Europe, one covering the transition of Britain from Neolithic to Bronze Age, and another, thousands of years in Iberia.

First, let’s look at some SNPs from Britain.

CA-BA represents “Copper Age/Bronze Age.” This is the period when British genetic variation basically resembles that of the present day, especially in the north and west of the island (where Germanic period migrations had a minimal impact). A particular haplotype of HERC2 is strongly predictive of having blue eyes. That SNP is an excellent tag. This is the locus where the blue eye haplotype was at very high frequency during the Mesolithic. The frequency of blue eyes predicted increases with the arrival of the Beaker people…but it continues to rise into the present day! The Central European Beaker samples resemble the British ones on their pigmentation loci, just as they do genome-wide.

Now let’s look at the Iberian data.

The selection here is not as strong, but still evident on lactase and SLC45A2. Again, the key is to focus on the Bronze Age, when most of the ancestry of modern Iberians came into focus, with the fusion of EEF, Central and Northern Europeans, and residual WHG ancestry.

We now have some serious temporal transects of phenotypic change inferred from SNPs in very local regions across Europe, in Iberia, in Britain, and now in Estonia. These are very disparate regions, at three points in Europe. But they all seem to suggest the same thing: European populations became depigmented in situ after their overall genome-wide ancestry was established.

So I guess the question is why? I honestly have no idea. I doubt it’s Peter Frost’s old model because the timing is all off (the European steppe-tundra that plays a big role in his hypothesis also was gone by 4,000 years ago). My idea from nearly a decade ago that it had something to do with the arrival of agriculture and disease isn’t crazy, but I’m a lot less certain about the role of Vitamin D now than I was then.

And the issue around lactase persistence indicates that it’s not limited to the pigmentation genes. There’s lots of selection going on. Both the steppe populations and the Neolithic farming societies practiced cattle culture. In fact, Britain in the centuries before the arrival of the Beaker people had depopulated as it shifted away from intensive farming to agro-pastoralism. But, the steppe peoples do seem to have introduced the allele common in much of Eurasia associated with lactase persistence. It is the same allele found in Northern India. But, the frequency was initially quite low in the ancestral populations.

Overall as one chapter, that of relationships between populations closes, another opens. There’s a lot of innings left in the adaptation game.

Update: A reader of this weblog has pointed out that there is strong evidence that Northern European pigmentation profiles were all over the steppe and forest-steppe by the Bronze Age. Some of the data support that. But when I look closely at the steppe societies such as the Srubna, it is not clear at all that that was the case. For example, a derived SNP at SLC45A2 is close to fixed in modern northern Europeans, rs16891982 (~98%). The frequency is lower in Southern Europeans, closer to ~90%. In the Srubna and related groups, it is closer to 75%. In the Bell Beaker samples from Britain and Central Europe, it is closer to 65%. The frequency is lower in European farmers, but I don’t see the math working proportion-wise for Corded Ware Culture type ancestry with a dilution of EEF leading to a drop from ~100% to 65%.

Additionally, the Estonian CWC samples in the Reich data are ancestral, not derived.

Basically, there was a lot of heterogeneity. Even amongst groups that were similar on genome-wide terms.

(also note that the derived allele was already present at ~25% among Neolithic farmers)

Addendum: Since I put some time into this post, I made this premium. Many of you can read this because you get two free a month, or you are getting around the paywall. But comments are closed if you are not a member. Just so you know if you are confused.

Addendum: I made this a public post again. I’ll work on the Paypal option this weekend.

0