Ole Magga, Norwegian politician

On this blog I regularly get questions about the Sami (Lapp*). That’s because I often talk about Finnish genetics, have readers such as Clark who are of part-Sami origin, and, the provenance and character of the Sami speak to broader questions about the emergence of the modern European gene pool. More precisely questions about the Sami are relevant to the broader nature of the Finnic presence in Europe, and their relationship to other Baltic and northern populations. Are these people “indigenous” to Europe, or relatively newcomers (prehistoric Magyar or Turks).? These questions are prompted by the peculiarity of their languages (as well as the physical appearance of some of the Sami). With Basque they are the only living non-Indo-European European languages whose origins are prehistoric (Magyar and Turkish were arrivals within the last 1,000 years).**

Because of affinities to other Uralic languages which are found in Central Siberia it has often been conjectured that the Finns, Sami, and Estonians are relative newcomers to Norden from that region. This has some equivocal support from Y chromosomal lineages. On the other hand, there are those who argue that the Finnic peoples were present in the north of Europe before the arrival of Indo-European speakers (often these are Finnish nationalists). This has some support from maternal lineages. Naturally, some have been tempted to synthesize these two genetic lines of evidence, and the linguistic affinities, to argue that Finns are a hybrid population of Asiatic men and Paleolithic European women! But we need to go further than uniparental markers, the direct male and female ancestral lines. We need to look across the broader swath of the genome. It just happens that a new paper was published in The European Journal of Human Genetics on autosomal Sami affinities to other populations, A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies:

The understanding of patterns of genetic variation within and among human populations is a prerequisite for successful genetic association mapping studies of complex diseases and traits. Some populations are more favorable for association mapping studies than others. The Saami from northern Scandinavia and the Kola Peninsula represent a population isolate that, among European populations, has been less extensively sampled, despite some early interest for association mapping studies. In this paper, we report the results of a first genome-wide SNP-based study of genetic population structure in the Finnish Saami. Using data from the HapMap and the human genome diversity project (HGDP-CEPH) and recently developed statistical methods, we studied individual genetic ancestry. We quantified genetic differentiation between the Saami population and the HGDP-CEPH populations by calculating pair-wise FST statistics and by characterizing identity-by-state sharing for pair-wise population comparisons. This study affirms an east Asian contribution to the predominantly European-derived Saami gene pool. Using model-based individual ancestry analysis, the median estimated percentage of the genome with east Asian ancestry was 6% (first and third quartiles: 5 and 8%, respectively). We found that genetic similarity between population pairs roughly correlated with geographic distance. Among the European HGDP-CEPH populations, FST was smallest for the comparison with the Russians (FST=0.0098), and estimates for the other population comparisons ranged from 0.0129 to 0.0263. Our analysis also revealed fine-scale substructure within the Finnish Saami and warns against the confounding effects of both hidden population structure and undocumented relatedness in genetic association studies of isolated populations.

They had 352 Sami samples, and looked at ~38,000 SNPs. For the questions they’re focusing on 38 K SNPs seems fine. That’s enough to smoke out inter-population variation. In their paper they compared the Sami to the HGDP populations using standard techniques. Assuming 7 ancestral populations in the data set, this what ADMIXTURE popped out:

ejhg2010179f2

There is a definite “eastern” affinity among the Sami. Interestingly, it is broken down into a major and minor component. The major one is what is found among the Han, while the minor one resembles Native Americans. The natural interpretation for this is that what one is seeing is the shadow of the circumpolar northern Eurasian populations which spanned eastern Europe to Siberia. In comparison with other European populations the Sami affinity with Russians is clear, though interestingly they lack the “blue” component which peaks in northwest South Asian populations, which the Russians have, and Sardinians and French Basque lack.

To the left you see a PCA which breaks out the top two components of genetic variation for the data set. The two axes seem to be roughly west-east, north-south. Whatever ancient affinities the Sami may have with Southern Europeans via mtDNA haplogroup U5, it is not evident in the total genome content. The position of the Sami between Russians and Orcadians (from north of Scotland) is probably attributable to the fact that the Sami share much genetically with other Scandinavians, who are closer to British populations than the Russians are.

I’m not sure these analyses really shed any light on the on the questions I mentioned earlier. The authors themselves note that the “eastern” component of the ancestry in the Sami is probably very old, so they may be an ancient stabilized hybrid population, mostly indigenous with a non-trivial exogenous element. That does not tell us whether Finnic languages are indigenous to Europe, or whether they are indigenous to Central Siberia (indigenous here is in reference to the Indo-European languages). Additionally, there is the matter that for such fine-grained questions the HGDP sample is suboptimal as reference populations. Dienekes Pontikos points this out:

It is unfortunate that they included Native American HGDP populations, but did not include the most relevant published data on Siberians that I first used to study population structure across north Eurasia here and here and here. Hence, they discover a “Native American”-like component in Saami, which in all likelihood can be further resolved into Siberian-specific components utilizing the Rasmussen et al. dataset. The “closest approximation” to the East Eurasian component in Saami in the HGDP panel are the Yakuts, but finer-scale analysis (see my previous posts) reveals that the Yakuts are made up almost entirely of an Altaic-specific component tying them to Turkic, Mongol, and Tungusic populations, while the eastern component in European Finns, Vologda Russians and Chuvashs has relationships with Central Siberians such as Kets, Selkups, and Nganasans, all of which are missing in this paper.

Below is a re-edited ADMIXTURE plot from Dienekes:

Note: There are many ways to spell Sami. They used two a’s, but I find that confusing, so I just used one in my text.

Citation: Maki-Torkko, Elina, Aikio, Pekka, Sorri, Martti, Huentelman, Matthew J, & Camp, Guy Van (2010). A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies European Journal of Human Genetics : 10.1038/ejhg.2010.179

* Apparently “Lapp” is considered derogatory among Norwegians, though Finnish Sami refer to themselves as lappalainen. I will use Sami to avoid irritating Norwegian terminology police.

** I am implicitly excluding much of European Russia west of the Urals, but so be it.