Four major founder lineages within haplogroup K and N1b

Haplogroup K arose within haplogroup U8~36 ka, in Europe or the Near East, with the minor subclades K1b, K1c and K2 all most likely arising in Europe, between the last glacial period and the Neolithic (Fig. 1; Supplementary Note 1; Supplementary Data 1–3; Supplementary Figs S1–S3; Supplementary Tables S1–S3). K1a expanded from ~20 ka onwards, both in the Near East and Europe, with its major subclade, K1a1b1 (Fig. 2), mainly restricted to Europe (with a few instances in North Africa), arriving from the Near East by ~11.5 ka, the beginning of the Holocene (Supplementary Note 1).

Figure 1: Inferred ancestry of the main subclades within haplogroup U8. The timescale (ka) is based on ML estimations for mitogenomes. Inset: Bayesian skyline plot of 34 Ashkenazi haplogroup K lineages, showing growth in effective population size (N ef ) over time. Full size image

Figure 2: Phylogenetic tree of haplogroup K1a1b1. Time scale (ka) based on ML estimations for mitogenome sequences. Full size image

Almost half of mtDNAs in west/central European Ashkenazi Jews belong to haplogroup K, declining to ~15% in east European Jews1,11, with almost all falling into three subclades: K1a1b1a, K1a9 and K2a2a12,25 (Figs 1, 2, 3, 4; Supplementary Fig. S4). These three founder clusters show a strong expansion signal beginning ~2.3 ka, with the overall effective population size for these lineages increasing 13-fold by 275 years ago (Fig.1).

Figure 3: Phylogenetic tree of haplogroup K1a9 in the context of the putative clade K1a9′10′15′26′30. Time scale (ka) based on ML estimations for mitogenome sequences. Full size image

Figure 4: Phylogenetic tree of haplogroup K2a2. Time scale (ka) based on ML estimations for mitogenome sequences. Full size image

K1a1b1a (slightly re-defined, due to the improved resolution of the new tree) (Fig. 2) accounts for 63% of Ashkenazi K lineages (or ~20% of total Ashkenazi lineages) and dates to ~4.4 ka with maximum likelihood (ML); however, all of the samples within it, except for one, nest within a further subclade, K1a1b1a1, dating to ~2.3 ka (Supplementary Data 2). K1a1b1a1 is also present in non-Ashkenazi samples, mostly from central/east Europe. As they are nested by Ashkenazi lineages, these are likely due to gene flow from Ashkenazi communities into the wider population. The pattern of gene flow out into the neighbouring communities is seen in the other two major K founders, and also in haplogroups H and J; it is especially clear when the nesting and nested populations are more distinct, for example in the case of haplogroup HV1b, which has a deep ancestry in the Near East (Fig. 5; Supplementary Table S4).

Figure 5: Phylogenetic tree of haplogroup HV1b. Time scale (ka) based on ML estimations for mitogenome sequences. Full size image

The K1a1b1 lineages within which the K1a1b1a sequences nest (including 19 lineages of known ancestry) are solely European, pointing to an ancient European ancestry. The closest nesting lineages are from Italy, Germany and the British Isles, with other subclades of K1a1b1 including lineages from west and Mediterranean Europe and one Hutterite (Hutterites trace their ancestry to sixteenth-century Tyrol)26. Typing/HVS-I results have also indicated several from Northwest Africa, matching European HVS-I types2, likely the result of gene flow from Mediterranean Europe. K1a1b1a is also present at low frequencies in Spanish-exile Sephardic Jews, but absent from non-European Jews, including a database of 289 North African Jews2,25. Notably, it is not seen in Libyan Jews25, who are known to have a distinct Near Eastern ancestry, with no known influx from Spanish-exile immigrants (although Djerban Jews, with a similar history, have not been tested to date for mtDNA, they closely resemble Libyan Jews in autosomal analyses27). Thus the Ashkenazi subclade of K1a1b1 most likely had a west European source.

K1a9 (Fig. 3; Supplementary Fig. S4), accounting for another 20% of Ashkenazi K lineages (or 6% of total Ashkenazi lineages) and also dating to ~2.3 ka with ML (Supplementary Data 2) again includes both Ashkenazi and non-Ashkenazi lineages solely from east Europeans (again suggesting gene flow out into the wider communities). Like K1a1b1a, it is also found, at much lower frequencies, in Sephardim. Here the ancestral branching relationships are less clear (Supplementary Note 1 and Supplementary Fig. S4), but K1a9 is most plausibly nested within the putative clade K1a9′10′15′26′30, dating to ~9.8 ka, which otherwise includes solely west European (and one Tunisian) lineages, again pointing to a west European source.

K2a2 (Fig. 4) accounts for another 16% of Ashkenazi K lineages (or ~5% of total Ashkenazi lineages) and dates to ~8.4 ka (Supplementary Data 2). Ashkenazi lineages are once more found in a shallow subclade, K2a2a1, dating to ~1.5 ka, that otherwise again includes only east Europeans, suggesting gene flow from the Ashkenazim. Conversely, the nesting clades, K2a2 and K2a2a, although poorly sampled, include only French and German lineages. K2a2a is not found in non-European Jews25.

Haplogroup K is rarer in the North Caucasus than in Europe or the Near East (<4% (ref. 23)) and the three Ashkenazi founder clades have not been found there (Supplementary Note 2). We tested all eight K lineages out of 208 samples from the North Caucasus, and all belonged to the Near Eastern subclades K1a3, K1a4 and K1a12. Haplogroup K is more common in Chuvashia, but those sampled belong to K1a4, K1a5 and pre-K2a8.

The fourth major Ashkenazi founder mtDNA falls within haplogroup N1b (ref. 2). The distribution of N1b is much more focused on the Near East than that of haplogroup K (ref. 24), and the distinctive Ashkenazi N1b2 subclade has accordingly being assigned to a Levantine source2. N1b2 has until now been found exclusively in Ashkenazim, and although it dates to only ~2.3 ka, it diverged from other N1b lineages ~20 ka (ref. 24) (Supplementary Table S5). N1b2 can be recognized in the HVS-I database by the variant 16176A, but Behar et al.2 tested 14 Near Eastern samples (and some east Europeans) with this motif and identified it as a parallel mutation. Therefore, despite the long branch leading to N1b2, no Near Eastern samples are known to belong to it.

In our unpublished database of 6991 HVS-I sequences, however, we identified two Italian samples with the 16176A marker, which we completely sequenced. We confirmed that they belong to N1b2 but diverge before the Ashkenazi lineages ~5 ka, nesting the Ashkenazi cluster (Fig. 6; Supplementary Table S5). This striking result suggests that the Italian lineages may be relicts of a dispersal from the Near East into Europe before 5 ka, and that N1b2 was assimilated into the ancestral Ashkenazi population on the north Mediterranean ~2 ka. Although we found only two samples suggesting an Italian ancestry for N1b2, the control-region database available for inspection is very large (28,418 HVS-I sequences from Europe, the Near East and the Caucasus, of which 278, or ~1%, were N1b). Moreover, the conclusion is supported by our previous founder analysis of N1b HVS-I sequences, which dated the dispersal into Europe to the late Pleistocene/early Holocene24.

Figure 6: Phylogenetic tree of haplogroup N1b. Time scale (ka) based on ML estimations for mitogenome sequences. Full size image

Minor Ashkenazi mtDNA lineages

There is now a large number of mitogenomes from Europe, the Caucasus and the Near East (~3,500, with >70 Ashkenazim), and a substantial Ashkenazi mtDNA control-region database of 836 samples1,2,11 (Supplementary Table S6). We therefore endeavoured to cross-reference the two in order to pinpoint most of the control-region data within the mitogenome phylogeny.

Besides the four haplogroup K and N1b founders, the major haplogroup in Ashkenazi Jews is haplogroup H, at 23% of Ashkenazi lineages, which is also the major haplogroup in Europeans (40–50% in Europe, ~25% in the North Caucasus and ~19% in the Near East)28. There are 29 Ashkenazi H mitogenomes available (Supplementary Table S7), 26 (90%) of which nest comfortably within European subclades dating to the early Holocene (Supplementary Note 3, Figs 7 and 8; Supplementary Figs S5–S10; Supplementary Table S8). Most, in fact, nest more specifically within west/central European subclades, with closely matching sequences in east Europe, as with the pattern for the K founder clades. The Ashkenazi mitogenomes from haplogroup H include 39% belonging to H1 or H3, which are most frequent in west Europe and rare outside Europe. The nesting relationships in some cases point (albeit tentatively) to a central European source, but in many cases comparison with the HVS-I database indicates matches in west Europe. The phylogeographic conclusions based on the nesting relationships are strongly supported for haplogroup H by evidence from the study of prehistoric remains, showing in almost all cases that the lineages concerned were present in Europe since at least the early Bronze Age, ~3.5 ka (Supplementary Table S7)29. There is no suggestion of assimilation from the North Caucasus, where most H lineages differ from those of Europe23 (Supplementary Note 2).

Figure 7: Schematic phylogenetic tree of haplogroup H1. Only the Ashkenazi lineages are shown in full detail; the distribution of other lineages is indicated using small squares by the number present in the full tree for each subclade. Prehistoric European (all Neolithic, except for the H1aw lineage, which dates to the Iron Age) lineages are shown using red circles29. Full size image

Figure 8: Phylogenetic tree of Ashkenazi founders within haplogroup H6a1a. Time scale (ka) based on ML estimations for mitogenome sequences. A Late Neolithic Corded Ware lineage from central Europe29 is shown in red emerging directly from the root. Full size image

Haplogroup J comprises 7% of the Ashkenazi control-region database. Around 72% of these can be assigned to J1c, now thought to have arisen within Late Glacial Europe30, and 19% belong to J1b1a1, also restricted to Europe. Thus >90% of the Ashkenazi J lineages have a European origin, with ~7% (J1b and J2b) less clearly associated. Many have a probable west/central European source, despite (like H) being most frequent in eastern Ashkenazim. The four Ashkenazi J mitogenomes, in J1c5, J1c7a1a and J1c7d, once again show a striking pattern of Mediterranean, west and central European lineages enclosing Ashkenazi/east European ones (Fig. 9).

Figure 9: Schematic phylogenetic tree of haplogroup J1c. Only the Ashkenazi lineages are shown in full detail; the distribution of other lineages is indicated using small squares for each subclade with the number present in the full tree given in each case. For the full tree see Pala et al.30 Time scale (ka) based on ML estimations for mitogenome sequences. Full size image

Haplogroups U5, U4 and HV0 (6.3% between them overall) arose within Europe. Some of these lineages, which are again more frequent in the eastern than western Ashkenazi, may have been assimilated in central Europe. The haplogroup T lineages (5% overall) are more difficult to assign, but at least 60% (in T2a1b, T2b, T2e1 and T2e4) are likely of European and ~10% (T1b3 and T2a2) Near Eastern origin30. The haplogroup I lineages have evidently been present in Europe at least since the Neolithic, as indicated by both phylogeographic and ancient DNA analyses31. Haplogroup W3 may have originated in the Near East but spread to Europe as early as the Late Glacial31. The M1a1b lineage is characteristic of the north Mediterranean and was most likely assimilated there32, but the U6a and L2a1l lineages are more difficult to pin down.

The main lineages with a potentially Near Eastern source include HV1, R0a1a and U7a5 (~8.3% in all). HV1b2 mitogenomes, in particular, date to ~2 ka and nest within a cluster of Near Eastern HV1b lineages dating to ~18 ka (Fig. 5; Supplementary Table S4). Others such as U1a and U1b have an ultimately Near Eastern origin but, like N1b, have been subsequently distributed around the north Mediterranean. In general, it is more difficult to assign lineages to a Near Eastern source with confidence, as the much larger control-region database indicates that (as with N1b2) many lineages with deep Near Eastern ancestry became widely dispersed along the north Mediterranean during the Holocene, and may alternatively have been assimilated there.

If we allow for the possibility that K1a9 and N1b2 might have a Near Eastern source, then we can estimate the overall fraction of European maternal ancestry at ~65%. Given the strength of the case for even these founders having a European source, however, our best estimate is to assign ~81% of Ashkenazi lineages to a European source, ~8% to the Near East and ~1% further to the east in Asia, with ~10% remaining ambiguous (Fig. 10; Supplementary Table S9). Thus at least two-thirds and most likely more than four-fifths of Ashkenazi maternal lineages have a European ancestry.