If you are new to population genetics... This page aims at providing detailed descriptions of each haplogroup and their history. If you are unfamiliar with haplogroups or population genetics, we recommend that you familiarise yourself first with the basics by viewing the Video Tutorials about genetics and read our Frequently Asked Questions about DNA tests. Each haplogroup corresponds to a distinct ancestral lineage. Haplogroups are divided into numerous levels of subclades that form a phylogenetic tree, which is just a fancy word for genealogical tree of genetic ancestry. You may also find it useful to visualise the modern geographic distribution of Y-DNA haplogroups to get a sense of they represent.

Disclaimer

The information about the origin and ethnic association of haplogroups on this website should not be read as hard facts, but, as is often the case in science, as a model in constant evolution based on the present knowledge and understanding (of the author). Whenever the advancement of genetics couldn't provide irrefutable answers, we have attempted to provide the most likely and logical hypothesis based on archeological, historical and linguistic evidence. This page is being updated regularly to keep up with recent studies giving additional insights or rectifying possibly erroneous theories. Feel free to add comments or share your opinion on the forum.

DNA Facts Nucleobases are the alphabet of DNA. There are four of them : adenine (A), thymine (T), guanine (G) and cytosine (C). They always go by pairs, A with T, and G with C. Such pairs are called "base pairs".

The 46 chromosomes of human DNA are composed of a total of 3,000 million base pairs.

The Y chromosome possess 60 million nucleobases, against 153 million for the X chromosome.

Mitochondrial DNA is found outside the cell's nucleus, and therefore outside of the chromosomes. It consists only of 16,569 bases.

A SNP (single nucleotide polymorphism) is a mutation in a single base pair. At present, only a few hundreds SNP's define all the human haplogroups for mtDNA or Y-DNA.

=> More DNA Facts

Introduction to historical population genetics

DNA studies have permitted to categorise all humans on Earth in genealogical groups sharing one common ancestor at one given point in prehistory. They are called haplogroups. There are two kinds of haplogroups: the paternally inherited Y-chromosome DNA (Y-DNA) haplogroups, and the maternally inherited mitochondrial DNA (mtDNA) haplogroups. They respectively indicate the agnatic (or patrilineal) and cognatic (or matrilineal) ancestry.

Y-DNA haplogroups are useful to determine whether two apparently unrelated individuals sharing the same surname do indeed descend from a common ancestor in a not too distant past (3 to 20 generations). This is achieved by comparing the haplotypes through the STR markers. Deep SNP testing allows to go back much farther in time, and to identify the ancient ethnic group to which one's ancestors belonged (e.g. Celtic, Germanic, Slavic, Greco-Roman, Basque, Iberian, Phoenician, Jewish, etc.).

In Europe, mtDNA haplogroups are quite evenly spread over the continent, and therefore cannot be associated easily with ancient ethnicities. However, they can sometimes reveal some potential medical conditions (see diseases associated with mtDNA mutations). Some mtDNA subclades are associated with Jewish ancestry, notably K1a1b1a, K1a9,d K2a2a and N1b.

The study of Y-chromosomes is far more interesting than that of mitochondrial DNA for two reasons.

Firstly, the Y chromosome is a sequence of 60 million "characters" (nucleobases), against only 16,569 for mtDNA. The Y chromosome therefore offers a much greater resolution as mutations are more common, and indeed happen every generation. In contrast, mtDNA mutations happen much more infrequently. Since the time of the Mitochondrial Eve, approximately 200,000 years ago, modern humans have acquired in average 20 mtDNA mutations in each lineage - about one every ten thousand years. Even though the number of mutations has accelerated with the soaring of human population over the last 10,000 years, the dating of lineages based on mtDNA alone remains very approximate, and practically useless for historical times. By sequencing the full Y chromosome, it is theoretically possible to map the entire patrilineal genealogy of humanity (or any other species) within a few generations (or even within one generation). This is a collossal task, and an expensive one too, since full chromosome sequencing (reading every nucleobase one by one) remains very expensive compared to SNP genotyping (checking only for mutations already discovered in other individuals). The arrival of the full Y-chromosome sequencing (or even whole genome sequencing) on the market has permitted to achieve an optimal resolution, but their price remains well above the standard commercial tests. This restricts the overall reach of these tests and the most common haplogroups in rich countries are at present much better studied than the other ones. The second advantage of Y-DNA over mtDNA is that men have traditionally been less mobile than women (except during military invasions, like the Indo-Europeans, the Vikings or the Arabs). In almost every settled, agricultural society, men are the ones who inherit their parents's property, and therefore remain in the same location generation after generation. Women, on the other hand, were often send away to marry in another village or town, so that their lineages spread more evenly over time, thus progressively erasing the traces of ancient settlement patterns.

Paternal and maternal haplogroups in prehistoric Europe

Mesolithic Europe

Following the end of the last Ice Age approximately 12,000 years ago, European hunter-gatherers recolonised the continent from the Ice Age refugia in southern Europe. The vast majority of Mesolithic Europeans would have belonged to Y-haplogroup I. This included I* (the * means that no further subclade was identified), pre-I1, I1, I2*, I2a*, I2a2, I2c, but the most widespread appears to have been I2a1, which was found in most parts of Europe. Northeast Europeans would have belonged mostly to haplogroup R1a, and to a lower extent also I2a2 and R1b. Other minor male lineages were certainly also present in parts of Europe, notably haplogroup A1a, C-V20, and possibly even Q1a.

The maternal lineages of Mesolithic Europeans appears to have been predominantly U4 and U5, but also included several H subclades (H1, H3, H17), T, U2 (U2d et U2e) and V. The presence of mt-haplogroups I and W in Eastern Europe or the North Caucasus is possible but hasn't been confirmed yet.

Based on their modern distributions, mtDNA haplogroups H10 and H11 might well have Mesolithic/Palaeolithic European origins.

There seem to have been several Palaeolithic and/or Mesolithic migrations from Northwest Africa to Iberia. The oldest might have brought West African paternal haplogroup A1a to Western and Northern Europe during the Palaeolithic. A1a has been found in modern populations as far north as Ireland, Scotland, Scandinavia and Finland. The presence of African maternal lineages (L2, L3 and possibly L1b1) has been attested in Neolithic Iberia. Northwest Africans would also have brought U6 and possibly HV0/V lineages to Europe.

A small percentage of sub-Saharan African admixture has been identified in Late Mesolithic Swedes from the Pitted Ware culture (2800-2000 BCE), which would imply that A1a was already present in northern Europe at the time. Another Mesolithic sample from Loschbour in Luxembourg had dark hair and considerably darker skin than modern Europeans.

Distribution map of Y-DNA and mtDNA haplogroup in and around Europe circa 8000 BCE

Neolithic and Chalcolithic Europe

Click to enlarge.

Agriculture first developed in the Levant, then spread to Anatolia, Greece, the Balkans, Italy, Central and Eastern Europe. These Neolithic farmers were confirmed to have belonged primarily to Y-DNA haplogroups G2a, but also included minorities of C1a2, E1b1b, H2 (formerly F3), J1, J2 and T1a lineages, who could have been assimilated in Anatolia before entering Europe. As they advanced across Europe, Neolithic farmers also increasingly assimilated European lineages, notably I2a1 in Southeast Europe, I1 and I2a1 in Central Europe, I2a1 and I2a2a in Western Europe, and E-M78, I2a1 and I2a2a in Southwest Europe.

Hundreds of Neolithic samples from all over Europe (but especially Central Europe and Iberia) have been tested. The new lineages brought by these Near Eastern immigrants included mt-haplogroups HV, J1, J2, K1, K2, N*, N1, T1a, T2b, T2c, T2e, T2f, U3, W, X1, X2, and many subclades of H (including H2, H5, H7, H13 and H20). H4, H8 and H9 seem to have originated in the Near East as well, although no Neolithic sample has been identified in Europe yet.

However, due to the proximity of the Caucasus from the Indo-European homeland, many of these mt-haplogroups were almost certainly also transported by the Indo-Europeans themselves. This would notably be the case of H5, K1a, T2b, U3, W and X2.

The Bronze Age and the Indo-European migrations

The origin of the Indo-European peoples is a subject that has caused much ink to flow among archaeologists and historians. Their Urheimat (original homeland) has been speculated to lie in Anatolia, around the Caucasus, in Iran, in India, in Central Asia, in Russia, or even in Scandinavia. Thanks to Paleogenetics we now know that these people expanded during the Late Copper and Early Bronze Age from the Pontic Steppe to the North of the Black Sea and the Caucasus. There seems to have been two distinct, though closely related, groups of tribes speaking the Proto-Indo-European language, from which descend almost all the European languages today (apart from Basque, Hungarian, Estonian, Finnish and Sami) as well as Armenian, Kurdish, Persian and most North Indian languages. Tribes belonging mainly to the paternal haplogroup R1a reportedly occupied the North of the steppe (forest-steppe and tundra), while in the South (open steppe) were nomadic cow herders belonging mainly to haplogroup R1b.

Their migration both westward to Europe and eastward to Central and South Asia makes it easy to infer which mtDNA haplogroups they carried (=> see also Identifying the original Indo-European mtDNA from isolated settlements). The best matches for R1a are C4a, H1b, H1c, H2a1, H6, H11, K1b1b, K1c, K2b, T1a1a1, T2a1b1, T2b2, T2b4, U2e, U4, U5a1a, W, and several I subclades.

The R1b branch would have originated in eastern Anatolia and/or northern Mesopotamia/Syria during the Early Neolithic period, where they probably domesticated cattle and became primarily cattle herders. Then would have migrated to the western part of the Iranian plateau, crossed the Caucasus to the Pontic Steppe in search for pasture for their cattle, where they mixed to some extent with I2a2 and R1a tribes that inhabited those lands. The maternal lineages of these Near Eastern R1b people would have included haplogroups H5a, H6, H8, H15, I1a1, J1b1a, K1a3, K2a6, U5, and some V subclades (like V15).

MtDNA haplogroups H4 has not been found in Europe before the Late Chalcolithic (Corded Ware culture) and the Early Bronze Age (Unetice culture) and might have been brought by the Indo-Europeans. Likewise, H6 is absent from all Mesolithic or Neolithic samples, and its strong presence in the North Caucasus and Central Asia supports an Indo-European connection.

Y-DNA Haplogroups

Chronological development of Y-DNA haplogroups

C => 66,000 years ago (in the East Africa)

E => 62,500 years ago (in Africa)

G => 48,000 years ago (in the Middle East)

K => 46,000 years ago (between the Caucasus and India)

I => 43,000 years ago (around the Black Sea)

J => 43,000 years ago (in the Middle East or the Caucasus)

T => 42,000 years ago (around the Iranian Plateau)

C1a2 => 41,500 years ago (in the Middle East)

E1b1b => 35,000 years ago (in Northeast Africa)

Q & R => 32,000 years ago (in the Central Asia or Siberia)

J1 => 31,000 years ago (in the Caucasus or Zagros mountains)

J2 => 31,000 years ago (in northern Mesopotamia or the Caucasus)

I1 & I2 => 27,500 years ago (in Europe)

T1a => 27,000 years ago (around the Iranian Plateau)

R1b => 23,000 years ago (around the Caspian Sea or in Russia)

R1a => 23,000 years ago (in Russia)

E-M78 => 20,000 years ago (in north-eastern Africa)

G2a => 20,000 years ago (in the Middle East)

I2a1a (M26) & I2a1b (M423) => 18,500 years ago (in southern Europe)

J2a1 => 18,500 years ago (in northern Mesopotamia or in the Caucasus)

E-M123 => 18,000 years ago (around the Red Sea or in the Levant)

I2a2a (M223) => 17,500 years ago (in southern Europe)

J2b1 & J2b2 => 16,000 years ago (around the Iranian Plateau or the Caucasus)

N1c1 => 15,500 years ago (in northern China)

E-M81 => 14,000 years ago (in North Africa)

R1b-M269 => 13,500 years ago (around the Caspian Sea)

I2a2b (L38) => 12,500 years ago (in central Europe)

J1-P58 => 11,500 years ago (in the Middle East or the Caucasus)

I2a2a-L801 => 9,500 years ago (in central or northern Europe)

R1a1a1 (M417) => 8,500 years ago (in Northeast Europe)

T1a-CTS2214 => 8,500 years ago (in the Middle East)

E-V13 => 7,500 years ago (in Central or Southeast Europe)

N1c1-L1026 => 6,500 years ago (in Northeast Europe)

Q1b1a-L245 => 6,500 years ago (in central Asia or in the Middle East)

R1b-L23 => 6,500 years ago (around the Caucasus)

J1-L858 => 5,500 years ago (in the Middle East)

R1b-U106 & R1b-P312 => 5,000 years ago (in Central Europe)

I1a (DF29) => 4,500 years ago (in Scandinavia)

Q1a2-Y4827 => 3,000 years ago (in Scandinavia)

Map of early Bronze Age cultures in Europe around 4,500 to 5,000 years ago

Main European & Middle Eastern haplogroups