Abstract The current human mitochondrial (mtDNA) phylogeny does not equally represent all human populations but is biased in favour of representatives originally from north and central Europe. This especially affects the phylogeny of some uncommon West Eurasian haplogroups, including I and W, whose southern European and Near Eastern components are very poorly represented, suggesting that extensive hidden phylogenetic substructure remains to be uncovered. This study expanded and re-analysed the available datasets of I and W complete mtDNA genomes, reaching a comprehensive 419 mitogenomes, and searched for precise correlations between the ages and geographical distributions of their numerous newly identified subclades with events of human dispersal which contributed to the genetic formation of modern Europeans. Our results showed that haplogroups I (within N1a1b) and W originated in the Near East during the Last Glacial Maximum or pre-warming period (the period of gradual warming between the end of the LGM, ∼19 ky ago, and the beginning of the first main warming phase, ∼15 ky ago) and, like the much more common haplogroups J and T, may have been involved in Late Glacial expansions starting from the Near East. Thus our data contribute to a better definition of the Late and postglacial re-peopling of Europe, providing further evidence for the scenario that major population expansions started after the Last Glacial Maximum but before Neolithic times, but also evidencing traces of diffusion events in several I and W subclades dating to the European Neolithic and restricted to Europe.

Citation: Olivieri A, Pala M, Gandini F, Kashani BH, Perego UA, Woodward SR, et al. (2013) Mitogenomes from Two Uncommon Haplogroups Mark Late Glacial/Postglacial Expansions from the Near East and Neolithic Dispersals within Europe. PLoS ONE 8(7): e70492. https://doi.org/10.1371/journal.pone.0070492 Editor: Luísa Maria Sousa Mesquita Pereira, IPATIMUP (Institute of Molecular Pathology and Immunology of the University of Porto), Portugal Received: March 19, 2013; Accepted: June 20, 2013; Published: July 31, 2013 Copyright: © 2013 Olivieri et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This research received support from the Leverhulme Trust (research project grant 10 105/D)(to MBR), the Sorenson Molecular Genealogy Foundation (to UAP and SRW) and the Italian Ministry of Education, University and Research: Progetti Futuro in Ricerca 2008 (RBFR08U07M) and 2012 (RBFR126B8I) (to AA and AO) and Progetti Ricerca Interesse Nazionale 2009 and 2012 (to AA, AT and OS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: Author Scott R. Woodward is employed by the commercial company AncestryDNA, Provo, UT. After having carefully read the journal's policy, the authors confirm that this does not alter their adherence to all the PLOS ONE policies on sharing data and materials. They confirm that Alessandro Achilli is a PLOS ONE Editorial Board member. After having carefully read the journal's policy, the authors confirm that this does not alter their adherence to all the PLOS ONE policies on sharing data and materials.

Introduction Evidence from mitochondrial DNA (mtDNA) suggests that a southern dispersal from the Horn of Africa along the Indian Ocean coasts might have brought anatomically modern humans out of Africa ∼60–70 thousand years ago (kya) [1]–[3] although archaeological evidence from Southern Arabian and Indian sites has led some to propose an even earlier exit along the southern route [4]–[6]. ∼15–25 ky later, during the Early Upper Palaeolithic, the first modern Europeans arrived from the Levant [7]–[10]. Archaeologists, linguists, anthropologists and, more recently, geneticists have long debated the role of the major colonization and diffusion events in shaping the structure of modern Europeans. A fundamental question has concerned the relative amount of genetic input into modern Europeans from Palaeolithic versus Neolithic waves of settlement. Palaeolithic events include both the first entry to the continent and the re-settlement from southern refugia after the Last Glacial Maximum (LGM), starting from ∼19 kya, while Neolithic phases primarily coincide with the spread of agriculture and pastoralism that began in the Near East ∼10 kya and progressively reached the Balkans, Central Europe, the West Mediterranean, and the north, argued to have been accompanied by substantial increases in population size [11]. As for the relative extent of the genetic traces left by these key events, the debate has been inconclusive. Early analyses based on “classical” genetic markers were interpreted as supporting a Neolithic wave of advance that played a major role in shaping the genetic variability of Europeans [12], [13] with Mesolithic foragers contributing minimally to the present-day genetic background. Subsequently, the analysis of mitochondrial DNA variation based on the phylogeographic analysis of the mtDNA control-region sequence and coding-region RFLP markers [14]–[16] turned the tide, pointing to a more significant contribution from the indigenous hunter-gatherers estimated to at least ∼80%. This suggested that only small groups of Neolithic people settled Europe and a wide-scale adoption of agricultural technology by indigenous Mesolithic/Palaeolithic populations occurred [3]. With the advent of complete mtDNA genome sequencing, clear Palaeolithic and Mesolithic signals in Europe have been retrieved from various mtDNA clades tracing Late Glacial and postglacial expansions of populations from European refuge areas from ∼18 kya, albeit with the majority clustering in the postglacial ∼11.5 kya [1], [3], [17]–[21]. More recently, Pala et al. [22] have further shown that the widespread West Eurasian haplogroups J and T share a common origin in the Near East and expanded at the end of the Last Glacial Maximum from a Near Eastern glacial refuge. Lineages within these haplogroups had previously been identified as potentially accompanying the spread of the Neolithic, but the improved resolution of complete mtDNA genomes showed that the initial move to Europe had been much earlier. However, despite some criticisms [23], the picture of the peopling of Europe with limited, but not insignificant, Neolithic immigration into a mainly Palaeolithic/Mesolithic genetic background, was supported by the analysis of the other uniparental genetic system, the non-recombining, male-specific portion of the Y chromosome [24], which not only confirmed the presence of Mediterranean/Southern European refuges [25], but also traced the genetic legacy of other glacial refuge areas, such as the Balkans and the Periglacial areas of the Ukrainian plains [26], [27]. Ten years after the release of the whole human genome reference sequence [28], [29], previously inconceivable progress has been made in the field of population genetics, with the double aim of both correlating population structure with genetic bases of common diseases and/or drug response and understanding past history and migrations of our species [30]. The future of population genetics will be likely dominated by personal genomics and sequencing of complete genomes at population levels [31], in the meantime genome-wide SNP arrays have contributed to outline continental and population genetic maps. For example, a close correlation between genes and geography (albeit for a tiny fraction of the variation) has been detected within the European continent [32], [33]. However, some of the major questions concerning the peopling of the world remain unanswered, mainly because of the lack of reliable chronologies on detected genetic admixtures and structures [34]. In this respect, mtDNA remains at the forefront of the field, due both to a well-developed molecular clock [21] and the recent accumulation of ancient DNA evidence. The Ancient DNA Perspective A key role in distinguishing the relative amount of Mesolithic versus Neolithic genetic traces retained within modern human populations is played by ancient DNA studies, despite problems such as contamination (with consequent misleading selection of rare variants), and small sample sizes contributing to potentially biased views of the ancient gene pool [35]. The earliest farming culture in Central Europe, the Linear Pottery Culture (LBK, from Linienbandkeramik), has been precisely dated to ∼7 kya, thanks to recently recalibrated radiocarbon dating, and it also represents the best genetically characterised trace of the Neolithic advent in Europe. A first analysis of 24 Neolithic skeletons from Central Europe, dating back to the LBK period, found that 25% of mtDNAs belonging to haplogroup N1a, were detectable at a 150 times lower frequency (0.2%) in modern Europeans [36]. These controversial [37]–[39] results initially suggested that cultural diffusion was the major mechanism of spread for Neolithic technologies; and, at least for the maternal lineage, the first central European farmers did not significantly contribute to the genetic pool of Europeans, who appear by default to have been of mainly Mesolithic origin. On the other hand, a demic diffusion model was the interpretation of mtDNA sequences obtained from a Spanish Neolithic site dating to 5,500 years BP [40]. Despite the limited sample size (N = 11), the haplogroup composition of the Neolithic population suggested genetic continuity between ancient and modern Iberians. This raised the possibility of heterogeneous patterns of Neolithic dispersal between Central and Southern Europe. Bramanti and co-workers claimed to have resolved the issue [41] by comparing directly (and for the first time) ancient DNA from skeletons of pre-Neolithic European hunter-gatherers and early farmers (20 and 25 specimens, respectively). The mtDNAs fell into two profoundly distinct groups, thus suggesting genetic discontinuity between Palaeolithic/Mesolithic (mainly carrying haplogroup U) and Neolithic groups, and also between pre-Neolithic and modern populations. More recently, the LBK population sample was increased (to N = 42) and comparisons with modern mtDNA variation suggested a demic (not only cultural) diffusion model of genetic input from the Near East/Anatolia into Central Europe with the early Neolithic [42], as claimed by some anthropologists [43], [44]. However, despite some signs of Neolithic ancestry among modern Europeans, distinct patterns of haplogroup frequency distribution between ancient and modern samples suggest that further major demographic events shaped the genetic landscape of Europe [42]. In conclusion, to quote Rowley-Conwy, “the picture is more complex and, thus, more interesting than these simple scenarios suggest”, with many, maybe as yet undetected, local migratory pulses similar to leapfrog migrations [45]. The concept that the rise of farming and expansion were not uniform processes across Europe has been further corroborated by ancient nuclear genomics. The recent analysis of 5-ky-old skeletal remains from Scandinavia revealed a close genetic link between a Neolithic individual and modern Mediterranean Europeans, while Scandinavian hunter-gatherers clustered with modern northern Europeans (Finns in particular) [46]. Thus, while the classical scenarios envisioned expansions (cultural versus genetic) from the Near East towards Europe, forthcoming ancient genomic data are now deciphering internal routes spanning the Neolithic within Europe. A Methodological Revolution Continuous progress in automatic high-throughput sequencing technologies has contributed to a new methodological revolution within mtDNA population studies, allowing entry into public web databases with a large volume of complete mitogenome data, including a recent augmentation with 8,216 modern mtDNA genomes [47]. This revolution has made available to the scientific community a constantly increasing amount of molecular data, raising the level of resolution of human mtDNA phylogeny in terms of haplogroup definition to unprecedented levels [47], [48] (PhyloTree Build 15). This newly available dataset, when constantly enlarged and analysed with emerging and/or well-established phylogenetic/phylogeographic methods, constitutes an extremely informative source of inferences on human evolution and population relationships. The ideal phylogeny will span all worldwide modern human populations, thus including representatives of virtually all extant mtDNA haplotypes. The final aim is to reconstruct and trace, step by step, the journey that our ancestors took to across the world, providing answers to pivotal questions that still remain unsolved. Here are some examples: the contribution of Palaeolithic glacial refugia to Late Glacial and postglacial re-peopling of Europe seems clear, but are there also some traces of European Neolithic migration events clearly marked by mtDNA lineages? Could they clarify major events of the peopling of Europe? In this context, how can we detect the genetic contribution of present-day variation of the numerous demographic events that have taken place in post-Neolithic Europe? Is there continuity or discontinuity between modern and both Palaeolithic and Neolithic ancient DNAs? When discontinuity is detected, is this due to poor dataset resolution and could the above-mentioned “ideal” worldwide phylogeny resolve this question? Some mtDNA haplogroups, sharing peculiar characteristics, are suitable candidates to potentially answer these and other fundamental questions, but often in the past, the level of resolution has been inadequate. The work presented here focuses on the phylogenetic and phylogeographic analysis of two West Eurasian haplogroups, namely I (within haplogroup N1a1b) and W, which are widely distributed over the entire European continent, the Near East and West Asia, but at low frequencies. Haplogroups I and W split directly from N1 and N2, respectively, thus they are both one step from the root of haplogroup N, the most ancient non-African (or better out-of-Africa) lineage that entered first Southwest Asia (∼60 kya) and then Europe (∼45 kya). A recently published phylogenetic analysis of haplogroups N1 and N2 (including representatives of both haplogroups I and W), as well as haplogroup X, suggested that these clades did indeed represent ancient relicts of the first human dispersals out of Africa along the southern coastal route, localizing their putative origins in the Arabian peninsula [49]. The current human mtDNA phylogeny does not equally represent all human populations but is biased in favour of representatives originally from north and central Europe [47]. This affects the phylogeny of many West Eurasian haplogroups, including I and W, whose southern European and Near Eastern components are poorly represented. The aims of this work were to (i) expand and re-analyse the available datasets of I and W complete mtDNA genomes, reaching a comprehensive 419 mitogenomes (192 I, with the addition of four samples belonging to the poorly represented sister clade N1a1b1– former N1e, and 223 W), by adding 58 new complete sequences, mainly from southern European and Near Eastern individuals, and (ii) accurately define the phylogenetic relationships within subclades of limited geographic distribution and low frequencies, searching for precise correlations between mtDNA haplotypes/clades and events of human dispersal. Our results showed that haplogroups N1a1b1, I and W most probably originated in the Near East during the Last Glacial Maximum or pre-warming period (the period of gradual warming between the end of the LGM, ∼19 kya, and the beginning of the first main warming phase, ∼15 kya) and, like J and T, may have been involved in Late Glacial expansions starting from the Near East. Thus these data contribute to better defining the Late and postglacial re-peopling of Europe, providing further evidence for the scenario that major population expansions started after the Last Glacial Maximum but before Neolithic times.

Materials and Methods Sample Selection and Analysis of mtDNA Sequence Variation We searched our database of control-region sequences (and relative haplogroup classification based on coding-region markers) from almost 10,000 available subjects of various geographic origins (Africa, East and South Asia, the Near East, Caucasus and Europe) and selected 58 mtDNAs (31 W, 26 I and 1 N1a1b1) for complete mtDNA sequencing. Both control-region variation and geographic/ethnic origin were used as selection criteria, particularly focusing on samples from Mediterranean Europe and the Near East (following the same definition of this term as in [22]). For all subjects involved, appropriate written informed consent was obtained, and the study was approved by the Ethics Committee for Clinical Experimentation at the University of Pavia, Board minutes from October 5th, 2010. These 58 mitogenomes were analysed together with 225 (89 I, 3 N1a1b1 and 133 W) previously available from published data and public databases (i.e. NCBI and 1000 Genomes Project) and 136 (77 I and 59 W) made available by recent phylogenetic updates from Behar et al. [47] for a total of 419 (196 belonging to N1a1b and 223 belonging to W) mitogenomes used to build the corresponding phylogenies. Geographic and ethnic affiliations of the 419 mitogenomes are listed in Table S1 and Table S2 in File S1, together with their GenBank or 1000 Genomes Project accession numbers. We amplified and sequenced mitogenomes following well-established protocols, as reported elsewhere [50], and aligned, assembled, and compared them using Sequencher 5.0 (Gene Codes Corporation), relative to both the newly proposed Revised Sapiens Reference Sequence (RSRS) [47] and rCRS [51]. We performed phylogenetic construction using a maximum parsimony approach with the aid of the mtPhyl software (http://eltsov.org/mtphyl.aspx), correcting the trees by hand with reference to PhyloTree. We assigned haplogroup labels following the nomenclature proposed by the PhyloTree database (at http://www.phylotree.org/) [48]. We obtained maximum likelihood (ML) molecular divergences with the same methodological approach reported in [22] and then directly compared them to the averaged distances (ρ) and corresponding heuristic estimate of the standard error (σ), using whole-mtDNA sequences (excluding the mutations 16182C, 16183C, and 16519). We converted both ML and ρ mutational distances into years using the corrected molecular clock of [21]. We analysed the same dataset used to build the phylogenetic trees (196 N1a1b mitogenomes and 223 W with the exclusion of highly drifted Finnish mitogenomes, as in [49]) with BEAST v1.7 [52] to obtain Bayesian skyline plots (BSPs) [53], [54] of haplogroups N1a1b and W. We ran the program under the HKY substitution model (gamma-distributed rates) with a relaxed molecular clock (lognormal in distribution across branches and uncorrelated between them) for 100,000,000 iterations, with samples drawn every 10,000 Markov chain Monte Carlo (MCMC) steps, after a discarded burn-in of 10,000,000 steps, as in [55]. We considered haplogroups N1a1b, I and W as a whole and their major subclades monophyletic in the analyses. We visualized the BSPs obtained in plots with Tracer v1.5 and then converted them to Excel graphs by using a generation time of 25 years, as in [56]. We evaluated geographic distributions of both haplogroups I and W in a large dataset of more than 40,000 (published and unpublished) control-region (mostly limited to HVS-I, the first hypervariable segment) data from ∼100 populations, and assessed their geographic origin, haplogroup classification and haplotypes. We built spatial frequency distribution plots with the program Surfer 9 (Golden Software). We assigned the most likely source region for major clades in the whole-mtDNA tree on the basis of sample distribution among the subclades, following the same approach as in [22].

Discussion In the last ten years, the availability of a growing number of complete mitogenomes (more than 18,000 in [47]) has dramatically improved the worldwide human mtDNA phylogeny [48]. Many of the novel subclades are characterized by more distinct geographical distributions than the deeper clades from which they derive, thus allowing inferences on demographic events that not only occurred more recently but at regional rather than continental level (e.g. [19], [22]). In this study, we aimed to define the internal variation of haplogroups N1a1b and W, which are rather uncommon and were not well-sampled in random population surveys. Moreover, despite their infrequent occurrence, both N1a1b and W have extremely wide distribution ranges encompassing the whole of Western Eurasia and North Africa, implying that extensive hidden substructure remained to be uncovered for both haplogroups. Our data confirm this scenario, bringing to light numerous novel subclades as well as improving the phylogenetic resolution of those already known. Our data confirm that N1a1b1 and I coalesce at very similar times (21.1 and 20.1 kya, respectively) and their common molecular ancestor, corresponding to the N1a1b node, arose 28.6±5.2 kya. Haplogroup N1a1b1, even if very rare, has been found only in Asia with a deep internal split at node N1a1b1a (dated at ∼19.3 kya) which divides one single mitogenome from Russian North Asia (Siberia) and the remaining three Near Eastern (Iranian) individuals. Haplogroup I has a more widespread distribution, but with peaks of frequency in the Near East. Therefore, the most parsimonious scenario is that both haplogroups N1a1b1 and I arose in the Near East during the LGM period. This conclusion is supported by the phylogeny of haplogroup I. All of the subclades of haplogroup I, and especially the Late Glacial subclades (I1, I4, I5, and I6), include mitogenomes from the Near East. Like the deeper subclades of I, haplogroup W also dates to the Late Glacial period, ∼17 kya, and most of its subclades (W3–6) differentiated during the warming period (12–15 kya). Moreover, the distribution of W, with frequency peaks in India, the Near East and the Caucasus, as well as the presence of numerous basal Near Eastern lineages in the W tree (also dating to the Late Glacial), might suggest an origin in the Near East as well, with a subsequent very rapid spread into Europe. Comparing phylogeographic data from other lineages of Near Eastern origin, the overall age estimates for N1a1b1, I, and W haplogroups appear very similar to those previously reported for major subclades of J and T, two among the most frequent haplogroups in Europe and the Near East [22]. These major subclades were recently identified as signals of dispersals into Europe from a Near Eastern refuge area, after the peak of the last glaciation, ∼19 kya [22]. This scenario may be paralleled in the history of haplogroups N1a1b1, I and W, with dispersals of haplogroups I and W into Europe during the Late Glacial period, ∼18–12 kya, signalled in particular by subclades I1, I2’3, I5, W3, W4 and W5, and by W1 in the immediate postglacial period, ∼10–11 kya. Thus important expansions of I and W occurred in parallel with Late Glacial and postglacial climatic improvements, several millennia before the European Neolithic. It seems likely that Late Glacial and postglacial improvements in climate were fundamental to the dispersal of numerous other mtDNA lineages not only in the Near East and Europe, but also in Africa, Asia, the Pacific and the Americas [3], [21], [54], [55], [58]–[64], and even some lineages previously thought to be markers for Neolithic expansions have now been recognized as signalling Late Palaeolithic and/or Mesolithic diffusion events [22]. This represents a significant step forward in the century-long debate concerning the relative genetic contribution of Palaeolithic versus Neolithic to the current gene pool of modern Europeans. Now, a still unresolved fundamental question in understanding the genetic makeup of modern Europeans is what exactly happened in the time span of several thousands of years between the Late Palaeolithic/Mesolithic expansions and the arrival of agriculture in the different parts of Europe. The first clear consequence of the scenario described above is that since the European genetic pool was largely defined before Neolithic times, major haplogroups already present in Europe during the Palaeolithic were most probably involved in subsequent gene flows linked to the advent and expansion of agriculture. Therefore, we need to distinguish between lineages that arrived from the Near East with agriculture – which appear to be few, in the extant mtDNA pool – and those which may have dispersed and expanded within Europe, carrying agriculture from one region to another. In the case of the haplogroup I and W phylogenies, signs of the diffusion of agriculture and pastoralism within Europe may be evident in those I and W subclades which date to the European Neolithic period and are restricted to Europe, particularly starlike examples such as I1a1 (as already suggested by [49]) and I2, and possibly also I1c1, I3 and W5a. These are reflected in the more recent of the two bursts of growth starting from ∼7 kya in the N1a1b BSP of Figure 3, while the major expansion of the entire haplogroup W started during the Late Glacial period, decreasing gradually during the Neolithic (Figure 3). It is worth noting that the autosomal STRUCTURE analyses for Europe carried out by Behar et al. [65] seem to suggest a very substantial indigenous (i.e. non-Near Eastern) component, along with two potentially Near Eastern components, which could perhaps correspond to distinct Late Glacial and Neolithic dispersals from the Near East. Recent simulation work attempting to interpret autosomal patterns also suggests that any Neolithic immigration is likely to have been very minor [66]. However, with extant evidence we can only estimate (at best) the degree of present-day impact of each dispersal, rather than the scale of the dispersal as it was at the time. The direct comparison of ancient and modern DNA samples, allowing a diachronic view of human history, can be an important test of inferences based on data from extant populations. Having improved the resolution of the N1a1b and W phylogenies, we were then able to re-evaluate the published I and W control-region haplotypes from ancient specimen (Table 3) in the context of the modern variation of I and W mitogenomes (Figure S1 and Figure S2). Since only information relative to the HVS-I is available, most of the ancient I and W mtDNAs bear basal and/or common haplotypes, which could not be further classified within any subclade. However, a few informative cases were identified. A Spanish middle Neolithic sample [40] bearing the haplogroup I control-region motif 16264-16270-16311-16319-16362 (from the root of I) (Table 3) can now be classified within I1c1. The identification of this sample has already been interpreted as indicating genetic continuity in the Iberian Peninsula since the Neolithic period and (more contentiously, given the paucity of Mesolithic evidence from Iberia) that the diffusion of agriculture followed a demic model in the Mediterranean area [40]. We found that the same I1c1 haplotype is shared by five mitogenomes in our phylogeny (#60-64 in Figure S1), of which four are of unknown geographic/ethnic origin but one (sample #64, sequenced in the present study) was of North Italian origin. This result provides a further confirmation of our findings, based on the analysis of haplogroup I phylogeny, that (i) subclade I1c1 was likely a marker of Neolithic dispersal in Europe (rather than, for example, having been brought from the Near East much more recently by the ancestors of Ashkenazi Jews, some of whom carry this lineage [67], [68]) and (ii) the distribution and age might support a demic model of Neolithic diffusion in the Mediterranean area. Similarly, a probable member of haplogroup W3 in the same Spanish Neolithic sample [40], sharing the haplotype 16292-16295-16304 (against the root of N) with a mitogenome from Azerbaijan (sample #127) in our phylogeny), may point to Neolithic dispersal from the Near East into Europe. PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 3. List of ancient specimen belonging to mtDNA haplogroups I and W and control-region haplotypes. https://doi.org/10.1371/journal.pone.0070492.t003 With the advent of reliable ancient DNA studies, attention is starting to focus on subsequent events in European prehistory. A German specimen associated with the Late Neolithic Bell Beaker culture bears the mtDNA control-region variants (from the root of N) 16129-16172-16311-16391-73-199-203-204-250-263 (Table 3). The mutational motif 16172-203 classifies this sample within I1a1, another potential marker of the agricultural expansion in Europe. Considering that haplogroup I1a (Figure 3), from which subclade I1a1 derives, is mainly concentrated in Europe, with frequency peaks in Eastern Europe, it is possible that sub-clade I1a1, dated to about 5 kya in our phylogeny (Table 1), might be a marker of a late Neolithic diffusion from Central/Eastern Europe, perhaps associated with the Corded Ware, into Bell Beaker territory. This would also be consistent with the lack of haplogroup I1 thus far (apart from I1c) in any western European Neolithic or pre-Neolithic remains [69], and would testify to the importance of dispersals later than the early Neolithic in prehistoric Europe. Similarly, the European Neolithic subclade W5a has been detected in one Late Neolithic sample of the German Bell Beaker culture (Table 3), even though an accurate classification for this sample would require the analysis of at least one of the W5a-specific coding-region markers. Indeed, a German individual belonging to the Corded Ware culture has been shown to carry a W6 lineage [70]. As with I1a1, the age and distribution again makes an origin in the north-east European Neolithic, followed by dispersal westwards with the Late Neolithic, an attractive hypothesis.

Supporting Information Figure S1. Maximum parsimony tree of 196 mitogenomes belonging to the sister haplogroups I and N1a1b1. https://doi.org/10.1371/journal.pone.0070492.s001 (XLSX) Figure S2. Maximum parsimony tree of 223 mitogenomes belonging to haplogroup W. https://doi.org/10.1371/journal.pone.0070492.s002 (XLSX) File S1. File containing Tables S1–S3. Table S1. Origin and subclade affiliation of haplogroup N1a1b1 and I mitogenomes considered in this study. Table S2. Origin and subclade affiliation of haplogroup W mitogenomes considered in this study. Table S3. Percentage frequency distribution of haplogroups I and W and the subclades I1a and W6. https://doi.org/10.1371/journal.pone.0070492.s003 (DOCX)

Acknowledgments The authors are grateful to all the donors for providing biological specimens.

Author Contributions Conceived and designed the experiments: AO AA AT. Performed the experiments: AO MP FG BHK UAP VG VB. Analyzed the data: AO MP OS AA MBR AT. Contributed reagents/materials/analysis tools: UAP SRW AA OS AT. Wrote the paper: AO MBR AT.