Significance The question of colonization of Europe by Neolithic people of the Near East and their contribution to the farming economy of Europe has been addressed with extensive archaeological studies and many genetic investigations of extant European and Near Eastern populations. Here, we use DNA polymorphisms of extant populations to investigate the patterns of gene flow from the Near East to Europe. Our data support the hypothesis that Near Eastern migrants reached Europe from Anatolia. A maritime route and island hopping was mainly used by these Near Eastern migrants to reach Southern Europe.

Abstract The Neolithic populations, which colonized Europe approximately 9,000 y ago, presumably migrated from Near East to Anatolia and from there to Central Europe through Thrace and the Balkans. An alternative route would have been island hopping across the Southern European coast. To test this hypothesis, we analyzed genome-wide DNA polymorphisms on populations bordering the Mediterranean coast and from Anatolia and mainland Europe. We observe a striking structure correlating genes with geography around the Mediterranean Sea with characteristic east to west clines of gene flow. Using population network analysis, we also find that the gene flow from Anatolia to Europe was through Dodecanese, Crete, and the Southern European coast, compatible with the hypothesis that a maritime coastal route was mainly used for the migration of Neolithic farmers to Europe.

Genotyping of extant and ancient populations has been used to address the question of the origins of the people of Europe. The genome of the present-day Europeans reflects merging of the Paleolithic settlers who colonized Europe 35,000–40,000 y before the present era (BPE) and the Neolithic people who started colonizing Europe approximately 9,000 y BPE. The Neolithic contribution to the gene pool of modern Europeans has been estimated with studies of extant European populations by using mitochondrial DNA, Y-chromosomal DNA, or nuclear DNA polymorphisms. Mitochondrial DNA studies estimate the Neolithic contribution to the maternal lineages of the modern Europeans to range between 10 and 20% (1). A contribution of approximately 22% was suggested by a study of Y-chromosome polymorphisms, which also found that the Neolithic contribution was more pronounced along the Mediterranean coast (2). Neolithic contributions of 50–70% were estimated with other methodologies (3⇓–5), including highly polymorphic DNA markers (6). Clinal patterns of genetic diversity of autosomal (7⇓–9) or Y-chromosomal (10) polymorphisms across Europe suggest that the Neolithic migrants originated from the Near East (7⇓–9). It has been proposed that these Near Eastern migrants brought to Europe their new agricultural technologies (7⇓–9, 11) and, perhaps, the Indo-European language (12). How did these Neolithic people reach Europe from the Near East?

The geographic focus of the transition from foraging to the Neolithic way of life was the Levantine corridor, which extended from the Fertile Crescent to the southeastern sections of the central Anatolian basin (13). The Neolithic farmers could have taken three migration routes to Europe. One was by land to North-Eastern Anatolia and from there, through Bosporus and the Dardanelles, to Thrace and the Balkans (14, 15). A second route was a maritime route from the Aegean Anatolian coast to the Mediterranean islands and the coast of Southern Europe (12, 14⇓⇓⇓–18). The third was from the Levantine coast to the Aegean islands and Greece (19). Navigation across the Mediterranean was active during the Early Neolithic and Upper Paleolithic (16⇓–18) as illustrated by the finding of obsidian from the island of Milos in Paleolithic sites of the Greek mainland (19, 20) and the early colonization of Sardinia, Corsica, and Cyprus (18, 21⇓–23). If a maritime route was used by the Neolithic farmers who settled Europe, their first stepping stones into Europe were the islands of Dodecanese and Crete. The Dodecanese is very close to the Aegean coast of Anatolia, whereas the west-most Dodecanesean islands are very close to Crete. Crete hosts one of the oldest Neolithic settlements of Europe in the site of Knossos, established ∼8,500–9,000 y BPE (24, 25), and the inhabitants of the island established the first advanced European civilization starting approximately 5,000 BPE.

To obtain insights on the question of migrations to Europe, we analyzed genome-wide autosomal single nucleotide polymorphisms (SNPs) from a dataset of 32 populations. This dataset includes population samples from the islands of Crete and Dodecanese, one from Cappadocia in Central Anatolia, three subpopulations from different regions of mainland Greece, 14 other populations from Southern and Northern Europe, five populations from the Near East, and seven from North Africa. In addition to established methods for genetics analysis, we use a population genetics network approach that can define pathways of gene flow between populations. Our data are compatible with the hypothesis that a maritime route connecting Anatolia and Southern Europe through Dodecanese and Crete was the main route used by the Neolithic migrants to reach Europe.

Discussion In historical times, there have been three major invasions of South Eastern Europe from the direction of the Near East but no evidence of major migratory events and gene flow. The Persians dominated South Western Asia in the fifth century BC: They established satrapies in Asia Minor and invaded Europe, but they were stopped by the Greeks (32). The Arabs attempted multiple invasions during the seventh and eighth centuries AD, but they were stopped by the Byzantines (33). An Arab tribe originating from Andalusia established in Crete a pirate state in the ninth century, but they were exterminated by the Byzantines 140 y later, and they left no traces of settlement in the island other than the name of their seat of power in the town of Chandax (33). The Turks invaded Asia Minor starting the 11th century and occupied the Balkans in the subsequent three centuries, but any Turks and converts to Islam left from Greek territories with the population exchanges that took place in the 20th century (34); the origin of the Turkish tribes was the central Asia. Seljuk Turks settled in Anatolia in the 12th century AD; however, the Anatolian Cappadocians we included in this study belong to the population that have kept the religion and the language of the pre-Seljuk Cappadocians and, therefore, most likely carry the genetic makeup of the ancient Anatolians. The only important gene flows from Near East to Europe must have occurred in prehistoric times and, as genetic evidence suggests, the most prominent migrations should have occurred during the Neolithic. The idea that the Neolithic was introduced to Europe through coastal routes of colonization has been proposed by several archaeologists (12, 16, 17, 19, 22, 35). The earliest Neolithic sites with developed agricultural economies in Europe dated 8500–9000 BPE are found in Greece (19, 36, 37). The general features of material culture of the Greek Neolithic (14, 19, 36) and the genetic features of the preserved crops and associated weeds of the earliest Greek Neolithic sites point to Near Eastern origins (38). How these Near Eastern migrants reached Greece is a matter of speculation. One route of migration was by land from Central to Northeast Anatolia and from there to Southern Balkans through Bosporus, the Dardanelles, and Thrace (14, 15, 39). This migration route is less likely because archaeological evidence (19, 36, 40, 41) including 14C dating (19, 40, 41) suggests that the Neolithic sites in Thrace and Macedonia are younger than those of mainland Greece, an unexpected finding if the Neolithic migrants who colonized Greece arrived there from the north. Other models suggest that waves of the Near-Eastern migrants reached Greece by sailing either from the Aegean Anatolian coast (12, 14, 16, 17, 22, 35) or from the Levantine coast (19, 36). Our data support the Anatolian rather than the Levantine route because they consistently show the Aegean islands to be connected to the Near East through Anatolia. Archaeological evidence from Greek and Near Eastern and Anatolian Neolithic sites suggests that multiple waves of Neolithic migrants reached Greece and Southern Europe. Most likely multiple routes were used in these migrations but, as our data show, the maritime route and island hopping was prominent. Our findings also suggest that to the west of Greece, the Neolithic reached Sicily and Italy by sea, as it has been suggested by archaeologists (12, 42). Studies of extant European and Near Eastern populations using multiple autosomal genetic polymorphisms have established the presence of clinal distributions of allelic frequencies (4⇓⇓⇓⇓⇓–10, 43, 44). These clines in gene frequencies have been attributed to the geographically gradual merging of the gene pools of the Neolithic Near East migrants with the gene pools of the existing Paleolithic population of Europe. The correlation of clinal gene frequencies with the archaeological record of the spread of agriculture in Europe lead to the suggestion that it was the migration of Neolithic populations from the Near East that led to the spread of agriculture in Europe (7). The underlying hypothesis is that the development of agriculture triggered marked population growth and produced demographic pressures that resulted in dispersion of the Neolithic populations to new regions (7⇓–9, 11). The rate of dispersion from the Near East to Western Europe has been estimated to approximately 0.6–1 km/y (44). A faster rate of dispersion is expected if maritime routes were used for the colonization of Southern Europe. Indeed, archaeological evidence suggests that farming spread faster in Southern Europe (12, 42, 45) and radiocarbon measurements in Neolithic sites are compatible with very rapid colonization of the west Mediterranean by Neolithic migrants (46, 47). Although the Southeastern Mediterranean islands seem to have acted as a bridge from Anatolia to Southern Europe, the relatively small degree of gene flow between the African and the European coasts shows that the Mediterranean Sea also had a barrier function as also suggested with studies of mtDNA polymorphisms (48). Thus, the Mediterranean seems to have facilitated the migrations of Neolithic farmers along its Southern European coast but it mostly acted as an isolating factor between its European and African coasts.

Materials and Methods Samples. We collected a total of 202 samples from nine populations that were genotyped on two different platforms (SI Appendix, Table S1). In our sample collection process from the Greek subpopulations, we extracted DNA from blood samples of individuals that were at least 70 y old and self-reported that all four grandparents originated from the target population. We expect that because of our sample selection process, our data reflect the genetic structure of the Greek subpopulations four generations before present. We combined our data with four additional datasets to study population structure around the Mediterranean basin as well as Northern Europe. Thus, we produced a dataset of 964 samples from 32 populations, genotyped on 75,194 SNPs. More specifically, we used additional data from (i) the Human Genome Diversity Panel (49), (ii) the HapMap Phase III Project (50), (iii) publicly available data on Northern African populations that were first released by Henn and coworkers (51), and (iv) data from the Kidd Laboratory at Yale University (allele frequencies for these data are available via the ALFRED database) (SI Appendix, Table S2). PCA and ADMIXTURE. We used our own MatLab implementation of PCA (52, 53) (see SI Appendix for details). Before running ADMIXTURE, we pruned the SNPs to remove SNPs in high LD by using a windowed approach and a value of r2 equal to 0.8. Correlation Between PCA and Geographic Coordinates. We estimated the correlation between geographic coordinates (SI Appendix, Table S3) and the top two eigenvectors emerging from PCA. For each population in our sample, we approximated its location of origin either using information provided to us by the individuals that collected the respective sample, or by using a capital city that is relatively close to the population under study. The correlation between geographic coordinates and the eigenvectors was computed by converting both the geographic coordinates vector and the eigenvectors to z scores, and then computing the Pearson correlation coefficient. A Mantel test was run to estimate statistical significance. Network Analysis. To better understand the connection between the populations included in our study, we performed a network analysis on the results of PCA and ADMIXTURE. To form the networks, we identified the top few nearest neighbors of each sample by representing each sample with respect to the top K coefficients returned by PCA or ADMIXTURE, and then computing the distance of each sample to all other samples, under the additional constraint that these neighbors should not belong to the same population of origin as the sample itself. Once a network whose nodes correspond to populations and whose edges correspond to connections between populations, as described above, is formed, we visualize it by using the Cytoscape software package (see SI Appendix for details).

Acknowledgments We thank F. Sakellaridi, E. Papadaki, I.Adamopoulos, K. Farmaki, M. Tsironis, A. Mariolis, and P. Kaloyannidis for their assistance during the field study; A. Papadopoulou, N. Psatha, and N. Zogas for technical assistance; and N. Patterson and I. Lazaridis for helpful discussions. This work partially was supported by National Institutes of Health grants (to G.S. and K.K.) and National Science Foundation grants (to P.D.) and cofunded by the European Union (European Social Fund) and National Resources under the Operational Programme “Education and Lifelong Learning” Action 4386 - GENOMAP.GR, ARISTEIA II Programme, NSRF 2007-2013 (to P.P.).

Footnotes Author contributions: G.S. designed research; P.P., P.D., E.Y., A.R., K.K., M.M., M.C.R., S.P., A.A., and K.K.K. performed research; E.Y., A.R., K.K., M.M., M.C.R., S.P., and A.A. performed population studies; K.K.K. contributed data; P.P., P.D., F.T., and S.S.P. analyzed data; and P.P., P.D., J.A.S., and G.S. wrote the paper.

The authors declare no conflict of interest.

↵*This Direct Submission article had a prearranged editor.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1320811111/-/DCSupplemental.