The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear. Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence. Distinguishing among these alternatives has been difficult. We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals). West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes. We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population. Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world. Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago. Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt.

Main Text

1 Jobling M.A.

Hollox E.

Hurles M.

Kivisild T.

Tyler-Smith C. Human evolutionary genetics. 2 Armitage S.J.

Jasim S.A.

Marks A.E.

Parker A.G.

Usik V.I.

Uerpmann H.P. The southern route “out of Africa”: evidence for an early expansion of modern humans into Arabia. , 3 Petraglia M.D.

Haslam M.

Fuller D.Q.

Boivin N.

Clarkson C. Out of Africa: new hypotheses and evidence for the dispersal of Homo sapiens along the Indian Ocean rim. , 4 Stringer C.B.

Grün R.

Schwarcz H.P.

Goldberg P. ESR dates for the hominid burial site of Es Skhul in Israel. 5 Soares P.

Alshamali F.

Pereira J.B.

Fernandes V.

Silva N.M.

Afonso C.

Costa M.D.

Musilová E.

Macaulay V.

Richards M.B.

et al. The Expansion of mtDNA Haplogroup L3 within and out of Africa. , 6 Quintana-Murci L.

Semino O.

Bandelt H.J.

Passarino G.

McElreavey K.

Santachiara-Benerecetti A.S. Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa. 7 Campbell M.C.

Tishkoff S.A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. 8 Cavalli-Sforza L.L.

Menozzi P.

Piazza A. The history and geography of human genes. , 9 Cavalli-Sforza L.L.

Piazza A.

Menozzi P.

Mountain J. Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. , 10 Lahr M.

Foley R. Multiple Dispersals and Modern Human Origin. 5 Soares P.

Alshamali F.

Pereira J.B.

Fernandes V.

Silva N.M.

Afonso C.

Costa M.D.

Musilová E.

Macaulay V.

Richards M.B.

et al. The Expansion of mtDNA Haplogroup L3 within and out of Africa. , 6 Quintana-Murci L.

Semino O.

Bandelt H.J.

Passarino G.

McElreavey K.

Santachiara-Benerecetti A.S. Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa. , 9 Cavalli-Sforza L.L.

Piazza A.

Menozzi P.

Mountain J. Reconstruction of human evolution: bringing together genetic, archaeological, and linguistic data. 11 Green R.E.

Krause J.

Briggs A.W.

Maricic T.

Stenzel U.

Kircher M.

Patterson N.

Li H.

Zhai W.

Fritz M.H.

et al. A draft sequence of the Neandertal genome. 12 Krause J.

Orlando L.

Serre D.

Viola B.

Prüfer K.

Richards M.P.

Hublin J.J.

Hänni C.

Derevianko A.P.

Pääbo S. Neanderthals in central Asia and Siberia. 13 Pagani L.

Kivisild T.

Tarekegn A.

Ekong R.

Plaster C.

Gallego Romero I.

Ayub Q.

Mehdi S.Q.

Thomas M.G.

Luiselli D.

et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. , 14 Henn B.M.

Botigué L.R.

Gravel S.

Wang W.

Brisbin A.

Byrnes J.K.

Fadhlaoui-Zid K.

Zalloua P.A.

Moreno-Estrada A.

Bertranpetit J.

et al. Genomic ancestry of North Africans supports back-to-Africa migrations. The routes followed by fully modern humans as they expanded out of Africa 50,000–100,000 years ago into Eurasia have long been a central question of anthropologyand have important implications for understanding the evolutionary history of all non-African populations. So far, neither fossil and archaeologicalnor geneticevidence has been able to distinguish between an exit through Egypt and Sinai (northern route)or one through Ethiopia, the Bab el Mandeb strait, and the Arabian Peninsula (southern route).Genetic evidence has more often been interpreted as favoring a southern route,although the Neandertal admixture present in all non-Africansis more readily explained by a northern route given that Neandertal fossils are currently known from the Levant, but not from the southern part of the Arabian Peninsula.Thus, the available evidence remains inconclusive. Information to discriminate between the northern and southern routes might still be present in Africa within the full genomes of the populations inhabiting modern Egypt and the Horn of Africa, and thus further investigation is warranted. However, although it might not be easy to extract this information because of the past and recent genetic introgression experienced by these populations,full sequences of Northeast African genomes would provide the best starting point for these and other analyses.

15 Abecasis G.R.

Auton A.

Brooks L.D.

DePristo M.A.

Durbin R.M.

Handsaker R.E.

Kang H.M.

Marth G.T.

McVean G.A. 1000 Genomes Project Consortium

An integrated map of genetic variation from 1,092 human genomes. 13 Pagani L.

Kivisild T.

Tarekegn A.

Ekong R.

Plaster C.

Gallego Romero I.

Ayub Q.

Mehdi S.Q.

Thomas M.G.

Luiselli D.

et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. , 14 Henn B.M.

Botigué L.R.

Gravel S.

Wang W.

Brisbin A.

Byrnes J.K.

Fadhlaoui-Zid K.

Zalloua P.A.

Moreno-Estrada A.

Bertranpetit J.

et al. Genomic ancestry of North Africans supports back-to-Africa migrations. , 16 Haber M.

Gauguier D.

Youhanna S.

Patterson N.

Moorjani P.

Botigué L.R.

Platt D.E.

Matisoo-Smith E.

Soria-Hernanz D.F.

Wells R.S.

et al. Genome-wide diversity in the levant reveals recent structuring by culture. To improve our understanding of the African gene pool that might have been ancestral to the out-of-Africa (OOA) dispersal, we sequenced the genomes of a random sample of 100 Egyptians and 125 individuals from five Ethiopian populations (25 each from Amhara, Oromo, Ethiopian Somali, Wolayta, and Gumuz) to an average depth of 8× by using an Illumina HiSeq 2000, and we analyzed these data within the context of similar data generated by the 1000 Genomes Project.Sample collection, export, and analysis were approved by University College London research ethics committee 0489/002, Ethiopian Ministry of Science and Technology approval no. 310/538/04, and Lebanese American University institutional review board SMPZ121307-2 (see Supplemental Data for additional information). The overall genetic landscape emerging from the sequencing data ( Table S1 ) refines current knowledge of the high diversity in the Ethiopian region. Sequence data avoid the effect of ascertainment bias that one encounters when dealing with SNP arrays from the same populations ( Figure S1 ). If the northern route was the predominant path followed by the ancestors of the OOA populations, and modern African populations are representative of those at the time of the exit, Egyptians should be genetically more similar to modern non-Africans. Conversely, if the southern route was the main way out of Africa, Ethiopians should be closest to the OOA populations. However, extensive historical and genetic data show that recent gene flow has drastically influenced the genomes of present-day Egyptians and Ethiopians.To minimize the confounding effect of this gene flow back to Africa while testing this hypothesis, we first identified and then masked the recent non-African ancestry in the Ethiopian and Egyptian genomes.

17 Alexander D.H.

Novembre J.

Lange K. Fast model-based estimation of ancestry in unrelated individuals. 18 Price A.L.

Patterson N.J.

Plenge R.M.

Weinblatt M.E.

Shadick N.A.

Reich D. Principal components analysis corrects for stratification in genome-wide association studies. 20 Loh P.R.

Lipson M.

Patterson N.

Moorjani P.

Pickrell J.K.

Reich D.

Berger B. Inferring admixture histories of human populations using linkage disequilibrium. 13 Pagani L.

Kivisild T.

Tarekegn A.

Ekong R.

Plaster C.

Gallego Romero I.

Ayub Q.

Mehdi S.Q.

Thomas M.G.

Luiselli D.

et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. , 14 Henn B.M.

Botigué L.R.

Gravel S.

Wang W.

Brisbin A.

Byrnes J.K.

Fadhlaoui-Zid K.

Zalloua P.A.

Moreno-Estrada A.

Bertranpetit J.

et al. Genomic ancestry of North Africans supports back-to-Africa migrations. 13 Pagani L.

Kivisild T.

Tarekegn A.

Ekong R.

Plaster C.

Gallego Romero I.

Ayub Q.

Mehdi S.Q.

Thomas M.G.

Luiselli D.

et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. 21 Pickrell J.K.

Patterson N.

Loh P.R.

Lipson M.

Berger B.

Stoneking M.

Pakendorf B.

Reich D. Ancient west Eurasian ancestry in southern and eastern Africa. 13 Pagani L.

Kivisild T.

Tarekegn A.

Ekong R.

Plaster C.

Gallego Romero I.

Ayub Q.

Mehdi S.Q.

Thomas M.G.

Luiselli D.

et al. Ethiopian genetic diversity reveals linguistic stratification and complex influences on the Ethiopian gene pool. , 21 Pickrell J.K.

Patterson N.

Loh P.R.

Lipson M.

Berger B.

Stoneking M.

Pakendorf B.

Reich D. Ancient west Eurasian ancestry in southern and eastern Africa. 22 Kitchen A.

Ehret C.

Assefa S.

Mulligan C.J. Bayesian phylogenetic analysis of Semitic languages identifies an Early Bronze Age origin of Semitic in the Near East. 23 Serjeant G.R. Sickle-cell disease. Figure 1 PCA and ADMIXTURE Analysis Show full caption 19 Purcell S.

Neale B.

Todd-Brown K.

Thomas L.

Ferreira M.A.

Bender D.

Maller J.

Sklar P.

de Bakker P.I.

Daly M.J.

Sham P.C. PLINK: a tool set for whole-genome association and population-based linkage analyses. 17 Alexander D.H.

Novembre J.

Lange K. Fast model-based estimation of ancestry in unrelated individuals. 18 Price A.L.

Patterson N.J.

Plenge R.M.

Weinblatt M.E.

Shadick N.A.

Reich D. Principal components analysis corrects for stratification in genome-wide association studies. PCA (A) and ADMIXTURE analysis (B) of the newly sequenced samples (Egyptian, pink; Amhara, yellow; Oromo and Ethiopian Somali, light orange; Wolayta, red; and Gumuz, blue) and a subset of 1000 Genomes samples (CHB, dark gray; TSI, light gray; ASW [African ancestry in Southwest USA], green; and LWK [Luhya in Webuye, Kenya] and YRI, light green). ADMIXTURE was run with different values of K (K = 5 was the smallest cross-validation error). The top ADMIXTURE plot shows five ancestral components tentatively describable as West African (green), East African (orange), European (light gray), East Asian (dark gray), and putatively Middle Eastern (pink). The phased and imputed genotypes from the low-coverage sequences were processed with PLINKfor the removal of variants with a minor allele frequency < 1% (--maf 0.01 --geno 0.01) and pairwise linkage disequilibrium above 0.1 (--indep-pairwise 50 10 0.1). The pruned dataset was then analyzed by ADMIXTUREwith the --cv option for assessing the most plausible value of K and also by PCA.The proportion of the total variance explained by each principal component is reported as a percentage next to each axis label. Using ADMIXTUREand principal-component analysis (PCA) Figure 1 A), we estimated the average proportion of non-African ancestry in the Egyptians to be 80% and dated the midpoint of the admixture event by using ALDERto around 750 years ago ( Table S2 ), consistent with the Islamic expansion and dates reported previously.The Ethiopian populations showed, as expected, a more variable spectrum of genetic introgression ( Figure 1 B). Consistent with previous reports,the Amhara and Oromo were shown to have around 50% of their genome derived from non-Africans, the introgressed proportion in the Somali and Wolayta amounted to 40%–30%, and the Gumuz showed negligible amounts of non-African admixture. The date of the midpoint of these admixture events was 2,500–3,000 years ago ( Table S2 ), although one notable exception was the Oromo, who have shown evidence of multiple admixture events.These conclusions are consistent with previous reportsand fit with linguistic records.Furthermore, the distribution of maternal (mtDNA) and paternal (Ychr) lineages revealed sex-biased admixture patterns in Ethiopians ( Figure S2 ), such that there was less male-mediated than female-mediated Middle Eastern backflow. The affinity of the Egyptian African component with the modern East and West African populations (green component in Figure 1 B, K = 5) could be due to either a continuity of human presence in the area or recent gene flow from neighboring African regions resulting from demographic processes and slave trade over the last two millennia.

24 Delaneau O.

Marchini J.

Zagury J.F. A linear complexity phasing method for thousands of genomes. 25 Brisbin A.

Bryc K.

Byrnes J.

Zakharia F.

Omberg L.

Degenhardt J.

Reynolds A.

Ostrer H.

Mezey J.G.

Bustamante C.D. PCAdmix: principal components-based assignment of ancestry along each chromosome in individuals with admixed ancestry from two or more populations. 15 Abecasis G.R.

Auton A.

Brooks L.D.

DePristo M.A.

Durbin R.M.

Handsaker R.E.

Kang H.M.

Marth G.T.

McVean G.A. 1000 Genomes Project Consortium

An integrated map of genetic variation from 1,092 human genomes. ST 26 Wright S. Evolution in Mendelian Populations. −6) and Ethiopian′- and Egyptian′-specific (hereafter Ethiopian′|Egyptian′-specific) (1.15-fold, p = 9 × 10−6) haplotypes than did any of the other African haplotype sets ( Figure 2 Haplotype Sharing between African and Non-African Populations Show full caption The 41,141 African haplotypes retrieved from 18,114 LD regions outside Africa were grouped according to the population of discovery (A). The haplotype composition of African and non-African (CHB + TSI) populations (B) showed more Egyptian′ (pink) and Egyptian′|Ethiopian′ (blue)-specific haplotypes in the OOA samples (relative increases from the general African population are provided for each colored section) than did the haplotype composition of the combined African populations. Non-significant (χ2i) comparisons are labeled “NS.” Of the haplotypes specific to a single African population, the Egyptian′ haplotypes (pink) showed the highest population frequency outside Africa (C), whereas the Egyptian′|Ethiopian′ haplotypes (blue) were the most frequent of those shared by two African populations (D). Bars not significantly different (tested with χ2i) from the Egyptian′ (C) or Ethiopian′|Egyptian′ (D) ones are labeled “NS.” The first bin in (C) and (D) shows the proportion of African haplotypes not present outside Africa. In order to filter out, through masking, the Eurasian portion identified in this way, we phased the samples by using ShapeITand processed them with PCAdmix.In the masking process, Europeans (CEU [Utah residents with ancestry from northern and western Europe from the CEPH collection])were used as a proxy for the non-African component, and the Gumuz (the Ethiopian population showing minimal introgression) were used as a proxy for the African component. Pairwise Fwas calculated before and after the masking process ( Table S3 ), highlighting the expected trend of increased distance of the admixed populations from non-Africans when we retained only their African component. After we excluded the Gumuz themselves from the subsequent analyses, we compared the African components of the masked Ethiopian and Egyptian genomes (hereafter referred to as the Ethiopian′ and Egyptian′ genomes, respectively) with a set of West African (YRI [Yoruba in Ibadan, Nigeria]) and OOA populations spanning Eurasia (East Asian CHB [Han Chinese in Beijing, China], European TSI [Toscani in Italia] and CEU [ Figure 2 ], and South Asian GIH [Gujarati Indians in Houston, Texas] [ Figure S6 ]) in order to look for a signature of the OOA migration. Such a signature was defined as a higher similarity between the Ethiopian′ or Egyptian′ genomes and the non-Africans than between the latter and the YRI. If we assume a stepwise differentiation out of Africa, and if the preferential route followed was the northern one, Egyptian′ samples should share the highest number of haplotypes with the Eurasian samples even after recent events of introgression are controlled for. Conversely, Ethiopian′ samples would show the highest haplotype sharing with the Eurasian samples if the southern route was preferentially followed during the OOA migration. We restricted this comparison to 18,114 genomic regions (spanning a total length of 7.1 Mb; Figure S5 ) containing haplotypes shared by Europeans and Asians because these were likely to predate the split between these populations. Given the broad occurrence of these regions outside Africa, we could rule out positive selection as a plausible driver of the observed linkage-disequilibrium (LD) pattern. We identified these regions by calculating LD blocks in a set of 457 non-African samples. We retrieved 41,141 haplotypes at these loci in the Egyptian′, Ethiopian′, or YRI samples ( Figure 2 A) and used them to estimate the genetic similarity between OOA populations CHB and TSI and each of the three African populations. 85% of the haplotypes were present in all three African populations and were discarded as non-informative. The remaining 15% of haplotypes were instead observed in only one or two African populations. For these haplotypes that could discriminate between the African populations, the combined CHB and TSI samples showed more Egyptian′-specific (1.25-fold, p = 2 × 10) and Ethiopian′- and Egyptian′-specific (hereafter Ethiopian′|Egyptian′-specific) (1.15-fold, p = 9 × 10) haplotypes than did any of the other African haplotype sets ( Figure 2 B). We further explored the observed enrichment of Egyptian′ haplotypes in the CHB and TSI samples by investigating the frequency of each class of haplotype in the combined CHB and TSI samples, and again, the frequencies of Egyptian′-specific and Egyptian′|Ethiopian′-specific haplotypes were highest ( Figures 2 C and 2D). The enrichment of Egyptian′ haplotypes in the genetic pool of the CHB and TSI samples points to a northern migration as the greater contributor to populations outside Africa.

27 Hodgson J.A.

Mulligan C.J.

Al-Meeri A.

Raaum R.L. Early back-to-Africa migration into the Horn of Africa. This finding was robust to a wide range of potential artifacts stemming from uncertainties in the masking process ( Figures S3 S4 , and S6 A; Table S4 ; note particularly the false-positive rate displayed in column 8) and was replicated in a South Asian population (GIH; Figure S6 B). Furthermore, we showed with simulations that the error rate present in the masking process ( Table S4 ) was unlikely to affect our findings ( Figures S4 and S6 ). Even when we added a 10% misclassification error to the Ethiopians, Egyptians held as the African population showing the highest affinity to non-Africans. Alternative scenarios involving early back-to-Africa migrationsas the source of haplotype sharing between Egyptian′ and non-African samples were considered as sources of the observed pattern. However, such confounding backflow would need to have taken place prior to the split between East Asians and Europeans (ca. ∼40,000 years ago) and, if this genetic component originated from the main OOA founding event, is likely to have been removed by the non-African masking procedure, which was designed for this purpose.

15 Abecasis G.R.

Auton A.

Brooks L.D.

DePristo M.A.

Durbin R.M.

Handsaker R.E.

Kang H.M.

Marth G.T.

McVean G.A. 1000 Genomes Project Consortium

An integrated map of genetic variation from 1,092 human genomes. , 28 Drmanac R.

Sparks A.B.

Callow M.J.

Halpern A.L.

Burns N.L.

Kermani B.G.

Carnevali P.

Nazarenko I.

Nilsen G.B.

Yeung G.

et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. 29 Schiffels S.

Durbin R. Inferring human population size and separation history from multiple genome sequences. 30 Li H.

Durbin R. Inference of human population history from individual whole-genome sequences. 21 Pickrell J.K.

Patterson N.

Loh P.R.

Lipson M.

Berger B.

Stoneking M.

Pakendorf B.

Reich D. Ancient west Eurasian ancestry in southern and eastern Africa. Figure 3 Inferred Split Times between Pairs of High-Coverage Genomes Show full caption MSMC-inferred genetic split times of a set of five Ethiopian, three Egyptian, one Maasai, one European (CEU), and one West African (YRI) randomly chosen genome from Europeans, West Africans, and East Africans (Gumuz). One Egyptian (Egypt1) and one Ethiopian (Wolayta) genome were analyzed also after their non-African component was masked out. The split time between two genomes is defined as the time when the cross-coalescence rate dropped to 50%. Cross-coalescence rates of 75% and 25% are shown by the top and bottom bars, respectively, providing references for the putative beginning and end, respectively, of the population split event. The space covered by each vertical line is therefore intended to provide a “time range” when the population split might have occurred, thus showing the split between populations as a slow rather than an instantaneous phenomenon. To provide an independent test of our finding, we analyzed three Egyptian and five Ethiopian high-coverage genomes with the multiple sequentially Markovian coalescent (MSMC) approach before and after masking and compared them with a set of publicly available high-coverage genomes.MSMC,an extension of the PSMCmethod to two or four genomes, estimates the split time between pairs of genomes. Consistent with their admixed nature, the split times of the non-masked Egyptians and the mixed Ethiopians from Europeans (CEU) and West Africans (YRI) were much closer to each other than to the same split times measured in the non-admixed Ethiopian population (Gumuz) ( Figure 3 Figure S7 ). If we consider the genetic split between two populations as a process gradually occurring over thousands of years, two independent splits might show partial overlaps when their midpoints are less than a few thousand years apart. Keeping in mind this potential confounder, the Ethiopian′ and Egyptian′ genomes showed different patterns. In particular, the Egyptian′ genomes displayed a more recent split from both the West African (21,000 years ago) and the non-African (55,000 years ago) genomes than did the Ethiopian′ genomes (37,000 and 65,000 years ago, respectively). This suggests a higher similarity between non-African and Egyptian′ components than between non-African and Ethiopian′ components, which is consistent with the fact that Egypt is the last stop on the way out of Africa. Such split datesalso hint at a recent interaction between Egyptians and West Africans ( Figure 3 ).