Near Eastern genomes from Iran The genetic composition of populations in Europe changed during the Neolithic transition from hunting and gathering to farming. To better understand the origin of modern populations, Broushaki et al. sequenced ancient DNA from four individuals from the Zagros region of present-day Iran, representing the early Neolithic Fertile Crescent. These individuals unexpectedly were not ancestral to early European farmers, and their genetic structures did not contribute significantly to those of present-day Europeans. These data indicate that a parallel Neolithic transition probably resulted from structured farming populations across southwest Asia. Science, this issue p. 499

Abstract We sequenced Early Neolithic genomes from the Zagros region of Iran (eastern Fertile Crescent), where some of the earliest evidence for farming is found, and identify a previously uncharacterized population that is neither ancestral to the first European farmers nor has contributed substantially to the ancestry of modern Europeans. These people are estimated to have separated from Early Neolithic farmers in Anatolia some 46,000 to 77,000 years ago and show affinities to modern-day Pakistani and Afghan populations, but particularly to Iranian Zoroastrians. We conclude that multiple, genetically differentiated hunter-gatherer populations adopted farming in southwestern Asia, that components of pre-Neolithic population structure were preserved as farming spread into neighboring regions, and that the Zagros region was the cradle of eastward expansion.

The earliest evidence for cultivation and stock-keeping is found in the Neolithic core zone of the Fertile Crescent (1, 2); a region stretching north from the southern Levant through eastern Anatolia and northern Mesopotamia, then east into the Zagros Mountains on the border of modern-day Iran and Iraq (Fig. 1). From there, farming spread into surrounding regions, including Anatolia and, later, Europe, southern Asia, and parts of Arabia and North Africa. Whether the transition to agriculture was a homogeneous process across the core zone, or a mosaic of localized domestications, is unknown. Likewise, the extent to which core zone farming populations were genetically homogeneous, or exhibited structure that may have been preserved as agriculture spread into surrounding regions, is undetermined.

Fig. 1 Map of prehistoric Neolithic and Iron Age Zagros genome locations. Colors indicate isochrones, with numbers giving approximate arrival times of the Neolithic culture (in years BCE).

Ancient DNA (aDNA) studies indicate that early Aegean farmers dating to ~6500 to 6000 BCE are the main ancestors of early European farmers (3, 4), although it is not known if they were predominantly descended from core zone farming populations. We sequenced four Early Neolithic (EN) genomes from Zagros, Iran, including one to 10× mean coverage from a well-preserved male sample from the central Zagros site of Wezmeh Cave [WC1, 7455 to 7082 calibrated years (cal) BCE]. The three other individuals were from Tepe Abdul Hosein and were less well preserved (genome coverage between 0.6 and 1.2×) but are around 10,000 years old, and therefore are among the earliest Neolithic human remains in the world (tables S1 and S3).

Despite a lack of a clear Neolithic context, the radiocarbon-inferred chronological age and palaeodietary data support WC1 being an early farmer (tables S1 to S3 and fig. S7). WC1 bone collagen δ13C and δ15N values are indistinguishable from those of a securely assigned Neolithic individual from Abdul Hosein and consistent with a diet rich in cultivated C 3 cereals rather than animal protein. Specifically, collagen from WC1 and Abdul Hosein is 13C depleted compared to those from contemporaneous wild and domestic fauna from this region (5), which consumed C 4 plants. Crucially, WC1 and the Abdul Hosein farmers exhibit very similar genomic signatures.

The four EN Zagros genomes form a distinct cluster in the first two dimensions of a principal components analysis (PCA; Fig. 2); they plot closest to modern-day Pakistanis and Afghans and are well separated from European hunter-gatherers (HG) and other Neolithic farmers. In an outgroup f3-test (6, 7) (figs. S17 to S20), all four Neolithic Iranian individuals are genetically more similar to each other than to any other prehistoric genome except a Chalcolithic genome from northwestern Anatolia (see below). Despite 14C dates spanning around 1200 years, these data are consistent with all four genomes being sampled from a single eastern Fertile Crescent EN population.

Fig. 2 PCA plot of Zagros, European, and Near and Middle Eastern ancient genomes. Comparison of ancient and modern genomes shows that Neolithic Zagros genomes form a discrete genetic cluster close to modern Pakistani and Afghan genomes but distinct from the genomes of other Neolithic farmers and European hunter-gatherers. See animation S1 for an interactive three-dimensional version of the PCA, including the third principal component.

Examination of runs of homozygosity (ROH) above 500 kb in length in WC1 demonstrated that he shared a similar ROH distribution with European and Aegean Neolithics, as well as modern-day Europeans (Fig. 3, A and B). However, of all ancient samples considered, WC1 displays the lowest total length of short ROH, suggesting that he was descended from a relatively large HG population. In contrast, the ROH distributions of the HG Kotias from Georgia, and Loschbour from Luxembourg, indicate prolonged periods of small ancestral population size (8).

Fig. 3 Level and structure of ancient genomic diversity. (A) Total length of the genome in different ROH classes; shades indicate the range observed among modern samples from different populations, and lines indicate the distributions for ancient samples. (B) The total length of short (<1.6 Mb) versus long (≥1.6 Mb) ROH. (C) Distribution of heterozygosity (θ) inferred in 1-Mb windows along a portion of chromosome 3 showing the longest ROH segment in WC1. Solid lines represent the MLE estimate, shades indicate the 95% confidence intervals, and dashed lines represent the genome-wide median for each sample. (D) Distribution of heterozygosity (θ) estimated in 1-Mb windows across the autosomes for modern and ancient samples. (E) Similarity in the pattern of heterozygosity (θ) along the genome as obtained by a PCA on centered Spearman correlations. Ancient—Bich: Bichon, Upper Palaeolithic forager from Switzerland; KK1: Kotias, Mesolithic forager from Georgia; WC1: Wezmeh Cave, Early Neolithic farmer from Zagros; Mota: 4500-year-old individual from Ethiopia; BR2: Ludas-Varjú-dúló, Late Bronze Age individual from Hungary. Modern—YRI: Yoruban, West Africa; TSI: Tuscan, Italy; PJL: Punjabi, Pakistan; GBR: British.

We also developed a method to estimate heterozygosity in 1-Mb windows that takes into account postmortem damage and is unbiased even at low coverage (9) (Fig. 3, C and D). The mean in WC1 was higher than in HG individuals (Bichon and Kotias), similar to that in Bronze Age individuals from Hungary and modern Europeans, and lower than in ancient (10) and modern Africans. Multidimensional scaling on a matrix of centered Spearman correlations of local across the whole genome again puts WC1 closer to modern populations than to ancient foragers, indicating that both the mean and distribution of diversity over the genome are more similar to those of modern populations (Fig. 3E). However, WC1 does have an excess of long ROH segments (>1.6 Mb), relative to Aegean and European Neolithics (Fig. 3B). This includes several very long (7 to 16 Mb) ROH segments (Fig. 3A), confirmed by low estimates in those regions (Fig. 3C). These regions do not show reduced coverage in WC1 nor a reduction in diversity in other samples, with the exception of the longest such segment where we find reduced diversity in modern and HG individuals, although less extended than in WC1 (7) (Fig. 3B). This observed excess of long segments of reduced heterozygosity could be the result of cultural practices such as consanguinity and endogamy, or demographic constraints such as a recent or ongoing bottleneck (11).

The extent of population genetic structure in Neolithic southwestern Asia has important implications for the origins of farming. High levels of structuring would be expected under a scenario of localized independent domestication processes by distinct populations, whereas low structure would be more consistent with a single population origin of farming or a diffuse homogeneous domestication process, perhaps involving high rates of gene flow across the entire Neolithic core zone. The ancient Zagros individuals show stronger affinities to Caucasus HGs (table S17.1), whereas Neolithic Aegeans showed closer affinities to other European HGs (tables S17.2 and S17.3). Formal tests of admixture of the form f3(Neo_Iranian, HG; Anatolia_Neolithic) were all positive with Z-scores above 15.78 (table S17.6), indicating that Neolithic northwestern Anatolians did not descend from a population formed by the mixing of Zagros Neolithics and known HG groups. These results suggest that Neolithic populations from northwestern Anatolia and the Zagros descended from distinct ancestral populations. Furthermore, although the Caucasus HGs are genetically closest to EN Zagros individuals, they also share unique ancestry with eastern, western, and Scandinavian European HGs (table S16.1), indicating that they are not the direct ancestors of Zagros Neolithics.

The significant differences between ancient Iranians, Anatolian/European farmers, and European HGs suggest a pre-Neolithic separation. Assuming a mutation rate of 5 × 10−10 per site per year (12), the inferred mean split time for Anatolian/European farmers (as represented by Bar8, 4) and European HGs (Loschbour) ranged from 33,000 to 39,000 years ago [combined 95% confidence interval (CI) 15,000 to 61,000 years ago], whereas the preceding divergence of the ancestors of Neolithic Iranians (WC1) occurred 46,000 to 77,000 years ago (combined 95% CI 38,000 to 104,000 years ago) (13) (fig. S48 and tables S34 and S35). Furthermore, the European HGs were inferred to have an effective population size (Ne) that was ~10 to 20% of either Neolithic farming group, consistent with the ROH and analyses.

Levels of inferred Neanderthal ancestry in WC1 are low (fig. S22 and table S21), but fall within the general trend described recently in Fu et al. (14). Fu et al. (14) also inferred a basal Eurasian ancestry component in the Caucasus HG sample Satsurblia when examined within the context of a “base model” for various ancient Eurasian genomes dated from ~45,000 to 7,000 years ago. We examined this base model using ADMIXTUREGRAPH (6) and inferred almost twice as much basal Eurasian ancestry for WC1 as for Satsurblia (62 versus 32%) (fig. S52), with the remaining ancestry derived from a population most similar to ancient north Eurasians such as Mal`ta1 (15). Thus, Neolithic Iranians appear to derive predominantly from the earliest known Eurasian population branching event (7).

“Chromosome painting” and an analysis of recent haplotype sharing using a Bayesian mixture model (7) revealed that, when compared to 160 to 220 modern groups, WC1 shared a high proportion (>95%) of recent ancestry with individuals from the Middle East, Caucasus, and India. We also compared WC1’s haplotype-sharing profile to that of three high-coverage Neolithic genomes from northwestern Anatolia (Bar8; Barcın, Fig. 4), Germany (LBK; Stuttgart), and Hungary (NE1; Polgár-Ferenci-hát). Unlike WC1, these Anatolian and European Neolithics shared ~60 to 100% of recent ancestry with modern groups sampled from southern Europe (figs. S24, S30, and S32 to S37; table S22).

Fig. 4 Modern-day peoples with affinity to WC1. Modern groups with an increasingly higher (respectively lower) inferred proportion of haplotype sharing with the Iranian Neolithic Wezmeh Cave (WC1, 7455 to 7082 cal BCE, blue triangle) compared to the Anatolian Neolithic Barcın genome (Bar8; 6212 to 6030 cal BCE, red triangle) are depicted with an increasingly stronger blue or red color, respectively. Circle sizes illustrate the relative absolute proportion of this difference between WC1 versus Bar8. The key for the modern group labels is provided in table S24.

We also examined recent haplotype sharing between each modern group and ancient Neolithic genomes from Iran (WC1) and Europe (LBK, NE1), HG genomes sampled from Luxembourg (Loschbour) and the Caucasus (KK1; Kotias), a 4500-year-old genome from Ethiopia (Mota) and Ust’-Ishim, and a 45,000-year-old genome from Siberia. Modern groups from south, central, and northwestern Europe shared haplotypes predominantly with European Neolithic samples LBK and NE1, and European HGs, whereas modern Near and Middle Eastern, as well as southern Asian samples, had higher sharing with WC1 (figs. S28 and S29). Modern Pakistani, Iranian, Armenian, Tajikistani, Uzbekistani, and Yemeni samples were inferred to share >10% of haplotypes with WC1. This was true even when modern groups from neighboring geographic regions were added as potential ancestry surrogates (figs. S26 and S27 and table S23). Iranian Zoroastrians had the highest inferred sharing with WC1 out of all modern groups (table S23). Consistent with this, outgroup f3 statistics indicate that Iranian Zoroastrians are the most genetically similar to all four Neolithic Iranians, followed by other modern Iranians (Fars), Balochi (southeastern Iran, Pakistan, and Afghanistan), Brahui (Pakistan and Afghanistan), Kalash (Pakistan), and Georgians (figs. S12 to S15). Interestingly, WC1 most likely had brown eyes, relatively dark skin, and black hair, although Neolithic Iranians carried reduced pigmentation-associated alleles in several genes and derived alleles at 7 of the 12 loci showing the strongest signatures of selection in ancient Eurasians (3) (tables S29 to S33). Although there is a strong Neolithic component in these modern south Asian populations, simulation of allele sharing rejected full population continuity under plausible ancestral population sizes, indicating some population turnover in Iran since the Neolithic (7).

While Early Neolithic samples from eastern and western southwest Asia differ conspicuously, comparisons to genomes from Chalcolithic Anatolia and Iron Age Iran indicate a degree of subsequent homogenization. Kumtepe6, a ~6750-year-old genome from northwestern Anatolia (16), was more similar to Neolithic Iranians than to any other non-Iranian ancient genome (figs. S17 to S20 and table S18.1). Furthermore, our male Iron Age genome (F38; 971 to 832 BCE; sequenced to 1.9×) from Tepe Hasanlu in northwestern Iran shares greatest similarity with Kumtepe6 (fig. S21) even when compared to Neolithic Iranians (table S20). We inferred additional non-Iranian or non-Anatolian ancestry in F38 from sources such as European Neolithics and even post-Neolithic Steppe populations (table S20). Consistent with this, F38 carried a N1a subclade mitochondrial DNA (mtDNA), which is common in early European and northwestern Anatolian farmers (3). In contrast, his Y chromosome belongs to subhaplogroup R1b1a2a2, also found in five Yamnaya individuals (17) and in two individuals from the Poltavka culture (3). These patterns indicate that post-Neolithic homogenization in southwestern Asia involved substantial bidirectional gene flow between the east and west of the region, as well as possible gene flow from the Steppe.

Migration of people associated with the Yamnaya culture has been implicated in the spread of Indo-European languages (17, 18), and some level of Near Eastern ancestry was previously inferred in southern Russian pre-Yamnaya populations (3). However, our analyses suggest that Neolithic Iranians were unlikely to be the main source of Near Eastern ancestry in the Steppe population (table S20) and that this ancestry in pre-Yamnaya populations originated primarily in the west of southwest Asia.

We also inferred shared ancestry between Steppe and Hasanlu Iron Age genomes that was distinct from EN Iranians (table S20) (7). In addition, modern Middle Easterners and South Asians appear to possess mixed ancestry from ancient Iranian and Steppe populations (tables S19 and S20). However, Steppe-related ancestry may also have been acquired indirectly from other sources (7), and it is not clear if this is sufficient to explain the spread of Indo-European languages from a hypothesized Steppe homeland to the region where Indo-Iranian languages are spoken today. Yet, the affinities of Zagros Neolithic individuals to modern populations of Pakistan, Afghanistan, Iran, and India is consistent with a spread of Indo-Iranian languages, or of Dravidian languages (which includes Brahui), from the Zagros into southern Asia, in association with farming (19).

The Neolithic transition in southwest Asia involved the appearance of different domestic species, particularly crops, in different parts of the Neolithic core zone, with no single center (20). Early evidence of plant cultivation and goat management between the 10th and the 8th millennium BCE highlights the Zagros as a key region in the Neolithization process (1). Given the evidence of domestic species movement from east to west across southwest Asia (21), it is surprising that EN human genomes from the Zagros are not closely related to those from northwestern Anatolia and Europe. Instead they represent a previously undescribed Neolithic population. Our data show that the chain of Neolithic migration into Europe does not reach back to the eastern Fertile Crescent, also raising questions about whether intermediate populations in southeastern and Central Anatolia form part of this expansion. Nevertheless, it seems probable that the Zagros region was the source of an eastern expansion of the southwestern Asian domestic plant and animal economy. Our inferred persistence of ancient Zagros genetic components in modern day south Asians lends weight to a strong demic component to this expansion.