Tracing our ancestors in cave sediments Analysis of DNA from archaic hominids has illuminated human evolution. However, sites where thousand-year-old bones and other remains can be found are relatively rare. Slon et al. wanted to exploit any trace remains that our ancestors left behind. They looked for ancient DNA of hominids and other mammals in cave sediments, even those lacking skeletal remains. They identified mitochondrial DNA from Neandertal and Denisovan individuals in cave sediments at multiple sites. Science, this issue p. 605

Abstract Although a rich record of Pleistocene human-associated archaeological assemblages exists, the scarcity of hominin fossils often impedes the understanding of which hominins occupied a site. Using targeted enrichment of mitochondrial DNA, we show that cave sediments represent a rich source of ancient mammalian DNA that often includes traces of hominin DNA, even at sites and in layers where no hominin remains have been discovered. By automation-assisted screening of numerous sediment samples, we detected Neandertal DNA in eight archaeological layers from four caves in Eurasia. In Denisova Cave, we retrieved Denisovan DNA in a Middle Pleistocene layer near the bottom of the stratigraphy. Our work opens the possibility of detecting the presence of hominin groups at sites and in areas where no skeletal remains are found.

DNA recovered from ancient hominin remains enriches our understanding of human evolution and dispersal [e.g., (1)] and has, for example, resulted in the discovery of the Denisovans, a previously unknown group of archaic hominins in Asia who were distantly related to Neandertals (2–4). However, hominin fossils are rare. We therefore decided to investigate whether hominin DNA may survive in sediments at archaeological sites in the absence of macroscopically visible skeletal remains.

Mineral and organic components in sediments can bind DNA [e.g., (5–8)] (figs. S1 to S3), and the amplification of short stretches of mitochondrial (mt) or chloroplast DNA from sediments by polymerase chain reaction (PCR) has been used to demonstrate the past presence of animals and plants at several sites [e.g., (9–14)]. More recently, DNA extracted from sediments has been converted to DNA libraries, from which DNA fragments were sequenced directly (“shotgun” sequencing) (15, 16). This approach is preferable to PCR, as it allows the entire sequence of DNA fragments to be determined. This is important, as it makes it possible to detect cytosine (C) to thymine (T) substitutions near the ends of DNA fragments, which are caused by the deamination of cytosine bases (17) and indicate that the DNA is of ancient origin (18–20). However, the abundance of bacterial DNA in sediments and the difficulty in assigning short nuclear DNA sequences to mammalian taxa limit the utility of shotgun sequencing for analyzing DNA from sediments.

Isolating DNA from Pleistocene cave sediments To investigate whether ancient mammalian DNA, especially of archaic humans, may be preserved in Pleistocene cave sediments, we collected 85 samples from seven archaeological sites with known hominin occupation, varying in age between ~14 thousand years ago (ka) and ≥550 ka (data file S1) (8). Some samples were collected specifically for the purpose of this study: 4 from Les Cottés (France), 5 from Trou Al’Wesse (Belgium), 1 from El Sidrón (Spain), 1 from Vindija Cave (Croatia), 3 from Denisova Cave (Russia), and 13 from Caune de l’Arago (France). The other samples, 49 from Denisova Cave and 9 from Chagyrskaya Cave (Russia), had been collected previously for luminescence dating. The latter two sites are located in the Altai Mountains, where remains of both Neandertals and Denisovans have been uncovered (3, 21). We extracted DNA from between 38 and 160 mg of each sample and converted aliquots of the DNA to single-stranded DNA libraries (8, 22, 23). All libraries were shotgun sequenced and analyzed with a taxonomic-binning approach (8). Whereas most of the DNA sequences (79.1 to 96.1%) remained unidentified, most of those that could be identified were assigned to microorganisms and between 0.05 and 10% to mammals (figs. S7 to S15).

Enrichment of mammalian mtDNA To determine the taxonomic composition of the mammalian DNA in the sediments, we isolated DNA fragments bearing similarities to mammalian mtDNAs by hybridization capture using probes for 242 mitochondrial genomes, including human mtDNA (8, 24). MtDNA is useful for this purpose because it is present in higher copy numbers than nuclear DNA in most eukaryotic cells and is phylogenetically informative despite its small size because of its fast rate of evolution in mammals. Between 3535 and 3.2 million DNA fragments were sequenced per library (data file S2), of which between 14 and 50,114 could be assigned to mammalian families with a strategy for taxonomic identification of short and damaged DNA fragments (8) (fig. S18). To assess whether the sequences were of ancient origin, we evaluated them for the presence of C to T substitutions at their 5′ and 3′ ends (17, 18) (see fig. S19 for an example). Additionally, we computed the variance of coverage across the mitochondrial genome for each taxon to test whether sequences mapped randomly across the reference genome (fig. S20), as would be expected for sequences that are genuinely derived from the taxon to which they are assigned. With the exception of 46 sequences from a single sample from Les Cottés, which were originally attributed to procaviids but that mapped only to one restricted region of the genome (fig. S21), this analysis lent support to the correct taxonomic classification of the sequences we obtained. Of the 52 sediment samples from the Late Pleistocene, 47 contained mtDNA fragments from at least one family showing evidence of ancient DNA-like damage, whereas 14 out of 33 Middle Pleistocene samples did so (Fig. 1 and fig. S22). Overall, we detected ancient mtDNA fragments from 12 mammalian families, of which the most common were hyaenids, bovids, equids, cervids, and canids (data file S3 and figs. S23 to S32). These taxa are all present in the zooarchaeological records of the sites, as reconstructed from faunal remains (fig. S33). Fig. 1 Ancient taxa detected in Late Pleistocene (LP) and Middle Pleistocene (MP) sediment samples from seven sites. For each time period, the fraction of samples containing DNA fragments that could be assigned to a mammalian family and authenticated to be of ancient origin is indicated. The shaded symbols representing each family are not to scale. We exploited the known genetic variation within these families to determine the affinity of the sequences we obtained to specific species (8) (data file S3). In all libraries containing elephantid DNA, the majority (71 to 100%) of sequences matched variants found in the mtDNAs of woolly mammoths, a species that became extinct in Eurasia during the Holocene (25), but not in other elephantids. Likewise, sequences attributed to rhinocerotids most often carried variants specific to the woolly rhinoceros branch (54 to 100% support), thought to have become extinct at the end of the Late Pleistocene (25), and showed little support (0 to 6%) for other rhinoceros lineages. In ~70% of libraries containing hyaenid mtDNA, the sequences matched variants of the extinct cave hyena and/or the spotted hyena, which exists today only in Africa (26). Lastly, 90% of ursid mtDNA sequences retrieved from Vindija Cave carried variants matching Ursus ingressus, an Eastern European cave bear lineage that became extinct ~25 ka (27, 28). Extraction and DNA library preparation negative controls contained between 32 and 359 mammalian mtDNA sequences. These sequences do not exhibit damage patterns typical of ancient DNA, and they originate from common contaminants (24, 29–31), predominantly human DNA, as well as DNA of bovids, canids, and suids (fig. S34).

Targeting hominin DNA Among the samples analyzed, the only site that yielded sequences from putatively deaminated DNA fragments that could be assigned to hominids (or hominins, assuming that no other great apes were present at the sites analyzed here) was El Sidrón. This site differs from the others in that no ancient faunal DNA was identified there (Fig. 1), consistent with the almost complete absence of animal remains at the site (32). To test whether animal mtDNA was too abundant at other sites to detect small traces of hominin mtDNA, we repeated the hybridization capture for all DNA libraries using probes targeting exclusively human mtDNA (8). Between 4915 and 2.8 million DNA fragments were sequenced per library, out of which between 0 and 8822 were unique hominin sequences that passed our filtering scheme (8). Between 10 and 165 hominin mtDNA sequences showing substitutions typical of ancient DNA were obtained from 15 sediment samples from four sites (data file S4). To generate sufficient data for phylogenetic analyses, we prepared DNA extracts from additional subsamples of 10 of these samples and used automated liquid handling to generate 102 DNA libraries from these as well as the original extracts (data file S1 and fig. S22). After enriching for human mtDNA and merging all sequences from a given sediment sample, nine samples yielded a sufficient number of deaminated hominin mtDNA fragments (between 168 and 13,207) for further analyses (data file S4).

Identifying Neandertal and Denisovan mtDNA We identified “diagnostic” positions in the mtDNA genome that are inferred to have changed on each branch of a phylogenetic tree relating modern humans, Neandertals, Denisovans, and a ~430,000-year-old hominin from Sima de los Huesos (8, 33). For eight sediment samples from El Sidrón, Trou Al’Wesse, Chagyrskaya Cave, and Denisova Cave, the Neandertal state is shared by 87 to 98% of sequences overlapping positions diagnostic for Neandertal mtDNA, whereas the modern human, Denisovan, and Sima de los Huesos branches are supported by 4 to 11%, 0 to 2%, and 0 to 2% of sequences, respectively. In the ninth sample, collected in layer 15 of the East Gallery in Denisova Cave, 84% (16/19) of sequences carry Denisovan-specific variants, compared to 0% (0/10), 5% (1/19), and 0% (0/23) for the modern human, Neandertal, and Sima de los Huesos variants, respectively, pointing to a Denisovan origin for these mtDNA fragments (data file S4 and fig. S40). Notably, none of the hominin sequences present in the extraction or library preparation negative controls carry variants specific to the Neandertal, Denisovan, or Sima de los Huesos branches (data file S4). The average sequence coverage of the mitochondrial genome varied between 0.4- and 44-fold among the nine samples. To be able to reconstruct phylogenetic trees using these sequences, we called a consensus base at positions covered by at least two deaminated fragments and required more than two-thirds of fragments to carry an identical base (34). These relatively permissive parameters were chosen to avoid discarding samples that produced very small numbers of hominin sequences and allowed us to reconstruct between 8 and 99% of the mtDNA genome (table S3). Phylogenetic trees relating each of the reconstructed mtDNA genomes to those of modern and ancient individuals (8) (table S5) show that they all fall within the genetic variation or close to known mtDNA genomes of Neandertals or Denisovans (Fig. 2 and figs. S41 to S49). Fig. 2 Cladogram relating mtDNA genomes reconstructed from sediment samples to those of modern and ancient individuals. The branches leading to mtDNA genomes reconstructed from sediments (dashed lines) were superimposed on a neighbor-joining tree relating the previously determined mtDNA genomes of ancient and present-day humans (purple), Neandertals (orange), the Sima de los Huesos hominin (blue), and Denisovans (green) (table S5). Discrete phylogenetic trees relating each of the mtDNAs reconstructed here and the comparative data are shown in figs. S41 to S49.

Single versus multiple sources of hominin mtDNA We next aimed to assess whether mtDNA fragments from more than one individual are present in a given sediment sample. For this purpose, we identified positions in the mitochondrial genome that are covered by at least 10 sequences exhibiting evidence of deamination. Three samples had sufficient data for this analysis (fig. S50). At each of these positions, nearly all sequences from a sample collected in the Main Gallery of Denisova Cave carry the same base, suggesting that the DNA may derive from a single individual. In contrast, sequences from the El Sidrón sample support two different bases at a single position, as is the case for a second sample from Denisova Cave. Thus, at least two mtDNA genomes seem to be present in both these samples (fig. S51). That the variable position in the latter sample is a known variant among Neandertal mtDNAs supports the conclusion that the sample contains DNA from more than one Neandertal (table S7). We then developed a maximum-likelihood approach to infer the number of mtDNA components also in low-coverage data (8) (fig. S52), allowing us to investigate this issue in four additional samples from two sites. We detected only one ancient mtDNA type in the sample from Chagyrskaya Cave and in two samples from Denisova Cave, whereas another sample from Denisova Cave contains mtDNA from at least two ancient individuals (table S9).

DNA yields from sediments To assess how much DNA can be recovered from sediment compared to skeletal elements, we counted the number of mtDNA fragments retrieved per milligram of bone (2, 21, 35–38) or sediment originating from the same layers at three archaeological sites. The number of hominin mtDNA fragments retrieved from bone ranges from 28 to 9142 per milligram, compared to between 34 and 4490 mammalian mtDNA fragments per milligram of sediment (table S10). Thus, surprisingly large quantities of DNA can survive in cave sediments. Notably, most of the ancient taxa we identified are middle- to large-sized (Fig. 1), consistent with larger animals leaving more of their DNA in sediments. The hominin DNA is present in similar concentrations among subsamples of sediment removed from larger samples (fig. S53). This suggests that, in most cases, the DNA is not concentrated in larger spots but is spread relatively evenly within the sediment, which is compatible with the DNA originating from excreta or the decay of soft tissue (9, 39, 40). One exception is a sample from the Main Gallery of Denisova Cave, from which one subsample contains more than 500 times as much hominin mtDNA fragments as others. As the mtDNA retrieved from it may originate from a single Neandertal (tables S7 and S9), we hypothesize that this is due to an unrecognized small bone or tooth fragment in the subsample. Despite its high content of hominin DNA, the library remains dominated by DNA from other mammals, as only ~7.5% of sequences were attributed to hominins after its enrichment with the mammalian mtDNA probes. Nonetheless, if such microscopic fragments can be identified and isolated, they may represent a source of hominin DNA sufficiently devoid of other mammalian DNA to allow for analyses of the nuclear genome.

DNA movement across layers Postdepositional mixing of particles or a saturation of the sediments by large amounts of DNA can potentially lead to movements of DNA between layers in a stratigraphy (40–42). At the sites investigated here, the overall consistency between the taxa identified from DNA and the archaeological records (fig. S33) suggests the integrity of the spatial distribution of DNA. In Chagyrskaya Cave, for example, we recovered abundant mammalian mtDNA fragments showing degradation patterns typical of ancient DNA in layers rich in osseous and lithic assemblages, whereas no ancient mammalian DNA was identified in an archaeologically sterile layer underneath (43). Additionally, mtDNA sequences attributed to the woolly mammoth and woolly rhinoceros were identified in Late Pleistocene layers, yet they are absent from the layer that postdates the presumed time of extinction of these taxa (25) (data file S3 and fig. S24). This implies that little or no movement of mtDNA fragments occurred downward or upward in Chagyrskaya Cave. However, as local conditions may affect the extent to which DNA can move in a stratigraphy, these conditions need to be assessed at each archaeological site before the DNA recovered can be linked to a specific layer. This may be best achieved by dense sampling in and around layers of interest.

Conclusions We show that mtDNA can be efficiently retrieved from many Late and some Middle Pleistocene cave sediments by using hybridization capture (Fig. 1). Encouragingly, this is possible also for samples that were stored at room temperature for several years (8). Sediment samples collected for dating, site-formation analyses, or the reconstruction of ancient environments at sites where excavations are now completed can thus be used for genetic studies. The mtDNA genomes reconstructed from sediments of four archaeological sites recapitulate a large part of the mitochondrial diversity of Pleistocene hominins hitherto reconstructed from skeletal remains (Fig. 2). The recovery of Neandertal mtDNA from El Sidrón, Chagyrskaya Cave, and layer 11.4 of the East Gallery of Denisova Cave is in agreement with previous findings of Neandertal remains at those sites and in those layers (21, 32, 44). At Trou Al’Wesse, where we find Neandertal mtDNA, no hominin remains have been found in the Pleistocene layers. However, Late Mousterian artifacts and animal bones with cut-marks support the use of the site by Neandertals (45). In Denisova Cave, we detected Neandertal mtDNA in layers with Middle Paleolithic stone tools in the Main Gallery (46), in which no Neandertal remains have been found. In the East Gallery, we identified Denisovan as well as Neandertal mtDNA lower in the stratigraphy than where skeletal remains of archaic humans have been discovered (Fig. 3), indicating the repeated presence of both groups in the region. Fig. 3 Hominin mtDNAs along the stratigraphy of the East Gallery in Denisova Cave. Layer numbers are noted in gray. The layers of origin for sediment samples and skeletal remains yielding Neandertal (orange) and Denisovan (green) mtDNA genomes are indicated. For details on these and other hominin skeletal remains from other parts of the cave, see (8). The absence of identifiable ancient DNA in Middle Pleistocene layers in Caune de l’Arago and Chagyrskaya Cave is not surprising given their age (>300 ka). Although compared to other animals, hominins constitute a rare taxon at most sites, we were able to detect Neandertal DNA in the sediments of four of the six sites containing Late Pleistocene layers. For the remaining two sites, Vindija Cave and Les Cottés, only one and four samples, respectively, were available for this study, suggesting that extensive sampling is necessary at each site to ensure that hominin DNA is detected if present. Fortunately, the automation of laboratory procedures to generate DNA libraries and isolate DNA by hybridization capture (8) now makes it possible to undertake large-scale studies of DNA in sediments. This is likely to shed light on the genetic affiliations of the occupants of large numbers of archaeological sites where no human remains are found.

Supplementary Materials www.sciencemag.org/content/356/6338/605/suppl/DC1 Materials and Methods Figs. S1 to S53 Tables S1 to S10 References (47–159) Data Files S1 to S4