Authentication

This study validated its results through the application of fifteen criteria. The DNA was extracted from tooth powders and DNA libraries were prepared in two independent laboratories in different locations at the University of Washington by different personnel. All DNA extractions and amplification preparations were carried out in physically isolated work areas in flow hoods exclusively dedicated to the study of ancient DNA. The extraction of DNA from teeth for next-generation genome sequencing was performed at the University of Washington in a separate building and by different personnel from the HVS-1 analysis. The samples were processed in a newly built laboratory facility that was restricted for use solely for the Minoan tooth materials. The DNA extracts were maintained in a dedicated, bleach-treated freezer in a separate wing of the building from the PCR or the Illumina machines. Multiple blank extractions were processed in parallel and negative controls were included in all reactions. Positive controls were excluded from extractions and amplifications to avoid the introduction of modern competitor DNA. The DNA samples were tested for appropriate molecular behaviour27. HVS-1 results were confirmed on a second tooth from the same individual. Small overlapping targets were amplified. PCR products were cloned to determine the ratio of endogenous-to-exogenous sequences. Amino-acid racemization28 and concentration ratios29 were determined in duplicate on a large subset of the samples. PCR copy number was estimated using real-time PCR methods. Protective surgical clothing and mask were worn during the handling and extraction of materials. Equipment, sand paper and tubes were illuminated with UV for 3 hours before each use. All commercial reagents (Taq Polymerase, primers, water and buffers) were screened for modern DNA before use.

DNA extracts were screened with primers L16055-H16379, using the parameters outlined above to assess appropriate molecular behaviour. None of the ancient DNA samples reported here amplified when screened with the L16055-H16379 primer pair, indicating the absence of intact modern competitor DNA20,27. Biochemical preservation of teeth was determined using amino-acid analysis by MicroAnalytica LLC on 15 of the original 52 samples from Ayios Charalambos that amplified through HVS-1 primers and 39 of 39 samples from Odigitria. Racemization results for aspartic acid ranged from 0.057 to 0.103 (average=0.08) and for alanine from 0.004 to 0.011 (average=0.0076) for individuals from Ayios Charalambos. Concentration ratios were proportional to published and modern reference standards. For Asp/Glu, they ranged from 0.65 to 0.79 (average=0.71), Ser/Glu 0.44–0.47 (average=0.45) and Ala/Glu 1.56–1.71 (average=1.63). Racemic results were consistent with specimens from which ancient DNA has been successfully recovered and indicate that the cave of Ayios Charalambos contains skeletal remains with excellent biomolecular preservation. In comparison, racemization results obtained from specimens from Odigitria suggested poor preservational history. Aspartic acid ratios for Odigitria ranged from 0.092 to 0.226 (average=0.135) and for alanine from 0.007 to 0.043 (average=0.015). Concentration ratios from Odigitria materials were proportional to published and modern reference standards, but showed a greater range compared with the samples from Ayios Charalambos. For Asp/Glu, they ranged from 0.66 to 1.12 (average=0.75), Ser/Glu 0.38–0.48 (average=0.42) and Ala/Glu 1.31–1.86 (average=1.59) Quantification of target molecules was performed on specimens from Ayios Charalambos using primers L16055-H16155 and SYBR Green (Qiagen) dye on a DNA Engine Opticon 2 Real-Time PCR Detection System (MJ Research). All DNA extracts were shown to contain high copy numbers, ranging from 6,250–13,125 copies (average=10,500) per PCR reaction30.

DNA extraction, PCR cloning and sequencing

Teeth were decontaminated by removing the outer layer with sand paper, soaking in 100% bleach for 15 s, rinsing 8 times with DNA-free water and UV treating on all sides for 3 h. They were then pulverized with a Spex CertiPrep 6750 Freezer/Mill for 2 min at a setting of 4. Four-hundred milligram of the resulting powder was decalcified and digested following Krings et al.31, using Ultra reagents (Fluka BioChemika). For the HVS-1 analysis, samples were centrifuged for 1 min at 4,000g and the supernatant removed and extracted with an equal volume of UltraPure phenol, chloroform, isoamyl alcohol (25:24:1) (Invitrogen). Supernatant was concentrated to 100 μl using Microcon MW-30 columns (Millipore). DNA from concentrate was isolated using the MinElute Qiagen PCR Purification Kit32 and eluted with 70 μl of DNA-Free Elution Solution (QBIOgene). Six microlitres of the DNA extract was added to each 25 μl reaction containing HotStart Taq DNA Polymerase (Qiagen) following the manufacturers protocol. Four or five overlapping primer pairs were used to amplify 16055–16379 of the mitochondrial HVS-1 region. Primers followed previous publications31,33 with these noted modifications L16022-H16155 (5′-ATGTGGATTGGGTTTTTATG-3′) or L16055-H16155, L16122-H16223 (5′-CAGTTGATGTGTGATAGTTGAG-3′), L16209 (5′-CCCCATGCTTACAAGCAAG-3′)-H16331, and L16271-H16379. Reactions were cycled in a PTC-150HB PCR MiniCycler (MJ Research) using the parameters: 95 °C for 15 min, 42 cycles of 94 °C for 30 s, 55 °C for 60 s, 72 °C for 60 s and 72 °C for 7 min. PCR products were cloned using the 2.1-TOPO TA Cloning Kit (Invitrogen). Eight to twelve clones per amplicon were sequenced, representing ~80 clones per individual. For sequencing by next-generation Illumina GAII analyzer, DNA was extracted according to the protocol of Rohland and Hofreiter34, and processed for sequencing according to the specifications of the manufacturer. The DNA ends were repaired by a Taq polymerase-based protocol and TruSeq adaptors or bar-coded adaptors (single-end) ligated to synthesize the DNA-sequencing libraries. The Truseq adapter libraries were loaded in a single flowcell, while the bar-coded libraries were pooled in sets of six libraries and loaded in a single flowcell.

Sequence analysis and statistics

Consensus sequences were determined from manually aligned amplicons. Sequences were typed following Richards et al.35, where motifs containing 16304 were typed as haplogroup H rather than F. All analyses were performed treating cytosine deamination-induced artifacts as ambiguous characters (N). The Surfer 9.0 application (Golden Software Inc., Golden, Colorado) applying the Kriging method was used to graphically represent shared lineages on geographic maps.

Comparison data set of extant and ancient populations

For comparison to the Minoan haplotypes, we mined the GenBank sequence database, and compiled a data set of previously published HVS-1 haplotypes from 135 different population samples (total of 14,267 individuals) (Supplementary Table S4). For our analysis, samples were grouped into 71 population groups from modern populations and 11 ancient populations (Supplementary Table S4).

Population distance matrix based on allele frequencies

For each population, we computed the frequencies of the four different possible nucleotides (A,C,G,T) and missing entries for each of the 413 genotyped mtDNA loci of the HVS-1 region. Thus, each population was summarized by a vector of frequencies. To compute the distance between two populations, we ignored loci with >10% missing entries in either population. Then, for each locus, we computed the city-block (L1) distance between the frequency vectors at that locus. (Recall that the L1 distance between two probability distributions is simply the sum of the absolute values of the element-wise differences.) The distance between the two populations is equal to the average of all L1 distances in all retained loci. This distance definition is symmetric, and for populations that have similar allele frequencies in all genotyped loci, this distance will be small. The above computation was run for all pairs of available populations, thus forming a pairwise distance matrix for all populations.

Principal component analysis

PCA was performed on a pairwise population distance matrix, which was computed using the allele frequencies at each genotyped locus. PCA was evaluated on various subsets of the available populations. Towards that end, we applied the singular value decomposition on the aforementioned pairwise distance matrix, to compute its singular vectors and values. The singular values were used to measure the significance of the top two principal components, and nearest neighbours to the Minoan population were computed by projecting each population on the top two singular vectors and then scaling by the corresponding singular values.