Laboratory locations

All DNA work involving the modern relatives was carried out at the University of Leicester. Male-line relatives were typed using Promega PowerPlex Y23 and for SNPs defining the main European Y-haplogroups in Leicester with a subset of the typing being confirmed at the Université Paul Sabatier. Female-line relatives were sequenced for the entire mitochondrial genome at the University of Leicester. DNA was extracted from ancient teeth and bone at the University of York and the Université Paul Sabatier, Toulouse. Library preparation and target enrichment were done at the University of York. Single-end 100-bp sequencing using a HiSeq 2000 (Illumina, CA, USA) was performed at the Copenhagen Sequencing Facility. Targeted sequencing of both modern and ancient DNA was also carried out at genomic technical platform PlaGe (Genopole, Toulouse, France) and at the Protein Nucleic Acid Chemistry Laboratory at the University of Leicester. Below we provide a condensed version of the methods used. For each step, full information can be found in the Supplementary Methods. Details surrounding the extensive genealogical research carried out for this project can be found in the Supplementary Note 2.

Sample collection

DNA was extracted from saliva samples of the modern relatives of Richard III and all participants were recruited with informed consent following project review by the University of Leicester Research Ethics Committee.

Skeleton 1 was excavated and samples taken under clean conditions37. Everyone involved in the excavation at the Grey Friars site, the clean laboratory in Leicester and those involved in the laboratories and labwork had their mitochondrial and, for males, Y-chromosomes typed. DNA was extracted from saliva samples and all participants were recruited with informed consent.

DNA extraction of ancient samples

DNA was extracted from teeth and bone (femur) samples. All procedures were performed in dedicated ancient DNA laboratories at the University of York and the Université Paul Sabatier, Toulouse with appropriate contamination precautions in place. Two extraction blanks were included and treated exactly as if they were extracts throughout the whole process. PCRs and library experiments also included further blank controls.

Sex-typing assay performed on Skeleton 1

A newly designed sex-typing assay comprising of PCR primers for co-amplification of the SRY fragment with UTX and UTY homologous regions was used. This assay was designed to enable relatively small sized fragments of SRY, UTX and UTY to be co-amplified from samples likely to contain degraded DNA10.

Mitochondrial control region analysis of Skeleton 1

Analysis of the hypervariable segments (HV1, HV2 and HV3) of the mtDNA control region was carried out by amplifying and directly sequencing multiple overlapping fragments ranging from 153 to 250 bp in size (http://forensic.yonsei.ac.kr/protocol/mtDNA-midi-mini.pdf)22. A selection of amplicons was used for cloning the PCR products in the lab in Toulouse. Sequencing was carried out using the Big-Dye Terminator V3.1 cycle sequencing kit (Applied Biosystems) and by capillary electrophoresis on an ABI Prism 3730 Genetic Analyser (Applied Biosystems) at the Protein Nucleic Acid Chemistry Laboratory at the University of Leicester or at the genomic technical platform PlaGe (Genopole).

Mitochondrial genome/Y-SNP/HIrisplex typing of Skeleton 1

Libraries were built following Meyer and Kircher16, with the exception that the first filtration step between the blunt end repair and the adapter ligation was substituted by heat inactivation of the enzymes38,39. Two microarrays were designed, one for the mtDNA enrichment and another one for nuclear SNP enrichment. DNA enrichment was performed by hybridization capture using the Agilent 244k DNA SureSelect microarray (Agilent, Böblingen, Germany). For the nuclear capture, Y-chromosome probes were designed to cover the SNPs relevant to the major European lineages14. Further probes were designed to cover the SNPs relevant to the HIrisplex31 markers. These two sets of probes (mitochondrial and SNPs) were used separately to fill the two different microarray designs of a 1 × 244K format. For each microarray, the capture protocol was performed following Hodges et al.15 with the modifications proposed by Zhang et al.40 and Fortes and Paijmans38. The libraries were pooled in equimolar quantities and sequenced on two lanes of the Illumina HiSeq 2000 platform in 100 SE mode at the sequencing facility of the University of Copenhagen, Denmark.

The raw reads from each library were sorted based on the six-nucleotide index used during library preparation. Only reads with a 100% match to the index were selected for further analyses. Reads shorter than 25 nucleotides were discarded from further analysis. The trimmed reads were mapped to autosomes and sex chromosomes from the human reference genome build 37 (GRCh37) and to the rCRS (NC_012920.1) using bwa 0.7.5a-r405 (ref. 41). In each alignment, the output bam files were merged using SAMtools 0.1.19 (ref. 41) and PCR duplicates were removed subsequently. The mapped reads were filtered based on a mapping quality >29 and their alignment to unique positions along the reference sequence.

Polymorphic positions were identified using SAMtools (SAMtools 0.1.19) and bcftools. Finally, vcfutils.pl was used to filter the list of variants according to a Phred-scaled genotype posterior probability quality >20 and a read depth higher than 10. To avoid miscalling because of the deamination pattern of ancient DNA molecules, all the polymorphic positions reported in the vcf output file were checked by eye. In the case of the mitochondrial genome, the assembly to the reference was visualized in Tablet42, while the alignment of the reads containing the SNPs to the reference chromosomes was visualized using IGV43.

SNP typing by PCR

The capture approach yielded insufficient coverage for all HIrisPlex and Y-chromosome SNPs and therefore primers were designed to generate amplicons containing these SNPs as well as two SNPs, which further define Y-chromosome haplogroup G: M285 (G1) and P287 (G2) (ref. 14). These were amplified as part of multiplex reactions following Römpler et al.44 or singleplex reactions (using 40 cycles and with no secondary amplification) and sequenced on the Ion Torrent following library preparation using Ion PGM 200 Xpress Template Kit and PGM 200 Sequencing Kit. To increase coverage, singleplex PCR and sequencing of one marker (rs28777) was carried out according to Binladen et al.45

Typing of the haplogroup G defining SNPs (M201, M285 and P287) was repeated in Toulouse using singleplex PCRs. Sequencing of these PCR products was carried out using Big-Dye Terminator V3.1 cycle sequencing kit (Applied Biosystems) analysed by capillary electrophoresis on an ABI Prism 3730 Genetic Analyser (Applied Biosystems) at the genomic technical platform PlaGe (Genopole).

Y-chromosomal haplotype analysis

Ancient and modern samples' Y-chromosomal haplotypes were obtained using the PowerPlex Y23 System (Promega) and analysed by capillary electrophoresis on an ABI Prism 3730 Genetic Analyser (Applied Biosystems) at the genomic technical platform PlaGe (Genopole) and on an ABI Prism 3130xl Genetic Analyser (Applied Biosystems) at the University of Leicester. For Skeleton 1, this was carried out on three separate extracts (RM2, LM1 and LM3) in two different ancient DNA laboratories (York and Toulouse). For the modern relatives, this was carried out on two different extracts in two different modern laboratories (Leicester and Toulouse).

Y-chromosomal SNP analysis of modern samples

Following determination of the Y-haplotype for the modern male-line samples, the predicted haplogroup was determined using Whit Athey’s haplogroup predictor (http://www.hprg.com/hapest5/hapest5a/hapest5.htm?order=num). Binary markers covering these and related lineages were typed in two multiplexes by the SNaPshot minisequencing procedure (Applied Biosystems) and an ABI3130xl Genetic Analyzer (Applied Biosystems) followed by confirmation using Sanger sequencing. Somerset 3 was determined to be Hg I (M170+ M223−, M253−)14 derived, further confirmed by the lab in Toulouse. Somersets 1,2,4 and 5 were determined to be derived for R1b-U152. Somersets 1,2,4 and 5 were tested for SNPs subdividing this clade13 (Z56, M126, Z36, Z192, M160 and L2) using Sanger sequencing in both labs.

Modern mtDNA analysis

Both samples were replicated twice.

Samples were taken using Oragene DNA Collection kits (DNA Genotek) and DNA extracted using two different methods: the Qiacube Blood and Body Fluid protocol (200 μl with 200 μl elution) and the Oragene protocol. To analyse the control region, samples were sequenced twice in both the forward and reverse direction using two overlapping primer sets (15973-296 and 16524-614) using Big-Dye Terminator V 3.1 (Applied Biosystems). No differences were found between replicates or between samples.

Samples were amplified for the complete mitochondrial genome from both extractions following Meyer et al.26 PCR amplicons were sequenced on an Ion Torrent PGM Sequencer on an Ion314 Chip. Libraries were prepared using the Ion Xpress Plus gDNA Fragment Library Preparation kit, while the template preparation and the sequencing were carried out using the Ion PGM 200 Xpress Template Kit and the Ion PGM 200 Sequencing Kit, respectively. Raw reads were mapped back to the rCRS (NC_012920.1) using TMAP software included in the Ion Alignment plugin 3.2.1 (Torrent Suite Software 3.2.1) on the Ion Torrent server. Duplicate reads removal and variant calling were performed using SAMtools 0.1.19 (ref. 41) and local realigning was carried out with the Genome Analysis Tool Kit46. Variant sites were filtered for Base Quality 20, Mapping Quality 50 and Depth of Coverage 30 following which 33 polymorphic sites were retained. All these sites have been manually checked and confirmed by Sanger sequencing in both directions and replicated twice.

Contamination control and quantification

Modern DNA contamination of the ancient remains was controlled for by the following methods:

1 Excavation was carried out under clean conditions (see Supplementary Methods) 2 Samples were stored in clean labs and ancient DNA work carried out only in dedicated ancient DNA facilities. 3 Separate ancient samples were processed in separate labs to replicate results. 4 All lab members and excavation participants had their mtDNA typed and Y-chromosome typing was carried out on all men involved. None had a matching mtDNA or Y-chromosome type.

As evidence against significant contamination, DNA analysis of Skeleton 1 shows a perfect mtDNA match to ML1 and a single-base difference with ML2. It also shows a clear Y-STR haplotype, which has been replicated using a number of extracts generated and tested in two separate labs. Finally, an examination (see Supplementary Methods) of the substitution pattern in our reads also supports this.

Statistical analysis

Taking a conservative approach at each step, we computed a likelihood for each item of observed evidence under each of two opposing hypotheses: Hypothesis 1 (H1): Skeleton 1 is Richard III, and Hypothesis 2 (H2): Skeleton 1 is not Richard III.

As it was reasonable to assume that all the different lines of evidence were independent, the joint likelihood of all the evidence was obtained by multiplication of the individual likelihoods under each hypothesis. The weight of evidence for H1, called the likelihood ratio (LR), was then given by the ratio of the likelihood under H1 to that under H2. We say that an assumption is ‘conservative’ if it reduces the LR.

The LR can be converted into a probability that H1 is true, given a prior probability. We took as starting point the moment that Skeleton 1 was first observed and recognized as a human skeleton, but before any assessments of age, sex, state of health and cause of death were made. At that point, there was substantial evidence that a skeleton found in what is believed to have been the location of Leicester Grey Friars choir could be that of Richard III. All of the information available at the time that Skeleton 1 was unearthed, including its precise location and the nature of the grave, was regarded for this analysis as background information that can inform the prior probability. On the basis of that information, we believe that a sceptical observer could not reasonably have assigned a prior probability less than 1 in 40. This value was proposed in a previous analysis (http://rationalgareth.com/), based on what we judge to be sceptical assessments. The highest probability that could be justified by the prior evidence might be 1 in 2.

We have used relevant, available data where possible. Inevitably subjective judgments are required, for example, the relevant reference populations and about the probabilities of error in reported facts. As far as seemed possible and reasonable, we strived to be conservative in our approach, for example, using a pseudocount method to bias the LR towards a neutral value of 1, thus tending to avoid spurious large values from low observed frequencies. Details of the data and methods used in the statistical analysis of the radiocarbon data, age and sex of skeleton, presence of scoliosis, presence of perimortem wounds, Y-chromosome and mtDNA frequency data can be found in the Supplementary Methods. To summarize the results: the radiocarbon data yielded a likelihood ratio of 1.84 representing limited support for H1. The age and sex data yielded a likelihood ration of 5.25, again representing limited support for H1. The presence of one shoulder higher than the other, reported during Richard’s lifetime, could be attributed to scoliosis (Skeleton 1 had severe idiopathic adolescent-onset scoliosis) or two other known conditions, Erb’s Palsy and Sprengel’s deformity, both of which are very rare. Under H1, the above rates give an estimated probability of 0.90 of observing scoliosis given the description of Richard III’s physical appearance (=the scoliosis rate divided by the sum of the three rates), which we multiplied by 0.95 to allow for the possibility that the recorded description was incorrect. This lead to a LR of 212, providing moderately strong support for H1. The presence of perimortem injuries gave a LR of 42, and so moderate support for H1. The Y-chromosome of Skeleton 1 did not match that of genealogically determined patrilineal relatives of Richard III. This could be explained by a false-paternity event in one or more of the 19 putative father–son links between Richard III and Henry Somerset, fifth Duke of Beaufort. The Y-chromosome results also indicate one further false-paternity event between Henry Somerset and his five contemporary, presumed patrilinear descendants. To be conservative, we selected a published false paternity rate that was (1) lower than any other published rate that we considered17,47 and (2) based on genealogical data18. To this we add the false-paternity event in the 19 putative father–son links between Henry Somerset and five contemporary male Somersets. This gives a probability of at least one false paternity event in the 19 putative father–son links between Richard III and Henry Somerset of 0.16. Given that a false-paternity event must have occurred under H1, the population frequency of Skeleton 1’s Y-haplotype is the same under H1 and H2 and cancels out in the LR calculation. Thus, the LR is 0.16, representing limited evidence against H1.

The mtDNA sequences of Skeleton 1 and the presumed 19-meiosis matrilinear relative of Richard III, Michael Ibsen, matched completely. A 21-meiosis relative also matched except at one base (8994). The latter observation is equally likely under H1 and H2 given the observed sequence of Michael Ibsen, and so cancels out in the LR. Thus, we only need likelihoods for the observation of the sequence shared by Michael Ibsen and Skeleton 1.

To obtain the likelihood under H1, we require the mtDNA mutation rate, and in this case high estimates are conservative. Parsons et al.28 report 10 control region mutations in 327 generations using genealogical data. Because this suggests a higher rate than other published estimates, and is based on genealogical data, we used it to derive a probability of 0.52 for no mutation in 19 meioses.

For the likelihood under H2, we require the population fraction of the Skeleton 1 haplotype. Although we obtained the complete mtDNA genome sequence from Skeleton 1, we identified little published whole-genome comparison data from England. Thus, for the statistical analysis, we used only the mtDNA control regions between positions 16,093 and 16,320 and between 00073 and 00188, for which we obtained suitable English comparison data from an update of Röhl et al.30, supplemented with mtDNA sequences supplied by Roots for Real (Genetic Ancestor Ltd., Clare, Suffolk, UK). Using only these short sections of the control region under H2 is conservative, since the population fraction of the observed control region sequences cannot be less than that of the full mtDNA genome. The relevant reference population is over 500 years in the past, but due to the large population size over the period considered, we expect population frequencies to have changed little over the last five centuries. We found the frequency of the Skeleton 1 haplotype to be 0 among 1823 in the database, to which we add the one instance observed in Michael Ibsen. This approach is, again, conservative as Michael was sampled due to his known genealogical relationship to Richard III. This leads to an LR of 478 representing moderately strong evidence for H1.

We also noted that there were no matches in a database of 26,127 European mitochondrial control region haplotypes (www.empop.org)29. We do not rely on this database because it is Europe-wide rather than specific to England and because of ascertainment issues, but it suggests that the Skeleton 1 haplotype may be much rarer than can be inferred from our smaller English database. We also note that female mobility among the European nobility is likely to have been much higher than for the general population, because of marriage practices relating to political alliance formation. Such practices would provide some justification for using the European mtDNA database, and so for considering the haplotype found in Skeleton 1 and Michael Ibsen to be extremely rare. In Supplementary Table 10, we show some illustrative results using the European database to demonstrate the implications of establishing that the Skeleton 1 haplotype is as rare as suggested by that database.

The LRs for different combinations of the evidence, and two posterior probabilities, are shown in Supplementary Table 10. Using all the evidence, the support for H1 is extremely strong with an LR of 6.7 million, so that our sceptic would be driven to the conclusion that the probability that Skeleton 1 is not Richard III is less than 1 in 100,000, while for those taking a 1 in 2 starting position that probability is much less than 1 in a million. Taking into account the conservative assumptions underlying our calculation described above, we regard this as establishing the truth of H1 beyond reasonable doubt.