The genome of Lonesome George was sequenced using a combination of Illumina and PacBio platforms (Supplementary Section 1.1). The assembled genome (CheloAbing 1.0) has a genomic size of 2.3 gigabases and contains 10,623 scaffolds with an N50 of 1.27 megabases (Supplementary Section 1.1 and Supplementary Tables 1–3). We also sequenced, with the Illumina platform, the closely related tortoise A. gigantea at an average read depth of 28×. These genomic sequences were aligned to CheloAbing 1.0.

TimeTree database estimations (http://www.timetree.org) indicate that Galapagos and Aldabra giant tortoises shared a last common ancestor about 40 million years ago, while both diverged from the human lineage more than 300 million years ago (Supplementary Section 1.4). A preliminary analysis of demographic history using the pairwise sequentially Markovian coalescent (PSMC)5 model showed that while the effective population size of C. abingdonii has been steadily declining for the past million years, with a slight uptick about 90,000 years ago, the population of Aldabra giant tortoises experienced substantial fluctuations over this period (Fig. 1b). Effective population size reconstructions for C. abingdonii lose statistical power at the million-year time frame, probably due to complete coalescence. In turn, this suggests that overall diversity in these giant tortoises must have been low throughout many generations. Together, these results prompt us to propose that the populations of these insular giant tortoises were vulnerable at the time of human discovery of the Galapagos Islands, probably elevating their extinction risk.

Using homology searches with known gene sets from humans and Pelodiscus sinensis (the Chinese soft-shell turtle), along with RNA sequencing (RNA-Seq) data from C. abingdonii blood and an A. gigantea granuloma, we automatically predicted a primary set of 27,208 genes from the genome assembly using the MAKER2 algorithm6. We then performed pairwise alignments between each of the primary predicted protein sequences and the UniProt databases for humans and P. sinensis, whose annotated sequences show relatively high quality when compared with data available for other turtles7. Using alignments spanning at least 80% of the longest protein and showing more than 60% identity, we constructed sets of protein families shared among these species. This preliminary analysis singled out several protein families that seem to have undergone moderate expansion in a common ancestor of C. abingdonii and A. gigantea. Almost all of these expansions were also confirmed in the genome of the related, long-lived tortoise Gopherus agassizii (Supplementary Section 1.2 and Supplementary Table 4). Most of these genes have been linked to exosome formation, suggesting that this process may have been important in tortoise evolution.

We also interrogated the predicted gene set for evidence of positive selection in giant tortoises. This analysis singled out 43 genes with evidence of giant-tortoise-specific positive selection (Supplementary Section 1.2, Supplementary Table 5 and Supplementary Fig. 1). This list includes genes with known roles in the dynamics of the tubulin cytoskeleton (TUBE1 and TUBG1) and intracellular vesicle trafficking (VPS35). Importantly, the analysis of genes showing evidence of positive selection also includes AHSG and FGF19, whose expression levels have been linked to successful ageing in humans8. The role of both factors in metabolism regulation9,10—another hallmark of ageing11,12—suggests that the specific changes observed in these proteins may have arisen to accommodate the challenges that longevity poses on this system. The list of genes with signatures of positive selection also features TDO2, whose inhibition has been proposed to protect against age-related diseases through regulation of tryptophan-mediated proteostasis13. In addition, we found evidence for positive selection affecting several genes involved in immune system modulation, such as MVK, IRAK1BP1 and IL1R2. Taken together, these results identify proteostasis, metabolism regulation and immune response as key processes during the evolution of giant tortoises via effects on longevity and resistance to infection.

Parallel to this automatic analysis, we used manually supervised annotation on more than 3,000 genes selected a priori for a series of hypothesis-driven studies on development, physiology, immunity, metabolism, stress response, cancer susceptibility and longevity (Supplementary Section 1.3 and Supplementary Fig. 2). We searched for truncating variants, variants affecting known motifs and variants whose human counterparts are related to known genetic diseases (Supplementary Section 1.3 and Supplementary Table 6). These variants were first confirmed with the RNA-Seq data. Then, more than 100 of the most interesting variants in terms of putative functional relevance were also validated by PCR amplification followed by Sanger sequencing. To this end, we used a panel of genomic DNA samples of 11 different species of giant tortoises endemic to different islands from the Galapagos Archipelago (Supplementary Section 1, Supplementary Table 7 and Supplementary Fig. 3).

The manually supervised annotation of development-related genes showed the complete conservation of the Hox gene set among giant tortoises, with the exception of HOXC3, which seems to have been lost in the radiation of Archelosauria14,15 (Supplementary Section 2, Supplementary Table 8 and Supplementary Fig. 4). BMP and GDF gene families were also found to be conserved, although the duplication event that gave rise to GDF1 and GDF3 in mammals did not occur in turtles, birds and crocodiles. In contrast, we found a duplication of the ParaHox gene CDX4 in giant tortoises, also present in other reptiles as well as avian reptiles (birds). This annotation also showed the duplication of WNT11 in turtles and chickens (but not in the lizard Anolis carolinensis), and the specific duplication of WNT4 in turtles. Given the roles of these duplicated genes and their conservation in most vertebrate species, they could prove to be useful candidates to study the morphological development of turtles, particularly in relation to shell formation. Of note, KDSR—one of the genes possibly under positive selection in giant tortoises—has been linked to hyperkeratinization disorders16. Also, in this regard, we annotated 30 β-keratins in C. abingdonii, 26 of which seem to be functional. These numbers are lower than those previously reported for β-keratins in other turtles17. Finally, we did not find in C. abingdonii or A. gigantea any functional orthologues of genes specifically involved in tooth development (such as ENAM, AMEL, AMBN, DSPP, KLK4 and MMP20). This finding confirms a pattern in the evolutionary molecular mechanisms for tooth loss, which seems to have been followed consistently and independently across vertebrates. Taken together, these results offer multiple candidates to study developmental traits in tortoises (Supplementary Section 2 and Supplementary Figs. 5–8).

In most species, the immune function is an evolutionary driver that is under strong selective pressure and has important implications in ageing and disease18. The specific components and functionality of immune system components in Reptilia, however, have not been extensively characterized beyond the major histocompatibility complex (MHC)19,20. Our detailed analysis of 891 genes involved in immune function consistently found duplications affecting immunity genes in giant tortoises compared with mammals (Supplementary Section 3, Supplementary Table 9 and Supplementary Figs. 9–13). We found a genomic expansion of PRF1 (encoding perforin) in giant tortoises and other turtles, compared with chickens (one copy), A. carolinensis (two copies) and most mammals (one copy). Both C. abingdonii and A. gigantea possess 12 copies of this gene (validated by Sanger sequencing), although three of them have been pseudogenized in C. abingdonii. In addition, we detected and validated, by Sanger sequencing, an expansion of the chymase locus, containing granzymes, in giant tortoises (Supplementary Section 3.1 and Supplementary Fig. 10). Both expansions are expected to affect cytotoxic T lymphocyte and natural killer functions, which play important roles in defence against both pathogens and cancer21,22. Other concurrent expansions involve APOBEC1, CAMP, CHIA and NLRP genes, which participate in viral, microbial, fungal and parasite defence, respectively. These results suggest that the innate immune system in turtles, and especially in giant tortoises, may play a more relevant role than in mammals, consistent with the less important role that adaptive immunity seems to play19. We found that class I and II MHC genes probably underwent a duplication event in a common ancestor between giant tortoises and painted turtles (Chrysemys picta bellii). We also annotated 40 class III MHC genes, thus confirming the conservation of this cluster in giant tortoises. The large number of MHC genes in giant tortoises is consistent with the suggestion that ancestors of archosaurs and chelonians did not possess a minimal essential MHC as found in the chicken genome20 (Supplementary Section 3.3, Supplementary Table 10 and Supplementary Figs. 14–16).

Giant tortoises are at the upper end of the size scale for extant Chelonii, and have often been used as an example of gigantism23. We analysed a series of genes involved in size regulation in vertebrates, most notably dogs (Supplementary Section 2, Supplementary Table 8 and Supplementary Fig. 6). Our results on genes related to growth hormone, the insulin-like growth factor (IGF) system and stanniocalcins suggest that these genes are well conserved; therefore, additional size determinants may exist in giant tortoises. As a complex phenotype, gigantism in tortoises is expected to be caused by interactions between different genetic and environmental factors. An interesting finding in this regard is the presence of several gene variants in tortoises (including G. agassizii) probably affecting the activities of glucose metabolism genes, such as MIF (p.N111C; expected to yield a locked trimer) and GSK3A (p.R272Q in the activation loop). Given the roles of these positions in the mammalian orthologues of these genes, tortoise-specific changes could point to differences in the regulation of glucose intake and tolerance (Supplementary Section 4, Supplementary Table 11, and Supplementary Figs. 17 and 18). We also found expansions and inactivations in other genes involved in energy metabolism. Thus, glyceraldehyde-3-phosphate dehydrogenase (GAPDH)—a glycolytic enzyme with a key role in energy production, as well as in DNA repair and apoptosis24—is expanded in giant tortoises. Conversely, the NLN gene encoding neurolysin is pseudogenized in tortoises. The loss of this gene in mice has been related to improved glucose uptake and insulin sensitivity25. Taken together, these results led us to hypothesize that genomic variants affecting glucose metabolism may have been a factor in the development of tortoises.

The analysis of genes related to the stress response has also highlighted several putative variants in giant tortoises affecting globins and DNA repair factors (Supplementary Section 5, Supplementary Tables 12 and 13, and Supplementary Figs. 19–22, 32 and 33). We found that, despite living terrestrially, giant tortoises conserve the hypoxia-related globin GbX26. Together with coelacanths, turtles, including giant tortoises, are the only organisms known to possess all eight different types of globins27. Consistent with this, we found in both giant tortoise genomes a variant in the transcription factor TP53 (p.S106E) that has been linked to hypoxia resistance in some mammals and fishes28. The presence of the same residue in Testudines strongly suggests a process of convergent evolution in the adaptation to hypoxia, probably driven by an ancestral aquatic environment, which left this footprint in the genomes of terrestrial giant tortoises.

An important trait of large, long-lived vertebrates is their need for tighter cancer protection mechanisms, as illustrated by Peto’s paradox29,30. In turn, this need for additional protection illustrates the deep relationship and interdependence between cancer and longevity (Fig. 2). Notably, tumours are believed to be very rare in turtles31. Therefore, we analysed more than 400 genes classified in a well-established census of cancer genes as oncogenes and tumour suppressors32. Although most presented a highly conserved amino acid sequence when compared with the sequences of other organisms, we uncovered alterations in several tumourigenesis-related genes (Fig. 2a, Supplementary Section 6, Supplementary Table 14 and Supplementary Figs. 23–29). First, we found that several putative tumour suppressors are expanded in turtles compared with other vertebrates, including duplications in SMAD4, NF2, PML, PTPN11 and P2RY8. In addition, the aforementioned expansion of PRF1, together with the tortoise-specific duplication of PRDM1, suggests that immunosurveillance may be enhanced in turtles. Likewise, we found giant-tortoise-specific duplications affecting two putative proto-oncogenes—MYCN and SET. Notably, the SET complex mediates oxidative stress responses induced by mitochondrial damage through the action of PRF1 and GZMA in cytotoxic T lymphocyte- and natural killer-mediated cytotoxicity33. Taken together, these results suggest that multiple gene copy-number alterations may have influenced the mechanisms of spontaneous tumour growth. Nevertheless, further studies are needed to evaluate the genomic determinants of putative giant-tortoise-specific cancer mechanisms.

Fig. 2: Genomic basis of longevity and cancer in giant tortoises. a, Genes potentially implicated in C. abingdonii and A. gigantea longevity extension and cancer resistance, classified according to their putative role in the different hallmarks. Tables indicate copy-number variations and relevant variants of age-related genes and tumour suppressors found in C. abingdonii, A. gigantea and other species. Within these tables, numbers indicate gene copy numbers, and asterisks represent pseudogenization events. Dots in colours relating to each hallmark represent presence of the variant. b, Venn diagrams showing the relationships between cancer-, ageing- and immunity-related genes, as classified before annotation. Top, all of the genes related to each category that have been manually annotated, including the number of genes in each group. Bottom, those genes showing potentially interesting variations after annotation. Full size image

Finally, we selected, for manually supervised annotation, a set of 500 genes that may be involved in ageing modulation (Supplementary Section 7 and Supplementary Table 15). The extreme longevity of giant tortoises is expected to involve multiple genes affecting different hallmarks of ageing11. We found several alterations in the genomes of giant tortoises that may play a direct role in six of them, and impinge on other ageing hallmarks and processes, such as cancer progression34 (Fig. 2b). First, we identified changes in three candidate factors (NEIL1, RMI2 and XRCC6) related to the maintenance of genome integrity, a primary hallmark of ageing11 (Fig. 3a). Thus, we found and validated a duplication affecting NEIL1, a key protein involved in the base-excision repair process whose expression has been linked to extended lifespans in several species35. Likewise, RMI2 is duplicated in tortoises, suggesting an enhanced ability to resolve homologous recombination intermediates to limit DNA crossover formation in cells36. In a preliminary exploration of this hypothesis, we overexpressed NEIL1 and RMI2 in HEK-293T cells and exposed the infected cells to a sublethal dosage of H 2 O 2 or ultraviolet light, monitoring DNA damage by western blot analysis at 24 and 48 h after treatment. As shown in Supplementary Figs. 22, 32 and 33, the expression of both genes results in reduced levels of phosphorylated histone H2AX and cleaved poly (ADP-ribose) polymerase (PARP), suggesting reduced levels of DNA damage37. In turn, this result is consistent with the hypothesis that NEIL1 and RMI2 levels may regulate the strength of DNA repair mechanisms. Also in relation to DNA repair mechanisms, we identified and validated a variant affecting XRCC6—encoding a helicase involved in non-homologous end joining of double-strand DNA breaks—which may affect a known sumoylation site (p.K556R). This lysine is conserved in diverse vertebrates but, notably, is changed in giant tortoises, and also in the naked mole rat (p.K556N), the longest-lived rodent, which suggests a putative process of convergent evolution (Fig. 3b). Since sumoylation is induced following DNA damage and plays a key role in DNA repair response and multiple regulatory processes38, this variant may reflect selective pressures acting on the regulation of the repair of double-strand DNA breaks in long-lived organisms (Supplementary Section 5.5).

Fig. 3: DNA repair response in giant tortoises. a, Copy-number variations and putative function-altering point variants found in C. abingdonii, A. gigantea and closely related species. b, Alignments showing the variants highlighted in XRCC6 and DCLRE1B. Full size image

Regarding telomere attrition—another primary hallmark of ageing11—we uncovered in giant tortoises one variant in DCLRE1B (p.R498C) potentially affecting its binding interface with telomeric repeat binding factor 2 (TERF2) (Fig. 3b and Supplementary Section 7.2). This change, together with the aforementioned variants affecting DNA repair genes that may also impinge on telomere dynamics39,40,41, highlights the relevance of telomere maintenance as a regulatory mechanism of longevity in tortoises. Moreover, we found changes potentially affecting proteostasis (Fig. 2a). We independently found specific expansions of the elongation factor gene EEF1A1 in C. abingdonii, A. gigantea and G. agassizii, as described with the automatic annotation. Importantly, overexpression of EEF1A1 homologues in Drosophila melanogaster has been linked to an increased lifespan in this species42.

Over time, nutrient sensing deregulation—another hallmark of ageing—can result from alterations in metabolic control mechanisms and signalling pathways12. The aforementioned variant affecting the activation loop of GSK3A (Supplementary Section 4.1), which is present in C. abingdonii and all tested tortoises from the Galapagos Islands and Aldabra Atoll, as well as their continental outgroups, G. agassizii and C. picta bellii, may be involved in the maintenance of glucose homoeostasis. Interestingly, the inhibition of GSK3 can extend lifespan in D. melanogaster43. Likewise, the identified alterations in other giant tortoise genes implicated in glucose metabolism, such as the aforementioned inactivation of NLN, may provide interesting candidates to study nutrient sensing in these long-lived species (Supplementary Section 7.4).

Regarding the mitochondrial function, we found two variants (p.Q366M and p.M487T) potentially affecting the function of ALDH2, a mitochondrial aldehyde dehydrogenase involved in alcohol metabolism and lipid peroxidation, among other detoxification processes44. Notably, the p.Q366M variant, which may alter the NAD-binding site of ALDH2, is exclusively found in Galapagos giant tortoises, but not in their continental close relative Chelonoidis chilensis, nor in the more distantly related Aldabra or Agassiz’s tortoises. Thus, these changes could also alter the detoxification process and contribute to pro-longevity mechanisms. Together with the above described specific alterations in other genes of giant tortoises, such as NLN and GAPDH, which encode enzymes associated with mitochondrial functions45,46, these variants may also impinge on mitochondrial dysfunction, an antagonistic hallmark of ageing11 (Supplementary Section 7.5).

We have also found evidence in tortoises of some variants related to altered intercellular communication (Supplementary Section 7.6 and Supplementary Fig. 30), an integrative hallmark of ageing11. Thus, we have detected exclusively in C. abingdonii a premature stop codon affecting ITGA1 (p.R990*), an essential integrin involved in cell–matrix and cell–cell interactions. In addition, the aforementioned variant affecting MIF is also expected to cause the formation of inactivating interchain disulfide bonds, inhibiting intracellular signalling cascades47. Moreover, MIF deficiency reduces chronic inflammation in white adipose tissue and expands lifespan, especially in response to caloric restriction48,49. Finally, we have annotated a specific variant in IGF1R that is expected to affect the interaction between this receptor and the IGF1/2 growth factors50. Notably, a homology model of this region in IGF1R in C. abingdonii suggests that position 724 is located at the surface of the protein, and the presence of an aspartic acid residue changes the local electrostatic field (Fig. 4a). The extended lifespan in different species correlates with IGF signalling decrease51,52, which suggests that this unique change in IGF1R may provide an attractive target to study the cellular mechanisms underlying the exceptional lifespan of these animals. To explore the functional consequences of differential IGF1 signalling caused by the p.N724D variant found in the IGF1 receptor (IGF1R), we infected HEK-293T cells with pCDH, pCDH-IGF1RWT and pCDH-IGF1RN724D plasmids. Cells expressing the mutant receptor showed an attenuation of IGF1 signalling, compared with those expressing the wild-type protein, measured as a significant reduction in the phosphorylation levels of IGF1R at 5 min (95% confidence interval of difference: 0.1119–1.5330, t = 2.454, P = 0.026) and 10 min (95% confidence interval of difference: 0.1991–1.6200, t = 2.714, P = 0.0153) after IGF1 treatment (Fig. 4b, Supplementary Section 7.6.2 and Supplementary Fig. 31). According to a two-way analysis of variance, the exogenous IGF1R form accounted for 16.07% of total variation (F 1,4 = 20.91, P = 0.0102), while time accounted for 44.23% of total variation (F 3,12 = 6.57, P = 0.0071). Interestingly, we also found in tortoises a short deletion in the coding region of IGF2R that results in the loss of two amino acids. The fact that IGF2R variants have been associated with human longevity53 opens the possibility that the variant found in tortoises could also contribute to increasing the lifespan of these long-lived animals.

Fig. 4: Functional relevance of IGF1RN724D in the IGF1 signalling pathway. a, Alignment of IGF1R around residue p.N724 in C. abingdonii, A. gigantea and other representative species. The predicted electrostatic surfaces of human (top right) and modelled C. abingdonii (bottom right) IGF1R around the same residue are shown for comparison. Negatively charged areas are depicted in red, while positively charged areas are depicted in blue. b, Western blot analysis and densitometry quantification of the phospho-IGF1R (pIGF1R)/total IGF1R ratio at 5, 10 and 20 min intervals after IGF1 addition in HEK-293T cells infected with pCDH, pCDH-IGF1RWT and pCDH-IGF1RN724D plasmids. Bars indicate means ± s.e.m. *P < 0.05, Fisher’s least significant difference test (n = 3 independent experiments). Full size image

In summary, in this work, we report the preliminary characterization of giant tortoise genomes. We complemented the automatic annotation of genomes from two giant tortoise species with a hypothesis-driven strategy using manually supervised annotation of a large set of genes. The analysis of the resulting sequences offers candidate genes and pathways that may underlie the extraordinary characteristics of these iconic species, including their development, gigantism and longevity. A better understanding of the processes that we have studied may help to further elucidate the biology of these species and therefore aid the ongoing efforts to conserve these dwindling lineages. Lonesome George—the last representative of C. abingdonii, and a renowned emblem of the plight of endangered species—left a legacy including a story written in his genome whose unveiling has just started.