The transition from land to water in whales and dolphins (cetaceans) was accompanied by remarkable adaptations. To reveal genomic changes that occurred during this transition, we screened for protein-coding genes that were inactivated in the ancestral cetacean lineage. We found 85 gene losses. Some of these were likely beneficial for cetaceans, for example, by reducing the risk of thrombus formation during diving (F12 and KLKB1), erroneous DNA damage repair (POLM), and oxidative stress–induced lung inflammation (MAP3K19). Additional gene losses may reflect other diving-related adaptations, such as enhanced vasoconstriction during the diving response (mediated by SLC6A18) and altered pulmonary surfactant composition (SEC14L3), while loss of SLC4A9 relates to a reduced need for saliva. Last, loss of melatonin synthesis and receptor genes (AANAT, ASMT, and MTNR1A/B) may have been a precondition for adopting unihemispheric sleep. Our findings suggest that some genes lost in ancestral cetaceans were likely involved in adapting to a fully aquatic lifestyle.

Because of an extensive series of intermediate fossils, the shift from a terrestrial to a fully aquatic environment is one of the best-characterized macroevolutionary transitions in mammalian evolution ( 5 ). However, the important genomic changes that occurred during this transformation remain incompletely understood. Because recent work has shown that the loss of ancestral protein-coding genes is an important evolutionary force ( 17 – 19 ), we conducted a systematic screen for genes that were inactivated on the stem Cetacea branch, i.e., after the split between Cetacea and Hippopotamidae but before the split between Odontoceti (toothed whales) and Mysticeti (baleen whales). This revealed a number of gene losses that are associated with the evolution of adaptations to a fully aquatic environment.

Comparative analysis of cetacean genomes has provided important insights into the genomic determinants of cetacean traits and aquatic specializations. Several studies revealed patterns of positive selection in genes with roles in the nervous system, osmoregulation, oxygen transport, blood circulation, or bone microstructure ( 5 – 7 ). An adaptive increase in myoglobin surface charge likely permitted a high concentration of this oxygen transport and storage protein in cetacean muscles ( 8 ). In addition to patterns of positive selection, the loss (inactivation) of protein-coding genes is associated with derived cetacean traits. For example, cetaceans have lost a large number of olfactory receptors, taste receptors, and hair keratin genes ( 9 – 12 ). Furthermore, all or individual cetacean lineages lost the ketone body–synthesizing enzyme HMGCS2 ( 13 ), the nonshivering thermogenesis gene UCP1 ( 14 ), the protease KLK8 that plays distinct roles in the skin and hippocampus ( 15 ), and short wave– and long wave–sensitive opsin genes ( 16 ). During evolution, gene loss not only can be a consequence of relaxed selection on a function that became obsolete but also can be a mechanism for adaptation ( 17 ). For example, the loss of the erythrocyte-expressed AMPD3 gene in the sperm whale, one of the longest- and deepest-diving cetacean species, is likely beneficial by enhancing oxygen transport ( 18 ). The loss of the elastin-degrading protease MMP12 may have contributed to “explosive exhalation,” and the loss of several epidermal genes (GSDMA, DSG4, DSC1, and TGM5) likely contributed to hair loss and the remodeling of the cetacean epidermal morphology ( 18 ).

The ancestors of modern cetaceans (whales, dolphins, and porpoises) transitioned from a terrestrial to a fully aquatic lifestyle during the Eocene about 50 million years ago ( 1 ). This process constitutes one of the most marked macroevolutionary transitions in mammalian history and was accompanied by profound anatomical, physiological, and behavioral transformations that allowed cetaceans to adapt and thrive in the novel habitat ( 2 ). Remarkable changes in cetacean anatomy include streamlined bodies and loss of body hair to reduce drag during swimming, a much thicker skin that lacks sweat and sebaceous glands and has enhanced physical barrier properties, a thick layer of blubber for insulation, the loss of hindlimbs after propulsion by the tail flukes evolved, and reduced olfactory and gustatory systems, which became less important in water ( 3 ). To efficiently store and conserve oxygen for prolonged breath-hold diving, cetaceans developed a variety of adaptations. These adaptations include increased oxygen stores that result from large blood volumes and elevated concentrations of hemoglobin, myoglobin, and neuroglobin in blood, muscle, and brain tissue, respectively; a high-performance respiratory system that allows rapid turnover of gases at the surface; and a flexible ribcage that allows the lung to collapse at high ambient pressure ( 4 ).

RESULTS AND DISCUSSION

Screen for coding genes that were inactivated in the cetacean stem lineage To investigate the contribution of gene inactivation to the evolution of adaptations to a fully aquatic environment in cetaceans, we systematically searched for protein-coding genes that were inactivated in the cetacean stem lineage (a flowchart of the screen is shown in fig. S1). Briefly, we considered 19,769 genes annotated in the human genome and searched for gene-inactivating mutations throughout a phylogeny of 62 mammalian species, comprising four cetaceans, two pinnipeds, a manatee, and 55 terrestrial mammals (table S1). To detect gene-inactivating mutations, we used a comparative approach that makes use of genome alignments to search for mutations that disrupt the protein’s reading frame (stop codon mutations, frameshifting insertions or deletions, and deletions of entire exons) and mutations that disrupt splice sites (18). Excluding members of the large olfactory receptor and keratin-associated gene families, whose losses have been studied in detail before [for example, in (9, 11, 12)], we identified 236 genes that do not have an intact ortholog in cetaceans and are inactivated in at most 3 of the 55 terrestrial mammal species. Of these 236 genes, 110 exhibit inactivating mutations that are shared between the two extant cetacean clades, odontocetes and mysticetes (Fig. 1A). Odontocetes were represented in our screen by the bottlenose dolphin, killer whale, and sperm whale (20–22), and mysticetes were represented by the common minke whale (6). The most parsimonious hypothesis for inactivating mutations shared between odontocetes and mysticetes is that they occurred before the split of these two clades in the common ancestral branch of Cetacea. To precisely identify genes that were inactivated during the transition from land to water in the cetacean stem lineage, we made use of the recently sequenced genome of the common hippopotamus (23), a semi-aquatic mammal that, along with the pygmy hippopotamus, is the closest living relative to cetaceans, and considered only genes with no detected inactivating mutations in the hippopotamus. This resulted in a set of 85 lost genes that exhibit shared inactivating mutations in odontocetes and mysticetes, 62 (73%) of which have not been reported before (table S2). Fig. 1 Key coagulation factors that promote thrombosis were lost in the cetacean stem lineage. (A) F12 (coagulation factor XII) and KLKB1 (kallikrein B1) were lost in the cetacean stem lineage, consistent with previous findings (6, 22, 25). Boxes illustrate coding exons superimposed with those gene-inactivating mutations that are shared among odontocetes and mysticetes (both lineages are labeled in the phylogenetic tree) and thus likely occurred before the split of these lineages. The inset shows one representative inactivating mutation. Shared breakpoints imply that the deletion of KLKB1 coding exons 6 to 12 occurred in the cetacean stem lineage (intronic bases adjacent to exons 5 and 13 are in lowercase letters). All inactivating mutations in both genes are shown in figs. S4 and S5. (B) Left: F12 encodes a zymogen that autoactivates by contact with a variety of surfaces, which likely include nitrogen microbubbles that form during breath-hold diving (27, 29). KLKB1 encodes another zymogen that can be activated to plasma kallikrein (PK) by either activated F12 or by the endothelial membrane–associated endopeptidase prolylcarboxypeptidase (PRCP) (26). PK, in turn, can activate F12. Both activated F12 and PK proteases promote thrombosis formation (26). Right: Gene knockouts in mice suggest that loss of F12 and KLKB1 has no major effect on wound sealing but protects from thrombus formation via different mechanisms. While loss of KLKB1 protects from thrombosis by reducing the expression of F3 (coagulation or tissue factor III) (30), loss of F12 prevents activation on nitrogen microbubbles during diving. Because a vasoconstriction-induced reduction in blood vessel diameters and nitrogen microbubble formation increase the risk of thrombosis for frequent divers, the loss of both genes was likely beneficial for cetaceans. For these 85 genes, we performed additional analyses to confirm evolutionary loss in the cetacean stem lineage. First, inactivating mutations shared between the four cetaceans used in the genomic screen imply that other species that descended from their common ancestor should share these mutations. We tested this by aligning the genomes of two additional odontocetes [Yangtze River dolphin (11) and beluga whale (24)] and an additional mysticete [bowhead whale (25)]. Manually inspecting the gene loci in these additional species confirmed the presence of shared inactivating mutations. Second, the manual inspection of genome alignments also revealed no evidence for an undetected functional copy of these genes in cetaceans. Together, these analyses show that these genes were inactivated on the stem Cetacea branch, i.e., after the split between Cetacea and Hippopotamidae but before the split between Odontoceti and Mysticeti. We intersected the 85 genes with functional annotations of their human and mouse orthologs (table S2) and performed a literature search. This revealed a number of genes that we hypothesize to be related to aquatic adaptations [F12 (coagulation factor XII), KLKB1 (kallikrein B1), POLM (DNA polymerase mu), MAP3K19 (mitogen-activated protein kinase 19), SEC14L3 (SEC14-like lipid binding 3), SLC6A18 (solute carrier family 6 member 18), SLC4A9 (solute carrier family 4 member 9), and AANAT (aralkylamine N-acetyltransferase)] by being involved in thrombosis, repair of oxidative DNA damage, oxidative stress–induced lung inflammation, renal amino acid transport, saliva secretion, and melatonin synthesis. For these eight genes, we further verified that they have an intact reading frame not only in the common hippopotamus but also in the pygmy hippopotamus, the only other extant species in the family Hippopotamidae. Furthermore, we validated the correctness of all inactivating mutations with raw DNA sequencing reads that were used to assemble the cetacean genomes. We found that the vast majority of inactivating mutations (248 of 251; 98.8%) are confirmed by DNA sequencing reads (fig. S2 and table S3). We further estimated that the remnants of the coding regions of genes evolve under relaxed selection in cetaceans (highly significant for all genes except MAP3K19 with P = 0.08; table S4). Last, we analyzed available expression data of the bottlenose dolphin and minke whale, which revealed that the remnants of these genes either are not expressed anymore or do not produce full-length and properly spliced transcripts (fig. S3). With the exception of POLM and AANAT, which are also lost in the pangolin, these genes are either exclusively lost in the cetacean stem lineage (F12, KLKB1, MAP3K19, SEC14L3, and SLC6A18) or convergently lost in the aquatic manatee (SLC4A9 and AANAT). In the following, we describe how the loss of these eight genes likely relates to adaptations to a fully aquatic environment.

Loss of coagulation-associated factors and reduced thrombus formation Diving results in a systemic response, consisting of a decrease in heart rate (bradycardia) and reduced peripheral blood flow, which is achieved by contraction of endothelial smooth muscle cells (peripheral vasoconstriction) (4). A frequent vasoconstriction-induced reduction in blood vessel diameter during diving increases the risk of thrombus (blood clot) formation. Our screen detected two blood coagulation-associated factors, F12 and KLKB1, that are specifically lost in cetaceans and no other analyzed mammal. Several shared inactivating mutations show that both genes were lost in the cetacean stem lineage (Fig. 1A and figs. S4 and S5). While the loss of these genes in various cetacean species was noted before (6, 22, 25), the mechanisms by which these two gene losses likely protect from thrombus formation during diving have not been described. F12 initiates thrombus formation via the contact activation system (CAS) (26). F12 encodes a zymogen that autoactivates upon encountering a variety of foreign or biological surfaces (27). The activated zymogen functions as a serine protease that engages in a reciprocal activation cycle with the serine protease encoded by KLKB1, resulting in platelet activation and the formation of a blood clot (26). Consistently, knockout or knockdown of F12 protects various mammals from induced thrombosis but, importantly, did not impair wound sealing after blood vessel injury (hemostasis) (28). Eliminating CAS-based coagulation by inactivating F12 may have been especially advantageous for cetaceans, as nitrogen microbubbles, which readily form in the blood upon repeated breath-hold diving, may act as foreign F12-activating surfaces entailing harmful thrombus formation (Fig. 1B) (29). The KLKB1-encoded zymogen prekallikrein is activated by proteolytic cleavage to form the serine protease plasma kallikrein (PK). Similar to the knockout of F12, the knockout of KLKB1 in mice also granted protection from induced thrombosis while only slightly prolonging wound sealing (28). Thrombosis protection in KLKB1 knockout mice is mediated by a CAS-independent mechanism (30). KLKB1 knockout leads to reduced levels of bradykinin, the main target of PK, which, in turn, leads to reduced expression of coagulation factor III (also called tissue factor, F3) (30), a key initiator of the blood coagulation cascade. The reduction of coagulation factor III alone is sufficient to reduce the risk of thrombosis. In addition to being activated by F12, prekallikrein can be activated by the endothelial membrane–associated endopeptidase prolylcarboxypeptidase (PRCP) (31). Evidently, this type of activation should happen more frequently in a diving cetacean, where constricted blood vessels increase the proximity of prekallikrein and PRCP (Fig. 1B). Moreover, the activity of PRCP is pH dependent and peaks in slightly acidified plasma (31), a condition found in diving cetaceans. In summary, all cetaceans are deficient in two key factors that promote thrombosis but largely do not affect wound sealing. In support of the hypothesis that wound sealing mechanisms remain intact, we found that the key coagulation factors facilitating hemostasis upon tissue damage (encoded by the genes F2, F3, F7, and F10) are intact in the cetaceans and other mammals included in our screen. The risk of thrombus-induced occlusion of blood vessels is higher for frequent divers, as smaller blood vessel diameters and nitrogen microbubble formation during diving both increase the likelihood of F12 or prekallikrein activation (Fig. 1B). Because inactivating F12 or KLKB1 reduces the risk of thrombus formation via different and likely additive mechanisms, both gene losses were potentially advantageous for stem cetaceans. Consistent with this, previous studies found that several genes involved in blood clotting evolved under positive selection in cetaceans (21).

Loss of a DNA repair gene and improved tolerance of oxidative DNA damage The pronounced peripheral vasoconstriction evoked by the diving response restricts blood supply to peripheral tissues of the diving mammal, causing an oxygen shortage (ischemia). Restoration of blood flow (reperfusion) to these tissues causes the production of reactive oxygen species (ROS), which can damage DNA. Diving mammals are better adapted to tolerate frequent ischemia/reperfusion-induced ROS generation by having high levels of antioxidants (32). In addition to these increased antioxidant levels, we detected the inactivation of POLM in the cetacean stem lineage (Fig. 2A and fig. S6). POLM lacks inactivation mutations in any other mammal with the exception of the Chinese pangolin, a burrowing mammal that inhabits higher elevations. Fig. 2 Loss of an error-prone DNA repair polymerase could have improved tolerance of oxidative DNA damage in cetaceans. (A) POLM (DNA polymerase mu) was lost in the cetacean stem lineage, as shown by shared gene-inactivating mutations. Visualization as in Fig. 1A. All inactivating mutations are shown in fig. S6. (B) ROS (reactive oxygen species) induce DNA damage, which includes oxidation of guanine (8-oxodG) as one the most frequent lesions. POLM encodes the DNA repair polymerase Polμ, which often does not perform a correct translesion synthesis (left) but instead introduces errors (right). In particular, Polμ typically deletes bases (35) or erroneously incorporates deoxy-adenosine opposite to 8-oxodG (instead of the correct deoxy-cytosine), which results in a C:G to A:T transversion mutation (34). In contrast to Polμ, another DNA repair polymerase Polλ is much less error prone (36). Loss of POLM in cetaceans may have reduced the mutagenic potential of diving-induced oxidative stress by increasing the utilization of the more precise Polλ and accurate homology-directed DNA repair. Loss of POLM has implications for improved tolerance of oxidative DNA lesions. POLM encodes the DNA polymerase Polμ, which plays an integral role in DNA damage repair (33). The most severe type of DNA damage caused by ROS is a DNA double-strand break. One mechanism to repair double-strand breaks is nonhomologous end joining (NHEJ), a process that ligates DNA strands without requiring a homologous template and resynthesizes missing DNA bases by DNA polymerases. Polμ is able to direct synthesis across a variety of broken DNA backbone types, including ends that lack any complementarity (33). This high flexibility comes at the cost of making Polμ more error prone than Polλ, the second DNA polymerase that participates in NHEJ (33). One of the most frequent types of DNA damage caused by ROS is the oxidation of guanine, creating 8-oxo-7-hydrodeoxyguanosine (8-oxodG) (34). 8-oxodG is highly mutagenic, as the bypassing Polμ resolves this lesion either by deleting bases (35) or by creating a transversion mutation (Fig. 2B) (34). In contrast to Polμ, Polλ performs translesion synthesis with a much lower error rate (36). Outside the context of DNA double-strand breaks, Polβ is the main polymerase facilitating 8-oxodG repair (37). Our screen detected no inactivating mutations in cetaceans (and other mammals) in the genes encoding Polλ (POLL) and Polβ (POLB), suggesting that other DNA repair polymerases remain functional. However, in a regime of frequent oxidative stress, as experienced by diving cetaceans, the error-prone DNA repair polymerase Polμ likely constitutes a mutagenic risk factor. Inactivation of Polμ in the cetacean stem lineage may have enhanced the fidelity of bypassing 8-oxodG lesions and repairing double-stranded breaks by increased utilization of the more precise Polλ, which is supported by mouse experiments. Compared with wild-type mice, POLM knockout mice showed significantly reduced mutagenic 8-oxodG translesion synthesis and exhibited a higher endurance when challenged with severe oxidative stress (38, 39). POLM knockout mice also displayed improved learning abilities and greater liver regenerative capacity at high age (38, 39) but were found to suffer from reduced hematopoiesis and impaired adaptive immunity (40). Consequently, the benefits of losing POLM may only predominate under frequent exposure to oxidative stress, where its loss reduces the mutagenic potential of ROS, which readily form during repeated ischemia/reperfusion processes in diving mammals.

Loss of lung-related genes and a high-performance respiratory system During diving, the cetacean lung collapses and reinflates during ascent. While lung collapse would represent a severe clinical problem for humans, it serves to reduce both buoyancy and the risk of developing decompression sickness in cetaceans (41). Our screen revealed two genes that are exclusively lost in cetaceans and have specific expression patterns in the lung, MAP3K19 and SEC14L3 (Fig. 3, A and B, and figs. S7 and S8). Fig. 3 Loss of lung-related and renal transporter genes in the cetacean stem lineage. (A and B) The loss of MAP3K19 (mitogen-activated protein kinase 19) and SEC14L3 (SEC14-like lipid binding 3), which are specifically expressed in cell types of the lung, may relate to the high-performance respiratory system of cetaceans. (C) The loss of the renal amino acid transporter SLC6A18 (solute carrier family 6 member 18) offers an explanation for the low plasma arginine levels in cetaceans and may have contributed to stronger vasoconstriction during the diving response. Visualization as in Fig. 1A. A shared donor (gt ➔ at) and acceptor (ag ➔ aa) splice site disrupting mutation is indicated in (B). All inactivating mutations are shown in figs. S7 to S9. MAP3K19 is expressed in bronchial epithelial cells, type II pneumocytes, and pulmonary macrophages (42). Overexpression of MAP3K19 was detected in pulmonary macrophages of human patients suffering from idiopathic pulmonary fibrosis (42). This disease is believed to be caused by aberrant wound healing in response to injuries of the lung epithelium, leading to the abnormal accumulation of fibroblasts (fibrosis), excessive collagen secretion, and severely impaired lung function (42). Consistent with a fibrosis-promoting function of MAP3K19, inhibition of MAP3K19 in mice protects from induced pulmonary fibrosis by significantly reducing fibrosis and collagen deposition (42). In a similar manner, MAP3K19 loss may also have a protective effect in cetaceans where repeated lung collapse/reinflation events during deep dives cause shear forces that could increase the incidence of pulmonary microinjuries. Furthermore, overexpression of MAP3K19 was also detected in human patients suffering from chronic obstructive pulmonary disease (COPD) (43), a disease associated with cigarette smoking–induced oxidative stress. MAP3K19 is up-regulated in cells in response to oxidative and other types of environmental stress and promotes the expression of pro-inflammatory chemokines (43). Further supporting a role of MAP3K19 in the pathogenesis of COPD, inhibition of MAP3K19 in mouse COPD models strongly reduced pulmonary inflammation and airway destruction (43). A hallmark of COPD is a reduction of alveolar elasticity caused by elastin degradation, which contributes to an incomplete emptying of the lung. Cetaceans exhibit the opposite phenotype and have extensive elastic tissue in their lungs (41), which contributes to “explosive exhalation,” a breathing adaptation that allows renewal of ~90% of the air in the lung in a single breath (3). Therefore, similar to the previously described loss of the elastin-degrading and COPD-overexpressed MMP12 in aquatic mammals (18), the loss of MAP3K19 may also be involved in the evolution of this breathing adaptation. More generally, the frequent oxidative stress faced by diving cetaceans, especially upon reoxygenation of the reinflated hypoxic lung, would increase the risk for MAP3K19-mediated chronic pulmonary inflammation and compromised respiratory function, which could have contributed to MAP3K19 loss. The second lung-expressed gene SEC14L3 is expressed in airway ciliated cells and in alveolar type II cells that secrete pulmonary surfactant, the lipid-protein complex that prevents alveoli collapse (44, 45). Similar to other surfactant-associated genes, SEC14L3 expression is highly induced in the lungs before birth (45). SEC14L3 functions as a sensor of liposomal lipid-packing defects and may affect surfactant composition (45). Alterations in surfactant composition may be relevant for cetaceans and other diving mammals. A study in seals suggested that pulmonary surfactants with anti-adhesive properties are important for diving mammals by facilitating alveolar reinflation after collapse (46). Because cetacean surfactants have not been characterized, it remains to be investigated whether the cetacean-specific loss of the surfactant-related SEC14L3 is associated with changes in the composition and anti-adhesive properties of cetacean surfactants.

Loss of a renal transporter gene and enhanced vasoconstriction during the diving response Our screen revealed the cetacean-specific loss of SLC6A18 (Fig. 3C and fig. S9), which encodes a renal amino acid transporter that participates in reabsorption of arginine and other amino acids in the kidney proximal tubules. Knockout of SLC6A18 in mice resulted in reduced plasma arginine levels (47). Thus, the loss of SLC6A18 and its renal arginine reabsorbing activity provides one possible explanation for why cetaceans exhibit considerably lower plasma arginine levels in comparison to mice (48). In addition, SLC6A18 knockout in mice resulted in stress-induced hypertension (47), a condition that involves vasoconstriction. This hypertension phenotype likely arises because lower arginine levels reduce the main substrate for the production of nitric oxide, a highly diffusible vasodilating substance (47). Consistently, SLC6A18 inactivation caused persistent hypertension in a different mouse strain that is more susceptible to perturbations of nitric oxide production (49). This raises the possibility that the evolutionary loss of SLC6A18 in the cetacean stem lineage may have contributed to an increased diving capacity by indirectly enhancing vasoconstriction during the diving response.

Loss of an ion transporter gene and feeding in an aquatic environment Saliva plays a role in lubricating the oral mucosa, in providing starch-degrading enzymes, and in the perception of taste. All these functions became less important in an aquatic environment, where the abundance of water sufficiently lubricates food and dilutes salivary digestive enzymes. In addition, the hyperosmotic marine environment necessitates strict housekeeping of freshwater resources in marine species (50); thus, freshwater loss via saliva secretion may be detrimental. We found that SLC4A9, a gene participating in saliva secretion, was lost in the cetacean stem lineage (Fig. 4A and fig. S10). Moreover, we found a convergent inactivation of this gene in the manatee, representing the only other fully aquatic mammalian lineage (fig. S10). Fig. 4 Loss of a pleiotropic ion transporter in cetaceans relates to the dispensability of saliva secretion. (A) Several shared inactivating mutations indicate that SLC4A9 (solute carrier family 4 member 9) was lost in the cetacean stem lineage. Visualization as in Fig. 1A. All inactivating mutations are shown in fig. S10. (B) Simplified illustration of saliva secretion. SLC4A9 encodes an ion transporter. (1) In the submandibular salivary gland, SLC4A9 participates in creating a transepithelial chloride anion flux into the acinar lumen, together with another transporter SLC12A2 and chloride channels (52). (2) This first evokes a passive movement of cations across the tight junctions into the acinar lumen. (3) The resulting osmotic gradient induces a flow of water, which constitutes fluid secretion into the acinar lumen, the initial site of saliva secretion. SLC4A9 knockout in mice leads to a 35% reduction in saliva secretion (52). The remaining saliva secretion potential in SLC4A9 knockout mice is maintained by SLC12A2 (52). However, SLC12A2 lacks inactivating mutations in cetaceans; these mutations lead to severe phenotypes in humans and mice (55), suggesting that gene essentiality maintained this gene in cetaceans. In addition to saliva secretion, SLC4A9 is also involved in transepithelial sodium ion flux in the kidney (not shown here) and participates in sodium chloride reabsorption (56), a process that is less important in hyperosmotic marine environments. SLC4A9 encodes an electroneutral ion exchange protein (51), which is expressed in the submandibular salivary gland. SLC4A9 is restricted to the basolateral membrane of acinar cells, where it participates in saliva secretion (Fig. 4B) (52). SLC4A9 knockout mice displayed a 35% reduction in saliva secreted from the submandibular gland (52). This suggests that loss of SLC4A9 in cetaceans could contribute to a reduction of saliva secretion, which is in agreement with morphological observations that salivary glands are absent or atrophied in cetaceans (53). A second gene decisively contributing to saliva secretion is SLC12A2. Knockout of this gene in mice reduces saliva secretion by more than 60% (54). However, SLC12A2 is involved in a multitude of other physiological processes, and mutations in SLC12A2 entail severely detrimental phenotypes (55). Pleiotropy therefore likely explains why SLC12A2 lacks inactivating mutations in cetaceans and all other analyzed mammals. In addition to the submandibular salivary gland, SLC4A9 is also expressed at the basolateral membrane of β-intercalated cells of the kidney, where it contributes to sodium chloride reabsorption (56). For species living in a hyperosmotic environment, where they incidentally ingest seawater with their prey, salt reabsorption by the kidney is probably less important (or even harmful) relative to efficient salt excretion. Thus, the loss of the salt reabsorbing factor SLC4A9 may contribute to the high urinary concentrations of sodium and chloride in cetaceans as compared to cows (57). In summary, the pleiotropic SLC4A9 gene was likely lost because both of its physiological processes, secretion of saliva and salt reabsorption, became dispensable in marine aquatic environments.

Loss of melatonin biosynthesis/reception and the evolution unihemispheric sleep Commitment to a fully aquatic lifestyle also required distinct behavioral adaptations in stem cetaceans. Specifically, prolonged periods of sleep are obstructed by the needs to surface regularly to breathe and to constantly produce heat in the thermally challenging environment of the ocean. Cetaceans are the only mammals thought to sleep exclusively unihemispherically, a type of sleep that allows one brain hemisphere to sleep while the awake hemisphere coordinates movement for surfacing and heat generation (58). Our screen uncovered that AANAT was lost in the stem cetacean lineage (Fig. 5A and fig. S11). AANAT is a key gene required for synthesis of melatonin, the sleep hormone that influences wakefulness and circadian rhythms. Because genes that are functionally linked in a pathway tend to be co-eliminated (17, 59), we inspected ASMT (acetylserotonin O-methyltransferase), encoding the second enzyme required for melatonin synthesis, and MTNR1A and MTNR1B (melatonin receptors 1A and 1B), encoding the two membrane-bound melatonin receptors. Supporting a pattern of co-elimination of melatonin-related genes, we found that all three genes were lost in all analyzed cetaceans, with MTNR1B being inactivated in the cetacean stem lineage (Fig. 5A and fig. S12), while ASMT and MTNR1A were probably inactivated independently after the split of odontocetes and mysticetes (figs. S13 and S14). Thus, cetaceans have lost all genes required for melatonin biosynthesis and reception (Fig. 5B). In line with these findings, cetaceans exhibit low levels of circulating melatonin, which does not follow a circadian pattern (60, 61). Because dietary melatonin is readily transported into the blood stream, our finding that the melatonin-synthesizing enzymes AANAT and ASMT are lost in cetaceans further indicates that previously measured melatonin levels in cetaceans are not endogenous, but rather of dietary origin. Furthermore, the loss of the ASMT gene suggests that the previously reported immunohistochemistry signal of ASMT protein in the retina, Harderian gland, and gut of bottlenose dolphin (61) may be attributed to antibody cross-reactivity. Fig. 5 Complete loss of melatonin synthesis and reception may have been a precondition to exclusively adopt unihemispheric sleep in cetaceans. (A) Shared inactivating mutations indicate that AANAT (aralkylamine N-acetyltransferase), the first enzyme required to synthesize melatonin, and MTNR1B, one of the two melatonin receptors, were lost in the cetacean stem lineage. Subsequently, the second enzyme ASMT (acetylserotonin O-methyltransferase) and the second receptor MTNR1A were probably independently lost in cetaceans after the split of odontocetes and mysticetes; however, overlapping deletions of the last ASMT coding exon and MTNR1A exon 2 do not exclude the possibility of ancestral gene losses. Visualization as in Fig. 1A. All inactivating mutations are shown in figs. S11 to S14. (B) Pathway to synthesize melatonin from serotonin and the main sites of expression of the two melatonin transmembrane receptors. Melatonin is synthesized in the pineal gland in the absence of light (i.e., at night) by serial action of the enzymes AANAT and ASMT and thereby relays information on daytime and season. Polymorphisms in AANAT or ASMT affect sleep patterns in humans (62). Furthermore, knockout of AANAT in zebrafish decreased the length of sleep bouts, causing an ~50% reduction in nightly sleep time (63). It has been suggested that melatonin influences sleep-wake cycles mainly by binding the receptors encoded by MTNR1A and MTNR1B on cells of the suprachiasmatic nucleus. Accordingly, elimination of these two receptors significantly increased the time spent awake in mutant mice (64). Furthermore, a polymorphism in the promoter region of MTNR1A was linked to insomnia symptoms (65). In addition to influencing sleep, melatonin has also been shown to regulate core body temperature in a circadian manner, and high circulating melatonin levels evoke a reduction of core body temperature through increased distal heat loss (66). Therefore, the potential benefits of abolishing melatonin production and reception for cetaceans were likely twofold. First, by helping to decouple sleep-wake patterns from daytime, the loss of circadian melatonin production may have been a precondition to adopt unihemispheric sleep as the exclusive sleep pattern. Consistently, sleep in several cetacean species was observed to be equally distributed between day- and nighttime and is thought to be primarily influenced by prey availability (58). Second, mechanisms that reduce core body temperature appear detrimental for species inhabiting a thermally challenging environment. When we examined the melatonin biosynthesis/reception genes in the manatee, we found inactivating mutations in three of the four genes (AANAT, ASMT, and MTNR1B; figs. S11 to S13). Similar to unihemispheric sleep, manatees also display considerable interhemispheric asymmetry during slow-wave sleep (58). In addition, manatees seem to lack a pineal gland. The pangolin, the only terrestrial mammal in our dataset that exhibits an inactivated AANAT gene, also lacks a pineal gland. The results of our genomic analysis also have implications for the conflicting results on the morphological presence of the pineal gland in cetaceans. This gland has been reported to be absent or rudimentary in several cetaceans (but its absence can sometimes be variable between different individuals), while other species such as beluga, harbour porpoise, and sperm whale appear to have a fully developed pineal gland (61). Even if a pineal gland is present in some cetacean individuals or species, inactivating mutations in melatonin synthesis and receptor genes in all cetaceans, including beluga and sperm whale, preclude a role for this gland in melatonin-mediated circadian rhythms.

Loss of genes involved in immune system, muscle function, metabolism, and development Despite the fact that cetacean phenotypes have been extensively studied, our genomic screen for genes lost in the cetacean stem lineage detected several gene losses that imply changes in particular phenotypes, which have not been well characterized. For example, we found losses of genes involved in defense to infectious agents such as bacteria and viruses (TRIM14 and TREM1; figs. S15 and S16 and table S2). Furthermore, while mammals generally have four genes encoding peptidoglycan recognition proteins, which are receptors important for antimicrobial function and for maintaining a healthy gut microbiome, cetaceans have lost three of these four genes (PGLYRP1/3/4; figs. S17 to S19 and table S2). While the loss of these genes highlights differences in the cetacean immune system, it is not clear whether these losses are potentially related to different pathogens encountered in a fully aquatic environment, changes in gut microbiome composition in these obligate carnivores, or other reasons. Another example is MSS51, a gene that is predominantly expressed in fast glycolytic fibers of the skeletal muscle. Inactivation of MSS51 in muscle cell lines directs muscle energy metabolism toward beta-oxidation of fatty acids. MSS51 was lost in the cetacean stem lineage (fig. S20 and table S2), suggesting that muscle metabolism may be largely fueled by fatty acids, which would be consistent with a high intramuscular lipid content in cetaceans. Cetaceans also lost ACSM3 (fig. S21 and table S2), a gene involved in oxidation of the short-chain fatty acid butyrate, but it is not clear whether this loss relates to their carbohydrate-poor diet. The loss of ADH4 (fig. S22 and table S2), a gene that metabolizes retinol and other substrates, suggests differences in vitamin A metabolism. Last, the cetacean loss of SPINK7 (fig. S23 and table S2), a gene involved in esophageal epithelium development, could be linked to the specific ontogeny of the cetacean esophagus, which is homologous to a ruminant’s forestomach. Overall, this highlights the need for further studies to investigate how the loss of these genes may affect immunity, metabolism, and development in cetaceans.

Loss of less well-characterized genes in cetaceans Last, we detected losses of genes that have no experimentally characterized function (table S2). Some of these genes have tissue-specific expression patterns, exemplified by FABP12 (fig. S24 and table S2), a member of the fatty acid–binding protein family that is expressed in retina and testis of rats; ASIC5 (fig. S25 and table S2), an orphan acid-sensing ion channel specifically expressed in interneuron subtypes of the vestibulocerebellum that regulates balance and eye movement; or C10orf82 (fig. S26 and table S2), which is specifically expressed in the human testis. Natural losses of these uncharacterized genes provide intriguing candidates for future functional studies, which may help to relate evolutionary gene losses to particular cetacean phenotypes.

Convergent gene losses in other semi-aquatic and aquatic mammals Several of the 85 genes lost on the stem Cetacea branch are also convergently inactivated in the fully aquatic manatee (including SLC4A9 and AANAT) or in semi-aquatic pinnipeds (table S2). We further tested whether these three lineages of (semi-) aquatic mammals have convergently lost more genes than their closest terrestrial relatives in our phylogenetic tree. To this end, we determined the number of genes that are convergently inactivated in at least two of the (semi-) aquatic mammals (represented by killer whale, Pacific walrus, and manatee) but intact in all their respective terrestrial sister species (represented by cow, polar bear, and elephant). For comparison, we determined the number of genes that are convergently inactivated in at least two of these three terrestrial mammals but intact in all their respective (semi-) aquatic sister mammals. Indeed, we found 20 genes that are convergently inactivated in at least two (semi-) aquatic mammals, whereas only two genes are convergently inactivated between at least two of their respective terrestrial sister species (fig. S27 and table S5). This finding is in contrast to a previous study showing that there are not more convergent amino acid substitutions among (semi-) aquatic mammals than there are among their terrestrial sister species (21), which might be related to the fact that the loss of a gene is generally a rarer and more radical genomic change than the substitution of an amino acid.