Significance Echolocation is a prime example of convergent evolution, the independent gain of similar features in species of different lineages. Is phenotypic convergence driven by underlying molecular convergence? If so, could molecular convergence include contributions from highly constrained, often-pleotropic, coding regions? We develop a generalizable test that offers a resounding “yes” to both extensively debated questions. Our test highlights molecular convergence in genes regulating the cochlear ganglion of echolocating bats and whales, the skin of aquatic mammals, and the lung of high-altitude mammals. Importantly, the approach correctly dismisses confounding convergence-like patterns, such as those from sequence decay of vision genes in blind subterranean species, and is readily applicable to the thousands of genomes sequenced across the tree of life.

Abstract Distantly related species entering similar biological niches often adapt by evolving similar morphological and physiological characters. How much genomic molecular convergence (particularly of highly constrained coding sequence) contributes to convergent phenotypic evolution, such as echolocation in bats and whales, is a long-standing fundamental question. Like others, we find that convergent amino acid substitutions are not more abundant in echolocating mammals compared to their outgroups. However, we also ask a more informative question about the genomic distribution of convergent substitutions by devising a test to determine which, if any, of more than 4,000 tissue-affecting gene sets is most statistically enriched with convergent substitutions. We find that the gene set most overrepresented (q-value = 2.2e-3) with convergent substitutions in echolocators, affecting 18 genes, regulates development of the cochlear ganglion, a structure with empirically supported relevance to echolocation. Conversely, when comparing to nonecholocating outgroups, no significant gene set enrichment exists. For aquatic and high-altitude mammals, our analysis highlights 15 and 16 genes from the gene sets most affected by molecular convergence which regulate skin and lung physiology, respectively. Importantly, our test requires that the most convergence-enriched set cannot also be enriched for divergent substitutions, such as in the pattern produced by inactivated vision genes in subterranean mammals. Showing a clear role for adaptive protein-coding molecular convergence, we discover nearly 2,600 convergent positions, highlight 77 of them in 3 organs, and provide code to investigate other clades across the tree of life.

The evolutionary gain of similar traits in distantly related species has been proposed to be encoded, in some cases, by identical sequence substitution paths in their respective genomes (1, 2). While convergent phenotypes are seldom, if ever, the result of only (parallel or strictly) convergent molecular adaptations, several cases of identical amino acid changes underlying phenotypic convergence in distant species have been documented in recent years (3). For example, parallel evolution of prestin (SLC26A5) (4⇓⇓–7) and 4 other auditory genes (8⇓–10) were identified in echolocating bats and toothed whales. However, these examples are few in number, limited in biological scope, and identified using candidate gene-based approaches.

Our recent ability to apply whole-genome sequencing to explore the genetic basis of natural complex adaptations (e.g., Fig. 1A) calls for the development of more systematic approaches for identifying adaptive molecular convergence patterns across genomes (11⇓⇓–14). Recent genome-wide screens illuminated sensory genes with sequence changes that segregate with echolocating mammals (11). Others highlighted positively selected genes that contain convergent amino acid substitutions (12) or identified parallel shifts in evolutionary rates in independent lineages of aquatic mammals (15). However, later analyses demonstrated that the genome-wide frequency of molecular convergence in echolocating mammals is similar to the frequency in nonecholocating control outgroups, suggesting that, in fact, there is no genome-wide protein sequence convergence in echolocation (16, 17).

Fig. 1. Screening for molecular convergence and divergence in the mammalian lineage. (A, Left) Simplified placental mammals phylogenetic tree. See SI Appendix, Fig. S1 for all 57 species used in the study. Colored rectangles highlight branches with independent phenotypic evolution of echolocation, aquatic, high-altitude, and subterranean lifestyles. (A, Right) Filled and empty rectangles represent target and outgroup species, respectively. We screened for (parallel or strictly) convergent and divergent amino acid substitutions along the branches leading from the last common ancestor of each target group and its outgroup to the target group itself. For example: (B) An N7T (asparagine to threonine in position 7) parallel convergent substitution in the hearing gene Prestin (SLC26A5) in echolocating mammals. (C) An E3840 divergent substitution in the vision/hearing gene Usherin (USH2A) in a pair of distantly related subterranean mammals.

Another key challenge in determining the molecular basis of phenotypic convergence is distinguishing adaptive convergent substitutions from molecular convergence-like patterns that do not contribute to the phenotypic convergence, but instead accumulate for other reasons (18). In particular, relaxation of evolutionary constraints (19, 20) or complete loss of a biological function (21) may lead to an accumulation of convergence-like (but in fact nonadaptive) substitutions through sequence relaxation in functionally related groups of genes.

Here, we demonstrate an unbiased approach that 1) identifies—in a genome-wide, function-agnostic manner—all convergent amino acid substitutions in target lineages relative to outgroups, 2) determines if these substitutions are statistically overrepresented in any (or none) of more than 4,000 tissues, and 3) checks for known functional congruence between the most-enriched tissues and the convergent phenotype.

In developing the approach, we considered molecular convergence (Fig. 1B) to include both parallel substitutions (i.e., identical substitutions in the 2 target lineages derived from the same ancestral amino acid) and strictly convergent substitutions (identical target substitutions derived from different ancestral amino acids). Following convention (22), we herein collectively refer to these both as convergent amino acids. Furthermore, we required that the exact same amino acid (the convergent amino acid) be present in every species from both target groups, and that no other amino acid identities be present in target group species. In target or outgroup clades containing 2 or more species (e.g., in the 3 echolocating bats), we allowed a missing amino acid alignment in some species, as long as the convergent amino acid was represented at least once and no other amino acids were present in the target clade.

Then, we devised a functional enrichment-based analysis that, rather than asking which individual genes contain (parallel or strictly) convergent amino acid substitutions, asked which (collection of genes, all affecting a) specific tissue or organ, if any, is most significantly affected by coding convergence. We emphasize that this approach is functionally agnostic (not candidate-based), as it considers convergent substitutions from all genes and tests for functional enrichment across thousands of tissues and organs (Methods and Fig. 2A). Furthermore, by screening for concurrent accumulation of divergent substitutions (Fig. 1C) in these same gene sets, we ensure that the test is specifically robust to otherwise-confounding sequence constraint relaxation.

Fig. 2. A molecular convergence test. Example application to echolocating mammals: (A, Top Left) We picked 2 target groups (TG) of species with a phenotypic convergence (echolocation here), and 2 outgroups (OG). (Top Middle) Our algorithm identified all cross-species conserved (and thus functionally important) amino acid positions in genes annotated for any of 4,300 specific MGI phenotype functions. (Top/Bottom Right) We then identified the subset of positions showing (parallel or strictly) convergent substitutions between our target groups and performed a hypergeometric test over positions to find the single most statistically enriched MGI function (if any) for amino acid convergence. We also tested for divergent substitutions between our target groups (red arrow). (Bottom Middle/Left) If the most-enriched convergent function is not also enriched for divergent substitutions, we declared an adaptive molecular convergence event and linked it back to the convergent species phenotypes. In this example, we discovered 25 convergence events in 18 genes regulating cochlear ganglion function in bats and whales. The cochlear ganglion prediction is particularly striking considering that 4,300 different functions were evaluated to arrive at a poster-child organ for phenotypic convergence between the 2 echolocating groups. (A and B) Comparable total number of convergent substitutions were observed in echolocating mammals (TG.I and TG.II in A, Top Left) as in 3 control sets formed by shuffling either or both target groups with their respective outgroups. (C) However, only the echolocating set of B showed a statistically significant enrichment for molecular convergence in the ontology term “cochlear ganglion degeneration.” In fact, none of the control experiments yielded any statistical enrichment across all 4,300 tested terms.

We applied our method to 3 different convergent trait gains (in echolocating, aquatic, and high-altitude mammals) and to 3 relaxation-based scenarios (vision loss in blind subterranean moles; Fig. 1A) and discovered coding molecular convergence affecting organs with clear functional convergence. We also provide the code underlying our approach so that others can use it to investigate the ever-increasing number of species now being sequenced.

Discussion Convergent phenotypic evolution is a striking and often-observed phenomenon. The extent to which phenotypic convergence is driven by molecular convergence is a fascinating question with implications on our understanding of the constraints and predictability of species evolution (13). This question is complicated by the fact that convergent lineages do not necessarily show an overall excess of convergent genomic events (Fig. 2B). Another highly debated issue relevant to this work concerns the contribution, if any, of often-pleiotropic protein-coding changes to convergent phenotypes (39, 57, 58). To address both questions, we devised a test to analyze distantly related pairs of echolocating, aquatic, and high-altitude mammals. The approach reveals excess accumulation of convergent coding substitutions in (gene sets regulating) organs with well-established roles in the context of the investigated phenotypic adaptation. As is the case in many evolutionary scenarios, when a surprising number of constrained amino acids rapidly change, one must carefully weigh a switch from purifying selection to either positive selection or neutral drift. The subterranean mammals provided an excellent case study to fortify our test against the latter scenario. Whereas all 3 combinations of distantly related moles yielded a statistically significant accumulation of convergent substitutions in vision genes, all 3 sets also yielded a much stronger statistical signature for accumulated divergent substitutions in these same gene sets. For instance, divergent substitutions outnumber convergent changes in USH2A (Fig. 3D), which—along with other genes enriched for both convergence and divergence (including CRB1, ABCA3, GUCY2F, and PDE6C)—has previously been flagged for molecular relaxation in these species (19, 28). Indeed, this observation of ours could potentially explain why echolocating lineages do not share more convergent amino acids with each other than they do with their outgroups (Fig. 2B). Varying degrees of constraint relaxation will allow amino acid substitutions to different extents in different directions, including in seemingly convergent patterns, such that a fraction of what is technically defined as convergent (Fig. 1B) is not actually the result of adaptive pressure. A key to our approach is the ability to distinguish molecular convergence driving adaptation versus superficially similar phenomena from constraint relaxation. Our results cannot, and should not, exclude additional noncoding, regulatory contribution to these same traits (59). Indeed, when a different high-altitude mammal, the Tibetan antelope, is analyzed as one target group with either Bactrian camel and alpaca (329 convergent coding substitutions) or pika (403 convergent coding substitutions) as the second target group, we do not observe any functional coding enrichment. As such, it is possible that the genomic basis of high-altitude adaptation in Tibetan antelope is mainly noncoding (60), or coding but not strictly convergent (61). By nature, life-style adaptations like the ones we identify here are complex, multitissue, polygenic affairs. In this first derivation of our test, we examine only the single functional term (where one exists) most enriched for convergent substitutions that also does not exhibit excess accumulation of divergent changes. The rationale for doing so is severalfold: Consideration of just the top term focuses attention on specific (and, possibly, the most pronounced) aspects of convergent physiological adaptations—providing discernment that is not possible when one simply collects all convergent events and considers the multiple, often-pleiotropic functions of each affected gene, or when taking a candidate-based approach. We also find plenty of compelling candidates that can motivate extensive experimental inquiry into their significance beyond the promising functional studies we identified. Recent advances in CRISPR-Cas–mediated DNA and RNA editing along with the development of in vitro organ models makes experimental evaluation of the top-term candidate substitutions easier and more powerful than ever before. Using just the 3 case studies presented here as starting examples, one could engineer and test—in a robust, isogenic manner—various combinations of our 77 candidate substitutions in human induced pluripotent stem cells (IPSCs). The modified IPSCs could be used to generate inner ear organoids containing functional hair cells (62) and determine whether their electrophysiological response to various stimuli is shifted to be more similar to that of echolocating mammals. IPSC-derived skin structures (63) from unmodified cells versus those engineered with convergent substitutions could test for changes in (ultra)structure, proliferation, or UV-induced DNA repair that more closely match the integument of fully aquatic mammals. Likewise, the mass, proliferation, and hypoxia tolerance (including DNA damage response) of 3D lung models (64) could be assayed to determine if high-altitude convergent substitutions affect cardiorespiratory characteristics beneficial for fitness at low-oxygen conditions. Like other biological tests of significance, where the true underlying statistical distribution is beyond reach, our test could be extended in numerous ways (65), such as attempting to consider additional convergence-enriched functions that rank below the top term in statistical significance. It is not trivial, however, to implement such an extension, as the enrichment of other functionally related terms would not occur in a statistically independent manner (SI Appendix, Table S4). We, therefore, leave these considerations to future work and share our codebase for this purpose. By analyzing a study that scored thousands of physiological and morphological traits for presence/absence in mammals (66), we estimate (Methods) that over a third (773/2,215, >34%) of the scored phenotypes are convergent, independently gained traits like echolocation (SI Appendix, Fig. S3B) and would be amenable to interrogation by our test. To deploy our test, one simply needs genome assemblies from a set of related species, a functionally annotated gene set for just 1 of the assemblies, and knowledge of phenotypic convergences among the species. Thousands of genomes across the tree of life are already publicly available at the National Center for Biotechnology Information alone (SI Appendix, Fig. S3A), and ongoing sequencing efforts at scales spanning large consortia to individual benchtops will generate many more. These genomes, combined with the increased ability to integrate phenotypic data into computational analyses (66⇓–68), are ripe for exploration using screens like ours to identify sequence changes underlying a myriad of convergent phenotypes across the tree of life.

Methods Code and Data Availability. The code developed for the project is provided, along with required input files and detailed usage documentation, at https://bitbucket.org/bejerano/convergentevolution. Species Set and Gene Set. In this study we used genome assemblies of 57 species (listed in SI Appendix, Table S1) and their substitutions per site-weighted phylogenetic tree (SI Appendix, Fig. S1) from UCSC (genome.ucsc.edu). We used human genome assembly GRCh38/hg38 and Ensembl (ensembl.org) release 86 human protein-coding gene set as reference. Finding Conserved Amino Acid Positions in Functionally Annotated Genes. We started by picking a pair of independent clades (with 1 or multiple species) to serve as target groups for our convergent evolution test, and their associated outgroups. The 6 sets used in this paper are shown in SI Appendix, Fig. S1. We mapped all human genes, except a small fraction overlapping segmental duplication regions (from UCSC), to each of the other 56 species using UCSC liftOver chains (69). We then excluded genes that were not mapped to at least 1 species from each of the selected target and outgroups, as well as genes lacking any functional annotations in the MGI Phenotype Ontology (more details below). In addition, to focus our analyses on pan-mammalian genes, we required that a gene is aligned in at least 40 species. To evaluate the robustness of our results, we later repeated the test with ≥30 or ≥35 species thresholds. We then used the alignments to determine the orthologous amino acid in each available species. Because cross-species exon boundaries sometimes shift for evolutionary or for alignment reasons, we excluded the first and last 2 amino acid positions in each exon from all downstream analyses. We then derived a set of testable amino acids from all remaining genes, where each amino acid is aligned in at least 1 species from each of the target and outgroups and is also conserved across all aligned species with Bayesian branch length score (BBLS) >0.9 (discussed below). We mapped only these genes back to the MGI ontology, removed ontology terms annotated by too few or too many genes (discussed below), and then excluded all genes and amino acids lacking functional annotations in the remaining set of ontology terms. The final number of ontology terms, genes, and amino acids used for testing each of the 6 species sets in the paper is described in SI Appendix, Table S2. Cross-Species Amino Acid Conservation Score. Using cross-species conservation as a hallmark of functional importance, we required that each amino acid position be aligned in at least 40 placental mammals. We then computed the total substitution per site branch length (BL) over which the dominant amino acid is conserved. This we converted to a BBLS which further takes into account the phylogenetic relatedness between species. We tested only amino acid positions conserved at BBLS above 0.9. BBLS was previously demonstrated to outperform BL scores (70) and is extensively discussed in Xie et al. (71). Mouse Phenotype Ontology. The MGI Phenotype Ontology catalogs spontaneous, induced, and genetically engineered mouse mutations and their associated phenotypes (24). Ontology data (containing 8,949 phenotypic terms; format version 1.2) was lifted over from mouse to human, resulting in 609,253 (canonicalized) gene–phenotype associations. To increase statistical power (by reducing the required multiple testing correction factor in our tests below) we ignored general ontology terms (those annotating more than 500 genes with at least 1 conserved position) or too-specific ontology terms (those annotating fewer than 10 such genes). Calling Convergent and Divergent Substitutions. Starting from an amino acid alignment at any tested position (derived above) we used PAML (V4.8) (72) to infer the most likely amino acid identity in the internal nodes of our 57 species eutherian phylogenetic tree (SI Appendix, Fig. S1). We scanned only for substitutions occurring along the branches leading from the last common ancestor of each target group and its outgroup to the target group itself. The procedure for calling convergent substitution is described in the introduction. For divergent substitutions (Fig. 1C) we used similar logic: A pair of substitutions resulting in different amino acids in the 2 target groups was called a divergent substitution. In groups of more than 1 species, we allowed missing alignments (in either target or outgroup) but also allowed different amino acids within target groups, as long as the criteria of amino acid substitution along the branch from the common ancestor with the outgroup to the target itself is satisfied. Enrichment Test over Amino Acids. Different genes contribute to our test a different number of conserved amino acids, depending on gene length and its cross-species conservation. Therefore, rather than test ontology term enrichment over genes, we tested over the conserved amino acid positions, associating each gene’s ontology annotations to all of its amino acid positions we intended to test for convergence/divergence events. This is equivalent to a gene-based test where each gene is weighted by the number of positions tested in it. For a given pair of target lineages, we started by determining the background set of conserved and MGI annotated amino acids as above (SI Appendix, Table S2). We used the hypergeometric test to calculate functional enrichment over this set. More specifically, the hypergeometric test was executed separately for each ontology term (π) with 4 parameters: 1) N, the total number of conserved and annotated amino acid positions (background); 2) K π , the total number of positions annotated by an ontology term (a subset of the background); 3) n, the total number of positions with convergent (or divergent) substitutions identified from the background; and 4) k π , the intersection between 2 and 3. We computed a standard hypergeometric P value of the observed enrichment for term π as the fraction of ways to choose n (converged or diverged) amino acid positions without replacement from the entire group of N positions such that at least k π of the n have ontology annotation π, using the formula below: P = ∑ i = k π min ( n , K π ) ( K π i ) ( N − K π n − i ) ( N n ) . [1] We corrected all P values for multiple testing by setting a Benjamini–Hochberg false discovery rate of 0.05 for significance. We also computed the fold enrichment for convergent (or divergent) substitutions labeled by a given term by dividing the observed fraction of convergent (or divergent) substitutions intersecting the tested term by the expected fraction of substitutions: (k π /n)/(K π /N). We kept only those terms meeting all of the following criteria: q-value convergence < 0.05; q-value divergence ≥ 0.05; fold enrichment convergence > 2; fold enrichment divergence ≤ 2. Finally, we ranked all remaining terms, if any, by convergent q-value and then by convergent fold enrichment. Enrichment Test Robustness to Top-Term Gene Omission. To assess the robustness of the convergent enrichment for the top term, we recomputed the statistical significance and magnitude of enrichment for the top term when omitting, one by one, each gene contributing to that term. Gene omission entailed ignoring all of its amino acid positions for all steps of the calculation (as if the gene did not exist in the genome). The statistical thresholds (q-value convergence < 0.05; q-value divergence ≥ 0.05; fold enrichment convergence > 2; fold enrichment divergence ≤ 2) remained identical for this analysis. Estimating the Abundance of Convergent Traits. To estimate the abundance of phenotypic convergence, we obtained phenotypic data (66) from a large scale mammalian study in MorphoBank (project 773; https://morphobank.org/index.php/Projects/ProjectOverview/project_id/773) that scores 2,215 morphological, physiological, and behavioral presence/absence traits (labeled as “1” and “0,” respectively) across 86 extant and extinct mammals with an existing phylogenetic tree. We labeled as “convergent” the subset of traits that 1) switch from 0 → 1 (i.e., gained) independently in the tree at least twice and 2) after reconstructing the presence/absence of all of the nodes in the tree, the trait is inferred as “absent” in the ancestral head node of the phylogeny (thus reflecting a phenotypic innovation in species with “1”) with probability larger than 0.9, using a maximum likelihood tool available in R (package ape, using SYMreconstruction for ancestral state reconstruction, https://cran.r-project.org/web/packages/ape/index.html). Protein Structural Models. Protein structure figures were created using PyMol (https://pymol.org/2).

Acknowledgments We thank the members of the G.B. laboratory, particularly A. M. Tseng for his help, and M. J. Berger, W. Heavner, H. M. Moots, J. H. Notwell, J. Birgmeier, K. A. Jagadeesh, B. Yoo, H. Guturu, and A.M. Wenger for technical advice and helpful discussions. We thank H. Clawson (UCSC) for assistance with mammalian genome alignments and the community at large for the availability of all the different genomes. We thank D. Cooper and P. Stenson from Human Gene Mutation Database for disease variant data. This work was funded in part by the National Science Foundation Graduate Research Fellowship Grant 1656518 (to H.I.C.) and NIH Grants U01MH105949 and R01HG008742, a Packard Foundation Fellowship, and a Microsoft Faculty Fellowship (to G.B.).

Footnotes Author contributions: A.M., Y.T., H.I.C., and G.B. designed research; A.M., Y.T., and H.I.C. performed research; A.M., M.G., and B.A.B. contributed new reagents/analytic tools; A.M., Y.T., H.I.C., H.W., and G.B. analyzed data; and A.M., Y.T., H.I.C., and G.B. wrote the paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

Data deposition: The code developed for the project is provided, along with required input files and detailed usage documentation, at https://bitbucket.org/bejerano/convergentevolution.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1818532116/-/DCSupplemental.