Comparative Genome-wide Analysis of Accelerated Evolution Reveals Candidate Functional Genomic Elements for Distinctive Mammalian Traits

Sequence quality could impact AR discovery. We performed control experiments to test for reproducibility, particularly for the bats, which have large numbers of ARs. A comparison of microbat ARs (Myotis lucifugus) to ARs in the closely related Myotis davidii species found that 14,331 (68%) ARs are shared ( Figures 1 and S1 B). A comparison of the microbat to a more distantly related hibernating bat, the big brown bat, uncovered 13,665 shared ARs (58%) ( Figures 1 C and S1 B). Therefore, the majority of bat ARs are reproducible between close lineages. The shared microbat-big brown bat ARs are candidate elements for shaping phenotypes in the hibernating bat lineage. We focus on these “hibernating bat” (Hib bat) ARs in the remainder of our study ( Figure 1 C). A large fraction of ARs (37%) are also reproducible between the closely related dolphin and orca ( Figure 1 C). Therefore, more ARs are shared between more closely related species, which is expected only if the data are not dominated by the effects of poor genome sequences. A significant linear relationship (r = 0.9, p = 0.0041, linear regression) exists between the number of ARs in a species and the phylogenetic distance between that species and their closest relative in the background ( Figures 1 and S1 C). Significant relationships between AR number and assembly quality (N50) or assembly size were not observed (data not shown). Thus, our ARs are promising candidates for shaping species-specific traits.

Figure 2 Human and Mouse Homologs for ARs Are Enriched for Transcription-Factor-Binding Sites and Are Active Elements in Diverse Tissues and Cell Types Show full caption (A and B) The bar graphs show the number of different TFs significantly enriched for binding sites in the human (A) and mouse (B) homologs the species’ ARs (FDR 5% or less in at least one cell type relative to random elements; in silico ChIP in ChIP-Atlas). (C and D) Heatmaps show enrichments for DNase-I-hypersensitive sites in the homologous elements for the species’ ARs in various human (C) and mouse (D) cell types (in silico ChIP). All shown enrichments greater than zero are significant (FDR < 5%). We found that 18%–35% of the species’ ARs are located in human genome annotated exons, while the majority (65%–82%) are noncoding regions ( Figure 1 D). The ARs are in otherwise conserved elements predicted to be functional across mammalian species. We tested whether the human and mouse homologs of the ARs show evidence of functionality by comparing AR regions to data for 6,387 human and 5,827 mouse chromatin immunoprecipitation (ChIP) datasets in the ChIP-Atlas database ( Table S2 ). The results show significant enrichments for hundreds of transcription factor (TF) and regulatory-protein-binding sites ( Figures 2 A and 2B ; FDR < 5% relative to random genomic elements). A comparison of the ARs to DNAse I hypersensitivity site sequencing (DNase-seq) datasets for 19 human and 14 mouse cell types ( Table S2 ) uncovered significant enrichments for DNase-I-hypersensitive regions in various human ( Figure 2 C) and mouse ( Figure 2 D) cell types (FDR < 5%).

Capra et al., 2013 Capra J.A.

Erwin G.D.

McKinsey G.

Rubenstein J.L.R.

Pollard K.S. Many human accelerated regions are developmental enhancers. Hubisz and Pollard, 2014 Hubisz M.J.

Pollard K.S. Exploring the genesis and functions of Human Accelerated Regions sheds light on their role in human evolution. Figure 3 ARs from Different Species Are Biochemically Active Elements in Humans and Mice and Are Differentially Enriched for Specific Epigenetic Marks Show full caption (A and B) Heatmaps show the enrichment for different biochemical marks in the homologous elements for the species’ ARs in human (A) and mouse (B) ChIP experiments. All enrichments greater than zero (white squares) are statistically significant compared to random elements (FDR < 5%, in silico ChIP, ChIP-Atlas). It is estimated that ∼30% of human ARs are enhancers () and some are noncoding RNAs, but most remain uncharacterized (). We investigated the nature of the human and mouse homologs of the species’ ARs by comparing these regions to available ChIP sequencing (ChIP-seq) and genomics datasets for 68 and 76 epigenetic marks in human and mouse cells, respectively, including various histone modifications, 5-mC methylated DNA, 5-hmC hydroxymethylated DNA, CTCF (CCCTC-binding factor)-binding sites and EP300 (E1A-binding protein p300)-binding sites ( Table S2 , ChIP-Atlas). Human ( Figure 3 A) and mouse ( Figure 3 B) homologs of the species’ ARs are significantly enriched for 34 and 44 different epigenetic marks, respectively (FDR < 5% relative to random elements). Significant enrichments for markers of active enhancers (H3K27ac, H3K4me2, H3K4me1, and EP300), active promoters (H3K4me3 and H3K4me2), transcribed elements (H3K36me3), and repressed elements (H3K27me3) were observed for the human and mouse homologs of all six species’ ARs ( Figures 3 A and 3B; Table S2 ), suggesting functional elements impacted by ARs across different species. We do not know whether these results extend beyond the mouse and human homologs.