Sex chromosomes originated from ordinary autosomes, and their evolution is characterized by continuous gene loss from the ancestral Y chromosome. Here, we document a new feature of sex chromosome evolution: bursts of adaptive fixations on a newly formed X chromosome. Taking advantage of the recently formed neo-X chromosome of Drosophila miranda, we compare patterns of DNA sequence variation at genes located on the neo-X to genes on the ancestral X chromosome. This contrast allows us to draw inferences of selection on a newly formed X chromosome relative to background levels of adaptation in the genome while controlling for demographic effects. Chromosome-wide synonymous diversity on the neo-X is reduced 2-fold relative to the ancestral X, as expected under recent and recurrent directional selection. Several statistical tests employing various features of the data consistently identify 10%–15% of neo-X genes as targets of recent adaptive evolution but only 1%–3% of genes on the ancestral X. In addition, both the rate of adaptation and the fitness effects of adaptive substitutions are estimated to be roughly an order of magnitude higher for neo-X genes relative to genes on the ancestral X. Thus, newly formed X chromosomes are not passive players in the evolutionary process of sex chromosome differentiation, but respond adaptively to both their sex-biased transmission and to Y chromosome degeneration, possibly through demasculinization of their gene content and the evolution of dosage compensation.

Sex chromosomes have evolved independently many times in both animals and plants from ordinary chromosomes. Much research on sex chromosome evolution has focused on the degeneration and loss of genes from the Y chromosome. Here, we describe another principle of sex chromosome evolution: bursts of adaptive fixations on a newly formed X chromosome. By employing a comparative population genomics approach and taking advantage of the recently formed sex chromosomes in the fruit fly Drosophila miranda, we show that rates of adaptation are increased about 10-fold on a newly formed X chromosome relative to background levels of selection in the genome. This suggests that a young X chromosome responds adaptively to both its female-biased transmission and to Y chromosome degeneration. Thus, contrary to the traditional view of being passive players, the X chromosome has a very active role in the evolutionary process of sex chromosome differentiation.

Here, we describe patterns of DNA sequence polymorphism at many gene fragments across the D. miranda neo-X chromosome and compare them to gene fragments surveyed from the ancestral X chromosome. Contrasting patterns of polymorphism of genes from the recently formed neo-X with genes located on the ancestral X allows us to control, to some extent, for recent demographic events and life-history differences that otherwise pose a problem for identifying adaptive evolution using population variability data [ 25 – 27 ]. Thus, the unusual chromosomal configuration of D. miranda enables us to test for an elevation in rates of adaptation on a recently formed X chromosome relative to background rates of adaptive evolution in the genome [ 28 ].

The ancestral X chromosome of D. miranda consists of two arms ( Figure 1 ); X-L (Muller's element A), which is part of the X chromosome in all species of the genus Drosophila (and >60 million years [MY] old, [ 22 ]), and X-R (Muller's element D), which became part of the X only approximately 10 MY ago, and this X-autosome fusion is shared by species in the D. affinis and D. pseudoobscura subgroup [ 11 , 12 ]. Interestingly, X-R has already acquired the classical characteristics of an evolved X chromosome, including the evolution of dosage compensation over all its length [ 11 , 12 ] and demasculinization of its gene content [ 17 ]. The neo-sex chromosomes of D. miranda (Muller's element C) were formed about 1 MY ago (∼10N e generations) [ 23 ], and appear to be in transition from an ordinary autosome to a pair of heteromorphic sex chromosomes [ 4 ]. Specifically, about half of all genes have become pseudogenized on the neo-Y chromosome of D. miranda [ 24 ], and some genes on the neo-X are acquiring dosage compensation [ 11 , 12 ].

The ancestral X chromosome consists of two chromosomal arms; X-L (light grey), which is part of the X chromosome in all species of the genus Drosophila (>60 MY old), and X-R (medium grey), a former autosome that fused to X-L approximately 10 MY ago. The neo-sex chromosomes (dark grey) were formed by the fusion of another autosome to the ancestral Y chromosome about 1 MY ago. The neo-X chromosome segregates with the X chromosome in D. miranda, but is not fused to it. X-R has already acquired all the stereotypical properties of X chromosomes, whereas the neo-sex chromosomes are in transition from an ordinary autosome to a pair of heteromorphic sex chromosomes.

To test for signatures of pervasive adaptive evolution at the DNA level on a newly formed X chromosome, we take advantage of the unusual sex chromosomes (termed neo-sex chromosomes) of Drosophila miranda ( Figure 1 ). In the genus Drosophila, fusions between autosomes and the ancestral sex chromosomes (that is, the original X and Y chromosomes shared by all members of the genus Drosophila) have repeatedly created so-called neo-sex chromosomes [ 19 – 21 ]. As a result of such a fusion, one chromosome—the neo-Y—is cotransmitted with the Y chromosome through males only. Given the lack of crossing over in male Drosophila, such fusions restrict recombination between the male-limited neo-Y chromosome and its former homolog (the neo-X chromosome). In fact, the neo-Y chromosome is completely sheltered from recombination and thus exposed to the evolutionary forces causing Y degeneration [ 19 – 21 ]. The neo-X chromosome, in contrast, can still recombine in females and cosegregates with the ancestral X chromosome (i.e., it is present in two copies in females and one copy in males). Over evolutionary time periods, neo-sex chromosomes of several Drosophila species have evolved the classical properties of ancestral sex chromosomes (i.e., the neo-Y chromosome degenerates, and the neo-X chromosome evolves dosage compensation [ 19 – 21 ]).

In particular, genes on X chromosomes are faced with several unusual challenges relative to autosomal genes. First, the degeneration of the Y chromosome creates a gene dose problem for X-linked genes in males [ 10 ], resulting in the evolution of dosage compensation mechanisms on the X [ 10 – 13 ]. Another consequence of Y chromosome degeneration is the hemizygosity of X-linked genes in males, increasing the efficacy of natural selection acting on recessive mutations (known as faster-X evolution [ 14 ]). Finally, sex-biased transmission of X chromosomes can result in an accumulation, or deficiency, of genes with female- or male-beneficial functions [ 15 – 17 ]. Indeed, ancestral X chromosomes have often evolved dosage compensation mechanisms, and male-specific genes are depleted (i.e., demasculinization of the X chromosome; [ 11 , 12 , 17 ]). Genes on a newly formed X chromosome may therefore undergo accelerated evolutionary change relative to background levels of adaptation in the genome, to adjust to their altered genomic environment [ 13 , 18 ].

Sex chromosomes have originated independently many times in both animals and plants from ordinary autosomes [ 1 , 2 ]. Their evolution is characterized by a loss of gene function on the nonrecombining Y chromosome, as seen in many taxa [ 3 – 5 ]. For example, of the roughly 1,000 genes originally present on the ancestral Y chromosome of humans, only a few dozen remain [ 3 ]. Conventionally, X chromosomes were often viewed as static entities in the evolutionary process of sex chromosome differentiation, with relatively little change occurring that would distinguish the X from the autosome from which it was derived [ 2 ]. However, several recent studies have shown that the X chromosome has also undergone substantial evolutionary modifications (reviewed in [ 6 – 9 ]).

Results and Discussion

Reduced Diversity on the Neo-X Natural selection can increase the frequency of a beneficial mutation in a population, thereby reducing neutral variation in the genomic region linked to the advantageous allele (i.e., a selective sweep [29,30]). Thus, one signature of directional selection at the DNA level is a reduction in neutral variation in genomic regions surrounding the targets of selection [29]. To test for increased rates of adaptation on the D. miranda neo-X chromosome, we surveyed DNA sequence polymorphism at 152 gene fragments across the neo-X and compare them to 112 gene fragments from the ancestral X chromosome (60 genes from X-L and 52 from X-R). Certain classes of genes tend to undergo increased rates of adaptive evolution in Drosophila, such as genes showing sex-biased expression or genes involved in some biological pathways [31,32]. Genes on both the neo-X and the ancestral X chromosome were selected randomly with regard to gene function or expression patterns, and no significant heterogeneity in gene count among gene ontology classes or patterns of sex-biased expression was detected between loci on the ancestral X chromosome and the neo-X (see Tables S1–S4). Estimated levels of synonymous variation and synonymous divergence to an outgroup species are similar for genes on X-L and X-R (mean π s = 0.51% vs. π s = 0.71%; Wilcoxon two-sample test, p > 0.05 and mean K s = 4.26% vs. K s = 3.98% between D. miranda and D. pseudoobscura; Wilcoxon two-sample test, p > 0.05), consistent with observations that X-R has reached the typical properties of an evolved X chromosome (i.e., X-R appears fully dosage compensated in males and its gene content shows a similar deficiency of male-biased genes as X-L, the ancestral X chromosome; [12,17]). Table 1 summarizes average levels of synonymous diversity across the genomic regions studied. Synonymous site diversity is reduced by about 50% on the neo-X compared to the ancestral X (average π s = 0.33% vs. π s = 0.60%; Wilcoxon two-sample test, p < 3e-6), while levels of synonymous divergence to an outgroup species between the chromosomes are similar (K s = 4.36% vs. K s = 4.13% for the neo-X and the ancestral X chromosome between D. miranda and D. pseudoobscura; Wilcoxon two-sample test, p = 0.13), suggesting that the two chromosomes have similar mutation rates. Comparison of polymorphism and divergence levels at synonymous sites on the neo-X versus the ancestral X using a Hudson-Kreitman-Aguadé (HKA) test [30] confirms that the reduced diversity observed at neo-X genes is not attributable to a lower mutation rate (HKA test p < 10−4, see Table S6). Additionally, many more invariant loci are observed on the neo-X chromosome (23 vs. three genes, Table 1). PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Table 1. Average Diversity Measures in Drosophila miranda across X-Linked and Neo-X Gene Fragments and Numbers of Loci Showing Evidence for Recent Adaptive Evolution https://doi.org/10.1371/journal.pbio.1000082.t001 Nonequilibrium demography, such as recent population bottlenecks, or differences in life-history strategies between males and females can cause levels of diversity to differ between sex chromosomes and autosomes [25–27]. However, because the ancestral X chromosome and the neo-X chromosome show identical patterns of inheritance, demography and life history are expected to influence patterns of diversity on the X and the neo-X in a similar manner [25–27]. Note, however, that the neo-X chromosome was segregating as an autosome until the formation of the neo-sex chromosomes roughly 1 MY ago. This event was presumably associated with a modest decline in the population size of the neo-X (from 2N to 1.5N), but is sufficiently ancient (∼10N e generations ago) to not leave signatures in current levels of population variation [33,34]. Thus, the ancestral X should serve as an adequate control for demographic effects on the neo-X, which suggests that natural selection is responsible for reduced levels of variability on the neo-X relative to the ancestral X. We employed several statistical approaches in order to quantify rates of adaptive evolution on the neo-X versus the ancestral X.

More Recent Selective Sweeps on the Neo-X Recent positive selection results not only in a local reduction of variation in the genomic region surrounding the target of selection, but also in a skew in the frequency distribution of mutations surrounding the target of selection (i.e., the hitchhiking effect; [29,35]). In particular, recent adaptive evolution in the genome results in an excess of both low- and high-frequency mutations relative to neutral expectations [33,36,37]. Such approaches to detect selection using population variability data cannot be applied to invariant loci, but the 129/152 polymorphic neo-X–linked and 109/112 polymorphic X-linked loci can be examined. A composite likelihood ratio (CLR) test [38] that utilizes these population variation patterns to detect recent adaptations reveals that a greater proportion of genes located on the neo-X chromosome relative to the ancestral X reject a model of neutral sequence evolution in favor of a genetic hitchhiking model at the 5% significance level (19/129 testable neo-X genes, 15%, vs. 3/109 testable X-linked genes, 3%, Table 1; p = 0.0013, Fisher exact test). Further, the distributions of CLR test p-values for the neo-X and X are significantly different from one another (p = 2e−9, Kolmogorov-Smirnov test). Although the CLR test is not robust to some demographic scenarios [34], the ancestral X chromosome functions as an internal control to account for such effects (see above). Thus, we detect about 5-fold more adaptive events on the newly formed neo-X compared to the ancestral X. To further evaluate evidence in support of more adaptation on the neo-X, we also applied a goodness-of-fit test (GOF test) [34] to genes rejecting the CLR test. This statistic was proposed to assess the fit of data to a selective sweep model, in order to identify loci with significant CLR tests that may be explained by demographic effects. Only one of the three genes that rejected the CLR test on the ancestral X chromosome were consistent with a selective sweep model using the GOF test, whereas 13 of the 19 neo-X genes rejecting the CLR test were consistent with a recent selective sweep (Table 1). Adaptive evolution also leaves characteristic signatures in patterns of linkage disequilibrium (LD), with reduced LD across the target of selection, and increased LD in genomic regions flanking the target [39–41]. The ω max statistic [39] was used to identify loci under selection based on these patterns of LD. Again we detect more selection on the neo-X chromosome (7/129 testable genes, 5%, reject neutrality at the 5% significance level) compared to the ancestral X (1/109 testable genes, 1%, reject neutrality at the 5% significance level, Table 1; p = 0.07 Fisher exact test). The genes identified as targets of recent adaptive evolution using the ω max statistics are a subset of those identified with the CLR+GOF test. Thus, several locus-by-locus tests identify a much larger fraction of genes having undergone recent adaptive evolution on the neo-X chromosome compared to the ancestral X. Note that the locus-by-locus tests for selection are not corrected for multiple testing, because our main interest lies in quantifying the relative excess of statistical tests rejecting neutrality on the neo-X relative to the ancestral X, and not the absolute number of significant rejections. A similar excess of adaptive evolution of neo-X–linked genes relative to the ancestral X is found if we use the false discovery rate to account for multiple testing (see Table S7). In addition, the above locus-by-locus tests of selection assume that the genomic regions surveyed are unlinked. Indeed, the genes surveyed on the neo-X and the ancestral X chromosome appear mostly independent from each other, with levels of LD being similarly low between loci (unpublished data). Also, the genomic regions we identified as having undergone recent selection on the neo-X show little evidence of clustering (the median distance between loci using the D. pseudoobscura genome sequence as a guide is 0.97 Mb for the 19 significant loci identified using the CLR test, and only two regions rejecting this test are adjacent to each other). In addition, many of the gene fragments studied here were mapped previously by in situ hybridization experiments, and found to be scattered along the polytene chromosomes of D. miranda [42–44]. Thus, most selective sweeps identified on the neo-X represent independent events. A more formal approach to take account of multiple testing and linkage when comparing rates of evolution on the ancestral and the newly formed X chromosome is to consider all loci simultaneously rather than testing them individually. To this end, we employed a composite likelihood method (the maximized composite likelihood surface test [MCLS test]) [45] for detecting positive selection, which uses a similar likelihood framework as the CLR test. As opposed to the CLR or GOF tests, however, which use a specific population genetic model as the null (the equilibrium neutral model or a selective sweep model, respectively), the MCLS statistic derives the null model from the empirical data itself. In this way, the test seeks to identify loci that show unusual patterns of variation relative to the other loci (the “background loci” [45]) in the genomic screen. To determine significance of the test statistic, it is still necessary to simulate data based on an explicit model [45]. In our implementation, we use the background allele frequency distribution (the background site-frequency spectrum) obtained from the ancestral X, to test for selection on both the ancestral X and the neo-X chromosome. Again, we find evidence for many more genomic regions having undergone recent positive selection on the neo-X (13 nonoverlapping regions) compared to the ancestral X (three genomic regions, Figure 2; p = 0.035, Fisher exact test). The distributions of the minimum p-value windows are significantly different for loci surveyed on the neo-X and the ancestral X (p = 0.022, Kolmogorov-Smirnov test). All 13 neo-X regions identified as targets of recent adaptive evolution using the MCLS approach correspond to loci identified as positively selected using the locus-by-locus CLR+GOF test (above). PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 2. The Maximized Composite Likelihood Surface Calculated for Genes from the Ancestral X and the Neo-X Chromosome The horizontal line indicates the 5% cutoff values (LR crit ) as determined separately for the X (LR crit = 4.5) and the neo-X (LR crit = 2.7) by simulation under a neutral equilibrium model. More significant peaks are identified on the neo-X, suggesting that more recent selective sweeps have occurred on this chromosome. https://doi.org/10.1371/journal.pbio.1000082.g002 Could other systematic biases—such as differences in overall recombination rates or levels of variability between sampled loci—result in differential power to detect selection on the two chromosomes? Specifically, selective sweeps are more easily identified in low-recombining regions due to the increased effects of hitchhiking [34], and statistical tests have reduced power to detect selection if levels of variability are low [34]. Recombination rates do not appear to systematically differ between loci on the neo-X and the ancestral X chromosome (average levels of LD, as measured by Wall's B or Q [46], are not significantly different within loci on the two chromosomes; p > 0.2, Kolmogorov-Smirnov test). However, the neo-X chromosome has significantly reduced levels of variability and more invariant genes relative to the X (Table 1), as expected under a model of recurrent selection. Methods based on the site-frequency spectrum (CLR, GOF, and MCLS) and LD (ω max ) have reduced power to detect selection when levels of variability are low (and cannot be applied to invariant loci), implying less power to detect selective events on the neo-X chromosome. This suggests that the difference in rates of adaptive evolution between the neo-X and the ancestral X is likely underestimated. Nevertheless, utilizing many different features of our data, we consistently estimate that the fraction of genes having undergone recent adaptive evolution on the newly formed X chromosome increases 5–10-fold relative to background levels of adaptation on the ancestral X.

Increased Rates of Adaptation on the Neo-X To what extent are rates of adaptation accelerated on the neo-X? To evaluate the difference in rates of adaptation on the neo-X versus the ancestral X using all loci (including invariant ones), we estimate parameters of a recurrent selection model for each chromosome. Under a model of recurrent adaptation, the average reduction in levels of variability depends on the rate at which adaptive substitutions occur (2N e λ) and their average effect on fitness (s) [47]. We use a recently developed approximate Bayesian approach [48] to estimate these parameters. This approach uses multiple summary statistics of the population variation data (see Materials and Methods), to obtain maximum a posteriori (MAP) estimates of both 2N e λ and s. Figure 3 shows the marginal and joint posterior distributions of the rate and the strength of selection inferred for loci on the ancestral X and the neo-X. The MAP estimate of the rate of adaptive substitutions again is roughly 10-fold higher for neo-X–linked genes compared to genes that are located on the ancestral X (Figure 3). Interestingly, we also estimate the strength of selection to be an order of magnitude higher for genes on the neo-X chromosome (Figure 3). Importantly, MAP estimates of the strength and the rate of sweeps for the neo-X fall outside of the 95% credibility intervals for the estimates on the X. Thus, not only does adaptive evolution appear to be more frequent on a newly formed X chromosome, but also the selective benefit of mutations arising may be larger for genes that have only recently become X-linked relative to genes that have been evolving under a stable chromosomal configuration. This may be expected if genes on a young X chromosome are further away from their optimum fitness [49], since they experience a new genomic environment (i.e., they used to segregate as an autosome but have only recently become X-linked). PPT PowerPoint slide

PowerPoint slide PNG larger image

larger image TIFF original image Download: Figure 3. Approximate Bayesian Estimation of the Rate of Adaptive Substitutions (2N e λ) and Their Average Effect on Fitness (s) for Genes from the Ancestral X and the Neo-X Chromosome Estimation is based on 106 draws from the prior (s ∼ Uniform (1.0E−06, 1.0) and 2N e λ ∼ Uniform (1.0E−07, 1.0E−01)). (Top) The joint posterior distributions for the X and neo-X. The dotted lines correspond to MAP estimates, and darker regions indicate greater posterior density. (Bottom) The marginal posterior distributions for the strength and rate of sweeps. The marginal posterior distribution for the X is indicated in grey, and for the neo-X in black. Both the rate and the strength of selection are inferred to be an order of magnitude higher for neo-X–linked genes. https://doi.org/10.1371/journal.pbio.1000082.g003