Genetic variation within populations was significantly higher in SBGs, whether estimated as single nucleotide polymorphism or as nucleotide diversity (π), and this was particularly pronounced for MBGs (Fig. 1a–c). In theory, increased genetic variation could be due to stronger balancing selection but relaxed purifying selection on SBGs could also contribute, since net screening of gene copies by selection is weaker if only one sex expresses a gene3. We used the ratio of substitution rates (p) at non-synonymous and synonymous sites (p N /p S ) between segregating variants to infer the efficacy of purifying selection within populations20. The absolute value of p N /p S is somewhat difficult to interpret in this context but our observation of within-population p N /p S ratios for weakly SBGs of approximately 0.1 is at least consistent with simulations of strong purifying selection21. More importantly, that this ratio was significantly higher for more strongly SBGs (Fig. 1d) aligns very well indeed with the prediction that SBGs should experience relaxed purifying selection3. That this pattern was more pronounced for nucleotide diversity at non-synonymous sites (π ns ) than at synonymous sites (π s ) (Fig. 1b versus 1c) is also consistent with this conclusion. Similar findings have previously been reported in flycatchers20, guppies10, birds9 and humans22. This strongly suggests that the unique properties of SBGs are at least in part due to relaxed purifying selection3.

Fig. 1: Population genomic analyses of SBG expression in three diverged C. maculatus populations. Shown are mean (±95% bootstrap CI) metrics for genes showing different degrees of sex-biased expression, separately for the three populations (Brazil, blue; California, red; Yemen, green). Genes were grouped into quartiles based on their log2FC value, separately for FBGs and MBGs, resulting in eight bins in total. Sample size per bin is n = 592–656 genes. a, The density of polymorphism varied significantly across SBG categories (all P < 10−6). b,c, Nucleotide diversity also varied significantly across the SBG categories in synonymous (b; all P < 0.002) and, particularly, in non-synonymous (c; all P < 10–6) sites. These three different measures of DNA sequence variation all showed increased variation in SBGs, particularly in MBGs. d, The pattern of p N /p S across SBGs genes (all P < 10–6) was consistent with a history of strong negative selection in the least SBGs and relatively relaxed purifying selection with increasing sex-bias. e,f, Estimates of D also varied across SBG categories, significantly so in all populations when based on synonymous sites (e; Brazil: F 7,4697 = 3.05, P = 0.004; California: F 7,4877 = 4.97, P < 0.001; Yemen: F 7,4851 = 5.66, P < 0.001) and in one population when based on non-synonymous sites (f; Brazil: F 7,4011 = 1.76, P = 0.090; California: F 7,4149 = 4.35, P < 0.001; Yemen: F 7,4176 = 1.63, P = 0.122). Those SBG categories showing overall positive D with CIs not overlapping zero were intermediately biased FBGs. In fact, D tended to relate to sex-bias in gene expression by a wave-shaped pattern, which was significantly sigmoidal in three out of the four cases where the effect of SBG category was significant (third-order polynomial contrasts: California D ns : F 1,4190 = 6.372, P = 0.012; California D s : F 1,4877 = 11.59, P < 0.001; Yemen D s : F 1,4851 = 8.45, P = 0.003). Fitted functions in e and f represent cubic polynomials. This pattern was also seen when instead modelling sex-bias as a continuous trait (Supplementary Fig. 4) and remained intact when accounting for variation in overall gene expression, gene length and GC content (Supplementary Table 5). Full size image

Variation in standing genetic variation within populations represents the outcome of several interacting processes, notably balancing selection, purifying selection and genetic drift. To estimate the net effect of these processes, we related Tajima’s D (D) to sexual dimorphism in expression. D summarizes the site-frequency spectrum (the distribution of single nucleotide polymorphism, SNP, frequencies in a population) and represents a measure of the relative proportion of variable sites at a given locus, normalized such that D = 0 is expected for genes under mutation-drift equilibrium while D > 0 signifies balancing selection and D < 0 purifying selection. Our analyses unveiled a characteristic wave-shaped relationship between expression dimorphism and D, which was consistent across populations and across synonymous and non-synonymous sites (Fig. 1e,f), showing that the strength and nature of overall selection depends upon the type and degree of sex-bias in gene expression. Weakly biased FBGs showed strongest signs of balancing selection within populations, consistent with an elevation of genetic variation in these genes due to SA selection. We identified 149 candidate SA loci, representing FBGs (log2FC > 1, where FC denotes fold change) that also showed D ns > 0 and D s > 0 in all three populations. Classic theory23 predicts that the X chromosome should be enriched with SA loci24, and some population genomic studies have found support for this tenet25 while others have not26. We found little evidence for a general enrichment of candidate SA loci on the X chromosome in our study, as only two of these 149 loci were located on X-linked contigs (Fisher’s exact test; P = 0.336). We also note that genes with sex-limited expression, potentially reflecting genes where SA has been resolved1, were not significantly overrepresented on the X chromosome although this may in part reflect the apparent occurrence of partial dosage compensation and/or female X-inactivation in this species (Supplementary Results). Gene ontology enrichment analyses of our candidate set showed enrichment for genes involved in (1) a variety of general metabolic processes, (2) organelle (for example mitochondrial) organization and (3) cell division and egg production (Supplementary Table 6). Several of those FBGs that showed a signal of strong balancing selection in all three populations showed significant homologies with key metabolic genes, for example involved in ATP production, known to affect life-history traits such as life-span in other species27 (Supplementary Results).

Sexual conflict can be resolved by sex-limited expression of genes1. We therefore expect balancing SA selection to be absent or weakened in highly SBGs, as an indirect result of relaxed selection in the sex showing little or no expression of a given gene3. We found that overall balancing selection was indeed weakened in strongly FBGs (Fig. 1e,f). A similar finding in humans has been interpreted as evidence for SA selection being a major source of balancing selection among SBGs28. The fact that weakly SBGs showed the clearest hallmarks of balancing selection is consistent with the hypothesis that SA is more likely to promote the maintenance of polymorphism in genes where the evolution of sex-specific expression is constrained such that SA selection is more enduring29,30.

The pattern of overall selection in MBGs within populations was different from that in FBGs (Fig. 1e,f). While very weakly biased MBGs showed some evidence for overall balancing selection in two populations, intermediately biased MBGs tended to show, if anything, overall purifying selection. Clearly, the overall pattern of selection is distinct in MBGs and FBGs and the marked influence of balancing selection seen in intermediately biased FBGs was absent in MBGs. This does of course not negate the possibility that some MBGs may be involved in balancing SA selection but it does suggest that MBGs are overall more affected by negative selection than are FBGs. This is in concord with the suggestion that MBGs should be less constrained by pleiotropy than FBGs or unbiased genes4,8,30 and should be more affected by purifying sexual selection1 than FBGs. The fact that there are overall more MBGs than FBGs in C. maculatus supports this possibility as does the interesting fact that FBGs generally show more overlap across tissues than do MBGs19. Available evidence thus implies that FBGs are more often subject to antagonistic pleiotropy through shared function across sexes and tissues. We found further support for this hypothesis in that the degree of shared expression across tissues (abdomen versus head and thorax) among our 149 candidate SA loci was considerably and significantly higher (92%) than expected on the basis of all expressed genes (79%; Supplementary Results).

Several studies have shown that genetic variation in fitness in C. maculatus populations is, to an appreciable extent, SA16,17,18. Detailed phenotyping and experimental studies have placed general life-history traits, such as metabolic rate, locomotor activity, body mass, life-span, mitochondrial function and female egg production at the epicentre of SA selection13,14,15. The molecular hallmarks of selection documented here accord remarkably well with this previous body of research: general life-history genes tend to be female- rather than male-biased in expression in this species19 and we found that candidate SA loci were indeed enriched with genes involved in general metabolic processes and egg production. MBGs are instead enriched with genes with more special functions, such as receptor signalling pathways, visual perception, detection of chemical stimulus and neurotransmitter transport19. Interestingly, a focused analysis of 185 genes encoding C. maculatus male ejaculate proteins, which are male-biased in expression (Supplementary Results), under sexual selection12 and generally assumed to be candidate SA loci31, provided only limited evidence for overall balancing selection (Supplementary Figs. 5 and 6). Our results thus imply that genetic variation maintained by balancing SA selection is highly polygenic and is dominated by weakly FBGs involved in general life-history traits rather than by sex-specific traits under sexual selection in males.

Loci under balancing SA selection are predicted to show a higher degree of shared polymorphism across diverging populations, as ancestral polymorphisms are more likely to be maintained over time by balancing selection5. Previous studies have found that candidate SA loci are indeed more likely to show shared polymorphism, both across populations in fruit flies26 and across closely related species in flycatchers20. We tested this prediction by modelling the probability that genes carrying ≥1 SNP showed shared intermediate frequency polymorphism (minor allele frequency 0.3–0.5 in all three populations). This analysis revealed that FBGs showed a significantly higher probability of shared intermediate frequency polymorphism than did MBGs (Fig. 2). This is consistent with our analyses of balancing selection and, in fact, within-population estimates of D covaried strongly with shared polymorphism across populations (Supplementary Table 5). Genes with high values of D ns , and to a lesser extent also p N /p S , in the three populations were more likely to show shared polymorphism, while genes with more divergent estimates of D ns were less likely to show shared polymorphism. Male seminal fluid proteins and, in particular, sex-linked genes showed a relatively low incidence of shared intermediate frequency polymorphism, in concordance with a more pronounced role of purifying selection in these genes (Supplementary Fig. 7).

Fig. 2: The effect of SBG expression on shared polymorphism across the three populations. This figure shows predicted values (± s.e.m.) of the probability that a gene harbours ≥1 SNP that shows intermediate frequency polymorphism in all three populations across bins, from a generalized linear model (binomial errors and a logit link function) accounting for the effects of gene length and SNP density. The effect of SBG category on shared polymorphism, accounting for these covariates, was highly significant (Wald χ2 7 = 24.93, P < 0.001). Sample size per bin is n = 592–656 genes. Full size image

The identification of candidate SA loci can be refined by combining several metrics, each of which suggests a history of balancing SA selection6. We inspected the subset of genes showing (1) shared intermediate frequency polymorphism across populations, (2) signs of balancing selection within all populations (D ns > 1) and (3) at least weak sex-biased expression (log2FC > 1 or < −1). This identified 15 FBGs and 10 MBGs. Functional enrichment of these genes again showed an enrichment for general metabolic and catabolic processes (both sets) and egg production (the female set) (Supplementary Table 7).

We currently lack a recombination map of the C. maculatus genome and it is therefore not possible to assess whether and how variation in recombination rate across the genome might have influenced our results. C. maculatus has a fairly large (1.2 gigabases) and repeat-rich (>65%) genome with ten chromosome pairs (2n = 18 + XX/XY), and we found that genes carrying SNPs showing intermediate frequency polymorphism were distributed across contigs in accordance with random expectations rather than being enriched on some contigs (Supplementary Results). These facts suggest that linked selection is not responsible for the genome-wide patterns documented here.

Some studies have shown that loci with SB expression, in particular MB genes, tend to show increased rates of divergent sequence and expression evolution1. This is consistent with relaxed purifying selection in SBGs3 but is difficult to reconcile with theory4,7 and empirical observations26 of signals of balancing selection and shared polymorphism in candidate SA genes. This apparent incongruence is not yet fully resolved. Possible resolutions may include a release from SA constraints in strongly SBGs29, allowing such SBGs to respond to divergent sex-specific selection. Other factors may involve strong positive sexual selection in a subset of MB genes4, less constraints through antagonistic pleiotropy in certain MB genes1,4,30, the fixation of alternative alleles across some SA loci for complex polygenic traits under SA selection32 and the possibility that inter-locus SA coevolution2 spurs rapid evolution in a subset of SBGs. This is clearly an issue that deserves further attention.

In conclusion, the hypothesis that balancing SA selection has a major influence on genome-wide levels of genetic variation has considerable support from quantitative genetic studies but has rarely been tested using large-scale genomic data in species in which SA selection, SBG expression and SA phenotypes are well understood4. We provide such a test and our findings supported many key predictions and generated new insights ((1)–(4) as follows). (1) We found genome-wide evidence for relaxed purifying selection in SBGs, supporting the tenet that relaxed selection contributes to relatively high levels of genetic variation and rates of evolution of SBGs3. (2) However, our analyses also showed that indices of balancing selection showed a tighter covariation with shared genetic variation across populations than did those of relaxed purifying selection—the latter fact suggests that SA pleiotropy plays a central role in the elevation of genetic variation seen in SBGs. (3) Theory suggests that SA should be highly polygenic7, which seems to be true in Drosophila26. In line with this last prediction, our analyses identified many candidate SA loci. This molecular genetic finding corresponds well with recent quantitative genetic findings in this species, which have documented a negative genetic covariance between male and female reproductive fitness15,16,17 and have provided evidence for genome-wide sex-specific dominance reversal for fitness18. The latter phenomenon greatly increases the capacity for SA selection to generate balancing selection that results in stable polymorphism33 and promotes the maintenance of polygenic SA variation.

Finally, strong sexual selection on MBGs and male-specific traits has traditionally been assumed to be the primary generator of SA pleiotropy. In contrast to this belief, we found that (4) the footprints of balancing SA selection were most pronounced in weakly FBGs involved in metabolic processes that affect general life-history traits, matching previous studies identifying SA phenotypes in this species14,15,16,17,18. MBGs known to be under sexual selection in males (that is, male seminal fluid proteins) did not generally show consistent evidence for balancing SA selection. The degree to which the patterns documented here are general, as opposed to being specific for our model system, is currently unclear, as few studies have studied genetic diversity in SBGs in species with known SA phenotypes4,5. However, in conjunction with recent single-locus studies that have also revealed SA selection on genes related to metabolic processes and life-history traits33,34,35, our findings do suggest that our understanding of SA pleiotropy may need to be revised: a primary generator of this perpetual genetic tug-of-war between the sexes seems to be genes involved in a variety of general metabolic cascades, where sex-biased expression is constrained by shared function across the sexes.