Mapping sex-differential gene expression we found more than 6500 protein-coding genes with significant SDE in one tissue or more. The most differentiated tissue was the breast mammary gland, with more than 6000 genes having significant SDE (Fig. 1). This remarkable sex-biased gene expression is likely due to the distinct physiologic properties of this tissue between men and women [2]. In evolutionary terms, differential selection between the sexes of so many genes that are likely involved in lactation, an essential reproductive trait, might inhibit optimal adaptation of this trait due to its distinct importance in men and women.

Almost all SDE genes are sex differentiated in one or just a few tissues. Thirty-one genes have SDE in six or more tissues. Besides Y-linked genes that have men-specific expression, 16 of the other genes are X-linked, with multiple-tissue SDE in either men or women. Three of these X-linked genes are located in the PAR1 region (Additional file 6: Figure S4; Additional file 5: Table S2), which includes genes that undergo recombination with the Y chromosome and also escape X-inactivation [33]. These PAR1 genes have identical sequences in their X and Y copies (Additional file 5: Table S2), but are only classified as X-linked in the GTEx data. While this should have led to similar expression in men and women (as in most autosomal genes), these genes have men-biased expression in multiple tissues. It is possible that although the copies are identical, the regulation of their expression is distinct between the X and Y-chromosomes. Besides the PAR1 genes, X-linked SDE genes in multiple tissues were found to only have women-biased expression (Additional file 6: Figure S4). In several cases we found that such genes have an active paralog on the Y chromosome and it is therefore likely that these genes escape X-inactivation and both X alleles are expressed in women, while men have only one X-linked allele.

Aside from the mammary glands, the adipose, skeletal muscle, skin, and heart tissues have over a one hundred SDE genes. This indicates substantial differences in the physiology, or alternate biological pathways, in these tissues between adult men and women. However, the differences in the number of SDE genes per tissue should be carefully assessed because the variability in tissue sample sizes could contribute to the number of SDE genes per tissue that we can identify. Functional terms analysis of SDE genes suggests sexual dimorphism in fat biogenesis, muscle contraction, and cardiomyopathy (Additional files 13 and 14: Tables S4 and S5). Tissues with few identified SDE genes might have overall similar function between men and women, yet even very few SDE genes can have extensive physiological impacts on the organism. For instance, the pituitary gland has only 26 identified SDE genes (Figs. 1 and 2), but two of them are the FSHB (women-biased) and TSHB (men-biased) gonadotropin hormones that have wide-ranging roles in human reproduction and metabolism [46, 47]. Another example is the CYP3A4 and CYP2B6 cytochrome P450 enzymes, which have women-biased expression in liver. Cytochrome P450 (P450, CYP) enzymes are associated with drug metabolism and other essential catabolic processes [48], and might be involved in sex-differential drug responses, as previously reported [49]. Other identified specific genes might shed new light on the pathophysiology of human diseases. For instance, the NPPB gene, which is mainly overexpressed in young women’s hearts (Additional file 18: Figure S13), is related to cardiovascular homeostasis [36, 37]. Variations in this gene are associated with postmenopausal osteoporosis, a health condition mainly affecting women [50]. Thus, a sexually dimorphic effect of this gene on both phenotypes would be interesting to assess.

To evaluate the association between SDE and selection we identified sex-specific genes. Such genes are likely to possess different roles between the sexes and therefore are likely to undergo different selection pressures in each sex. The vast majority of sex-specific genes we found are overexpressed in the testis. We previously showed reduced selection and accumulation of damaging mutation in such genes. Here we confirmed our previous findings, extended them to many more testis-overexpressed genes, and to sex-specific genes of other men and women tissues. Many of the non-testis sex-specific genes are also related to the reproductive system, including genes expressed in tissues common to both sexes, such as gonadotropin hormones expressed in the pituitary (e.g., FSHB and CGB7). Dozens of genes with no direct association to reproduction were also identified as sex specific. Many of these genes are expressed in skin tissues, are linked to hairiness (Additional files 13 and 14: Tables S4 and S5), and are likely involved in hair dimorphism in women and men. Other non-reproductive genes do not seem to share common features with each other, but are each interesting on their own, for example, the moderately men-specific growth hormone GHRH and the men-specific calcitonin-related polypeptide alpha (CALCA) (Additional file 17: Table S7). The latter is involved in calcium regulation and functions as a vasodilator [51, 52]. The genes fro both seem specific to adult men, although they are related to apparently general biological processes.

Analyzing selection on highly and moderately men- and women-specific genes, we found a significant association with reduced selection efficiency, as reflected in their dDNS/dS and dStop/dS ratios (Table 1, Fig. 6). The reduced purifying selection efficiency was also correlated with the level of sex specificity. This suggests that higher sex specificity indicates greater distinction in the functional importance for each sex, and reduced selection efficiency. This in turn enables the propagation of damaging alleles through the non-expressing sex lineages. The resulting relatively high population frequencies of these alleles can enhance the prevalence of different human diseases.

Although we found reduced selection on both men- and women-specific genes, it is notable that reduced selection was more prevalent in men-specific genes (Fig. 6). This supports our previous expectations to find men-specific genes to be under less selection than women-specific genes [24]. We suggest that the basis for this could be the practically unlimited numbers of available male gametes compared to the restricted number of available women gametes, as suggested in the Bateman principle [53]. Thus, the ability of women to pass on alleles that cause men-specific lethality will less affect the number of fertile men required to sustain the population, but not vice versa.

In this work we focused on protein-coding genes, because currently there is a broad functional knowledge on these genes and extensive experience in analyzing and quantifying the selection trends these genes have undergone. However, the importance of non-coding RNA genes for the regulation and execution of sexual dimorphism was not ignored. For instance, the function of the XIST long non-coding RNA gene in the sex-specific X-inactivation process is well documented (Additional file 19: Figure S11) [54]. Our preliminary observations of the RNA gene differential transcriptome support a global role of these genes in the sex genetic architecture (Additional file 20: Figure S12). Hence, this work and the data it provides might trigger further in-depth studies on the contribution of RNA genes to sexual dimorphism.

Finally, the vast majority of sex-specific genes we found are associated with the reproductive system. Damaging mutations in many reproductive genes can hence propagate to high population frequencies. We suggest that sex-specific genes are major contributors to the high incidence of infertility in men and women.

Our results are delimited by the scope of the data in the GTEx study. This study includes 53 tissues from adult humans. All tissues are composed of several cell types and a few are represented in fewer than 15 men or women donors. We believe our statistical and analysis measures excluded most false-positive results. However, the distinct age limits of the samples are acutely pertinent to sexual dimorphism and we do not know how much of our findings can be extended beyond adults. Examining comparable data from puberty and during embryonic stages of sex determination will likely augment the genes and phenomena described here.

After submitting this work for review, two studies on sexual dimorphism in human gene expression were made public. Kassam et al. examined the sex-specific genetic architecture of autosomal gene expression in whole blood samples from about one thousand men and one thousand women using DNA arrays [55]. No differences between men and women were found in autosomal genetic control of gene expression. We too did not identify autosomal genes with different expression between men and women in the GTEx whole blood tissue (Fig. 1; Additional file 3: Table S1). Chen et al. posted to bioRxiv a non-peer-reviewed preprint analyzing the GTEx data for gene expression sexual dimorphism and regulatory networks [56]. They report sexually dimorphic patterns of gene expression involving as many as 60% of autosomal genes. Similar to our findings, they reported breast, skin, adipose, heart, and skeletal muscle as the most sexually dimorphic tissues. The studies vary in their analyses procedures and emphasize different contexts of SDE. These studies are complementary works with different insights.

The mode of gene expression is very complex, depending on the gene’s genomic and chromatin contexts, activity of other genes, expressing tissue, the individual’s developmental stage, and external factors such as exposure to pathogens, diet, and temperature. The expression level of genes thus varies temporally (in scales of minutes to decades) and across tissues, and is a multidimensional system. This is the key challenge in evaluating differential gene expression between populations.

SDE between men and women stems from any deviations of gene activity in place (i.e., organs, tissues, and cells) and time (e.g., developmental stage, age, cell cycle point, or periodic processes). The overall distribution of gene expression values in two populations could be highly similar, and distinct in only a minor subset of samples that represents a genuine biological difference in time and/or place. For instance, a gene can have similar basal expression in men and women, but upon sex-specific induction its expression will be altered only in one sex. Thus, only a small fraction of one population in any one time might differentially express this gene. Identifying differential expression is thus a challenging problem. In addition, sex-specific expression is a particular case of SDE, in which genes present a global bias in their mode of expression in one sex compared to the other.

We applied several approaches to identify SDE and sex-specific expression. Besides analyzing differences according to the population variance (NOISeqBIO), we also used an approach that gave weight to a subset of samples that notably deviated from all other samples (using count trimmed means and NOISeq-sim). The DESeq2 method was also used to validate the results in selected datasets. In addition we used a new normalized measure for gene differential expression between pairs of sample populations. This differential expression measure takes into account the expression difference between the sexes and the maximal expression of the gene in all tissues, placing the difference in specific tissues in the context of the gene overall mode of expression. This measure is general and can be used in other population-based differential gene expression studies (Additional file 1: Figure S1). Combining these approaches increased our ability to identify differential expression from various modes of gene expression. Accumulation of many more samples from different donors and conditions will uncover the full spectra of gene modes of expression and improve the resolution of differential expression analyses.