UK Biobank subsample

We performed GWASs on an unrelated (KING relatedness metric >0.044, equivalent to a relatedness value of 0.088; n related = 7765) European subsample (defined by 4-means clustering of the genetic principal components)57 of the genotyped UK Biobank participants (n = 155,961, 45% female, 32% of the genotyped participants, Supplementary Table 1)58,59. The UK Biobank (URLs) is a prospective cohort sampled from the general population between 2006 and 2010. All participants were between 40 and 69 years old, were registered with a general practitioner through the United Kingdom’s National Health Service, and lived within travelling distance of one of the assessment centres.

Ethics

The UK Biobank is approved by the North West Multi-centre Research Ethics Committee. All procedures performed in studies involving human participants were in accordance with the ethical standards of the North West Multi-centre Research Ethics Committee and with the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. All participants provided written informed consent to participate in the study. This study has been completed under UK Biobank approved study application 27546.

Power calculations of the GWASs

We conducted power calculations for the female and male GWASs using the Genetic Power Calculator60. A minimum of 39,580 individuals is required to detect a SNP that accounts for 0.1% of trait variance at 80% power at a genome-wide significance threshold of p ≤ 5 × 10−8 and a minor allele frequency of 0.20. According to these results, the female and the male GWASs were sufficiently powered to detect genome-wide significant loci with 70,700 females and 85,261 males. With these parameters, the female GWAS had a power of 99.8% and the male GWAS of 99.9%.

GWASs on body composition traits in the UK Biobank

The continuous body composition traits—BF%, FM, FFM and BMI—were measured using the validated bioelectrical impedance analyser Tanita BC-418 MA (Tanita Corporation, Arlington Height, IL) at every assessment centre61,62 for every participant across the UK. We applied trait-specific medication and illness filtering to exclude participants with compromised hydration status and medications or illnesses known to affect body composition to identify genetic variation associated with body composition phenotypes that is not confounded by illnesses and their downstream effects or metabolism-changing medication. We applied stringent exclusion criteria and covaried for addictive behaviour-related phenotypes, including smoking and alcohol consumption (for exclusion criteria, see Supplementary Table 2). We regressed the body composition traits on factors related to assessment centre, genotyping batch, smoking status, alcohol consumption, menopause and continuous measures of age, and socioeconomic status (SES) measured by the Townsend Deprivation Index63 as independent variables. We took the residuals from these regressions as our phenotypes for the GWASs. We included 7,794,483 SNPs and insertion–deletion variants (hereafter referred to as SNPs) with a minor allele frequency >1%, imputation quality scores >0.8, and that were genotyped, or present in the HRC reference panel64 and used an additive model on the imputed dosage data provided by UK Biobank, using BGENIE v1.265. We accounted for underlying population stratification by including the first six principal components, calculated on the genotypes of our European subsample using FlashPCA266. We performed GWASs including incremental numbers of principal components and checked each GWAS for inflation by calculating its LDSC intercept. We identified six principal components as the optimal number to adjust for population stratification within the European subsample and to not overcorrect the analysis retaining the greatest signal. Additionally, we included assessment centre as a covariate to adjust for population stratification. We then meta-analysed the sex-specific GWASs using METAL67 (URLs) applying an inverse variance-weighted model with a fixed effect, to obtain sex-combined results.

Clumping and genome-wide significant loci

Significantly associated SNPs (p < 5 × 10−8) were considered as potential index SNPs. SNPs in LD (r2 > 0.2) with a more strongly associated SNP within 3000 kb were assigned to the same locus using FUMA (URLs)68. Overlapping clumps were merged with a second clumping procedure in FUMA, merging all lead SNPs with r2 = 0.1 to genomic loci. After clumping, independent genome-wide significant loci (5 × 10−8) were compared with entries in the NHGRI-EBI GWAS catalogue69, using FUMA68.

Heritability estimation and investigation of sex differences

To ensure the robustness of our results, we applied multiple approaches to calculate heritability estimates and genetic correlations. We used BOLT-LMM70, LDSC11 and GREML71 implemented in GCTA72 to calculate common variant \(h_{\mathrm{SNP}}^2\) (URLs). Additionally, we calculated the genetic correlation between females and males using LDSC11 and Haseman–Elston regression38 implemented in GCTA72 to estimate sex differences in the genetic architecture of the body composition, glycaemic traits and physical activity. Haseman–Elston regression uses the cross-product of phenotypes for pairwise individuals and a genetic relatedness matrix to calculate heritability and genetic correlations73. All other statistics were calculated in R 3.4.1 if not otherwise stated (URLs).

GWASs of psychiatric disorders and behavioural traits

All of the following traits were used for the sex-specific and age-dependent analyses (Supplementary Data 1). The sex-specific summary statistics for the psychiatric disorders, including major depressive disorder27, schizophrenia3, anorexia nervosa25, bipolar disorder74,75, ADHD26,76, alcohol dependence77, autism spectrum disorder78 and PTSD79, were provided by the PGC (URLs), for OCD80,81 by International Obsessive Compulsive Disorder Foundation Genetics Collaborative (IOCDF-GC) and OCD Collaborative Genetics Association Studies (OCGAS), for borderline personality disorder82 by the German Borderline Genomics Consortium, for cannabis use by the International Cannabis Consortium83, for anxiety84 by our own group, for insomnia85 by the Complex Trait Genetics group at VU University Amsterdam (URLs), for heavy smoking86 by University of Leicester available from the UK Biobank (URLs), for the behavioural traits years of education87 by the Social Science Genetic Association Consortium (SSGAC) (URLs), for neuroticism41 by our own group (Supplementary Data 1) and for migraine88,89 by International Headache Genetics Consortium (IHGC). Glycaemic traits’90 summary statistics were provided by the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC), whereas childhood obesity91 results were provided by the Early Growth Genetics (EGG, URLs) Consortium, BMI in young adulthood by Graff et al.92 and physical activity by our group41.

Genetic correlations

Using an analytic extension of LDSC11, we calculated SNP-based bivariate genetic correlations (r g ) to examine the genetic overlap of body composition and glycaemic traits with psychiatric and behavioural traits and disorders in a sex-specific manner. Differences in genetic correlations were calculated and their s.e.’s were calculated using a block jackknife approach as previously described41.

Generalized summary data-based Mendelian randomization

We investigated putative causal bidirectional relationships between these traits using GSMR37. Mendelian randomization is a method that uses genetic variants as instrumental variables, which are expected to be independent of confounding factors, to test for causative associations between an exposure and an outcome93. Mendelian randomization can be used to infer credible causal associations when randomized-controlled trials are not feasible or are unethical93. GSMR performs a multi-SNP Mendelian randomization analysis using summary statistics. Let z be a genetic variant (e.g. SNP), x be the exposure (e.g. psychiatric disorder) and y be the outcome (e.g. body composition trait). First, GSMR is based on the premise that several nearly independent SNPs (z) are associated with the exposure (x). Second, it assumes that the exposure (x) has an causal effect on y. If both assumptions hold true, the SNPs that are associated with the exposure (x) will exert an effect on the outcome (y) via the exposure (x). If in this instance no pleiotropy is present, the estimate (b xy ) at any of the SNPs that are associated with the exposure (x) should be highly similar, because each effect of all SNPs on the outcome (y) will be mediated through the exposure (x). With the help of a generalized least squares (GLS) model, the estimates of b xy of each SNP that is associated with the exposure (x) can be combined, resulting in higher statistical power37,94. The GSMR method essentially implements summary data-based Mendelian randomization analysis for each SNP instrument individually, and integrates the b xy estimates of all the SNP instruments by GLS, accounting for the sampling variance in both b zx and b yz for each SNP and the LD among SNPs. We used individual-level genotype data from a subsample of the anorexia nervosa GWAS to approximate the underlying LD structure to account for LD between the variants in the multi-SNP instrument. Pleiotropy is an important potential confounding factor that could bias the estimate and often results in an inflated test statistic in Mendelian randomization analysis. We also removed potentially pleiotropic SNPs (i.e. SNPs that have effects on both risk factor and outcome) from this analysis using the heterogeneity in dependent instruments outlier method37,95 that detects pleiotropic SNPs at which the estimates of b xy are significantly different from expected under a causal model. The power of detecting a pleiotropic SNP depends on the sample sizes of the GWAS data sets and the deviation of b xy estimated at the pleiotropic SNP from the causal model. Based on this, the overall b xy can be estimated from all the instruments remaining using a GLS approach that takes the LD between the variants and the correlations between their effect sizes into account by modelling them in a covariance matrix. Additionally, GSMR uses the intercept of the bivariate LD score regression to account for potential sample overlap between the GWASs used as instruments for the exposure or outcome12. Estimates with binary exposures were converted to the liability scale40. Some of these analyses are exploratory because a few utilised GWASs were underpowered (i.e. did not detect ≥10 genome-wide significant independent loci at a p value level of 5 × 10−8) and we therefore lowered the p value threshold for inclusion, in order to include at least 10 independent SNP instruments as previously recommended37.

Correction for multiple testing

We calculated the number of independent traits by matrix decomposition (i.e. number of principal components accounting for 99.5% of variance explained) and adjusted our p value threshold accordingly. The first matrix of the main analysis contained all 17 psychiatric traits, all four body composition traits, physical activity and childhood obesity (Supplementary Data 2). All sex-specific correlations were entered when available. The second matrix comprised all 17 psychiatric traits and all glycaemic traits listed in Supplementary Data 6, including their sex-specific correlations. The family-wise Bonferroni-corrected p value threshold for the main analysis, including the genetic correlations with body composition traits and physical activity, was p Bonferroni = 0.05/190 = 2.6 × 10−4 and the family-wise p value threshold for the genetic correlations with glycaemic traits was p Bonferroni = 0.05/231 = 2.2 × 10−4.

URLs

For METAL, see http://csg.sph.umich.edu/abecasis/metal/; for FUMA, see http://fuma.ctglab.nl/; for SSGAC, see https://www.thessgac.org/; for Complex Traits Genetics lab, see https://ctg.cncr.nl; for International Headache Genetics Consortium, see http://www.headachegenetics.org/; for the MAGIC, see https://www.magicinvestigators.org/; for UK Biobank, see https://www.ukbiobank.ac.uk/; for the PTSD working group of the Psychiatric Genomics Consortium, see https://pgc-ptsd.com/; for the Psychiatric Genomics Consortium, see http://www.med.unc.edu/pgc; for the R project, see https://www.r-project.org/; for the EGG Consortium, see https://egg-consortium.org/.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.