Participating studies

Nineteen studies, comprising subjects of European ancestry only, were used in this analysis. The discovery dataset consisted of 124,711 individuals with available childhood maltreatment data from the UK Biobank (UKBB)18, and the replication sample comprised 26,290 individuals—a subset of the PGC-PTSD Freeze 1.5 dataset (PGC1.5)17. The details of these studies, including the demographics and instruments used to assess maltreatment can be found in Supplementary Table 1. We have complied with relevant ethical regulations for work with human subjects. All subjects provided written informed consent and studies were approved by the relevant institutional review boards and the UCSD IRB (protocol #16097×).

Phenotype harmonization

For the childhood maltreatment phenotype, Childhood Trauma Questionnaire (CTQ) scores on physical, sexual, and emotional abuse19 were obtained from the participating studies. From this, an overall childhood maltreatment count score of 0–3 was constructed, based on a count of the three abuse categories listed above. An individual was considered to have endorsed a childhood abuse category if they scored in the moderate to extreme range for that particular category, per established cut-offs20 (Supplementary Table 2). If CTQ data were not available, the event assessment during childhood (occurring before 18 years of age) that was most validated for that particular study was obtained, providing a count of the total number of different categories of reported childhood events (e.g. physical, sexual, or severe emotional abuse) along with the range of possible scores for the measure. The reported maltreatment exposure from the UKBB dataset comprised a score of three items where participants were asked whether they were (i) “physically abused by family as a child”, (ii) “sexually molested as a child”, and whether they (iii) “felt hated by family member as a child”. The childhood maltreatment count score, whether it was generated from the CTQ or another instrument, was used as the main outcome measure in the association analysis. The range and mean of maltreatment count scores for each study can be seen in Supplementary Table 1.

Global ancestry determination, genotyping quality control, and imputation

Study participants from the PGC-PTSD were genotyped with a number of different arrays (Supplementary Table 1). Genotype data were quality controlled and processed using the standard PGC pipeline, Ricopili-MANC (https://sites.google.com/a/broadinstitute.org/ricopili/ and https://github.com/orgs/Nealelab/teams/ricopili) as part of the PGC-PTSD Freeze 2 data analysis17,21. This work was carried out on the Dutch national e-infrastructure with the support of SURF Cooperative. A detailed outline of these methods can be found in ref. 17. Briefly, ancestry was determined with pre-QC genotypes using a SNPweights panel of 10,000 ancestry informative markers from a reference panel comprising 2911 subjects from 71 diverse populations and six continental groups (https://github.com/nievergeltlab/global_ancestry). Samples with estimated > 90% European ancestry were classified as European. Samples were excluded if they had call rates < 98%, deviated from the expected inbreeding coefficient (fhet < −0.2 or >0.2), or had a sex discrepancy between reported and genotypic sex (based on inbreeding coefficients calculated from SNPs on the X chromosome). Markers were excluded if they had call rates < 98%, >2% difference in missing genotypes between PTSD cases and controls, or were monomorphic. Markers with a Hardy–Weinberg equilibrium (HWE) p < 1 × 10−6 in controls were excluded from all subjects. Principal components (PCs) were calculated using the smartPCA algorithm in EIGENSTRAT22. Pre-phasing and phasing was performed using SHAPEIT2 v2.r83723. Imputation was performed with IMPUTE2 v2.2.224 using the 1000 Genomes (1000G) phase 3 data25 as the reference.

Details regarding the QC, imputation, and ancestry determination of the UKBB dataset can be found in ref. 26. Briefly, study participants were genotyped with two custom genotyping arrays (with ∼800,000 markers). A two-stage imputation was performed using the Haplotype Reference Consortium (HRC)27 and the UK10K28 as the reference panels. Variants in the UKBB dataset were filtered to include only those with a minor allele frequency (MAF) of > 1% and an INFO score of > 0.4. Related individuals (third degree and closer) and those with a genotyping call rate < 98% were excluded. Ancestry was determined by 4-means clustering on the first two PCs provided by the UKBB29. Additional principal component analysis was conducted on the European-only data subset using flashpca230.

Main GWAS

GWAS analysis was conducted separately for each study. Best-guess genotypes were tested for association to self-reported childhood maltreatment using an ordinal logistic regression model with age, sex, and the first five PCs included as covariates. Variants with a MAF < 0.5% and a genotyping rate < 98% were excluded, for all studies except the UKBB. These analyses were implemented in PLINK 1.931 using the plug-in Rserve. To ensure computational efficiency, linear regression models were run for 4 of the larger contributing studies (NSS1; NSS2; PPDS; and UKBB, N = 143,392 subjects)17. For the NSS1; NSS2; and PPDS studies, age, sex, and 5 PCs were included as covariates in the regression model. For the UKBB dataset, the regression analysis was implemented in BGenie v1.232 with age, sex, 6 PCs, batch, and site included as covariates. All tests performed were two-sided.

Meta-analysis

As both linear and ordinal logistic models were implemented in the GWASs, which resulted in different effect statistics, fixed effects meta-analysis was conducted across studies using p-values and direction of effect, weighted according to the effective sample size as the analysis scheme, in METAL (v. March 25 2011)33. Effective sample sizes (N eff ) for ordinal logistic regressions were calculated as N eff = harmonic mean*n levels of childhood maltreatment, and for linear regressions as N eff = ((1−probability of having a zero score) × mean of nonzero data)34. Heterogeneity across datasets was tested using the Cochran’s Q-test for heterogeneity, also implemented in METAL. Only variants with an INFO score of >0.8 and a conservative MAF of >5% were included in the meta-analysis, except where otherwise indicated in the results. Forest plots were generated for genome-wide significant hits using the R package meta35.

Functional mapping and annotation

Genome-wide significant hits identified from the GWAS meta-analysis were annotated using the web-based tool FUnctional Mapping and Annotation (FUMA) v1.3.4c36. Default settings were used and annotations were based on the human genome assembly GRCh37 (hg19). The SNP2GENE module was used to identify genomic risk loci and these were mapped to protein-coding genes within a 10 kb window. An r2 of ≥ 0.6 was used to identify variants in LD with lead SNPs. The 1000G European Phase 3 was used as the reference dataset. Variants were functionally annotated using ANNOVAR, combined dependent depletion (CADD), RegulomeDB (RDB), and chromatin states (only tissues/cells from brains were included). The NHGRI-EBI GWAS catalog was used to determine any previous associations with the identified risk variants. The GTEx v7 brain tissue, RNAseq data from the CommonMind Consortium and the BRAINEAC database were used to perform eQTL mapping for significant SNP–gene pairs (FDR q < 0.05).

A gene-based analysis was performed within FUMA using MAGMA whereby SNPs were mapped to 18,989 protein-coding genes. Genome-wide significance was set at a Bonferroni-corrected threshold p < 2.63 × 10−6. In addition, gene-based test statistics were used to determine whether specific biological pathways are associated with childhood maltreatment. This was performed for 10,678 curated gene sets and GO terms obtained from MsigDB, using MAGMA. The significance threshold was set at a Bonferroni-corrected threshold of p = 4.68 × 10−6 (0.05/10,678).

Heritability estimation

Linkage disequilibrium score regression (LDSR) is a technique for quantifying polygenicity and confounding, such as population stratification, in GWAS summary statistics37. This is accomplished by evaluating the relationship between linkage disequilibrium (LD) scores (the average squared correlation of a SNP with all neighboring SNPs) and SNP test statistics. Using this approach, the LDSR intercept was used to estimate the proportion of inflation in test statistics due to polygenic signal (rather than inflation due to population stratification and cryptic relatedness), with the Eq. (1)—(LDSR intercept−1)/(mean observed chi-square−1)17. Using GWAS summary statistics, SNP-based heritability (h2 snp ) was calculated, which is one of the applications of LDSR.

Polygenic risk scoring

Using PRSice v2.1.3.beta38, PRS were calculated in target samples (PGC1.5) based on SNP effect sizes from childhood maltreatment GWAS in non-overlapping discovery/training samples (UKBB). Multiple p-value thresholds (P T ) (0.001, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 1) were generated using the best guess genotype data of target samples. Variants with a MAF < 5% were excluded from the discovery dataset. As a default in PRSice, LD pruning was performed whereby variants were pruned if they were nearby (within 250 kb) and in LD (r2 > 0.1) with the leading variant (lowest p-value) in a given region. For this analysis, a rescaled childhood maltreatment phenotype was generated whereby the childhood maltreatment score for each individual was divided by the theoretical maximum score for a given study. Best-fit PRS (at P T = 0.0354) were used to predict childhood maltreatment status as a quantitative trait, adjusting for five PCs and dummy study indicator variables. As women in PGC1.5 experienced significantly more childhood maltreatment than men, we generated PRS in women and men separately. The proportion of variance explained by PRS was estimated as the difference in Nagelkerke’s R2 between the full model (which includes PRS plus covariates) and the null model (which only has the covariates). PRS prediction plots were based on quantiles of PRS, with effect sizes calculated in reference to the lowest quantile. p-values for PRS were derived from a likelihood ratio test comparing the two models. The significance threshold was set at a Bonferroni-corrected threshold of p = 0.006 (0.05/8).

Genetic correlation

Another application of LDSR is the measurement of genetic correlation, i.e. the degree and direction of shared genetic effects between different traits37,39. Cross-cohort genetic correlation (r g ) was calculated using LDSR. The web-based interface for LDSR, LD Hub, was used to further calculate pairwise genetic correlations between childhood maltreatment and 247 non-UKBB traits of interest including psychiatric, anthropomorphic, smoking behavior, reproductive, aging, education, autoimmune, and cardio-metabolic categories.

Conditional analyses of childhood maltreatment top hits

To evaluate if the effects of top variants in the UKBB GWAS and meta-analysis were specific to childhood maltreatment, we conditioned childhood maltreatment on genetically correlated traits using the multi-trait conditional and joint analysis (mtCOJO)40 feature in GCTA41. Data for major depressive disorder (MDD)42 (from https://www.med.unc.edu/pgc/results-and-downloads) was used to minimize sample overlap with the UK Biobank data. The effect of the correlated trait on childhood maltreatment was estimated using a generalized summary-data based Mendelian randomization analysis of significant LD independent SNPs (r2 < 0.05, based on 1000G Phase 3 CEU samples). The threshold for significance was set at p < 5 × 10−6, due to having less than the required 10 significant independent SNPs at the program default of p < 5 × 10−8, for the correlated trait.