Participating studies

The PGC-PTSD Freeze 2 dataset (PGC2) includes 60 ancestrally diverse studies from Europe, Africa and the Americas. Of these, 12 were already included in Freeze 110. Study details and demographics can be found in Supplementary Data 1. PTSD assessment was based either on lifetime (where possible) or current PTSD (i.e. including participants with a potential lifetime PTSD diagnosis as controls), and PTSD diagnosis was established using various instruments and different versions of the DSM (DSM-III-R, DSM-IV, DSM-5). For GWAS analyses, all studies provided PTSD case status as determined using standard criteria and control subjects not meeting the PTSD diagnostic criteria (see Supplementary Data 1 for additional exclusion criteria). The majority of controls was trauma-exposed. A detailed description of the studies included is presented in Supplementary Methods. We have complied with relevant ethical regulations for work with human subjects. All subjects provided written informed consent and studies were approved by the relevant institutional review boards and the UCSD IRB (protocol #16097×).

Data assimilation

Subjects were genotyped on a range of Illumina genotyping arrays (exception: UKB was genotyped on the Affymetrix Axiom array). At the time of analysis, direct access to individual-level genotypes was permitted for 65,555 subjects. For these, pre-QC’ed genotype data were deposited on the LISA server for central data processing and analysis, using the standard PGC pipelines (https://sites.google.com/a/broadinstitute.org/ricopili/) and (https://github.com/orgs/Nealelab/teams/ricopili). Studies with data sharing restrictions (eight studies, N = 137,114 subjects) performed analyses off site using identical pipelines unless otherwise indicated (Supplementary Data 1). Such studies then shared summary results for meta-analyses.

Global ancestry determination

To determine consistent global ancestry estimates across studies, each subject was run through a standardized pipeline, based on SNPweights67 of 10,000 ancestry informative markers genotyped in a reference panel including 2911 unique subjects from 71 diverse populations and six continental groups (K = 6)68 (https://github.com/nievergeltlab/global_ancestry). Pre-QC genotypes were used for these analyses.

For the present GWA studies, subjects were placed into three large, homogeneous groupings, using previously established cut-offs (Supplementary Table 12): European and European Americans (EUA; subjects with ≥90% European ancestry), African and African-Americans (AFA; subjects with ≥5% African ancestry, <90% European ancestry, <5% East Asian, Native American, Oceanian, and Central-South Asian ancestry; and subjects with ≥50% African ancestry, <5% Native American, Oceanian, and <1% Asian ancestry), and Latinos (AMA; subjects with ≥5% Native American ancestry, <90% European, <5% African, East Asian, Oceanian, and Central-South Asian ancestry). Native Americans (subjects with ≥60% Native American ancestry, <20% East Asian, <15% Central-South Asian, and <5% African and Oceanian ancestry) were grouped together with AMA. All other subjects were excluded from the current analyses (N = 6,740). Supplementary Fig. 1 shows the ancestry grouping used for GWAS of 69,484 subjects for which individual-level genotype data was available to the PGC. The ancestry pipeline was shared with external sites in order to ensure consistency in ancestry calling across cohorts.

Genotype quality control

The standard PGC pipeline RICOPILI was used to perform QC, but modifications were made to allow for ancestrally diverse data. In the modified pipeline, each dataset was processed separately, including subjects of all ancestries. Sample exclusion criteria: using SNPs with call rates >95%, samples were excluded with call rates <98%, deviation from expected inbreeding coefficient (f het < −0.2 or >0.2), or a sex discrepancy between reported and estimated sex based on inbreeding coefficients calculated from SNPs on X chromosomes. Marker exclusion criteria: SNPs were excluded for call rates <98%, a > 2% difference in missing genotypes between cases and controls, or being monomorphic. Hardy-Weinberg equilibrium (HWE): the modified pipeline identified the largest homogenous ancestry group in the data, identified SNPs with a HWE P-value < 1 × 10−6 in controls, and excluded these SNPs in all subjects of the specific datasets, irrespective of ancestry.

Relatedness within studies

Within-study relatedness was estimated using the IBS function in PLINK 1.969. From each pair with relatedness \(\hat \pi\) > 0.2, one individual was removed from further analysis, retaining cases where possible.

Calculation of principal components (PC’s) for GWAS

For each dataset, unrelated subjects were subset into the three ancestry groups (EUA, AFA, AMA; Supplementary Tables 3, 5, 6) for analysis. SNPs were excluded that had a MAF <5%, HWE P > 1 × 10−3, call rate <98%, were ambiguous (A/T, G/C), or due to being located in the MHC region (chr. 6, 25–35 MB) or chromosome 8 inversion (chr. 8, 7–13 MB). SNPs were pairwise LD-pruned (r2 > 0.2) and a random set of 100 K markers was used for each subset to calculate PC’s based on the smartPCA algorithm in EIGENSTRAT70.

Imputation

Imputation was based on the 1000 Genomes phase 3 data (1KGP phase 371). Any dataset using a human genome assembly version prior to GRCh37 (hg19) was lifted over to GRCh37 (hg19). SNP alignment proceeded as follows: for each dataset, SNPs were aligned to the same strand as the 1KGP phase 3 data. For ambiguous markers, the largest ancestry group was used to calculate allele frequencies and only SNPs with MAF <40% and ≤15% difference between matching 1KGP phase 3 ancestry data were retained. Pre-phasing was performed using default settings in SHAPEIT2 v2.r83772 without reference subjects, and phasing was done in 3 megabase (MB) blocks, where an additional 1 MB of buffer was added to either end of the block. Haplotypes were then imputed using default settings in IMPUTE2 v2.2.273, with 1KGP phase 3 reference data and genetic map, a 1 MB buffer, and effective population size set to 20,000. RICOPILI default filters for MAF and Info were removed since analyses were run across ancestry groups at this step. Imputed datasets were deposited with the PGC DAC and are available for approved requests.

Main GWAS

The analysis strategy for the main association analyses is shown in Supplementary Tables 3, 5 and 6. Analyses were performed separately for each study and ancestry group, unless otherwise indicated. The minimum number of subjects per analysis unit was set at 50 cases and 50 controls, or a total of at least 200 subjects, and subsets of smaller size were excluded. Smaller studies of similar composition were genotyped jointly in preparation for joint analyses (e.g. PSY1, PSY3). For studies with unrelated subjects, imputed SNP dosages were tested for association with PTSD under an additive model using logistic regression in PLINK 1.9, including the first five PC’s as covariates. For family and twin studies (VETSA, QIMR), analyses were performed using linear mixed models in GEMMA v0.9674, including a genetic relatedness matrix (GRM) as a random effect to account for population structure and relatedness, and the first five PC’s as covariates. The UKB data (UKB) were analyzed with BGenie v1.2 (https:// www.biorxiv.org/content/early/2017/07/20/166298) using a linear regression with 6 PC’s, and batch and center indicator variables as covariates (see Supplementary Methods for additional details). In addition, all GWAS analyses were also performed stratified by sex.

Meta-analyses

Summary statistics on the linear scale (from GEMMA and BGenie) were converted to a logistic scale prior to meta-analysis (for formula see75). Within each dataset and ancestry group, summary statistics were filtered to MAF ≥1% and PLINK INFO score ≥0.6. Meta-analyses across studies were performed within each of the three ancestry groups and across all ancestry groups. Inverse variance weighted fixed effects meta-analysis was performed with METAL (v. March25 2011)76. Heterogeneity between datasets was tested with a Cochran test and for nominally significant Q-values, a Han-Eskin random effects model (RE-HE) meta-analysis was performed with METASOFT v.2.0.177. Markers with summary statistics in less than 25% of the total effective sample size or present in less than three studies were removed from meta-analyses. Quantile-quantile (QQ) plot of expected versus observed −log 10 p-values included genotyped and imputed SNPs at MAF ≥1%. The proportion of inflation of test statistics due to the actual polygenic signal (rather than other causes such as population stratification) was estimated as 1—(LDSC intercept—1)/(mean observed Chi-square—1), using LD-score regression12 (LDSC).

For primary analyses, genome-wide significance was declared at P < 5 × 10−8. To account for multiple comparisons in analyses stratified by sex, genome-wide significance was also considered at P < 1.67 × 10−8. For genome-wide significant hits, Forest plots and PM-Plots were generated using the programs METASOFT with default settings and M-values were generated using the MCMC option13,78. For a given study and SNP, the M-value is the posterior probability that there is a SNP effect in that study. Studies with values <0.1 are predicted to have no effect, values ≥0.1 and ≤0.9 are ambiguous, and values >0.9 are predicted to have an effect. In PM-plots, M-values are plotted against -log 10 P-values. Regional association plots were generated using LocusZoom79 with 400KB windows around the index variant and compared to the corresponding windows in the other ancestry groups, including the 1000 Genomes Nov. 2014 reference populations EUR, AFR and AMR, respectively. To test for sex-specific effects, a z-test was performed on the difference of the effect estimates from male and female sex-stratified analyses.

Estimating PTSD heritability

SNP-based heritability estimates (h2 SNP ) in EUA subjects were calculated using LDSC on meta-analysis summary data. Estimates were calculated for the combined PGC freeze 2 samples (PGC2) and separately for PGC1.5 (without UKB), the UK biobank (including alternative subject/phenotype selections), and for men and women. Unconstrained regression intercepts were used to account for potential inclusion of related subjects and residual population stratification, and precomputed LD scores from 1KGP EUR populations were used. For population prevalence we used a range of values (conservative low at 10%, moderate at 30%, and very high at 50%), based on prevalences reported for subjects exposed to different types of trauma80. Sample prevalence was set to the actual proportion of cases in each set of data.

To estimate h2 SNP in admixed individuals and compare h2 SNP across different ancestries, individual-level genotype data was analyzed using an unweighted linear mixed model81 as implemented in the LDAK software82. For each ancestry group (EUA and AFA, respectively), imputed individual-level genotype data were filtered to bi-allelic SNPs with MAF ≥1% in the corresponding 1KGP phase 3 superpopulation. Imputed genotype probabilities ≥0.8 were converted to best-guess genotype calls, and for each ancestry group, studies were merged and SNPs with <95% genotyping rate or MAF <10% removed. Next, to estimate relatedness between subjects, a genetic relatedness matrix (GRM) was constructed based on autosomal SNPs that were LD pruned at r2 > 0.2 over a 1MB window, and an unweighted model with α = −1, where α is the power parameter controlling the relationship between heritability and MAF. To prevent bias of h2 SNP due to cryptic relatedness, strict relatedness filters were applied. For pairs with relatedness values > the negative of the smallest observed kinship (−0.014 for EUA and −0.045 for AFA, respectively), one subject was randomly removed. PC’s were then calculated in the remaining sets of unrelated subjects. Finally, to estimate h2 SNP , an unweighted GRM was estimated without LD-pruning, and h2 SNP was calculated on the liability scale using REML in LDAK, including 5 PC’s and dummy indicator variables for study (number of studies - 1) as covariates.

Comparability of PGC2 studies

To compare the genetic signal between specific PGC2 subsets, LDSC12 was used to estimate heritability and genetic correlations. Small EUA studies with N < 200 cases and total effective sample size of N < 500 were selected (N = 24 studies; GWAS including 2102 cases and 7366 controls, effective N = 5162) and compared to larger studies. To reduce standard error given this relatively small sample, we estimated heritability with the LDSC intercept constrained to 1, after first testing that the intercept was not significantly different from 1.

Replication study

Data from the US Million Veteran Program (MVP) were used to replicate GWAS findings9. Participants reported here completed the PCL-C that asked respondents to report how much they have been bothered in the past 30 days by symptoms in response to stressful experiences (i.e. not just military experiences). The symptom cluster most distinctive for PTSD, re-experiencing symptoms (range 5–25), was analyzed. After accounting for missing phenotype data, the final sample for European Americans was 146,660, of whom 41.3% were combat-exposed. Genotyping was accomplished via a 723,305-SNP Affymetrix Axiom biobank array, customized for the MVP. Imputation was performed with Minimac 383 and the 1000 Genomes Phase 3 reference panel. GWAS analysis was conducted using RVTEST84 using linear regression with the first 10 principal components, age, and sex included as covariates. The results were filtered with imputation quality score R2 ≥ 0.9, MAF > 0.01 and HWE test P-value > 1 × 10-06. LDSC was used to estimate genetic correlation with the PGC2 EUA sample. The PGC2 EUA GWAS summary statistics were used to estimate PRS in MVP samples, where linear regression was then used to test for association between PRS and re-experiencing symptoms.

Local ancestry deconvolution

A pipeline was developed to determine local ancestry in subjects with African and/or Native American admixture (AFA, AMA; Supplementary Fig. 28). Additional QC to consistently prepare cohort data for downstream analysis was performed with a custom script (https://github.com/eatkinson/Post-QC). Post-QC steps involved extracting autosomal data, removing duplicate loci, updating SNP IDs to dbSNP 144, orienting data to the 1KGP reference (with removal of indels and loci that either were not found in 1KGP or that had different coding alleles), flipping alleles that were on the wrong strand, and removing ambiguous SNPs.

Data harmonization and phasing: We then intersected and jointly phased the post-QC’ed cohort data with autosomal data from 247 1KGP reference panel individuals, removing conflicting sites and flipping any remaining strand flips. The merged dataset was then filtered to include only informative SNPs present in both the cohort and reference panel using a filter of MAF ≥ 0.05 and a genotype missingness cutoff of 90%. The program SHAPEIT285 was used to phase chromosomes, informed by the HapMap combined b37 recombination map86. Individuals from the cohort and reference panel were then separated and exported as harmonized sample and reference panel VCFs to be fed into RFMix87.

Reference panel: Three ancestral populations of European, African, and Native American ancestry were chosen for the admixed AFA cohorts based on ancestry proportion estimates from SNPweights runs. All reference populations were taken from 1KGP phase 3 data71. Specifically, 108 West African Bantu-speaking YRI were used as the African reference population, 99 CEU comprised the European reference, and 40 PEL of >85% Native American ancestry were used as the Native American reference panel. Individuals used as the reference panel can be found on (https://github.com/eatkinson).

Local ancestry inference (LAI) parameters: LAI was run on each cohort separately using RFMix version 287 (https://github.com/slowkoni/rfmix) with 1 EM iteration and a window size of 0.2 cM. We used the HapMap b37 recombination map86 to inform switches. The -n 5 flag (terminal node size for random forest trees) was included to account for an unequal number of reference individuals per reference population. We additionally used the --reanalyze-reference flag, which recalculates admixture in the reference samples for improved ability to distinguish ancestries.

Local ancestry of genome-wide significant variants: Haplotypes of the genomic regions around genome-wide significant associations were aligned to the local ancestry calls according to genomic position. To compare MAF of top hits on different ancestral backgrounds within a specific admixed population (AFA or AMA), subjects were grouped according to the number of copies (1 or 2) of a specific ancestry (European, African, and Native American, respectively) at that position. For a given SNP, MAF was calculated within each of the six groups. To ensure successful elimination of population stratification by standard global PC’s in regression analyses of admixed populations, two (out of 3, to reduce redundancy) local ancestry dosage covariates were included, coded as the number of copies (0, 1 or 2) from a given ancestral background. Finally, to compare if effects of the minor allele depend on a specific ancestral background (European, African, and Native American), for each SNP, we coded variables that counted the number of copies of the minor allele per ancestral background. Association between these three variables and PTSD were jointly evaluated using a logistic regression, including study indicators and five global ancestry PC’s as additional covariates.

Functional mapping and annotation

We used Functional Mapping and Annotation of genetic associations (FUMA) v1.3.0 (https://fuma.ctglab.nl/) to annotate GWAS data and obtain functional characterization of risk loci. Annotations are based on human genome assembly GRCh37 (hg19). FUMA was used with default settings unless stated otherwise. The SNP2Gene module was used to define independent genomic risk loci and variants in LD with lead SNPs (r2 > 0.6, calculated using ancestry appropriate 1KGP reference genotypes). SNPs in risk loci were mapped to protein-coding genes with a 10 kb window. Functional consequences for SNPs were obtained by mapping the SNPs on their chromosomal position and reference alleles to databases containing known functional annotations, including ANNOVAR, Combined Annotation Dependent Depletion (CADD), RegulomeDB (RDB), and chromatin states (only brain tissues/cell types were selected). Next eQTL mapping was performed on significant (FDR q < 0.05) SNP-gene pairs, mapping to GTEx v7 brain tissue, RNA-seq data from the CommonMind Consortium and the BRAINEAC database. Chromatin interaction mapping was performed using built-in chromatin interaction data from the dorsolateral prefrontal cortex, hippocampus and neuronal progenitor cell line. We used a FDR q < 1 × 10−5 to define significant interactions, based on previous recommendations, modified to account for the differences in cell lines used here. SNPs were also checked for previously reported phenotypic associations in published GWAS listed in the NHGRI-EBI catalog.

Gene-based and gene-set analysis with MAGMA

Gene-based analysis was performed with the FUMA implementation of MAGMA. SNPs were mapped to 18,222 protein-coding genes. For each gene, its association with PTSD was determined as the weighted mean χ2 test statistic of SNPs mapped to the gene, where LD patterns were calculated using ancestry appropriate 1KGP reference genotypes. Significance of genes was set at a Bonferroni-corrected threshold of P = 0.05/18,222 = 2.7 × 10−6.

To see if specific biological pathways were implicated in PTSD, gene-based test statistics were used to perform a competitive set-based analysis of 10,894 pre-defined curated gene sets and GO terms obtained from MsigDB using MAGMA. Significance of pathways was set at a Bonferroni-corrected threshold of P = 0.05/10,894 = 4.6 × 10−6. To test if tissue-specific gene expression was associated with PTSD, gene-set-based analysis was also used with expression data from GTEx v7 RNA-seq and BrainSpan RNA-seq, where the expression of genes within specific tissues were used to define the gene properties used in the gene-set analysis model.

Functional follow-up of the AFA top hit rs115539978

Cell Culture Experiments, RNA extraction and qPCR: Lymphoblastoid cell lines (LCLs) from the AFR superpopulation were obtained from the Coriell Institute, NJ (Supplementary Table 13, N = 6 lines each for the homozygous major and homozygous minor allele). Cells were cultured in RPMI 1640 medium with GlutaMAX (Thermo Scientific, 61870-036) supplemented with 15% FBS (Thermo Scientific, 26140079) and 1X Antibiotic-Antimycotic (Thermo Scientific, 15240-062) at 37 C and 5% CO2 in a humidified incubator. For Dexamethasone (Dex) treatment, a final concentration of 100 nM Dex (Sigma–Aldrich) in 100% Ethanol was added to the medium for a total of 4 hr. All experiments were run in duplicates.

RNA was extracted using the Quick-RNA MiniPrep Kit (Zymo, R2060) according to the instructions of the manufacturer including an additional DNase digestion. RNA concentrations were quantified via Qubit and cDNA was generated using the SuperScript IV First Strand Kit (Life Technologies, 18091200) according to the manufacturer’s instructions. SYBR green qPCR reactions were carried out in duplicates using POWERUP SYBR Green Master Mix (Life Technologies, A25743) and custom primer pairs (Supplementary Table 14) according to the manufacturer’s recommendations. Data were analyzed using the ΔΔCt method88 and GAPDH as reference. Between group differences were calculated using one-way ANCOVA with sex as covariate. Significance threshold was set at P = 0.05.

Deep phenotyping of the AFA top hit rs115539978

Neuroimaging: Scanning of 87 GTPC subjects took place on a 3.0 T Siemens Trio with echo-planar imaging (Siemens, Malvern, PA). High-resolution T1-weighted anatomical scans were collected using a 3D MP-RAGE sequence, with 176 contiguous 1 mm sagittal slices (TR/TE/TI = 2000/3.02/900 ms, 1 mm3 voxel size). T1 images were processed in Freesurfer version 5.3 (https://surfer.nmr.mgh.harvard.edu). Gray matter volume from subcortical structures was extracted through automated segmentation, and data quality checks were performed following the ENIGMA 2 protocol (http://enigma.ini.usc.edu/protocols/imaging-protocols/), a method designed to standardize quality control procedures across laboratories to facilitate replication. Briefly, segmented T1 images were visually examined for errors, and summary statistics and a summary of outliers ± 3 SD from the mean were generated from the segmentation of the left and right amygdala and hippocampus. Regional volumes that were visually confirmed to contain a segmentation error were discarded.

Startle Physiology: The physiological data of 299 GTPC subjects were acquired using Biopac MP150 for Windows (Biopac Systems, Inc., Aero Camino, CA). The acquired data were filtered, rectified, and smoothed using MindWare software (MindWare Technologies, Ltd., Gahanna, OH) and exported for statistical analyses. Startle data were collected by recording the eyeblink muscle contraction using the electromyography (EMG) module of the Biopac system. The startle response was recorded with two Ag/AgCl electrodes; one was placed on the orbicularis oculi muscle below the pupil and the other 1 cm lateral to the first electrode. A common ground electrode was placed on the palm. Impedance levels were less than 6 kilo-ohms for each participant. The startle probe was a 108-dB(A)SPL, 40 ms burst of broadband noise delivered through headphones (Maico, TDH-39-P). The maximum amplitude of the eyeblink muscle contraction 20–200 ms after presentation of the startle probe was used as a measure of startle magnitude.

Polygenic scoring

Polygenic risk scores (PRS) were calculated in hold out target samples based on SNP effect sizes from PTSD GWAS in non-overlapping discovery/training samples. GWAS summary statistics were filtered to common (MAF > 5%), well imputed variants (INFO > 0.9). Indels, ambiguous SNPs, and variants in the extended MHC region (chr6:25-34 Mb) were removed. LD pruning was performed using the --clump procedure in PLINK1.9, where variants were pruned if they were nearby (within 500 kb) and in LD (r2 > 0.3) with the leading variant (lowest P-value) in a given region. PRS were calculated in PRSice v2.1.2 using the best-guess genotype data of target samples, where for each SNP the risk score was estimated as the natural log of the odds ratio multiplied by number of copies of the risk allele. PRS was estimated as the sum of risk scores overall SNPs. PRS were generated at multiple P-value thresholds (P T ) (at intervals of 0.01 ranging from P = 0.0001 to P = 1). Best-fit PRS (at P T = 0.3 for PTSD and P T = 0.3 for re-experiencing symptoms, respectively) were used to predict PTSD status under logistic regression, adjusting for 5 PCs and dummy study indicator variables, using the glm function in R 3.2.1. PRS prediction plots were based on quintiles of PRS, with odds ratios calculated in reference to the lowest quintile. The proportion of variance explained by PRS was estimated as the difference in Nagelkerke’s R2 between a model including PRS plus covariates and a model with only covariates. R2 was converted to the liability scale assuming a 30% prevalence, using the equation found in Lee et al89. P-values for PRS were derived from a likelihood ratio test comparing the two models.

Genetic correlation of PTSD with other traits and disorders

Bivariate LD-score regression (LDSC) was used to calculate pairwise genetic correlation (r g ) between PTSD and 235 traits with publicly available GWAS summary statistics on LD Hub12. Summary statistics for PTSD studies were restricted to the EUA meta-analysis, including UKB subjects (23,212 cases, 151,447 controls) and significance was evaluated based on a conservative Bonferroni correction for 235 phenotypes (i.e. correlated traits and traits measured twice in independent studies were counted independently).

In addition, these phenotypes were compared with genetic correlations reported for PTSD and several psychiatric disorders, including 221 phenotypes and MDD18, 172 phenotypes and Schizophrenia (SCZ)10, 196 phenotypes and bipolar disorder (BPD)19 and 219 phenotypes and attention-deficit/hyperactivity disorder (ADHD)20. Due to substantial overlap with other traits, two education, four anthropometric and two cancer phenotypes were omitted.

Conditional analyses to test for disease specific effects

To evaluate if the effects of the top variants identified in the PTSD GWAS were specific to PTSD, we conditioned PTSD on MDD, and MDD plus BPD plus SCZ using the multi-trait conditional and joint analysis (mtCOJO)24 feature in GCTA to regress out the effects of correlated traits based on external GWAS summary data. MDD was selected here as the main psychiatric trait because of the high co-morbidity and genetic correlation of depressive symptoms and PTSD (r g = 0.80 for depressive symptoms and r g = 0.62 for MDD; see Supplementary Data 3). Publicly available summary statistics were supplied as program inputs: Bipolar cases vs. controls for BPD, and MDD2 excluding 23andMe for MDD (both from https://www.med. unc.edu/pgc/results-and-downloads); Schizophrenia: CLOZUK + PGC2 meta-analysis for SCZ (http://walters.psycm.cf.ac.uk/). The effect of each psychiatric disorder on PTSD was estimated using a generalized summary-data based Mendelian randomization analysis of significant LD independent psychiatric trait SNPs (r2 < 0.05, based on 1000 G Phase 3 CEU samples), where the threshold for significance was set to P < 5 × 10−7 due to having less than the required 10 significant independent SNPs at the program default P < 5 × 10−8 for MDD. Estimates of heritability, genetic correlation, and sample overlap of psychiatric trait and PTSD GWAS were estimated using precomputed LD scores based on 1000 G Europeans that were supplied with LDSC (https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.