All the protocols and methods used in this study were approved by the institutional review board of the Kunming Institute of Zoology, Chinese Academy of Sciences and adhere to all relevant national and international regulations.

Clinical association samples

In the discovery stage, we performed a meta-analysis using statistics from a BPD GWAS which has been described in Ruderfer et al.27 and a non-overlapped MDD GWAS which has been described elsewhere.28 In brief, the BPD GWAS sample included 10 410 cases and 10 700 controls, it has partial overlap with the Psychiatric Genomics Consortium (PGC) BPD GWAS,12 but also includes four additional BPD samples compared with the PGC GWAS.12 Standardized semi-structured interviews were used to collect clinical information about lifetime history of psychiatric illness, and operational criteria were applied to make lifetime diagnoses. All cases have experienced pathologically relevant episodes of elevated mood (mania or hypomania) and meet the criteria for BPD within the primary study classification system. Controls were selected from the same geographical and ethnic populations as the cases and had a low probability of having BPD.

The MDD GWAS includes 9227 patients and 7383 controls. Cases were required to have diagnoses of DSM-IV lifetime MDD established using structured diagnostic instruments from direct interviews by trained interviewers or clinician-administered DSM-IV checklists. Most samples ascertained cases from clinical sources, and most controls were randomly selected from the population and screened for lifetime history of MDD. In each GWAS, logistic regression was applied to test the association of clinical diagnosis with SNP dosages under an additive model. Covariates included sample grouping and principal components reflecting ancestry. Detailed descriptions of the samples, data quality, genomic controls and statistical analyses can be found in the original GWAS.27, 28

Replication analyses were performed in nine independent BPD or MDD samples that included 9920 patients and 13 973 controls, and no overlap was found with the discovery samples. Detailed information on individual samples—including diagnostic assessment, genotyping and quality control—are shown in the Supplementary Data and Supplementary Table 1. Most of these replication samples were previously reported in earlier large-scale collaborative studies where they were found to be effective in detecting genetic risk variants for BPD.10, 22, 29 Each of the original sample subjects were recruited under relevant ethical and legal guidelines for their respective areas, and all provided written informed consents prior to their inclusion in the earlier studies. In brief, the origin and sizes of the replication samples are as follows: (1) Sweden (1415 BPD cases and 1271 controls);29 (2) Romania (461 BPD cases and 329 controls);30 (3) Germany II (181 BPD cases and 527 controls);14, 29 (4) Australia (330 BPD cases and 1811 controls);14, 22 (5) USA (58 BPD cases and 145 controls); (6) China-I (198 BPD cases and 135 controls); (7) PsyCoLaus (1585 MDD cases and 2362 controls);15, 29 (8) China-II (5303 MDD cases and 5337 controls);31 (9) The Netherlands (389 MDD cases and 2056 controls).32

SNP selection, genotyping and statistical analysis

For genotyping in our replication samples, we mainly used the Illumina (San Diego, CA, USA) and Affymetrix platforms (details are shown in Supplementary Data), and the genotyping yield was at least 99% in cases and control subjects of all groups. During statistical analysis, for initial screening in the discovery BPD and MDD samples, the statistics data from a total of 559 SNPs covering 2.0 Mb in 13q21.1 region were obtained from both GWAS samples. We utilized PLINK v1.07 to perform the meta-analysis of the 559 SNPs in two samples. We used odds ratio (OR) and standard error (SE) to estimate heterogeneity between individual samples and to calculate the pooled OR and 95% confidence interval (CI) in the combined samples. To combine the results from individual sample, we calculated the heterogeneity between each samples using the Cochran’s (Q) χ2-test, which is a weighted sum of the squares of the deviations of individual OR estimates from the overall estimate. In the absence of heterogeneity among individual studies, we used a fixed-effect model to combine the sample and to calculate the pooled OR and the corresponding 95% CIs; otherwise, a random-effect model was applied. The meta-analysis was performed using the classical inverse variance weighted methods. These regional association results of 559 SNPs were plotted using LocusZoom (http://locuszoom.sph.umich.edu/locuszoom/).33 During the replication analysis and all combined analysis on rs9537793, ‘metafor’ package in R (http://www.R-project.org) was used to perform the meta-analysis using appropriate genetic model. We used a forest plot to graphically present the pooled ORs and the 95% CIs of rs9537793. Each study was represented by a square in the plot, and the weight of each study was also shown. As described in a previous GWAS meta-analysis,12 P-values for replication samples are reported as one-tailed tests and P-values for all combined samples are shown as two-tailed tests. P-value <8.94 × 10−5 was set as the statistical significance level in the discovery and combined samples; in the replication sample, P-value <0.05 was considered significant.

Cognitive index

We used educational attainment as a ‘proxy phenotype’ for cognitive function. Although it’s not a direct cognitive measure, educational attainment is correlated with cognitive ability (r~0.5) and some personality traits related to persistence and self-discipline.34 Educational attainment is strongly associated with social outcomes, and there is a well-documented health-education gradient. Estimates suggest that around 40% of the variance in educational attainment is explained by genetic factors.34 The harmonized measurements of educational attainment were coded by study-specific measures using the International Standard Classification of Education (1997) scale,35 and included a binary variable for college completion (named ‘College’, that is, whether college degree was completed) and a quantitative variable defined as an individual’s years of schooling (named ‘EduYears’, that is, number of years of schooling completed). College may be more comparable across countries, whereas EduYears contains more information about individual differences within countries. Recently, a GWAS on these ‘educational attainment’ phenotypes has been performed in 101 069 European individuals,34 and we utilized the statistical results from their GWAS as our first-step analysis. Briefly, educational attainment was measured at an age at which participants were very likely to have completed their education (more than 95% of the sample was at least 30). On average, participants have 13.3 years of schooling, and 23.1% have a college degree. In the second-step analysis,36 we used a sample which included increasing number of subjects (n=293 723) and has partial overlap with the first-step sample; in this sample, only ‘EduYears’ phenotype was assessed, with the same standard of measurement with the first-step analysis. Detailed information on the samples, genotyping methods and statistical analyses can be found in the original GWAS report.34, 36

Personality traits measurement

Personality can be deemed as a set of characteristics that influence people’s thoughts, feelings and behavior across a variety of settings. Over the last century, scientific consensus has converged on a taxonomic model of personality traits based on five higher-order dimensions of neuroticism, extraversion, openness to experience, agreeableness and conscientiousness, known as the five-factor model.37 Neuroticism refers to the tendency to experience diverse and relatively more intense negative emotions, and is commonly defined as emotional instability; it involves the experience of negative emotions such as anxiety, depression, hostility and the vulnerability to stress. Neuroticism is a pervasive risk factor for different psychiatric conditions including mood disorders and personality disorders, and is also associated with entail emotional dysregulation.38, 39

In 2015, the Genetics of Personality Consortium (GPC) conducted GWAS on neuroticism40 in 63 661 individuals from Europe, United States and Australia. We obtained the statistical results of PCDH17 risk SNP from this GWAS as our discovery analysis.40 In brief to their GWAS, neuroticism scores were harmonized across all 29 discovery cohorts by item response theory analysis and statistics were performed against SNPs using additive linear regression, with sex, age and principal components as covariates. Later in 2016, Okbay et al.41 performed an expanded analysis (n=170 911) by pooling summary statistics from the published study by the GPC40 (n=63 661) with results from a new analysis of UK Biobank data42 (UKB, n=107 245). In the UKB cohort, the measure was the respondent’s score on a 12-item version of the Eysenck Personality Inventory Neuroticism.

Subcortical structure testing

Subcortical brain regions form circuits with cortical areas to learning, memory43 and motivation,44 and altered circuits can lead to abnormal behavior and disease.45 To investigate how common genetic variants affect the structure of these brain regions, ENIGMA2 consortium conducted GWASs on the volumes of several subcortical regions derived from magnetic resonance images (MRI).46

We focused on two phenotypes (amygdala volume and hippocampal volume) closely relevant to risk of mood disorders, and obtained the statistical results from the ENIGMA2 GWAS discovery sample.46 In short, the discovery sample includes 13 171 European subjects, the subcortical brain measures (amygdala and hippocampus) were delineated in the brain using well-validated, freely available brain segmentation software packages: FIRST, part of the FMRIB Software Library (FSL), or FreeSurfer. The standardized protocols for image analysis and quality assurance are openly available online (http://enigma.ini.usc.edu/protocols/imaging-protocols/). For each SNP, the additive dosage value was regressed against the trait of interest separately using a multiple linear regression framework controlling for age, age,2 sex, four MDS components, ICV and diagnosis (when applicable). For studies with data collected from several centers or scanners, dummy-coded covariates were also included in the model. Detailed information on the samples, imaging procedures and genotyping methods can be found in the original GWAS.46

Functional MRI analysis

Imaging Subjects

Functional magnetic resonance images were obtained from healthy German participants (N=297) of European ancestry, as part of a tricentric study on the neurogenetic mechanisms of psychiatric disease (the MooDS cohort).47, 48, 49 The subjects were recruited from the communities in Mannheim, Bonn and Berlin (mean age 33.77±9.81 years, 134 males and 163 females). Exclusion criteria included a lifetime history of significant general medical, psychiatric or neurological illness, prior drug or alcohol abuse, head trauma, and the presence of a first-degree relative with mental illness. This particular experiment was approved by the ethics committees of the Universities of Bonn, Heidelberg and Berlin. All subjects provided written informed consent to participate in the study.

Genotyping

rs9537793 genotyping was performed using Illumina Human 610-Quad and Illumina Human 660 W-Quad arrays (Illumina). The allele frequencies for the SNPs were in the Hardy–Weinberg equilibrium (77 AA, 146 AG, 74 GG, P=0.77). Age, handedness, sex, site and level of education did not significantly differ between genotype groups (see Supplementary Table 2 for characteristics of the matched sample).

Emotional Face-matching Task

During functional MRI (fMRI) scanning, participants completed an emotional face-matching task. The face-matching task is an implicit emotion processing task which has previously been shown to robustly engage the amygdala.50, 51 This task includes two conditions: an emotional condition (matching faces) and a control condition (matching shapes). In the emotional condition, subjects view trios of faces with fearful or angry expressions and are asked to match the two corresponding stimuli illustrating the same individual. In the control condition, the participants view trios of simple geometric shapes (circles, vertical and horizontal ellipses) and are asked to match the two corresponding geometric shapes. The task is presented in eight blocks of six trials (30 s) with alternating epochs of face- and shape-matching conditions (task duration: 4.3 min or 130 whole-brain scans).

Imaging parameters

Blood oxygenation level-dependent fMRI was performed using three identical scanners (Siemens Trio 3 T; Siemens Medical Solutions, Erlangen, Germany) at the Central Institute of Mental Health Mannheim, University of Bonn and the Universitätsmedizin Charité, Berlin. Data were acquired with gradient-recalled echo-planar imaging (GRE-EPI) sequences with the following parameters: TR 2000 ms, TE 30 ms, 28 oblique slices (descending acquisition) per volume, 4 mm slice thickness, 1 mm slice distance, 80° flip angle, 192 mm FOV, and 64 × 64 matrix. Quality assurance measures were conducted on every measurement day at all sites according to a multicenter quality assurance protocol revealing stable signals over time.

Functional Imaging Processing

fMRI images were processed using Statistical Parametric Mapping (SPM8, http://www.fil.ion.ucl.ac.uk/spm/). The procedures followed our previously published studies with the same task.52, 53 In brief, the preprocessing included realignment, slice timing correction, normalization to the Montreal Neurological Institute (MNI) space with voxel size 3 × 3 × 3 mm3, and spatial smoothing with a 9 mm full-width at half-maximum (FWHM) Gaussian kernel. The preprocessed images were then analyzed at two levels. At the first level, images for each individual were analyzed using general linear models (GLM), where the boxcar vectors for task conditions (convolved with the standard SPM hemodynamic response function) were included as regressors of interest and the six head motion parameters from the realignment step were included as regressors of no interest. The data were high-pass filtered (cutoff, 128 s) and individual maps for the ‘face-matching>shape-matching’ contrast were computed. The contrast images were then used for a second-level random effects analysis. To test for genetic association, these contrast images were analyzed using the multiple regression model including the three allelic groups (labelled as 0,1,2) as variable of interest and age, sex and scanner site as the nuisance covariates. Significance was measured at P<0.05 family-wise error corrected across an a priori defined anatomical mask of the bilateral amygdala from the Automated Anatomical Labeling atlas.54 To probe more precisely which subregion the peak voxel was located, we further extracted three amygdala subdivisions (superficial, latero-basal and centro-medial complex) from the Anatomy toolbox55, 56 and corrected the peak voxel across the three subregional masks. The corrected P-values for each of the masks were reported.

Healthy subjects for expression quantitative trait loci analysis

To identify the impact of risk SNPs on mRNA expression, we utilized a well-characterized gene expression database BrainCloud (http://braincloud.jhmi.edu/).57 The data in BrainCloud is aimed at increasing our understanding of the regulation of gene expression in the human brain and will be of value to others pursuing functional follow-up of disease-associated variants. The BrainCloud is comprised of 261 postmortem dorsolateral prefrontal cortex of non-psychiatric normal individuals, including 113 Caucasian subjects and 148 African American individuals across the lifespan. We used 224 postnatal individuals (110 Caucasians and 114 African Americans) from BrainCloud which contains the genotype data. The raw genotype data were obtained from BrainCloud; expression data and demographic information such as RNA integrity number, race, sex, and age were also obtained. The prenatal subjects were removed from the expression quantitative trait loci (eQTL) analysis since PCDH17 mRNA expression is differentially expressed in fetal subjects compared with postnatal subjects. The statistical analysis was conducted using linear regression, with RNA integrity number, sex, race and age as covariates.

RNA-seq data processing in SMRI data set for diagnostic analysis

We downloaded raw RNA-sequencing reads from the SMRI data set (http://sncid.stanleyresearch.org/) in the FASTQ file format. The RNA-seq data were from frontal cortex (15 BPD, 15 MDD and 15 healthy controls) generated by SMRI neuropathology collection. Reads after adaptors and low quality filtering using btrim6458 were aligned to human reference genome (hg38, http://asia.ensembl.org/index.html) through Tophat2 v2.0.14 (ref. 59) with mismatches, gap length as well as edit distance all no more than 3 bases. Cufflinks v2.2.1 (ref. 60) was then applied to call new transcripts and quantify both the new and old ones with default parameters. For replicate samples, accepted hits bam files from Tophat2 alignment were merged by Samtools v0.1.18 (ref. 61) and the merged files were utilized for the following Cufflinks quantification. Only reads uniquely mapped to genes were used to calculate the gene expression level. To quantify mRNA expression, FPKM (Fragments per Kilobase per Million mapped reads) was calculated to measure gene-level expression according to the formula: FPKM=R × 103/L × 106/N; where F is the number of fragments mapping to the gene annotation, L is the length of the gene structure in nucleotides, and N is the total number of sequence reads mapped to the genome of chromosome.

Statistical analyses of mRNA expression associated with diagnosis were conducted in R 3.0.1 using linear regression, covaring for RNA integrity number, sex, age, race, duration of illness, brain pH, post-mortem interval, suicide status and batch number in each sample. All reported two-sided P-value s were calculated from t statistics computed from the log fold change and its standard error from each multiple regression model, and therefore represent covariate-adjusted P-values.

Pluripotent stem cell analysis

The expression analysis of PCDH17 in iPSCs and neurons derived from BPD patients and healthy controls has been described in a previous study.62 In brief, subjects contributing a skin sample were from a psychiatric clinic in a mid-western college city, Caucasian, and were diagnosed with Bipolar I Disorder, or healthy unaffected controls were ascertained through advertising on the University of Michigan Clinical Studies website. To characterize the iPSC and to determine whether there were differences in their gene expression profiles with neuronal differentiation, total RNA was isolated from six individual iPSC cell lines (3 BPD patients and 3 controls) before and following 8 weeks of neuronal differentiation using the TRIzol reagent (Invitrogen, Grand Island, NY, USA). The RNAs were amplified and hybridized to GeneChip U133 Plus 2.0 microarrays (Affymetrix, Santa Clara, CA, USA). Only complete sets of iPSC (three BPD patients and three controls) and neurons (same three BPD patients and same three control cell lines from six individuals) were analyzed to minimize stochastic changes due to culture conditions. Detailed protocols about fibroblast derivation, iPSC derivation and neuronal differentiation were described previously.62

Plasmid constructs and reagents

The pCMV (Agilent Technologies, Santa Clara, CA, USA) encoding human PCDH17 with a C terminus Myc-tag and pEGFP vector were used. The integrity of constructs was verified by sequencing. The following antibodies were used: GFP rabbit polyclonal (MBL) and Myc mouse monoclonal (MBL).

Cortical neuronal cultures and transfection

Dissociated cortical neurons were prepared from cerebral cortex of C57BL/6J mice embryonic (E16.5). In brief, cortices were dissected, trypsinized and gently minced. Neurons were seeded to a density of 1 × 106 viable cells/35 mm glass bottom dishes previously coated with poly-D-lysine (1 mg ml−l) for at least 12 h at 37 °C. Cultures were maintained at 37 °C with 5% CO 2 , supplemented with Neurobasal medium with 2% B27 (Invitrogen), penicillin/streptomycin (100 U ml−1 and 100 μg ml−l, respectively), 2.5 mM glutamine, and 5% fetal bovine serum. Cultures were transfected with Lipofectamine 2000 (Invitrogen) at 17–18 days in vitro (DIV) with EGFP plus mock plasmid or PCDH17-myc and maintained for additional 1 day before imaging analysis.

Quantitative morphological analysis of dendritic spines

Transfected neurons were fixed in 4% paraformaldehyde with 4% sucrose at 4 °C. Immunostaining with antibody to GFP was used to circumvent potential unevenness of GFP diffusion in spines. For co-transfection experiments, the neurons that clearly transfected with both GFP and PCDH17-myc were captured as images. Transfected neurons were chosen randomly and images were obtained using a TCS SP8 confocal microscope (Leica Microsystems). The acquisition parameters were kept constant for all scans in the same experiment. Deconvolution was performed and image stacks (0.13 mm z series) were quick projected. The first or second dendrites that were arborized from a neuron were subjected to morphological analysis. Data analysis was carried out using ImageJ software (NIH, MD, USA). Dendritic spine density was evaluated manually. Individual spines on dendrites were traced and neck length and head width of each spine was measured. To analyze spine morphology, at least 400 spines (from 16 neurons) were measured for each condition. On the basis of morphology, spines were classified into the following categories: (1) Thin, where the head width was <0.4 μm; (2) Mushroom, where the head width was >0.4 μm; and (3) Stubby, where the neck length was <0.1 μm. Statistics were calculated in Prism v 6.07 (GraphPad Software, San Diego, CA, USA). Spine density, neck length and spine width between two groups were compared using two-sided student’s t-test. To compare the proportion of different spine types between two groups, two-way analysis of variance with Bonferroni post hoc test were used.