Sample recruitment, microarray genotyping, and WGS

We recruited 26 pedigrees (15 from CR and 11 from CO), each ascertained for multiple individuals diagnosed with BP1 (Table 1). Some families were previously studied using linkage analysis13,14,15,16,17. The ascertainment and phenotyping strategy was previously reported18, and is briefly reviewed in the Supplementary Text. Control individuals were relatives of BP1 individuals in families, and either they went through the complete psychiatric evaluation and were found to have no mental illness, or they answered negatively to all Mini International Neuropsychiatric Interview19 questions related to mood or psychotic symptoms and were > 60 years of age. Individuals who were not diagnosed as BP1 or who were not considered as controls had unknown disease status. Written informed consent was obtained from all participants. Institutional Review Boards at participating institutions approved all study procedures. Using DNA extracted from whole blood we performed microarray genotyping using Illumina Omni 2.5 chips; as reported previously12 this procedure yielded data after QC for 838 individuals (206 BP1) with 2,026,257 SNPs (Supplementary Figure 1). For WGS, we used ExomePicks to identify the subset of individuals to sequence that would enable maximum opportunity to impute variants into the remaining genotyped pedigree members. Owing to budgetary constraints, 22 pedigrees out of the 26 pedigrees (449 individuals after QC) were sequenced including 143 BP1 (Supplementary Figure 1). Illumina performed WGS using HiSeq 2000 with 36× mean coverage (100 bp read length).

Table 1 Description of families included in the current study. Full size table

Variant calling, QC, and imputation

We called SNVs using GATK best practices20,21. We removed variants that failed variant quality score recalibration and set each genotype whose quality score was ≤ 20 to missing (see Supplementary Text for details on QC). After QC, we had 449 individuals (143 BP1) and 20,396,290 SNVs (Supplementary Table 1). We then performed genotype refinement using Polymutt22, which corrected almost all Mendelian inheritance errors (Supplementary Table 2). To increase the sample size, we performed pedigree-aware genotype imputation using GIGI23,24, which imputed 334 individuals with only microarray data. After imputation, 782 individuals (190 BP1, 130 controls, and 462 unknown disease status) were either sequenced or imputed with high quality (see Supplementary Text and Supplementary Figure 2 for measuring imputation accuracy).

We performed genome-wide detection of CNVs using microarray and WGS data. For microarray data, we adapted a previously established pipeline25 based on PennCNV26, and QuantiSNP27. After removing individuals failing QC (Supplementary Figure 3), we detected 5,437 CNVs (3,317 deletions and 2,120 duplications) after filtering for rare events > 5 kb in length and spanned by a minimum of 10 probes among 782 individuals (189 BP1, 128 controls, and 465 unknown disease status). For WGS data, we called 8,768 bi-allelic deletions using GenomeSTRiP software28,29 among the 449 sequenced individuals, and used these calls to impute CNVs in the same set of individuals imputed for SNVs. We found that CNVs from GenomeSTRiP had low Mendelian error rate (Supplementary Figure 4) and low false-discovery rate (Supplementary Table 3 and Supplementary Text). A summary on the number of variants and individuals after QC and on which analysis is applied to each type of variant, is in Supplementary Table 4.

Variant annotation

SNVs were mapped to UCSC knownGene30 and GENCODE V.1931 transcripts. To identify rare variants, we used both external and internal sources of allele frequency. For SNVs we used allele frequencies in 1000 Genomes32 (1KG) Colombians (CLM) and ExAC33 Latinos (AMR). For CNVs, we extracted frequency information from the Database of Genomic Variants Gold Standard Variants34 for microarray CNVs and from Phase 3 of 1KG35 (AMR) for WGS CNVs. If a variant in our data set is present in an external source, we considered it rare if its MAF is < 1% in that source. For variants not present in any external source, we considered them rare if their MAF is < 10% in our data set where MAF is estimated from all sequenced individuals. Deleterious SNVs are stop-gain/loss, splice-site, and missense variants predicted damaging by PolyPhen-236.

Estimation of global admixture proportions

We generated estimates of admixture proportions for the 838 individuals with microarray data using ADMIXTURE37 with 57,180 LD-pruned SNPs. The reference populations were CEU (n = 112) and YRI (n = 113) from HapMap38,39, and 52 Native American samples from Central or South America who have virtually no European or African admixture40. We compared the proportion of European ancestry between BP1 individuals and controls using both a linear mixed model (LMM) based on lmekin function in coxme R package41 and a generalized linear mixed model (GLMM) based on GMMAT software42; both took into account relatedness of individuals using a kinship matrix calculated from theoretical kinship. In LMM, the dependent variable was the proportion of European ancestry, whereas the independent variable was BP1 status, and it was vice versa in GLMM.

PRS analysis

We calculated PRS of our samples with WGS data using PRSice43 and summary statistics from the Psychiatric Genomics Consortium (PGC) GWAS of BP12 and schizophrenia (SCZ)44 after excluding A/T and G/C SNVs and SNVs in the MHC region. Our WGS data were LD clumped, and we retained from the GWAS summary statistics the most significant SNV for each clump. We used LMM and GLMM to test association between BP1 status and PRS at each of five GWAS p value thresholds while considering relationships among individuals and global admixture proportions of European ancestry. We used logistic regression without considering relationships to estimate Nagelkerke R2 as it was not straightforward to estimate R2 using GLMM. We also did not include the admixture proportions when calculating R2 because we were interested in variance of BP1 explained only by PRS.

Identifying genes relevant to BP1

To increase power to detect effects of rare variants on BP1, we focused on genes for which a priori information indicated their relevance to BP1. To identify such genes, we utilized three sources of information. First, we performed a stratified LD score regression45 using the latest PGC BP1 GWAS summary statistics2 to identify cell-type specific promotor or enhancer regions in which BP1 heritability is enriched. Among the 10 cell-types groups tested, we observed enrichment of heritability for BP1 only in the central nervous system (CNS) group (Supplementary Figure 5), which contained 8,714 genes. Second, we used genes near 15 genome-wide significant independent lead SNPs in the latest PGC GWAS that analyzed only individuals with BP1, excluding individuals with other types of BP. We identified 72 genes around these SNPs using windows of 250 Kb. At last, we identified 99 genes within 1 Mb of BP1 linkage peaks (Supplementary Text, Supplementary Figure 6, Supplementary Table 5). These three sources yielded a gene-set of 8,757 unique protein-coding genes with one or more deleterious SNVs in at least one individual in our dataset (Supplementary Text and Supplementary Table 6).

Rare variant burden analysis

We compared burden scores between BP1 individuals and controls. For SNVs, this score was the mean burden of rare deleterious SNVs in our gene-set, which corresponds to the fraction of deleterious alternative minor alleles at those SNVs that each individual has. For CNVs, it was the number of genes in our gene-set affected by rare CNVs. Individuals who carry an overall larger number of rare SNVs may carry a larger number of rare deleterious SNVs; we, therefore, also calculated mean burden of all rare SNVs in the gene-set. For CNVs, we calculated the total number of CNVs and the average size of all CNVs that each individual carried. The mean burden of rare deleterious SNVs was regressed on the mean burden of all rare SNVs using LMM to account for relatedness. We performed a similar correction for the CNV burden score using the total number of CNVs and the average size of CNVs, and for microarray CNVs we also corrected for genotyping batch. The residuals after the LMM were then quantile-normal (QN) transformed, and we compared the QN transformed residuals between BP1 individuals and controls using both LMM and GLMM while taking into account relationships among individuals and admixture proportions of European ancestry.

Rare variant segregation analysis

Given a rare variant in a family, we developed a statistical approach that computes p values to estimate the probability of having the observed segregation pattern or more extreme segregation patterns under the null hypothesis of random segregation. Our segregation statistic for a rare variant (S rare ) is the sum of the number of BP1 individuals with a rare allele and the number of controls without the allele. To calculate the p value of S rare , we assume that the rare allele was introduced by certain founders in a family (denoted as Frv), enumerate many random inheritance vectors (IVs), and find the proportion of IVs that generate the same or larger S rare values (Supplementary Figure 7). To obtain a “Family-level” p value we computed the p value for each rare variant in each family. We also computed “Variant-level” and “Gene-level” p values that meta-analyzed p values using the Fisher’s method across different families and across different rare variants and families in a gene, respectively. The performance of our approach using simulation data and an approach to detect Frv using imputation results are discussed in the Supplementary Text.

Multiple testing correction

We summarized the types of variants (common or rare), the number of tests, and type of multiple testing correction applied to each analysis in Supplementary Table 7. There are two main questions of interest in this study: (1) identifying genetic architecture of BP1 and (2) identifying specific loci segregating in the CO/CR pedigrees. As expected, a majority of tests in this study were employed in addressing the second question where we implemented standard procedures to account for multiplicity. We did not perform multiple testing correction to analyses related to the first question, and while we can apply the study-wide testing correction that considers all tests performed in this study, it would inappropriately reduce our power to learn about genetic architecture if we treat p values from analyses related to genetic architecture as we would treat p values from the rare variant segregation analysis. As we perform 11 main analyses, we could use a significance threshold of 0.5/11 and account for further multiplicity within each main analysis. However, we opted against this idea as this is not standard and it would make it difficult to compare our results with those from other studies. Instead, we presented the total number of hypotheses tested and the multiplicity adjustment procedure in Supplementary Table 7. Furthermore, we did not use the term “significant” in describing our findings, we report p values explicitly to show the strength of evidence. At last, it is important to note that we report results of all our analyses, even when they do not lead to the identification of any promising hypothesis, thereby avoiding selection bias.