Samples

Since January 2013 subjects for the NFID (Northern Finland Intellectual Disability) project have been recruited from the Northern Ostrobothnia Hospital District Center for Intellectual Disability Care and from the Department of Clinical Genetics of Oulu University Hospital. In January 2016 the recruitment was expanded to include all pediatric neurology units and centers for intellectual disability care in the special responsibility area of Oulu University Hospital. Subjects of all ages with either intellectual disability or pervasive and specific developmental disorders (ICD-10 codes F70-79 and F80-89, respectively) of unknown etiology were included. Individuals with copy number variations of unknown clinical significance or highly variable phenotypes were also included in order to uncover other possible factors of genetic etiology. Subjects were identified through hospital records and invited via mail to take part in the study. In addition, they were recruited during routine visits to any of the study centers.

The cases have been evaluated and examined clinically by multi-professional teams. Depending on the situation in question the team may consist of psychologist, physician, speech and occupational therapist, physiotherapist, nurse and social worker. Standardized IQ tests that were used included different versions of following tests: Wechsler Preschool And Primary Scale Of Intelligence (WPPSI), Wechsler Intelligence Scale for Children (WISC) and Wechsler Adult Intelligence Scale (WAIS) for adults.

In case of autism spectrum disorder the diagnoses were also based on multiprofessional evaluation and different, clinically used methods such as ADOS (Autism Diagnostic Observation Schedule), ADI-R (Autism Diagnostic Interview), and CARS (Childhood Autism Rating Scale).

All research subjects and/or their legal guardians provided a written informed consent to participate in the study. DNA samples from the participants were extracted primarily from peripheral blood. In a few cases where a blood sample could not be obtained, DNA was extracted from saliva. The ethical committees of the Northern Ostrobothnia Hospital District and the Hospital District of Helsinki and Uusimaa approved the study.

Clinical diagnostic tests varied considerably depending on the subject´s age, clinical diagnosis and phenotype. During the past 20 years, blood and urine metabolic screening tests, chromosome karyotyping, FMR1 CGG repeat analysis, electroencephalography (EEG) and brain computed tomography (CT) or magnetic resonance imaging (MRI) have been routinely performed on almost all individuals with remarkable developmental delay or intellectual disability. Array CGH and whole exome sequencing have been widely used for less than ten and three years, respectively.

Identification of other neurodevelopmental disorder cases

We identified individuals with neurodevelopmental disorder (NDD) phenotypes (intellectual disability, schizophrenia, autism and epilepsy; N = 636, NFNDD and SFNDD cases in Table 2) among 5904 individuals with exome sequence data in the FINRISK study. FINRISK is a series of population-based health examination surveys carried out every 5 years since 1972 to monitor the risk of chronic diseases30. The cohorts have been followed up for disease end-points using annual record linkage with the Finnish National Hospital Discharge Register and the National Causes-of-Death Register.

Additional Finnish NDD cases were included from cohorts of schizophrenia and autism patients sequenced as part of the UK10K-study (i.e. subcohorts UK10K_NK_SCZ, UK10K_KUUSAMO_SCZ and UK10K_ASDFI) and a collection of autism patients from Southern Finland (AUTISM_ASDFI) (see Supplementary Data for cohort descriptions). We genetically matched each NDD case to five exome sequenced controls using the first 2 principal components (PCs). We further divided these cases and controls approximately to Northern Finnish NDD (NFNDD, Northern Finland NeuroDevelopmental Disorder) and Southern Finnish NDD (SFNDD, Southern Finland NeuroDevelopmental Disorder) cohorts based on principal component analysis (PCA).

Regional prevalence of intellectual disability in Finland

To estimate regional prevalence of ID and SCZ in Finland, we used The Social Insurance Institution of Finland provides social security coverage for Finnish residents. The Social Insurance Institution of Finland centrally provides all disability pensions in Finland and maintains a database of all residents on a disability pension and the reason for the pension. We requested the number of individuals over 16 years of age receiving a disability pension for ID or schizophrenia (SCZ) at the end of year 2016 in each of the 19 high-level administrative regions in Finland. We divided the number of beneficiaries by the population aged over 16 in each region to get a crude estimate of the relative prevalence of more severe SCZ and ID cases. The prevalence of schizophrenia particularly is higher in more detailed prevalence estimaties11. Schizophrenia tends to be underdiagnosed in the first years of illness31, and only 50% of patients with schizophrenia receive a disability pension after 5 years of initial diagnosis32.

CNV analysis

To analyze the copy number variations (CNVs), we performed DNA Chip Array (Illumina HumanCoreExome v 12.0, Illumina PsychArray) based copy number analysis of 497 cases and 504 unaffected family members of the NFID cohort. To assess CNV frequencies in the general population, we used as controls a population-based cohort of 13,390 participants from the FINRISK study33. CNV calls in controls were generated using raw data from the Illumina HumanCoreExome v12.0 and v12.1 chips.

CNVs were called using a CNV pipeline powered by PennCNV34 for sensitive CNV calling. Adjacent CNVs of similar copy number were called as one if the adjoining region between the two calls was ≤20% of the joined CNV. To increase the confidence in the called CNVs, we considered only CNVs supported by at least 10 consecutive probes and which covered a genomic region of at least 100 kb, omitting known CNV artifacts regions35. The large regional requirement was set to support analysis across the different DNA chips.

Samples were excluded if they had: (1) a high variance (SD > 0.3) in intensity (1.5% in NFID; 5.6% in FINRISK), (2) a high (>0.005) drift of B allele frequency (0 additional samples in NFID; 0.2% in FINRISK), and (3) CNVs called in excess of 10 for one individual (10 samples in NFID; 8.9% in FINRISK). All called CNVs for the NFID cohort, both for patients and for unaffected family members, were manually curated. For the FINRISK population cohort, CNVs were manually curated if large (>500 kb) or if they fit into a category of interest relevant to study (see Identifying likely pathogenic mutations chapter below). Otherwise, CNVs of controls were rejected if at least 50% of the CNV overlapped a known artifact region36, or had a poor coverage (≤1.08 SNPs per 10 kb).

GWAS data processing

All samples were genotyped in seven batches on either the Illumina CoreExome or Illumina PsychArray, which contains 480,000 common variants. The NFID samples were genotyped in three batches, one with Illumina CoreExome and two with PsychArray. FINRISK population controls were genotyped in five batches using Illumina CoreExome.

We excluded markers that exhibited high missingness rates (>5%), low minor allele frequency (<1%), or failed a test of Hardy–Weinberg equilibrium (p < 1e−9). We also excluded individuals with high rates of heterozygosity (>3sd from the mean), or a high proportion of missing genotypes (>5%). To control for population stratification, we merged the genotypes from individuals passing QC with HapMap III data from European (CEU), Asian (CHB + JPT), and African (YRI) populations. We then performed a PCA on this combined data and excluded population outliers not clustering with the Finnish samples

We then merged genotyping batches one-by-one and repeated the QC procedures described above on the merged dataset. To prevent any potential batch effects in the merged data, we also excluded any markers that failed a test of differential missingness (p < 1e−5, Fisher’s exact test) between the merged batches. Furthermore, during each round of merging we performed a association analysis (using a logistic mixed-model for individuals) between samples from each batch to identify markers where the minor allele frequency deviated significantly between batches (p < 1e−5, score test). Finally, we removed related individuals (identity by descent > 0.185).

We used a custom Finnish imputation reference panel containing 1941 low-pass whole genomes (4.6×) and 1540 high coverage exomes. We used Shape-IT37 for pre-phasing and Impute-238 for imputation.

Exome sequencing

NFID cases were exome sequenced at the Broad Institute using Illumina Nextera Rapid Capture Exome-capture kit and sequenced with Illumina HiSeq2000 or 2500. NFID cases were jointly called with a collection of Finnish individuals collected as part of the Sequencing Initiative Suomi (SISU)-study (www.sisuproject.fi). The sequence data processing and variant calling has been described previously39. See Supplemental Note 1 for descriptions of cohorts used in the current study.

We filtered samples with estimated contamination > 3% (n = 590), chimeric reads > 3% (n = 51), samples significantly deviating from other samples within each project/batch on selected metrics (transition/transversion ratio, insertion/deletion ratio, heterozygous/homozygous variant ratio, number of singletons, n = 243) and finally included only those with empirically confirmed ≥99% Finnish ancestry (described in Rivas et al.39).

We first split the multiallelic variants in to bi-allelic variants. For genotype QC, we set the following genotypes to missing; genotype quality (GQ) < 20, read depth (DP) < 10, heterozygote allelic balance less than 20% or greater than 80%, homozygous reference alt reads ≥10%, alternate allele homozygous reference reads ≥10%. Variants were filtered out if Variant Quality Score Recalibration (VQSR) did not indicate PASS, the p-value from a test of Hardy–Weinberg Equilibrium (pHWE) < 1e−9 in controls (in females only in the X chromosome), SNP quality-by-depth (QD) < 2, INDEL QD < 3 or more than 20% of heterozygote calls had allelic balance out of the 20–80% range. To account for the different batches of exome sequencing we required a stringent genotype call rate ≥0.95 in cases and controls separately after genotype QC. All variant and genotype QC was performed using Hail40 and executed in the Google Cloud dataproc cluster.

Finally, we ensured cases and controls were approximately independent by filtering such that all samples had a pairwise kinship coefficient < 0.0442 to every other sample. We estimated kinship coefficient using King41 and when possible we always retained cases rather than a related control (N filtered = 1531).

Variant annotation

We annotated variants using VEP v.85 and the LOFTEE VEP plugin [https://github.com/konradjk/loftee] to filter likely false positive protein truncating variants (PTV). We considered variant annotations of the canonical (as defined by ENSEMBL) transcript only. A variant was considered to be a protein truncating variant (PTV) if LOFTEE predicted it to be a high confidence loss-of-function variant (stop-gained, splice site disrupting or frameshift) without any warning flags.

Identifying likely pathogenic mutations

As a basis for identifying Likely pathogenic variants, we used a gene list curated within the Deciphering Developmental Disorders study (DDD) and a gene list of 93 exome-wide significant genes from the latest DDD study meta-analysis of de novo variants4. We downloaded a gene list curated within the DDD study [https://decipher.sanger.ac.uk/ddd#ddgenes] containing 1897 genes with varying degrees of evidence of mutations in those genes causing developmental delays. We further subset the list to only confirmed or probable developmental delay genes contributing to a brain/cognition phenotype. This gene set was further extended by a set of 93 genes with a significant excess of damaging de novo variants in the latest DDD meta-analysis4. These two lists resulted in a total of 818 genes (Supplemental Table 1). For each ID patient we searched for PTV or damaging missense (MPC ≥242) variants not observed (as homozygotes in recessive genes) in non-Finnish GnomAD individuals or in our control individuals. We used only non-Finnish GnomAD individuals, as all Finnish individuals in GnomAD are included in our control exome cohort. Variants were classified as Other high impact variants if the variant was a PTV (in PTV constrained gene, pLI22 > 0.95) or a damaging missense variant (MPC ≥2) in a gene that was not in the list of known genes (as above) and not observed in non-Finnish GnomAD individuals or in our control individuals. For homozygotes we used CADD43 score > 20 to filter to putatively damaging variants, as MPC score is a measure of heterozygous constraint.

CADD was chosen as pathogenicity prediction method as CADD integrates multiple different prediction tools in to a single prediction score. CADD contains both conservation-based methods (e.g., GERP, Phastcons) as well as protein level scores (e.g., SIFT and Polyphen)43.

In homozygote variant filtering we required that the variant was not seen as homozygous in non-Finnish GnomAD samples or in our internal Finnish controls.

In cases where we had parental exome data we further filtered the “likely pathogenic” variants if they were inherited from control parent or if clinical phenotype was clearly different than what has been reported in the literature (as assessed by clinical geneticist).

The algorithm for identifying pathogenic mutations was implemented in Hail40 and executed in a Google Cloud dataproc cluster.

All CNVs passing QC criteria were classified as either (1) likely pathogenic, (2) other high impact variant, or (3) uncertain. A “likely pathogenic” classification was assigned to deletions where the size was at least 1 Mb, and 500 kb for de novo deletions. All CNV types were additionally considered likely pathogenic (class i) when overlapping at least 75% with an established disease associated locus44, or deleting an ID associated gene of interest (see above). CNVs were classified as “other high impact variant” (class 2) if both: (A) they were never seen in unaffected family members, population controls, or the high-quality variant set of the Database of Genomic Variants; and (B) they deleted a gene with a high probability of loss-of-function intolerance22 (pLI > 0.95). Otherwise, a CNV was classified as a variant of uncertain significance (class 3).

Polygenic risk scores

As SNP weights we used summary statistics from GWA studies of schizophrenia45, IQ17, and educational attainment18. To avoid potential biases caused by non-random regional sampling of individuals in the GWA studies the summary statistics were generated after excluding all Finnish cohorts.

For polygenic scoring we used only well-imputed and genotyped common SNPs (Impute 2 info ≥0.9, allele frequency > 0.05). We pruned the SNPs to a subset of uncorrelated SNPs (r2 < 0.1 within 500 kb) and used the remaining SNPs for calculating a polygenic risk score (PRS) for each individual by summing the product of beta from the summary statistics and the number of effect alleles (genotype dosage for imputed SNPs) over all SNPs. Our primary hypothesis testing used a PRS constructed from nominally significant variants (p < 0.05) in the original GWAS study. The genetic scores were standardized to z-scores using Finnish population controls.

For visualizing geographical differences in the PRSs within Finland, we subset the controls to those whose parents’ birthplaces were within 100 km of each other. An individual’s coordinates were set to the average of the parents’ birthplaces’ longitude and latitude. We smoothed the PRS across a map of Finland. At each map position we calculated weighted average by weighting each individual’s PRS by the inverse of the squared distance between the map point and the individual’s coordinate. Individuals within 50 km from the map point contributed equally to the map point, i.e., the full weight was given to those individuals independent of their exact distance from the map point.

Association analysis

To control for population stratification, we matched each case to its five genetically closest controls given by the first two PC’s using the optmatch R package.

For replication and for studying the neurodevelopmental spectrum of candidate variants in the exome analysis, we identified neurodevelopmental (NDD) cases (ID, SCZ, and ASD) from the Finnish FINRISK population cohort as well as disease-specific collections sequenced in the UK10K study (SCZ and ASD) (Table 2). Each NDD case was genetically mapped to its five closest controls that were not matched to NFID patients.

For the dominant association analysis, we used both Fisher’s exact test and Firth bias corrected logistic regression using the four first PC’s as covariates. We meta-analyzed the results across the three cohorts (NFID, North NDD and South NDD) using Mantel–Haenzel meta-analysis (rma.mh in metaphor46 R package) for Fisher’s analysis and a sample size weighted meta-analysis for Firth47.

For the recessive analysis we used a recessive allele frequency test (RAFT)48, which takes the population allele frequency of the variant tested into account to estimate the probability of observing as many cases and controls as homozygotes under the null. As we genetically matched all cases to controls we present the analysis results from Fisher’s exact test and present Mantel–Haenzel meta-analysis and Firth results in the supplement.

Association analyses were performed using Hail40 and executed in a Google Cloud dataproc cluster.

Enrichment analysis

For testing if different classes of variants were enriched in cases vs. controls we used Fisher’s exact test and for significant variant classes we estimated the variance explained by Nagelkerke’s pseudo r2.

For the CNV analysis, we used the same cases and controls as in the exome analysis where GWAS data was available (433 cases and 1100 controls passing QC for CNV analysis). Association analysis was performed testing carrier ratios using Fisher’s exact test. The relevant categories were: (1) CNVs overlapping one of DECIPHER’s syndromic regions (2) deletions overlapping a known developmental delay gene (Supplementary Data 1), and (3) deletions overlapping a gene with high probability of protein truncating variant intolerance (pLI > 0.95)22.

Heritability estimation

We estimated the variance explained by different variant categories by fitting a logistic model and computing Nagelkerke’s pseudo r2 from the fitted full and null models.

Case/control status was used as a dependent variable and as an explanatory variable we used either a binary indicator for presence of variant in a given category (likely diagnostic or other high impact) or a continuous variable for PRS variance estimation. As we observed geographical differences in all evaluated PRSs we corrected for the first four PCs even after genetic matching of cases and controls to account for any residual stratification (i.e., the null model included the first four PCs). Confidence intervals for r2 were estimated using adjusted bootstrap percentile method49 by drawing 5000 bootstrap samples and computing the r2 for each sample. We compared the variance explained for the whole ID cohort and also in mild and severe ID separately. As mild and severe ID have different population prevalence we transformed the observed scale variance explained to the liability scale50. We used the population prevalence from a cumulative normal distribution function with mean 100 and standard deviation 15. Prevalence of 1.94%, 1.91% and 0.034% were used for all ID (IQ < 70), mild ID (50 ≤ IQ < 70) and other more severe ID combined (IQ < 50), respectively.

Code availability

All code used within the manuscript for all analyses is available from the corresponding author upon reasonable request.

Reporting summary

Further information on experimental design is available in the Nature Research Reporting Summary linked to this article.