Participants

Data analyzed in this study were obtained from 1583 14-year-old adolescents, participants of the IMAGEN project, for which magnetic resonance images passing quality control procedures were available. Recruitment procedures have been described previously,15 and written informed consent was obtained from all participants and their legal guardians. Individuals completed an extensive battery of neuropsychological, clinical, personality and drug use assessments online and at the testing centers. Participants were excluded if they had contraindications for magnetic resonance imaging (for example, metal implants and claustrophobia). Some individuals were only included in part of the analyses, depending on availability of the genotype, imaging and cognitive data for each participant. The characteristics of this sample are described in Table 1.

Table 1 Characteristics of study participants Full size table

Cognitive assessment

The Block Design and Matrix Reasoning subtests of the Wechsler Intelligence Scale for Children-Fourth Edition16 were computed to generate a Perceptual Reasoning Index and assess nonverbal intelligence (nonverbal intelligence quotient (IQ)). The Similarities and Vocabulary subtests were computed to generate a Verbal Comprehension Index measuring verbal concept formation, that is, the subjects’ ability to verbally reason (referred to as verbal IQ). For this, single test scores were converted to more precise age-equivalent scores values. Score values of the relevant subtests were summed to generate indices for Perceptual Reasoning or Verbal Comprehension. To control for differences in developmental status between participants, pubertal status of the sample was assessed using the Puberty Development Scale,17 which provides an eight-item self-report measure of physical development based on the Tanner stages.

SNP genotyping and quality control

DNA purification and genotyping were performed by the Centre National de Génotypage in Paris. DNA was purified from whole-blood samples (~10 ml) preserved in BD Vacutainer EDTA tubes (Becton, Dickinson and Company, Oxford, UK) using the Gentra Puregene Blood Kit (Qiagen, Manchester, UK) according to the manufacturer’s instructions. A total of 705 and 1382 individuals were genotyped with the Illumina (Little Chesterford, UK) Human610-Quad Beadchip and Illumina Human660-Quad Beadchip, respectively. For each genotyping platform the following quality control was performed separately. Single-nucleotide polymorphisms (SNPs) with call rates <95%, minor allele frequency <5%, deviation from the Hardy–Weinberg equilibrium (P⩽1 × 10−3) and nonautosomal SNPs were excluded from the analyses. Individuals with excessive missing genotypes (failure rate >5%) were also excluded. Population homogeneity was examined with the Structure software using HapMap populations as reference groups.18 Individuals with divergent ancestry (from Utah residents with ancestry from northern and western Europe) were excluded. Identity-by-state clustering and multidimentional scaling were used to estimate cryptic relatedness for each pair of individuals using the PLINK software19 and closely related individuals were eliminated from the subsequent analysis. We applied principal component analysis to remove remaining outliers,20 defined as individuals located at more than four s.d. of the mean principal component analysis scores on one of the first 20 dimensions. Finally, the integrated genotypes from both Illumina Human610 Quad BeadChip and Human660-Quad BeadChip were combined and platform- specific SNPs were removed. After the quality control measures, we obtained a total of 466 125 SNPs in 1834 individuals.

Magnetic resonance imaging

Full details of the magnetic resonance imaging acquisition protocols and quality checks have been described previously.21 Brain images were segmented with the FreeSurfer software package (http://surfer.nmr.mgh.harvard.edu/) and the entire cortex of each individual was inspected for inaccuracies. Individuals with major malformations of the cerebral cortex were excluded from further analysis. Out of 1909 images, 1584 passed these quality control checks. In addition to global mean thickness of the left and right cerebral hemispheres, neuroimaging measures included cortical thickness for 33 individual regions per hemisphere. These were combined to produce weighted average thickness (weighted for surface at each region) for the four cerebral lobes (that is, frontal, temporal, parietal and occipital). The effect of magnetic resonance imaging site was controlled by adding it as a nuisance covariate in all statistical analyses.

Human neural stem cell culture

The human neural stem cell line SPC-04 was generated from 10-week-old human fetal spinal cord22 and was cultured mainly as previously described.23 In brief, cells were plated on tissue culture flasks that had been freshly coated with laminin (20 μg ml−1 in Dulbecco’s modified Eagle’s medium:F12 for 3 h at 37 °C), at a density of 20 000 cells cm−2 and routinely grown into a reduced minimum media formulation consisting of Dulbecco’s modified Eagle’s medium:F12 with 0.03% human serum albumin, 100 μg ml−1 human Apo-transferrin, 16.2 μg ml−1 putrescine dihydrochloride, 5 μg ml−1 human insulin, 60 ng ml−1 progesterone, 2 mM L- glutamine and 40 ng ml−1 sodium selenite. This reduced minimum medium was also supplemented with growth factors (10 ng ml−1 basic fibroblast growth factor and 20 ng ml−1 epidermal growh factor) and 100 nM 4-hydroxy-tamoxifen. Cell differentiation was triggered when cells reached about 80% confluence by depleting the medium of growth factors and 4-hydroxy-tamoxifen. This was achieved in two steps. First, the growth factor- and 4-hydroxy-tamoxifen-depleted medium was supplemented with 10 μM of the γ-secretase inhibitor N-[N-(3,5-difluorophenacetyl)-L-alanyl]-S-phenylglycine t-butyl ester and 100 nM all-trans-retinoic acid for 48 h. We referred to this stage as ‘pre-differentiation’. Afterward, differentiation was achieved by maintaining the cells in reduced minimum media without any supplements for up to 7 days, with media change every 2 days.

RNA extraction and microarray analyses and SNP selection

RNA was extracted from triplicate SPC04 differentiation experiments using the RNeasy Mini Kit (Qiagen), according to the manufacturer’s instructions. Total RNA samples were processed using the TargetAmp-Nano Labeling Kit (Cambio, Cambridge, UK) and hybridized to Illumina HumanHT-12 v4 Expression BeadChips according to the manufacturers’ instructions at the Biomedical Genomics microarray core facility of the University of California, San Diego, CA, USA. Raw data were extracted by the Illumina BeadStudio software and further processed in R statistical environment (http://www.r-project.org) using the lumi24 and limma25 Bioconductor packages. Raw expression data were log2 transformed and normalized by quantile normalization. Differential expression between each differentiated versus undifferentiated conditions was assessed using the linear model for microarray analyses package. P-values were adjusted for multiple testing according to the false discovery rate procedure of Benjamini and Hochberg, and differentially expressed genes were selected at false discovery rate <5%. See Supplementary Table 1 for the list of differentially expressed genes. The functional annotation clustering tool, part of the Database for Annotation, Visualisation and Integrated Discovery26 was used to determine enrichment of functional groups in genes’ list generated from the microarray analyses. SNPs (n=59 643) lying within ±10 kB of each differentially expressed autosomal genes were selected for genetic association studies; of these, n=54 837 passed genetic quality controls and were used in further association analyses.

Genetic associations

Linear regression analyses were performed in PLINK19 using average cortical thickness of the left or right hemisphere as a dependent variable and the additive dosage of each SNP as an independent variable of interest, controlling for covariates of age, sex, puberty and the first four principal components from multidimentional scaling analysis. Dummy covariates were also used to control for different scanning sites. Genome-wide complex trait analysis27 was used to estimate the proportion of phenotypic variance in left cortical thickness explained by all genotyped SNPs and SNPs selected from our differential gene expression analyses. The genome-wide complex trait analysis was fitted using a restricted maximum likelihood method. The Broad Institute’s SNAP online plotting tool28 was used to generate the regional association and recombination rate plots.

The same conditions were used when investigating the association between rs7171755 and IQ except that ethnicity was also included as a nuisance covariate. Given correlations between brain volume (that is, the sum of all cortical and subcortical gray and white matter, excluding ventricle and cerebrospinal fluid), cortical thickness, cortical surface area and IQ, left surface area was also included as a covariate when using brain volume or IQ as a variable. For the associations of rs7171755 with brain volume, linear regression analyses were performed using site, sex, left surface area and four multidimentional scaling components as covariates. Handedness influenced none of the above associations and was not included as a covariate in our analyses. Mediation analyses between SNP × left (average or frontal) cortical thickness × non verbal IQ were performed in SPSS (version 20.0) using the PROCESS boostrapping procedure29 with 1000 boostrap samples used to calculate 95% confidence interval estimates of indirect effects.

Bonferroni corrections adjusting for the total number of tests in each analysis were performed to control for multiple testing. For the genotypes × cortical thickness association analyses with the selected 54 837 SNPs, on the left and right hemispheres, the corresponding significance threshold was P=4.56 × 10−7.

Meta-analytic association of rs7171755 with brain volumes in ENIGMA

We have used the ENIGMA data set, the largest meta-analysis of gene × neuroimaging phenotypes, to investigate association of rs7171755 with total brain volume, the brain phenotype most closely related to cortical thickness available in this data set. Association of rs7171755 with brain volume was performed using the online tool EnigmaVis,30 generating an interactive association plot. Only the healthy subsample (N=5775) of ENIGMA, for which this brain volume was available, were included in the meta-analysis.

Bootstraping procedure

To provide bias-reduced estimates of the associations reported above, we used a bootstrap resampling approach31 for linear regression models in the following way: first, subjects were resampled with replacement from the subjects passing quality controls criteria, here referred to as the bootstrap sample. Second, the coefficient βSNP for the SNP of interest from the bootstrap sample was calculated. We shuffled the SNP column of the bootstrap sample 100 000 times and recalculated the βSNP, generating a NULL distribution of βSNP for the bootstrap sample, denoted as βNULL. Third, the Pemp (empirical P-value) of the bootstrap sample was determined as the portion of βNULL greater than βSNP. We repeated this bootstrap procedure 10 000 times to obtain an empirical distribution of the P-values for each variable of interest.

Least square kernel machine association tests for candidate genes

As genetic association testing based on single SNPs might suffer from low power, we have also used a more sophisticated lease square kernel machine (LSKM) procedure that we have recently developed to analyze joint effects of several SNPs with imaging traits32 to detect possible genetic influences on cortical thickness. In short, this procedure compares individuals’ allele profiles, composes a similarity matrix (Kernel Matrix), and then determines to what extent the similarity matrix explains variations in the phenotype. A summary statistics is used to evaluate the significance under null hypothesis. We considered SNPs within ±10 kb of a gene’s transcript region as ‘belonging to’ the corresponding gene. In the current analysis, a gene-wide identity-by-state matrix was used as the similarity matrix. After quality control, 2659 out of the ~3540 genes differentially expressed in our microarray analyses were retained and subjected to the LSKM analysis. As for the single SNP association analyses, recruitment site, gender, age, puberty, ethnicity and the four first multidimentional scaling components were used as covariates in the LSKM analyses.

NPTN expression on mouse brain samples

RNA samples extracted from CD1 mouse brains at embryonic day 10 (E10), E14, E18 and at postnatal (P) stages 1 week, 1 month or 6 months were obtained from AMS Biotechnology (Abingdon-on-Thames, UK). Whole-brain mouse RNAs extracted from pools of five and three embryos were used for the E10 and E14 stages, respectively. RNAs extracted from the frontal cortex were used for later developmental stages (that is, E18–P6 months). In this case, triplicate samples from independent brains were analyzed for each stage, except for the P6 month stage for which data were derived from a single mouse brain. Complementary DNAs obtained by reverse transcription using the SuperScript III First-Strand Synthesis System (Invitrogen, Paisley, UK) following the manufacturer’s instructions were amplified by PCR with GAPDH as an internal control, using the following forward and reverse primers: GAPDH-F 5′-IndexTermTGTTCCTACCCCCAATGTGT-3′; GAPDH-R 5′- IndexTermCCTGCTTCACCACCTTCTTG-3′; NPTN-F 5′-IndexTermGCCTTTCTTGGGAATTCTGGC-3′; NPTN-R 5′- IndexTermAGAGTTGGTTTTCATTGGTCCAG-3′. PCRs were run in triplicate in the Applied Biosystems real-time PCR device (7900HT Fast Real-Time PCR system) in 20 μl reactions containing 4 μl complementary DNA, 0.5 μM of each forward and reverse primers and 1 × Power SYBR Green Mix (Applied Biosystems, Paisley, UK) using the following cycles: 95 °C for 15 min and 40 cycles at 95 °C for 30 s and 59 °C for 30 s. The PCR reaction products were evaluated by a melting curve analysis. Relative quantification of the PCR products was performed using the SDS software (Applied Biosystems) comparing threshold cycles (Ct). NPTN mRNA levels were first normalized to that of GAPDH (ΔCt=Ct NPTN −Ct GAPDH ) at each developmental stage, and changes in expression relative to E10 were calculated as 2−(ΔCt−ΔCtE10). Statistical analysis (one-way analysis of variance, followed by Bonferroni-based post hoc analysis with α=0.05, two sided) was performed comparing expression of triplicates at the E18, P1 week and P1 month stages to that at E10.

NPTN expression in human brain samples

Expression of NPTN in the human brain was investigated using two databases. To study effects of rs7171755 on NPTN expression (probe 33624, targeting NM_001161363 and NM_012428), we used the publicly available BrainCloud database (http://BrainCloud.jhmi.edu/), which includes data on gene expression and genotypes from post-mortem dorsolateral prefrontal cortex samples collected from 272 subjects across the lifetime. In this database, transcript expression levels were measured on Illumina Oligoset array of 49 152 probes, and genotyping was performed using Illumina Infinium II or HD Gemini 1M Duo BeadChips.33 The genetic data sets were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/gap through dbGaP accession number phs000417.v1.p1. Submission of the data phs000417.v1.p1 to dbGaP was provided by Drs Barbara Lipska and Joel Kleinman. Data collection was through a collaborative study sponsored by the National Institute of Mental Health Intramural Research Program. Initial report on this data set is from Colantuoni et al.33 For this study, we considered only samples with good RNA quality (RNA integrity number⩾8). Statistical analyses measuring effects of rs7171755 on the postnatal expression of NPTN were performed on 147 samples (individuals⩾0.5 year old) by general linear models controlling for age, ethnicity and RNA quality.

To investigate possible differences in NPTN expression between brain hemispheres, we analyzed a database (GEO series GSE25219) containing genome-wide gene expression data from 16 brain regions on both hemispheres, collected from 57 subjects across the lifetime (N=1340 post-mortem brain samples).34 Paired sample t-tests were performed comparing expression of NPTN on the right and the left hemisphere for each sample, controlling for the developmental stage and RNA integrity factor.