Subject characteristics

Between 2008 and 2016, adult patients with MDD who started iCBT at the Internet Psychiatry Clinic in Stockholm [40], a government-funded psychiatric clinic specializing in delivering psychologist-guided iCBT, were asked to participate in the study. The treatment center is part of the public psychiatric care provided by the Stockholm County Council. The patients were asked to donate a blood sample for DNA. The patients had either been referred to the clinic by their general practitioner or via an online self-referral system. See Table 1 for a full description of the 894 study participants included in the final analysis. As detailed below, individuals from the original sample of 964 were excluded from the study for the following reasons: being an ancestry outlier (n = 49), quality control issues (n = 11), and missing phenotypes (n = 10).

Table 1 Demographic characteristics of the participants Full size table

After an online screening, the patients came to the clinic for psychiatric assessments, including a structured diagnostic interview (Mini-International Neuropsychiatric Interview) [41]. A psychiatrist or supervised psychiatry resident performed the interview. For enrollment in the study, the patient had to meet the following requirements: fulfill the criteria in the DSM IV-TR for current MDD [42, 43], be able to read and write in Swedish, and be at least 18-year-old. The exclusion criteria were any of the following: severe MDD combined with moderate to high risk of suicide, recent medication changes, comorbid bipolar or other psychotic disorder, unable to participate in concurrent psychotherapy, current alcohol or illicit drug abuse/dependence, or communication difficulties that impact treatment. The study was approved by the Regional Ethics Board in Stockholm, Sweden. All participants provided written informed consent.

Intervention

The core interventions of iCBT are the same as those administered face-to-face in conventional CBT. The iCBT program consisted of 10 text modules with components covering standard CBT interventions for patients with MDD, such as psychoeducation, cognitive restructuring, behavioral activation, and relapse prevention, that were to be completed in 12 weeks. Each module had a set of tasks and homework assignments to be completed each week that were monitored by the therapist via the secure online platform. In general, the patient and therapist interactions were limited to email contact, and there were no live meetings. A thorough description of the program has been published previously [44].

Primary outcome measure

The primary outcome measure was assessed using the Montgomery Åsberg Depression Rating Scale-Self report (MADRS-S) [45]. The MADRS-S total score, which ranges from 0 to 54, measures nine clinical characteristics of depression. The MADRS-S was assessed at treatment start (MADRS-S baseline), once each week during treatment, and in the last week of treatment (MADRS-S Post). Thus, each individual provided up to 12 weekly MADRS-S assessments that were included in the analyses. See Supplementary Table 2.

Genotyping

Genotyping was performed at LIFE & BRAIN GmbH (Bonn, Germany) using the Infinium Global Screening Array 1.0 BeadArray (Illumina, Inc., San Diego, CA, USA) and automated workflow according to the manufacturer's instructions. The raw data were analyzed using GenomeStudio 2.0 (Illumina, Inc.) using the Infinium cluster file (GSA-24v1-0_A1_ClusterFile.egt). A reclustering step was performed using the GenTrain 3 algorithm in Genome Studio 2.0.

Discovery datasets

GRSs were generated for the following six phenotypes: MDD, bipolar disorder (BIP), attention-deficit/hyperactivity disorder (ADHD), autism spectrum disorder (ASD), intelligence (IQ), and educational attainment (EDU). We obtained the corresponding GWAS results for MDD, BIP [46], ADHD [47], and ASD [48] from the Psychiatric Genomics Consortium (PGC) website (https://www.med.unc.edu/pgc/results-and-downloads) and the GWAS results for IQ and EDU from published GWAS meta-analyses [49, 50]. The target set (currently studied iCBT samples) were not part of these previous GWAS meta-analyses.

Target dataset

The GWAS data from the 964 iCBT samples were processed using the PGC Ricopili pipeline for quality control and genotype imputation with reference genomes from the 1000 Genomes Project (phase 1 version 3) [51]. Eleven samples were excluded due to sample overlap (two pairs), cryptic relatedness (two pairs with pi-hat ≥ 0.2), or poor call rate (three samples). After excluding 49 subjects due to non-European ancestry, the top 20 ancestry principal components (PC) were calculated from the best-guess imputed genotypes, please see Supplementary Figure 1. Ten participants who failed to start treatment after inclusion were excluded due to missing phenotype data, resulting in a final sample total of 894. The details of the SNP quality control of the discovery and target datasets and reference data, together with the overlapping numbers of SNPs among these three sets, are provided in Supplementary Figure 2.

GRS calculation

The GRS values were derived for the target set iCBT samples as the sum of the scores based on the risk alleles weighted by the effect size from the discovery sample. To select an independent set of SNPs for calculating the GRS, we conducted linkage disequilibrium clumping (r2 < 0.1 in 1-Mb window) on the overlapping SNPs using the European samples from the 1000 Genomes Project as a linkage disequilibrium reference. We computed eight sets of GRS for each phenotype under the p-value cutoffs of ≤ 1x10-5, ≤ 1x10-4, ≤ 0.001, ≤ 0.01, ≤ 0.05, ≤ 0.1, ≤ 0.5, ≤ 1. The GRS calculations were performed using PLINK (version 1.9) [52].

Statistical analyses

The statistical analyses were performed using R [53]. To analyze the association between the six calculated GRS values and iCBT treatment outcome measured by MADRS-S, we used the lme4 package [54] to perform full information maximum likelihood mixed models, including all available data for all patients. First, we fitted a model that determined the overall course of the MADRS-S values over the treatment period. This model included linear and quadratic effects of time (to allow for curvilinear development over time, which provided the best fit of the data) as fixed effects. The model also included a random intercept and random effect of time. Second, we investigated the influence of GRS on the rate of change during treatment. In all models, covariates (i.e., GRS) and possible confounders (i.e., ancestry PC scores, age, and sex) were added as both main effects and interaction effects with linear effect of time. The interpretation of a significant main effect of a GRS is that the GRS had a constant effect on the MADRS-S rating throughout the entire treatment period. The interpretation of a significant GRS × time interaction effect is that the GRS influenced the rate of improvement during treatment. These analyses were performed in the following steps: (1) Each of the six GRS domains at the predetermined p-value cutoff were investigated in separate models while controlling for the top five ancestry PC scores. (2) Age and sex were added to the models in step 1. (3) A full model was created in which all six GRSs were entered while controlling for ancestry PCs, age, and sex. As stated above, all covariates (GRS scores, ancestry PCs, age, and sex) were entered as both main effects and interaction effects with linear time in these analyses. To reduce multiple testing, we tested each of the six GRS at predetermined p < 0.05 in main analyses. In addition, we presented the results on GRS at all p-value thresholds as sensitivity analyses (Supplementary Table 1).

Outlier analyses

We performed outlier analyses to detect influential cases that may have biased the regression models. These analyses were performed on the GRS p < 0.05 models (controlling for PC scores, age, and sex) with which significant or near-significant (p < 0.10) main or interaction effects were obtained. For this, we used the influence.ME package [55] to calculate Cook’s distance for all observations (i.e., one MADRS-S rating) and all individuals (i.e., all MADRS-S ratings by one individual). Possible influential observations and individuals were identified by visual inspection of the Cook’s distance plots, and the regression analyses were rerun with the outlying observations or individuals removed. Removing influential observations or individuals did not result in altered interpretations of the significant or near-significant results in any of the cases.