Recruitment and consent

We enrolled patients with FEP at admission to the Centro de Atenção Integral a Saúde Mental (CAISM), São Paulo. The study protocol was designed to address the acute but temporary lack of capacity in FEP patients at admission. When a patient was admitted meeting the inclusion criteria (below), medical staff explained the study to family members, provided printed information sheets and, if agreeing, families then signed a written informed consultee consent with the assent of the patient. At the follow-up assessment, the patients were directly consented into the study, provided they had capacity. If subjects lacked capacity at the follow-up assessment, consent was taken at a later stage when capacity was regained. The local Research Ethics Committee of Universidade Federal de São Paulo (CEP-UNIFESP 0603/10) and the national Brazilian Ethics Committee (CONEP-CAAE 33148114.6.0000.5505, CAAE 48242015.9.0000.5505) approved the research protocol.

Longitudinal cohort of FEP patients

Our cohort of antipsychotic-naiveFEP patients includes 154 subjects recruited from a psychiatric emergency unit in São Paulo (Brazil). The diagnosis of a psychotic disorder was established by trained psychiatrists using Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) criteria, using the Structured Clinical Interview for DSM-IV (SCID-I). Inclusion criteria were aged between 16 and 40 years without previous history of antipsychotic medication and with confirmed non-affective psychosis (SCZ, schizophreniform disorder or brief psychosis disorder diagnosis) after two months of treatment. Prior or current treatment with benzodiazepines was allowed. Patients with psychotic episodes due to a general medical condition, substance-induced psychotic disorder, intellectual disability, major depressive disorder or bipolar disorder were excluded.

A total of 60 patients met criteria for antipsychotic-naive FEP after the follow-up (FEP, N = 60). These patients were assessed at baseline and followed up for 9.03 ± 2.76 weeks of risperidone treatment. Four patients were taking benzodiazepines and one clonazepam, at baseline. During follow-up, besides risperidone, 12 were taking clonazepam and 7 mood stabilizers.

The healthy control group (N = 60) comprised age-gender-and-ethnicity-matched volunteers with no first-degree family history of psychotic disorders, who were evaluated by trained psychiatrists using a modified SCID-I to ensure no current or previous psychiatric diagnoses. Peripheral blood samples were collected in EDTA tubes at baseline and follow-up for patients and after psychiatric interview for controls.

Clinical assessments

All psychiatrists had the same training at the “Programa de Esquizofrenia da UNIFESP” and the FEP patients were always assessed by the same psychiatrist at both time points for the following scales: (a) PANSS (Positive and Negative Syndrome Scale), (b) CGI (Clinical Global Impression Scale)18, (c) GAF (Global Assessment of Functioning Scale), (d) CDSS (Calgary Depression Scale for SCZ)19.

Symptom clusters (negative, positive, disorganization, excited and anxiety/depression) from the PANSS items20 were calculated using the algorithm from a previous study in a Brazilian population21. For more information about each symptom cluster, see Supplementary Table S1. Response to treatment was defined as a > 50% reduction in baseline PANSS total score22. GAF is the only scale where higher values represent less impairment; thus we transformed to them to negative values (referred to as −GAF).

DNA isolation

Whole blood was collected into EDTA tubes, and genomic DNA isolation was performed using the Gentra Puregene Kit (Qiagen) according to the manufacturer’s protocol.

Genomic arrays

The genotyping was performed at King’s College London using the Infinium PsychArray-24 BeadChip (Illumina) with a GWAS core backbone (~590 K markers) and specific content from the Psychiatric Genomics Consortium: https://www.med.unc.edu/pgc/psychchip.

Quality control and imputation

For the quality control (QC) parameters, we removed SNPs with a minor allele frequency (MAF) < 1%, Locus missingness > 10% or Hardy–Weinberg disequilibrium significance < 0.00001. We also excluded individuals with missingness > 10% and an estimation of identity-by-descent > 0.12. Genotype imputation was performed using the https://imputation.sanger.ac.uk using as Reference Panel the Haplotype Reference Consortium (release 1) with 32,488 samples (39 M sites) and the Pre-phasing algorithm SHAPEIT2. After post-imputation QC, using the same parameters as above, ~ 9 M SNPs were analysed.

Polygenic risk scores

For more information about how the scores are calculated, please see the Supplementary Material of Purcell et al.8. To generate the PRS we used the PRSice software (www.PRSice.info) default options. The SCZ sample from PGC2 (downloaded from https://www.med.unc.edu/pgc) was used as the training sample and our imputed genotyping sample as the target. The PGC2 SCZ PRS is generated from many individual samples that may represent more chronic and severe SCZ, such as patients on clozapine. This means the PGC PRS represents a powerful tool to understand the influence of SCZ risk on clinically important symptom dimensions pre-treatment. We performed P-value-informed clumping with a cutoff of r2 = 0.1 using a 250-kb window and calculated scores per individual for multiple p-threshold (ranging from 0.0001 to 0.5 with increments of 0.00005) including or excluding the MHC (major histocompatibility complex) region on chromosome 6, which has a complex linkage disequilibrium structure. Given that our sample is sampled from an admixed south eastern Brazilian population, we carefully assessed population stratification and used the first four components generated by plink1.9 software were used as covariables. Posteriorly, PRSice runs a regression to find the best p-threshold based on the explained variance (Nagelkerke’s pseudo-r2 correlation) and in our case gave PRSs based on the most FEP case-control variance explained.

Statistical analysis

We used R for all statistical analysis. With the PRSs calculated for the case-control comparison, we used a generalized linear model to test PRS associations assuming a Poisson distribution (Poisson regression), which is more suitable for ordinal variables (such as psychiatric scales), using clinical traits as the dependent variable and the best p-threshold PRS with the first four principal components as the independent variables and covariates. As clinical outcome variables, we considered, for both time points, GAF score, total CGI score, total PANSS scores and the five PANSS dimension clusters suggested by Wallwork et al.20 and validated by Higuchi et al.21 in the Brazilian population. GAF values were transformed to negative values (−GAF), so all clinical variables were easily compared, with high values meaning high symptomatology. We defined as outliers those observations lying beyond 1.5 times the ‘Inter Quartile Range’ - the difference between 75th and 25th quartiles.

We applied the Bonferroni correction for multiple comparisons (number of psychiatric scales tested N = 36), considering as significant a p-value < 0.0014 (0.05/36). As the Brazilian population is known to be a highly admixture population, we first plotted case and controls principal components to check if they have similar background and then we did a sensitivity analysis considering only full European ancestry cases.

Using the residuals from the PRS with principal components, we tested if the available demographics could be potential confounders. Further, we tested if response to risperidone overall or within subtypes of FEP included in our study (SCZ or schizophreniform) was associated with SCZ PRS. First, we tested the change in symptoms from baseline to the follow-up and if the subtype of FEP was associated with the PRS using a Poisson regression. Second, we tested the correlation between the change in total PANSS and PRS using a Pearson correlation. Finally, we verified if there was an association of clonazepam or mood stabilizers with CDSS, CGI, GAF and PANSS symptoms that could be affecting the results.