Systemic lupus erythematosus (SLE) is an autoimmune disease characterized by loss of tolerance to nucleic acids and highly diverse clinical manifestations. To assess its molecular heterogeneity, we longitudinally profiled the blood transcriptome of 158 pediatric patients. Using mixed models accounting for repeated measurements, demographics, treatment, disease activity (DA), and nephritis class, we confirmed a prevalent IFN signature and identified a plasmablast signature as the most robust biomarker of DA. We detected gradual enrichment of neutrophil transcripts during progression to active nephritis and distinct signatures in response to treatment in different nephritis subclasses. Importantly, personalized immunomonitoring uncovered individual correlates of disease activity that enabled patient stratification into seven groups, supported by patient genotypes. Our study uncovers the molecular heterogeneity of SLE and provides an explanation for the failure of clinical trials. This approach may improve trial design and implementation of tailored therapies in genetically and clinically complex autoimmune diseases.

We aimed to identify immune correlates of DA in children with SLE, who present with aggressive disease and lack co-morbidities that might confound data interpretation in older age groups. To this end, we transcriptionally profiled 924 longitudinal blood samples from 158 pediatric patients. To analyze these data, we developed models that incorporate DA, demographics, treatments, SLEDAI components, and nephritis classes. Finally, we developed a personalized transcriptional immunomonitoring approach. This enabled patient stratification based on the immune networks best correlating with DA in each patient. The accuracy of this stratification was supported by SNP analysis and provides a rationale for tailored therapeutic interventions.

SLE patients display unique blood transcriptional signatures linked to type I interferon (IFN) and granulocytes (). Preliminary work suggests that these signatures can be used to assess DA (). Most studies have focused on IFN-induced transcripts or proteins as biomarkers () and suffer from sample-size limitations and disease-inherent clinical and therapeutic heterogeneity, making data interpretation difficult. This underscores the need to assess larger cohorts longitudinally, use unbiased approaches that incorporate all elements of the signature, and account for disease heterogeneity and clinical covariates during data interpretation.

SLE is primarily managed with hydroxychloroquine (HC), corticosteroids, and immunosuppressive agents, such as mycophenolate mofetil (MMF) and cyclophosphamide (). An anti-BAFF human monoclonal antibody (belimumab) was recently approved (), representing the only novel therapy for SLE in more than 50 years. Failure to achieve primary endpoints in clinical trials targeting CD20 with rituximab and IFNα with rontalizumab () suggests that SLE is molecularly heterogeneous. Therefore, stratification of patients according to immune networks that associate with DA may facilitate the development of customized therapies using rational clinical trial designs.

Classification of SLE requires the presence of 4 out of 11 criteria. Disease activity (DA) is measured using any of six validated composite scores, including the SLE Disease Activity Index (SLEDAI), a weighted metric combining 24 components (). These scoring systems, while useful, only quantify a non-exhaustive set of parameters. Because SLE is heterogeneous, not all manifestations are included in the SLEDAI, making reliable patient assessment challenging. Thus, there is a significant need for biomarkers of disease pathogenesis and DA.

Systemic lupus erythematosus (SLE) is an incurable systemic autoimmune disease predominantly affecting young women. Patients characteristically produce autoantibodies against double-stranded DNA (dsDNA), ribonucleoproteins (RNPs), cardiolipin, and phospholipids, among others. Autoantibodies form immune complexes (ICs) that deposit in many different organs, such as the skin, joints, and kidneys, leading to rashes, arthritis, and lupus nephritis (LN). The disease course is unpredictable, with periods of remission and flares that lead to cumulative damage over time (). This clinical heterogeneity calls for treatment personalization according to underlying molecular mechanisms.

Finally, we sought a gene set that could stratify SLE patients with shorter follow-up. To do so, we conducted ANOVA (p < 0.001) on the 11,457 transcripts that form the union of the 80 WGCNA SLEDAI modules. 3,011 transcripts were differentially correlated between the seven patient groups. Of these, 797 intersected with the blood module space (erythropoiesis: 149 transcripts; IFN response: 40; myeloid lineage/neutrophils: 163; plasmablasts: 9; lymphoid lineage: 436) ( Figure 7 A; Table S5 ). Classification accuracy of these transcripts in cross-validation analysis ranged between 0.81 and 0.91 depending on PG ( Supplemental Experimental Procedures ). Next, we correlated the SLEDAI of 12 additional patients followed during three or four clinic visits with the expression of these 797 transcripts. Specific immune correlates of the SLEDAI were identified in these individuals (e.g., IFN in SLE-187, myeloid lineage/neutrophils in SLE-255, and plasmablasts in SLE-268) ( Figure 7 B). Finally, we assigned PG groups to these 12 donors using both hierarchical cluster tree inspection and k-nearest neighbors (k = 5) approaches ( Figure 7 C). The two methods agreed for 8 out of 12 patients. SLE-157, 230, and 267 reproducibly associated with PG2, while SLE-187 and 255 associated with PG5. This suggests that these transcripts can stratify SLE patients and could be used in a targeted cost-efficient bioassay. Overall, our approach highlights immune signatures correlating with SLEDAI across different genotypes and phenotypes and supports the development of customized treatment strategies.

(C) Hierarchical cluster of the 92 SLE patients considered, according to individual SLEDAI correlation profile in the collapsed blood module space. Patients are colored by PG classification. PG assignment of the 12 new patients is highlighted by colored dots.

(B) Summary heatmap of immune signatures correlating with the SLEDAI of 12 independent patients with three to four visits (upper). Line charts representing the mean normalized expression of immune signatures (left y axis) and SLEDAI (right y axis) for three representative patients (lower).

To assess potential SNP differences between PGs, while maximizing sample size, we grouped patients from PG2 and PG3 (PG2/3), where DA correlated with plasmablasts and/or lymphoid lineage, and patients from PG4 and PG5 (PG4/5), where DA correlated with IFN response and myeloid lineage. We compared these groups to race-matched healthy controls ( Figure 6 D) and to each other. In parallel, we conducted expression quantitative trait loci (eQTL) analysis to identify SNPs acting in cis on genes that change with DA ( Supplemental Experimental Procedures Table S3 ). We then overlapped the results from these analyses, focusing on 362 group-specific SNPs that overlapped with eQTLs ( Figure 6 E; Table S4 ). Numerous genes in cis with SNPs and distinct between PG2/3 and PG4/5 were related to lymphoid lineage/B cells (BANK1, CD3E, and CD3G), myeloid lineage/neutrophils (BPI, CD300A, CXCL16, CCR1, RXRA, TKT, LILRB3, NOD2, SIGLEC9, PGLYRP1, MYD88, IL8, TNFAIP2, and TNFRSF1A) or both (CD40). Overall, 126 out of 362 (35%) SNPs were acting in cis with an IFN-inducible gene (). While studies in larger cohorts are warranted, these data suggest an association between genotype and transcriptional correlates of DA in discrete groups of SLE patients, which in combination might accelerate the identification of targets for personalized treatment.

To assess the contribution of genetics to this distribution, we genotyped 135 patients by SNP array and collected race-matched healthy controls from both our cohort and the 1000 Genomes Project (G1K) ( Figure S6 E; Supplemental Experimental Procedures ). We detected SNPs previously associated with SLE among numerous members of the HLA cluster on chromosome 6, as well as ITGAM, ITGAX, IRF5, TNPO3, and BANK1 with confirmatory significance (p < 0.001) ( Figure 6 C) (). These results were validated by random permutation analysis ( Supplemental Experimental Procedures ).

We analyzed the distribution of renal SLEDAI components, patient demographics, laboratory values, and treatment between groups ( Table S2 ). No difference was found between PGs for age, gender, race, sampling season, or visit number. PG4 patients, for whom SLEDAI best correlated with IFN response, myeloid lineage, and plasmablasts, all had LN, displayed the highest proportion of PLN, were on aggressive therapy, and had the highest percentage of neutrophils, suggesting higher disease burden. Conversely, PG1 and PG6 patients, for whom SLEDAI best correlated with erythropoiesis, displayed the lowest proportion of active nephritis.

We considered a different stratification approach that used samples with high DA only ( Supplemental Experimental Procedures ). Clustering of transcripts modulated in this dataset revealed partial conservation of the original PG stratification, in particular for PG4 and PG5 ( Figure S6 B). High DA and longitudinal correlation analyses were then conducted in parallel ( Figure S6 C). Similarities between methods were observed for the plasmablast signature (R= 0.47; Figure S6 D). More pronounced differences were observed for IFN, erythropoiesis, and myeloid lineage signatures, suggesting that the longitudinal approach contributes additional stratification information by incorporating molecular status during quiescent disease.

We automated this analytical approach to stratify 80 patients with five or more visits. Clustering of the interindividual SLEDAI correlation matrix identified seven patient groups (PG1–7), each displaying a specific combination of five immune signatures correlating with the SLEDAI, including erythropoiesis, IFN response, myeloid lineage/neutrophils, plasmablasts, and lymphoid lineage ( Figure 6 A). Group-specific signature enrichment patterns are summarized in Figure 6 B. Only plasmablast modules correlated with the SLEDAI in PG3, while only IFN response and myeloid lineage modules did in PG5. As a positive control, we identified WGCNA modules that best correlated with neutrophil percent in each patient, revealing a homogenous correlation pattern with myeloid lineage/neutrophil modules across patients ( Figure S6 A).

(E) Principal component analysis of the SNP data from healthy controls selected from the 1000 Genomes project for comparative genotype analysis of SLE patient groups (AA: African-American; ASW: Americans of African Ancestry in SW USA; C: Caucasian; GBR: British in England and Scotland; H: Hispanic; MXL: Mexican Ancestry from Los Angeles USA; BIIR, Baylor Institute for Immunology Research).

(C) Hierarchical clustering of the 2,192 transcripts overlapping with the blood module transcripts from Figure 7 A, collapsed by immune network, by both the high DA and temporal correlation approaches.

(A) Heatmap representing the WGCNA modules that best correlate with neutrophil percentage for 80 SLE patients with at least five visits.

(E) Venn diagram of overlapping SNPs between PG2/3 versus healthy, PG4/5 versus healthy, PG2/3 versus PG4/5, and eQTLs (p < 0.05). Genes in cis with these SNPs are displayed in boxes for relevant lists.

(C) Manhattan plot representing the results from the comparative SNP analysis between SLE (n = 80) and healthy controls. Loci related to genes previously associated with SLE are highlighted in red.

(A) Hierarchical clustering of the interindividual correlation matrix between the SLEDAI WGCNA modules for 80 patients (rows) and blood modules (columns) (left). Blood modules were averaged by immune group for each SLEDAI WGCNA module (center). A black square indicates a correlation ≥0.4. The transcript overlap between each WGCNA module and the combined list of blood module transcripts from the five groups (PG, patient group) are shown (right).

To assess the heterogeneity of immune signatures associating with DA across patients, SLEDAI WGCNA modules were then recombined into an interindividual correlation matrix between WGCNA (y axis) and whole-blood (x axis) modules, and hierarchically clustered. Subgroups of patients were identified based on the combination of blood immune signatures that best correlated with the SLEDAI. Finally, to assess whether there is a genetic basis for these clusters, we conducted SNP analysis ( Figure S4 C). We reasoned that the identification of individual immune correlates of DA could inform personalized therapeutic interventions ( Figure S4 D).

To facilitate access to these data, we developed a web interface available at http://websle.com . For each patient, users can (1) follow clinical trait changes over time, (2) analyze the profile and content of each WGCNA module, (3) identify modules and transcripts best correlating with clinical traits, and (4) identify major module hubs through module membership quantification. The interface’s features are detailed in Figure S5

Users can browse all results from the four linear mixed models presented in the manuscript. The SAS PROC MIXED code is also displayed. Users can browse individual subject clinical data, as well as assess the distribution of specific clinical variable by race/ethnicity or treatment group for example. Individual WGCNA results can be analyzed through correlation matrices, linecharts representing module profiles and tables with module transcript content. Finally, blood module fingerprints can be browsed by individual over time.

An example is provided for patient SLE-55, an African-American female with PLN immunomonitored 13 times over 798 days. The SLEDAI indicated two flares at visits 3 and 9. Both were accompanied by increases in anti-dsDNA antibody titers and neutrophil counts, but only the second flare was accompanied by an increase in ESR ( Figure 5 A). WGCNA identified 41 modules specific to her immunomonitoring time course, which were correlated with clinical traits ( Figure 5 B). The modules that best correlated with SLEDAI (brown), anti-dsDNA antibody titers (dark orange), neutrophil count (blue), C3 (midnight blue), and ESR (yellow) were selected ( Figure 5 C). To biologically interpret these modules, we projected their expression profile in the annotated blood module space, by correlating their eigengenes to those of blood modules. This approach allows us to automate the annotation of hundreds of WGCNA modules by comparing their expression pattern and transcript content to those of our reference blood modules. In this patient, the profile of the brown (SLEDAI) and blue (neutrophil percent) modules positively correlated with myeloid lineage, inflammation, and IFN response blood modules. The dark orange module that best fitted anti-dsDNA antibody titers correlated with plasmablast, cell cycle, and IFN modules. Finally, the yellow module (ESR) highly correlated with erythropoiesis ( Figure 5 D).

(C) Line charts representing the modules that best correlate with SLEDAI, anti-dsDNA antibody titers, neutrophil count, C3, and ESR. The profile of the clinical trait is overlaid on the plot and represented as an interrupted black line.

(A) Clinical summary for patient SLE-55. SLEDAI, anti-dsDNA antibody titers, neutrophil count, C3, and ESR are displayed as line charts as a function of days since the first visit (x axis). Treatment and nephritis class are also displayed.

SLE is a heterogeneous disease, for which no single treatment is curative. Stratification of patients will be important to formulate customized therapies and improve clinical trial design. To address this, we leveraged weighted gene co-expression network analysis (WGCNA) () to identify modules of longitudinally co-expressed transcripts for individual patients and uncover associated clinical traits. Briefly, the entire dataset was divided into subsets of samples corresponding to each subject. Patient-specific modules of co-expressed transcripts over time were identified by WGCNA, and continuous clinical traits were correlated to module eigengenes ( Figure S4 A). A module/trait correlation matrix was built for each patient and the module that best correlated with the SLEDAI was selected (hereafter SLEDAI WGCNA module) ( Figure S4 B).

(C) The SLEDAI WGCNA modules are reassembled and correlated to blood modules expression, for inference of biological function. The resulting interindividual correlation matrix is clustered and patient groups are identified by hierarchical tree inspection, based on their separation across major leukocyte immune networks (erythropoiesis, IFN response/neutrophils, myeloid lineage/neutrophils, plasmablasts, and lymphoid lineage). Patients are genotyped and SNP analysis is conducted for each patient group.

In silico ingenuity pathway analysis (IPA) further revealed that MMF-treated PLN overexpressed transcripts linked to myeloid lineage and inflammation (FcRs, CR1, IL1B, IL6R, and CTSC) and underexpressed B cell and plasmablast transcripts (CD19, EBF1, E2F5, GAS6, and ADARB1) ( Figure 4 F). Conversely, MMF-treated MLN overexpressed transcripts linked to activated neutrophils (DEFA1, DEFA4, CAMP, RNASE2, and LTF) and the IFN response (MHC class I, IRF9, PSMBs, OASL, and TRIM22) ( Figure 4 G). These data highlight a differential enrichment of molecular signatures in MMF-treated PLN and MLN, perhaps reflecting different pathogenic mechanisms leading to disease.

MMF-treated PLN and MLN groups showed no difference in SLEDAI or neutrophil percentage ( Figure S3 E). Untreated PLN displayed enrichment (FDR < 0.001) for modules correlating with DA, including neutrophils (3.49), plasmablasts (1.71), the IFN response (1.63), and B cells (1.33). Upon MMF treatment, enrichment of these modules decreased significantly more in PLN than in MLN (1.42-, 1.27-, 1.27-, and 1.41-fold, respectively, FDR < 0.001) ( Figures 4 D and 4E).

The term lupus nephritis (LN) includes a heterogeneous group of kidney pathologies histologically classified into six major types (I–VI) (). Severe cases comprise proliferative (PLN, III and IV) and membranous (MLN, V) nephritis, or a combination of the two (VI). MMF is used to treat both PLN and MLN, though the pathogenesis of these two nephrites is different. To determine whether blood transcriptomics could inform on these differences, we developed a fourth model accounting for interactions between treatment and nephritis class, while adjusting for SLEDAI differences between groups. Samples were categorized into no LN (I), mesangial nephritis (II), PLN (III and IV), and MLN (V) (type VI was excluded from the analysis). We focused on three comparisons: (1) untreated patients without LN versus untreated PLN (untreated MLN samples were underrepresented in our cohort), (2) MMF-treated patients without LN versus PLN or MLN, and (3) direct comparison of PLN versus MLN on MMF ( Figure S3 D).

Because the SLEDAI is a composite score evaluating 24 parameters, we assessed whether transcriptional differences could be detected as a function of discrete SLEDAI categories. We classified samples according to five groups of SLEDAI components: no SLEDAI parameters (“none”), alterations in serum parameters only (“serology”), connective tissue ± serology (“skin/musculoskeletal”), kidney ± serology (“renal”), or all combined (“global”), reflecting the highest disease severity ( Figure S3 B). SLEDAI distribution across these groups confirmed a pattern of increasing DA from “none” to “global” ( Figure S3 C). To compare these groups, we developed a third model, accounting for treatment, and obtained estimates for the comparisons between each active component group and “none.” QuSAGE detected enrichment in IFN response, plasmablast, and B cell modules in all comparisons, suggesting early involvement of these pathways in disease progression. Conversely, the neutrophil, myeloid lineage, and inflammation modules were only enriched in the renal and global component groups, suggesting an association with kidney involvement ( Figures 4 A and 4B ). FACS analysis in a subset of patients (n = 31) revealed increased total and activated CD62L-low circulating neutrophil counts in patients with active nephritis ( Figure 4 C). These data support a model of gradual disease progression, whereby an initial increase in IFN response and differentiation of B cells into short-lived plasmablasts, which may occur before clinical onset (), is followed by kidney disease and full-blown systemic inflammation fueled by myeloid cells, including neutrophils.

(F) Ingenuity pathway analysis (IPA) networks of DETs from the estimates of PLN versus no LN treated with MMF. DETs are represented on the outer circle and colored by fold change (red, overexpressed; green, underexpressed). Predicted upstream and downstream regulators (absolute Z score > 1) are represented on the inner circle.

(D) Heatmap representing the QuSAGE fold enrichment for PLN and MLN, compared either to no nephritis (No LN) or directly to each other, with (MMF) or without (NT) treatment.

(C) Box plots representing the absolute counts of bulk (left) and activated CD62L-low (right) neutrophils by FACS in a subset of patients with (n = 8) or without (n = 23) active LN. ( ∗ p < 0.05; ∗∗ p < 0.01.) Data were adjusted for age through a mixed model. Whiskers represent the minimum and maximum values.

Next, we assessed how treatment influences SLE blood transcriptional signatures by categorizing each sample into one of five treatment categories (NT, HC, OS, MMF, and CIV). We identified 622 DETs in any treatment versus others. Patients receiving HC only, given as primary therapy for milder clinical manifestations, displayed a module enrichment profile similar to that of untreated patients, who for the most part were in remission. Treatment groups including corticosteroids (OS and CIV) displayed an increased neutrophil signature, consistent with increased numbers of circulating neutrophils following these treatments. The plasmablast signature was decreased by all treatments compared to NT, but most strongly by MMF and CIV, two cytostatic drugs that suppress activated lymphocytes. These two therapies were also the most effective in decreasing the IFN response ( Figures 3 E and 3F). These results highlight changes in circulating leukocyte ratios, as well as the association of more potent therapies with significant decrease in plasmablast and IFN response.

To complement these observations, we analyzed the distribution of SLEDAI, anti-dsDNA antibody titers, C3, and ESR among these races by DA level. African-Americans displayed a significantly higher SLEDAI in the DA3 group and higher systemic inflammation as measured by ESR and C3 levels, consistent with increased clinical burden ( Figure S3 A). Accordingly, their anti-dsDNA antibody titers were increased compared to Caucasians and Hispanics in the DA3 group, as substantiated by the slope of the linear regression of SLEDAI versus anti-dsDNA antibody titers (p < 0.0001) ( Figure 3 D). Therefore, an increase in DA is connected to enrichment in plasmablast signatures and circulating anti-dsDNA antibodies, particularly in African-Americans.

(B) Classification of samples in five groups based on SLEDAI component category (serum, musculo-skeletal and kidney). Dark gray squares represent the components a sample must be positive for (at least one per category) to be part of a group.

The influence of race on SLE clinical severity is well documented, with African-Americans often presenting with more severe disease than Caucasians (). To assess whether transcriptional differences among individuals of various races exist, we compared the transcriptomes from African-American, Hispanic, and Caucasian patients in paired analyses, i.e., (1) African-American, (2) Caucasian, and (3) Hispanic versus Others. We uncovered 444 DETs among the three groups ( Figure 3 A). Functional interpretation by QuSAGE identified an enrichment of plasmablast, cell cycle, and erythropoiesis modules in African-Americans ( Figures 3 B and 3C). Conversely, Hispanics and Caucasians showed enrichment of neutrophil, myeloid lineage, and inflammation-related modules.

(B) Heatmap representing the QuSAGE fold enrichment of each group using blood modules as gene sets. The reproducibility of each module between training and test sets is displayed on the right.

Finally, transcripts linked to DA changes were represented as a Cytoscape network () ( Figure 2 F). Along with blood module transcripts, this network displays DETs that do not belong to pre-established modules (connected to “NA” hub). Within this group, numerous IFN-regulated transcripts (IFI6, IFI27, IFI27L1, DDX60L, and SIGLEC1), histone (HIST1H3E, HIST2H2AA4, and HIST2H4B), and B-cell-related (STAP1 and LOC652126) genes were found, thereby expanding the results from modular enrichment analysis.

To functionally interpret these gene lists, we applied quantitative set analysis for gene expression (QuSAGE) (), using the blood modules as gene sets ( Figures 2 D and S2 D). Positive correlates of DA were enriched for IFN response, plasmablast, cell cycle, neutrophil, histone, and B cell modules. Reproducibility of module enrichment was assessed by Pearson correlation analysis of a module’s eigengene profile in training and test sets ( Figure S2 E). The plasmablast signature was the most reproducible (M7.32: R= 0.86; M4.11: R= 0.65; M7.7: R= 0.61), while the IFN response showed lower reproducibility (M1.2: R= 0.07; M3.4: R= 0.25; M5.12: R= 0.22). Transcripts negatively correlated with DA were enriched for NK cell/cytotoxicity, protein synthesis, and erythropoiesis modules. Fluorescence-activated cell sorting (FACS) analysis of circulating plasmablasts in a subset of patients (n = 36) confirmed their increasing frequency with increasing DA ( Figure 2 E).

Comparing DA3 versus DA1 yielded 486 transcripts, of which 383 positively and 103 negatively correlated with DA ( Figure 2 A). These transcripts were differentially expressed between DA levels regardless of race or treatment, as shown by their sustained upward or downward expression pattern across RaceDA and TreatmentDA interaction groups (subgroups of samples organized by DA levels 1, 2, and 3 for each value of the race or treatment variables) ( Figure 2 B). Their expression pattern was consistent in the test set (p < 0.0001, overexpressed: R= 0.35; underexpressed: R= 0.55) ( Figures 2 C and S2 C).

(F) Cytoscape network displaying the connectivity of DETs in DA3 versus DA1 with blood modules, which are represented as hubs (squares). Genes not connected to a blood module are linked to the “NA” hub. Nodes are colored by the standard least-squares mean expression of the transcript in DA3.

(E) Box plots representing the absolute count of circulating plasmablasts by FACS in a subset of patients, grouped by DA level. Plasmablast counts were found significantly different in DA3 versus DA1 (p = 0.0069) and DA3 versus DA2 (p = 0.045) by a mixed model that adjusted for age and treatment ( ∗ p < 0.05; ∗∗ p < 0.01).

Next, we assessed how the SLE signature changes with DA. To do so, we developed a second model that incorporated both race and treatment, as these parameters impact the clinical course of SLE. It identified 3,501 DETs, which were clustered to highlight their individual behavior across DA, race, and treatment groups, in training and test sets ( Figures S2 A and S2B).

(E) Upper panel: reproducibility of the 97 blood modules between training and test sets, displayed in decreasing order. Lower panel: standard least-squares mean average profiles for plasmablasts and IFN response modules in training and test sets. X-Y plots display the linear regression between training and test sets for each module. Dots represent the standard least-squares means for each comparison group considered by the model.

(D) QuSAGE fold change enrichment analysis of whole blood modules for DETs between DA3 and DA1 in training (salmon) and test (turquoise) sets. Modules are displayed in decreasing order of QuSAGE fold enrichment score in the training set. Module expression reproducibility between training and test set is displayed as a heatmap.

(C) Disease activity signature reproducibility between training and test sets. Line charts representing the eigengenes of the transcripts up- (upper panel) or down- (lower panel) regulated with disease activity in training (salmon) and test (turquoise) sets.

The SLE modular fingerprint from the training set revealed overexpression of IFN response, neutrophil, inflammation, cell cycle, erythropoiesis, and histone modules. Conversely, modules linked to NK cell/cytotoxicity, lymphoid lineage, B cells, T cells, and protein synthesis were underexpressed ( Figure 1 C). The SLE module fingerprint was highly reproducible in the test set as shown by linear regression (R= 0.94) ( Figure 1 D). Frequency analysis of IFN, plasmablast, and neutrophil signatures, which were previously associated with SLE, revealed a more transient plasmablast signature (positive in 21.2% of samples) than IFN (84.8%) or neutrophil (48.8%), explaining the absence of plasmablast-related modules in the global SLE signature ( Figure 1 E). These results highlight the breadth and frequency of systemic immune signatures altered in SLE.

To functionally interpret these DETs, we conducted modular analysis (). Briefly, we used a framework of 260 modules of transcripts co-expressed in blood across various immunological conditions to reduce data dimensionality. These modules were annotated using knowledge-based and data-driven approaches. To assess their specific enrichment in sorted hematopoietic cell populations, we applied our modular analysis to two public datasets ( Figures S1 A and S1B) (). In addition, a correlation matrix of module expression across our cohort was built and clustered ( Figure S1 C). Functional annotation of several undetermined modules could be inferred based on their proximity to annotated modules from lymphoid, myeloid, or erythropoietic lineages.

(C) Correlation matrix of blood modules across the SLE longitudinal cohort. Individual blood module fingerprints were obtained for each of the 972 samples included in the dataset. The median transcript expression from healthy controls was used as reference. A correlation matrix was built (Pearson) and hierarchically clustered. Major cluster groups were highlighted by hierarchical tree inspection.

(B) Blood module fingerprints from six leukocyte populations sorted from whole blood (). Left panel: hierarchical clustering of the 12,775 transcripts differentially expressed between the six populations (one-way ANOVA). Data were normalized to the median of all samples. Right panel: hierarchical clustering of the blood module fingerprints from the six populations. Cluster organization and population-specific enrichment were used to expand blood module annotations.

(A) Hierarchical clustering (Pearson) of the blood module fingerprints from 38 leukocyte populations (average of 4 to 10 biological replicates) sorted from whole blood (). The data were normalized to the median of all samples. Cluster organization and population-specific enrichment were used to expand blood module annotations.

We first defined the global SLE signature by comparing all patient samples to healthy controls. Unsupervised hierarchical clustering of the 15,386 transcripts revealed a prevalent IFN signature in 784 out of 924 SLE samples (84.8%) ( Figure 1 A). To identify differentially expressed transcripts (DETs) between SLE patients and healthy controls, we developed a linear mixed model. These models include both fixed effects, such as disease, and random effects, such as individual. They account for repeated measurements and enable the use of unbalanced data without excluding individuals with missing records. This model yielded 1,052 DETs in the training set, with high reproducibility in the test set ( Figure 1 B).

(A) Hierarchical clustering of the 15,386 transcripts detected across the 972 samples composing the Dallas pediatric SLE cohort. IFN response is highlighted in the dashed rectangle, and representative transcripts are listed.

We collected clinical and blood transcriptional profiles from 158 pediatric SLE patients followed longitudinally for up to 1,412 days, representing 924 unique visits. The 24 components of the SLEDAI were collected at all time points to assess DA. Samples were categorized as DA1 (SLEDAI: 0–2), DA2 (SLEDAI: 3–7), or DA3 (SLEDAI > 7), based on SLEDAI distribution across the cohort. Treatment was recorded at blood draw and categorized as follows: no treatment (NT), HC only, oral steroids ± HC (OS), MMF ± HC/OS, and intravenous therapy consisting of cyclophosphamide and/or methylprednisolone ± HC/OS (CIV). 48 healthy pediatric individuals were enrolled as controls. For validation, patients were separated into independent training and test sets in a two-to-one sample-size ratio. No significant difference was observed in the distribution of race, DA, or treatment between the sets. Effects of circadian rhythm, season, treatment dose, and frequency on the transcriptional variance were either accounted for or negligible. Cohort characteristics are summarized in Table S1

Discussion

SLE is a heterogeneous disease characterized by a wide spectrum of clinical manifestations and degrees of severity. To gain insight into the molecular heterogeneity of SLE, we profiled the blood transcriptome of a longitudinal cohort of pediatric patients. We identified a plasmablast signature as the most robust biomarker of DA. We also detected a gradual enrichment of immune signatures during disease progression, linking neutrophils to active nephritis. Finally, personalized immunomonitoring enabled patient stratification based on correlates of DA. This resulted in a comprehensive view into the heterogeneity of molecular signatures associated with clinical progression that could not be predicted with cohort-level analyses.

Odendahl et al., 2000 Odendahl M.

Jacobi A.

Hansen A.

Feist E.

Hiepe F.

Burmester G.R.

Lipsky P.E.

Radbruch A.

Dörner T. Disturbed peripheral B lymphocyte homeostasis in systemic lupus erythematosus. Sanz and Lee, 2010 Sanz I.

Lee F.E. B cells as therapeutic targets in SLE. Jonsson and Carlsten, 2003 Jonsson C.A.

Carlsten H. Mycophenolic acid inhibits inosine 5′-monophosphate dehydrogenase and suppresses immunoglobulin and cytokine production of B cells. Previous studies designed to identify blood transcriptional correlates of DA in SLE mainly focused on limited numbers of IFN-inducible transcripts that might be induced indistinctly by type I–III IFN family members. Through an unbiased approach, we confirmed that the majority of samples in our cohort overexpressed IFN-inducible transcripts. This signature, together with plasmablast, B cell, neutrophil, and histone transcripts, correlated with DA. Of these, the plasmablast signature was the most robust DA biomarker. These observations complement known associations between DA and numbers of circulating plasmablasts (), which are the source of anti-dsDNA antibodies that fluctuate with DA (). The plasmablast signature was reduced by all conventional SLE therapies, but most significantly by MMF and CIV. In addition to the anti-proliferative effect of both drugs, MMF is known to inhibit inosine 5′-monophosphate dehydrogenase, which is required for differentiation of mature B cells into plasma cells ().

Merrill et al., 2010 Merrill J.T.

Neuwelt C.M.

Wallace D.J.

Shanahan J.C.

Latinis K.M.

Oates J.C.

Utset T.O.

Gordon C.

Isenberg D.A.

Hsieh H.J.

et al. Efficacy and safety of rituximab in moderately-to-severely active systemic lupus erythematosus: the randomized, double-blind, phase II/III systemic lupus erythematosus evaluation of rituximab trial. Navarra et al., 2011 Navarra S.V.

Guzmán R.M.

Gallacher A.E.

Hall S.

Levy R.A.

Jimenez R.E.

Li E.K.

Thomas M.

Kim H.Y.

León M.G.

et al. BLISS-52 Study Group

Efficacy and safety of belimumab in patients with active systemic lupus erythematosus: a randomised, placebo-controlled, phase 3 trial. Ritterhouse et al., 2011 Ritterhouse L.L.

Crowe S.R.

Niewold T.B.

Merrill J.T.

Roberts V.C.

Dedeke A.B.

Neas B.R.

Thompson L.F.

Guthridge J.M.

James J.A. B lymphocyte stimulator levels in systemic lupus erythematosus: higher circulating levels in African American patients and increased production after influenza vaccination in patients with low baseline levels. When accounting for ethnicity, the plasmablast signature showed stronger signal in African-Americans, who also displayed the highest DA levels. Accordingly, we found the best correlation between anti-dsDNA antibody titers and SLEDAI in this ethnic group. African-American patients respond better to B-cell-depletion therapies than Caucasian patients (). Conversely, they displayed lower responses to anti-BAFF treatment in a phase III clinical trial (). This might be explained by their higher serum levels of BAFF (), which may require higher drug doses for neutralization.

Garcia-Romo et al., 2011 Garcia-Romo G.S.

Caielli S.

Vega B.

Connolly J.

Allantaz F.

Xu Z.

Punaro M.

Baisch J.

Guiducci C.

Coffman R.L.

et al. Netting neutrophils are major inducers of type I IFN production in pediatric systemic lupus erythematosus. Hakkim et al., 2010 Hakkim A.

Fürnrohr B.G.

Amann K.

Laube B.

Abed U.A.

Brinkmann V.

Herrmann M.

Voll R.E.

Zychlinsky A. Impairment of neutrophil extracellular trap degradation is associated with lupus nephritis. Villanueva et al., 2011 Villanueva E.

Yalavarthi S.

Berthier C.C.

Hodgin J.B.

Khandpur R.

Lin A.M.

Rubin C.J.

Zhao W.

Olsen S.H.

Klinker M.

et al. Netting neutrophils induce endothelial damage, infiltrate tissues, and expose immunostimulatory molecules in systemic lupus erythematosus. Neutrophil transcripts were enriched in patients with active renal disease. In lupus, these cells are primed by IFN and release both pro-inflammatory mediators and interferogenic DNA when exposed to SLE ICs (). Their specific contribution to different LN types has yet to be elucidated. Similarly to the plasmablast and IFN signatures, the neutrophil signature was extinguished after MMF treatment in PLN, but not in MLN. Mechanistically, these observations are difficult to interpret. MMF may target an upstream event leading to induction of these signatures only in PLN. For example, a short-lived plasmablast-derived autoantibody responsible for pDC and neutrophil activation could drive PLN, as opposed to autoantibodies from long-lived non-proliferative plasma cells that would be resistant to this mode of therapy in MLN. Conversely, a pro-inflammatory signature, including IL1A, IL1B, and IL6R, subsisted in PLN under MMF, suggesting that combination therapies, including drugs targeting these pathways, might be superior to current regimens against PLN.

Cheng and Anderson, 2012 Cheng M.H.

Anderson M.S. Monogenic autoimmunity. The FDA approved only one drug for SLE treatment in the past 50 years. While several treatments improved disease in pre-clinical models and/or phase I–II clinical trials, larger phase III trials have proved unsuccessful. A major hurdle in lupus therapy development is the lack of knowledge about molecular mechanisms driving DA in individual patients. Genomic approaches have identified various genetic pathways contributing to familial SLE, including immature B cell survival, early complement cascade components, regulation of IFN production, and defects in cytoplasmic or extracellular DNA degradation (). Allelic variants of genes involved in some of these pathways increase the risk of developing SLE in more common, non-familial cases. However, these alleles have so far not enabled the stratification of patients for customized treatment. Our longitudinal personalized immunoprofiling approach stratified patients into seven molecular groups according to DA correlates. This transcriptional stratification was complemented by SNP analysis that pointed at specific associations in distinct groups, despite limited sample size. Simultaneous eQTL analysis enabled the identification of group-specific SNPs that may directly affect the expression of neighboring genes. Further studies in larger cohorts are warranted and should include leukocyte profiling by flow cytometry to correct for blood population distribution in eQTL analyses.

By highlighting the spectrum of individual correlates of DA over time, our approach provides a rationale for the failure of clinical trials in SLE and an opportunity for the development of personalized therapies. Indeed, prevalent signatures that correlated with DA at the cohort level, such as IFN or plasmablast, failed to do so at the individual level in two-thirds of patients. Future studies should assess the response to treatments targeting B cells/plasmablasts, IFN, and pro-inflammatory cytokines in these subgroups of patients.

While our longitudinal data could have been used to stratify patients solely based on molecular profiles, one of our main goals was to identify immune pathways associated with changes in clinical features. A potential caveat is the use of the SLEDAI to assess DA. This measure, although validated in all age and ethnic groups, still falls short of accurately capturing some relevant clinical aspects of the disease. In our pediatric cohort, however, the major contributor to SLEDAI is LN, which is more objectively quantified than other components of the index.

McKinney et al., 2015 McKinney E.F.

Lee J.C.

Jayne D.R.

Lyons P.A.

Smith K.G. T-cell exhaustion, co-stimulation and clinical outcome in autoimmunity and infection. Biomarkers that can predict occurrence and frequency of flares will be of great value in the clinic (). The observational nature of this study, however, prevented the identification of flare predictors, as this would require more standardized and frequent sampling of individuals. Nevertheless, the molecular heterogeneity we unravel may inform the design of future studies to identify classifiers and predictors of DA.

While our study focused on SLE, our approach should be applicable to other clinically and genetically complex autoimmune diseases, such as rheumatoid arthritis, inflammatory bowel disease, or inflammatory myopathies, where even approved therapies fail to induce remission in a significant number of patients. Future challenges include establishing the value of this approach to predict responses to targeted therapies and reducing the time to stratification. This might require integration of additional information, such as epigenetic and proteomic data. Altogether, uncovering the molecular heterogeneity of complex diseases should enable a more rational use of available treatments, improve patient selection for clinical trials, and guide the development of novel targeted drugs for precision medicine.