Differences in cytokine expression between healthy and ME/CFS populations

Differences in the expression of individual markers between HC subjects and subjects diagnosed with ME/CFS) were assessed in each of the age and duration of illness group using the non-parametric ranksum test and the standard two-tailed t test applied to the log2-transformed cytokine data (Additional file 1: Table S2). Significant differences in both the mean (t test) and median log-transformed expression values (ranksum test) were observed between the HC and ME/CFS (p < 0.05) in at least one of the 3 subgroups for 7 out of the 16 cytokines measured. These differences were especially abundant in the sub-group of subjects aged 50 years or more and ill for 11 years on average. In the latter, IL-4, IL-5, IL-12 and LTα were increased in expression in ME/CFS, while IL-8 and IL-15 were expressed at lower levels in ME/CFS. Conversely increased expression values for IL-8 were observed in adolescent ME/CFS subjects with log-transformed concentrations of IL-23 being expressed at significantly lower levels in this sub-group.

Validating a prior classification model

As cytokines are not expressed independently of one another [34], we had previously applied both a sequential step-wise selection procedure and an all-possible subsets procedure to identify subsets of cytokines that when used as the basis of a linear classification model might provide a co-expression signature characteristic of ME/CFS in the adolescent cohort used here [35]. Use of the sequential selection procedure identified IL-1a, IL-6, IL-8, IL-13 and IL-23 as potential markers of ME/CFS in this adolescent population (Table 1). Of these cytokines, IL-6, IL-8 and IL-23 were also selected by the all-possible subsets method forming the basis for a reduced consensus model. As described in Broderick et al. [35] random sub-sampling of the adolescent subjects indicated that IL-6 and 8 provided an especially robust basis for a minimal classification model of post-infectious ME/CFS in our adolescent population. Even this minimal model of ME/CFS supported a classification accuracy of close to 80 % in the adolescent training set. However as shown in Table 1, all three variants of this classification model extrapolated poorly to the older mid-course and late course subject sub-groups with accuracies of less than 50 %.

Table 1 Extrapolation of classification models identified in adolescent CFS (Broderick et al., 2012) to adult pre and post-menopausal groups Full size table

Selecting cytokines broadly conserved across age and BMI

Even in healthy individuals cytokine expression is influenced by a variety of factors such as age and BMI [17, 42–44]. Indeed when examining changes in cytokine expression in healthy individuals alone we found that the majority of the 16 cytokines measured here changed significantly in expression (Table 2a). Despite transformations to the data, significance analysis based on classical ANOVA was further verified using the Kruskall-Wallis non-parametric test and variable selection was based on the more conservative result. Only IL-6, TNF-α and LTα levels were not significantly different among healthy control subjects across all 3 age subgroups (Table 2a). Expression of IL-1α, 8 and 15 was statistically stable only across subgroups composed of predominantly premenopausal healthy individuals (Table 2b). While significant differences in BMI were seen in healthy subjects between the adolescent subgroup and the subgroups with older subjects, the overall range of values was such that the vast majority of subjects were non-obese (58 of 73 with BMI <30 kg/m2) [45].

Table 2 Results of one-way ANOVA and Kruskal–Wallis tests in healthy control (HC) subjects only (a) across all 3 age groups, and (b) across the 2 age groups with predominantly premenopausal subjects (age ≤50 years) Full size table

In an attempt to remove the confounding effects of age and BMI on cytokine expression, we constructed a new set of classification models using only IL-1a, IL-6, IL-8, IL-15, TNF-α and LTα as candidate markers since these were reasonably invariant in healthy subjects. Cytokines IL-1α, TNF-α and LTα, IL-6 and IL-8 had been previously selected as discriminatory markers [35] in the adolescent set and were retained here as the basic model (Table 3) producing a classification accuracy of 88 % in this subgroup. Repeating the stepwise selection procedure for the middle aged mid-course subgroup led to the identification of IL-1α and IL-15 as being the best markers for this subgroup yielding an accuracy of 72 %. Likewise in the predominantly post-menopausal late-course group, IL-6, IL-8, IL-15 and TNF-α were selected to deliver a classification accuracy of 84 %. As these subgroup specific marker sets overlap, the applicability of the simple model identified in the adolescent subgroup was tested on both other age and illness duration groups yielding accuracies of 37 and 48 %. To explore whether this decrease in performance was related to the choice of cytokines, we constrained the structure of the classification model to be based on IL-1α, IL-6 and IL-8 but allowed the coefficients to be tuned for each of the illness subgroups. When coefficients were tuned in this way the classification accuracy based on IL-1α, IL-6 and IL-8 rose to 77 and 75 % in the middle-aged and post-menopausal subgroups respectively (Table 3). As IL-1α, IL-6 and IL-8 were selected from a candidate set of cytokines that were reasonably invariant across age and BMI in healthy subjects this result suggests that duration of illness may be a main factor driving the need for parameter tuning across ME/CFS subgroups, at least in the group of cytokines measured here. This choice of markers would also be consistent with a cursory analysis of illness severity showing that IL-1α and IL-8 in particular display a correlation of at least marginal significance (p ≤ 0.07) or better with the general fatigue, physical fatigue and reduced activity components of the MFI (Additional file 1: Table S4).

Table 3 Best Stepwise models selected for each set from cytokines stable across HC groups of age ≤50 years Full size table

A classification model corrected for duration of illness

The classification results above suggest that IL-1α, IL-6 and IL-8 may be broadly applicable as ME/CFS illness markers but that their contribution should be adjusted based on duration of illness and perhaps other related covariate factors. To assess the variability of such adjustments the coefficients for these cytokines in the linear classification model were estimated repeatedly on 50 random subsets of 10 healthy control and 10 ME/CFS subjects in each illness duration subgroup. Subjects would be assigned to the ME/CFS class if 0 < α 0 + α 1 × [IL-1a] + α 2 × [IL-6] + α 3 × [IL-8], where [x] is the z score normalized concentration of cytokine x based on the mean and standard deviation values listed in Additional file 1: Table S3. Results of this piece-wise optimal tuning of classification coefficients are shown in Table 4 supporting an accuracy in classification of approximately 75 ± 8 % standard error on these smaller random subsets. Sensitivity values were 75 ± 12 % to 78 ± 9 % in the adult subsets. Corresponding specificity levels exceeding 73 ± 11 % ranging up to 83 ± 8 %.

Table 4 Performance statistics for classification models built from 50 random subsets of 10 healthy and 10 CFS/ME subjects Full size table

The corresponding mean and median values for these optimally tuned coefficients are listed in Table 5 and shown in Fig. 1. These indicate that duration of illness influences both the magnitude and the polarity of the contribution made by each cytokine in determining membership to the ME/CFS class. The coefficient α 1 for IL-1α suggests that increased levels of the latter are most characteristic of ME/CFS in the early course of illness but that this feature decreases in importance as illness progresses. Coefficients α 2 and α 3 actually reverse in polarity as illness progresses with a combination of lower than average IL-6 and higher than average IL-8 levels being more discriminatory for early stage ME/CFS but the reverse pattern being more prominent in subjects with more established illness. Changes in the intercept α 0 and the coefficients α 1 , α 2 , α 3 with respect to duration of illness were captured using a simple second order polynomial of the form α i = β 0 + β 1 × (years ill) + β 2 × (years ill)2. This regression model is presented in Table 6. Results show that in the case of all classification coefficients, the duration of illness is a highly significant contributor (F > 32; p < 0.01). Indeed close to 60 % of the total variability in the classification coefficients for IL-6 and IL-8 are captured by duration of illness alone (R2 = 0.57). However only slightly more that 30 % of the total variability in the classification coefficients for IL-1α as well as that for the intercept are supported by changes in duration of illness. To evaluate the impact of this unexplained variability on classification accuracy we applied the simple protocol proposed in Additional file 4: Figure S2 where duration of illness alone is first used to calculate the appropriate values of the classification weights α i for IL-1α, IL-6 and IL-8 using the β i values listed in Table 6. Using the resulting estimates of coefficients α 0 , α 1 , α 2 , α 3 in the linear classification model each subject was assigned a predicted ME/CFS (score >0) or non-ME/CFS status (score ≤0). The protocol based on duration of illness alone supported a classification accuracy of 63 % across the full range of illness duration, a performance comparable to that obtained with an optimal tuning based on all subjects (Additional file 1: Table S5). However important gaps in performance emerged across the different phases of illness. This was especially obvious for classification performed in the mid-range of illness duration (i.e. 6–7 years ill). These results illustrate that although duration of illness may be a highly significant contributor to the evolution of the classification coefficients (p < 0.01), other covariate factors related to illness progression may also play an important role, in particular during the transition from early to late phase.

Table 5 Distribution statistics for coefficients in linear discriminant model based on 50 random subsets of 10 healthy and 10 CFS/ME subjects in each age group Full size table

Fig. 1 Relative contribution to classification of ME/CFS subjects versus healthy control subjects across duration of illness for cytokines largely unaffected by age and BMI. Average value with standard error for the coefficients of IL-1α, IL-6 and IL-8 in a linear model for classification of ME/CFS subjects as estimated across 50 random subsets of n = 10 ME/CFS and n = 10 healthy control subjects sampled from subgroups of differing illness duration Full size image