In a nutshell: Analysing PACE the way the authors originally promised to do showed that CBT and GET didn’t do much to improve self-reported physical function and fatigue and did not lead to recovery. Even the very limited self-reported gains in this unblinded trial are likely to be illusory because they are not backed up by meaningful gains in objective measures, such as fitness. The self-report gains also appear not to last. We now need biomedical research to pave the way for effective treatments.

Researchers and patients have been pointing out problems with the PACE trial for years. A new paper goes further by reanalysing the raw data to give the results the way the trial authors originally said they would give them, before they opted for softer measures of success. The new paper, published in the journal BMC Psychology, also sets out all the flaws of the PACE trial in one place.

Rethinking the treatment of chronic fatigue syndrome—A reanalysis and evaluation of findings from a recent major trial of graded exercise and CBT

Carolyn E. Wilshire, Tom Kindlon, Robert Courtney, Alem Matthees, David Tuller, Keith Geraghty and Bruce Levin

It comes from a team that was led by research psychologist Dr Carolyn Wilshire and included a professor of biostatistics and several researcher-patients, each of whom has a string of publications to their name. Alem Matthees, whose long Freedom of Information battle with the PACE authors secured the release of the underlying data, is among the co-authors.

The new work, using the original analysis method published in the study protocol, revealed results that are much less impressive than the ones published by the PACE authors.

How PACE’s results really looked

“we suggest that these findings show that either CBT or GET, when added to [standard medical care], is an effective treatment for chronic fatigue syndrome”

PACE trial authors writing in The Lancet, 2011

Back in 2007, while the PACE trial was still treating patients, the authors published their protocol for the study, which set out exactly how they would run the trial and how they would measure success. The focus was on scores from questionnaires that would be completed by patients who rated their fatigue and physical function. The key success measure was set as the proportion of patients who improved by a substantial amount for fatigue, for physical function, or for both.

However, just before analysing the data, the authors decided to switch to a measure that effectively made success easier to achieve. Instead of looking at how many patients improved, they asked if average score differences between therapy groups were bigger than a rather modest “clinically useful” amount (essentially, anything more than the minimum possible improvement for fatigue or physical function scores). They then published these results but didn’t publish the protocol-specified results, which meant that no one could see the impact of switching outcomes.

The new analysis by Wilshire et al. has put that right. It found that, according to the protocol definition of improvement, only 20% of CBT and 21% of GET patients improved on both fatigue and physical function. The CBT result wasn’t even statistically significant compared with controls after appropriate corrections were made for multiple comparisons (necessary to guard against the risk of chance findings). The result for GET was statistically significant using the correction method set out in the statistical analysis plan (which itself was published years after the protocol). However, using the correction method indicated by the protocol, even the GET result would not reach statistical significance.

So it appears that, according to the analysis set out in the protocol, neither CBT nor GET led to significantly more patients improving on both fatigue and function.

Although the PACE authors switched the primary outcome to the average score differences between groups, they also published “improvement” rates using their new, watered-down definition of improvement. The graph below compares the protocol improver rates from the Wilshire et al. paper with those originally published in the Lancet. It shows that the changes to the definition of improvement inflated rates roughly three-fold.

The new analysis also looked at the less-important measure of the proportion of patients improving on fatigue or function separately, rather than on both. This found that significantly more CBT patients improved than control patients for fatigue, but not for function. For GET, there was a significant difference for function, but not for fatigue.

The situation with “recovery” is even worse than for “improvers”. The PACE authors published results showing that one in five GET and CBT patients “recovered”, but only after the goalposts had been moved halfway up the pitch.

For example, the protocol set the recovery threshold for physical function as a score of 85 out of a maximum possible 100 (‘100’ indicating good, healthy function and ‘0’ very poor function). But that recovery threshold was then revised down to just 60 – lower than the score of 65 that indicated such poor physical function that scoring at or below it qualified patients to join the trial in the first place. For comparison, patients with Class II congestive heart failure have an average score of 57. No patient or clinician would recognise this level of physical function as representing recovery.

Recovery rates using the protocol definition were much lower and not statistically better for CBT or GET than for controls, as shown in the graph below. The new results make it clear that CBT and GET don’t lead to recovery from this illness.

Another new analysis provided by the Wilshire et al. paper examined an aspect of PACE’s long-term follow-up results that had not previously been explored. The long-term follow-up results were published in 2015 and showed no difference between groups. At the time, the PACE team argued that this meant that the benefits of CBT and GET had been maintained and that patients in the control group had caught these groups up because by then, some of the control-group patients had also received CBT and GET.

But the new paper elegantly unpicks this optimistic claim, by analysing data from those patients who did not have any further therapy after the trial – and here too the control group caught up with the CBT and GET groups. The evidence shows that even the modest gains in self-reported fatigue and physical function didn’t last.

What should have happened with the PACE results?

The Wilshire et al. paper highlights two fundamental issues with PACE changing the analyses from what was specified in the protocol. The first problem was making substantial changes without solid justification, and the second was never publishing the results using the analyses that they had originally promised.

As the paper notes, although the Trial Steering Committee did approve the changes, it isn’t clear why the PACE authors suddenly decided that their planned measure “would be hard to interpret, and would not allow us to answer properly our primary questions of efficacy”. They’d clearly thought differently when they’d published their detailed protocol three years earlier.

Wilshire et al. argue that it was important to publish the results as set out in the protocol, which would still have left the PACE authors free to use more sensitive methods to explore smaller changes in outcomes

The central flaw: self-reported outcomes in a nonblinded trial

The definition of who was classed as an “improver” was based on self-reported measures, and “recovery” was largely built on these too. But PACE was a nonblinded study where patients knew whether they were getting the active treatment. In a drug trial, it would be unacceptable for the researchers to rely on patients’ self-ratings if the patients knew whether they were receiving the new drug or a control pill because those self-ratings would be affected by the patients’ expectations.

Wilshire et al. make the case that patients receiving CBT or GET had their expectations raised as part of their therapy. Patients receiving CBT were assured it was a ‘powerful’ and ‘effective’ therapy, while GET patients were told it was ‘one of the most effective therapy strategies currently known’. Both groups were assured that faithfully adhering to their therapy could lead to full recovery. This was likely to influence how patients answered questions about how they felt after treatment.

Because the nature of the therapies meant that it was not possible to conduct PACE in a nonblinded way, it was crucial to compare its self-report outcomes with its objective ones, such as fitness and walking distance. But this did not happen.

The original Lancet paper included results for a test measuring how far patients could walk in six minutes, which showed that CBT patients failed to improve after a year of therapy, despite making modest gains in self-report measures. Meanwhile, GET patients could walk 67 m further than when they started therapy, but as Wilshire et al. point out, this is less than half the improvement found in a study of a sample of Class II chronic heart failure patients after just three weeks of graded exercise.

Fitness results revealed no gains at all for either CBT or GET. Similarly, there was no improvement for either group in the hours that patients were able to work, and slightly more patients were awarded state sickness benefits at the end of the trial than at the start.

There was no meaningful objective gains for patients after either CBT or GET, despite gains in self-report outcomes. Wilshire et al. comment that this pattern is exactly what you would expect from a nonblinded study of therapies that had no real effect.

“[the] findings of the trial cannot safely be used to support behavioural interventions for chronic fatigue syndrome.” Lead author Wilshire, speaking in 2018

Looking to the future

This impressive new paper demonstrates that the changes to the analysis methods led to more flattering results for the PACE trial both for primary outcomes and for recovery. It confirms that in the longer-term, even self-report outcomes were no better with CBT and GET. And it shows how increased expectations of success among patients getting CBT and GET are likely to explain why the self-reported gains weren’t matched by objective ones.

This leads to a remarkable situation: the PACE trial aimed to demonstrate the effectiveness of therapies explicitly based on the notion that ME/CFS is perpetuated by patients’ flawed beliefs and behaviours. Instead, it showed that these approaches, CBT and GET, deliver few or no benefits for patients.

The paper concludes that “the time has come to look elsewhere for effective treatments”, pointing to biomedical studies funded by the National Institutes of Health in the US. It is indeed time to find the biological causes of this illnesses and so pave the way for the effective treatments that patients need.

Follow @sjmnotes

Sadly, Robert Courtney, one of the researcher-patients who co-authored this new paper, died while it was in press. ME patients have published a tribute to their friend Bob.