In their response to Geraghty, the PACE investigators state that they have “repeatedly addressed” the various methodological concerns raised about the trial. While this is true, these responses have repeatedly failed to provide satisfactory explanations for the trial’s very serious flaws. This commentary examines how the current response once again demonstrates the ways in which the investigators avoid acknowledging the obvious problems with PACE and offer non-answers instead—arguments that fall apart quickly under scrutiny.

Other commentaries in this Special Section focus on specific methodological aspects of the PACE trial. (The study’s full name: “Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome: A randomised trial.”) I would like to examine how the PACE investigators, in their response to the troubling questions about their research raised in Geraghty’s editorial (Geraghty, 2016), have strategically avoided providing direct answers (White et al., 2017). Instead, they have provided non-answers—persuasive-sounding arguments that fall apart quickly under scrutiny. This approach is consistent with their earlier efforts to rebut legitimate criticism.

The PACE investigators note that they have “repeatedly addressed” the various concerns about the trial, citing journal correspondence as well as popular forums such as blog posts and news articles. A review of some of these publications confirms their point. But “addressing” concerns is not the same as offering credible explanations, and the investigators have failed this test each time they have responded. For example, they have acknowledged that some participants already qualified as “recovered” on primary outcomes at entry—even though these participants had been found on the same measures to be disabled enough for the study. But the investigators have not acknowledged the obvious—that this peculiar overlap in disability and recovery thresholds presents serious problems of interpretation. And they have not explained why people who were recovered on primary outcomes were included in the study to begin with.

In their trial protocol (White et al., 2007), the PACE investigators included four separate outcomes on which participants had to meet recovery criteria in order to be considered fully recovered. Two of them were the primary outcomes of physical function and fatigue. In the 2013 paper in Psychological Medicine, as has been reported previously, all four of the recovery criteria were watered-down versions of the criteria listed in the protocol (White et al., 2013; Wilshire et al., 2016). In essence, the investigators overhauled their definition of “recovery” in ways that boosted the trial’s apparent success rate.

In responding to Geraghty, the investigators quote his editorial thus: “Dr Geraghty states that ‘… some trial participants had reached the level required to be classified as improved or recovered at trial entry’. This is incorrect.” Yet in quoting Geraghty, the investigators have truncated his comments in a way that distorts his meaning. Geraghty was clearly referring to participants who were recovered at trial entry on a single recovery outcome—and in particular the physical function outcome. Unbiased readers could not reasonably interpret Geraghty’s actual statement the way the PACE investigators have chosen to present it, as if he were referring to participants who were recovered at entry for all four criteria.

The PACE investigators next answer a question Geraghty did not ask. They provide assurances that no participants met the “full criteria for recovery” at trial entry—that is, none were recovered on all four criteria. This statement, while true, is a diversion, since neither Geraghty nor anyone else (to my knowledge) has argued that any participants met the “full criteria for recovery” at entry. And while making that point, the investigators decided not to explain why anyone was recovered on any of the four criteria at baseline—and especially on the two measures that were used to determine whether patients were sufficiently disabled to be in the trial. In short, the investigators refuse to grapple with the implications of this massive flaw at the core of their research. Instead, they appear to believe they deserve some credit because none of their participants entered the study having already met the “full criteria” for recovery.

The PACE investigators further state that only three participants, or less than 1 percent of the sample, met the recovery thresholds for both physical function and fatigue at baseline. Like their statement that no participants met all four of the recovery criteria, this point is also true, and also disingenuous. In fact, as was discovered after the trial through a freedom-of-information request, almost 13 percent of the sample—81 of 641 participants—met the recovery threshold for physical function at baseline. Seven participants met the fatigue recovery threshold at baseline; three members of this group also met the physical function recovery threshold. In all, 85 participants met at least one of these two recovery thresholds at baseline. Yet the investigators mention in their response only the three participants who met both thresholds, ignoring the many dozens of others who met at least one of them.

In not reporting these relevant facts in Psychological Medicine, the PACE investigators withheld important evidence from the scientific record. This omission should raise a host of difficult questions about the overall integrity of the study. It is self-evident that participants cannot logically be defined simultaneously as “disabled” and “recovered” on an indicator, even if it is only one of four indicators under investigation. Such an anomaly in a study of breast cancer, AIDS or any other illness would disqualify it from being published. Lack of approval from oversight committees for major changes in outcome measures would also be disqualifying; the PACE investigators do not reference any such approvals in Psychological Medicine.

Conscientious editors at journals that mistakenly published such flawed research would immediately move to correct or retract it. Yet those serving as gatekeepers and decision-makers at Psychological Medicine and other prestigious journals have yet to acknowledge the glaring and fundamental problems with the PACE trial. This editorial recalcitrance and willful obtuseness harm the field of public health and undermine public belief in science. (One of the PACE investigators, Michael Sharpe, is on the Psychological Medicine editorial board.)

In their response to Geraghty, the PACE investigators also suggest that the improvement rates from cognitive behavior therapy (CBT) and graded exercise therapy (GET)—that is, the percentage of those found to have reached designated “improvement” thresholds for both physical function and fatigue—are irrelevant to their claims that the treatments work. What matters instead, they write, is the revised method they used to assess the effectiveness of the primary outcomes: a comparison of averages between the groups, which they reported in The Lancet (White et al., 2011). The PACE investigators explain, as they have previously, that their original method of measuring improvement rates was too complicated to interpret, so they substituted the comparison of averages.

Yet rates provide key information that averages, however useful, do not—namely, how many people in the different groups got better. This is information that patients and clinicians want, need, and deserve. Perhaps in response to potential criticism of their decision to scrap the protocol measure of improvement rates, the investigators reported in The Lancet a post hoc measure of improvement rates that was much more expansive than the protocol version. This revised method yielded improvement rates of 59 percent for CBT and 61 percent for GET. Last year, after a tribunal ordered the release of anonymized PACE data, the investigators published their own reanalysis of the data and reported that only 20 percent were defined as “improved” under the protocol methodology (Queen Mary University of London, 2016).

This big drop does not trouble the PACE investigators, nor does it alter their interpretation of the study. The issue, they write in their current response, has “nothing to do with efficacy.”

It is true that the investigators found in their reanalysis of improvement rates that the extremely modest benefits for CBT and GET were nonetheless statistically significant. But the 59–61 percent improvement rates have been widely cited as a measure of the PACE trial’s success. To insist now that anyone assessing the study should ignore the implications of the sharp decline in reported improvement rates is not a serious argument.

Indeed, the investigators appear perplexed that anyone would think of comparing the two sets of results. “It is no surprise,” they write, “that fewer participants are regarded as improved if more stringent criteria are applied.” By the same logic, it could have been “no surprise” to the PACE investigators themselves that more participants would be “regarded as improved”—and therefore reported as “improved” in their Lancet paper—if they substituted less stringent criteria to measure improvement rates in the trial.

If the investigators believed so strongly that their new improvement rate criteria were better than those in the protocol, they should have published both sets of findings or the appropriate sensitivity analyses. Then, they could have explained why the revised methods that produced the higher improvement rates were more valid and reliable than the original methods that produced the lower rates. That the investigators received oversight committee approval for the changes in primary outcome measures does not mitigate their responsibility to provide sufficient data for others to assess the results. That The Lancet did not require inclusion of this sort of information was a puzzling lapse in editorial judgment.

In addressing Geraghty’s concerns, the PACE investigators refer readers seeking further explanations to previous correspondence and articles. Yet the claims in these publications also fall short in transparency and common sense. White, for example, wrote a Guardian commentary last fall after an independent group reanalyzed the recovery data and found null results (Matthees et al., 2016). In the commentary, White complained that the researchers had made “tweaks” to the outcome measures that made it harder for trial participants to achieve recovery thresholds (White, 2016). White failed to mention that those “tweaks” were simply the stricter recovery methods he and his colleagues had themselves promised in their protocol, and later abandoned. In other words, the reanalysis did not “tweak” anything—rather, it corrected the scientific record by un-tweaking the investigators’ own post hoc, unauthorized tweaks. These criteria changes had yielded 22 percent recovery rates for GET and CBT, rather than null results found in the reanalysis.

White also recently told The BMJ that it was unfair of critics to compare the high improvement rates from the Lancet paper with the lower improvement rates calculated from the protocol definition (Hawkes, 2016). “They’re comparing one measure with a completely different one—it’s apples and pears,” White said. Indeed it is. White and his colleagues took 5 million pounds in government funds and promised to bring back apples from the market. Instead they brought back pears, refusing to show anyone the apples they had rejected. Given the resources involved, it should not be hard to understand why people would want to examine those apples for themselves, to make their own comparisons with the pears and draw their own conclusions about whether their 5 million pounds were spent wisely. The investigators appear to view this public interest in accountability for public money as confusing or even offensive.

For years, the PACE investigators have repeated their standard arguments while affirming their faith in the integrity of the study. They do not yet grasp that the ground has shifted, in ways that do not benefit this strategy. With the release of the trial data, it is no longer enough to have persuaded themselves, Sir Simon Wessely, and other adherents that their methods are sound and the findings robust. The larger scientific world is now scrutinizing both the study itself and the investigators’ defense of their work and has found their reasoning problematic and their intellectual position unsustainable.

Rather than acknowledging the flaws that others now see clearly, the investigators appear determined to persist with their current approach and resist any concession of error. Given this ill-advised and anti-scientific stance, they should prepare themselves for an even greater onslaught of questions and challenges from leading researchers, clinicians, and other experts, not to mention myalgic encephalomyelitis (ME)/chronic fatigue syndrome (CFS) patients and advocates. Their inadequate and non-responsive responses to tough but fair criticism have apparently served the PACE investigators well in previous exchanges, when few but Sir Simon Wessely were paying attention. That time has passed.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References