Trial By Error: Andrew Lloyd’s Past Endorsement of PACE

By David Tuller, DrPH

This post is sort of long and complicated, but I think the details are important given Andrew Lloyd’s outsized role in the ME/CFS domain in Australia. I urge patients to take care not to over-exert themselves in reading it!

**********

A few weeks ago, I interviewed Andrew Lloyd, an infectious disease specialist at the University of New South Wales. When I asked about his past referencing of the PACE trial as evidence for treating ME/CFS with CBT and GET, he told me he “might have” cited it in papers but couldn’t actually remember whether he had or not. He further declared himself “unfussed” about the controversy over the trial because other evidence (e.g. the Cochrane reviews) also found that these therapies were effective. (More on that problematic argument in another post.)

In my short blog about our conversation, I indicated that I found it hard to believe Professor Lloyd couldn’t remember whether or not he’d cited PACE. Here’s one reason: In 2015, he co-wrote a BMJ editorial that discussed three separate PACE papers. Here’s another: Three years before that, he co-authored a commentary in the Journal of Internal Medicine that not only mentioned the PACE findings favorably but unfairly criticized smart patients and researchers who had questioned the reported results.

Professor Lloyd co-authored both of these publications with Jos van der Meer, a member of the Dutch branch of the CBT/GET ideological brigades. Many academic papers have five or ten or 15 authors, some of whom might play little or no role in the writing and might not even be fully aware of what is included in a final draft. In the circumstances under consideration here, both men presumably were familiar with what was sent out under their joint byline, no matter which one took the lead in composing the draft.

In the 2012 commentary, Professor van der Meer was listed as the corresponding author; in the 2015 editorial, Professor Lloyd took on that role. I suppose it’s conceivable that, in the midst of our conversation, Professor Lloyd might have forgotten about these public endorsements of the PACE findings, but that seems highly unlikely—to me, at least. After all, my question about PACE didn’t come out of nowhere; he knew in advance that the trial was on our interview agenda.

Let’s review the two publications. The first, called “A controversial consensus,” was a critique of the International Consensus Criteria, published in the Journal of Internal Medicine in 2011. (The authors’ overall concerns about these criteria are not the focus of this post.) The second paragraph of the commentary, which recounted the longstanding battles between different factions involved in ME/CFS research and advocacy, included this section:

This dispute between the various protagonists recently surfaced with the PACE trial published in the Lancet, which provided evidence for effectiveness of elements of cognitive-behavioural therapy (CBT) and graded exercise therapy (GET) for patients with CFS. This publication triggered unscientific and sometimes personal attacks on the researchers in both the scientific literature and via the Internet.

The sentence about “unscientific and sometimes personal attacks on the researchers” cited eight letters published by The Lancet in May, 2011, a few months after the release of the PACE trial results. It also cited a paper co-authored by Professor van der Meer and some Dutch colleagues. Since the referenced paper was published in 2006, it is hard to understand how it could have any relationship to these purported “attacks” triggered by the PACE paper published five years later. Of more interest here is the off-hand dismissal of the concerns expressed by the advocates and researchers whose letters were included in the Lancet correspondence about PACE.

It is unclear why Professors van der Meer and Lloyd would characterize these generally cogent, well-argued and scientifically literate letters as “unscientific and sometimes personal attacks.” This inaccurate description raises doubts about their own integrity and scientific credibility, as well as their ability to recognize blatant data manipulation.

These letters identified some of the core problems with PACE. Among other issues, they raised questions about the rampant outcome-switching that resulted in overlaps between the entry and outcome thresholds for the two main measures of self-reported physical function (the SF-36 form) and self-reported fatigue (the Chalder Fatigue Scale)—a bizarre paradox that should have disqualified the PACE papers from ever being published. The letters also criticized the decision by the investigators to not report the findings promised in their protocol, the claim that the trial proved the treatments were not only effective but safe, the fallacious statement by one of the lead investigators about patients who “got back to normal,” the absurd claim in an accompanying commentary that patients met a “strict criterion” for recovery, and so on.

Let’s look at some of these purportedly “unscientific and sometimes personal attacks” from the Lancet correspondents.

Sarah Feehan, on behalf of the Liverpool ME Support Group:

Given…the fact that those with a Chalder fatigue questionnaire Likert score of 18 could still meet the trial’s entry criteria (bimodal score of 6 or more), it would be good if White and colleagues would now recalculate the data using the original definition of “fatigue caseness.”

Andrew James Kewley, from the faculty of science and engineering at Flinders University in Australia:

I am concerned by the change in assessment method between the published results of the PACE trial and the trial protocol…In particular, the protocol stated that those with short-form 36 physical function subscale scores of 65 or less would be deemed ill enough to participate, and that those with scores of 85 or more would be regarded as “recovered.” However, the authors have questionably defined “normal” as a score of 60 or more, based on general population scores which did not exclude those reporting chronic illnesses.

John Mitchell, Jr., a patient:

Much has been made of the “recovery” achieved by some participants in Peter White and colleagues’ PACE trial, one of the authors having stated to the media that ‘twice as many people on graded exercise therapy and cognitive behaviour therapy got back to normal’ and the accompanying Comment stating that, by use of a ‘strict criterion’ for recovery, “the recovery rate of cognitive behaviour therapy and graded exercise therapy was about 30%…Although the trial protocol does give a strict definition for recovery, this information is omitted from the published paper, which instead refers to physical function and fatigue in the “normal range.” Whether the values given are indicative of normal function is open to question, however.”

Tom Kindlon, a representative of the Irish ME/CFS Association:

Peter White and colleagues’ claim that, if cognitive behaviour therapy and graded exercise therapy are delivered as described, they are “safe” for chronic fatigue syndrome (CFS); the CONSORT statement on harms reporting recommends against such claims…Both cognitive behaviour therapy and graded exercise therapy are designed to increase activity; however, actometers were not used, so one cannot be sure how many patients were actually more active….Analysis of three trials of cognitive behaviour therapy found that activity levels before and after therapy were similar, despite improvements being reported on fatigue and other subjective measures. This finding suggests that patients might simply substitute the activity component of cognitive behaviour therapy for other activities; if this situation occurred in White and colleagues’ study, we would not have information on the effects of actually increasing activity levels.

And so on.

The letters were all well within the bounds of acceptable and appropriate academic discourse. It was preposterous to accuse the letter-writers of mounting unscientific arguments or launching personal attacks. To the contrary, they were pointing out how the PACE investigators engaged in serious violations of key scientific and statistical principles in ways that allowed them to report better results for their five-million-pound trial than would otherwise have been possible. Professors van der Meer and Lloyd were apparently unconcerned that participants could meet outcome thresholds for both primary measures at baseline, which suggests that their adherence to their preferred interpretation of the PACE results or their collegial relationships with the PACE authors impaired their judgement and their ability to weigh the evidence objectively.

Since Professor van der Meer was listed as the corresponding author, it is possible that he wrote the commentary and that Professor Lloyd paid minimal attention to the text. But it is worth noting that, as a follow-up, Professor Lloyd authored on his own a brief mea culpa. In that apology, he retracted the accusation of “unscientific and sometimes personal attacks” in relation to just one of the eight letters he and Professor van der Meer had criticized. Here’s what he wrote:



In our commentary on [a] paper by Broderick et al. regarding diagnostic criteria for CFS, Professor van Der Meer and I inadvertently cited a letter by Stouten et al. in a section of our correspondence suggesting that their response to the publication of the PACE trial for patients with CFS was unscientific and included personal attacks on the authors of that study, as it was interspersed with a series of letters that met those descriptors. However, it is clear that the letter by Stouten et al. raised valid scientific concerns about the trial. We wish to note that this letter should not have been included as an example in our correspondence and apologize to the authors for any distress caused.

While Professor Lloyd should perhaps be commended for having issued any apology at all, it is surprising that he singled out just one of the letters for such praise. It is certainly true that the Stouten et al. letter raised legitimate and thoughtful concerns about “the discrepancy between the definitions of improvement in the protocol and the paper.” But it was ridiculous to suggest the other letters did not, and that they deserved the condemnation they received in the commentary. Professors Lloyd and Professor van der Meer owed—and still owe—heartfelt apologies to all the letter-writers they bashed. Moreover, that Professor Lloyd personally wrote this note certainly makes it harder to fathom his recent forgetfulness about his past defense of PACE.

**********

Now let’s review the 2015 editorial in The BMJ, titled “The long wait for a breakthrough in chronic fatigue syndrome: not over yet.” The editorial mentioned the failure of the XMRV retrovirus hypothesis and of research into various biomedical treatments. It also noted that “there is solid evidence from multiple controlled studies that patients can gain control of symptoms and functional improvement through multidisciplinary interventions incorporating graded exercise therapy and cognitive behavioural therapy.”

In discussing this “solid evidence,” the editorial declared the following:

The recent mediation analysis of the outcomes of the PACE trial [a 2015 paper published in Lancet Psychiatry] is of interest. This trial compared standard medical care, cognitive behavioural therapy, graded exercise, and adaptive pacing therapy, concluding that both cognitive behavioural and graded exercise therapy were more effective at reducing fatigue and improving physical disability than standard care or adaptive pacing. The mediation analysis suggested that both cognitive behavioural therapy and graded exercise worked by reducing avoidance of activity.

In another passage, the editorial discussed the PACE recovery paper, published in Psychological Medicine in 2013, while noting that there had been “recent contention” about whether these treatments could lead to a cure:

An analysis of the PACE trial suggested cure was possible, but recovery outcomes were defined post hoc using population norms with generous thresholds (such as the population mean plus one standard deviation for self reported fatigue). This analysis was criticised because of the limited assessments and less than full restoration of health, leading to a recommendation that trials use more accurate outcomes (such as clinically relevant improvement) defined in advance and capturing a broad based return to health with assessments of fatigue and function…Even with the unduly liberal designation of recovery, less than one quarter of patients ‘recovered’ in the PACE trial.

Notably, this statement appeared to offer modest criticism of the PACE methodology for assessing recovery and pointed out that a relatively small percentage of participants were reported to have achieved that status. In one sense, this represented a tiny victory for those who had consistently been questioning the PACE recovery claims. Yet viewed from another perspective, the passage essentially ignored and whitewashed the most egregious deficiencies that marked the 2013 paper.

As its source for the criticism of the Psychological Medicine paper, the BMJ editorial cited a letter to Evidence-Based Mental Health, another journal from the BMJ stable. The letter was written by Tom Kindlon, who wrote one of the supposedly “unscientific” responses to the Lancet paper, and Adrian Baldwin, another of the smart citizen-researchers whose stellar analyses have informed my own. Yet the editorial’s account of this Kindlon-Baldwin letter falls short of the transparency readers have a right to expect from leading journals like The BMJ.

Contrary to the statement by Professors Lloyd and van der Meer, the letter did not just focus on “limited assessments” and the “less than full restoration of health.” It specifically noted—as had the correspondence about the 2011 Lancet paper–that the outcome threshold for self-reported physical function was lower than the threshold for disability required on the same measure for eligibility to enter the trial. Here’s a key passage from the letter:

An entry criterion for the PACE trial was SF-36 PF ≤65 [SF-36 was the self-reported physical function questionnaire] to ensure participants had disabling impairments in physical functioning. We thus believe it was inappropriate that a threshold of SF-36 PF ≥60 was used to define recovery, meaning a participant could be classed as recovered even if their physical functioning deteriorated during the trial!

In calling this threshold “inappropriate,” Kindlon and Baldwin were using understated language to describe this inexplicable feature of the study. In the letter, they explained how this anomaly had happened: the PACE investigators used the wrong statistical method to determine their so-called “normal range” for physical function. The investigators derived this range from a population whose scores on the SF-36 questionnaire for self-reported physical function did not form the symmetrical, bell-shaped curve known as a normal distribution. Instead, the responses were highly skewed toward the healthier end.

When data from a sample fall into a normal distribution, the method for determining a range that includes about 68% of the group is to take the population mean plus/minus one standard deviation–the method referred to as “generous” in the BMJ editorial. But using this method for values that are not normally distributed, as the PACE team did with the population-level SF-36 data, yields a markedly different result. In the case of the purported “normal range” derived for the SF-36, the spread of values was so expansive that the lower boundary fell under the threshold the investigators themselves had designated to indicate disability. They used this same inappropriate statistical method to derive a “normal range” for the self-reported fatigue scale from another population-based sample, even though the values in that sample were also skewed toward the healthy end and therefore did not generate a normal distribution.

In fact, the PACE authors themselves knew that they were using a calculation that produced distorted findings. How can we be sure? Because the lead PACE investigator, Peter White, was the senior author of a 2007 paper that included this key information. The paper was called “Is a full recovery possible after cognitive behavioural therapy for chronic fatigue syndrome?” (Not surprisingly, the authors answered that question in the affirmative.)

In that paper, Professor White and his colleagues provided a caveat. They acknowledged that this standard method of determining ranges “assumed a normal distribution of scores” and noted that the formula would yield different results given “a violation of the assumptions of normality.” The paper further acknowledged that the population-based responses on the SF-36 physical function questionnaire were not normally distributed.

As it turns out, Professor van der Meer, Professor Lloyd’s co-author on both the Journal of Internal Medicine commentary and the BMJ editorial, was also one of the co-authors of Professor White’s 2007 paper. Therefore, he presumably knew that the usual statistical approach cited in the editorial was a problematic method of generating an accurate range when population-based responses were not normally distributed. In PACE, this formula therefore led to ranges that were not only “generous,” as the BMJ editorial stated, but unacceptable and in fact absurd, given that the results allowed PACE participants to be “recovered” and “disabled” simultaneously for both the physical function and fatigue measures. Furthermore, the PACE papers did not include the important caveat featured in the 2007 paper—nor did the BMJ editorial.

As the editorial accurately noted, the “recovery” definition in the 2013 paper was post-hoc—that is, it was created after reviewing the data, not before. But as was immediately obvious to patients, the problems went way beyond that. In fact, all four of the recovery criteria included in this post-hoc definition—not just the physical function and fatigue measures–were significantly weakened from what the PACE investigators promised in their protocol. Moreover, they created this new definition without any apparent oversight committee approval, since none was cited in the 2013 paper. This salient detail should also have rendered the paper unpublishable, but it was not mentioned by Professors van der Meer and Lloyd in their editorial.

If patients were able to figure all this out, why couldn’t Professor Lloyd–especially since he has declared himself to be Australia’s “leading light” on the illness? And if Professor Lloyd was aware of these issues, why didn’t he and Professor van der Meer hold the PACE investigators to account for statistical manipulations that generated unreliable claims of “recovery”? We now know, of course, that 13 % of the PACE participants were already “recovered” for physical function when they entered the trial. Since these data had long since been released under a freedom of information request, they were certainly available for Professors Lloyd and van der Meer to review before they published their editorial.

**********

Any study including an analysis in which outcome thresholds overlapped with entry criteria should not have seen the light of day–period. The presence of such an analysis should have been a huge red flag to editors, peer reviewers and intelligent readers that something was seriously amiss with the study’s statistical methodology. Anyone who had compared the trial protocol with the results reported in both the Lancet and Psychological Medicine papers would also have noticed that the other changes would also have yielded better reportable results than those the investigators promised to provide. It should not be asking too much for commentators and editorialists to read not only the studies on which they are opining but the protocols that led to approval for such research in the first place.

That Professor Lloyd failed to notice these flaws—or, if he noticed, that he remained “unfussed” about them—is problematic. That he now suggests he’s unbothered about the entire matter and can’t even remember whether he’d ever cited PACE suggests that he has abdicated responsibility for the bad choice he made in touting the findings in the first place. The Kindlon-Baldwin letter clearly laid out some key issues that rendered the reported PACE results incoherent and nonsensical. The BMJ editorial’s failure to take these issues into account was indefensible—just like the decision by Professors Lloyd and van der Meer to describe the concerns raised in the Lancet correspondence as “unscientific and sometimes personal attacks.”

When I first wrote to Professor Lloyd seeking an interview, I assured him he would have a chance to respond on Virology Blog to any critical statements. He wrote back that he would be unfussed by such criticism and only responded to concerns raised in the scientific literature. Nevertheless, I sent him an e-mail after my post about our conversation, offering him a chance to comment on what I had written. He declined. I then advised him that I would no longer solicit responses from him, but that he could certainly contact me if he changed his mind.

That invitation remains open.