The PACE trial of cognitive behavioural therapy and graded exercise therapy for chronic fatigue syndrome/myalgic encephalomyelitis has raised serious questions about research methodology. An editorial article by Geraghty gives a fair account of the problems involved, if anything understating the case. The response by White et al. fails to address the key design flaw, of an unblinded study with subjective outcome measures, apparently demonstrating a lack of understanding of basic trial design requirements. The failure of the academic community to recognise the weakness of trials of this type suggests that a major overhaul of quality control is needed.

Some years ago I was asked to advise on research strategies for chronic fatigue syndrome (CFS)/myalgic encephalomyelitis (ME), on the grounds of having expertise that might be relevant, although I never practiced in the field. I was introduced to the PACE study in 2014 by a presentation by Peter White that consisted of a cursory showing of one or two data images intended to assure the audience of robust evidence for efficacy of cognitive behavioural therapy (CBT) and graded exercise therapy (GET), followed by an extended series of unsupported statements directed at patient advocates who were accused of ‘attacking science’ by raising criticisms of the trial. Subsequent interaction with the patient community made it clear to me that the advocates’ criticisms were, if anything, over-lenient and that if there was any threat to science it came from the poor quality of the study itself.

Debate about PACE has often focused on detail. Yet the trial has a central flaw that can be lost sight of: it is an unblinded trial with subjective outcome measures. That makes it a non-starter in the eyes of any physician or clinical pharmacologist familiar with problems of systematic bias in trial execution. In their article responding to a recent critical editorial by Geraghty (2016), White et al. (2017) write that ‘[Geraghty] has not said which [scientific] procedures and standards we neglected or bypassed’. In fact, Geraghty (2016) itemises these in detail. However, it is true that, perhaps because it seems too obvious, he does not spell out the central problem in full – the combination of lack of blinding and subjective outcome measures.

There is no way of addressing this flaw. The defence that the trial was peer reviewed by the Medical Research Council is no argument; it appears just to indicate that ignorance of methodological principles is widespread in the British medical establishment (not to mention the editor of a high-profile journal). Nevertheless, I have heard two arguments raised by the PACE authors and their colleagues (personal communications), which are worth at least mentioning.

First, it is argued that there are many good unblinded trials (surgery in oncology for example) and many good trials with subjective endpoints (drug studies in rheumatoid disease). That is undisputed, but misses the key point that there are essentially no good trials that have both features. Blinding is introduced specifically to address the potential for bias in the use of subjective outcome measures. It is not needed for objective outcomes and objective outcomes are not needed for fully blinded trials. It is hard to credit that anyone could miss this point, but if they do, it would at least explain how trials come to be designed without taking it into account.

Second, it has been suggested that if practical issues make robust methodology hard to set up then weaker methodology has to be used. That will sometimes be so. But it makes no sense to say that if you cannot work out how to do a reliable study then an unreliable study can be taken as reliable.

Apart from the apparent lack of understanding of trial design, the irony is that what appears least understood by the defenders of PACE is that its problems stem from what one might call human nature, or in jargon terms ‘psychology’. If, as White et al. claim, the PACE team had done all in their power to minimise systematic bias due to human nature – loaded beliefs or motivations – this might have had some mitigating force. However, in contrast, as Geraghty indicates, material likely to lead to such bias, including the instruction manuals for patients and therapists and a subsequent newsletter, emphasising which treatments were expected to do best, seems to have been laid on with a trowel.

Reliable assessment of therapies delivered by dedicated therapists presents a serious methodological challenge. In rheumatology, the problem became familiar with physiotherapy techniques and joint protection programmes from occupational therapists. In the end, pretty much all evidence from studies of these modalities was discarded as unreliable. The central problem is that it is very difficult to find therapists who have no prior commitment to the validity of certain techniques. White et al. argue that it would be inappropriate for trials to be performed by disinterested parties. Geraghty’s suggestion may be impractical, but I do not see it as misguided. White et al. argue that ‘The clinicians amongst us have dedicated their careers to care for thousands of patients with CFS/ME and we always want the best for them’. It is precisely this sort of emotionally laden justification of ‘those of us who know best’ that needs to be removed from trial design. The way that human nature creeps into the research environment is something all too well known to physicians and pharmacologists. It seems strange that it should be unfamiliar in psychological medicine.

Another peculiar line of argument has been used to justify the claim that bias would not have been a problem in PACE. It has been claimed that there tends to be no significant placebo effect in CFS/ME; at the same time, it is pointed out that CBT operates through essentially the same mechanism as a placebo effect (Knoop et al., 2007) – by encouraging the patient to take a positive view of their progress through modifying perceptions rather than pharmacological means. The two premises would be compatible if the PACE trial had yielded a negative result for CBT. But if it is claimed that CBT was effective, then it is hard to maintain the first premise in the face of the second.

The problem highlighted here is that we have no real way of knowing what aspect of the modality called ‘CBT’ is responsible for any improvement, if indeed reported improvement reflects more than just a desire to meet a therapist’s expectations. In pharmacology, some form of quantitation is normally considered essential before evidence is considered reliable. This is often a dose–response curve, but there are other options. PACE provides nothing of this kind.

Moreover, the ‘control’ group does not meet reasonable criteria for an adequate control, which would require replicating all contextual aspects of treatment that might have a non-specific effect on reporting of outcome. The standard medical care arm apparently had no equivalent contact with professionals (White et al., 2011). Again, the PACE authors have failed to take the opportunity to mitigate the central flaw in the trial. In short, the trial was set up in such a way that the default assumption would be that systematic bias due to the usual factors associated with subjective outcomes in an unblinded setting would be operating full tilt. It would be quite surprising if the treatments advertised as best had not led to a better reported outcome.

It may be that it is easier for those of us involved in pharmacological interventions to recognise extraneous psychological influences on trial outcomes as extraneous. Systematic bias is rife within science wherever there is leeway in analysis of outcome. The scale and subtlety of the problem was brought home to me by a paper on the putative paranormal phenomenon of ‘the [non-visual] sense of being stared at’ (Radin, 2005). An inverted funnel plot of a set of studies of this effect subject to meta-analysis showed evidence for systematic bias towards a positive result, a familiar finding. More interestingly, the results were too consistently only slightly positive. If all studies were tracking the same effect, more of them should have been more positive, due to noise. The suspicion must be that more dramatic ‘effects’ were not reported since they might appear ‘too big’ and therefore implausible! Bias is not just common, but also often finely tuned, even if unwittingly. Judging from my own experience of both laboratory and clinical research, Murphy’s Law applies. Whenever bias can creep in it will. The only solution is to design it out of the study from the outset.

More detailed criticisms of PACE in terms of shifting of recruitment and outcome criteria and implausible criteria for recovery have been covered by Geraghty, Tuller, Matthees, Wilshire, Kindlon, Rehmeyer and others (Geraghty, 2016; Rehmeyer, 2016; Tuller, 2015; Wilshire et al., 2016). As indicated, I see these problems chiefly in terms of failed opportunities to mitigate the basic design flaw. However, I think the claim that the effects of CBT and GET were maintained at two and a half years (Sharpe et al., 2015) is worth challenging again because it is not what any reasonably intelligent person would conclude. If there is no longer a difference in the level of improvement between treatment groups, then a preferential causal influence of one therapy or another cannot be claimed to be ‘maintained’. It is conceivable that exposure of other patients to CBT allowed them to catch up, but there is no way that this can be used to shore up evidence that is otherwise entirely negative.

I think it is a matter of concern that White et al. (2017) reject out of hand the possibility that the ‘actions [of the PACE authors] have arguably caused distress to patients’. They have. Distress is very evident among the patient community, as much as anything in terms of the insult to their intelligence made by insisting that seriously flawed research is in their interest. I have no doubt that most CFS/ME patients in the United Kingdom would want to campaign to preserve services, but it seems disingenuous to suggest that this is because they want more CBT and GET. If they are still ill, presumably these approaches have failed and the priority is to find something more effective.

I find it particularly disappointing that at the end of White et al.’s response there is an uncalled for innuendo that somehow in writing his editorial Geraghty might be inhibiting future high-quality research. I think Geraghty is to be congratulated for voicing a reasonable opinion with the admirable aim of inhibiting poor research and calling for something properly grounded. What the patients want most is confidence in the level of research and that will only come when the poverty of past attempts is fully appreciated.

White et al. conclude that they stand firmly by the findings of the PACE trial, presumably because of their inability to understand its basic flaws. As has been suggested by others, the flaws are so egregious that it would serve well in an undergraduate textbook as an object lesson in how not to design a trial. Its flaws may have only been widely appreciated recently simply because those involved in trial design in other disciplines were unaware of its existence. Now that they are aware, there appears to be near unanimity. The patients have been aware of the problems for several years, and all credit to them for their detailed analyses. In my experience, most of the people with a deep understanding of the scientific questions associated with CFS/ME are patients or carers. To suggest that when these people voice their opinions they are doing a disservice to their peers seems to me inexcusable.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding

The author(s) received no financial support for the research, authorship and/or publication of this article.