Conclusions Placebo controlled trial is a powerful, feasible way of showing the efficacy of surgical procedures. The risks of adverse effects associated with the placebo are small. In half of the studies, the results provide evidence against continued use of the investigated surgical procedures. Without well designed placebo controlled trials of surgery, ineffective treatment may continue unchallenged.

Results In 39 out of 53 (74%) trials there was improvement in the placebo arm and in 27 (51%) trials the effect of placebo did not differ from that of surgery. In 26 (49%) trials, surgery was superior to placebo but the magnitude of the effect of the surgical intervention over that of the placebo was generally small. Serious adverse events were reported in the placebo arm in 18 trials (34%) and in the surgical arm in 22 trials (41.5%); in four trials authors did not specify in which arm the events occurred. However, in many studies adverse events were unrelated to the intervention or associated with the severity of the condition. The existing placebo controlled trials investigated only less invasive procedures that did not involve laparotomy, thoracotomy, craniotomy, or extensive tissue dissection.

Study selection Randomised clinical trials comparing any surgical intervention with placebo. Surgery was defined as any procedure that both changes the anatomy and requires a skin incision or use of endoscopic techniques.

We assessed whether placebo controls should be used in the evaluation of surgical interventions by systematically reviewing all clinical trials in which the efficacy of surgery was compared with a placebo control.

Placebo controlled randomised clinical trials of surgical interventions are relatively uncommon. 15 16 17 Studies published so far have often led to fierce debates on the ethics, feasibility, and role of placebo in surgery. 18 19 20 21 One reason for the poor uptake is that many surgeons, as well as ethicists, have voiced concerns about the safety of patients in the placebo group. Many of the concerns are based on personal opinion, with little supporting evidence. 18 19 20 21 In the absence of any comprehensive information on the use of placebo controls in surgery, and the lack of evidence for harm or benefit of incorporating a placebo intervention, a systematic review of placebo use in surgical trials is warranted.

In considering any scientific evaluation, it is important to remember that an outcome of a surgical treatment is a cumulative effect of the three main elements: critical surgical element, placebo effects, and non-specific effects. 7 The critical or crucial surgical element is the component of the surgical procedure that is believed to provide the therapeutic effect and is distinct from aspects of the procedures that are diagnostic or required to access the disease being treated. 8 The placebo effects are related to the patients’ expectation and the “meaning of surgery,” whereas the non-specific effects are caused by fluctuations in symptoms, the clinical course of the disease, regression to the mean, report bias, and consequences of taking part in the trial, including interaction with the surgeons, nurses, and medical staff. 7 9 10 It is reasonable to assume that surgery is associated with a placebo effect. 11 12 13 Firstly, because invasive procedures have a stronger placebo effect than non-invasive ones 12 and, secondly, because a confident diagnosis and a decisive approach to treatment, typical for surgery, usually results in a strong placebo effect. 14

The increase in the applications for surgical procedures has been driven by a greater involvement of technology in surgical procedures. Such technological advances have made many interventions less invasive, more likely to be endoscopic, and less resembling typical open surgery, such as laparotomy. However, these new procedures are often introduced into surgical practice without any formal evaluation of safety and efficacy, such as using randomised clinical trials. This is because, unlike drug products, such verification is currently not mandated by regulatory authorities. 6 Furthermore, there is generally a scarcity of information reported on the surgical learning curves or the iterative development of a new technique. Both existing and innovative surgical practice clearly needs to be evaluated, and any evaluative method should take account of the unique idiosyncrasies and challenges presented by surgical interventions.

Modern surgery is changing rapidly. Surgical interventions can now be offered to improve function and quality of life not just to save life. The improvement in the safety of surgical procedures and anaesthesia has facilitated this change. 1 The mortality associated with anaesthesia has decreased from between 64 and 100 in 100 000 in the 1940s to between 0.4 and 1 in 100 000 at present. 1 2 The prevalence of serious adverse events related to surgical interventions has remained relatively constant over the past 10 years, despite an increase in the number of surgical procedures performed each year. 3 4 The postoperative death rate is between 1.9% and 4% and in most cases is due to the primary disease. 3 5

Owing to the considerable heterogeneity of conditions, interventions, and outcomes it was not feasible to combine the results of individual studies in a meta-analysis. We present a descriptive analysis of the results of each individual study and present data in tables and figures. All analyses were carried out in Stata (version 12.1).

As a measure of harms, we examined serious adverse events and whether they were attributable to the surgical or the placebo intervention. An optimal strategy to identify reports of adverse events does not exist. 24 We defined serious adverse events as harmful events that occurred during the trial, such as prolonged hospital stay, and events that required admission to hospital or resulted in death. We summarised the serious adverse events data using a grading system according to their severity, as definitely, likely or unlikely to be serious, and a grading system according to relation between the event and the procedure as definitely, likely, unlikely to be related to the procedure. Wherever possible, we used the results from the intention to treat analysis. Most studies did not provide sufficient information about harms to enable a formal statistical analysis.

We assessed the beneficial effect of the surgical intervention on the basis of the original conclusions as any improvement in the main outcome of the trial and as superiority of the surgical treatment over the placebo—that is, the additional benefits of the critical surgical element. Moreover, we calculated statistically significant difference between the surgical intervention and placebo using the information reported in the results section of each study. We calculated the odds ratio for binary outcomes and the effect size for continuous outcomes using the effect estimate from analysis of covariance, the difference in change score, or the difference in postoperative score, depending on the method of analysis and data reported within each individual study. We included only the primary outcome measure, whenever it was explicitly specified. If two primary outcomes were reported, we used both; however, when there were several main outcome measures, we chose those reported in the abstract or those used in other studies, so that the forest plots present similar outcomes. Where necessary, we changed the direction of effect so that the improvement was consistently presented in the same way in the forest plots.

The three review authors also independently assessed the risk of bias in the included studies using the risk of bias tool criteria recommended by the Cochrane guidelines. 22 23 In particular we assessed the method of random sequence generation; concealment of treatment; blinding of participants, care providers, and assessors; success of blinding; and use of intention to treat analysis.

We used a standardised data extraction form to collect information about the characteristics of each study as well as the clinical improvement and superiority of the surgical intervention compared with the placebo for the main study outcome; as reported by the authors in the published article. For each study we extracted the year of publication; study population; condition; intervention; outcomes; sample size; number of participants; number of events, as well as mean and standard deviation for continuous outcomes; and serious adverse events and whether they were related to the procedure. To reduce the chance of errors, the three review authors extracted data separately, checked the entries for consistency, and agreed on a single set of data.

If several articles reported outcomes from a single trial (that is, with the same authors, location, patient population, and recruitment dates), we only included the article reporting the main outcome for the trial.

Three reviewers (KW, IR, BJFD) independently screened the initial set of records identified from the search and then screened the full text of any potentially relevant articles. Each reviewer independently assessed the eligibility of each study, and the final list of included studies was agreed by consensus.

We developed search strategies for three electronic databases: Medline (Ovid), Embase (Ovid), and the Cochrane Central Register of Controlled Trials. We searched the databases from the date of their inception to 14 November 2013, with no restriction on language. (See supplementary appendix 1 for details of the search terms.) We did not systematically search for studies reported only as conference abstracts.

We performed a systematic review adhering to published guidance from the Cochrane Collaboration. 22 Studies were eligible if they were randomised clinical trials in which the efficacy of surgery was compared with placebo. We defined surgery as any interventional procedure that changes the anatomy and requires a skin incision or the use of endoscopic techniques; dental studies were excluded. We used the term placebo to refer to a surgical placebo, a sham surgery, or an imitation procedure intended to mimic the active intervention; including the scenario when a scope was inserted and nothing was done but patients were sedated or under general anaesthesia and could not distinguish whether or not they underwent the actual surgery. We did not limit the inclusion criteria to any particular condition, patient group, intervention, or type of outcome. We excluded studies investigating anaesthesia or other drug substances used perioperatively.

The only adverse event reported as related to anaesthesia was in a trial by Schwartz and colleagues, 56 in which one patient in the placebo group had a bruise as result of a misplaced intravenous line during sedation.

Of the 27 trials in which serious adverse events occurred, 17 trials reported events that were related or likely to be related to the procedure in the surgical group (table 1). Most of the studies did not specify whether the serious adverse events were directly or potentially associated with critical surgical element or other elements of surgical interventions. Complications in the placebo group related or likely to be related to some element of the procedure were reported in nine studies. Harms definitely directly related to the surgical placebo were reported in two studies, and included infections 64 as well as complications related to the device and the investigated condition itself 37 ; both trials were stopped early because of concerns about safety. In one of the trials on gastrointestinal bleeding, aspiration occurred in two patients in the placebo group that could have been related to the procedure as well as to the condition. 31

The interventions in the placebo arm were overall associated with less serious adverse events compared with the active arm, as the main surgical element was omitted and the authors made an effort to minimise risks by withholding part of the intervention—for example, partial burr holes rather than full trepanation 40 or not administering heparin. 39 Often the type of serious adverse events in the placebo group depended on the severity of the investigated condition and the invasiveness of the chosen procedure. Moreover, the serious adverse events in the placebo group were more likely if the procedures involved exogenous material; out of 13 trials using implanted exogenous substances, materials, or tissue, eight reported serious adverse events. 28 37 38 39 40 62 63 64

In just under half of the trials (n=23/53; 43%) the authors stated that there were no serious adverse events, although sometimes they reported minor adverse events (table 2). Three out of 53 studies (6%) did not report any information about adverse events. In the remaining 27 studies (51%), serious adverse events occurred in at least one of the study arms (table 1). In 17 of these 27 studies, serious adverse events were observed in both the surgical and the placebo arms, in five studies serious adverse events were only present in the active group, whereas in four studies the authors did not specify in which group the serious adverse events occurred. Not all serious adverse events were related to the procedure. For example, in six trials on gastrointestinal bleeding, deaths, rebleeds, or continuous bleeding were the main outcome of the study and were a result of the investigated condition and of the procedure being ineffective rather than it being harmful. In only two of these trials, adverse events, such as a perforation, were directly related to the intervention. In several trials, adverse events were rare (<5% of patients) or were unrelated to the procedure—for example, death from other causes. In general, the placebo arm was reported to be safer and adverse events were more serious and more common in the active group.

In 12 studies, the surgical intervention was significantly superior to placebo (tables 1 and 2 and figs 2 and 3). 28 29 30 33 34 36 44 45 46 47 48 49 In 11 studies, the effect of the surgical intervention was significantly better for only some of the reported primary outcome measures. 31 32 50 51 52 53 54 55 56 57 58 In 24 studies (45%), there was no statistically significant benefit of the active surgical intervention over the placebo (tables 1 and 2 and figs 2 and 3). In six studies we could not calculate the statistical estimated clinical benefit because the results were reported as median values and could not be used to calculate an effect size. 27 35 42 59 60 61

Fig 3 Forest plot of studies with continuous outcome measures showing magnitude of effect (effect sizes) in active group compared with placebo group. Outcome values in Stone and colleagues trial were not normally distributed; therefore, the effect size does not represent the true difference. WOMAC=Western Ontario and McMaster Universities arthritis index; KSPS=knee specific pain scale; QoL=quality of life; BMI=body mass index; GERD-HRQL=gastro-oesophageal reflux disease health related quality of life; NRS=numerical rating scale; RF=radiofrequency; SF-36=short form (36) health survey; CSF=cerebrospinal fluid; UPDRS=unified Parkinson’s disease rating scale; EQ5D=EuroQol Group health questionnaire; RMS=modified Roland-Morris scale; ODI=Oswestry disability index; ESS=Epworth sleepiness scale; MDRS=Mattis dementia rating scale

Overall, the magnitude of the treatment effect when the active surgical intervention group was compared with the placebo group was small but generally favoured the surgical treatment. The forest plots in figures 2 ⇓ and 3 ⇓ present the effect sizes and odds ratios for individual trials.

In half of the included studies (n=26; 49%) the authors reported superiority of the surgical procedure over the placebo intervention, and in the remaining trials (n=27; 51%) the active surgical procedure was not statistically different from that of the placebo intervention (table 3).

In around three quarters (n=38; 72%) of the studies the authors reported an improvement in both the surgical group and the placebo group (table 3 ⇓ ). In a further seven trials, 30 31 32 33 34 35 36 the clinical improvement was observed in the surgical group but not in the placebo group; however, in five of these studies, 30 31 32 33 36 the outcome measures were objective and did not depend on patients’ ratings. Finally, in six studies no improvement was reported in either group, 37 38 39 40 41 42 and one study could not be interpreted in terms of improvement as the main outcome was failure of treatment defined as new or continued bleeding. 43

Overall, 39 of the 53 (74%) included studies were published after 2000. Most of the trials investigated minor and not directly life threatening conditions, such as severe obesity (n=7; 13%) or gastro-oesophageal reflux (n=6; 11%). The most common type of intervention was endoscopy, with 23 trials (43%) using this technique as a part of the investigated procedure. Thirteen trials (25%) used some exogenous material, implant, or tissue, and a further six used balloons. Most studies reported subjective outcomes such as pain (n=13; 25%), improvement in symptoms or function (n=17; 32%), or quality of life (n=8; 15%). Less than half of the trials (n=22; 42%) reported an objective primary outcome—that is, measures that did not depend on judgment of patients or assessors. The majority of trials were small; the number of randomised participants ranged between 10 and 298, with a median of 60. No placebo controlled surgical trials investigating more invasive surgical procedures such as laparotomy, thoracotomy, craniotomy, or extensive tissue dissection were identified. Tables 1 ⇓ and 2 ⇓ list the characteristics of each trial.

All included studies were assessed for risk of bias (see supplementary appendix 2). The risk of bias related to sequence generation, concealment of treatment allocation, and blinding during the procedures was generally low. In some trials the measures undertaken to maintain blinding were not explicitly described; however, these trials used endoscopic techniques in patients under general anaesthesia or sedation. Patients and assessors were blinded in almost all trials; in the trials in which assessors were not blinded, the outcomes were objective. Only 12/53 (23%) trials assessed whether the blinding was successful, including one in which the patients were likely to guess the correct allocation. 60

A keyword search of the ClinicalTrials.gov database identified 14 relevant trials, including two studies already found using the electronic search 25 26 and 12 studies that were not yet available as full text articles in November 2013.

The search of Medline, Embase, and Cochrane databases identified a total of 4543 records. We found an additional 23 articles using a hand search of relevant literature and the references of the articles identified using the database search. Out of these 4566 records, 1597 duplicates were discarded, leaving 2969 records; of these, 2860 did not meet inclusion criteria, and a further 56 studies were excluded after reviewing the full text of the article. Among them were 12 articles reporting additional outcomes of a trial, seven follow-up papers, and seven studies that reported results in a way that made the comparison between the surgical and placebo arm impossible. This resulted in 53 full text articles that met the inclusion criteria and were included in this review. Figure 1 ⇓ details the study identification and reasons for exclusion.

Discussion

Surgical randomised clinical trials incorporating a placebo arm are rare but this review shows that the results of many of the trials provide clear evidence against continued use of the investigated surgical procedures and in well designed studies the risks of adverse effects are small and the placebo arm is safer than surgery. The identified surgical randomised clinical trials were heterogeneous. The existing placebo controlled trials investigated only less invasive procedures that did not involve laparotomy, thoracotomy, craniotomy, or extensive tissue dissection. About a half of the reviewed trials showed superiority of the surgical procedure over placebo intervention, but the magnitude of the effect directly related to the crucial surgical element was generally small. The majority of the trials showed an improvement in the surgical group as well as in the placebo group, which would suggest that some surgical procedures may have a placebo effect and that some of the benefits of surgery are related to factors other than the crucial surgical element.

Serious adverse events were reported in half of the reviewed trials, and the severity of possible serious adverse events was often related to the seriousness of the investigated condition and the invasiveness of the chosen procedure. Generally, the incidence of complications in the placebo group was lower than that in the surgical arm.

Strengths and weaknesses of this study Modern surgery involves not only open surgery but also minimally invasive procedures, implants, and transplants; therefore, the boundary between surgery and other medical procedures was not always clear. In addition, identifying unique studies was not always straightforward—that is, differentiating between papers reporting two similar trials or different outcomes of the same trial. Many trials reported several outcomes, often without identifying the primary measure. As a consequence we had to report only the outcomes for which the study had been powered, those reported in the abstract, or those used in other similar studies. The lack of a clear primary outcome also meant that we could not report a single primary outcome per study. The available data only allowed comparison of effects in the surgical arm versus the placebo arm and did not allow for any estimation of the magnitude of the placebo effect. Interestingly, the effects for surgery versus placebo were smaller than expected. Only one study included an observational group.56 We specifically chose not to include surgical trials with waiting list controls. Analysis of the placebo effect without controlling for non-specific effects of being in the trial would be flawed, as was pointed out in the comments66 67 in the original paper on placebo effects in surgery by Beecher.68 The quality of the included studies varied. Many studies were small and used several primary outcome measures or cumulative measures or used a subgroup analysis without comparisons between groups. Reporting of adverse events was poor and many authors did not specify the primary cause of the complications or in which group they occurred. In the recent trials, the adverse events were described in more detail, but further improvement is required.69 The standardised way of reporting harms was published only in 2004.70

Possible benefits and harms and implications for clinicians and policy makers The main benefit and value of placebo controlled randomised clinical trials is their ability to show the real efficacy of a surgical intervention. If a procedure is effective and superior over the placebo, the case is strong for it to be commissioned and funded. The opposite is also true; if a surgical procedure has no benefit over placebo the argument is strong for stopping its use. The well designed placebo controlled trial of surgery is a useful tool to challenge the continued commissioning and use of ineffective treatments. The trial participants usually gain direct benefits, the main one being the perceived improvement attributable to the intervention. Whether this is acceptable by doctors and patients is the subject of an ongoing debate.71 72 Indirect benefits may include confirming or disproving the primary diagnosis thereby allowing patients to be referred for a more appropriate treatment. A reduced waiting time or receiving treatment free of charge (for treatment that is ordinarily paid for by patients) is a further potential advantage of taking part in a trial. In the modern clinical setting the distinction between the research and treatment is less clear and these two are often integrated. The balance between benefits and overall risk to a patient in a trial and in standard clinical care are usually not much different, as both the clinical outcomes and the additional risks and inconveniences caused by additional assessments are not great.73 74 The negative consequences for the trial patients may be considered mainly in terms of harms. In most of the trials, the investigated conditions were non-life threatening and the main aim of surgery was to improve function, symptoms, and quality of life, to reduce pain, or to remove the need for drug use. In these trials the harms were either lack of improvement or complications arising from the procedure, such as perforation after endoscopy. In the trials on gastrointestinal bleeding, the investigated condition was potentially life threatening and the procedure was not elective. In these six trials the negative events, such as bleeding and deaths, were the primary outcome and an indicator that the intervention was ineffective. Placebo controlled surgical trials raise important ethical concerns18 19 20 but are justified when there is a genuine equipoise19; that is, a disagreement in the medical community about whether one treatment is superior to another, because standard treatment does not exist or its efficacy is questioned. Such trials conform to the ethical principles of non-maleficence—that is, the duty not to inflict expected harm, and justice.21 Surgical intervention may be associated with greater risk than drug intervention; therefore, to be justified it must be associated with greater “pure” benefit to outweigh such risks. Surgical trials are justified by equipoise not only when there is uncertainty about pure benefit, but also when there is uncertainty about whether benefit outweighs harms. If a standard medical treatment is available, it may be offered as a part of the study, as in the trials on tissue transplantation in Parkinson’s disease, in which the participants continued their L-dopa drug. In addition to that, if one of the treatments in the trial becomes recognised as effective or a new treatment becomes available during the conduct of the trial, equipoise is disturbed and ethically the study should be stopped, as was the case in one trial61 where the intervention was accepted by the insurer as a standard procedure and the trial was terminated early, before it could show the superiority of the active procedure. Another ethical concern is deception and risk to the trust between patients and surgeons. In the majority of studies and always in the recent trials, patients were informed about taking part in a placebo controlled trial, and informed consent was obtained. In the study by Moseley and colleagues, participants were also asked to write a clause in their notes acknowledging the fact that they might receive placebo intervention and that placebo might not be effective.75 If patients are fully informed and give proper consent, then from the ethical point of view the conditions of free consent and autonomy are fulfilled. Surgery of any form, including placebo surgery, is associated with some level of risk, whereas a placebo tablet or drug is not. Therefore the balance between risks and benefits in surgical placebo controlled randomised clinical trials is different from that in drug trials. In most of the studies that reported serious adverse events, such complications were expected in the investigated conditions; even in the study on Alzheimer’s disease, mortality was comparable with other trials.37 What matters most is that risk is minimised and that actual harm is as small as possible.76 77 78 In the reviewed trials, the placebo arm was usually designed to pose as little risk to the participants as possible and to be significantly less risky than the active surgical procedure. In a situation where there is no certainty that the surgery is effective, the balance between risks and benefits within a study actually may be better in the placebo arm. For example, in the trials on fetal nigral tissue transplantation for Parkinson’s disease, the active treatment was no better than placebo in terms of outcome and resulted in more severe side effects, such as dystonias and dyskinesias. Moreover, the risk of infection and other complications associated with the actual tissue implantation was much higher than the risk in the placebo group. Interestingly, in the study by Freed and colleagues,38 patients who had been in the placebo group still opted for the surgical procedure after the trial, despite the fact that no clear benefit of fetal nigral tissue transplantation had been shown. This may be because patients believe that invasive,12 new,79 and expensive80 procedures are actually more effective.

Effects of existing trials on change in practice The results of the placebo controlled surgical trials performed so far have had a varied impact on clinical care. With the exception of the trials on debridement for osteoarthritis75 or internal mammary artery ligation for angina,81 most of the trials did not result in a major change in practice. Moseley’s study on debridement for knee osteoarthritis was well received and resulted in limiting the recommendations for debridement and lavage for osteoarthritis of the knee to patients only with clear mechanical symptoms such as locking.82 The reaction of the medical community to the trials on tissue transplantation to treat Parkinson’s disease was also favourable. Although this treatment is not currently recommended,40 the need for more studies on mechanisms of disease and on tissue transplantation is recognised.83 84 These studies also provoked a discussion about ethical aspects of randomised clinical trials and placebo.18 19 21 85 In contrast, the results of the trials on the efficacy of vertebroplasty62 64 were challenged and their authors were criticised for undermining the evidence supporting this commonly used procedure. The critics acknowledged that the injection of cement might be associated with many side effects, some of them potentially dangerous, but they argued that the treatment was justified because earlier unblinded trials had shown superiority of vertebroplasty over medical treatment.62 64 86 This argument against the validity of placebo controlled trials neglected any potential placebo effect of vertebroplasty.