Acupuncture study on colicky kids shows the opposite of what the authors conclude

This new RCT was embargoed until today; so, I had to wait until I was able to publish my comments. Here are the essentials of the study:

The Swedish investigators compared the effect of two types of acupuncture versus no acupuncture in infants with colic in public child health centres (CHCs). The study was designed as a multicentre, randomised controlled, single-blind, three-armed trial (ACU-COL) comparing two styles of acupuncture with no acupuncture, as an adjunct to standard care. Among 426 infants whose parents sought help for colic and registered their child’s fussing/crying in a diary, 157 fulfilled the criteria for colic and 147 started the intervention.

Parallel to usual care, study participants visited the study CHC twice a week for 2 weeks. Thus, all infants received usual care plus 4 extra visits to a CHC, during which parents met a nurse for 20–30 min and were able to discuss their infant’s symptoms. Together these were considered to represent gold standard care. The nurse listened, and gave evidence-based advice and calming reassurance. Breastfeeding mothers were encouraged to continue breastfeeding. At each visit, the study nurse carried the infant to a separate treatment room where they were left alone with the acupuncturist for 5 min.

The acupuncturist treated the baby according to group allocation and recorded the treatment procedures and any adverse events. Disposable stainless steel 0.20×13 mm Vinco needles (Helio, Jiangsu Province, China) were used. Infants allocated to group A received standardised MA at LI4. One needle was inserted to a depth of approximately 3 mm unilaterally for 2–5 s and then withdrawn without stimulation. Infants allocated to group B received semi-standardised individualised acupuncture, mimicking clinical TCM practice. Following a manual, the acupuncturists were able to choose one point, or any combination of Sifeng, LI4 and ST36, depending on the infant’s symptoms, as reported in the diary. A maximum of five insertions were allowed per treatment. Needling at Sifeng consisted of 4 insertions, each to a depth of approximately 1 mm for 1 s. At LI4 and ST36, needles were inserted to a depth of approximately 3 mm, uni- or bilaterally. Needles could be retained for 30 seconds. De qi was not sought, therefore stimulation was similarly minimal in groups A and B. Infants in group C spent 5 min alone with the acupuncturist without receiving acupuncture.

The effect of the two types of acupuncture was similar and both were superior to gold standard care alone. Relative to baseline, there was a greater relative reduction in time spent crying and colicky crying by the second intervention week (p=0.050) and follow-up period (p=0.031), respectively, in infants receiving either type of acupuncture. More infants receiving acupuncture cried <3 hours/day, and thereby no longer fulfilled criteria for colic, in the first (p=0.040) and second (p=0.006) intervention weeks. No serious adverse events were reported.

The authors concluded that acupuncture appears to reduce crying in infants with colic safely.

Notice that the investigators are cautious and state in the abstract that “acupuncture appears to reduce crying…” Their conclusions from the actual article are, however, quite different; here they state the following:

Among those initially experiencing excessive infant crying, the majority of parents reported normal values once the infant’s crying had been evaluated in a diary and a diet free of cow’s milk had been introduced. Therefore, objective measurement of crying and exclusion of cow’s milk protein are recommended as first steps, to avoid unnecessary treatment. For those infants that continue to cry >3 hours/day, acupuncture may be an effective treatment option. The two styles of MA tested in ACU-COL had similar effects; both reduced crying in infants with colic and had no serious side effects. However, there is a need for further research to find the optimal needling locations, stimulation and treatment intervals.

Such phraseology is much more assertive and seems to assume acupuncture caused specific therapeutic effects. Yet, I think, this assumption is not warranted.

In fact, I believe, the study shows almost the opposite of what the authors conclude. Both minimal and TCM acupuncture seemed to reduce the symptoms of colic compared to no acupuncture at all. I think, this confirms previous research showing that acupuncture is a ‘theatrical placebo’. The study was designed without an adequate placebo group. It would have been easy to use some form of sham acupuncture in the control group. Why did the authors not do that? Heaven knows, but one might speculate that they were aiming for a positive result – and what better way to ensure it than with a ‘no treatment’ control group?

There are, of course, numerous other flaws. For instance, Prof David Colquhoun FRS, Professor of Pharmacology at University College London, criticised the study because of its lousy statistics:

START OF QUOTE

“It is truly astonishing that, in the 21st century, the BMJ still publishes a journal devoted to a form of pre-scientific medicine which after more than 3000 trials has still not been able to produce convincing evidence of efficacy1. Like most forms of alternative medicine, acupuncture has been advocated for a vast range of problems, and there is little evidence that it works for any of them. Colic has not been prominent in these claims. What parent would think that sticking needles into their baby would stop it crying? The idea sounds bizarre. It is. This paper certainly doesn’t show that it works.

“The statistical analysis in the paper is incompetent. This should have been detected by the referees, but wasn’t. For a start, the opening statement, ‘A two-sided P value ≤0.05 was considered statistically significant’ is simply unacceptable in the light of all recent work about reproducibility. Still worse, Table 1 uses the description ‘statistical tendency towards significance (p=0.051–0.1)’.

“Worst of all, Table 1 reports 24 different P values, of which three are (just) below 0.05. Yet no correction has been used for multiple comparisons. This is very bad practice. It’s highly unlikely that, if the proper correction had been done, any of the results would have given a type 1 error rate below 5%.

“Even were it not for this, most of the ‘significant’ P values are marginal (only slightly less than 0.05). It is now well known that the type 1 error rate gives an optimistic view. What matters is the false positive rate – the chance that a ‘significant’ result is a false positive. A p-value close to 0.05 implies that there is at least a 30% chance that they are false positives. If one thought, a priori, that the chance of colic being cured by sticking needles into a baby was less than 50%, the false positive rate could easily be greater than 80%2. It is now recognised that this misinterpretation of p-values is a major contributor to the crisis of reproducibility.

“Other problems concern the power calculation. A priori calculations of power are well-known to be overoptimistic, because small trials usually overestimate the effect size. In this case the initial estimated sample size was not attained, and a rather mysterious recalculation of power was used.

“Another small problem: the discussion points out that ‘the majority of infants in this cohort did not have colic’.

“The nature of the control group is not very clear. An appropriate control might have been to cuddle the baby – this was used in a study in which another implausible treatment, chiropractic, was shown not to work. This appears not to have been done.

“Lastly, p-values are reported in the text without mention of effect sizes. This is contrary to all statistical advice.

“In conclusion, the design of the trial is reasonable (apart from the control group) but the statistical analysis is appalling. It’s very likely that there aren’t any real effects of acupuncture at all. This paper serves more to muddy the waters than to add useful information. It’s a model for the sort of mistakes that have led to the crisis in reproducibility. The BMJ should not be publishing this sort of stuff, and the referees seem to have no understanding of statistics.”

END OF QUOTE

Despite these rather obvious – some would say fatal – flaws, the editor of ACUPUNCTURE IN MEDICINE (AIM) thought this trial to be so impressively rigorous that he issued a press-release about it. This, I think, is particularly telling, perhaps even humorous: it shows what kind of a journal AIM is, and also provides an insight into the state of acupuncture research in general.

The long and short of it is that conclusions about specific therapeutic effects of acupuncture are not permissible. We know that colicky babies respond even to minimal attention, and this trial confirms that even a little additional TLC in the form of acupuncture will generate an effect. The observed outcome is most likely unrelated to acupuncture.