I noticed this morning that this week’s New Scientist cover feature (by Michael Brooks)is entitled Exclusive: Grave doubts over LIGO’s discovery of gravitational waves. The article is behind a paywall – and I’ve so far been unable to locate a hard copy in Maynooth so I haven’t read it yet but it is about the so-called `Danish paper’ that pointed out various unexplained features in LIGO data associated with the first detection of gravitational waves of a binary black hole merger.

I did know this piece was coming, however, as I spoke to the author on the phone some time ago to clarify some points I made in previous blog posts on this issue (e.g. this one and that one). I even ended up being quoted in the article:

Not everyone agrees the Danish choices were wrong. “I think their paper is a good one and it’s a shame that some of the LIGO team have been so churlish in response,” says Peter Coles, a cosmologist at Maynooth University in Ireland.

I stand by that comment, as I think certain members – though by no means all – of the LIGO team have been uncivil in their reaction to the Danish team, implying that they consider it somehow unreasonable that the LIGO results such be subject to independent scrutiny. I am not convinced that the unexplained features in the data released by LIGO really do cast doubt on the detection, but unexplained features there undoubtedly are. Surely it is the job of science to explain the unexplained?

It is an important aspect of the way science works is that when a given individual or group publishes a result, it should be possible for others to reproduce it (or not as the case may be). In normal-sized laboratory physics it suffices to explain the experimental set-up in the published paper in sufficient detail for another individual or group to build an equivalent replica experiment if they want to check the results. In `Big Science’, e.g. with LIGO or the Large Hadron Collider, it is not practically possible for other groups to build their own copy, so the best that can be done is to release the data coming from the experiment. A basic problem with reproducibility obviously arises when this does not happen.

In astrophysics and cosmology, results in scientific papers are often based on very complicated analyses of large data sets. This is also the case for gravitational wave experiments. Fortunately, in astrophysics these days, researchers are generally pretty good at sharing their data, but there are a few exceptions in that field.

Even allowing open access to data doesn’t always solve the reproducibility problem. Often extensive numerical codes are needed to process the measurements and extract meaningful output. Without access to these pipeline codes it is impossible for a third party to check the path from input to output without writing their own version, assuming that there is sufficient information to do that in the first place. That researchers should publish their software as well as their results is quite a controversial suggestion, but I think it’s the best practice for science. In any case there are often intermediate stages between `raw’ data and scientific results, as well as ancillary data products of various kinds. I think these should all be made public. Doing that could well entail a great deal of effort, but I think in the long run that it is worth it.

I’m not saying that scientific collaborations should not have a proprietary period, just that this period should end when a result is announced, and that any such announcement should be accompanied by a release of the data products and software needed to subject the analysis to independent verification.

Given that the detection of gravitational waves is one of the most important breakthroughs ever made in physics, I think this is a matter of considerable regret. I also find it difficult to understand the reasoning that led the LIGO consortium to think it was a good plan only to go part of the way towards open science, by releasing only part of the information needed to reproduce the processing of the LIGO signals and their subsequent statistical analysis. There may be good reasons that I know nothing about, but at the moment it seems to me to me to represent a wasted opportunity.

CLARIFICATION: The LIGO Consortium released data from the first observing run (O1) – you can find it here – early in 2018, but this data set was not available publicly at the time of publication of the first detection, nor when the team from Denmark did their analysis.

I know I’m an extremist when it comes to open science, and there are probably many who disagree with me, so here’s a poll I’ve been running for a year or so on this issue:

Any other comments welcome through the box below!

UPDATE: There is a (brief) response from LIGO (& VIRGO) here.