By Guillaume Filion, filed under bioinformatics revolution, open source, reproducible research, obfuscated open source.





A recent Nature editorial entitled “Code share” discusses an update in Nature’s policy regarding the use of software. Interestingly, the subtitle is

Papers in Nature journals should make computer code accessible where possible.

Yes, finally! The last decade was a transition period, which, in the history of bioinformatics will probably be known as the “bioinformatics revolution”. Following the completion of the first genome projects, the demand for bioinformatics rose steadily, to the detriment of biochemistry and genetics, which have now fallen from grace. Something as traumatic cannot happen in a day, and it cannot happen without pain. Actually, the transition is still ongoing and this regularly causes difficulties of all kinds in biology.

One of the most perverse effects of the massive popularization of bioinformatics is that senior scientists were not properly trained for it. This led to an implicit view that bioinformatics is a tool, somewhat like a microscope or a FACS. This explains why the materials and methods section of the first papers using bioinformatics was often reduced to something like “all the bioinformatics analysis were performed using R”. In other words, “we got some bioinformatics software and asked a qualified technician to use it according to the instructions”.

But bioinformatics is not a tool, it is a science. Nobody would accept to publish a method section saying “all the equations were solved with a pen and a paper”. But if you wrap those equations in some code then you can safely ignore them.

There is no universal agreement on what should go in a scientific paper and what should not, but we can question the motivation for publishing scientific articles where the authors do not have to tell what they did. All this to say that I was rather pleased that Nature, as a leader in scientific publishing would make a reasonable step forward. However my jaw dropped to the floor when I read on. Here is what the editorial says.

Nature and the Nature journals have decided that, given the diversity of practices in the disciplines we cover, we cannot insist on sharing computer code in all cases. But we can go further than we have in the past, by at least indicating when code is available.

I don’t know what is the most shocking: that authors did not even have to mention that the code is not available, or that an editorial entitled Code share actually says that you do not have to share your code. The only statement that shows some progress is the following.

Editors will insist on availability where they consider it appropriate: any practical issues preventing code sharing will be evaluated by the editors, who reserve the right to decline a paper if important code is unavailable.

In short, now it is up to the editors. I have to admit that this is much better than not caring about the issue at all. I hope that we will see some real progress towards reproducible research in the Nature journals. But let us face it, the real challenge is not to share the code, it is to make research reproducible. Just because you upload your code does not make your research reproducible. Actually, the Nature group has a history of publishing obfuscated open source code, in other words, the code is there, but you cannot do anything with it (if you like challenges, you can try to do something with the code from this publication in Nature Biotechnology... starting from finding it in the supplementary information).

The editorial rightly points out that practices differ widely between scientific communities. But because something is a “culture” does not make it right. For Nature, reproducibility is not the priority. Now, the question that matters is whether it is a priority for you.