An Error in a Python Script May Have Invalidated 150+ Research Projects

And three other Python stories you may have missed from the past month

Photo by NeONBRAND on Unsplash

A coding error in a set of Python scripts used for computational analysis may have invalidated 150 published research studies in chemistry.

A recently published research article from the University of Hawaii shows a programming error within the Willoughby-Hoye scripts.

The researchers, attempting to examine results obtained from a cyanobacteria experiment, observed notable variations in the outcomes gotten from using similar Nuclear Magnetic Resonance Spectroscopy (NMR) data.

The error propagated depending on the operating system the scripts were being run on. The scripts were found to give accurate results on Windows 10 and macOS Mavericks but were less accurate by almost a full percent on macOS Mojave and Ubuntu.

The source of these variations comes from the scripts’ usage of Python’s glob module.

The glob module seeks files that correspond to a specific name pattern, and based on the glob results, the scripts generate a list of input files to read.

But then, the output from this module is dependent on the OS used for eordering and returning these files. The order taken for the processing of the file affects the outcomes of calculations made by these scripts.

This small detail may invalidate many previous research papers due to inaccurate outputs.

Phillip Williams and Rui Sun wrote codes to help fix this problem of corrected sorting, which now guarantees consistent outcomes. While the variations did not have any impact on the data results obtained by the University of Hawaii’s team, it may see some substantial impact on other published research projects.

The Willoughby-Hoye scripts were named after its authors, Patrick Willoughby and Thomas Hoye, of the University of Minnesota.

Presently an assistant professor of chemistry at Ripon College, Patrick Willoughby now acknowledges the new findings as well as the new corrections to the scripts. This update was made known in his post on Twitter:

“Great find by Rui and Prof. Williams. When I wrote the scripts six years ago, the OS was able to handle the sorting. Rui and Williams added the necessary sort code and added a function to ensure the calcs were properly aligned. Kudos!” — Patrick Willoughby (@pat_willoughby)

Sometimes, trusting external scripts and libraries can lead to unexpected results.

And now, three other updates you may have missed.