Microsoft Excel could be to blame for errors in genomics research.

In a paper published last week in the Genome Biology journal, researchers Mark Ziemann, Yotam Eren, and Assal El-Osta reveal that an auto-correct feature built into Microsoft Excel caused errors in approximately 20 percent of all genomics research papers. As Quartz reports, the findings were discovered after the Australian scientists analyzed more than 7,500 Excel files in more than 3,600 papers across 18 journals, all of which were published in the last 10 years.

The issue, according to the researchers, is that Excel has a problem understanding certain gene symbols and automatically corrects them to dates or "gloating-point numbers." For instance, the researchers say that gene symbols like Sept2 or March1, which refer to genes and not dates, are automatically converted to dates. In other cases, gene identifiers can be converted to massive numbers like "2310009E13."

"The spreadsheet software Microsoft Excel, when used with default settings, is known to convert gene names to dates and floating-point numbers," the researchers say. "A programmatic scan of leading genomics journals reveals that approximately one-fifth of papers with supplementary Excel gene lists contain erroneous gene name conversions."

While the issue only affects a couple dozen genes out of approximately 30,000 in the human genome, data from other scientists is often used as a jumping-off point for other research. If 24 genes aren't included in that follow-up data, inaccuracies and inconsistencies could be profound.

What's worse, the scientists say the problems were first reported in 2004. Since then, massive amounts of research has been done to on genomics.

"This suggests that gene name errors continue to be a problem in supplementary files accompanying articles," the researchers write. "Inadvertent gene symbol conversion is problematic because these supplementary files are an important resource in the genomics community that are frequently reused. Our aim here is to raise awareness of the problem."

Luckily, the feature can be temporarily turned off in Excel and scripts can be run for scientists to determine whether they've fallen victim to the auto-correct features. They also note that Google Sheets doesn't convert the data automatically, making the search giant's alternative spreadsheet tool a potentially appealing option for scientists.

Further Reading

Office Suite Reviews