A number of studies have spotted a worrisome trend: although the number of scientific journals and articles published is increasing each year, the rate of papers being retracted as invalid is increasing even faster. Some of these are being retracted due to obvious ethical lapses—fraudulent data or plagiarism—but some past studies have suggested errors and technical problems were the cause of the majority of problems.

A new analysis, released by PNAS, shows this rosy picture probably isn't true. Researchers like to portray their retractions as being the result of errors, but a lot of these same papers turn out to be fraudulent when fully investigated. If there's any good news here, it's that a limited number of labs (38, to be exact) are responsible for a third of the fraudulent papers that end up being retracted.

The new paper focuses on the fact that problems with a paper may crop up after—sometimes years after—it's first published. If these problems are minor, authors can issue clarifications or corrections (typically called errata or corrigenda). But if they raise questions about the validity of the work in general, the journal will sometimes retract the paper entirely. This is the equivalent of saying the paper never existed, and shouldn't be considered part of the peer-reviewed literature. To notify the scientific community of this action, the journals will typically mark the online versions of the original publication, and separately publish a short retraction notice.

The authors of the new paper took advantage of this by searching for every retraction notification that has appeared in the NIH's PubMed database. They found just over 2,000 of them as of May of this year, then categorized them according to the reason for the retraction, based on the contents of the announcement: plagiarism, duplicated work, errors, and fraud.

A bumper crop of fraud

Honest errors occur in research all the time, and I've seen papers retracted because of software errors or after a commercial product didn't live up to its promised capabilities. And past studies (such as this one) have suggested these errors account for the majority of retractions. But the authors of the new study find that this isn't the case. Many retractions (over 15 percent by one measure) claim to be because of errors, but ultimately turn out to be because of fraud. The authors discovered this by checking the author list against reports prepared by the Office of Research Integrity, which polices research fraud.

This may be because the retractions come before investigations are complete, or because researchers get to write their own retraction notices. But whatever the cause, the gap between retractions and reality can be hilariously large. One paper ostensibly retracted because of "flaws in methodological execution and data analysis" turned out to have "many instances of data fabrication and falsification." Another that was pulled because one image "was not correct" was found, after investigation, to be the result an author knowingly selecting an experiment that gave the answer he wanted.

With these retractions reevaluated, the authors find that cases of fraud account for over 43 percent of all retractions. Duplicate publications and plagiarism account for another 24 (the Retraction Watch blog found a awkward example of the former just today). Those numbers drop honest errors down to the point where they only account for just over 20 percent of the total retractions. Fraud is a bigger problem than we'd thought.

And it's getting bigger. The authors find that, since 1975, the rate of retracted articles as a percent of total publications has increased nearly tenfold. Duplicate publications and plagiarism, which didn't use to be a significant problem, have boomed since 2005. And while retractions due to errors have increased, those due to fraud have increased much faster.

Patterns of deceit

When it comes to fraud, the traditional research powers are leading the way. The US has the largest number of cases, followed by Germany and Japan. But things like plagiarism and duplicating publications are quite different, with China being a major player, and India having a large presence. These sorts of copying problems are rare in high-profile journals like Nature and Science. Instead, there was a strong correlation between the incidence of fraud and the prominence of the journal, as measured by its impact factor.

The authors suggest that the increasing levels of fraud may come from "the incentive system of science, which is based on a winner-takes-all economics that confers disproportionate rewards to winners in the form of grants, jobs, and prizes at a time of research funding scarcity." That could certainly explain its prevalence in the US, where competition for grant money has become increasingly fierce in a way that roughly parallels the rising rates of fraud.

If there's good news in the data, it's that fraud may be increasing, but it might not be as widespread as the numbers would appear to suggest. A total of 38 labs, each of which had at least five retractions, accounted for a hefty 34 percent of the total frauds evaluated by the authors.

One interesting result the study also turns up is that retractions often do their jobs: other papers stop citing the original work shortly after the retraction notice is printed. But this doesn't always happen. In the case of one 2005 article, the HTML and PDF versions available online are clearly marked with retraction notices, but they found it continues to get cited. Even though the paper contained what appears to be inaccurate data about a molecule's function, it remains the first description of a molecule that turned out to be important.

"This practice," the authors conclude, "suggests that under certain circumstances, scientists continue to find utility in retracted articles."

PNAS, 2012. DOI: 10.1073/pnas.1212247109 (About DOIs).