Nearly 20 percent of the nonhuman genomes held in computer databases are contaminated with human DNA, presumably from the researchers who prepared the samples, say scientists who chanced upon the finding while looking for a human virus.

The affected species include crop plants and the model organisms used in many research laboratories, like the C. elegans roundworm and the Xenopus frog, say three researchers at the University of Connecticut, Mark S. Longo, Michael O’Neill and Rachel O’Neill. Their report was published Wednesday in the journal PLoS One.

The contamination may mislead researchers who assume that any genome sequence in a major databank is highly accurate. Rachel O’Neill said the problem is likely to become more serious in the future as individual human genomes are sequenced for medical reasons. Contamination of human samples by other human DNA is very hard to distinguish from normal variation, and could lead to erroneous medical decisions.

“The level of contamination found in these databases is significant and worrisome,” the researchers write.