Humanity’s last hope could someday rest on a humble seed, tucked away in the cool and dark of a collection. Seed collections, called seedbanks or seed genebanks, store millions of grains, nuts, and kernels in baggies and plastic bins, meant as the ultimate backup for worldwide plant diversity in the event of a major natural or man-made disaster.

But the collections aren’t always well-organized or clear about what they contain. A recent study in Scientific Reports aims to help seedbanks get their shelves in order, by using genetics to catalogue their holdings. The study found that, at least for Aegilops tauschii goatgrass, a relative of wild wheat, seedbanks are surprisingly redundant in their collections. The work comes in the wake of a November study in Nature Genetics that also found redundancy and limited genetic diversity. In this case, the suspect seed was the barley collection of the German federal ex situ genebank in Gatersleben, one of the world’s largest crop plant collections.

For the Scientific Reports study, researchers from Kansas State University’s wheat seedbank, using two kinds of low cost sequencing, read the genes of more than 1,000 packets of goatgrass seeds stored in three separate facilities. More than half of the packets, both within and across the three collections, were so similar that the researchers surmised they had probably come from the same handful of wild plants, furring the margins of farmers’ fields from Turkey to China.

Museums, arboretums, and other living libraries often share seeds, so the redundancy isn’t shocking, especially when so many of the original records are lost, says study coauthor Jon Raupp, who runs the genebank at Kansas State. Many seedbanks started in an age before digitized record keeping, and their earliest paper files yellowed and crumbled years ago. Modern genetic sequencing offers these facilities a way to get their houses in order, to re-label and reorganize the musty brown halls of their botanical history.

Each of the roughly 1,750 genebanks around the world has its own numbered and labeled seed packets. Ideally, identical seeds would have identical numbers and labels across all these facilities. But there are no guarantees, according to Raupp. “Along the way, somebody might have mixed up one of the lines,” he says. That’s a potential problem for plant breeders.

Commercially-grown Kansas wheat has a lot to contend with, including fungal and insect pests. For Raupp, his seedbank is more than a genetic ark, it’s a library of wild wheat relatives (such as goatgrass) to crossbreed with domesticated wheats, bolstering their resilience in the arms race with pestilence and disease. Hybridizing wild and domesticated plants can take years, though—and it would be a waste of time for seedbanks around the world to run breeding experiments unknowingly using identical seeds. That’s why genebanks need to know what they have, and who else has it, Raupp explains. Sharing genetic information can help “make sure we’re not duplicating the same research,” Raupp says. Knowing their contents can also help seedbanks figure out what to collect next to fill out their shelves.

Duplication isn’t all bad. “A certain level of redundancy between genebank collections provides additional backups,” says Nils Stein, a coauthor of the November Nature Genetics paper who studies the genetics of wheat, barley, and rye at the Leibniz Institute of Plant Genetics and Crop Plant Research, also in Gatersleben, Germany. If a seedbank were destroyed without backups, for example, those collections would be gone. Stein’s study analyzed the entire barley collection of the German federal ex situ genebank, and found that it does not capture a complete snapshot of barley’s wild genetic diversity. Some seed packets there too are duplicates. Genotyping tells the researchers exactly how many, and which ones, Stein says.

Both of the recent studies, researchers say, illustrate the power of low cost sequencing to unlock greater potential from seedbanks. Systematically characterizing seed collections using genetics, as these studies have done, is “a blueprint for what a global effort could look like,” Stein says. Genetic stores in genebanks number in the millions. “But as long as we don’t know what is unique and what is redundant,” he says, “this has a negative effect on the power of the resources.”