$180 million DNA ‘barcode’ project aims to discover 2 million new species

For centuries biologists have identified new species at a painstakingly slow pace, describing specimens' physical features and other defining traits, and often trying to fit a species into the tree of life before naming and publishing it. Now, they have begun to determine whether a specimen is likely a novel species in hours—and will soon do so at a cost of pennies. It's a revolution driven by short stretches of DNA—dubbed barcodes in a nod to the familiar product identifiers—that vary just enough to provide species-distinguishing markers, combined with fast, cheap DNA sequencers.

"Biodiversity science is entering a very golden era," says Paul Hebert of the University of Guelph in Canada. On 16 June, a team he leads will launch a $180 million global effort to identify more than 2 million new species of multicellular creatures. Other teams are also adopting the approach to comb samples for new species in their labs—or even directly in the field. With the world losing species faster than they are discovered, biologists are welcoming the technology.

"For many years I dreamed of changing the rules by being able to bring a portable genomic lab [to] where the samples are," says Massimo Delledonne, a genomicist at the University of Verona in Italy who recently performed barcoding studies in a forest on the island of Borneo that quickly revealed a new species of snail. "Field barcoding is now ready for prime time."

Biodiversity experts estimate that Earth has between 8.7 million and 20 million kinds of plants, animals, and fungi, but to date only 1.8 million of them have received formal descriptions. Insects, in particular, are a vast realm of undiscovered species. "Yet collectively, they may contribute more biomass in terrestrial habitats than all wild vertebrates combined," says Rudolf Meier, a biologist from the National University of Singapore who has been developing barcoding approaches with a small DNA sequencer.

In 2003, Hebert proposed the DNA barcode concept: that animal species could be distinguished by sequencing less than 1000 bases of mitochondrial DNA from a specimen. It took a while for the idea to catch on, but Hebert and other enthusiasts began to compile barcodes from known species. In 2010, for example, he spearheaded a consortium called the International Barcode of Life (iBOL), an $80 million effort centered at Guelph that began to build a reference library of known species with their identifying sequences. It now tops 7.3 million barcodes—each species can have more than one—and has proved to be a resource not only for identifying known organisms, but also for documenting their interactions with other species—including who eats whom—based on the different barcodes in a particular sample.

Now, with additional support in money and in-kind services from its 30 international partners, iBOL is about to start a 7-year follow-up effort. Called BIOSCAN, it will gather specimens and study species interactions at 2500 sites around the world, aiming to expand its reference library by 15 million barcode records, 90% of them coming from undescribed species. The data will set the stage for monitoring the effects of pollution, land-use changes, and global warming on biodiversity, Hebert says. Ultimately, "We will be able to track life on the planet the way we track the weather."

And, in a departure from iBOL's previous focus on deriving barcodes for known species, "One of the primary goals will be species discovery," Hebert says. If software fails to match a sample's barcode sequence to an existing species, it will immediately flag the specimen for closer genetic and visual scrutiny and possible identification as a new species. In the past, it might have taken years or even decades to confirm some organisms as new species—for example, certain flies in which species differ visibly only in the shape of the male genitalia.

Customized bioinformatics and sequencers that can read enough bases in one shot to get a full barcode will keep the cost low, Hebert predicts—about $1 per specimen including collection, preservation, DNA extraction, sequencing, and follow-up analysis. He expects the sequencing component of the overall costs will eventually drop to about $0.02 per specimen.

For now, all the specimens gathered for BIOSCAN's barcoding will be shipped to the University of Guelph. But Meier has been developing a barcoding approach that he hopes will be accessible to many labs doing species surveys. He got interested in more efficient ways of identifying species in 2012, when Singapore officials asked him to study tiny flies emerging from two local reservoirs. "It was a nightmare" to pin down the midge species responsible, he recalls. As a result, he, too, turned to barcoding.

Now, he's cataloging all of Singapore's biodiversity, particularly its "silent majority," as he calls small insects. For this work, Meier says, "We abandoned the over-engineered and expensive techniques that are traditionally used for barcoding." Instead, he has turned to a recently developed sequencer called the MinION that is the size of a cellphone and costs less than $1000. Designed to identify DNA's four bases by how they change an electrical current when they pass through a nanometer pore, it sequences thousands of bases in one stretch, more than enough for a barcode.

Working with undergraduates and volunteers in Singapore who both collect and sequence specimens, Meier's team has already generated 200,000 insect barcodes. These represent 10,000 species, more than 70% of them new to science, he and his colleagues reported last year. Meier envisions many countries setting up such lab-based efforts to independently catalog their biodiversity.

His group recently demonstrated the power of the approach with a study of the insects an entomologist had caught in a single net trap in Kibale National Park in Uganda. Meier and his colleagues focused on a very large, diverse group of flies called Phoridae, which are hard to tell apart visually. Barcoding just one-third of the trap's haul of insects—about 8700 in all—yielded 650 Phoridae species new to science, the team reported in a 30 April preprint on bioRxiv. That's more than all the known Phoridae in tropical Africa. Emily Hartop from Stockholm University, an expert in fly identification, confirmed that the barcodes correctly separated species 90% of the time, Meier's group reports.

The numbers show "there's a lot of diversity out there that we have not named and don't know about," says John Kress, a botanist at the Smithsonian Institution's National Museum of Natural History in Washington, D.C.

Delledonne and his colleagues have been working out ways to do at remote field sites what Meier does in his Singapore lab. They, too, use a MinION sequencer, which is typically run by a laptop computer. The laptop normally requires an internet connection, but the sequencer's manufacturer, Oxford Nanopore Technologies, made it possible for the system to work in remote locations with no such access. Everything needed for barcoding fits into a carry-on suitcase.

On a 2018 Borneo expedition, the team joined with citizen scientists and barcoded about a dozen animals, among them a new snail they named Microparmarion exquadratus and described last year in the Journal of Molluscan Studies. In a 6 May bioRxiv preprint, the group detailed their protocols, so others can follow their footsteps. Anyone with a few days' training can now get set up to do barcoding in the field for less than $7000, Delledonne suggests. "We see how smartphones have changed our lives. We have shown that sequencing may follow the same trend."

Indeed, a team from Duke University in Durham, North Carolina, has just reported a similar field study with a MinION in the Madagascar dry forest, obtaining barcodes to identify mouse lemurs. A few decades ago, only a few species of these secretive, nocturnal primates were known. The tally is now up to 24, but finding new species is still slow. "It could sometimes take years," says Duke primatologist Marina Blanco. In a 26 May bioRxiv preprint, she, Duke ecologist Lydia Greene, and colleagues described using a MinION-based barcoding system to take a DNA sample from a live-trapped lemur, barcode it, and decide on the spot whether it was a new species.

The Duke work, says Delledonne, is a "good example of what is going to be the impact of mobile genomic labs on ecological and evolutionary studies."

For Meier, such field studies, along with his own lab's work, bode well for the democratization of barcoding: "It all points to the bright future of decentralized biodiversity science that yields rapid results." But Kress thinks industrial-scale efforts like BIOSCAN will be important as well if there's any hope to catalog all species on the planet. "It has to be both approaches in parallel," he says. "We will never get it done if we do it in individual labs."