SAN FRANCISCO (September 20, 2018) – Days after a fire tore through Brazil’s National Museum and destroyed specimens of irreplaceable heritage, a team of scientists has quantified the vast number of fossils that sit unstudied in natural history collections. Based on their findings, the team estimates only 3 to 4 percent of recorded fossil locations from across the globe are currently accounted for in published scientific literature. This means any shelved specimens that have never been published or documented digitally remain vulnerable to loss. Researchers from the California Academy of Sciences, University of California Museum of Paleontology (UCMP), and partner institutions are working to preserve these “dark data” in online databases, highlighting the need for underfunded museums around the world to invest in the digital preservation of their collections. The three-year-old project’s preliminary results were published in Biology Letters earlier this month.

“The fossil record offers invaluable insight into our planet’s ecological and evolutionary past,” says co-author Dr. Peter Roopnarine, the Academy’s Curator of Invertebrate Zoology and Geology. “Yet published literature only documents a fraction of the fossils housed in museum collections. Digitizing specimens preserves valuable data and makes it readily accessible to researchers everywhere.”

Fossil-finding long predates the digital age, leaving modern paleontologists with the Herculean task of compiling enough data by hand to address large-scale questions of planetary change. The first digital revolution for fossil collections began in the 1990s, when the scientific community launched several still-growing online databases based on published literature, the most comprehensive being the Paleobiology Database (PBDB).

Today, a second digital revolution is underway. Led by UCMP, ten institutions are digitally cataloging fossil specimens from their collections that have never been cited in published literature. The new database, known as EPICC (Eastern Pacific Invertebrate Communities of the Cenozoic), compiles marine invertebrate fossils that span the past 66 million years and hail from Chile to Alaska.

The study’s co-authors compared the number of locations represented by fossils in the literature-based PBDB to the number of locations tallied in the new EPICC database for the states of Washington, Oregon and California. They found that for every fossil-bearing location recorded in the scientific literature, 23 more exist on shadowy museum shelves. This finding informed the team’s global estimate for all fossil types: Of the fossil-bearing locations known to exist across the globe, only 3 to 4 percent are accounted for in published literature.

“What this means is that within most of the great museums of the world there are specimens that have not been fully utilized to understand the nature of our planet, how ecosystems responded to climate change in the past, and how they’ll respond moving forward,” says lead author Dr. Charles Marshall, Director of UCMP and Fellow of the Academy. “We need that perspective to forecast the future.”

So far, modern digital technologies have already allowed the team to harness the collective power of hundreds of thousands of specimens for coherent analysis. The research potential is vast: Teams continue to make new-to-science discoveries by simply delving deeper into their collections. Digitization also supports the enormous, upfront investment that museums have already made to collect and steward natural history specimens.

Marshall says the paper’s coincidental publication shortly after Brazil’s National Museum fire is a call to arms. “In the wake of the fire, my reaction was one of heartbreak, dismay, and shock. As scientists, seeing a fire like this is akin to learning your parent’s house has just burnt to the ground. It’s time for government and funding agencies to step up investment in the digitization of natural history collections and preserve our world heritage for decades to come.”