A Smear of DNA Can Hold 10,000 Gigabytes of Data

Facing a storage crisis, the U.S. is investing $48 million to turn DNA into living hard drives

Photo: imaginima/E+/Getty Images

Around the world, warehouses the size of several football fields store millions of hard drives’ worth of data. Every time we send an email, search Google, upload photos to Facebook, or stream a movie on Netflix — which is to say, all the time — those hard drives are put to work.

Big tech is building more of these sprawling data centers to keep up with the massive growth in data needs. But we are generating so much digital data that our current storage systems won’t be able to keep up for long. Already, large-scale U.S. data centers cost hundreds of millions of dollars to maintain and account for nearly 2% of the country’s electricity consumption, and those numbers are only expected to grow.

“There’s a problem coming where we’re going to have more data than we can store,” says Nicholas Guise, a senior research scientist at the Georgia Tech Research Institute who works on cybersecurity. To solve it, he says, we’ll need to figure out how to store more data in less space.

The U.S. government, which also has a huge data storage problem, has just invested $48 million into one possible solution: storing data in DNA.

“There’s a problem coming where we’re going to have more data than we can store.”

For the past few years, researchers have been tinkering with encoding songs, images, and other files in DNA. But it’s still expensive and time-consuming. Now a new program launched by the Intelligence Advanced Research Projects Activity (IARPA), a research agency within the Office of the Director of National Intelligence, aims to change that. Its goal is to shrink a warehouse-sized data center into an affordable tabletop device that can store one exabyte of data — which is equal to a million terabyte-sized hard drives.

“The scale and complexity of the world’s ‘big data’ problems are increasing rapidly, and we are entering an era when the solutions will require storage and random access from an exabyte or more of data,” IARPA program manager David Markowitz tells OneZero. “Faced with exponential data growth, large data consumers may soon face a choice between investing exponentially more resources in storage or discarding an exponentially increasing fraction of data.”

In January, IARPA awarded Guise’s team up to $25 million to start working toward this goal, together with some collaborators. His group will work with San Francisco-based DNA synthesis company Twist Bioscience, San Diego startup Roswell Biotechnologies, and a team at the University of Washington that’s collaborating with Microsoft to develop a fully automated DNA data storage system. Meanwhile, researchers from the Broad Institute of MIT and Harvard and French company DNA Script have been awarded a separate contract worth up to $23 million to work on ways to encode data into DNA and retrieve that information.

Like big tech companies, the government also needs archival data storage capabilities that are more affordable than conventional systems. The federal government collects and stores data on everything from taxes and crime to public health and climate. DNA offers an extremely compact means of storing immense amounts of data. A data storage facility as big as a Walmart Supercenter could be shrunk down to the size of a sugar cube.