DNA data on a computer monitor. Image: AP

Ten years ago, if you wanted to back up some old photos, you might have stored them on a big, clunky external hard drive that weighed a couple of pounds and was a pain to lug around. Ten years from now, you might back up all the data from your entire life on just a few grams of DNA.




Embedded in the code of life, researchers have now encoded an 1895 French film, a computer virus and a $50 Amazon gift card.

This is not the first time scientists have turned to the double helix for storage. In 2011, Harvard University geneticist George Church pioneered the use of DNA for electronic data storage, encoding his own book, some images, and a Javascript program in the molecules. A year later, researchers European Bioinformatics Institute improved the method, and uploaded all of Shakespeare’s sonnets, a clip of Martin Luther King’s “I have a dream” speech, a PDF of the paper from James Watson and Francis Crick that detailed the structure of DNA, and a photo of their institute into a tiny speck of DNA. In July, a team from Microsoft and University of Washington also managed to store a record 200 megabytes of data in DNA.


But it was difficult to encode more than a few hundred letters with data without it turning into an undecipherable mess of gobbledygook.

In their new paper out Friday in Science, Yaniv Erlich and Dina Zielinski, from the New York Genome Center and Columbia University, respectively, detail a major improvement. Their new method, dubbed “DNA Fountain,” riffs off what’s known as fountain code, which slices data into chunks and then reassembles it, allowing, say, a large file like a movie to be flawlessly streamed over a lousy connection. Using their new method, they were able to store total over two megabytes of data in 72,000 DNA strands and easily retrieve it. One of Erlich’s Twitter followers was even able to crack the code and retrieve the Amazon gift card. The method allows them to pack 215 petabytes of data on a single gram of DNA—that’s 100 times more than Church did just a few years back.

DNA as a storage medium makes sense—after all, it already stores the billions of letters that code for life. It is compact and durable. Unlike the floppy disk or that five pound hard drive you used to lug around, it will never go obsolete. Instead of 1 and 0s, code is written in As, Gs, Cs and Ts. Using DNA, you could store all the data in the world in a nice-sized walk-in closet and keep it there for thousands and thousands of years.

The DNA Fountain technique is remarkable in its resistance to errors and ability to maximize the storage capacity of DNA.


Before we’re all walking around with bits of DNA on our key rings instead of flash drives, however, sequencing will have to become significantly cheaper. But that might happen sooner than we realize. This year, Illumina announced plans to bring the cost of sequencing an entire human genome sequencing down to $100. Sequencing a few megabytes of data would cost a small fraction of that.

[Science]