Forget flash – the computers of the future might store data in DNA. George Church of the Wyss Institute at Harvard University and colleagues have encoded a 53,400-word book, 11 JPG images and a JavaScript program – amounting to 5.27 million bits of data in total – into sequences of DNA. In doing so, they have beaten the previous record set by J. Craig Venter’s team in 2010 when they encoded a 7920-bit watermark in their synthetic bacterium.

DNA is one of the most dense and stable media for storing information known. In theory, DNA can encode two bits per nucleotide. That’s 455 exabytes – roughly the capacity of 100 billion DVDs – per gram of single-stranded DNA, making it five or six orders denser than currently available digital media, such as flash memory. Information stored in DNA can also be read thousands of years after it was first laid down.

Until now, however, the difficulty and cost involved in reading and writing long sequences of DNA has made large-scale data storage impractical. Church and his team got round this by developing a strategy that eliminates the need for long sequences. Instead, they encoded data in distinct blocks and stored these in shorter separate stretches.

The strategy is exactly analogous to data storage on a hard drive, says co-author Sriram Kosuri, where data is divided up into discrete blocks called sectors.


The team has also applied their strategy in practice. They converted a JavaScript program, and a book co-written by Church, into bit form. They then synthesised DNA to repeat that sequence of bits, encoding one bit at every DNA base. The DNA bases A or C encoded a ‘0’, while G and T encoded a ‘1’.

Because the DNA is synthesised as the data is encoded, the approach doesn’t allow for rewritable data storage. A write-only DNA molecule is still suitable for long-term archival storage, though. “I don’t want to say [rewriting is] impossible,” says Kosuri, “but we haven’t yet looked at that.”

But the result does show that DNA synthesis and sequencing technologies have finally progressed to the stage where integrating DNA sequence information into a storage medium is a real possibility, says Dan Gibson at the J. Craig Venter Institute in La Jolla, California, who was part of Venter’s team in 2010. “Cost, speed and instrument size currently make this impractical for general use, but the field is moving fast, and the technology will soon be cheaper, faster and smaller,” he says.

Science, DOI:10.1126/science.293.5536.1763c