In 60 short years we've discovered how to read, write, and edit DNA. Now could it be the answer to our data storage problems?

Post written by Guest Writer Anthony Loder





Deoxyribonucleic Acid (DNA) has acted as nature’s hard drive since life evolved on Earth some 3.5 billion years ago. The four base pairs (adenine, thymine, cytosine, and guanine) are the basic building blocks of life that carry our (and every other known living thing’s) genetic code. The fact that such a small number of molecules can be responsible for producing such a vast array of organisms, from algae to humans, has been of great interest to scientists for decades. As we continue to unravel the mysteries of the genetic code, we can attempt to apply its ingenious methodology for our own needs.

The idea of using DNA as a data storage device is not a novel one. Soviet physicist Mikhail Samiolvich Neiman proposed such a mechanism in the mid-1960s [1]. One of the first actual uses of DNA as a medium for data storage was in 1986, as an art project by MIT researcher Joe Davis titled “Microvenus” [2]. Davis converted the Germanic rune for “female earth” into a simple binary code that was in turn converted into a DNA sequence that was inserted into living Escherichia coli (E.coli) cells. The genetic sequence that made up Davis’s “Microvenus” was a mere 28 base pairs (bps) in length - not exactly an encyclopedia worth of information, but nonetheless it helped pave the road for the use of DNA as a storage device. Fortunately, our ability to synthesize DNA in the laboratory has vastly improved since the 1980s.



Fig. 1 The Germanic rune representing “female earth” and Joe Davis’ binary representation of the rune (Source: Anthony Loder)

While Davis was able to write and store data into DNA, he missed a critical piece of the puzzle: retrieval of said data. The ability to reliably read the encoded messages in the DNA is of paramount importance if such a medium is to be taken seriously as a storage system for our precious data. Rapid advancement in the field of genomic sequencing led to the sequencing of the first bacterial genome (1,830,137 bps) [3] in 1995 followed by the first human genome (over 3,095,693,981 bps) in 2003 [4]. The capability to read much longer sequences of DNA provides scientists with the proverbial key to the previously locked messages in the genetic code. We can now write, store, retrieve and read large quantities of DNA sequences.

In 1999, a team of researchers imagined the use of DNA as a medium for secret messages [5]. They successfully encoded and retrieved the message “JUNE 6 INVASION: NORMANDY” (69 bps) using an encryption method similar to that of the translation from DNA to proteins (each letter of the message was encoded using a unique combination of 3 base pairs). Over the next 17 years, many papers were published on this subject with increasing amounts and types of data being stored. In 2012, a paper came out of Harvard that detailed the use of DNA to store a 53,426-word book [6]. A recently published article about the promises of DNA data storage described how a team wrote and retrieved three images from laboratory synthesized DNA [7].



Fig. 2 The 3 images that were successfully recovered after being stored in DNA (Source: Figure 9 from Bornholt, James, et al. [7])

The value of DNA as a storage system comes from two of its properties: extreme information density and extraordinary durability. The information density of DNA is enormous, about 1,000,000,000 GB per cubic millimeter [7]. To put that into perspective, if you wanted to download all of the videos currently on YouTube (roughly 93,540 years’ worth of content) and store this data in DNA, the amount of space required would be about the size of the average playing dice. By comparison, Facebook built a 5,700m2 cold storage center to store that same amount of data [8]. Our current methods for data storage are not built to last forever. At best, with the newest tape drive technology, stored data is rated to remain readable for just 30 years. DNA on the other hand has been storing data for millennia and has a half-life of over 500 years [9].

DNA as a data storage system is not without its drawbacks. Accessing the data is still quite costly and time-consuming (on the order of hours). The current technology required to read and write to DNA is far from perfect and even further from everyday use. Due to these problems, the system is currently being considered more for data archiving than practical everyday use. However, if genomic technology continues to progress at even half of the rate that it has over the last 25 years, we may find ourselves storing our most personal and valuable information in DNA sooner than you think. As we continue to generate huge amounts of data on a daily basis, our need for an effective and efficient way of storing this information grows. The hyper-dense storing power of DNA could be the future solution.

Post written by Guest Writer Anthony Loder



Anthony Loder studied at Rowan University in Southern New Jersey. He is currently working in the food industry doing QA and R&D. In his spare time, he likes to brew beer, go for bike rides, and participate in MOOCs.

Contact: anthony.d.loder@gmail.com



References:

[1] Neiman M.S. Neiman. “Some fundamental issues of microminiaturization.” Radiotekhnika 1 (1964): 3-12.

[2] Davis, Joe. “Microvenus.” Art Journal 55, no. 1 (1996): 70-74.

[3] Fleischmann, Robert D., Mark D. Adams, Owen White, Rebecca A. Clayton, Ewen F. Kirkness, Anthony R. Kerlavage, Carol J. Bult, Jean-Francois Tomb, Brian A. Dougherty, and Joseph M. Merrick. “Whole-genome random sequencing and assembly of Haemophilus influenzae Rd.” Science 269, no. 5223 (1995): 496-512.

[4] Eaton, Lynn. “Human genome project completed.” BMJ: British Medical Journal 326, no. 7394 (2003): 838.

[5] Clelland, Catherine Taylor, Viviana Risca, and Carter Bancroft. “Hiding messages in DNA microdots.” Nature 399, no. 6736 (1999): 533-534.

[6]Church, George M., Yuan Gao, and Sriram Kosuri. “Next-generation digital information storage in DNA.” Science 337, no. 6102 (2012): 1628-1628.

[7] Bornholt, James, Randolph Lopez, Douglas M. Carmean, Luis Ceze, Georg Seelig, and Karin Strauss. “A DNA-based archival storage system.” Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems pp. 637-649. ACM, 2016.

[8] Miller, Rich. “Facebook Builds Exabyte Data Centers for Cold Storage.” Data Center Knowledge. January 18, 2013. Accessed May 15, 2016. http://www.datacenterknowledge.com/archives/2013/01/18/facebook-builds-new-data-centers-for-cold-storage/.

[9] Allentoft, Morten E., Matthew Collins, David Harker, James Haile, Charlotte L. Oskam, Marie L. Hale, Paula F. Campos et al. “The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils.” Proceedings of the Royal Society of London B: Biological Sciences 279, no. 1748 (2012): 4724-4733.

More From Thats Life [Science]