DNA Data Storage: Movie Encoded in the Genomes of Living Bacteria.

The storage of digital data has been revolutionizing from physical storage devices (tape drives, floppy disk, CD, DVD, memory cards, flash drives and hard drives, etc.) to biological material (DNA). The digital data can be videos, audios, images, and other alphanumeric data, etc. The data stored in physical devices have several disadvantages such as easily damage, high replaceable cost, require more energy for functioning, and limited spaces for huge file, etc. In this digital era, the exponential increases in the data warrant the alternative technologies for storing the world’s data securely. Due to several problems on currently available storage devices, the researchers are trying to hold the possibility of using DNA as storage devices.

DNA is the genetic material found in nearly every cell of all living organism mostly located in the nucleus, and a small amount can also found in the mitochondria. DNA is nature-perfect storage resources that contain every information about a living being depending on four building block Adenine, Guanine, Cytosine, and Thymine. This is similar to 0’S and 1’S used in our digital devices such as computer and laptop etc. The reason for which Microsoft invested in research that focuses on how DNA can be used to store data. They bought ten million strands of synthetic DNA from Twist Bioscience as they believe that DNA is a high-capacity, highly efficient, the long-term storage option that can securely store 1x109 TB per gram.

The Mikhail Neiman original idea published in the Radiotekhnika journal, 1964 reveal the possibility of recording, storage, and retrieval of real data on DNA molecules. After that, several publications reported that DNA could be potential resources for the data storage. The Erlich and Zielinski study published in the journal Science in 2017 demonstrated the methods of archiving the data into DNA, that is highly robust and approaches the information per nucleotide. Several other previous reports on DNA data storage based on the synthetic DNA but the new study provides the insights to store information in the genome of living bacterium Escherichia coli.


Movie 1: Original images of the galloping (left) and images encoded in bacterial DNA and recovered (right) using CRISPR.

Recently, Harvard Medical School Geneticist Seth L. Shipman and his colleagues published their technical achievement in Nature, 12 July 2017 that reveal the use of CRISPR-Cas system to encode the pixel values of black and white images and a short movie into the genomes. This system can capture and stably store real data within the genomes of a population of living bacteria. The study based on the principles of information storage in DNA with DNA-capture systems capable of functioning in living cells to create living organisms that capture, store and propagate information (images and movies) over time. In the study, researchers use the gene-editing system CRISPR to incorporate image and a short-animated image (GIF) into the genome of E. coli. For this, the images of galloping horse and rider, taken by British Photographer EadWeard Muybridge, who produce the first stop-motion photographs in the 1870s was chosen for the study. Firstly, they created strands of synthetic DNA that encoded in the letters G, T, C, and A, the positions and shaded of pixels found in the image of the hand and five pictures of a galloping horse. They stored the pixel values of an image in a nucleotide code, distributed over many individual synthetic oligonucleotides. These oligonucleotides delivered into a population of E. coli, each harboring a functional CRISPR array and over expressing the Cas1-Cas2 integrase complex, allowing cells to acquire the oligonucleotides into their genome. The electroporation method used for delivery of those oligonucleotides. The data stored in DNA retrieval done by high-throughput sequencing of the bacterial DNA. The newly acquired oligonucleotides are decoded to reconstruct the original images (Fig. 1) and movies (Mov. 1). Thus, researchers reported on what is currently possible and detailed a couple of amazing innovations, not the least of which is encoding a movie in DNA.

Figure 1: Original image of a hand (left) and encoded in bacterial DNA, then extracted (right) after many generations of bacterial growth.


Advantage of DNA Storage

  • Huge capacity to store data in the small volume.

  • DNA is stable to keep data.

  • Highly secure as DNA is invisible.

  • More efficient and easier data extraction.

  • Data can be more durable for future usage.


Disadvantage of DNA Storage

  • Very expensive due to the high cost of DNA synthesis.

  • DNA is not re-writable.

  • Required high technology equipped laboratories.


In the line, the study provides the promising potential of digital data storage in the DNA for long-term. In future DNA storage system will not only used for storing our favorite movies, music but will store the global information.

(Shrestha is a Doctoral Researcher in Tzu Chi University, Taiwan)

Rupendra Shrestha