Researchers code entire movie and operating system into DNA

3 Mar 2017

DNA samples in multichannel pipette. Image: science photo/Shutterstock

DNA might remain the same, but the amount of data we can transplant within the genetic code is growing substantially by the year, as researchers have just discovered.

In the near future, the differences between biological and machine might not be so clear, particularly with the advent of synthetic biology within the fields of medicine and computer science.

Researchers from the University of Manchester recently demonstrated a concept for a computer that would use DNA molecules to surpass the capabilities of speed, even within the fledgling field of quantum computing.

Fitted on to 72,000 DNA strands

However, the area that is likely to lead the way using DNA with computer science is data storage, whereby companies such as Helixworks in Cork are already offering people the chance to put a limited amount of data within a piece of genetic code.

Aside from being microscopically small, placing data in DNA will allow us to safely store information for hundreds of thousands of years into the future.

Now, researchers at Columbia University and the New York Genome Center (NYGC) have achieved a major storage breakthrough by fitting six large files into 72,000 DNA strands.

These included an 1895 French film called Arrival of a train at La Ciotat, a full computer operating system, a $50 Amazon gift card, a computer virus, a Pioneer plaque from the Voyager missions and a 1948 study by information theorist Claude Shannon.

To do this, the team of Yaniv Erlich and Dina Zielinski compressed the files into a single file, and then split that data into short strings of binary code made up of ones and zeros.

Then, using an algorithm they created called fountain codes, they randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T.

215 petabytes of data on a single gram of DNA

With the data encapsulated within the DNA, the researchers were able to retrieve it using DNA sequencing and software, and translate it back into binary with zero errors in the code.

Using their coding strategy, the researchers said it would be possible to store 215 petabytes of data on a single gram of DNA.

“We believe this is the highest-density data storage device ever created,” said Erlich.

The main barrier at the moment of bringing this into commercialisation is cost, as the synthesising of the DNA cost them $7,000, and another $2,000 to retrieve it.

A possible solution, Erlich said, would be to produce lower-quality DNA molecules and coding strategies such as DNA Fountain to fix molecular errors.

The pair’s research has been published in the journal Science.