Researchers Store Full Computer Operating System on DNA

researchers-store-full-computer-operating-system-on-dna photo 1

Proving that everything old is new again, researchers are now storing data on the oldest information storage solution there is: DNA.

A pair of researchers at Columbia University and the New York Genome Center (NYGC) have come up with a technique to store massive amounts of data on DNA. The result, according to study coauthor Yaniv Erlich, is the "highest-density data-storage device ever created."

The researchers say DNA is the perfect storage medium: it's ultra-compact and can last hundreds of thousands of years if kept cool and dry, according to a news release from Columbia.

"DNA won't degrade over time like cassette tapes and CDs, and it won't become obsolete—if it does, we have bigger problems," Erlich, a computer science professor at Columbia Engineering, said in a statement.

Erlich and his colleague Dina Zielinski, an associate scientist at NYGC, successfully encoded six files into DNA: a full computer operating system, the 1895 French film Arrival of a train at La Ciotat, a $50 Amazon gift card, a computer virus, a Pioneer plaque and a 1948 study by information theorist Claude Shannon.

They first compressed the files into a master file and split the data into short strings of binary code, made up of ones and zeros. Next, "using an erasure-correcting algorithm called fountain codes, they randomly packaged the strings into so-called droplets, and mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T," according to the release.

They wound up with a digital list of 72,000 DNA strands and send it in a text file to a San Francisco DNA synthesis startup called Twist Bioscience, which specializes in turning digital data into biological data.

Related

  • Microsoft Stores 200MB of Data on Strands of DNAMicrosoft Stores 200MB of Data on Strands of DNA

"Two weeks later, they received a vial holding a speck of DNA molecules," the school wrote. "To retrieve their files, they used modern sequencing technology to read the DNA strands, followed by software to translate the genetic code back into binary. They recovered their files with zero errors."

The researchers say this strategy allows for 215 petabytes of data to be stored on a single gram of DNA. This technique comes at a high cost, however, so don't expect it to go mainstream any time soon. The researchers spent $7,000 to synthesize the 2MB of data and another $2,000 to read it. For more on the technique, check out the video below.

Recommended stories

More stories

A Typo Took Amazon S3 Offline

Amazon Web Services suffered a major outage a few days ago. It turns out one mistyped command is to blame for hours of chaos.