DNA has many advantages for storing digital data. It’s ultracompact, and it can last hundreds of thousands of years if kept in a cool, dry place. And as long as human societies are reading and writing DNA, they will be able to decode it. “DNA won’t degrade over time like cassette tapes and CDs, and it won’t become obsolete,” says Yaniv Erlich, a computer scientist at Columbia University. And unlike other high-density approaches, such as manipulating individual atoms on a surface, new technologies can write and read large amounts of DNA at a time, allowing it to be scaled up.
Scientists have been storing digital data in DNA since 2012. That was when Harvard University geneticists George Church, Sri Kosuri, and colleagues encoded a 52,000-word book in thousands of snippets of DNA, using strands of DNA’s four-letter alphabet of A, G, T, and C to encode the 0s and 1s of the digitized file. Their particular encoding scheme was relatively inefficient, however, and could store only 1.28 petabytes per gram of DNA. Other approaches have done better. But none has been able to store more than half of what researchers think DNA can actually handle, about 1.8 bits of data per nucleotide of DNA. (The number isn’t 2 bits because of rare, but inevitable, DNA writing and reading errors.)