Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

DNA Storage, the way to Save the Human data crisis?

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Open a brain: if the earth is facing an impending devastating interstellar disaster, and human beings want to preserve the life and civilization of the earth as much as possible, what should we do under the existing conditions?

Like Da Liu, it may be too late to stop the earth's rotation and escape from the solar system. And if, like Noah's Ark, human, animal and plant, and human knowledge are transported to the spaceship, the existing rocket carrying capacity may not be able to hold one billionth of these substances.

If we want to preserve the earth's creatures as much as possible and for as long as possible, we only need to collect and package the DNA sequence information of all species, which can be preserved for hundreds of thousands of years in the low temperature environment of the spaceship. What about the information of human civilization? We know that the most efficient form of this information is data, which is mainly stored on hard drives and CDs.

Considering the weight and data density of these hard disk storage, we have to be discouraged again. What's more, before the spacecraft escapes from the solar system, the data will be lost due to the death of the hard disk or CD.

So can DNA be used as a hard disk to store data and information? The answer is yes.

DNA is definitely the oldest tool for storing life information on the planet, and it can also be used as a storage medium for data information, and its storage density and service life far exceed the existing disk storage solutions. Therefore, DNA storage, which is being regarded as the future of data storage, is becoming the best alternative to save the human data storage crisis.

How exactly does DNA storage do that? What is the stage of development now? What are the obstacles to commercial use? This requires us to answer them one by one.

How does DNA storage work?

Before we understand how DNA storage works, let's take a brief look at the principles of two existing solutions, magnetic storage and optical storage.

The principle of magnetic storage is that the metal material is coated with magnetic media to form an electromagnetic effect in the case of electrification, which can store and express 0101 binary information. The advantage of magnetic storage hard disk is that the input and reading speed is fast, and the disadvantage is that compared with volume and weight, the data density is lower. After 60 years of development, it is possible to store 3TB data on a 3.5inch hard drive.

The principle of optical storage is that the digitally encoded video and audio are recorded in the grooves on the surface of the optical disc, and then the data in these grooves are read out by laser, and then transferred or played. At present, optical storage is also experiencing the limit of storage. Because if you want to save more data, the grooves must be smaller and more compact, and the higher the precision of the laser. At present, single-layer Blu-ray discs can store more than 25GB information. If another ultraviolet laser is successfully developed, its optical disk capacity can reach the capacity of 500GB.

What are the advantages of DNA storage over magnetic storage and optical storage?

The first is to save space. However, compared with the double helix three-dimensional structure of DNA, the storage capacity of these single-layer flat storage methods has a gap of several orders of magnitude. The physical body of DAN itself is actively small and has a three-dimensional structure, and the data density per unit space is very high. To take a simple example, a gram of DNA is less than the size of a dewdrop on your fingertip, but it can store 700TB data, equivalent to 14000 Blu-ray discs with 50GB capacity, or 2333TB hard drives (about the weight of 151KG).

In addition, it is very energy efficient. Existing storage methods, such as a data center, consume a lot of monocrystalline silicon and a lot of electricity. The DNA material only needs to be stored in a cool, dry place, and basically does not need additional manual maintenance. Even if the DNA needs to be frozen, the resources and energy consumed are almost negligible.

In addition, the most important point is that it has been preserved for a very long time. Nowadays, high-density memories will decay over time, and the tool that can store magnetic tape for the longest time is magnetic tape, which has a life of only 50 years, while other memories have a shorter life. Comparatively speaking, the shelf life of DNA is calculated in terms of one hundred years. If it is frozen, it can be preserved for thousands or even tens of thousands of years.

It seems that there is a plan to save human civilization, but how on earth does DNA storage do it?

It is well known that DNA consists of four nitrogen-containing bases-A, T, C and G-complementary pairing. Scientists assign binary values (An and C, G and T) to adenine (A), guanine (G), cytosine (C) and thymine (T), respectively, and then synthesize the gene sequence by microfluidic chip so that the position of the sequence matches the relevant data set. In this way, these base pairs are encoded into a combination of 1 and 0, and the sequence information of DNA can be used to express the binary language.

Every time the binary language is written into the DNA sequence, the "DNA hard drive" can be stored in a low temperature environment. When we need to read the data, we only need to sequence the target DNA, restore the base pairs to binary coding, and then complete the decoding, which can be restored to our common data.

The principle is very simple, but how do scientists do it? It's time to briefly review the history of DNA storage technology.

How did DNA storage evolve to where it is now?

The first person who came up with this method was Joe Davis, an artist who worked with Harvard researchers in 1988 to convert a picture of a 7'5 pixel matrix called Microvenus into a 35-base DNA sequence and insert it into E. coli, writing information that does not belong to natural evolution into DNA for the first time.

(Microvenus stands for women and the earth)

In 2010, American synthetic biologist Craig Venter (Craig Venter) led a team of researchers to chemically synthesize the entire mycoplasma genomic DNA, named "Synthia", and encode the name of the researcher, the website of the research institute and the poems of the Irish poet James into the newly synthesized DNA in a "entertaining" way.

In 2011, a team led by George Church, a synthetic biologist at Harvard University, and Sriram Kosuri of the University of California, and Yuan Gao, a genomics expert at Johns Hopkins University, conducted the first proof-of-concept experiment. The team used short DNA fragments to encode a book of Church's 659KB data.

In 2013, Nick Goldman (Nick Goldman) of the European Institute of Bioinformatics (EBI) and his team also successfully incorporated five documents into DNA fragments, including Shakespeare's sonnets and Martin Luther King's "I have a Dream" speech, and a copy of Watson and Crick's DNA double helix paper. 739KB data became the largest DNA storage file at that time.

In 2016, Microsoft and the University of Washington used DNA storage technology to complete the storage of about 200MB data, which became a leap in DNA information storage technology.

In July 2017, Nature published a study of living DNA storage by Seth Shipman of Harvard Medical School and George Church. They put a 130-year-old black-and-white movie running Horse on the DNA of E. coli. Although Escherichia coli has a "strange DNA", not only can survive normally, but also can be inherited normally, each reproduction is a data replication. And the films stored in the genome have been preserved intact in each generation of E. coli.

However, because of the risk of information error caused by cell replication, division and death, the future data security, most of the DNA for storing information exists in the form of DNA dry powder, and the research of living cell storage turns to synthetic DNA storage.

In the same year, Columbia University and the New York Genome Center published an efficient DNA storage strategy called "DNA Fountain" in the journal Science. This technology shows the storage potential of maximizing the use of DNA, successfully compressing huge amounts of information to four bases of DNA, that is, encoding 1.6bit (bits) data for each DNA, storing 60% more information than before, approaching the theoretical limit (1.8bit). This method can store 215PB data in one gram of DNA, which is equivalent to 220 million movies.

In 2018, researchers at the Watford Institute of Technology (WIT) in Ireland developed a new DNA storage method that can store 1ZB data in 1 gram of E. coli DNA.

In 2019, Church's team published the results of another experiment in the journal Science. They read Church's book of about 53400 words, "Rebirth: how synthetic Biology will change Nature and itself in the Future," as well as 11 pictures and a Java program, encoded into a DNA microchip of less than one billionth of a gram, and successfully used DNA sequencing to read the book.

The rapid development of these scientific research also means that DNA synthesis technology (data writing) and DNA sequencing technology (data reading) are becoming mature. But at the same time, there are still some problems in the DNA coding process, such as storage / reading speed and cost, and DNA storage is still on the way to commercialization.

Problems and Progress of commercialization of DNA Storage

In the lab, DNA storage does not seem to be complex, but there are still some problems in commercialization.

First of all, the speed of storage and reading is very slow. The access speed of DNA storage devices is slow and time-consuming. Compared with the electromagnetic signals stored on disk, DNA synthesis depends on a series of chemical reactions. It takes less than a second to write 200MB data to disk, and it takes almost three weeks to synthesize it with DNA.

Second, DNA media cannot be overwritten and rewritten. In DNA, once the information is saved, it generally cannot be modified. To read this document, you need to sequence all the information and then transcode it.

Third, the accuracy of data storage needs to be improved. At present, repeated reading in DNA sequencing leads to high reading error probability.

Fourth, it is difficult to read and write randomly. At present, DNA synthesis technology can not produce longer DNA molecules at one time, but can only synthesize a large number of short fragments. This makes it difficult to quickly access specific data in a mixture of small DNA fragments.

Last but not least, DNA storage costs are too high. For example, DNA currently costs $800000 to store 200MB data, while electronic devices cost less than $1.

But as mentioned above, if you put it on a longer time scale and under the pressure of data storage space, DNA will show its unique advantages of high storage density, high energy saving and environmental protection, and ultra-long stability. As long as with the development of storage and reading technology, the efficiency of DNA coding and sequencing is improved, and the cost is greatly reduced, DNA storage is not far from commercial application.

So, what progress has been made in commercialization?

In 2015, Microsoft and the University of Washington jointly published a result that uses fixed-point reading of information, that is, to add tracking tags to a long chain of DNA. These tags similar to the indexing mechanism can be read without having to wait for the complete DNA chain to be sequenced each time.

In 2018, with another breakthrough in reading technology, Microsoft developed a "nano-pore" reading technology that allows DNA media columns to squeeze through a small nano-hole to read each DNA base in it. This technology greatly reduces the space cost of the reading device, which can be read by a palm-sized USB device, but the reading speed is about a few KB per second, which can be said to be quite slow.

In March 2019, the Microsoft team published a new development in Nature magazine when they developed the world's first automated DNA storage medium. Compared with manual DNA synthesis and sequencing, automatic DNA coding and decoding is the way out for commercialization in the future.

In addition, Catalog, an American startup founded in 2016, is also trying to solve the problem of how long and how much it takes to store and read DNA.

Last year, Catalog stored a total of 16 gigabytes of Wikipedia text in English on a DNA molecule. They used a DNA writer device to record the data in DNA at the speed of 4Mbps. This means that 125GB can be recorded in a day, about the amount of storage that a high-end phone can store. This speed is already three times the storage speed of previous studies.

At present, Catalog uses long prefabricated DNA strands of 20 to 30 base pairs, which can store more data by nesting enzymes together. The arrangement of these fragments is like the use of 26 letters in English, which can theoretically create countless combinations. Catalog estimates that DNA storage of 1MB data will cost less than 0.001 cents in the future.

Of course, if the startup can really significantly reduce costs in the future, it could pave the way for the commercialization of DNA data storage.

In 2019, DNA data storage technology was among the top ten emerging technologies jointly released by Scientific American and the World Economic Forum.

It can be predicted that magnetic storage and optical storage will still occupy the mainstream of data storage in the future. However, even if we do not have the extreme situation of the end of the earth, because of the proliferation of data in recent years, human beings are facing the serious problem of insufficient data storage space. At the same time, the surge in demand for data storage has led to a surge in the use of silicon wafers, as well as environmental pollution, water resources and energy consumption.

The implementation of DNA storage technology will alleviate the capacity problem of traditional storage to a certain extent, and greatly reduce the consumption of electronic components and energy.

Of course, in terms of access technology and cost control, carbon-based storage represented by DNA storage still has a long way to go, but with the progress of commercialization, the popularization of its scale will be accelerated. From the perspective of the history of data storage, the change of storage media is a constantly changing and accelerated process, and DNA storage should also become the technical direction of attention and research in our country.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report