Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

In-depth interpretation of Alibaba Cloud original Image Distribution system Dragonfly

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Dragonfly is a cloud native image distribution system opened by Alibaba, which mainly solves the image distribution problem of distributed application orchestration system with Kubernetes as the core. With the sweeping tide of enterprise digitization, industry applications have evolved towards micro-service architecture, and optimize business management through cloud platform. Dragonfly, which originated from Alibaba, has prospectively solved the three major problems of cloud native image distribution efficiency, flow control and security based on the actual landing scenario.

On November 14, 2018, it officially entered CNCF and became the CNCF sandbox level project (Sandbox Level Project).

The Origin of Dragonfly

With the explosive growth of Ali Group's business, the average daily release volume of the release system exceeded 20,000 in 2015, the machine scale of many applications began to exceed 10,000, and the release failure rate began to rise. the fundamental reason is that the publishing process requires a large number of file pulling. file servers can't handle a lot of requests, of course, the first time I think of server expansion. However, after the expansion, it is found that the back-end storage has become a bottleneck and the cost of capacity expansion is also very huge (according to our calculation, in order to meet the business needs and not hinder the development of the business, at least 2000 high-end physical machines are required and are not capped). In addition, a large number of client requests from different IDC consume a huge amount of network bandwidth, resulting in network congestion.

At the same time, many of Alibaba's businesses are internationalized, a large number of applications are deployed overseas, and overseas servers have to be downloaded back to China, wasting a lot of international bandwidth, and it is also very slow; if you transfer large files, the network environment is poor, if you fail, you have to do it all over again, and the efficiency is extremely low.

So we naturally think of P2P technology, P2P technology is not new, at that time also investigated a lot of domestic and foreign systems, but the conclusion of the survey is that the scale and stability of these systems can not meet our expectations, so there is the birth of Dragonfly.

What problems can Dragonfly solve?

As a general document distribution system, Dragonfly can mainly solve the following problems:

Large-scale download problem: software packages or image files need to be downloaded during application release. If there are a large number of machines to be released at the same time, such as 1000, calculated according to the image files of 500MB size, if you download directly from the image warehouse, assuming that the bandwidth of the image warehouse is 10000Mbps, it will take at least 10 minutes in an ideal situation, and it is likely that the repository has already been hung up.

Long-distance transmission problem: for cross-regional and international applications, such as AliExpress, it needs to be deployed not only in China, but also in the United States and Russia, while the source of storage software packages is generally only in one region, such as Shanghai, so machines in the United States or Russia have to be transmitted through the international network when they want to download software packages, but the international network is not only high in latency but also extremely unstable. Seriously affect the transmission efficiency, and then lead to the business can not be online in time for new functions or problem patches, which may even lead to business failures.

Bandwidth cost problem: in addition to transmission efficiency, high bandwidth cost is also a very serious problem. Many Internet companies, especially video-related companies, bandwidth costs can often account for a large part of their overall costs.

Secure transmission: according to statistics, the annual economic loss caused by network security problems is as high as 450 billion US dollars, so security must be the first lifeline. If no security mechanism is added to the file transfer process, the file content can be easily sniffed. Assuming that the file contains data such as account number or secret key, once intercepted, the consequences will be unimaginable.

How does Dragonfly solve these problems?

To solve the problem of large-scale image download through P2P technology, the principle is as follows:

There are several concepts that need to be explained in view of the above figure:

PouchContainer: Alibaba Group's open source efficient, lightweight enterprise-level rich container engine technology.

Registry: the repository of container images. Each image consists of multiple mirror layers, and each mirror layer is represented as a normal file.

Block: when downloading a layer image file through Dragonfly, the SuperNode of Dragonfly splits the whole file into chunks. The chunks in SuperNode are called seed chunks. Seed chunks are downloaded by several initial clients and quickly propagated among all clients. The size of the chunks is calculated dynamically.

The server side of SuperNode:Dragonfly, which is mainly responsible for the life cycle management of seed blocks and the construction of P2P networks and scheduling clients to transmit designated blocks to each other.

The client of DFget:Dragonfly, installed on each host, is mainly responsible for uploading and downloading parts and interacting with container Daemon commands.

Peer: Host downloading the same file is called Peer to each other.

The main download process is as follows:

First, PouchContainer initiates the Pull mirror command, which is intercepted by the DFget agent.

The DFget then sends a scheduling request to the SuperNode.

After receiving the request, SuperNode will check whether the corresponding file has been cached locally. If not, it will download the corresponding file from Registry and generate seed block data (once the seed block is generated, it can be propagated immediately, and it does not need to wait for SuperNode to download the whole file before distribution). If it has been cached, a multipart task will be generated directly.

The client parses the corresponding task and downloads multipart data from other Peer or SuperNode. When all the parts of a Layer are downloaded, a Layer is downloaded and passed to the container engine. When all the Layer downloads are completed, the whole image is downloaded.

Through the above P2P technology, we can completely solve the bandwidth bottleneck of the mirror warehouse, make full use of the hardware resources and network transmission capacity of each Peer, and achieve the effect that the larger the scale, the faster the transmission.

Dragonfly system architecture does not involve any changes to the container technology system, can seamlessly support the container to have P2P image distribution capabilities, in order to greatly improve the efficiency of file distribution!

Combining CDN and preheating technology to solve the problem of long-distance transmission

Through CDN caching technology, each client can download seed blocks from SuperNode nearby without the need for network transmission across regions. The principle of CDN caching is roughly as follows:

The first requester of the same file triggers the check mechanism to calculate the cache location based on the request information. If the cache does not exist, it triggers the origin-pull synchronization operation to generate seed blocks. Otherwise, send a HEAD request to the origin server with the If-Modified-Since field. The value of this field is the last modification time of the file returned by the server. If the response code is 304, it means that the file in the origin server has not been modified and the cache file is valid, and then determine whether the file is complete according to the meta-information of the cache file. If it is complete, the cache hits completely. Otherwise, you need to download the remaining files in segments through the breakpoint continuation method. The premise of the breakpoint continuation is that the origin server must support segmented download, otherwise the entire file must be synchronized. If the response code of the HEAD request is 200, it means that the origin server file has been modified and the cache is invalid, and origin-pull synchronization needs to be performed at this time; if the response code is neither 304nor 200, the origin server is abnormal or the address is invalid, and the download task fails directly.

CDN caching technology can solve the problem of client origin-pull download and nearby download, but if the cache misses, the efficiency of SuperNode origin-pull synchronization will be very low in the scenario of cross-domain long-distance transmission, which will directly affect the overall distribution efficiency. To solve this problem, Dragonfly uses an automatic hierarchical prefetch mechanism to maximize cache hit rate. Its general principle is as follows:

In the process of pushing the image file to Registry through the Push command, each layer of image pushed will immediately trigger SuperNode to synchronize the layer image to the SuperNode in a P2P manner. In this way, you can make full use of the time gap between users performing Push and Pull operations (about 10 minutes) to synchronize the files of each layer of the mirror to the SuperNode, so that when the user executes the Pull command, you can directly use the cache files in SuperNode. Naturally, there is no problem of long-distance transmission.

Solve the problem of bandwidth cost through dynamic compression and intelligent scheduling

Through dynamic compression, you can implement the corresponding compression strategy for the most worthy part of the file without affecting the normal operation of SuperNode and Peer, thus saving a lot of network bandwidth resources and further improving the distribution rate. Compared with the traditional HTTP native compression method, dynamic compression has the following advantages:

First of all, the advantage of dynamic compression is naturally dynamic, which ensures that compression will only be turned on when the load of SuperNode and Peer is normal. At the same time, only the most worthy blocks in the file will be compressed, and the compression strategy is dynamically determined. In addition, the compression rate can be greatly increased through multi-thread compression, and with the cache capacity of SuperNode, the entire download process only needs to be compressed once, and the compression benefit is at least 10 times higher than the native HTTP method.

In addition to dynamic compression, through the powerful task scheduling ability of SuperNode, Peer under the same network device can be transmitted to each other as much as possible, and the traffic across network devices and computer rooms can be reduced, thus the cost of network bandwidth can be further reduced.

Solve the problem of secure transmission through encryption plug-in

When downloading some sensitive class files (such as secret key files or account data), the security of the transfer must be effectively guaranteed. In this regard, Dragonfly has mainly done the following aspects:

Support for HTTP Header transport to satisfy download requests that require permission verification through Header

The data blocks are packaged and transmitted through the self-developed data storage protocol, and the packaged data will be re-encrypted later.

Plug-in of security encryption will be supported soon

Through the multi-check mechanism, data tampering can be strictly prevented.

What is the current maturity of Dragonfly?

Within Alibaba Group, Dragonfly, as the basic technical component of the whole group, has carried more than 90% of the file download tasks of the whole group, including image files, application software packages, algorithm data files, static resource files, index files and so on. The daily distribution peak can reach 100 million times, providing efficient and stable file distribution capability for the group business. At the same time, in the process of buying and buying on Singles Day every year, the most critical marketing activity data (GB size) is also successfully reached to tens of thousands of machines through Dragonfly near zero o'clock. In case there is a little problem in this process, what will happen to Singles Day, you know...

At present, Dragonfly is also open source. In the open source community, the current number of Star is 2,500. At the same time, a lot of external users are showing strong interest in Dragonfly, and many external companies are using Dragonfly to solve various problems they encounter in image or file distribution, such as China Mobile, Didi, iFLYTEK and so on. In addition, Dragonfly has become the third project in China to enter the CNCF Sandbox level, and we will continue to work hard to graduate as soon as possible.

President of CNCF Announces Dragonfly joining CNCF

Through the above introduction, I believe that in view of whether Dragonfly is mature enough, you should have a scale in your heart. Of course, Dragonfly still has a lot of things to improve and improve. Here we sincerely invite all kinds of talents to build Dragonfly into a world-class product!

Prospects for the future

Become a CNCF graduation project to provide more rich and powerful file distribution capabilities for cloud native applications.

The open source version is integrated with the group internal version to open up more advanced features to the community.

Carry on more exploration and improvement in the aspect of intelligence.

Original link: https://mp.weixin.qq.com/s/UUZDIGopz5UruRpnxcOZ8Q

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report