What is the principle of CDN origin-pull and CDN multi-level cache 04/16 Update SLTechnology News&Howtos

What is the principle of CDN origin-pull and CDN multi-level cache

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article focuses on "CDN back-to-origin principle and what is CDN multi-level cache". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor to take you to learn "CDN back-to-origin principle and what is CDN multi-level cache"!

CDN Origin-to-Origin principle

Origin-pull means that when a browser sends a request message, it is the server of the source site that responds to the request message, rather than the cache server on each node (for example, nginx enables caching). Then this process is called back-to-origin if it responds through the cache server on each node. If there are too many back-to-origin requests or traffic, it may make the server of the source site bear too much access pressure, thus affecting the normal access of the service.

The origin-pull domain name is generally a professional term in the field of cdn. In general, the origin-pull domain name is directly used for origin-pull. However, if the customer origin server has multiple ip and the ip address changes frequently, for cdn vendors, in order to avoid frequent configuration changes (origin-pull ip), origin-pull domain name will be used for origin-pull domain name. In this way, even if the ip of the origin server changes, the original configuration will not be affected.

Regular CDN is origin-pull. That is, when a user accesses a URL, if the parsed CDN node does not cache the content of the response, or the cache has expired, it will go back to the origin server to obtain it. If no one accesses it, then the CDN node will not take the initiative to pick it up at the origin server.

When the content of the origin server is updated, the origin server can actively push the content to the CDN node. Refer to Ali Cloud url to preheat https://help.aliyun.com/knowledge_detail/40106.html?spm=a2c4e.11153987.0.0.419f6ec5UvPSJ1

CDN is supposed to speed up our website, but sometimes it brings burden to the server because of inappropriate back-to-origin policy. Only by choosing the right strategy can we bring higher access efficiency to our website.

Calculation method of CDN back-to-origin rate

The back-to-origin score includes the proportion of back-to-origin requests and the proportion of back-to-origin traffic:

Ratio of back-to-origin requests

Statistics come from request records on all edge nodes, where requests that have no cache or cache expiration (cacheable) and requests that are not cacheable are counted as origin-pull requests, and other requests that directly hit the cache are hit requests.

Origin-to-origin flow ratio

Origin-pull traffic is the traffic generated by the file size of the back-to-origin request and the traffic generated by the request itself = back-to-origin traffic / back-to-origin traffic + traffic requested by users.

Common multi-level cache CDN concepts in CDN

The full name of CDN is Content Delivery Network, that is, content delivery network. The basic idea is to avoid the bottlenecks and links that may affect the speed and stability of data transmission on the Internet as far as possible, so as to make the content transmission faster and more stable. By placing node servers everywhere in the network to form a layer of intelligent virtual network based on the existing Internet, the CDN system can redirect the user's request to the nearest service node in real time according to the comprehensive information such as network traffic, connection of each node, load status, distance to the user and response time. Its purpose is to enable users to get the content they need nearby, solve the situation of Internet network congestion, and improve the response speed of users visiting the website.

CDN working method

The client browser first checks whether the local cache expires. If it expires, it initiates a request to the CDN edge node. The CDN edge node detects whether the cache of the user's requested data expires. If it does not, it responds directly to the user's request, and a completed http request ends. If the data has expired, the CDN also needs to issue an origin-pull request (back to the source request) to the origin server to pull the latest data. A typical topology diagram for CDN is as follows:

Typical Topology Diagram of CDN

CDN hierarchy:

In CDN system, the Cache devices that directly face users and are responsible for providing content services to users are deployed at the edge of the whole CDN network, so this layer is called edge layer.

In the CDN system, the central layer is responsible for the global management and control, but also saves the most content Cache. When the edge layer device fails to hit the Cache, it needs to request from the central layer device, while when the central layer fails to hit, it needs to request from the origin server. There are differences in the design of different CDN systems. The central layer may have the ability to serve users, or it may only provide services to the next layer.

If the CDN system is large and the edge layer requests too much content from the central layer, it will cause too much load pressure on the central layer. At this point, a regional layer needs to be deployed between the central layer and the edge layer, which is responsible for the management and control of an area, and can also provide some content Cache for access by the edge layer.

CDN caching

When the browser local cache expires, the browser initiates a request to the CDN edge node. Similar to browser caching, CDN edge nodes also have a caching mechanism.

Disadvantages of CDN caching

The diversion function of CDN not only reduces the access delay of users, but also reduces the load of the origin server. But its disadvantage is also obvious: when the website is updated, if the data on the CDN node is not updated in time, even if the user uses Ctrl + F5 to invalidate the cache on the browser side, it will lead to abnormal user access because the CDN edge node does not synchronize the latest data.

CDN caching strategy

CDN edge node caching strategies vary with different service providers, but generally follow the http standard protocol, through the Cache-control: max-age field in the http response header to set the CDN edge node data cache time.

When the client requests data from the CDN node, the CDN node will determine whether the cached data has expired, and if the cached data has not expired, it will directly return the cached data to the client; otherwise, the CDN node will send a back-to-origin request to the origin server, pull the latest data from the origin server, update the local cache, and return the latest data to the client.

CDN service providers generally provide multiple dimensions based on file suffixes and directories to specify CDN cache time to provide users with more refined cache management.

CDN cache time has a direct impact on the "back-to-origin rate". If the CDN cache time is short, the data on the CDN edge nodes will often fail, resulting in frequent origin-pull, increasing the load on the origin server and increasing access latency; if the CDN cache time is too long, it will bring the problem of slow data update time. Developers need to add specific business to do specific data cache time management.

CDN cache refresh

CDN edge nodes are transparent to developers. Compared with the forced refresh of browser Ctrl+F5 to invalidate the browser local cache, developers can clean up the cache of CDN edge nodes through the "refresh cache" interface provided by CDN service providers. In this way, after updating the data, the developer can use the "refresh cache" function to force the data cache on the CDN node to expire, ensuring that the client pulls the latest data when accessing it.

At this point, I believe you have a deeper understanding of "CDN back-to-origin principle and what is CDN multi-level cache". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.