A brief introduction to the functions and rules of HTTP caching 07/04 Update SLTechnology News&Howtos

A brief introduction to the functions and rules of HTTP caching

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the knowledge of "introduction to the functions and rules of HTTP caching". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Preface

HTTP caching mechanism as an important means of Web application performance optimization, for students engaged in Web development, should be the basic part of the knowledge system, but also want to become a necessary skill of the front-end architecture.

The role of caching

The reason we use caching is that caching can bring the following benefits to our Web project to improve performance and user experience.

Speed up the loading of web pages by browsers

Reduce redundant data transmission and save network traffic and bandwidth

Reduce the burden of the server and greatly improve the performance of the website.

Because reading static resources from the local cache, it is certain to speed up the web page loading speed of the browser, and indeed reduce the data transmission, in terms of improving the performance of the website, it is possible that the access of one or two users has no obvious effect on reducing the burden on the server, but if the site is in the case of high concurrency, the use of caching will have a qualitative change in reducing the pressure on the server and the performance of the whole site.

Brief introduction of caching rules

For ease of understanding, we believe that the browser has a cache database for storing cache information (in fact, static resources are cached in memory and disk). When the browser requests data for the first time, the cache database does not have the corresponding cache data, then the request server is required, the server will return the cache rules and data, and the browser will store the cache rules and data in the cache database.

The requested index.html will not be cached when the address is entered in the browser address bar, but other resources requested within the index.html will follow the caching policy. HTTP caching has a variety of rules, which are mainly divided into two categories according to whether the request needs to be sent to the server, mandatory caching and negotiation caching.

Force caching

1. Force caching process

Forced caching is that after accessing the server for the first time to obtain data, the server will no longer be requested within the valid time, but the cached data will be used directly. The process of forcing caching is as follows.

2. Force the cache to determine the expiration time

So how do you tell if the cache expires? In fact, it is based on the response header of the server on the first access, which is different in HTTP version 1.0 and HTTP version 1.1.

In HTTP version 1.0, the response header field used by the server is Expires, and the value is the future absolute time (timestamp). The current time of the browser request exceeds the time set by Expires, which means that the cache is invalid. You need to send the request to the server again, otherwise the data will be obtained directly from the cache database.

In HTTP 1.1, the response header field used by the server is Cache-Control, which has multiple values with different meanings.

Private: client can cache

Public: both the client and the proxy server can be cached (for the front end, it can be considered to be the same as private)

Max-age=xxx: the cached content will expire after xxx seconds (relative time, seconds)

No-cache: you need to use negotiation caching (described later) to verify that the data is expired

No-store: all content is not cached, mandatory caching and negotiation caching are not triggered.

The most commonly used value of Cache-Control is max-age=xxx, and the cache itself exists for data transfer optimization and performance, so no-store is rarely used.

Note: in HTTP version 1.0, the absolute time of the Expires field is obtained from the server. Because the request takes time, there is an error between the request time of the browser and the time obtained by the server when the request is received by the server, which also leads to the error of cache hit. In version 1.1of HTTP, because the xxx in the value max-age=xxx of Cache-Control is the relative time in seconds. Therefore, the countdown begins after the browser receives the resource, which avoids the disadvantage of cache hit error in HTTP 1.00.In order to be compatible with the lower version of HTTP protocol, the two response headers are used at the same time in normal development, and the implementation priority of HTTP version 1.1 is higher than that of HTTP 1.0.

3. View the forced cache through Network

We use the developer tool of the Chrome browser to open NetWork to view the information about forced caching.

The above is the response to Baidu's Logo images, which we can see clearly, which are compatible with HTTP 1.0 and HTTP 1.1, and have been stored in a forced cache for 10 years.

Let's take a look at the difference between cached data and other resources in Network.

In fact, the storage of the cache is located in memory and disk, which is determined by the policy of the current browser itself. It is relatively random. The data extracted from the cache in memory will be displayed (from memory cache), and the data extracted from the cache on disk will be displayed (from disk cache).

4. NodeJS server implements mandatory caching

/ / Force cache const http = require ("http"); const url = require ("url"); const path = require ("path"); const mime = require ("mime"); const fs = require ("fs"); const server = http.createServer ((req, res) = > {let {pathname} = url.parse (req.url, true); pathname = pathname! = "/"? Pathname: "/ index.html"; / / get the absolute path to read the file let p = path.join (_ _ dirname, pathname); / / check whether the path is legal fs.access (p, err = > {/ / if the path is invalid, directly disconnect the if (err) return res.end ("Not Found"); / / set the forced cache res.setHeader ("Expires", new Date (Date.now () + 30000). ToGMTString ()) Res.setHeader ("Cache-Control", "max-age=30"); / / set the file type and respond to the browser res.setHeader ("Content-Type", `${mime.getType (p)}; charset= utf8`); fs.createReadStream (p) .pipe (res);}); server.listen (3000, () = > {console.log ("server start 3000");})

The getType method of the above mime module can successfully return the file type corresponding to the file under the passed path, such as text/html and application/javascript, which is a third-party module and needs to be installed before use.

Npm install mime

Negotiation cache

1. Negotiate the caching process

The negotiation cache is also known as the comparison cache. After the negotiation cache is set, the server will return the data and the cache ID to the browser together with the cache ID the first time the server accesses the data, and the client will store the data and identity in the cache database. The next request will first remove the cache ID from the cache and send it to the server for query, and update the ID when the server data changes. Therefore, the server compares the identity sent by the browser. The same means that the data has not changed. In response to the browser notification that the data has not changed, the browser will go to the cache to obtain the data. If the identity is different, it means that the server has changed the data. Therefore, the new data and the new identity will be returned to the browser, and the browser will store the new data and identity in the cache. The process of negotiating the cache is as follows.

The difference between negotiation caching and forced caching is that negotiation caching needs to communicate with the server for each request, and the status code returned by the hit cache server is no longer 200, but 304.

2. Negotiate cache judgment identification

Mandatory caching controls access to the server through expiration time, while negotiation caching interacts with the server to compare cache identities each time. Similarly, the implementation of negotiation caching is different in HTTP version 1.0 and HTTP version 1.1.

In HTTP version 1. 0, the server sets the cache identity through the Last-Modified response header, which usually takes the last modification time (absolute time) of the request data as the value, while the browser stores the returned data and identity in the cache, and the second request automatically sends the If-Modified-Since request header, which is the last modification time (identity) returned before. The server takes out the value of If-Modified-Since and compares it with the last modification time of the data. if the last modification time is greater than the value of If-Modified-Since, it indicates that it has been modified, then the new last modification time and new data are returned through the Last-Modified response header, otherwise it has not been modified, and the return status code 304 informs the browser to hit the cache.

In HTTP version 1.1, the server sets the cache identity through the Etag response header (the unique identity, like a fingerprint, the generation rules are determined by the server). The browser receives the data and stores the unique identity in the cache, and the next request, the unique identity is brought to the server through the If-None-Match request header. The server takes out the unique identity and compares it with the previous identity, which means it has been modified. If the new identity and data are returned, the status code 304 is returned to notify the browser to hit the cache.

The flow chart of HTTP negotiation cache policy is as follows:

Note: HTTP version 1.0 is still unreliable when using negotiation cache. If a file is deleted after adding one character, the file is equivalent to no change, but in the end, the modification time has changed and will be treated as a modification. The server resend the data when it should have hit the cache. Therefore, the Etag unique identity used in HTTP 1.1is generated based on the file content or summary, which ensures that as long as the file content remains unchanged. In order to be compatible with the lower version of the HTTP protocol, both response headers will be used at the same time. Similarly, the implementation priority of HTTP 1.1 is higher than that of HTTP 1.0.

3. View the negotiation cache through Network

We also use the developer tool of the Chrome browser to open NetWork to view the information about the negotiation cache.

Request the request header information of the server again:

Hit the response header information of the negotiation cache:

Let's take a look at the difference between the data fetched through the negotiated cache in Network and the first load.

First request:

Request after caching:

Through the comparison of the two figures, we can find that the status code when the negotiation cache takes effect is 304, and the message size and request time are greatly reduced, because the server only returns the header part after identity comparison, and notifies the browser to use the cache through the status code, so it is no longer necessary to return the message body to the browser.

4. NodeJS server implements negotiation cache

/ / negotiation cache const http = require ("http"); const url = require ("url"); const path = require ("path"); const mime = require ("mime"); const fs = require ("fs"); 0const crytpo = require ("crytpo"); const server = http.createServer ((req, res) = > {let {pathname} = url.parse (req.url, true); pathname = pathname! = "/"? Pathname: "/ index.html"; / / get the absolute path to read the file let p = path.join (_ _ dirname, pathname); / / check whether the path is legal fs.stat (p, (err, statObj) = > {/ / if the path is invalid, the connection is if (err) return res.end ("Not Found"); let md5 = crypto.createHash ("md5") / / create an encrypted translation stream let rs = fs.createReadStream (p); / / create a readable stream / / read the contents of the file and encrypt rs.on ("data", data = > md5.update (data)); rs.on ("end", () = > {let ctime = statObj.ctime.toGMTString (); / / get the last modification time of the file let flag = md5.digest ("hex") / / get the unique identity after encryption / / get the request header let ifModifiedSince = req.headers ["if-modified-since"]; let ifNoneMatch = req.headers ["if-none-match"]; if (ifModifiedSince = ctime | | ifNoneMatch = flag) {res.statusCode = 304; res.end ();} else {/ / set negotiation cache res.setHeader ("Last-Modified", ctime) Res.setHeader ("Etag", flag); / / set the file type and respond to the browser res.setHeader ("Content-Type", `${mime.getType (p)}; charset= utf8`); rs.pipe (res);});}); server.listen (3000, () = > {console.log ("server start 3000");})

In the above code, the contents of the file are read through a readable stream, and the result of md5 encryption is used as a unique identification through the crypto module, so as to ensure that as long as the content of the file remains unchanged, it will hit the cache, which is compatible with HTTP 1.0 and HTTP 1.1. as long as it meets one, it will directly return 304to notify the browser to hit the cache.

Note: in fact, it is not advisable to read the contents of a file and encrypt it. If you are reading a large file, it will take a lot of time to read the contents of the file and encrypt it with md5. So in development, you should choose a way that can ensure the performance of the server according to the actual situation of the business, such as generating a unique identity according to the summary of the file.

Summary

In order to make the caching strategy more robust and flexible, the caching policies of HTTP 1.0 and HTTP 1.1 will be used at the same time, and even mandatory caching and negotiation caching will be used at the same time. For forced caching, the server notifies the browser of a cache time. Within the cache time, the next request will directly use the cache, beyond the valid time, execute the negotiation cache policy, and for the negotiation cache. The Etag and Last-Modified in the cache information are sent to the server through the request headers If-None-Match and If-Modified-Since, and the server verifies and sets a new mandatory cache at the same time. When the verification is passed and the 304status code is returned, the browser directly uses the cache. If the negotiation cache also fails, the server resets the identity of the negotiation cache.

This is the end of the introduction to the functions and rules of HTTP caching. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.