What problems should be considered in the design of Java cache 07/19 Update SLTechnology News&Howtos

What problems should be considered in the design of Java cache

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article focuses on "what should be considered in Java cache design". Interested friends may wish to take a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what are the issues to consider in Java cache design?"

Cache penetration

Cache traversal refers to querying a data that must not exist, because the data does not exist, so it will never be cached, so every request will request the database.

For example, we request a user data with a UserID of-1, and because the user does not exist, the request reads the database every time. In this case, if some unscrupulous people take advantage of this loophole to forge a large number of requests, it is likely to cause DB to fail when it cannot withstand so much traffic.

For cache traversal, there are several solutions, one is prevention in advance, the other is prevention after the event.

Prevention in advance. In fact, it is to check the parameters of all requests, blocking the vast majority of illegal requests at the outermost layer. In our example, we do parameter verification and reject all requests with UserID less than 0. But even if we do a comprehensive parameter check, there may still be some fish out of the net, and there will be some situations that we did not expect.

For example, our UserID is incremental, so if someone requests a user information with a large UserID (for example, 1000000), and our UserID is up to 10000. At this time, it is impossible to limit the UserID greater than 10,000 is illegal, or greater than 100000 is illegal, so the user ID must be able to pass the parameter check. But the user does not exist, so every request requests the database.

In fact, the above is just a situation that I can think of, and there must be a lot of situations that we did not think of. What we can do about these situations is to prevent them.

Prevention after the event. Ex post facto prevention means that when an empty result is queried, we still cache the empty result, but set a short expiration time (for example, one minute). We can see here that we do not completely prevent illegal requests, but just let the risk of illegal requests be borne by the redis with stronger ability to bear, so as to make the database more secure.

Through the above two processing methods, we can basically solve the problem of cache penetration. Prevent and resolve 80% of illegal requests in advance, and the remaining 20% of illegal requests use Redis to transfer risk.

Cache breakdown

If you have some hot data with high traffic in your application, we usually put it in the cache to improve access speed. In addition, in order to maintain timeliness, we usually set an expiration time. But for these high-traffic KEY, we need to consider a question: when the hot KEY fails, will the massive requests generate a large number of database requests, resulting in a database crash?

For example, we have a business KEY that has 10000 concurrent requests. When the KEY expires, 10,000 threads will request the database to update the cache. If appropriate measures are not taken at this time, the database is likely to crash.

In fact, the above problem is the problem of cache breakdown, which occurs at the moment of cache KEY expiration. There are two common solutions to this situation: mutexes and never expires.

Mutex lock

Mutex means that when the cache KEY expires to update, let the program acquire the lock first, and only the thread that acquires the lock is qualified to update the cache KEY. Other threads that do not acquire the lock sleep for a while and then retrieve the latest cached data again. In this way, only one thread will always read the database at a time, thus avoiding the impact of massive database requests on the database.

For the locks mentioned above, we can use some of the principle operations provided by the cache. For example, for redis caching, we can use its SETNX command to do this.

The key_mutex above is actually a normal KEY-VALUE value, which we use the setnx command to set to 1. If someone is already updating the cache KEY at this time, the setnx command returns 0, indicating that the setting failed.

Never expire.

From a caching point of view, if you set never to expire, there will be no massive requests to the database. At this time, we usually update the data in the database to the cache regularly by setting up a new thread, and a more mature way is to synchronize the cache and database data through scheduled tasks.

However, this scheme will have the problem of data delay, that is, the data read by the thread is not up-to-date. But for general Internet functions, a little delay is acceptable.

Cache avalanche

Cache avalanche means that we use the same expiration time when we set the cache, which causes the cache to fail at the same time, and all requests are forwarded to the database, resulting in excessive instantaneous pressure on the database and collapse.

For example, we have 1000 KEY, and each KEY has only 10 concurrent requests. The cache avalanche means that the 1000 KEY are invalidated at the same time, and then suddenly there are 1000 * * 10 = 10, 000 queries.

Problems caused by cache avalanches are generally difficult to troubleshoot, and if they are not prevented in advance, it is likely to take a lot of effort to find the cause. In the case of cache avalanches, the simplest solution is to add a random time (for example, 1-5 minutes) to the original expiration time, so that the repetition rate of each cache expiration time will be reduced, thus reducing the occurrence of cache avalanches.

"cache traversal" refers to the request for data that does not exist, so that the cache is null and void and the cache layer is penetrated. For example, we request a user data with a UserID of-1, and because the user does not exist, the request reads the database every time. In this case, if some unscrupulous people take advantage of this loophole to forge a large number of requests, it is likely to cause DB to fail when it cannot withstand so much traffic.

"cache breakdown" refers to a KEY with high concurrency. At the moment when the KEY expires, many requests go to the database at the same time to update the cache. For example, we have a business KEY that has 10000 concurrent requests. When the KEY expires, 10,000 threads will request the database to update the cache. If appropriate measures are not taken at this time, the database is likely to crash.

"cache avalanche" means that the cache expires at the same time, just as all blocks of snow fall at the same time, like an avalanche. For example, we have 1000 KEY, and each KEY has only 10 concurrent requests. The cache avalanche means that the 1000 KEY are invalidated at the same time, and then suddenly there are 1000 * * 10 = 10, 000 queries.

We can make some summary of their occurrence:

Cache traversal is a business-level vulnerability that leads to illegal requests, which has nothing to do with the number of requests and cache invalidation. "cache breakdown" will only appear on hot data, which occurs at the moment of cache failure, and has little to do with the business. The "cache avalanche" is due to multiple KEY failures at the same time, resulting in too many database requests. Non-hot data can also cause cache avalanches, as long as there are enough KEY failures at the same time.

At this point, I believe you have a deeper understanding of "what issues should be considered in Java cache design". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.