Example Analysis of Cache Avalanche, Cache Breakthrough and Cache Penetration in Redis 07/01 Update SLTechnology News&Howtos

Example Analysis of Cache Avalanche, Cache Breakthrough and Cache Penetration in Redis

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you the "Redis cache avalanche, cache breakdown and cache penetration example analysis", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "Redis cache avalanche, cache breakdown and cache penetration example analysis" this article.

Cache avalanche

Cache breakdown

Cache penetration

I believe that many partners have talked about these three questions on the Internet, but today I still want to say that I will draw more pictures to deepen your impression. These three questions are also high-frequency interview questions, but it takes skill to be able to explain these questions clearly.

When talking about these three questions, let's first talk about the normal request process and look at the picture:

The meaning of the above picture is roughly as follows:

First of all, in your code, which may be tomcat or your rpc service, first determine whether the data you want exists in the cache cache. If it is stored, then return it directly to the caller. If it does not exist, you need to query the database, query the results, then continue to cache the cache, and then return the results to the caller. The next query will hit the cache.

Cache avalanche

Define

I remember that when I was doing the recommendation system, some data was calculated by offline algorithm. The requirement is to see which similar goods this product will recommend, and this will be stored in hbase and redis at the same time. Because it is all from batch algorithm, and then stored in redis, if the expiration time setting is the same, it will result in a large number of key that will fail at the same time. Then a large number of requests will be sent to the backend database, because the database throughput is limited, which is likely to bring down the database. This is the cache avalanche. Look at the picture and say:

This mainly shows that a cache avalanche occurs, especially when timing tasks set cache in batches, be sure to pay attention to the expiration time setting.

How to prevent avalanches

In fact, it is also very simple, that is, when you set the cache time of cache in batches, set a random number for the set cache time (for example, random numbers can be generated within 10 minutes, and random numbers can be generated using java's Random). In this way, a large number of key will not occur and will fail collectively at the same time. See the figure:

What if there is an avalanche?

Traffic is not very large, the database can withstand, ok, congratulations on dodging a bullet.

The traffic is very large, which exceeds the limit of the number of requests that the database can handle. The database is downmachine, and congratulations on getting a P0 accident list.

The traffic is very heavy, if your database limited flow scheme, when the parameters set by the current limit are reached, the request will be rejected, thus protecting the background db. Here, say a few more words about the current restriction.

You can limit a large number of requests to the DB side by setting the number of requests per second. Note that the number of requests per second, or concurrency, is not the current number of requests per second for data, but can be set to query the number of requests per second corresponding to a certain key. The purpose of this is to prevent a large number of requests with the same key from reaching the backend database, so that most of the requests can be intercepted.

Look at the picture and say:

In this way, the same key will be restricted to most requests, thus protecting the database db.

In fact, current limit can be divided into two types: local current limit and distributed current limit. In the following article, I will introduce local current limit and redis distributed current limit.

Cache breakdown

Define

For example, when a website is engaged in operational activities such as Singles' Day or seconds killing, then the traffic of the website will generally be very large. A certain product will become a hot product because of promotion, and the traffic will be super large. If this product, for some reason, fails in the cache, then instantly the traffic of the key will rush to the database, and the db will eventually fail, down. You can imagine the consequences, ah, normal other data can not be queried.

Look at the picture and say:

The key of huawei pro in redis suddenly fails, it may be expired, or it may be that there is not enough memory to be eliminated, then there will be requests with large traffic arriving at redis and it will be found that redis does not have this key, then these traffic will be transferred to DB and the corresponding huawei pro will be queried. At this time, the DB will not hold up and down.

How to solve

In fact, in the final analysis, we can't let more traffic reach DB, so we just need to limit the traffic to db.

1. Current restriction

Similar to the above, it mainly limits the traffic of a certain key. When the key is broken down, only one traffic is restricted to the db, and the others are rejected, or wait for the query redis to be retried.

For the current limit diagram, please refer to the cache breakdown current limit diagram.

There will also be local current restrictions and distributed current restrictions.

What is local traffic restriction means to limit the traffic of this key within the scope of a single local instance, which is only valid for the current instance.

What is distributed traffic restriction? that is, in a distributed environment, within the scope of multiple instances, the limited traffic of this key is the traffic from multiple instances, which reaches the limit, and all instances will limit the traffic to DB.

2. Using distributed locks

Here we briefly talk about the definition of distributed lock. In concurrent scenarios, we need to use locks to mutually exclusive access to shared resources to ensure thread safety; similarly, in distributed scenarios, we also need a mechanism to ensure mutually exclusive access to multi-node shared resources. The implementation mechanism is distributed locking.

Here, sharing resources is the huawei pro in the example, that is, when accessing the huawei pro in db, it is necessary to ensure that only one thread or one traffic is accessed, thus achieving the effect of distributed locking.

Look at the picture and say:

To grab the lock:

After a large number of requests do not get the key value of huawei pro, they are ready to go to db to obtain data. When the code for obtaining db is added with a distributed lock, then each request and every thread will acquire the distributed lock of huawei pro (in the figure, distributed lock is implemented using redis. Later, I will have a separate article to introduce the implementation of distributed lock, which is not limited to redis).

After acquiring the lock:

At this point, thread An acquires the distributed lock of huawei pro, then thread A goes to DB to load the data, and thread A sets the huawei pro into cache again and returns the data.

Other threads do not get it. One way is to return a null value directly to the client, and there is another way to wait for 50-100ms, because querying db and putting in redis will be very fast. When you wait and query again, the result may be available. If not, you can return null directly, of course, you can try again. Of course, in the case of large concurrency, you still want to return the results quickly and do not have too many retries.

3. Scheduled task update hotspot key

This is easy to understand, to put it bluntly, it is a scheduled task to regularly monitor the timeout of some hot key, whether it expires, and then extend the cache time of key in cache when it is about to expire.

Check and update the expiration time by polling by a single thread, see figure:

For multithreading, note that there cannot be too many key of hot spots. A thread will open many key. If there are many hot spots, you can use thread pool as shown in the figure:

Delay queue implementation

To put it bluntly, whether it is a single thread or multiple threads, they will use polling (wasted cpu each time) to check whether the key is about to expire. In this way, the check time will be inaccurate, which may cause time delay or inaccuracy. When you are waiting for the next check, the key will be gone, and a breakdown will have been made at this time. Although the probability of this situation is low, there are still some, so how can we avoid it? in fact, we can use the delay queue (circular queue). Here I will not go into detail about the principle of this queue. You can use Baidu or google on your own. The so-called delay queue is that you send messages to this queue. You hope to spend according to the time you set. You will not spend until the time is up. Spend when the time is up. All right, look at the picture and talk:

1. The program starts for the first time to obtain the expiration time of the key on the list.

2. Set the delay time of key consumption in turn. Note that the consumption time is earlier than the expiration time.

3. Delay the expiration of the queue, and the consumer consumes key.

4. The consumer end consumes messages, delaying the expiration time of key to cache.

5. Send the new failure time of key to the delay queue again, waiting for the failure time of the next delay cache.

4. Set key not to fail

In fact, this may also be eliminated because of insufficient memory, key, you can think about under what circumstances, key will be eliminated.

Cache penetration

Define

The so-called penetration, is to access a cache does not exist, the database does not exist key, then the traffic is equivalent to directly to the DB, then some hooligans can take advantage of this loophole, crazy brush your interface, and then destroy your DB, your business will not be able to run normally.

How to solve the problem?

1. Set null or special value

We can set null or a specific value to the redis without expiration, so the next time we come back, we can get the null or special value directly from redis.

This solution can not solve the fundamental problem, if this traffic can mimic a large number of useless key, no matter how many null or special values you set is useless, then how should we solve it?

2. Bloom filter

Bloom filter English is bloomfiler, here we just do a brief introduction, due to the length of the reason, there will be a separate article to do the introduction.

For example, if we store tens of millions of levels of sku data in our database, our current requirement is to query redis if the database has this sku, and if redis does not query the database, and then update redis, the first thing we think of is to put sku data into a hashmap. Key is sku. Because there is a large number of sku, this hashmap will take up a lot of memory space and may burst memory. In the end, the loss outweighs the gain, so how to save memory? we can use an array of bit to store whether the sku exists. 0 means not, 1 means exists, we can use a hash function to calculate the hash value of sku, and then the hash value of sku modulates the bit array, finds the location of the array, and then sets it to 1, when the request comes. We will figure out whether the array position corresponding to the sku hash value is 1, which means it exists, and if it is 0, it does not exist. Such a simple bloomfilter is realized. Bloomfiler has an error rate. You can consider increasing the length of the array and the number of hash functions to provide accuracy. You can use Baidu or google, but we won't talk about it here today.

Let's take a look at the process of using bloomfiler to prevent cache penetration. Take a look at the figure:

The initialization of bloomfiler can read the db through a scheduled task, initialize the size of the bit array, the default value is 0, which means it does not exist, and then each entry calculates the array position corresponding to the hash value and then inserts it into the bit array.

For the request process, see the figure:

If you do not use the bloomfiler filter, for a key that does not exist in the database, it is a waste of two IO, one query redis, one query DB, with bloomfiler, then you can save these two useless IO and reduce the waste of back-end redis and DB resources.

Summary

Cache avalanche

Solution:

When setting the failure time period, add a random number of time, which can be done within a few minutes.

And the problem of what to do if there is a real avalanche, current restriction can be used.

Cache breakdown

Solution:

Current limit

Distributed lock

Update the hotspot key regularly. Here, take a look at the delay queue.

The setting time does not expire

Cache penetration

Solution:

Set null or a specific value to redis

Using bloomfiler to implement

These are all the contents of the article "example Analysis of Cache Avalanche, Cache Breakthrough and Cache Penetration in Redis". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.