Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve java cache avalanche and cache breakdown

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how to solve java cache avalanche and cache breakdown". In daily operation, I believe many people have doubts about how to solve the problem of java cache avalanche and cache breakdown. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts of "java cache avalanche and cache breakdown". Next, please follow the editor to study!

Cache avalanche

Cache avalanche means that a large number of caches in Redis expire at the same time, and if a large number of requests are initiated at the same time, it will cause requests to access the database directly and may destroy the database.

Cache avalanche generally describes the data that is not in the cache but in the database, and the request goes directly to the database because the time expires.

Solution

There are many ways to solve cache avalanches:

1. Add locks to ensure single-thread access to the cache. This way, there will not be many requests to access the database at the same time.

2. Do not set the failure time to the same. Typically, when initializing warm-up data, you can use random time to store the data in the cache to ensure that there will not be a large number of cache failures at the same time.

3. If memory allows, the cache can be set to never expire.

Cache breakdown

Cache breakdown is very similar to cache avalanche, except that cache breakdown generally refers to the failure of a single cache, while there are a lot of concurrent requests to access the key at the same time, resulting in pressure on the database.

Solution

The solution to cache breakdown is similar to solving cache avalanche:

1. Add locks to ensure single-thread access to the cache. In this way, when the first request arrives at the database, the cache is rewritten, and subsequent requests can read the cache directly.

2. If memory allows, the cache can be set to never expire.

Cache penetration

The essential difference between cache penetration and the above two phenomena is that the data accessed at this time does not exist in the database, so since the database does not exist, it certainly will not exist in the cache. In this way, if the concurrency is too large, it will cause the continuous flow of data to the database, causing great pressure on the database.

Solution

For the cache penetration problem, locking does not have a good effect, because the key itself does not exist, so even if the number of thread access is controlled, requests will continue to arrive at the database.

Generally speaking, the following solutions can be used to solve the problem of cache penetration:

1. The interface layer verifies and returns the illegal key directly. For example, if a self-increasing id is used in the database, a non-integer id or a negative id can be returned directly, or if a 32-bit uuid is used, it can also be returned directly if the id length is not equal to 32 bits.

2. The data that does not exist can also be cached. You can cache an empty or other agreed invalid value directly. With this solution, it is best to set the key to a short-term expiration time, otherwise a large number of non-existent key will be stored in Redis, which will also take up a lot of memory.

Bloom filter (Bloom Filter)

For the above cache traversal solution, let's think about this: if a key can bypass the first method of checking, and a large number of key are not accessed (such as 100 million or 1 billion), then all of them will be stored in the cache at this time, which will take up a lot of space and waste a lot of server memory, resulting in insufficient memory.

So is there a better solution? This is the Bloom filter that we will introduce next, and the Bloom filter can solve the problem of excessive key values to the greatest extent.

What is a Bloom filter?

Perhaps most people know that there is an interview question: how to quickly determine the existence of an element in 1 billion of the vast amount of disordered data?

To solve this problem, you need to use a Bloom filter, otherwise the memory of most servers cannot store such an order of magnitude of data.

The Bloom filter (Bloom Filter) was proposed by Bloom in 1970. It is actually a very long binary vector (bitmap) and a series of random mapping functions (hash functions).

The Bloom filter can be used to retrieve whether an element is in a collection. Its advantage is that the space efficiency and query time are much better than the general algorithm, but the disadvantage is that it has a certain error recognition rate and is difficult to delete.

Bitmap (Bitmap)

There is a kind of data structure in Redis that is bitmap, and the important implementation of Bloom filter is the implementation of bitmap, that is, bit array, and each position in this array has only 0 and 1 states, each position occupies only 1 byte, where 0 indicates that there is no element, and 1 indicates that there is an element. The following figure shows a simple example of a Bloom filter (a key value can be hashed and bitwise calculated to determine where it should fall):

Hash collision

Above, we find that lonely and wolf fall in the same place, and the phenomenon that different key values get the same value after hash operation is called hash collision. If a hash collision is followed by a bit operation, it must end up in the same place.

If too many hash collisions occur, it will affect the accuracy of judgment, so in order to reduce hash collisions, we will generally consider the following two factors:

1. Increase the size of the bitmap array (the larger the bitmap array, the more memory it takes).

2. Increase the number of hash functions (if the same key value is equal through one function, then the probability of getting the same result will naturally be reduced after the calculation of two or more hash functions).

The above two methods need to be considered comprehensively: for example, increasing the bit array will consume more space, and the more hash calculation will consume cpu, which will affect the final calculation time, so how big the bit array is and how many hash function times need to be calculated need to be analyzed in the specific case.

Two characteristics of Bloom filter

The following is a Bloon filter obtained after two hashes. According to the figure below, it is easy to see that if our Redis does not exist at all, but the two positions obtained by Redis after two hashes are already 1 (one is obtained by wolf through f2, the other is obtained by Nosql through F1).

Therefore, from the point of view of the Bloom filter, we can draw a conclusion that the Bloom filter has two main characteristics:

1. If the Bloom filter determines that an element exists, then the element may exist.

2. If the Bloom filter determines that an element does not exist, then the element must not exist.

From the point of view of elements, two major characteristics can also be obtained:

1. If the element actually exists, then the Bloom filter must judge that it exists.

2. If the element does not exist, then the Bloom filter may determine that it exists.

PS: it is important to note that if you go through the N hash function, the N positions you need to get are all 1 to determine the existence. As long as one of them is 0, you can determine that the element does not exist in the Bloom filter.

Fpp

Because there is always a misjudgment rate in the Bloom filter, because hash collisions are impossible to avoid. Bloom filter is called false positive probability for this misjudgment rate, namely: False Positive Probability, abbreviated as fpp.

In practice, you can define a fpp when you use a Bloom filter, and then you can calculate how many hash functions and how much bit array space you need according to the theory of the Bloom filter. It should be noted that this fpp cannot be defined as 100%, because there is no 100% guarantee that hash collisions will not occur.

At this point, the study on "how to solve java cache avalanche and cache breakdown" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report