How to optimize the cache architecture for users to access a hot Key 07/13 Update SLTechnology News&Howtos

How to optimize the cache architecture for users to access a hot Key

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article is about how to optimize the cache architecture for users to access a hot Key. Xiaobian thinks it is quite practical, so share it with everyone to learn. I hope you can gain something after reading this article. Let's not say much. Let's take a look at it together with Xiaobian.

Why cache clustering

What is Hot Key and Big Value? Simply put, a hot Key is a Key in your cache cluster that is instantly exploded by tens of thousands or even hundreds of thousands of concurrent requests.

Large Value means that the Value corresponding to one of your Keys may have a GB size, resulting in network-related failures when querying Value.

Let's take a look at the following picture:

Simply put, suppose you have a system on hand, which is deployed in a cluster, and then there is a cache cluster behind it. This cluster can be used whether you use Redis Cluster, Memcached, or a company's self-developed cache cluster.

So, what does this system do with cache clusters? Very simple, put some data that usually does not change in the cache, and then when the user queries a large amount of data that usually does not change, can he not go directly from the cache?

The concurrency of cache clusters is strong, and the read cache performance is high. For example, suppose you have 20,000 requests per second, but 90% of them are read requests, then 18,000 requests per second are reading data that doesn't change much, not writing data.

At this point you put the data in the database, and then send 20,000 requests per second to read and write data to the database, do you think it is appropriate?

Of course, it is not suitable. If you want to use the database to carry 20,000 requests per second, then sorry, you will probably have to do sub-database table + read-write separation.

For example, you score 3 master libraries, carrying 2000 write requests per second, and then each master library hangs 3 slave libraries, a total of 9 slave libraries carrying 18,000 read requests per second.

In this case, you may need a total of 12 high-profile database servers, which is expensive, very expensive, and inappropriate.

Take a look at the picture below to understand this situation:

Therefore, at this time, you can completely put the data that does not change at ordinary times in the cache cluster. The cache cluster can adopt 2 masters and 2 slaves. The master node is used to write the cache and the slave node is used to read the cache.

With the performance of a cache cluster, two slave nodes can be used to carry 18,000 reads per second, and then three database masters can carry 2000 write requests per second and a small number of other read requests.

Look at the figure below, the machines you consume instantly become 4 cache machines + 3 database machines = 7 machines, is it a lot less resource overhead than the previous 12 machines?

Yes, caching is actually a very important part of the system architecture. Many times, for data that changes little but has a lot of high concurrent reads, it is appropriate to resist high concurrent reads by caching clusters.

Here all the number of machines, concurrent requests are an example, we mainly understand this meaning is good, its main purpose is to give some students who are not familiar with cache related technology a little background explanation, so that these students can understand what it means to use cache clusters to carry read requests in the system.

200,000 users simultaneously accessing a hot cache problem

Well, the background has been explained to you clearly, so now I can tell you about the key issue to be discussed today: hot cache.

Let's assume that you now have 10 cache nodes to handle a large number of read requests. Under normal circumstances, read requests should fall evenly on 10 cache nodes, right?

These 10 cache nodes, carrying 10,000 requests per second, are about the same. Then let's make another assumption, you a node carrying 20,000 requests is the limit, so generally you limit a node to normally carry 10,000 requests OK, leaving a little Buffer out.

Okay, what do you mean by hot cache problem? It's very simple. Suddenly, for some unknown reason, a large number of users access the same cached data.

For example, when a star suddenly announces that he is married to a certain person, will this time trigger hundreds of thousands of users to view the news that the star is married to a certain person in a short time?

Then suppose that the news is a cache, and then the corresponding is a cache Key, which exists on a cache machine. At this time, suppose that there are 200,000 requests to a Key on that machine.

What happens at this point? Let's look at the picture below to understand this feeling of despair:

This time it is obvious, we just assumed that a cache Slave node is at most 20,000 requests per second, of course, the actual cache single machine bearing 50,000 to 100,000 read requests is also possible, we are here is an assumption.

At this point, suddenly rushed to 200,000 requests per second to this machine, what will happen? Quite simply, the cache machine in the picture above, which was targeted by 200,000 requests, would be overworked and down.

So what happens if the cache cluster starts to experience machine downtime? Then, if the read request finds that it cannot read the data, it will extract the original data from the database and put it into the remaining cache machines.

But with 200,000 requests per second coming in, it would overwhelm other caching machines again. And so on, eventually leading to a complete crash of the cache cluster, causing the entire system to go down.

Let's take a look at the picture below and feel this horrible scene again:

Cache Hotspot Automatic Discovery Based on Streaming Computing Technology

In fact, the key point here is that for this kind of hot spot cache, your system needs to be able to directly discover it when the hot spot cache suddenly occurs, and then immediately realize millisecond automatic Load Balancer.

So let's start by saying, how do you automatically discover hot cache problems? First of all, you should know that when cache hotspots generally appear, your concurrency per second must be very high, and hundreds of thousands or even millions of requests per second may come, which is possible.

Therefore, at this time, it is completely possible to perform statistics on the number of real-time data accesses based on streaming computing technologies in the field of big data, such as Storm, Spark Streaming, and Flink.

Then, once in the process of counting the number of real-time data accesses, for example, it is found that within one second, a certain piece of data suddenly accesses more than 1000 times, it is directly determined as hot data, and the discovered hot data can be written into Zookeeper, for example.

Of course, how your system determines hot spot data can be determined according to your own business and experience points.

Take a look at the following diagram to see how the whole process works:

Of course, some people will certainly ask, when your streaming computing system counts the number of data accesses, will there be a problem that a single machine is requested hundreds of thousands of times per second?

Because of streaming computing technology, especially Storm, it can make the same data request come, first scattered in many machines for local calculation, *** and then summarize the local calculation results to a machine for global summary.

So hundreds of thousands of requests can be spread out over, say, 100 machines, each counting thousands of requests for this piece of data.

Then 100 locally calculated results can be summarized into one machine for global calculation, so there will be no hot spot problem for statistics based on streaming computing technology.

Hotspot cache automatically loaded as JVM local cache

Our own system can listen to the Znode corresponding to the hot spot cache specified by Zookeeper, and if there is a change, he can immediately sense it.

At this point, the system layer can immediately load the relevant cache data from the database and then directly place it in the local cache within its own system.

This local cache, you can use Ehcache, Hashmap, in fact, everything depends on their own business requirements, mainly to say that the centralized cache in the cache cluster, directly into each system's own local implementation cache, each system itself is unable to cache too much data.

Because generally this kind of common system single instance deployment machine may be a 4-core 8G machine, leaving little space for local cache, so the local cache used to put this kind of hot data is the most appropriate, just right.

Suppose you have 100 machines deployed in your system-tier cluster. Well, all 100 of your machines will instantly have a copy of the hotspot cache locally.

Then the next hot cache read operation, direct system local cache read out to return, do not go to the cache cluster.

In this case, it is impossible to allow 200,000 read requests per second to reach a cache machine to read a hot cache, but instead 100 machines each carrying thousands of requests, then those thousands of requests directly return data from the machine local cache, which is no problem.

Let's draw a picture again and look at this process together:

current-limiting fuse protection

In addition, within each system, a current-limiting fuse protection for hot data access should be added.

Within each system instance, a fuse protection mechanism can be added. Assuming that the cache cluster carries up to 40,000 read requests per second, you have a total of 100 system instances.

You should limit yourself. Each system instance can request cache cluster read operations no more than 400 times per second. If it exceeds this limit, it can be blown out. Do not request cache cluster, return a blank message directly, and then the user will refresh the page again later.

By directly adding current limiting fuse protection measures to the system layer itself, it can protect the cache cluster and database cluster behind from being killed. Let's take a look at the following figure:

summary

Do you want to implement this complex cache hotspot optimization architecture in your system? This depends on whether your own system has such a scenario.

If your system has a hotspot cache problem, then implement a complex hotspot cache support architecture similar to that described in this article.

But if you don't, don't over-design your system; it probably doesn't need such a complex architecture at all.

The above is how users access a hot Key to optimize the cache architecture. Xiaobian believes that some knowledge points may be seen or used in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.