Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the reason why there must be Redis in distribution

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of the reason why Redis must be distributed, the content is detailed and easy to understand, the operation is simple and fast, and it has certain reference value. I believe that the reason why you must have Redis after reading this article will have something to gain, let's take a look at it.

Why use Redis

I think using Redis in a project is mainly considered from two perspectives: performance and concurrency.

Of course, Redis also has other functions such as distributed locks, but if it is only for distributed locks and other functions, there are other middleware, such as Zookeeper, instead of Redis. Therefore, this question is mainly answered from the perspectives of performance and concurrency.

Performance

As shown in the following figure, when we encounter a SQL that takes a long time to execute and the results do not change frequently, it is particularly suitable to cache the run results. In this way, the subsequent request is read in the cache so that the request can respond quickly.

Digression: I suddenly want to talk about this standard of rapid response. There is no fixed standard for this response time depending on the effect of the interaction.

But someone once told me: "ideally, our page jump needs to be solved in an instant, and the operation on the page needs to be solved in an instant."

In addition, the time-consuming operation of more than one finger must have a progress prompt, and can be suspended or cancelled at any time, so as to give the user an experience. "

So how much time is the moment, the moment, the flick of a finger?

According to the Maha Monk Law:

An instant is a thought, twenty thoughts are an instant, twenty moments are a flick of a finger, twenty fingers are a stroke, twenty strokes are a moment, and there are thirty moments in a day and a night.

So, after careful calculation, it is 0.36s in a moment, 0.018 seconds in a moment, and 7.2s in a finger.

Concurrence

As shown in the following figure, in the case of large concurrency, all requests directly access the database, and the database will have a connection exception.

At this point, you need to use Redis to do a buffering operation so that the request accesses the Redis first instead of directly accessing the database.

What are the disadvantages of using Redis

You must understand this problem when you use Redis for so long. Basically, you will encounter some problems when using Redis, and there are only a few common ones.

The answers are mainly four questions:

Double write consistency between cache and database

Cache avalanche problem

Cache breakdown problem

Concurrency contention of caches

I personally think that these four problems are often encountered in the project, and the specific solutions are given later.

Why is single-threaded Redis so fast?

This problem is an investigation of the internal mechanism of Redis. According to my interview experience, many people don't know that Redis is a single-threaded work model. Therefore, this question should be reviewed.

The answer is mainly the following three points:

Pure memory operation

Single-thread operation avoids frequent context switching

The non-blocking Istroke O multiplexing mechanism is adopted.

Digression: now we have to talk in detail about the Istroke O multiplexing mechanism, because this statement is so popular that most people don't understand what it means.

Let's take an example: Xiaoqu opened a delivery store in S City, which is responsible for the same city express service. Xiaoqu hired a group of couriers because of financial constraints, and then Xiaoqu found that there was not enough money to buy a car for express delivery.

Mode one of operation

Every time a customer sends a delivery, Xiaoqu lets a courier keep an eye on it, and then the courier drives to deliver the delivery.

Slowly Xiaoqu found the following problems with this mode of operation:

Dozens of couriers basically spend their time robbing cars, and most couriers are idle. Whoever gets the car will be able to deliver it.

With the increase of express delivery, there are more and more couriers, Xiaoqu found that the express store is becoming more and more crowded, there is no way to hire new couriers.

Coordination between couriers takes a lot of time.

Combining the above shortcomings, Xiaoqu learned from the bitter experience and put forward the following mode of operation.

Mode 2 of operation

Xiaoqu only employs one courier. Then, for the express delivery sent by the customer, Xiaoqu is marked according to the place of delivery, and then placed in one place in turn.

The courier went to pick up the delivery in turn, one at a time, then drove to deliver the delivery, and then came back to get the next delivery.

Comparing the above two modes of operation, do you obviously feel that the second mode of operation is more efficient and better?

In the above analogy:

Every courier → every thread

Every courier →, every Socket (I peep O stream)

Different states of → Socket at the place of delivery of express delivery

Customer delivers express request → request from client

The mode of operation of Xiaoqu the code running on the → server

The audit of a car → CPU

So we have the following conclusion:

The first mode of operation is the traditional concurrency model, in which each Imax O stream (express) has a new thread (courier) management.

The second mode of operation is Ihammer O multiplexing. There is only a single thread (a courier) that manages multiple Ipicot O streams by tracking the status of each Ipicuro stream (the place of delivery of each delivery).

The following analogy to the real Redis threading model is shown in the figure:

To put it simply, our redis-client will produce Socket with different event types when operating.

On the server side, there is an Imax O multiplexer program that places it in the queue. Then, the file event dispatcher takes it from the queue in turn and forwards it to different event handlers.

It should be noted that the Redis O multiplexing mechanism, Redis also provides select, epoll, evport, kqueue and other multiplexing function libraries, you can understand.

The data types of Redis and the usage scenarios for each data type

Do you think this question is very basic? I think so, too. However, according to the interview experience, at least 80% of people can't answer this question.

It is suggested that after using it in the project, then compare memory, experience more deeply, do not memorize it. Basically, a qualified programmer will use all five types.

String

There is nothing to say about this, the most conventional set/get operation, Value can be String or a number. Generally do some complex counting function of the cache.

Hash

Here Value stores structured objects, and it is convenient to manipulate one of the fields.

When I do single sign-on, I use this data structure to store user information, use CookieId as Key, and set 30 minutes as cache expiration time, which can well simulate the effect similar to Session.

List

Using the data structure of List, you can do a simple message queue function. In addition, you can use the lrange command to do Redis-based paging with good performance and good user experience.

Set

Because Set stacks a collection of values that are not repeated. So you can do the global de-duplication function. Why not use the Set that comes with JVM to remove the weight?

Because our systems are generally deployed in clusters, it is troublesome to use the Set that comes with JVM. Is it too troublesome to do a global de-duplication and start a public service?

In addition, the use of intersection, union, subtraction and other operations, you can calculate common preferences, all preferences, their own unique preferences and other functions.

Sorted Set

Sorted Set has an extra weight parameter Score, and the elements in the collection can be arranged by Score.

Can do ranking application, take TOP N operation. Sorted Set can be used to do deferred tasks. * an application can do scope search.

Expiration Policy and memory obsolescence Mechanism of Redis

This question is very important, in the end whether Redis is used at home, this question can be seen.

For example, you can only store 5 gigabytes of data in Redis, but if you write 10 gigabytes, you will delete 5 gigabytes. How to delete, have you thought about this question?

Also, your data has set the expiration time, but the time is up, the memory occupancy rate is still relatively high, have you thought about the reason?

Answer: Redis adopts the strategy of periodic deletion + lazy deletion.

Why not delete policies regularly?

Delete regularly, use a timer to monitor the Key, and delete automatically when it expires. Although memory is released in time, it consumes CPU resources.

In the case of large concurrent requests, CPU applies time to processing requests rather than deleting Key, so it does not adopt this strategy.

How does regular deletion + lazy deletion work

Delete regularly. By default, Redis checks each 100ms to see if there is an expired Key, and if there is an expired Key, delete it.

It is important to note that Redis does not check all the Key once for every 100ms, but is randomly selected for inspection (if every other 100ms, all Key are checked, the Redis will not be stuck).

Therefore, if you only use the periodic deletion policy, it will result in a lot of Key not being deleted by the time. As a result, lazy deletion comes in handy.

In other words, when you get a Key, Redis will check whether the Key has expired if it is set to expire. If it expires, it will be deleted.

Is there no other problem with regular deletion + lazy deletion?

No, if you delete Key periodically, you don't delete it. Then you didn't immediately request Key, which means that lazy deletion didn't work either. In this way, the memory of Redis will be higher and higher. Then the memory elimination mechanism should be adopted.

There is a line of configuration in redis.conf:

# maxmemory-policy volatile-lru

This configuration is equipped with a memory elimination strategy (what, you didn't match it? Reflect on yourself):

Noeviction: when there is not enough memory to hold new write data, the new write operation reports an error. I don't think anyone uses it.

Allkeys-lru: when there is not enough memory to hold newly written data, remove the least recently used Key in the key space. Recommended, which is currently used in the project.

Allkeys-random: when there is not enough memory to hold newly written data, a Key is randomly removed from the key space. No one should use it, if you don't delete it, at least use Key to delete it randomly.

Volatile-lru: when there is not enough memory to hold newly written data, remove the least recently used Key from the key space where the expiration time is set. In this case, Redis is generally used as both caching and persistent storage. Not recommended.

Volatile-random: when there is not enough memory to hold newly written data, a Key is randomly removed from the key space where the expiration time is set. Still not recommended.

Volatile-ttl: when there is not enough memory to hold newly written data, Key with an earlier expiration time is removed first in the key space where the expiration time is set. Not recommended.

PS: if the Key of expire is not set, the prerequisites (prerequisites) are not met; then the behavior of volatile-lru,volatile-random and volatile-ttl policies is basically the same as that of noeviction (do not delete).

Double write consistency between Redis and Database

Consistency problem is a common distributed problem, which can be subdivided into final consistency and strong consistency. Database and cache double write, there are bound to be inconsistencies.

To answer this question, first understand a premise. That is, if there are strong consistency requirements for data, you can't slow down the storage. What we do can only ensure the ultimate consistency.

In addition, fundamentally speaking, the plan we have made can only be said to reduce the probability of inconsistencies and cannot be completely avoided. Therefore, data with strong consistency requirements cannot be slowed down.

Answer: first of all, adopt the correct update strategy, update the database first, and then delete the cache. Second, because there may be a failure to delete the cache, you can provide a compensatory measure, such as using message queuing.

How to deal with cache penetration and cache avalanche

To tell you the truth, it is very difficult for small and medium-sized traditional software enterprises to encounter these two problems. If there are large concurrent projects, the traffic is about millions. These two issues must be considered deeply.

Cache traversal, that is, hackers deliberately request data that does not exist in the cache, resulting in all requests to the database, resulting in abnormal database connection.

Cache traversal solution:

Using mutexes, when the cache expires, first get the lock, get the lock, and then request the database. If you don't get the lock, you'll sleep for a while and try again.

The asynchronous update strategy is used, which is returned directly regardless of whether the Key takes a value or not. A cache expiration time is maintained in the Value value. If the cache expires, a thread is asynchronously set up to read the database and update the cache. Need to do cache warm-up (load the cache before the project starts).

Provide an interception mechanism that can quickly determine whether a request is valid, for example, using a Bloom filter to internally maintain a series of legitimate and valid Key. Quickly determine whether the Key carried by the request is legal or valid. If it is illegal, return directly.

Cache avalanche, that is, a large area of cache failure at the same time, when there is another wave of requests, resulting in requests to the database, resulting in abnormal database connection.

Cache Avalanche solution:

Add a random value to the cache expiration time to avoid collective failure.

Mutexes are used, but the throughput of this scheme is significantly reduced.

Double cache. We have two caches, cache An and cache B. The expiration time of cache An is 20 minutes, and cache B has no expiration time. Do your own cache warm-up operation.

Then subdivide the following points: read the database from cache A, then return directly; A has no data, read data directly from B, return directly, and start an update thread asynchronously, which updates both cache An and cache B.

How to solve the problem of concurrent competitive Key in Redis

The problem is that there are multiple subsystems to Set a Key at the same time. Have you thought about what you should pay attention to at this time?

Need to explain, I Baidu in advance, found that the answer is basically recommended to use the Redis transaction mechanism.

I do not recommend using Redis's transaction mechanism. Because our production environment is basically a Redis cluster environment, we have done data slicing operation.

When you have multiple Key operations involved in a transaction, the multiple Key may not be stored on the same redis-server. Therefore, the transaction mechanism of Redis is very chicken.

If you operate on this Key, no order is required

In this case, prepare a distributed lock, everyone to grab the lock, grab the lock to do set operation, it is relatively simple.

If the Key operation is required, order is required.

Suppose you have a key1, system A needs to set key1 to valueA, system B needs to set key1 to valueB, and system C needs to set key1 to valueC.

Expect the value value of key1 to change in the order of valueA > valueB > valueC. In this case, we need to save a timestamp when the data is written to the database.

Suppose the timestamp is as follows:

System A key 1 {valueA 3:00}

System B key 1 {valueB 3:05}

System C key 1 {valueC 3:10}

So, suppose system B grabs the lock first and sets key1 to {valueB 3:05}. Then system A grabs the lock and finds that the timestamp of its valueA is earlier than the timestamp in the cache, so it does not do the set operation, and so on.

Other methods, such as using queues, can also turn the set method into serial access. In short, be flexible.

This is the end of the article on "why there must be Redis in distribution". Thank you for reading! I believe that we all have a certain understanding of the knowledge of "what is the reason why Redis must be distributed?" if you want to learn more knowledge, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report