Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the problem of a large number of connection timeouts in Redis

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

Today, I would like to talk to you about how to solve a large number of connection timeouts in Redis. Many people may not know much about it. In order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.

Check the train of thought to check the distribution of anomalies

First of all, based on experience, let's take a look at our servers to see which machines on which the exception occurs. Switch to the standalone dimension through monitoring to see whether the exception is evenly distributed. If the distribution is uneven, only a small amount of host is particularly high, which can basically locate the machine with the problem.

Ah, this is very comfortable, and then suddenly found the problem, only a few machines are very tall.

However, this cannot be the case. Let's continue to talk about the train of thought of investigation.

Redis situation

Once again, according to experience, although the students in charge of redis said that the redis thief stabilized Barabara, we were skeptical and could not believe what they said, which is very important, especially at work, don't believe what others say, in the spirit of Conan, everyone is a criminal suspect when a murder occurs, of course you have to rule yourself out and unswervingly believe that this is not my pot!

All right, let's see if any of the nodes in the redis cluster are overloaded, for example, 80% in general experience can be used as a threshold.

If there is more than one or a small number of nodes, there may be a hot key problem, and if most of the nodes exceed, there is a problem with the overall pressure of redis.

In addition, you can see if there is a slow request. If there is a slow request, and the time of the problem matches, then there may be a big key problem.

Um...

Redis is right. Redis is as steady as an old dog.

CPU

Let's assume that we are still helpless and still don't find out what the problem is. Don't worry, then look for other people's reasons and see what CPU is like. Maybe the operation and maintenance staff secretly gave us the machine configuration for the whole problem.

Let's see how high the utilization rate of CPU is, whether it is more than 80%, or based on experience, it would be nice for our previous service to peak at 60%.

Then see if there is current restriction in CPU, or whether there are intensive current restrictions or long-term current restrictions.

If there are these phenomena, it should be the pot of operation and maintenance, giving us insufficient machine resources.

GC pause

All right, the operation and maintenance didn't die this time.

Let's see how GC looks.

Frequent GC and long GC will make it impossible for threads to read redis responses in time.

How do you judge this number?

Usually, we can calculate like this, again according to our messy experience, the total GC time per minute / 60s/ GC per minute, if it reaches the ms level, the impact on redis read and write latency will be obvious.

In order to keep a steady hand, we also need to compare whether it is almost the same as the historical monitoring level.

All right, excuse me, let's move on.

The network

We mainly look at the TCP retransmission rate in this area of the network, which is basically monitored by larger companies.

TCP retransmission rate = number of TCP retransmission packets per unit time / total number of TCP packets sent

We can regard TCP retransmission rate as a measure of network quality and server stability.

According to our experience, the lower the TCP retransmission rate, the better. The lower the TCP retransmission rate, the better. If the TCP retransmission rate remains above 0.02% (based on our own actual situation), or if it suddenly increases, we can doubt whether it is a network problem.

After reading the above, do you have any further understanding of how to solve a large number of connection timeouts in Redis? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report