In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "how to understand TCP semi-connected queue and fully connected queue". The content of the article is simple and clear, and it is easy to learn and understand. please follow the editor's train of thought to study and learn "how to understand TCP semi-connected queue and fully connected queue".
The problem describes JAVA's client and server, using socket to communicate. Server uses NIO. 1. Intermittent client establishing a connection to server the three-way handshake has been completed, but server's selector does not respond to the connection. two。 At the point of time when the problem occurs, there will be many connections with this problem at the same time. 3.selector has not destroyed and rebuilt, but has always used one. 4. Some will appear when the program is started, and then intermittently. Analyze the problem normal TCP connection three-way handshake process:
Step 1: client sends syn to server to initiate handshake
Step 2: server replies to syn+ack to client after receiving syn
Step 3: after client receives the syn+ack, reply to server. An ack indicates that you have received the syn+ack of server (at this time, the connection to port 56911 of client is already established).
From the description of the problem, it is a bit like when TCP establishes a connection, the full connection queue (accept queue) is full, especially symptoms 2 and 4. To prove this, take a look at the queue overflow statistics via ss-s:
667399 times the listen queue of a socket overflowed
After looking at it several times, it is found that the overflowed has been increasing, so it is clear that the full connection queue on the server must have overflowed.
Then look at what OS does with the overflow:
# cat / proc/sys/net/ipv4/tcp_abort_on_overflow 0
A tcp_abort_on_overflow of 0 means that if the full connection queue is full at the third step of the three-way handshake, then server throws away the ack sent by client (on the server side that the connection has not been established yet)
In order to prove that the exception of the client application code is fully related to the full connection queue, I first modify the tcp_abort_on_overflow to 1Magne1, which means that if the full connection queue is full in step 3, server sends a reset packet to client to cancel the handshake process and the connection (the connection has not been established on the server side yet).
Then the test then shows a lot of connection reset by peer errors in the client exception, which proves that the client error is caused by this reason.
So the developer looked through the java source code and found that the default backlog of socket (this value controls the size of the fully connected queue, which will be described in more detail later) is 50, so change the size and run again. After more than 12 hours of stress testing, this error has not occurred once, and overflowed is no longer added.
To solve this problem, to put it simply, there is an accept queue after the TCP three-way handshake, and only when you enter this queue can you change from Listen to accept. The default backlog value is 50, which is easy to fill. When the third step of the handshake is full, server ignores the ack packet sent by client (every once in a while server resends the syn+ack packet of the second step of the handshake to client). It will be abnormal if this connection is not queued all the time.
In-depth understanding of the process and queue of establishing connections during TCP handshake
As shown in the figure above, there are two queues: syns queue (semi-connected queue) and accept queue (fully connected queue)
In the three-way handshake, after receiving the syn of client in the first step, server puts the relevant information into the semi-connection queue and replies syn+ack to client at the same time (step 2)
For example, the syn floods attack is aimed at the semi-connection queue, and the attacker keeps building the connection, but only takes the first step when establishing the connection. In the second step, the attacker intentionally throws away and does nothing after receiving the syn+ack of the server, resulting in the queue on the server full of other normal requests.
In the third step, server receives the ack of client. If the full connection queue is not full at this time, then take the relevant information from the semi-connection queue and put it into the full connection queue, otherwise execute as instructed by tcp_abort_on_overflow.
At this time, if the full connection queue is full and tcp_abort_on_overflow is 0, server sends syn+ack to client again after a period of time (that is, the second step of re-taking the handshake). If the client timeout wait is relatively short, it is easy to get abnormal.
In our os, the default number of times for the second step of retry is 2 (centos defaults to 5):
Net.ipv4.tcp_synack_retries = 2 what metrics can be seen if the TCP connection queue overflows?
The above solution process is a bit circuitous, so next time there is a similar problem, is there any faster and clearer way to confirm the problem?
Netstat-s [root@server ~] # netstat-s | egrep "listen | LISTEN" > 667399 times the listen queue of a socket overflowed 667399 SYNs to LISTEN sockets ignored
For example, the 667399 times seen above indicates the number of full connection queue overflows. If it is executed every few seconds, if this number keeps increasing, the full connection queue must be full occasionally.
Ss command [root@server ~] # ss-lnt Recv-Q Send-Q Local Address:Port Peer Address:Port 0 50 *: 3306 *: *
The second column Send-Q seen above indicates that the maximum fully connected queue on the listen port of the third column is 50, and the first column Recv-Q is how much the fully connected queue is currently used.
The size of the full connection queue depends on: min (backlog, somaxconn). Backlog is passed in when socket is created, and somaxconn is an os-level system parameter
The size of the semi-connected queue depends on: max (64, / proc/sys/net/ipv4/tcp_max_syn_backlog). There will be some differences between different versions of os
Practice verifies the above understanding
Change the backlog in java to 10 (the smaller it is, the easier it is to overflow), and continue to run the pressure. At this time, client begins to report an exception again, and then observe it through the ss command on server:
Fri May 5 13:50:23 CST 2017 Recv-Q Send-QLocal Address:Port Peer Address:Port 11 10 *: 3306 *: *
According to the previous understanding, at this time we can see that the maximum service full connection queue on port 3306 is 10, but now there are 11 in the queue and waiting to join the queue. There must be one connection that cannot be removed from the queue to overflow.
Further thinking
If client takes the third step, it seems to client that the connection has been established, but the corresponding connection on server is not actually ready, what if client sends data to server,server at this time? (some students say they can reset, let's take a look at it.)
Let's take a look at an example:
As shown in the figure above, packet 150166 is the third step in the three-way handshake, client sends ack to server, and then client in packet 150167 sends a packet of length 816 to server, because at this time client thinks that the connection is established successfully, but there is no ready for the connection on server, so server does not reply. After a period of time, client thinks that the packet has been lost and then retransmits the 816-byte packet until the timeout, client actively sends the fin packet to break the connection.
This question, also known as client fooling, can be found here: https://github.com/torvalds/linux/commit/5ea8ea2cb7f1d0db15762c9b0bb9e7330425a071 (thanks to @ Liu Huan) for the hint.
* * from the actual packet capture above, it is not reset, but server ignores these packets, and then client retransmits them. After a certain number of times, client thinks it is abnormal, and then disconnects.
**
A strange problem found in the process [root@server ~] # date; netstat-s | egrep "listen | LISTEN" > 5 15:39:58 CST 2017 1641685 times the listen queue of a socket overflowed 1641685 SYNs to LISTEN sockets ignored [root@server ~] # date; netstat-s | egrep "listen | LISTEN Fri May 5 15:39:59 CST 2017 1641906 times the listen queue of a socket overflowed 1641906 SYNs to LISTEN sockets ignored
As shown above:
There are always the same number of overflowed and ignored, and both increase synchronously. Overflowed represents the number of full connection queue overflows and socket ignored represents the number of semi-connection queue overflows.
Look at the kernel source code:
When you can see overflow, there must be drop++ (socket ignored), that is, drop must be greater than or equal to overflow.
At the same time, I also checked these two values of several other server to prove that drop must be greater than or equal to overflow:
Server1150 SYNs to LISTEN sockets dropped server2 193 SYNs to LISTEN sockets dropped server3 16329 times the listen queue of a socket overflowed 16422 SYNs to LISTEN sockets dropped server4 20 times the listen queue of a socket overflowed 51 SYNs to LISTEN sockets dropped server5 984932 times the listen queue of a socket overflowed 988003 SYNs to LISTEN sockets dropped Thank you for reading. This is how to understand TCP semi-connected queue and fully connected queue. After the study of this article, I believe you have a deeper understanding of how to understand TCP semi-connected queue and fully connected queue, and the specific usage needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.