Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Understand TIME_WAIT and thoroughly figure out how to solve TCP: time wait bucket table overflow

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Share

Shulou(Shulou.com)06/01 Report--

I have been aware of this problem but do not know why. I have encountered it again these days. I have read a lot of information and solved it thoroughly. Hehe, let's start with the previous picture. All the understanding revolves around this picture. This picture describes the whole process of waving four times:

Use this figure to illustrate several concepts:

TIME_WAIT generation condition: the last ACK of the active shutdown party will become TIME_WAIT after sending four waves, and the secondary state will be retained for two MSL (one MSL in the linux is 30s, which is not configurable)

The role of TIME_WAIT two MSL: reliable and secure closure of TCP connections. For example, when the network is congested and the passive party of the active party does not receive the last ACK, the passive party will turn on TCP retransmission to the FIN and send multiple FIN packets. At this time, the TIME_WAIT that has not been closed will deal with these tail problems and will not affect new connections and other services.

Resources consumed by TIME_WAIT: a small amount of memory (about 4K for reference) and a fd.

The harm of TIME_WAIT shutdown: 1. When the network condition is bad, if the active party does not have time _ WAIT to wait, after closing the previous connection, the active party and the passive party establish a new TCP connection. In this case, the passive party retransmits or delays the FIN packet, which will directly affect the new TCP connection.

2. The same network condition is not good and there is no TIME_WAIT waiting, and there is no new connection after closing the connection. When the passive party receives the FIN packet retransmitted or delayed by the passive party, it will return a RST packet to the passive party, which may affect other service connections of the passive party.

TCP: the cause and impact of time wait bucket table overflow: the reason is that it exceeds the threshold of the number of tw in linux system. The harm is that after exceeding the threshold, the system will delete the excess time-wait socket and display a warning message. If there is a large number of visits in the NAT network environment, it will cause a variety of unstable disconnections.

Optimize and adjust the relevant parameters (of course, you have to configure them according to the actual situation of the server. Here we focus on the meaning of the parameters):

Now that you know the purpose of TIME_WAIT, try to adjust it in accordance with the TCP protocol. For tw's reuse and recycle are in violation of the TCP agreement, try not to open them when server resources permit and the load is not heavy. When TCP: time wait bucket table overflow occurs, try to increase the following parameters:

Tcp_max_tw_buckets = 256000

When adjusting the secondary parameter, adjust the timeout from TIME_WAIT_2 to TIME_WAIT. The default is 60s, which is optimized to 30s:

Net.ipv4.tcp_fin_timeout = 30

The matching parameters of other TCP are similar to the number of synack retransmissions and syn retransmissions, which are also beneficial after optimization.

Let's talk about the optimization parameters reuse and recycle, which are proprietary to time _ WAIT in Linux. they are also turned off by default. These two parameters can only take effect when timestamps is enabled:

Net.ipv4.tcp_timestamps = 1

Net.ipv4.tcp_tw_reuse = 1

The machine works as a client, and time_wait is reclaimed within a second when it is enabled.

Net.ipv4.tcp_tw_recycle = 0 (do not turn it on, now there are many NAT structures on the Internet, so you may not be able to shake hands three times)

After enabling, the TIME_WAIT is reclaimed in 3.5*RTO (the RTO time is calculated based on the RTT time), and the timestamp in the socket connect request of the same source ip host within 60s must be incremental. For the server, the same source ip may be many machines after the NAT. The timestamp increment of these machines is not guaranteed, and the server will reject the non-incremental request connection, directly resulting in the failure of three-way handshake.

Build your own original station, Operation and maintenance Web Service (www.net-add.com). The new blog will be updated on the Internet. Welcome to visit.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Network Security

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report