Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Troubleshooting the problem of container access to application service

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

The starting point of the problem

1. Users report that multiple containers have been started for the same ECS, one of which cannot communicate with Anytunnel 100.64.0.1, and TCP failed to build a union. The impassable container IP is 172.16.0.13. Other containers that work, such as 172.16.0.15.

2. Grab the package on the back-end RS of Anytunnel 100.64.0.1 and find the record of 172.16.0.13 TCP success. It indicates that other VPC can also be accessed with the address 172.16.013, and it is accessible.

PS:AnyTunnel address refers to the address within 10 of 100.64.0.0 PS:AnyTunnel in each VPC, which is used in cloud services such as DNS, YUM, NTP, OSS or SLS in VPC. Simple can be understood as a special SLB in order to solve the interworking of different VPC intranets under VPC. This SLB can be accessed by all VPC under the same Region.

Investigation process

1. First of all, we need to determine whether the request of 172.16.0.13 under VPC1 has reached the AnyTunnel backend RS. The easiest way is to grab the packet on both 172.16.0.13 and the backend RS for analysis.

The client and RS capture packets filter out problematic retransmission messages through the "tcp.analysis.retransmission" condition, and the messages are all retransmissions of SYN.

If the number of retransmitted messages is 7 and the source port is the same, it means that the request message has arrived on the RS, but the RS does not respond to the SYN-ACK, resulting in the failure of TCP Union.

2. Check the TCP count on RS and find that:

Netstat-ts | grep SYNs 3631812 SYNs to LISTEN sockets dropped

A large number of tcp syn packets have been discarded and the number has been growing, further indicating that RS received a SYN request from the client but did not respond.

Why RS did not respond to SYN request analysis

Count statistics in netstat, where TCP connection failure statistics are defined as follows:

Resets received for embryonic SYN_RECV sockets-syn_recv. If a non-retransmitted syn packet is received, reset will be returned.

Passive connections rejected because of time stamp-the timestamp of opening the corresponding connection of the sysctl_tw_recycle,syn packet is less than the timestamp saved in the route

Failed connection attempts-in syn_recv state, socket is closed or syn packet is received (non-retransmission)

Times the listen queue of a socket overflowed-received three-way handshake ack packet, accept queue is full

SYNs to LISTEN sockets ignored-received three-way handshake ack packet, failed to create socket due to various reasons (including full accept queue)

Through netstat-ts, you can see that the passive connections rejected because of time stamp statistics are very large and close to the number of SYNs to LISTEN sockets dropped.

3631476 passive connections rejected because of time stamp

The explanation is due to the timestamp problem of the corresponding connection of the syn packet.

So analyze the package capture file of RS again:

Message sequence

Whether to respond to SYN successfully

TimeStamp value

one

Successful response to SYN

2088983548

two

No response to SYN

271344470

three

No response to SYN

271345544

four

No response to SYN

271346612

five

No response to SYN

271348509

six

No response to SYN

271351766

seven

Successful response to SYN

2088993553

As can be seen from the packet capture information, when the TimeStamp value of the following SYN message is less than that of the previous successful response SYN message, the system will not respond to the SYN request by default.

According to the above analysis, the problem is obviously related to tcp timestmap. It is found that when all tcp_tw_recycle/tcp_timestamps is enabled, the timestamp in the socket connect request of the same source ip host must be incremented within 60 seconds.

Source function: tcp_v4_conn_request (), which is the handler of the tcp layer three-way handshake syn package (server) Source snippet: if (tmp_opt.saw_tstamp & & tcp_death_row.sysctl_tw_recycle & & (dst = inet_csk_route_req (sk, req))! = NULL & & (peer = rt_get_peer ((struct rtable *) dst))! = NULL & & peer- > v4daddr = = saddr) {if (get_seconds ()

< peer->

Tcp_ts_stamp + TCP_PAWS_MSL & & (S32) (peer- > tcp_ts-req- > ts_recent) > TCP_PAWS_WINDOW) {NET_INC_STATS_BH (sock_net (sk), LINUX_MIB_PAWSPASSIVEREJECTED); goto drop_and_release }} tmp_opt.saw_tstamp: the socket supports tcp_timestampsysctl_tw_recycle: the system enables the tcp_tw_recycle option TCP_PAWS_MSL:60s. This condition judgment indicates that the last tcp communication of the source ip occurred within 60s TCP_PAWS_WINDOW:1, and this condition judgment indicates that the timestamp of the last tcp communication of the source ip is greater than this tcp.

Tcp_timestamp is an optimization option defined by RFC1323, which is mainly used for the calculation of RTT (Round Trip Time) in TCP connections. Enabling tcp_timestamp helps the system to calculate more accurate RTT, which is also conducive to the improvement of TCP performance.

Tcp_tw_recycle (Boolean; default: disabled; since Linux 2.4) Enable fast recycling of TIME_WAIT sockets. Enabling this option is not recommended since this causes problems when working with NAT (Network Address Translation).

Enabling tcp_tw_recycle enables fast recycling of tcp time_wait. This parameter is not recommended to be enabled in a NAT environment. It can cause related problems.

The official definition is that using it in a NAT environment can cause problems, but in the final analysis, it is a problem that multiple clients use the same address to access the server. Our question fits the scenario.

Solution

There are two options:

The client closes tcp_timestamps and sets the value to 0. 0.

Server side should not set tcp_tw_recycle field and tcp_timestamps field to 1 at the same time

Because the client is in the hands of the customer, we can not grasp the configuration of each client, so we still need the server to configure.

Because when tcp timestamp is off, turning on tcp_tw_recycle does not work; while tcp timestamp can be turned on and works independently. Therefore, it is recommended that the service shut down tcp_tw_recycle.

The server (RS) closes the specific operation method of tcp_tw_recycle.

1. Temporary shutdown method: echo "0" > / proc/sys/net/ipv4/tcp_tw_recycle2, permanent shutdown method: add net.ipv4.tcp_tw_recycle = 0 to the / etc/sysctl.conf file and then use the sysctl-p command to make the configuration file effective

Extension

According to the above principle, when multiple clients use the same IP to access the same server at the same time, the tcp_tw_recycle should be turned off on the server side. Such as the following scenarios:

1. When the same Client accesses layer 4 private network SLB, it also directly accesses SLB backend ECS by bypassing SLB.

2. The same ECS hangs after multiple layer 4 SLB at the same time, and the same client accesses multiple SLB simultaneously or successively.

3. ECS provides services directly through public network addresses, NAT gateways and EIP. Because of the current depletion of IPV4 addresses, the vast majority of clients use SNAT for access.

Appendix

Reference:

Http://blog.sina.com.cn/s/blog_781b0c850101pu2q.html

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report