Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

If you use the TCP protocol, will you not lose the packet?

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

On the face of it, I'm a technology blogger.

But I didn't expect to become an emotional blogger today.

I did not expect that one day, I would use technical knowledge to save the feelings of fans who are about to break up.

Say it from the bottom of your heart. This matter is somewhat meritorious and immeasurable.

Here's the thing.

Recently, a reader added my green skin chat software, girl, the profile picture is very good-looking, just when I thought she wanted me to pull her into the group advertisement for adult promotion.

Something is suddenly wrong with the style of painting.

She said that her boyfriend is also a programmer, long-distance relationship, but also follow me, every day to study what TCP,UDP network. The kind that studies all night and doesn't reply to her messages all night.

There is something in the words, I understand.

Without an accident, she sent out soul torture.

"are you programmers really that busy? so busy that you don't even know how to reply."

I didn't expect it to be a straight punch.

But I caught this punch.

I really want to tell her "split up, next question".

But I can't. Because I hurt my readers, brother.

There was a moment of silence.

The single-core CPU is about to turn to smoke, so it trembles and sends out a message on the Nine Palace keyboard.

If I go back a little slower, I will feel sorry for my full-time undergraduate degree.

"actually, he has already replied to your message, but you know what? the Internet will lose packets."

"Let me help him explain. This topic should start with the process of sending data packets."

First of all, the green chat software clients of our two mobile phones will communicate with each other through their home servers. It looks like this.

Three-terminal communication of ▲ chat software

But to simplify the model, we omit the middle server, assuming it's an end-to-end communication. And in order to ensure the reliability of messages, we blindly guess that TCP protocol is used to communicate between them.

Communication between two ends of ▲ chat software

In order to send a packet, the two ends first establish an TCP connection through a three-way handshake.

A data packet is sent from the chat box, and the message is copied from the user space where the chat software is located to the sending buffer (send buffer) in the kernel space, and the data packet goes along the transport layer, the network layer, and into the data link layer, where the packet will pass through the flow control (qdisc), and then sent to the network card of the physical layer through RingBuffer. In this way, the data is sent to the complicated network world along the network card. The data in this will jump between more than n routers and switches and finally reach the network card of the destination machine.

At this point, the network card of the destination machine will tell DMA to put the packet information into RingBuffer, and then trigger a hard interrupt to CPU,CPU to trigger a soft interrupt to RingBuffer the packet, so a packet will follow the physical layer, data link layer, network layer, transport layer, and finally copy from the kernel space to the chat software in the user space.

Panoramic picture of sending and receiving packets in ▲ network

Draw such a big picture, only 200 words to explain, I am more or less heartache.

At this point, putting aside some details, you probably know the macro process of a data packet from sending to receiving.

As you can see, it is full of nouns.

Packet loss may occur in many places when the whole link is down.

But in order to prevent people from squatting for too long to affect their health, I will only focus on a few common scenarios that are prone to packet loss.

When establishing a connection, the packet loss TCP protocol establishes the connection through a three-way handshake. It looks like it looks like this.

▲ TCP three-way handshake

On the server side, after the first handshake, a semi-connection is established, and then the second handshake is issued. At this time, we need a place to store these semi-connections temporarily. This place is called semi-connected queue.

If the third handshake comes after that, the semi-connection will be upgraded to full connection, and then temporarily stored in another place called the full connection queue, waiting for the program to execute the accept () method to remove it.

▲ semi-connected queue and fully connected queue

Queues are long, and if they are long, they may be full. if they are full, the new packets will be discarded.

You can see if there is such a packet loss behavior in the following ways.

# number of full connection queue overflows # netstat-s | grep overflowed 4343 times the listen queue of a socket overflowed# semi-connection queue overflow times # netstat-s | grep-I "SYNs to LISTEN sockets dropped" 109 times the listen queue of a socket overflowed is a connection establishment failure.

This topic was written earlier, "can I establish a TCP connection without accept?" "if you have talked about it in more detail, if you are interested, you can go back and have a look.

Flow control packet loss application layer can send network packets of software, if all the data rushed into the network card without control, the network card will be unbearable, how to do? Let the data be processed in a queue according to certain rules, that is, the so-called qdisc (Queueing Disciplines), which is also what we often call the flow control mechanism.

Queue, there must be a queue, and the queue has a length.

We can see from the following ifconfig command that the number 1000 after txqueuelen involved is actually the length of the flow control queue.

When the data is sent too fast and the txqueuelen of the flow control queue is not large enough, packet loss is easy to occur.

▲ qdisc packet loss

You can view the dropped field under TX through the following ifconfig command. When it is greater than 0, it is possible that traffic control packet loss has occurred.

# ifconfig eth0eth0: flags=4163 mtu 1500 inet 172.21.66.69 netmask 255.255.240.0 broadcast 172.21.79.255 inet6 fe80216:3eff:fe25:269f prefixlen 64 scopeid 0x20 ether 00:16:3e:25:26:9f txqueuelen 1000 (Ethernet) RX packets 6962682 bytes 1119047079 (1.0 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 988919 bytes 2072511384 (1.9 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 when this happens We can try to modify the length of the downstream flow control queue. For example, increase the flow control queue length of the eth0 Nic from 1000 to 1500 as follows.

Packet loss caused by # ifconfig eth0 txqueuelen 1500 Nic and its driver is also common for many reasons, such as poor network cable quality and poor contact. Besides, let's talk about a few common scenes.

Too small RingBuffer leads to packet loss

As mentioned above, when receiving data, the data is temporarily stored in the RingBuffer receive buffer, and then wait for the kernel to trigger the soft interrupt to slowly collect it. If the buffer is too small and the data sent at this time is too fast, an overflow may occur and packet loss will occur.

▲ RingBuffer is full resulting in packet loss

We can use the following command to see if this has ever happened.

# ifconfigeth0: RX errors 0 dropped 0 overruns 0 frame 0 check the overruns metric above, which records the number of overflows due to insufficient RingBuffer length.

Of course, you can also view it with the ethtool command.

# ethtool-S eth0 | grep rx_queue_0_drops but it should be noted here that because there can be more than one RingBuffer in a network card, the 0 in the above rx_queue_0_drops represents the number of packets lost by the 0th RingBuffer. For network cards with multiple queues, this 0 can also be changed to other numbers. But my family conditions do not allow me to look at the number of packets lost in other queues, so the above command is enough for me.

When you find this type of packet loss, you can use the following command to view the configuration of the current network card.

The output above # ethtool-g eth0Ring parameters for eth0:Pre-set maximums:RX: 4096RX Mini: 0RX Jumbo: 0TX: 4096Current hardware settings:RX: 1024RX Mini: 0RX Jumbo: 0TX: 1024 means that RingBuffer supports a maximum length of 4096, but only 1024 is actually used.

To modify this length, execute ethtool-G eth1 rx 4096 tx 4096 to change the length of both the send and receive RingBuffer to 4096.

With the increase of RingBuffer, packet loss caused by small capacity can be reduced.

Insufficient performance of network card

As hardware, the transmission speed of the network card is limited. When the network transmission speed is too high and reaches the upper limit of the network card, packet loss will occur. This situation is generally common in stress testing scenarios.

We can get the maximum speed supported by the current network card by adding the name of the network card in ethtool.

# ethtool eth0Settings for eth0: Speed: 10000Mb/s can see the maximum transmission speed speed=1000Mb/s that the network card I use here can support.

This is commonly known as a gigabit network card, but note that the unit here is Mb, and b here refers to bit, not Byte. 1Byte=8bit . So 10000Mb / s should be divided by 8, that is, theoretically, the maximum transmission speed of the network card is 1000lap8 = 125MB/s.

We can analyze the sending and receiving of data packets from the network interface level through the sar command.

# sar-n DEV 1Linux 3.10.0-1127.19.1.el7.x86_64 2022 July 27 _ x86room641 CPU 08:35:39 IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s08: 35 minutes 40 seconds eth0 6.06 4.04 0.35 121682.33 0.00 0.00 .00 where txkB / s refers to the total number of bytes (byte) currently sent per second RxkB / s is the total number of bytes received per second (byte).

When the combined value of the two is about 1213w bytes, it corresponds to the transfer speed of about 125MB/s. When the performance limit of the network card is reached, packet loss will begin.

When you encounter this problem, give priority to whether your service really has such a large amount of real traffic. if so, you can consider splitting the service, or just recharge the pain and upgrade the configuration.

Receive buffer packet loss We usually use TCP socket for network programming, the kernel will allocate a send buffer and a receive buffer.

When we want to send a packet, we will execute send (msg) in the code, at this time the packet is not a shuttle directly through the network card to fly out. Instead, copy the data to the kernel send buffer and return it. As to when and how much data will be sent, it is up to the kernel to decide. Has the data been sent out after the code execution send is successful? "there is a more detailed introduction.

▲ tcp_sendmsg logic

The function of the receive buffer is similar, where packets received from the external network are temporarily stored, and then wait for the user-space application to pick up the packet.

These two buffers are limited in size and can be viewed with the following command.

# View receive buffer # sysctl net.ipv4.tcp_rmemnet.ipv4.tcp_rmem = 4096 87380 629145 check send buffer # sysctl net.ipv4.tcp_wmemnet.ipv4.tcp_wmem = 4096 16384 4194304 regardless of whether it is a receive buffer or a send buffer, you can see three values corresponding to the minimum, default and maximum values of the buffer (min, default, max). The buffer adjusts dynamically between min and max.

So the question is, what happens if the buffer is set too small?

For the send buffer, when send is executed, if it is a blocking call, it will wait until there is an empty space in the buffer to send data.

▲ send blocking

If it is a non-blocking call, an EAGAIN error message is immediately returned, meaning Try again. Have the application try again next time. In this case, packet loss usually does not occur.

▲ send non-blocking

When the accept buffer is full, things are different. Its TCP receive window changes to 0, the so-called zero window, and tells the sender through the win=0 in the packet, "the ball is out of control, don't serve." In general, in this case, the sender should stop sending messages, but if data does come at this time, packet loss will occur.

▲ recv_buffer packet loss

We can check whether this kind of packet loss has occurred through the TCPRcvQDrop in the following command.

Cat / proc/net/netstatTcpExt: SyncookiesSent TCPRcvQDrop SyncookiesFailedTcpExt: 0157 60116 but sadly, we don't usually see this TCPRcvQDrop because it's a hit introduced in version 5.9, and our servers usually use 2.x~3.x or so. You can find out what version of the linux kernel you are using with the following command.

# cat / proc/versionLinux version 3.10.0-Network packet loss between the two ends of 1127.19.1.el7.x86_64 mentioned above is the network packet loss inside the machines at both ends. In addition, such a long link between the two ends belongs to an external network, in which there are various routers, switches and optical cables. Packet loss also occurs very often.

These packet losses occur on some machines in the intermediate link, and we certainly do not have permission to log on to these machines. However, we can observe the connectivity of the entire link through some commands.

Ping command to view lost packets

For example, we know that the domain name of the destination is baidu.com. Want to know if there is any packet loss between your machine and the baidu server. You can use the ping command.

▲ ping check lost packets

There is a 100% packet loss in the penultimate line, meaning the packet loss rate is 100%.

But in this way, you can only know if there is any packet loss between your machine and the destination machine.

So if you want to know which node has lost packets on this link between you and the destination machine, is there any way?

Yes.

Mtr command

The mtr command can see the packet loss of each node between your machine and the destination machine.

Execute the command as follows.

▲ mtr_icmp

Where-r refers to report, printing the results in the form of a report.

You can see the Host column, which shows the machines with each hop in the middle of the link, and the Loss column refers to the packet loss rate corresponding to this hop.

It should be noted that there are some host in the middle, and that is because mtr uses the ICMP package by default, and some nodes restrict the ICMP package, which cannot be displayed properly.

We can add a-u to the mtr command, that is, using the udp package, we can see the part. The corresponding IP.

▲ mtr-udp

Put the results of the ICMP package and the UDP package together, it is a relatively complete link map.

There is also a small detail, the Loss column. We focus on the last line in the icmp scenario. If it is 0%, it doesn't matter whether the previous loss is 100% or 80%. Those are false positives caused by node restrictions.

But if the last line is 20%, and the first few lines are about 20%, it means that packet loss starts from the nearest line, and if it happens for a long time, there is probably something wrong with it. If it is a company intranet, you can take this clue to find the corresponding network colleagues. If it is a public network, then be patient and so on, the development of others will be more anxious than you.

What should I do if I lose my bag? I said so much. I just want to tell you that packet loss is very common and almost inevitable.

But the problem is, what if the packet is lost?

This is easy to do, using the TCP protocol for transmission.

What is TCP?

Both ends of the TCP connection are established. After sending the data, the sender will wait for the receiver to reply to the ack packet. The purpose of the ack packet is to tell the other party that it has indeed received the data, but if packet loss occurs in the intermediate link, the sender will not receive the acknowledgement ack for a long time, so it will be retransmitted. This ensures that each packet actually arrives at the receiving end.

Suppose the network is down now, we also use chat software to send messages, chat software will use TCP to constantly try to retransmit data, if the network is restored during the retransmission, then the data can be sent normally. But if you try again and again until the timeout fails, you will get a red exclamation point.

At this point, the problem comes again.

Suppose a green skin chat software uses the TCP protocol.

Why did the girl mentioned at the beginning of the article lose her bag when her boyfriend replied to her message? After all, if you lose your packet, you will try again, and if you fail, there will be a red exclamation point.

As a result, the question becomes, will there be no packet loss if you use the TCP protocol?

Will you not lose packets if you use the TCP protocol? we know that TCP is located in the transport layer, and there are various application layer protocols on it, such as the common HTTP or various RPC protocols.

▲ layer 4 network protocol

The reliability guaranteed by TCP is the reliability of the transport layer. In other words, TCP only ensures that data is reliably sent from the transport layer of machine A to the transport layer of machine B.

As for whether the data can be guaranteed to the application layer after it reaches the transport layer of the receiver, TCP does not care.

Suppose now, we enter a message, send it from the chat box, and go to the sending buffer of the transport layer TCP protocol. No matter whether there is packet loss in the middle or not, it is guaranteed to be sent to the other party's transport layer TCP receiving buffer through retransmission. At this time, the receiver replies to an ack, and after receiving the ack, the sender will throw away the message in the sending buffer. At this point, TCP's mission is over.

The TCP task is over, but the chat software task is not over.

The chat software also needs to read the data out of the TCP receive buffer. If at this moment, the phone crashes and flashes due to insufficient memory or various other reasons.

The sender thought that the message he had sent had been sent to the other party, but the receiver did not receive the message.

As a result, the news was lost.

▲ uses TCP protocol, but packet loss occurs.

Although the probability is very small, it just happened.

Reasonable and logically consistent.

So from here, I concluded forcefully that my readers had replied to the girl's message, and it was only because of the packet loss that the girl could not receive it. The reason for packet loss is that the girl's mobile chat software flickered at the moment of receiving the message.

Come here. Knowing that she had wronged her boyfriend, the girl cried and said that she must ask her boyfriend to buy her a non-flashback latest iPhone.

Uh...

Brothers think I did the right thing, please post a "positive energy" in the comments section.

How to solve this kind of packet loss problem? This is the end of the story. After being moved, let's have a heart-to-heart talk.

As a matter of fact, what I said above is true, and none of it is a lie.

But some green skin chat software is so mature, how can it not consider this point?

You should remember that we mentioned at the beginning of the article that for the sake of simplicity, the server side was omitted, from three-terminal communication to two-end communication, so there was the problem of packet loss.

Now let's add the server back.

Three-terminal communication of ▲ chat software

Have you noticed that sometimes we talk a lot on our mobile phone and then log on to the computer version, which can synchronize the latest chat records to the computer version? In other words, the server may record what data we have sent recently, assuming that every message has an id, and the server and chat software compare the id of the latest message every time to know whether the messages on both sides are consistent, just like reconciliation.

For the sender, as long as you check the content of the server regularly, you will know which message has not been sent successfully and just resend it.

If the receiver's chat software crashes, you will know which piece of data is missing after rebooting and communicating with the server a little, and then synchronize it, so there is no packet loss mentioned above.

It can be seen that TCP only guarantees the message reliability of the transport layer, but not the message reliability of the application layer. If we still want to ensure the message reliability of the application layer, we need the application layer to implement the logic itself.

So the problem is, when the two ends communicate with each other, they can also reconcile the accounts, so why introduce the third-end server?

There are three main reasons.

First, if you communicate between two ends, you have 1000 friends in your chat software, and you have to establish 1000 connections. But if you introduce the server, you only need to establish a connection with the server, and the less resources the chat software consumes, the more power the phone will save.

Second, there is the issue of security. If there is still communication between the two ends, anyone will check the account with you, and you will synchronize the chat records, which is not appropriate. If the other party has ulterior motives, the information will be leaked. The introduction of third-party server can be very convenient to do a variety of authentication verification.

Third, there is the problem of software version. After the software is installed on the user's phone, it is up to the user to decide whether the software will not be updated. If the two sides are still communicating, and the software version span is too large, it is easy to produce a variety of compatibility problems, but the introduction of the third-end server, you can force part of the low version upgrade, otherwise you can not use the software. But for most compatibility issues, it is fine to add compatibility logic to the server, and there is no need to force users to update the software.

So when you see here, you should understand that I remove the server, not simply for the sake of simplicity.

Summary data from the sender to the receiver, the link is very long, packet loss may occur anywhere, it can be said that packet loss is almost inevitable.

Usually you don't have to worry about packet loss. Most of the time, the retransmission mechanism of TCP ensures the reliability of messages.

When you find that the service is abnormal, such as the interface has a high latency and always fails, you can use the ping or mtr command to see if packet loss has occurred in the intermediate link.

TCP only guarantees the message reliability of the transport layer, not the message reliability of the application layer. If we still want to ensure the message reliability of the application layer, we need the application layer to implement the logic itself.

Finally, let me leave you a question. How does the mtr command know the IP address of each hop?

Referenc

"Linux Kernel Technology practice"-- geek time

"panoramic Guide to packet loss Fault location in Cloud Network"-- geek rebirth

This article comes from the official account of Wechat: rookie debug (ID:xiaobaidebug), author: Xiaobai

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report