Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the three-way handshake and four waves of the TCP protocol in Python

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge of "what is the three-way handshake and four waves of the TCP protocol in Python". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

I. the difference between TCP and UDP protocols

Before we introduce the difference between the two, we need to understand a concept: the TCP/IP protocol family. The definition is as follows:

At present, the mainstream protocol family used by Internet (Internet) is the TCP/IP protocol family, which is a hierarchical, multi-protocol communication system "Linux high performance programming".

Extract keywords: layering, multi-protocol and communication. In other words, it has multiple levels, each level has different protocols, these levels cooperate with each other through protocols, and finally achieve the purpose of network communication.

When it comes to layering, it should be no stranger. TCP/IP protocol family is a four-layer protocol system, which are data link layer, network layer, transport layer and application layer from bottom to top. The TCP and UDP protocols we are going to talk about here belong to the transport layer. (the role of each layer and related protocols will not be introduced here for the time being.)

Let's return to the title: the difference between TCP and UDP protocols. To sum up, the main points of the answer to this question are as follows:

1. On the home page, they are both protocols of the transport layer, and the so-called "transport layer" provides end-to-end communication for two hosts, that is, from A B to B.

2. TCP protocol is reliable, while UDP protocol is not. Reliability means that the data is sent from A to B, whether it can ensure that the data is really sent to B. The TCP protocol uses timeout retransmission and data confirmation to ensure that the packet is correctly sent to the destination, while the UDP protocol cannot guarantee the correct transmission of data from the sender to the destination. If the data is lost in the transmission process, or if the destination finds a data error through data verification, the UDP protocol simply notifies the application that the transmission failed. For the timeout retransmission, data validation and so on owned by the TCP protocol, it is necessary for the application to deal with this logic.

3. TCP is connection-oriented and UDP is connectionless. This is also easier to understand, because a TCP connection requires a "three-way handshake and four waves".

4. TCP service is based on stream, while UDP is based on Datagram, and there is no boundary (length) limit for stream-based data. In Datagram-based service, each UDP Datagram has a length, and the receiver must read out all its contents at once with that length as the minimum unit.

5. When the sender performs write operations many times, the TCP module will first put the data into the TCP sending buffer. When the TCP module really starts to send data, the data waiting to be sent in the sending buffer may be encapsulated into one or more TCP segments. Therefore, the number of TCP segments sent out by the TCP module has no fixed relationship with the number of write operations performed by the application. Similarly, when the receiver receives one or more TCP message segments, the TCP module puts the data into the TCP receiving buffer according to the sequence number (the serial number is described in the TCP header structure below) and informs the application to read the data. The receiver can choose to read the data out of the buffer once or multiple times (depending on the size of the application read buffer specified by the user). Therefore, there is no fixed relationship between the number of data read by the receiver and the number of message segments sent by the sender. To sum up, for TCP connections, there is no data relationship between the number of write operations performed by the sender and the number of read operations performed by the receiver, which is also based on the characteristics of the streaming service. * * for UDP service, every time a write operation is performed by the sender, it will be encapsulated into a UDP Datagram and sent. At the same time, the receiver must read it according to the transmission, otherwise the packet will be lost. Therefore, for UDP connections, the secondary data written by the sender is the same as the number of reads, which is also the characteristic of Datagram-based services.

6. TCP connections are one-to-one, so applications based on broadcast or multicast cannot use TCP, while UDP is very suitable for broadcast and multicast.

To sum up a definition:

TCP protocol (Transmission Control Protocal, Transmission Control Protocol) provides reliable, connection-oriented, flow-based services for the application layer. On the other hand, UDP protocol (User Datagram Protocal, user Datagram protocol) is the opposite of TCP protocol, which provides unreliable, connectionless and Datagram-based services for the application layer.

2. TCP head structure

The TCP message structure is divided into the header part and the data part. Why do you need to understand the TCP header structure, because the flag bits in the header structure are used in the following "three-way handshake and four waves". A simple understanding is good for understanding the following process.

The functions of each are described below:

16-bit source port number and destination port number, which is easier to understand, not too much explanation.

32-bit sequence number: in the process of establishing a connection (or closing), this sequence number is used as a placeholder. When A sends a connection request to B, it will take a sequence number (random value, called ISN). After B confirms the connection, it will return the sequence number + 1 with its own filling number. When a connection is established, the sequence number is the generated random value ISN plus the offset of the first byte of the data carried by the message segment in the entire byte stream. For example, if the data sent by a TCP message segment is the 100th to 200th byte of the byte stream, the sequence number is ISN + 100. So to sum up, when establishing a connection (or closing), the function of the sequence number is to occupy the space, and after the connection, it is to mark the first byte of the current data stream.

4-bit header length: identifies the number of 32 bit words in the TCP header, because it is 4 bits, that is, the maximum TCP header can represent 15, that is, the maximum length is 60 bytes. That is, it is the maximum length used to record the head.

6-bit flag bits, including:

URG flag: indicates whether the emergency pointer is valid.

ACK flag: confirmation flag. The TCP message segment carrying the ACK logo is usually called the acknowledgement message segment.

PSH flag: prompts the receiver that the program should immediately read the data from the TCP receive buffer to make room for receiving subsequent data (if not, the data will always be in the buffer).

RST flag: requires the other party to re-establish the connection. The TCP message segment carrying the RST flag is usually called the reset message segment.

SYN flag: indicates that a connection is requested. The TCP message segment that carries the SYN logo is usually called the synchronous message segment.

FIN flag: the closing flag, which is usually referred to as the end message segment of the TCP message carrying the FIN flag.

These flag bits indicate the purpose of the current request, that is, what to do.

16-bit window size: indicates how many bytes of data the current TCP receive buffer can hold, so that the sender can control the speed of sending data, which is a means of TCP flow control.

16-bit checksum: verify whether the data is corrupted, checked by the CRC algorithm. This check includes not only the TCP header, but also the data part.

16-bit emergency pointer: a positive offset that, together with the value of the sequence number field, represents the sequence number of the next byte of the last emergency data. The emergency pointer of TCP is a method by which the sender sends emergency data to the receiver.

TCP header option: variable length optional information, this part contains up to 40 bytes, because the maximum length of the TCP header is 60 bytes, so the fixed part occupies 20 bytes. Without details here, you can refer to "Linux High performance programming" 3.2.2.

Three-way handshake and four-wave process are explained in detail

Draw a picture

Let's first explain the three-way handshake:

1. The sender sends a connection request with a 6-bit mark of SYN and its own sequence number (at this time, because it does not transmit data, it does not indicate the offset of bytes, but only occupies a bit), such as 223.

2. The receiver receives the request, agrees to connect, sends the consent response, takes the SYN + ACK flag bit, and confirms the sequence number as 224 (the sender sequence number plus 1) and its own sequence number (again, because it does not transmit data, it does not indicate the offset of bytes, but occupies a bit), such as 521.

3. The sender receives the confirmation message and sends it back to the receiver, indicating that I have received your confirmation message. At this time, the logo is still ACK, and the confirmation sequence number is 522.

The question involved: why three handshakes instead of four or two?

First of all, explain why not four times. The process of four times goes like this:

Sender: I'm going to connect you.

Recipient: OK.

Receiver: I'm ready, you can connect.

Sender: OK.

Obviously, the receiver is ready to connect and agrees that the connection can be merged, which can improve the efficiency of the connection.

Again, let's explain why not twice. In fact, it is easy to understand. We know that TCP is full-duplex communication and reliable. Connection and shutdown must be performed on both sides. At the same time, we also need to make sure that both sides have already performed the connection or shutdown. If there are only two times, the process goes like this:

Sender: I'm going to connect you.

Recipient: OK.

Obviously, the receiver does not know and cannot guarantee that the sender will receive the "good" message. Once the receiver does not receive this message, it will receive the "unilateral connection". At this time, the sender will keep retrying to send the connection request, and the connection will not be completed until the "good" message is actually received. For three times, if the sender does not wait for your reply to confirm, it will not really be connected, it will retry the confirmation request.

Then let's take a look at the four waves:

1. The sender sends a shutdown request with the flag bit of FIN and its own sequence number (again, since no data is transmitted, it does not indicate the offset of bytes, but occupies a bit).

2. After receiving the request, the receiver replies with confirmation: ACK, and confirms that the serial number is the request serial number plus 1.

3. The receiver also decides to close the connection and send a closure notification with the flag bit FIN, along with the confirmation message in step 2, namely ACK, as well as the confirmation sequence number and its own sequence number.

4. The sender replies to the confirmation message: ACK, and the serial number of the receiver is added by 1.

The question involved: why do you need four handshakes instead of three?

The process of three times goes like this:

Sender: I don't send you any more data.

Recipient: OK, I won't send it to you either.

Sender: OK, bye.

This is because when the receiver receives the shutdown request, what it can respond immediately is to confirm the shutdown. What it confirms here is the shutdown of the receiver, that is, the sender no longer sends data to the receiver, but he can still receive the data sent to him by the receiver. Whether the receiver needs to close the channel of "sending data to the sender" depends on the operating system. It is also possible for the operating system to shut down the sleep for a few seconds. If it is merged into three times, it may cause the receiver not to receive the confirmation request in time, and may cause timeout retry and so on. So it takes four times.

What is the TIME_WAIT status

First of all, let's take a look at a piece of code (with so many theories, it's finally time to look at some code). Here is an example of python's simple use of socket for tcp communication:

Server:

Socket_server_test.py#-*-coding: utf-8-*-"" @ Time: 2019-6-26 afternoon 4:58@Author: Demon@File: socket_server_test.py@Desc: "import socketHOST = '127.0.0.1' # standard loopback address (localhost) PORT = 9999 # listening port (non-system-level port: greater than 1023) s = socket.socket (socket.AF_INET Socket.SOCK_STREAM) # third parameter If it is 0, it cannot be reused # third if it is 1, you can reuse # s.setsockopt (socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) s.bind ((HOST, PORT)) s.listen () conn, addr = s.accept () with conn: print ('Connected by', addr) while True: data = conn.recv (1024) print ("data:", data) if data: print ("close") s.close () break conn.sendall (data)

Client:

#-*-coding: utf-8-*-"@ Time: 2019-6-26 afternoon 4:55@Author: yrr@File: socket_client_test.py@Desc: test self._socket.setsockopt (socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)" import socketHOST = '127.0.0.1' # hostname or IP address of the server PORT = 9999 # Port used by the server s = socket.socket (socket.AF_INET Socket.SOCK_STREAM) s.connect ((HOST, PORT)) s.sendall (b'Hello, world') data = s.recv (1024) print ('Received', repr (data))

Execution effect:

We will find that when our server shuts down actively, if we run this program again, we will report an error saying that the port is still occupied. This is very strange, obviously has closed the connection, why still occupy the port? We use the netstat-an | grep 9999 command to check and find that the current connection is in the TIME_WAIT state.

Let's talk about TIME_WAIT status. That is, when one party disconnects, it does not directly enter the CLOSED state, but transfers to the TIME_WAIT state. In this state, it needs to wait for 2MSL (Maximum Segment Life, the maximum lifetime of the message segment) before it can be completely closed.

Issues involved:

1. Why does the TIME_WAIT state need to exist?

To put it simply, there are two reasons:

a. When the sender finally sends out the confirmation message, there is still no guarantee that the receiver will receive the message. If it is not received, the receiver will try again, and at this time the sender has really shut down and the request will not be accepted.

b. If the sender shuts down after sending the confirmation message, it is possible for the sender to issue a connection request again when the receiver receives the confirmation message, which is a mess at this time. As soon as I finished the connection, I received a message confirming the closure.

2. Why is the duration 2MSL?

This is actually easier to understand, so I send a confirmation message, the longest time to arrive is MSL, and if you don't receive it, try again, the longest time is MSL, then I wait for 2MSL, if you haven't received the request, it proves that you have received it normally.

Because we have this TIME_WAIT status, we usually say that the client shuts down first, and the server does not shut down first. So how to avoid the situation that the port is occupied after the shutdown (that is, how to solve the problem of the above code example)? Quite simply, you can do this with a single line of code:

# socket.SO_REUSEADDR says close port reusable s.setsockopt (socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) "what is the three-way handshake and fourth wave of TCP protocol in Python" is introduced here. Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report