Deep and simple TCP/IP protocol stack 07/19 Update SLTechnology News&Howtos

Deep and simple TCP/IP protocol stack

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

TCP/IP stack is the sum of a series of network protocols that form the core backbone of network communications, defining how electronic devices connect to the Internet and how data is transferred between them. TCP/IP protocol adopts 4 layers structure, which are application layer, transport layer, network layer and link layer respectively. Each layer calls the protocol provided by its next layer to fulfill its own requirements. Since we work most of the time in the application layer, we don't have to worry about the lower layer; secondly, the network protocol system itself is very complex and large, and the entry threshold is high, so it is difficult to figure out the working principle of TCP/IP. Let's explore this process:

0. Physical media

Physical media is the physical means of connecting computers, common optical fibers, twisted pairs, and radio waves, which determine the transmission mode of electrical signals (0 and 1), physical media determines the transmission bandwidth, rate, transmission distance, and anti-interference of electrical signals.

TCP/IP protocol stack is divided into four layers, each layer is communicated with each other by a specific protocol, and the communication between protocols is finally converted into electrical signals of 0 and 1, which can be transmitted through physical media to reach each other's computers. Therefore, physical media is the cornerstone of network communication. Network communication is like sending express delivery. The goods bought by users are wrapped layer by layer. The protocols describe the size, recipient, contact information and delivery address of the goods, while the actual delivery vehicle is the physical medium. Secondly, for some remote places, express delivery is not direct and needs to be forwarded halfway. At this time, various protocols on express delivery work. It records the address to be forwarded, the information of the receiver, etc. This is the role of TCP/IP in doing so many protocols. Here is a quick overview of TCP/IP data flow:

When a user initiates a request through http, relevant protocols of the application layer, the transport layer, the network layer and the link layer sequentially package the request and carry corresponding headers, and finally an Ethernet data packet is generated at the link layer, the Ethernet data packet is transmitted to the opposite host through a physical medium, and after the opposite host receives the data packet, the corresponding protocols are adopted layer by layer for unpacking, and finally the application layer data is handed over to the application program for processing.

With the overall concept, the following details the division of labor and corresponding protocols at each level:

1. Link layer

Network communication is to transmit data with specific significance to each other through physical media. It is meaningless to simply send 0 and 1, so it is necessary to group 0 and 1, and identify the information characteristics of each group of electrical signals, and then send them in turn according to the order of grouping. Ethernet stipulates that a group of electrical signals is a data packet, a data packet is called a frame, and the protocol that formulates this rule is the Ethernet protocol. A complete Ethernet packet is shown below:

The whole data frame consists of three parts: header, data and tail. The header is fixed to 14 bytes, including the destination MAC address, source MAC address and type; the shortest data is 46 bytes and the longest is 1500 bytes. If the data to be transmitted is very long, it must be divided into multiple frames for transmission; the tail is fixed to 4 bytes, indicating the data frame check sequence, which is used to determine whether the data packet is damaged during transmission. Thus, Ethernet protocols group electrical signals and form data frames, which are then sent over the physical medium to the receiver. So how does Ethernet know the identity of the recipient?

Ethernet protocol stipulates that all devices connected to the network must be equipped with network adapters, that is, network cards, and data packets must be transmitted from one network card to another. The NIC address is the sending address and receiving address of the packet, that is, the MAC address contained in the frame header. The MAC address is the identity of each NIC, just like the XX number on our XX, which has global uniqueness. MAC address adopts hexadecimal identification, totally 6 bytes, the first three bytes are manufacturer number, the last three bytes are network card serial number, for example 4C-0F-6E-12-D2-19

After having MAC address, Ethernet adopts broadcast form to send data packet to all hosts in the subnet. After receiving this packet, each host in the subnet will read the destination MAC address in the header and compare it with its own MAC address. If it is the same, do the next step. If it is different, discard the packet.

So the main job of the link layer is to group electrical signals and form data frames with specific meanings, and then send them to the receiver in the form of broadcasts over the physical medium.

2. Network layer

For the above process, there are several details worth considering:

How does the sender know the MAC address of the receiver?

How does the sender know that the receiver belongs to the same subnet as him?

If the receiver is not on the same subnet as you, how do you send packets to the other party?

In order to solve these problems, three protocols are introduced into the network layer, which are IP protocol, ARP protocol and routing protocol.

2.1 IP Protocol

From the previous introduction, we know that MAC addresses are only related to manufacturers and have nothing to do with the network in which they are located, so it is impossible to determine whether two hosts belong to the same subnet by MAC addresses.

Therefore, the IP protocol was introduced into the network layer, and a new set of addresses was established, which allowed us to distinguish whether two hosts belonged to the same network. This set of addresses was the network address, which was called IP address.

IPv4 is a 32-bit address, often represented by four decimal digits. The IP protocol divides this 32-bit address into two parts, the first part representing the network address and the second part representing the host's address in the local area network. Because the classification of various types of addresses is different, take the C address 192.168.24.1 as an example, in which the first 24 bits are the network address and the last 8 bits are the host address. Therefore, if two IP addresses are in the same subnet, the network address must be the same. In order to determine the network address in the IP address, IP protocol also introduces subnet mask, IP address and subnet mask can be obtained by bitwise AND operation.

Since the IP addresses of the sender and receiver are known (application layer protocols are passed in), we can determine whether both parties are in the same subnet by performing an AND operation on the two IP addresses through the subnet mask.

2.2 ARP Protocol

Address Resolution Protocol (ARP) is a network layer protocol that obtains MAC addresses based on IP addresses. It works as follows:

ARP will first initiate a request packet, the header of the packet contains the IP address of the target host, and then this packet will be packaged again at the link layer, generating an Ethernet packet, and finally broadcast by Ethernet to all hosts in the subnet, each host will receive this packet, and extract the IP address in the header, and then compare it with its own IP address. If it is the same, it will return its own MAC address. If it is different, it will discard the packet. ARP receives the returned message to determine the MAC address of the target machine; at the same time, ARP will also store the returned MAC address and the corresponding IP address in the ARP cache of the local machine and retain it for a certain time. The ARP cache will be directly queried next time to save resources. cmd Enter arp -a to query ARP data cached locally.

2.3 routing protocol

Through ARP protocol working principle can be found, ARP MAC addressing is still limited to the same subnet, so the network layer introduced routing protocol, first through the IP protocol to determine whether two hosts in the same subnet, if in the same subnet, through the ARP protocol query corresponding MAC address, and then broadcast to the subnet host send packets; if not in the same subnet, Ethernet will forward the packet to the subnet gateway for routing. Gateway is the bridge between subnet and subnet on the Internet, so gateway will forward the packet many times, and finally forward the packet to the subnet where the target IP is located, and then obtain the MAC of the target machine through ARP, and finally send the packet to the receiver in the form of broadcast.

The physical device that completes this routing protocol is the router. In the complicated network world, the router plays the role of the traffic hub. It will select and set the route according to the channel condition to forward the data packet with the best path.

2.4 IP packets

Packets wrapped at the network layer are called IP packets, and the structure of IPv4 packets is shown in the following figure:

IP packet consists of header and data. The header length is 20 bytes, mainly including the destination IP address and source IP address. The destination IP address is the clue and basis of gateway routing. The maximum length of the data part is 65515 bytes. In theory, the total length of an IP packet can reach 65535 bytes, while the maximum length of an Ethernet packet is 1500 characters. If this size is exceeded, the IP packet needs to be segmented and sent in multiple frames.

Therefore, the main work of the network layer is to define network addresses, distinguish network segments, MAC addressing within subnets, and route packets from different subnets.

3. Transmission layer

The link layer defines the identity of the host, that is, the MAC address, while the network layer defines the IP address and defines the network segment where the host is located. With these two addresses, data packets can be sent from one host to another host. But packets are actually sent from an application on one host and received by an application on the other host. Each computer can have many applications running simultaneously, so when a packet is sent to the host, it is impossible to determine which application is going to receive it. To identify each application, UDP defines a port, each application on the same host needs to specify a unique port number, and specifies that packets transmitted in the network must be accompanied by port information. In this way, when the packet arrives at the host, the corresponding application can be found according to the port number. UDP defined packets are called UDP packets and have the following structure:

UDP packet consists of two parts: header and data. The header length is 8 bytes, mainly including source port and destination port. The maximum length of data is 65527 bytes, and the maximum length of the whole packet can reach 65535 bytes.

UDP protocol is relatively simple, easy to implement, but it does not have an acknowledgement mechanism, once a packet is sent, it is impossible to know whether the other party has received it, so the reliability is poor. In order to solve this problem, TCP is born. TCP is a transmission control protocol, which is a connection-oriented, reliable, and byte stream-based communication protocol. TCP is a UDP protocol with an acknowledgement mechanism. Each packet sent requires an acknowledgement. If a packet is lost, no acknowledgement is received, and the sender must retransmit the packet.

In order to ensure the reliability of transmission, TCP protocol establishes a confirmation mechanism for three conversations on the basis of UDP, that is to say, a reliable connection must be established with the other party before the formal transmission and reception of data. Since the process of establishment is more complicated, we will make an image description here:

Host A: I want to send you data, can I?

Host B: Yes, when will you send it?

Host A: I'll send it right away, you take it!

After three conversations, host A sends formal data to host B, while UDP is a connection-oriented protocol that does not establish a connection with the other party, but directly sends the data packet. TCP can ensure that packets are not lost during transmission, but good things must pay a price. Compared with UDP, TCP implementation process is complex, consumes more connection resources, and transmission speed is slow.

TCP packets, like UDP, are composed of two parts: header and data. The only difference is that TCP packets have no length limit and can be infinitely long in theory. However, in order to ensure the efficiency of the network, the length of TCP packets usually does not exceed the length of IP packets to ensure that single TCP packets do not have to be divided.

To summarize, the main job of the transport layer is to define ports and identify application identities. TCP also ensures the reliability of data transmission.

4. Application layer

Theoretically speaking, with the support of the above three layers of protocols, data can already be transferred from an application program on one host to an application program on another host, but the data transferred at this time is a byte stream, which cannot be well recognized by the program and has poor operability. Therefore, the application layer defines various protocols to define and standardize data formats, such as http,ftp,smtp, etc. http is a relatively common application layer protocol, mainly used for data communication between B/S architectures. The message format is as follows:

In the request Header, the request data format Accept and the corresponding data format Content-Type are defined respectively. With this specification, when the other party receives the request, it knows what format to parse, then processes the request, and finally returns the data according to the format required by the requester. After receiving the response, the requester interprets it according to the specified format.

So the main job of the application layer is to define the data format and interpret the data according to the corresponding format.

5. Summary

Finally, let's review the responsibilities of each layer:

Link layer: grouping 0s and 1s, defining data frames, identifying physical addresses of hosts, and transmitting data;

Network layer: define IP address, confirm the network location of host, and MAC addressing through IP, routing and forwarding of external network data packets;

Transport layer: defines ports, confirms the identity of applications on the host, and delivers packets to the corresponding applications;

Application layer: defines the data format and interprets the data according to the corresponding format.

To put the above responsibilities in plain terms, when you enter a URL and press the Enter key, first, the application layer protocol defines the format of the request packet; then the transport layer protocol adds the port number of both parties to confirm the application program of both parties; then the network protocol adds the IP address of both parties to confirm the network location of both parties; Finally, the link layer protocol adds the MAC address of both parties, confirms the physical location of both parties, and at the same time groups the data into a data frame, which is sent to the other host through the transmission medium in a broadcast mode. For different network segments, the data packet will first be forwarded to the gateway router, and after repeated forwarding, it will finally be sent to the target host. After receiving the data packet, the target uses the corresponding protocol to assemble the frame data, and then parses it through the layer by layer protocol, and finally it is parsed by the application layer protocol and handed over to the server for processing.

The above content is a simple introduction to the TCP/IP four-layer model, and in fact each layer model has a lot of protocols, each protocol has to do a lot of things, but we must first have a clear context structure, master the most basic role of each layer model, and then enrich the details, perhaps it will be easier to understand.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.