In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the knowledge about "what happened between the URL and the webpage in the web". In the actual case operation process, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!
text
Let's explore what happens in the following simplified network topology model as an example.
Simple network model
01 Lonely Little Brother-- HTTP
The first step the browser does is parse the URL.
The first step the browser does is to parse the URL and generate the requested information for the Web server.
Let's look at what each element in a long URL represents, as shown in the following figure:
HTTP message format
A lonely HTTP packet says: "I am such a small packet, no friends, sent directly to the vast network, who will know me? Who can give me a lift? Who can protect me? Where is my destination? "。Full of all kinds of questions, it did not stagnate, still embarked on the journey!
02 Real address query-- DNS
After the browser parses the URL and generates an HTTP message, you need to entrust the operating system to send the message to the Web server.
But before sending, there is one more thing to do, which is to query the IP address of the server domain name pair, because when delegating the operating system to send messages, it must provide the IP address of the communication object.
For example, when we make a phone call, we must know the phone number of the other party, but because the phone number is difficult to remember, we usually save the phone number + name of the other party in the address book.
Therefore, there is a server that specifically stores the correspondence between Web server domain names and IP addresses, which is a DNS server.
Hierarchy of domain names
Domain names in DNS are separated by periods, such as www.server.com, where periods represent boundaries between different levels.
In a domain name, the further to the right is the higher the hierarchy.
After all, domain names are invented by foreigners, so thinking is opposite to that of Chinese people. For example, when it comes to a city location, foreign countries like to start from small to large (such as XX Street XX District XX City XX Province), while China likes to start from large to small (such as XX Street XX District XX City XX Province).
The root domain is at the top level, the next level below it is the com top-level domain, and below it is server.com.
So the hierarchy of domain names resembles a tree structure:
root DNS server
Top-level Domain DNS Server (com)
authoritative DNS server (server.com)
Domain name resolution workflow
DNS domain name resolution process is quite interesting, the whole process is similar to our daily life to find people ask for directions process, only to show the way not to lead the way.
The packet says: "DNS Big Brother is awesome, we have found our destination! I am still confused, I want to send it out, who do I need next? "
03 Good helper guide--protocol stack
After obtaining the IP through DNS, the HTTP transmission work can be handed over to the protocol stack in the operating system.
The internal part of the protocol stack is divided into several parts, which undertake different tasks. There are certain rules for the relationship between the upper and lower parts. The upper part will delegate work to the lower part, and the lower part will receive the delegated work and execute it.
TCP Header Format
First of all, the source port number and the destination port number are indispensable. Without these two port numbers, the data does not know which application should be sent to.
Then there is the serial number of the packet, which is to solve the problem of packet disorder.
There should also be a confirmation number, the purpose is to confirm whether the other party has received it. If it is not received, it should be sent again until it is delivered. This is to solve the problem of no packet loss.
Then there are some status bits. For example, SYN is to initiate a connection, ACK is to reply, RST is to reconnect, FIN is to end a connection, etc. TCP is connection-oriented, so both parties maintain the state of the connection, and the sending of these packets with status bits will cause state changes on both sides.
Another important thing is the window size. TCP needs to do traffic control. Both sides of the communication declare a window (buffer size) to identify their current processing capacity. Don't send it too fast, kill me, and don't send it too slowly, starve me.
In addition to doing traffic control, TCP also does congestion control. For real traffic jams, it can do nothing. The only thing it can do is to control itself, that is, to control the speed of transmission. If you can't change the world, change yourself.
Before TCP transmits data, it must first establish a three-way handshake.
Before HTTP can transmit data, TCP connection establishment is first required, and TCP connection establishment is usually called three-way handshake.
This so-called "connection" is just a state machine maintained in both computers, and the state change timing diagram of both sides in the process of establishing the connection looks like this.
TCP split data
If the HTTP request message is longer than the MSS, TCP needs to send HTTP data in pieces rather than sending all the data at once.
packet dividing
TCP message generation
TCP has two ports, one for browsers (usually randomly generated) and one for Web servers (HTTP default port number 80, HTTPS default port number 443).
After the two parties establish a connection, the data part of the TCP message stores HTTP header + data. After assembling the TCP message, it needs to be handed over to the following network layer for processing.
So far, the network packet message is as shown in the figure below.
IP Header Format
IP protocol requires source IP address and destination IP address:
Source IP address, that is, the IP address output by the client;
The destination address, i.e. the IP of the Web server obtained through DNS domain name resolution.
Because HTTP is transmitted through TCP, the protocol number in the IP header should be filled in as 06 (hexadecimal), indicating that the protocol is TCP.
Assuming that the client has multiple network cards, there will be multiple IP addresses, which IP address should be selected for the source address of the IP header?
When there are multiple network cards, when filling in the source address IP, you need to determine which address should be filled in. This determination is equivalent to determining which one of multiple network cards should be used to send packets.
At this time, you need to determine which NIC is the source IP according to the routing table rules.
On Linux operating systems, we can use the route -n command to view the routing table of the current system.
routing rule judgment
First, we perform an AND operation with the subnet mask (Genmask) of the first entry, and the result is 192.168.10.0, but the Destination of the first entry is 192.168.3.0, so the match fails because the two are inconsistent.
And then with the second destination subnet mask, the result is 192.168.10.0, and the match with the second destination Destination 192.168.10.0 is successful, so the IP address of the eth2 network card is used as the source address of the IP header.
Assuming that the destination address of the Web server is 10.100.20.100, the routing table rule above is still used to determine that the result matches the third entry.
The third entry is special. Its destination address and subnet mask are both 0.0.0.0, which indicates the default gateway. If all other entries cannot match, this line will automatically match. And then send the packet to the router, Gateway is the IP address of the router.
IP message generation
So far, the network packet message is shown in the figure below.
MAC Header Format
The MAC header requires the sender MAC address and the receiver destination MAC address for transmission between the two points.
Generally, in TCP/IP communication, the protocol type of MAC header only uses:
0800: IP Protocol
0806: ARP Protocol
How do MAC senders and receivers confirm?
The MAC address of the sender is relatively simple to obtain. The MAC address is written to the ROM when the network card is produced. As long as this value is read out and written to the MAC header, it can be done.
The MAC address of the receiver is a bit complicated. As long as we tell the Ethernet the MAC address of the other party, the Ethernet will help us send the packet. Obviously, we should fill in the MAC address of the other party here.
So first you have to figure out who to send the packet to, which you just have to check the routing table to know. Find a matching entry in the routing table and send the packet to the IP address in the Gateway column.
Now that you know who to send it to, how do you get the MAC address of the other party?
Don't know each other's MAC address? I don't know, just scream.
At this point we need ARP protocol to help us find the MAC address of the router.
MAC layer message
At this point, add the MAC header packet thank you very much, said: "Thank you MAC boss, I know where I want to go next!" I have a lot of head brothers now and believe I can reach my final destination! "。With the data package of many head brothers, he was finally ready to go out.
Exit 07--NIC
IP generated network packets are just a string of binary digital information stored in memory, and there is no way to send them directly to each other. Therefore, we need to convert digital information into electrical signals before they can be transmitted on the network cable, that is, this is the real data transmission process.
Responsible for the implementation of this operation is the network card, to control the network card also needs to rely on the network card driver.
After the NIC driver gets the packet from the IP module, it copies it into a buffer within the NIC, then adds a header and start frame delimiter to the beginning, and a frame check sequence to detect errors to the end.
the switch MAC address table
For example, if the MAC address of the receiver of the received packet is 00-02-B3-1C-9C-F9, it matches the third row in the table in the figure. According to the information in the port column, we know that this address is located on port 3, and then the packet can be sent to the corresponding port through the switching circuit.
So, the switch looks up the MAC address from the MAC address table and sends the signal to the appropriate port.
What happens when the MAC address table cannot find the specified MAC address?
The specified MAC address could not be found in the address table. This may be because the device with the address has not sent packets to the switch, or because the device has not been working for a while and the address has been removed from the address table.
In this case, the switch cannot determine which port to forward the packet to, but can only forward the packet to all ports except the source port, regardless of which port the device is connected to.
This is not a problem because Ethernet is designed to send packets throughout the network, and then only the appropriate receiver receives the packet, while other devices ignore it.
Some people say,"Does this send extra packets, or does it cause congestion? "
In fact, there is no need to worry too much, because after sending the packet, the target device will respond, as long as the response packet is returned, the switch can write its address to the MAC address table, and the next time it does not need to send the packet to all ports.
A local area network can transmit thousands of packets per second, so one or two extra packets won't hurt.
Also, if the receiver MAC address is a broadcast address, the switch sends the packet to all ports except the source port.
The following two are broadcast addresses:
FF:FF in MAC address
255.255.255.255 in IP address
The packet is forwarded through the switch and arrives at the router, ready to leave the native subnet. At this point, the packet and switch said: "Thank you switch brothers, help me forward to the exit gate, I want to go far away!" "
09 Exit Gate--Router
The difference between routers and switches
After passing through the switch, the network packet now arrives at the router, where it is forwarded to the next router or destination device.
This step of forwarding works similar to switches, and also determines the destination of packet forwarding by looking up the table.
However, in the specific operation process, routers and switches are different.
Because the router is based on IP design, commonly known as three-layer network equipment, each port of the router has a MAC address and an IP address;
The switch is based on Ethernet design, commonly known as Layer 2 network equipment, and the port of the switch does not have a MAC address.
Router Basics
A router port has a MAC address, so it can be both a sender and a receiver of Ethernet; it also has an IP address, and in this sense, it is the same as a computer card.
When forwarding packets, the router port will first receive the Ethernet packet sent to itself, and then the routing table queries the forwarding destination, and then the corresponding port will send the Ethernet packet as the sender.
Packet receiving operation of router
First, the electrical signal arrives at the network interface, and modules in the router convert the electrical signal into a digital signal, which is then error-checked by the FCS at the end of the packet.
If there is no problem, check the MAC address of the receiver in the MAC header to see if it is a packet sent to yourself. If it is, put it in the receive buffer, otherwise discard the packet.
In general, routers have MAC addresses on their ports, and only receive packets that match their own addresses, and discard packets that do not match.
Query routing table to determine output port
After the packet is received, the router removes the MAC header at the beginning of the packet.
The MAC header is used to send packets to the router, where the MAC address of the receiver is the MAC address of the router port. Therefore, when the packet arrives at the router, the MAC header's job is done, and the MAC header is discarded.
Next, the router forwards the packet based on the contents of the IP header behind the MAC header.
Forwarding operation is divided into several stages, the first is to query the routing table to determine the forwarding target.
skinned model
After the packet arrives at the server, the server will first open the MAC header of the packet to see if it matches the MAC address of the server itself. If it matches, the packet will be collected.
Then continue to open the IP header of the data packet and find that the IP address matches. According to the protocol item in the IP header, you know that your upper layer is TCP protocol.
So, open the TCP header, there is a sequence number, need to see if this sequence packet is what I want, if it is put into the cache and then return an ACK, if not discarded. The TCP header also contains the port number on which the HTTP server is listening.
The server naturally knows that the HTTP process wants the package, and sends it to the HTTP process.
The HTTP process on the server sees that the request is to access a page, so it encapsulates the page in
HTTP response message.
HTTP response packets also need to wear TCP, IP, MAC headers, but this time the source address is the server IP address and the destination address is the client IP address.
After putting on the head clothes, go out from the network card, hand over to the switch to forward to the router out of the city, the router sends the response packet to the next router, so jump ah jump.
Finally jumped to the client's gate handle router, the router opened the IP header found that it was looking for people in the city, so the packet was sent to the switch in the city, and then forwarded to the client by the switch.
After the client received the response packet from the server, it was also very happy that the client could open the courier!
Therefore, the client began to peel off the skin of the received data packet, leaving the HTTP response message, and handed it to the browser to render the page. A special data packet express was displayed like this!
Finally, the client wants to leave, and sends TCP four waves to the server, and the connection between the two parties is broken.
The content of "URL to Web page shows what happened in between" is introduced here. Thank you for reading it. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.