In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article introduces the relevant knowledge of "enter the process analysis of entering a URL in the browser address bar". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1. Parsing URL
I don't know if any of my classmates will confuse the concept of domain name and URL. It can be understood that URL is the URL we entered, and the URL contains the domain name. For example: www.baidu.com/veal98 is a URL, and www.baidu.com is the domain name of the server.
The elements of URL are composed as follows (of course, the pathnames of the following request files can be omitted):
The file path on the target server for this URL request is:
So first of all, the first step of the browser is to parse the URL to get the parameters, separating the domain name from the resources that need to be requested, so as to know which server to request, what resources to request on the server, and so on.
two。 Browser encapsulates HTTP request message
After parsing the URL, the browser determines the target server and file name, and then needs to "encapsulate" these messages into a HTTP request message to send out. Take an example of a HTTP request message:
❝
For more information on the HTTP protocol, please see the previous life and present life of the HTTP protocol. I will not repeat them here.
❞
Explain "encapsulation", a concept that runs through the entire computer network. That is to say, when the sender transmits data between layers, each layer will be marked with the first message to which the layer belongs. On the contrary, when the receiver transmits data between layers, the header information corresponding to that layer will be deleted after each layer.
3. DNS domain name resolution to obtain IP address
After the HTTP request message has been encapsulated, there is another preparation that has not been done, that is to obtain the IP address of the target server.
Although the domain name is parsed, the theoretical browser already knows who the target server is. But in fact, the domain name is not the real address of the target server. Every computer on the Internet is identified by the unique IP address in the world, but the IP address is not easy to remember, so the domain name is designed.
Then you need to resolve the domain name to get the IP address of the target server. Otherwise, there is a domain name that is easy to remember. How do you know where this request is sent? Getting IP addresses from domain names is what the DNS protocol does, as follows:
❝
For more details on DNS, you can review the article on DNS protocol parsing in super detail, such as what is a domain name, domain name server, recursive query and iterative query, etc., which are detailed enough to list only the resolution process of DNS here.
❞
1) first search "browser's DNS cache", which maintains a corresponding table between domain name and IP address.
2) if it is not hit, continue to search for "operating system's DNS cache"
3) if it still fails, the operating system sends the domain name to the "local domain name server", and the local domain name server queries its own DNS cache. If the search succeeds, the result is returned (Note: the query method between the host and the local domain name server is "recursive query")
4) if the DNS cache of the local domain name server fails, the local domain name server queries the higher domain name server and performs "iterative query" in the following ways (Note: the query method between the local domain name server and other domain name servers is iterative query to prevent excessive pressure on the root domain name server):
First of all, the local domain name server initiates a request to the "root domain name server". The root domain name server is at the highest level. it does not directly specify the IP address corresponding to the domain name, but returns the address of the top-level domain name server, that is, it points out a way for the local domain name server to go here to find the answer after the local domain name server gets the address of this "top-level domain name server". The local domain name server initiates a request to it according to the address of the authorized domain name server, and finally gets the corresponding IP address of the domain name.
4) the local domain name server returns the IP address to the operating system and caches the IP address itself.
5) the operating system returns the IP address to the browser and caches the IP address itself
6) at this point, the browser gets the IP address corresponding to the domain name and caches the IP address
Follow the figure below for intuitive understanding:
It should be noted that DNS uses the UDP protocol, which means that the forwarding of the above requests is based on the connectionless protocol UDP.
4. Establish a TCP connection
After obtaining the IP address of the target server, the browser will know who I want to send the request to later, and then you can start sending the encapsulated HTTP request message. Since you need to send the request, it is necessary for TCP to establish a reliable connection between the browser and the server through three handshakes to "ensure that both parties have reliable receiving and sending capabilities".
❝
Here is another classic interview question: TCP three-way handshake and four waving hands, details can be seen about TCP three-way handshakes and four waving hands, the full mark answer is in this article.
❞
The three-way handshake process is shown below:
5. Browser sends request
After the TCP three-way handshake is completed, a reliable virtual channel is established between the browser and the target server, so the browser can send its own HTTP request.
It should be noted that when HTTP request messages or response messages are transmitted on the TCP connection channel, because these messages are relatively large, in order to transmit them more easily, accurately and reliably, "TCP will divide the HTTP message into several message segments by sequence number and add the TCP header to transmit them respectively. After receiving these message segments, the receiver reorganizes the HTTP messages in the original order according to the sequence number.
6. IP protocol responsible for transmission
In fact, when TCP establishes a connection with a three-way handshake, disconnects with a four-way handshake, and sends and receives data (TCP message segments) in the process of connection establishment, it is transmitted through the IP protocol. The IP protocol adds the data of these stages to the IP header and encapsulates it into an IP Datagram for transmission.
The first part of the IP Datagram contains a "source IP address" and a "destination IP address". The so-called source IP address is the sender's IP address; the destination IP address is the IP address of the target server obtained through DNS domain name resolution.
In fact, "the network layer in which the IP protocol is located specifies the path (transmission route) through which the Datagram can reach the other computer and send it to the other party." Do not understand the detailed explanation of this sentence will come right away, continue to read on.
7. Using ARP protocol to communicate ❝with MAC address
For more information about IP protocol, IP address, MAC address, etc., please see this article that is no longer afraid of IP protocol (ten thousand words | multi-image warning).
❞
As mentioned above, the function of IP protocol is to send all kinds of packets to each other, and to ensure that they are actually sent to each other, various conditions need to be met, two of which are IP address and MAC address.
The MAC address is also used to uniquely identify a device connected to the Internet, and some partners may ask, since the network layer already has a uniquely identified IP address, why do you still need a MAC address?
Look at the picture below, on the network, "it is rare for both sides of communication to be in the same local area network, and it usually requires the transit of multiple computers and network devices to connect to each other." In transit, you need to use the MAC address of the next transit device to search for the next transit target.
The network layer specifies which host ("source IP address") is sent to which host ("destination IP address"). "the source IP address and destination IP address will not change during transmission."
The data link layer transmits in one interval after another according to the MAC address, the starting address in each interval is the "source MAC address", and the destination address in each interval is the "destination MAC address". Obviously, with the transmission of data, "the source MAC address and destination MAC address will continue to change."
For example, in the figure above, "the network layer tells the 1-2-3 route, which means that the IP addresses of these routers are indicated." Then the data link layer will find 1, 2, 3 according to the corresponding MAC addresses of these IP addresses, and transfer data between them.
? To put it this way, to take a more vivid example: if we think of the data link layer as a passenger who takes a high-speed rail from Suzhou to Nanjing, then Nanjing to Beijing, and then Beijing to Xizang, then the network layer is equivalent to the staff at every station. "at each transfer at the data link layer, the network layer buys a ticket marked with the next MAC address". Therefore, even if the passenger (data link layer) does not know its final destination, the staff (network layer) will guide you.
In fact, the process of guidance at the network layer, which we call "routing control", also involves routing protocols such as OSPF.
Then, the protocol that "translates IP addresses into MAC addresses" and thus transmits data precisely at the data link layer is the "ARP protocol".
ARP determines the MAC address with the help of two types of packets, "ARP request and ARP response". And each host has a "ARP cache", which contains the "IP address to MAC address mapping table" of the hosts and routers on the local area network.
As shown in the following figure, assume that host A sends IP datagrams to host B on the same link, knowing the IP addresses of host An and host B, and they do not know each other's MAC addresses:
1) first of all, in order to obtain the MAC address of host B, host A will first query its own ARP cache to see if there are any relevant records of host B.
2) if there is no mapping of host B's IP address to MAC address in the ARP cache of host A, host A will send a "ARP request packet" by "broadcast" (the packet carries its own IP address and MAC address as well as the IP address of the target host), indicating that it wants to obtain the MAC address of host B.
2) because broadcast requests can be received by all hosts or routers on the same link, if the IP address of a host or route on this link is the same as the IP address of the target host contained in the ARP request packet, then the node will plug its own MAC address into the "ARP response packet" and return it to host A
❝
Of course, the ARP response packet is sent as a unicast, after all, the ARP request packet already contains the IP address of host A, so host B knows exactly who the response packet should be sent to.
Most network protocols are designed with extreme restraint, cutting off unnecessary interactions, merging information that can be merged, and unicast without broadcasting, so as to increase bandwidth and make the network faster.
❞
3) after receiving the ARP response packet from host B, host A writes the mapping of host B's IP address to MAC address in its ARP cache.
Of course, the cache has a certain period of time, beyond which the contents of the cache will be emptied. This also makes it possible to correctly send packets to the destination address even if the mapping between the MAC address and the IP address has changed.
8. Server responds to request
The connection channel established by the browser's HTTP request message through the TCP three-way handshake is divided into several message segments and sent to the server respectively. After receiving these message segments, the server reorganizes the HTTP request message in the original order according to the sequence number. Then process and return a HTTP response. Of course, the HTTP response message goes through the same process as the HTTP request message.
Take a look at the picture below to review:
9. Disconnect TCP connection
After the browser and the server no longer need to send data, disconnect the TCP connection with four waves. for more information, please see the three-way handshake and four waves of TCP. The full answer is in this article.
10. Browser display interface
After receiving the HTTP response message returned by the server, the browser renders the corresponding data according to the browser's rendering mechanism.
This is the end of the content of "enter the process of entering a URL in the browser address bar". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.