In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "what will happen if you enter the URL by pressing enter". In the daily operation, I believe many people have doubts about what will happen if you enter the URL by pressing enter. The editor has consulted all kinds of materials and sorted out a simple and useful method of operation. I hope it will be helpful for you to answer the question of "what will happen if you enter the URL by pressing enter?" Next, please follow the editor to study!
General process
URL parsing.
DNS query.
TCP connection.
The server processes the request.
The client receives the HTTP message response.
Render pa
Here's the point:
How to understand TCP's three-way handshake and four waves? What is the state of each handshake between the client and the server?
Why do you have 2MSL when you wave? what is the problem with a large number of Socket in TIME_WAIT or CLOSE_WAIT status?
What is the process of three handshakes and four waves?
What is the message format of HTTP?
Continue to read this article, and listen to the code elder brother byte answer, Wechat search for "code brother byte", pay attention to the official account more hard core.
URL parsing
Address resolution: first of all, determine whether you enter a legal URL or a keyword to be searched, and automatically complete, character coding and other operations according to what you enter.
HSTS uses HSTS to force the client to use HTTPS to access the page because of security risks. For details, see HSTS that you don't know.
Other operating browsers will also perform some additional operations, such as security checks and access restrictions (previously domestic browsers restricted 996.icu).
Check the cache
DNS query
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Browser cache: first check whether it is in the cache, if not, call the system library function to query.
Operating system cache: the operating system also has its own DNS cache, but before that, a query request is sent to the DNS server to check whether the domain name exists in the local Hosts file.
Router cache.
ISP DNS caching: ISP DNS is the preferred DNS server set up on the client computer, and they have caching in most cases.
Root domain name server query
Without caching in all the previous steps, the local DNS server forwards the request to the root domain on the Internet. The following figure illustrates the whole process:
It is important to note that:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Recursive method: check all the way without return, and return the information only when you get the final result (the process from the browser to the local DNS server)
The iterative method is the way in which the local DNS server queries to the root domain name server.
What is DNS hijacking?
Front-end dns-prefetch optimization
TCP connection establishment and disconnection
TCP/IP is divided into four layers. When sending data, each layer encapsulates the data:
Application layer: sending HTTP requests
The browser gets the server IP from the address bar, and then constructs a HTTP message that includes:
Request header (Request Header): request method, destination address, protocol followed, etc.
Request body, request parameters, such as parameters in body
Transport layer: TCP transport message
The transport layer initiates a TCP connection to the server. In order to facilitate transmission, the data is divided (in terms of message segments) and numbered, so that the message information can be accurately restored when the server accepts it. The TCP three-way handshake is performed before the connection is established.
Network layer: IP protocol query MAC address
The data segment is packaged and added to the source and destination IP addresses, and is responsible for finding the transmission route. Determine whether the destination address is in the same network as the current address, if yes, send it directly according to the Mac address, otherwise use the routing table to find the next-hop address, and use the ARP protocol to query its Mac address.
Link layer: Ethernet protocol
Data is divided into packets in "frames" according to the Ethernet protocol, and each frame is divided into two parts:
Header: the sender, receiver, data type of the packet
Data: packet details
Mac address
Ethernet stipulates that all devices connected to the network must have a "network card" interface, data packets are transmitted from one network card to another, and the address of the network card is the Mac address. Each Mac address is unique, with one-on-one capabilities.
The main request process:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
The browser gets the IP and port number of the server from the address bar
The browser has a connection between the servers through the TCP three-way handshake.
The browser sends messages to the server
The server receives the message processing and sends the response message to the browser.
The browser parses the message and renders it to the page.
Three-way handshake
A connection needs to be established before data is transmitted at the transport layer, that is, a three-way handshake to create a reliable connection.
First, the Server side needs to listen to the port before establishing the link, so the initial state of the Server side before establishing the link is the LISTEN state. When the Client side is ready to establish the link, send a SYN synchronization package first. After sending the synchronization package, the link status of the Client side becomes the SYN_SENT state. After receiving the SYN, the Server side agrees to establish a link and will reply an ACK to the Client side.
Because TCP is duplex, the Server side will also send a SYN to the Client side and apply for Server to establish a link to the Client direction. After sending ACK and SYN, the link status on the Server side becomes SYN_RCVD.
After Client receives the ACK of Server, the link status of the Client side becomes ESTABLISHED status. At the same time, Client sends ACK to the Server side and replies to the SYN request of the Server side.
After the Server terminal receives the ACK of the Client terminal, the link status of the Server terminal becomes the ESTABLISHED state. At this time, the connection is completed, and the two parties can transmit data at any time.
During the interview, you need to understand that the three-way handshake is to establish a two-way link, and you need to keep in mind the changes in the link status between the Client side and the Server side. In addition, when answering the question of Jianlian, you can mention the cause of the SYN flood attack, that is, after receiving the SYN request from the Client side, the Server side sends ACK and SYN, but the Client side does not reply, resulting in a large number of links on the Server side in the SYN_RCVD state, thus affecting the connection of other normal requests. You can set tcp_synack_retries = 0 to speed up the recycling of half-links, or scale up tcp_max_syn_backlog to deal with a small number of SYN flood attacks
Waving four times
All we have to do is pay attention to the disconnection process between port 80 and port 13743, and the browser sends [FIN, ACK] through port 13747. Is this different from what you see on the Internet?
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
In fact, when the client sends a [FIN] message, it sends a [ACK] to confirm the last transmission confirmation.
Then the server responds to [ACK] through port 80, and then immediately responds to [FIN, ACK] to indicate that the data transfer is finished, and the connection can be closed.
Finally, the browser sends a [ACK] packet to the server through port 13743, and the connection between the customer server and the server is closed.
The specific process is shown in the following figure:
Three handshakes and four waves
Client:
SYN_SENT-after the client initiates the 1st handshake, the connection status is SYN_SENT, waiting for the server kernel to reply. If the server does not have time to process it (for example, the server's backlog queue is full), you can see the connection in this state.
ESTABLISHED-indicates that the connection is in a normal state and can be used for data transfer. After the client receives the SYN+ACK from the server, it replies to the SYN of the server separately (the 3rd handshake). The connection is established and enters the ESTABLISHED state. After receiving the third handbag, the server program also enters the ESTABLISHED state.
FIN_WAIT_1-after the client sends a FIN message to close the connection, it waits for the server to reply to ACK for confirmation.
FIN_WAIT_2-indicates that we have closed the connection and are waiting for the server to close. After the client sends a FIN message to close the connection, the server sends back an ACK reply, but it will be in this state if it is not closed.
TIME_WAIT-after both sides close the connection normally, the client maintains the TIME_WAIT for a period of time to ensure that the last ACK is successfully sent to the server. MSL (maximum message lifetime) with twice the length of stay is about 60 seconds under Linux. So you can usually see thousands of TIME_WAIT connections on a server that frequently establishes short connections.
Server:
LISTEN-indicates when the current program is listening on a port.
SYN_RCVD-after receiving the 1st handshake, the server enters the SYN_RCVD state, replies with a SYN+ACK (2nd handshake), and waits for confirmation.
ESTABLISHED-indicates that the connection is in a normal state and can be used for data transfer. After completing the TCP3 handshake, the connection is established and enters the ESTABLISHED state.
CLOSE_WAIT-indicates that the client has closed the connection, but it is not closed locally and is waiting for it to be closed locally. Sometimes the client program has exited, but the server program closes the connection because of an exception or the BUG does not call the close () function, then the connection will always be in the CLOSE_WAIT state on the server, but no longer exists on the client.
LAST_ACK-indicates that the client is waiting for final confirmation of the server shutdown request.
Reasons for the existence of the TIME_WAIT status:
Draw the key points.
Reliably implement the termination of TCP full-duplex connection during the closed connection four-way handshake protocol, the final ACK is issued by the active closing end. If the final ACK is lost, the server will resend the final FIN, so the client must maintain the status information and allow it to resend the final ACK. If this state information is not maintained, the client will respond to the RST section, which the server interprets as an error (connection reset's SocketException will be thrown in java). Therefore, in order to achieve the normal termination of TCP full-duplex connection, we must deal with the loss of any of the four sections of the termination sequence, and the client that closes actively must maintain the state information to enter the TIME_WAIT state.
Allow the old repetitive section to disappear in the network. The TCP section may be "lost" due to an exception in the router. During the lost period, the TCP sender may resend this section because of a confirmation timeout, and the lost section will also be sent to the final destination after the router is repaired. The original lost section is called lost duplicate. Immediately after closing a TCP connection, a TCP connection between the same IP address and port is re-established, and the latter connection is called the incarnation of the previous connection, so it is possible that the lost duplicate packet of the previous connection occurs after the termination of the previous connection, thus being misunderstood as subordinate to the new avatar. To avoid this situation, TCP does not allow connections in the TIME_WAIT state to start a new avatar, because the TIME_WAIT state continues to 2MSL, which ensures that when a TCP connection is successfully established, the repeated packets from the previous avatar of the connection have disappeared in the network.
In addition, when answering the question of chain breakage, it can be mentioned that a large number of problems that Socket is in TIME_WAIT or CLOSE_WAIT state may be encountered in practical applications. Generally speaking, enabling tcp_tw_reuse and tcp_tw_recycle can speed up the Sockets recovery of TIME-WAIT; while a large number of CLOSE_WAIT may be caused by the passive closing of the code bug, which does not close the link correctly.
To put it simply,
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Ensure that the full-duplex connection of the TCP protocol can be reliably closed
Ensure that the duplicate segments of this connection disappear from the network to prevent data confusion when the port is reused.
The server processes the request and responds to HTTP messages
Take an in-depth analysis of what a HTTP message is. Data transmission is responsible for the underlying transmission work through the TCP/IP protocol, the HTTP protocol basically does not have to worry about, the so-called "hypertext transfer protocol" does not seem to normally "transfer" this thing, then what is the core of HTTP?
Compared to the figure TCP message, it appends a 20-byte header data to the actual data to be transmitted, and stores the additional information necessary for the TCP protocol, such as the sender's port number, the receiver's port number, packet sequence number, flag bit, and so on.
With this additional TCP header, the packet can be transmitted correctly. When you get to the destination, you can remove the header and get the real data. This is easy to understand. Start and end points are set, different protocols are affixed with different headers, and at the corresponding destination, the header is removed to extract the real data.
Similar to TCP/UDP, you need to set some request headers before transmitting data, except that HTTP is a "plain text" protocol, and all headers are ASCII code text, so it's easy to see what it is.
In addition, the structure of his request message is basically the same as that of the response message, which consists of three main parts:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Start Line: describes the basic information of the request or response.
Header: specify the message information in the form of key-value.
Blank line.
Message body (Entity): the transmitted data, such as picture, video, text, etc.
The first two parts of the starting line and header fields are often called "request header" or "response header", and the message body is also called "entity", but corresponding to "header", it is often directly called "body".
Knock on the blackboard.
The HTTP protocol stipulates that the message must contain Header, and then there must be a "blank line", that is, "CRLF", hexadecimal "0D0A", can not have "body".
The structure of the message is shown in the following figure:
To intercept a message:
Request header-start line
The request line consists of three fields: the request method field, the URL field, and the HTTP protocol version field, separated by spaces. For example, GET / HTTP/1.1.
The request methods of HTTP protocol are GET, POST, HEAD, PUT, DELETE, OPTIONS, TRACE, CONNECT.
GET is the request method, "/" is the target resource of the request, and "HTTP/1.1" requests the protocol version number.
GET / HTTP/1.1 translates roughly as follows: "hello, server, I would like to request that the default files in the root directory use the HTTP 1.1 protocol version."
Head Header
The second part is Header, which is composed of key:value. You need to pay attention to using custom headers:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Header fields are case-insensitive and are usually capitalized
Field names do not allow spaces, you can use "-", you cannot use "_"
The field name must be followed by ":" and there can be no spaces, but ":" can be followed by spaces.
The order of field names makes no sense.
The browser receives the response and renders data
After the browser receives the response resource from the server, it analyzes the resource. First look at Response header and do different things according to different status codes (such as the redirection mentioned above). If the response resource is compressed (such as gzip), it also needs to be unzipped. Then, the response resource is cached. Next, the response content is parsed according to the MIME [3] type in the response resource (for example, HTML and Image have different parsing methods).
Then render the received data, different browsers are not exactly the same, but the process is roughly the same:
At this point, the study on "enter the URL press enter what will happen" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.