Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Analysis of the process from entering the URL to the final browser to render the content of the page

2025-01-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Share

Shulou(Shulou.com)05/31 Report--

This article will explain in detail the process analysis from the input URL to the final browser presentation of the page content. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

Prepare for

When you type the URL (such as www.coder.com) in the browser and hit enter, the first thing the browser needs to do is to get the IP address of coder.com. The specific way is to send a package of UDP to the DNS server, and the DNS server will return the IP of coder.com. At this time, the browser will usually cache the IP address, so that the next access will be accelerated.

For example, Chrome, you can check it through chrome://net-internals/#dns.

With the IP of the server, the browser can make HTTP requests, but the HTTP Request/Response must send and receive on the "virtual connection" of TCP.

To establish a "virtual" TCP connection, the TCP postman needs to know four things: (native IP, native port, server IP, server port). Now all you know is native IP and server IP. What about the two ports?

The local port is very simple, the operating system can assign a random one to the browser, the server port is simpler, using a "well-known" port, the HTTP service is 80, we can just tell the TCP postman.

After three handshakes, the TCP connection between the client and the server is established! You can finally send a HTTP request.

The TCP connection is drawn as a dotted line because the connection is virtual

Web server

A HTTP GET request has gone through thousands of mountains and rivers, forwarded by multiple routers, and finally reached the server side (HTTP packets may be fragmented and transmitted by the lower layer, omitting the table).

The Web server needs to deal with it, and it can handle it in three ways:

(1) all requests can be processed with one thread, and only one can be processed at a time. This structure is easy to implement, but it can cause serious performance problems.

(2) one process / thread can be assigned to each request, but when there are too many connections, the server-side processes / threads will consume a lot of memory resources, and the process / thread switching will overwhelm the CPU.

(3) in the way of reusing Web O, many Web servers adopt a reuse structure, such as monitoring all connections through epoll. When the state of the connection changes (if there is data to read), a process / thread is used to process that connection. After processing, it will continue to monitor and wait for the next state change. In this way, thousands of connection requests can be handled with a small number of processes / threads.

We use Nginx, a very popular Web server, to continue the following story.

For the HTTP GET request, Nginx uses epoll to read it, and Nginx then determines whether it is a static request or a dynamic request.

If it is a static request (HTML file, JavaScript file, CSS file, image, etc.), you may be able to handle it by yourself (of course, depending on the Nginx configuration, which may be forwarded to another cache server), read the relevant files on the local hard drive and return directly.

If the request is dynamic and needs to be processed by a backend server (such as Tomcat) before it can be returned, it needs to be forwarded to Tomcat. If there is more than one Tomcat in the backend, you need to select one according to a certain policy.

For example, Ngnix supports the following:

Polling: forward to the back-end server one by one in order

Weight: assign a weight to each back-end server, which is equivalent to the probability of forwarding to the back-end server.

Ip_hash: do a hash operation according to ip, and then find a server to forward it, so that the same client ip will always be forwarded to the same back-end server.

Fair: the request is allocated according to the response time of the back-end server, and the priority allocation of the response time period.

No matter which algorithm is used, a back-end server is eventually selected, and then the Nginx needs to forward the HTTP Request to the back-end Tomcat and forward the HttpResponse output from the Tomcat to the browser.

Thus it can be seen that Nginx is an agent in this scenario.

Application server

Http Request finally came to Tomcat, a container written by Java that can handle Servlet/JSP, and our code runs in this container.

Like a Web server, Tomcat may also assign a thread to each request to process, commonly known as the BIO mode (Blocking Iandano mode).

It is also possible to use the Icano multiplexing technique, which uses only several threads to process all requests, that is, the NIO pattern.

Either way, the Http Request will be handed over to a Servlet, which in turn converts the Http Request into the parameter format used by the framework, and then distributes it to a Controller (if you are using Spring) or Action (if you are in Struts).

The rest of the story is relatively simple (no, for programmers, it is actually the most complex part), which is to execute the add, delete, modify and query logic that programmers often write. In this process, they are likely to deal with cache, database and other back-end components, and eventually return HTTP Response. As the details depend on business logic, they are omitted.

According to our example, this HTTP Response should be a HTML page.

The way back home

Tomcat is happy to send Http Response to Ngnix.

Ngnix is also happy to send Http Response to the browser.

Can the TCP connection be closed after sending?

If HTTP1.1 is used, the connection defaults to keep-alive, which means it cannot be closed

If it is HTTP1.0, check to see if there is Connetion:keep-alive in the previous HTTP Request Header, and if so, it cannot be turned off.

The browser is working again

The browser receives the Http Response, reads the HTML page from it, and starts to prepare to display it.

But this HTML page may reference a large number of other resources, such as js files, CSS files, images, etc., which are also located on the server side and may be under another domain name, such as static.coder.com.

Browsers have no choice but to download them one by one, starting with using DNS to get IP, and doing what they have done before. The difference is that there will be no more application servers such as Tomcat.

If you need to download too many external resources, the browser will create multiple TCP connections to download in parallel.

However, the number of requests under the same domain name should not be too large at the same time, otherwise the number of server visits is too large to bear. So the browser should be limited, for example, Chrome can only download six resources in parallel under Http1.1.

When the server sends JS,CSS these files to the browser, it will tell the browser when these files will expire (using Cache-Control or Expire), and the browser can cache the files locally. When the same file is requested for the second time, if it is not expired, it can be fetched locally.

If it expires, the browser can ask the server if the file has been modified. (based on the Last-Modified and ETag sent by the last server), caching can also be used if it has not been modified (304 Not Modified). Otherwise, the server will be sent back to the browser by the latest file.

Of course, if you press Ctrl+F5, the GET request will be forced, completely ignoring the cache.

Note: under Chrome, you can view the cache through the chrome://view-http-cache/ command.

Now the browser has got three important things:

1.HTML, the browser turns it into DOM Tree

2. CSS, the browser turns it into CSS Rule Tree

3. JavaScript, which can modify DOM Tree

The browser will generate the so-called "Render Tree" through DOM Tree and CSS Rule Tree, calculate the location / size of each element, lay it out, and then call the API of the operating system to draw, which is a very complex process, omitting the table.

So far, we have finally seen the content of www.coder.com in the browser.

This is the end of the article on "process analysis from entering the URL to the final browser rendering page content". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Network Security

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report