How does the browser work? 07/06 Update SLTechnology News&Howtos

How does the browser work?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how the browser works". Interested friends may wish to take a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how the browser works.

Browser architecture

Before we talk about browser architecture, understand two concepts, process and thread.

Process is an execution process of a program and a dynamic concept. It is the basic unit for a program to allocate and manage resources during execution. Thread is the basic unit for CPU scheduling and dispatching. It can share all resources owned by a process with other threads belonging to the same process.

To put it simply, the process can be understood as the executing application, while the thread can be understood as the executor of the code in our application. And their relationship can be imagined, the thread is running in the process, there may be one or more threads in a process, and a thread can only belong to one process.

As we all know, the browser belongs to an application, and one execution of the application can be understood as the computer started a process, after the process started, CPU will allocate the corresponding memory space for the process, when our process gets the memory, we can use threads for resource scheduling, and then complete the function of our application.

In the application, in order to meet the functional needs, the started process will create another new process to deal with other tasks. These new processes have a new independent memory space and can not be inward memory with the original process. If these processes need to communicate with each other, it can be done through the IPC mechanism (Inter Process Communication).

Many applications work in this multi-process way, because processes and processes are independent of each other, and they do not affect each other, that is, when one of the processes dies, it does not affect the execution of the other processes. you just need to restart the suspended process to resume running.

Multi-process Architecture of browser

If we develop a browser, its architecture can be a single-process, multi-threaded application or a multi-process application that uses IPC communication.

Different browsers use different architectures. The following mainly takes Chrome as an example to introduce the multi-process architecture of browsers.

In Chrome, there are four main processes:

Browser process (Browser Process): responsible for the browser's TAB forward, backward, address bar, bookmark bar work and handle some of the browser's invisible underlying operations, such as network requests and file access.

Rendering process (Renderer Process): responsible for display-related work within a Tab, also known as the rendering engine.

Plug-in process (Plugin Process): responsible for controlling the plug-ins used by the web page

GPU process (GPU Process): responsible for handling GPU tasks for the entire application

What is the relationship between these four processes?

First of all, when we want to browse a web page, we will enter URL in the address bar of the browser. At this time, Browser Process will send a request to the URL to obtain the HTML content of the URL, and then send the HTML to Renderer Process,Renderer Process to parse the HTML content. When parsing the resources that need to request the network, they return to Browser Process to load, and notify Browser Process that Plugin Process is required to load plug-in resources and execute plug-in code. After the parsing is completed, Renderer Process calculates the image frames and gives them to GPU Process,GPU Process to convert them into image display screens.

Benefits of a multi-process architecture

Why does Chrome use a multi-process architecture?

First, higher fault tolerance. In today's WEB applications, HTML,JavaScript and CSS are becoming more and more complex. The code running in the rendering engine frequently appears BUG, and some BUG will directly cause the rendering engine to crash. The multi-process architecture makes each rendering engine run in its own process and is not affected by each other, that is to say, when one of the pages crashes and hangs, the other pages can run normally without impact.

Second, higher security and sandboxing (sanboxing). Rendering engines will often encounter unreliable or even malicious code on the network, and they will take advantage of these vulnerabilities to install malicious software on your computer. To solve this problem, browsers restrict different permissions to different processes. And provide it with sandboxie running environment to make it more secure and reliable.

Third, higher response time. In the single-process architecture, various tasks compete with each other for CPU resources, which makes the response speed of the browser slower, while the multi-process architecture just avoids this disadvantage.

Multi-process architecture optimization

As we said earlier, the role of Renderer Process is to be responsible for display-related work within a Tab, which means that a Tab will have a Renderer Process, and the memory of these processes cannot be shared, and the memory of different processes often needs to contain the same content.

Process mode of the browser

To save memory, Chrome provides four process modes (Process Models). Different process modes treat tab processes differently.

Process-per-site-instance (default)-use one process for the same site-instance

Process-per-site-use one process with the same site

Process-per-tab-use one process per tab

Single process-all tab share one process

Here we need to give the definition of site and site-instance.

Site refers to the same registered domain name (e.g. google.com, bbc.co.uk) and scheme (e.g. https://)). For example, a.baidu.com and b.baidu.com can be understood as the same site (note that it should be distinguished from Same-origin policy here, and the same origin policy also involves subdomain names and ports).

Site-instance refers to a set of connected pages from the same site. Here the definition of connected is how can obtain references to each other in script code understands this passage. Meet the following two situations and the new page and the old page open belong to the same site defined above, so they belong to the same site-instance

In this way, users click on a new page that opens.

A new page opened by JS code (such as window.open)

After understanding the concept, the following four process patterns are explained

The first is Single process, as the name implies, single-process mode, where all tab use the same process. Then there is Process-per-tab, which, as the name implies, creates a new process for each tab opened. For Process-per-site, when you open the a.baidu.com page, on the page that opens the b.baidu.com, the tab of these two pages uses the same process because the site of the two pages is the same, so if one of the tab crashes and the other tab crashes.

Process-per-site-instance is the most important because this is the default mode used by Chrome, which is the mode used by almost all users. When you open a tab to access a.baidu.com, and then open a tab to access b.baidu.com, the two tab will use two processes. If you open the b.baidu.com page in a.baidu.com through JS code, the two tab will use the same process.

Default mode selection

So why do browsers use Process-per-site-instance as the default process mode?

Process-per-site-instance is compatible with performance and ease of use, so it is a more moderate and general model.

Compared with Process-per-tab, being able to open many fewer processes means less memory footprint.

Compared with Process-per-site, it can better isolate unrelated tab under the same domain name and is more secure.

What happened during navigation?

Earlier, we talked about the multi-process architecture of the browser, the various benefits of the multi-process architecture, and how Chrome optimizes the multi-process architecture. Let's learn more about how processes and threads render our website pages from the simple scenario of users browsing the web.

Web page loading process

As we mentioned earlier, most of the work outside tab is done by the browser process Browser Process, and Browser Process is divided into different worker threads according to the different work:

UI thread: controls the buttons and input boxes on the browser

Network thread: processing web requests and getting data from the internet

Storage thread: controls access to files, etc.

Step 1: process input

When we press enter in the browser's address bar, UI thread will determine whether the input is a search keyword (search query) or URL. If it is a search keyword, jump to the default search engine corresponding to search URL. If the input is URL, start requesting URL.

Step 2: start navigation

After pressing enter, UI thread sends the keyword search corresponding URL or input URL to the network thread Network thread, at this time, the UI thread shows the icon in front of the Tab as loading state, and then the network process carries out a series of operations such as DNS addressing and establishing TLS connection to make resource requests. If it receives the 301redirect response from the server, it will tell the UI thread to redirect and then it will initiate a new network request again.

Step 3: read the response

After receiving the response from the server, network thread begins to parse the HTTP response message, and then determines the media type (MIME Type) of the response subject according to the Content-Type field in the response header. If the media type is a HTML file, the response data is handed over to the rendering process (renderer process) for the next step. If it is a zip file or other files, the relevant data will be transferred to the download manager.

At the same time, the browser performs a Safe Browsing security check, and if the domain name or request content matches to a known malicious site, network thread displays a warning page. In addition, network threads do CORB (Cross Origin Read Blocking) checks to determine that sensitive cross-site data is not sent to the rendering process.

Step 4: find the rendering process

After various checks, network thread is sure that the browser can navigate to the requested web page, network thread will inform UI thread that the data is ready, and UI thread will find a renderer process to render the web page.

In order for the browser to optimize the step of finding the rendering process, considering that it takes time for the network request to get the response, at the beginning of the second step, the browser has looked up and started a rendering process in advance. If all goes well in the intermediate step, when network thread receives the data, the rendering process will be ready, but if there is a redirection, the prepared rendering process may not be available. A rendering process will be restarted at this time.

Step 5: submit the navigation

At this point, the data and rendering process are ready, and Browser Process sends an IPC message to Renderer Process to confirm the navigation. At this point, the browser process sends the prepared data to the rendering process. After the rendering process receives the data, it sends an IPC message to the browser process, telling the browser process that the navigation has been submitted and the page begins to load.

At this time, the navigation bar will be updated, the security indicator will be updated (the small lock in front of the address), and the access history list (history tab) will be updated, that is, you can switch the page forward and backward.

Step 6: initialize and load complete

When the navigation submission is completed, the rendering process starts to load the resources and render the page (described below). When the page rendering is completed (the page and the internal iframe trigger the onload event), an IPC message is sent to the browser process, informing the browser process. At this time, UI thread will stop displaying the loading icon in the tab.

Principle of web page rendering

After the navigation process is completed, the browser process gives the data to the rendering process, which is responsible for everything within the tab. The core purpose is to convert the HTML/CSS/JS code into web pages that users can interact with. So how does the rendering process work?

In the rendering process, the containing threads are:

One main thread (main thread)

Multiple worker threads (work thread)

A synthesizer thread (compositor thread)

Multiple rasterized threads (raster thread)

Different threads have different job responsibilities.

Build DOM

When the rendering process receives the navigation confirmation and starts to accept data from the browser process, the main thread parses the data and converts it into a DOM (Document Object Model) object.

DOM is the data structure and API for WEB developers to interact with web pages through JavaScript.

Resource subload

In the process of building DOM, resources such as images, CSS and JavaScript scripts are parsed, which need to be obtained from the network or cache. If the main thread encounters these resources in the process of building DOM, it initiates a request to obtain them one by one. In order to improve efficiency, the browser will also run a preload scan (preload scanner) program if there are img, link and other tags in the HTML. The preload scanner passes these requests to Browser Process's network thread for resource download.

Download and execute JavaScript

In the process of building a DOM, if a tag is encountered, the rendering engine will stop parsing the HTML and load and execute the JS code instead, because the JS code may change the structure of the DOM (such as performing API such as [xss_clean] ()).

However, developers also have a variety of ways to tell browsers how to deal with a resource. For example, if attributes such as async or defer are added to the tag, the browser will load and execute JS code asynchronously without blocking rendering.

Style calculation-Style calculation

The DOM tree is just the structure of our page. If we want to know what the page looks like, we also need to know the style of each node of the DOM. When the main thread parses the page, it encounters the tag or the CSS resource of the tag, loads the CSS code, and determines the calculation style (computed style) of each DOM node according to the CSS code.

Calculation style is the main thread according to the CSS style selector (CSS selectors) calculated by each DOM element should have the specific style, even if your page does not set any custom style, the browser will provide its default style.

Layout-Layout

After the DOM tree and calculation style are complete, we also need to know the location of each node on the page. Layout is actually the process of finding the geometric relationships of all elements.

The main thread will traverse the calculation style of DOM and related elements, and build a layout tree (Render Tree) containing the page coordinate information of each element and the size of the box model. During the traversal, it will skip the hidden elements (display: none). In addition, although pseudo elements are not visible on the DOM, they are visible on the layout tree.

Draw-Paint

Layout layout, we know the structure, style, geometric relationship of different elements, we want to draw a page, we need to know the drawing order of each element, in the drawing stage, the main thread will traverse the layout tree (layout tree), generate a series of painting records (paint records). A drawing record can be seen as a note recording the order in which each element is drawn.

Synthesis-Compositing

Document structure, element style, element geometric relationship, painting order, these information we all have, at this time to draw a page, we need to do is to convert this information into pixels in the display, this conversion process is called rasterizing.

The easiest way to draw a page is to rasterize only the page content in the viewport (viewport). If the user scrolls the page, move the raster frame (rastered frame) and rasterize more content to make up for the missing part of the page, as follows:

The simplest rasterization process

The first version of Chrome uses this simple way of drawing, the only disadvantage of this approach is that every time the page scrolls, the raster thread needs to rasterize the new content moved into the view, which is a certain performance loss. In order to optimize this situation, Chrome adopts a more complex approach called compositing.

So, what is synthesis? Compositing is a technique that divides a page into several layers, rasterizes them separately, and finally merges them into a single page in a separate thread, the compositor thread. When the user scrolls the page, because all the layers of the page have been rasterized, all the browser needs to do is synthesize a new frame to show the effect of scrolling. The animation effect of the page is similar, just move the layer on the page and build a new frame.

In order to implement the synthesis technology, we need to layer the elements to determine which elements need to be placed on which layer. The main thread needs to traverse the render tree to create a hierarchical tree (Layer Tree). Elements with will-change CSS attributes will be regarded as a separate layer, elements without will-change CSS attributes, and the browser will decide whether to put the elements in a separate layer according to the situation.

You may want to give all the elements on the page a separate layer, but when the number of layers on the page exceeds a certain number, the compositing of layers is slower than rasterizing a small portion of the page in each frame, so it is very important to measure the rendering performance of your application.

Once the Layer Tress is created and the rendering order is determined, the main thread notifies the synthesizer thread of this information, and the synthesizer thread begins to rasterize each layer of the number of layers. Some layers can reach the size of the entire page, so the synthesis thread needs to split them into small blocks (tiles), and then send these small blocks to a series of raster threads (raster threads) for rasterization, after which the raster results of each block will be stored in the memory of GPU Process.

To optimize the display experience, compositing threads can give different priority to different raster threads, rasterizing layers in or near viewports.

When the blocks on the layer are rasterized, the compositing thread collects information called drawing quadrilaterals (draw quads) on the blocks to build a composite frame (compositor frame).

Drawing Quadrilateral: contains information such as the location of the block in memory and the location of the block on the page after layer composition.

Composite frame: a collection of drawn quadrilaterals that represent the content of a frame on a page.

After all the above steps are completed, the compositing thread submits (commit) a rendered frame to the browser process (browser process) through IPC. At this point, another composite frame may be submitted by the UI thread (UI thread) of the browser process to change the browser's UI. These composite frames are sent to GPU and displayed on the screen. If the compositing thread receives a page scrolling event, the compositing thread builds another composite frame and sends it to the GPU to update the page.

The advantage of compositing is that the main thread is not involved in this process, so the compositing thread does not have to wait for the style of calculation and JavaScript to complete execution. This is why the animation related to the rocks.com/en/tutorials/speed/high-performance-animations/ "_ fcksavedurl=" https://www.html5rocks.com/en/tutorials/speed/high-performance-animations/"> synthesizer is the smoothest, and if an animation involves layout or drawing adjustments, it will involve the recalculation of the main thread, which will naturally be much slower.

Browser handling of events

When the page is rendered, interactive WEB pages are displayed in the TAB, and users can move the mouse, click on the page, and so on. When these events occur, how does the browser deal with these events?

Take the click event (click event) as an example. When the mouse clicks on the page, the first person to receive the event information is the Browser Process, but the Browser Process only knows the type and location of the event. How to deal with the click event is carried out by the Renderer Process in the Tab. After Browser Process receives the event, it then passes the information of the event to the rendering process, which finds the target object (target) according to the coordinates of the event, and runs the listening function (listener) bound to the click event of the target object.

The synthesizer thread receives events during the rendering process

As we mentioned earlier, the synthesizer thread can create composite frames through rasterized layers independent of the main thread, such as page scrolling, if there are no events related to page scrolling binding. the combiner thread can create a composite frame independent of the main thread, and if the page is bound to a page scrolling event, the synthesizer thread waits for the main thread to handle the event before creating the composite frame. So how does the synthesizer thread determine whether this event needs to be routed to the main thread for processing?

Because executing JS is the work of the main thread, when the page is synthesized, the synthesizer thread marks the area of the page bound with the event handler as a non-fast scrolling area (non-fast scrollable region). If the event occurs in these marked areas, the synthesizer thread sends the event information to the main thread and waits for the main thread to handle the event, if the event does not occur in these areas. The synthesizer thread synthesizes the new frame directly without waiting for the response of the main thread.

For tags that are not fast scrolling areas, developers need to pay attention to the binding of global events. For example, we use an event delegate to pass the events of the target element to the root element body for processing. The code is as follows:

Document.body.addEventListener ('touchstart', event = > {if (event.target = area) {event.preventDefault ()}})

From the developer's point of view, this code is fine, but from the browser's point of view, this code binds the body element with an event listener, which means that the entire page is edited as a non-fast scrolling area, which makes it necessary for the synthesizer thread to communicate with the main thread and wait for feedback each time the user triggers the event, even if some areas of your page are not bound with any event. The mode in which the smooth synthesizer handles the synthetic frame independently is invalid.

In fact, this situation is also very easy to deal with, just pass the passtive parameter to true,passtive when the event is listening, and it will tell the browser that you want to bind the event and let the combiner thread skip the event handling of the main thread directly to create a composite frame.

Document.body.addEventListener ('touchstart',   event = > {if (event.target = area) {event.preventDefault ()}}, {passive: true})

Find the target object of the event (event target)

When the synthesizer thread receives the event information and determines that the event is not in the non-fast scrolling area, the synthesizer thread will send this time information to the main thread. The first thing the main thread does to get the event information is to find the target object of the event through the hit test (hit test). The specific hit test process is to traverse the drawing record (paint records) generated in the drawing phase to find the element object that contains the event occurrence coordinates.

Optimization of events by browser

Generally speaking, the frame rate of our screen is 60 frames per second, that is, 60fps, but some events are triggered more frequently than this value, such as wheel,mousewheel,mousemove,pointermove,touchmove. These continuous events are usually triggered 60 times per second. If each triggered event is sent to the main thread for processing, because the refresh rate of the screen is relatively low, the main thread will trigger excessive hit tests and JS code. So that the performance does not have to be loss.

For optimization purposes, the browser merges these consecutive events, delaying until the next frame rendering is executed, that is, before requestAnimationFrame.

Same event axis as before, but this event is merged and delayed

For non-contiguous events, such as keydown,keyup,mousedown,mouseup,touchstart,touchend, it will be sent directly to the main thread for execution.

At this point, I believe you have a deeper understanding of "how browsers work". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.