What happens to the page after entering URL 07/04 Update SLTechnology News&Howtos

What happens to the page after entering URL

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of "what happens on the page after entering URL". The editor shows you the operation process through an actual case. The method of operation is simple and fast, and it is practical. I hope that this article "what happens on the page after entering URL" can help you solve the problem.

Build a DOM tree

Because browsers cannot directly understand HTML strings, they convert this series of byte streams into a meaningful and easy-to-operate data structure, which is called DOM tree. DOM tree is essentially a multifork tree with document as its root node.

So what is the way to analyze it?

The essence of HTML Grammar

First of all, we should make it clear that the grammar of HTML is not a context-free grammar.

Here, it is necessary to discuss what context-free grammar is.

In the compilation principles discipline of computer science, there is a very clear definition:

If the production rules of a formal grammar G = (N, Σ, P, S) take the following form: Vmurf grammar, it is called context-free grammar. Where V ∈ N, w ∈ (N ∪ Σ) *.

Among them, the meaning of each parameter in G = (N, Σ, P, S) is explained:

N is a collection of non-Terminators (as the name implies, that is, the last symbol is not it, the same below).

Σ is a set of Terminators.

P is the start character, and it must belong to N, that is, the non-Terminator.

S is a collection of different productions. Such as S-> aSb and so on.

In popular terms, a context-free grammar means that the left side of all production in this grammar is a non-Terminator.

See here, if there is still a little bit of confusion, let me give you an example.

For example:

A-> B

In this grammar, there is a non-Terminator to the left of each production, which is a context-free grammar. In this case, xBy must be able to specify xAy.

Let's take a look at a counterexample:

AA-> BAa-> B

This situation is whether it is a context-free grammar. When we encounter B, we do not know whether we can specify An or not, depending on whether there is an on the left or right, that is to say, it is context-dependent.

As to why it is a non-context-free grammar, first of all, we need to pay attention to the fact that the standard HTML grammar conforms to the context-free grammar, and the non-context-free grammar can be reflected in the non-standard grammar. I would like to cite only one counterexample to prove it.

For example, when the parser scans the form tag, the context-free grammar is handled by directly creating the DOM object corresponding to the form, but this is not the case in the real HTML5 scenario. The parser will check the context of the form. If the parent tag of the form tag is also form, then skip the current form tag directly, otherwise the DOM object will be created.

Conventional programming languages are context-free, but HTML, on the contrary, is its non-context-free nature, which determines that HTML Parser can not be done using the parser of conventional programming languages, so it needs to find another way.

Analytical algorithm

The HTML5 specification describes parsing algorithms in detail. The algorithm is divided into two phases:

Tagging.

Build a tree.

The corresponding two processes are lexical analysis and grammatical analysis.

Labeling algorithm

This algorithm inputs as HTML text, outputs as HTML tags, and also becomes a tag generator. The finite automatic state machine is used to complete it. That is, when one or more characters are received in the current state, they will be updated to the next state.

Hello sanyuan

Use a simple example to demonstrate the process of tagging.

Encountered, indicating that the record of the tag name is complete, which becomes the data state.

Then do the same thing when you encounter the body tag.

At this time, the marks of html and body are recorded.

Now come to > in, enter the data state, and then stay in that state to receive the following characters hello sanyuan.

Then the receiving returns to the data state.

And then deal with it in the same style.

Tree building algorithm

As mentioned earlier, a DOM tree is a multifork tree with document as its root node. So the parser first creates a document object. The tag generator sends the information for each tag to the builder. When the builder receives the appropriate tag, it creates the corresponding DOM object. After creating this DOM object, you will do two things:

Add the DOM object to the DOM tree.

Press the corresponding tag into the stack where the open (corresponding to the meaning of the closed tag) element is stored.

Let's take the following example:

Hello sanyuan

First, the state is initialized.

The html tag from the tag generator is received and the state changes to the before html state. At the same time, create a DOM element of HTMLHtmlElement, add it to the document root object, and stack it.

Then the state automatically changes to before head, and then a body comes from the tag generator, indicating that there is no head, and the builder automatically creates a HTMLHeadElement and adds it to the DOM tree.

Now enter the in head state and jump directly to after head.

Now the tag generator sends in the body tag, creates the HTMLBodyElement, inserts it into the DOM tree, and presses the open tag stack.

Then the state changes to in body, and then receives the next series of characters: Hello sanyuan. When the first character is received, a Text node is created and the character is inserted, and then the Text node is inserted under the body element in the DOM tree. As you continue to receive subsequent characters, these characters are attached to the Text node.

Now, the tag generator passes a closing tag for body and enters the after body state.

The tag generator finally passes a closing tag of html to the state of after after body, indicating that the parsing process is over.

Fault-tolerant mechanism

When it comes to the HTML5 specification, we have to say that its strong tolerance strategy, fault tolerance is very strong, although there are different reviews, but I think as a senior front-end engineer, it is necessary to know what HTML Parser has done in fault tolerance.

Next are some classic examples of fault tolerance in WebKit, and others are welcome to add.

1. Use instead of

If (t-> isCloseTag (brTag) & & masked document-> inCompatMode ()) {reportError (MalformedBRError); t-> beginTag = true;}

Replace it all with

In the form of.

two。 Table discretization

Inner table outer table

WebKit is automatically converted to:

Outer table inner table

3. Form element nesting

Ignore the form directly at this time.

Style calculation

With regard to CSS styles, there are generally three sources:

Link tag reference

Styles in style ta

The embedded style attribute of the element

Format style sheet

First of all, browsers cannot directly recognize CSS-style text, so the first thing the rendering engine does when it receives CSS text is to convert it into a structured object, styleSheets.

This formatting process is too complex, and there will be different optimization strategies for different browsers, so I won't expand here.

The final structure can be viewed through document.styleSheets in the browser console. Of course, this structure contains the above three CSS sources, providing the basis for subsequent style manipulation.

Standardized style attribut

There are some CSS style values that are not easily understood by the rendering engine, so you need to standardize them before calculating the style, such as em- > px,red- > # ff0000,bold- > 700, and so on.

Calculate the specific style of each node

The style has been formatted and standardized, and then the specific style information for each node can be calculated.

In fact, the way of calculation is not complicated, there are mainly two rules: inheritance and cascading.

Each child node inherits the style attribute of the parent node by default, and if it is not found in the parent node, it adopts the browser default style, also known as the UserAgent style. This is the inheritance rule, which is very easy to understand.

Then there are cascading rules, the biggest feature of CSS is its cascading, that is, the final style depends on the effect of various attributes, and even a lot of weird cascading phenomenon, students who have seen "CSS World" should have a deep understanding of this, the specific cascading rules belong to the category of in-depth CSS language, here will not be introduced too much.

However, it is worth noting that after calculating the style, all the style values will be hung in window.computedStyle, that is, you can get the calculated style through JS, which is very convenient.

Generate layout tree

Now that the DOM tree and DOM style have been generated, the next thing to do is to determine the location of the elements through the browser's layout system, that is, to generate a layout tree (Layout Tree).

The general work of layout tree generation is as follows:

Iterate through the generated DOM tree nodes and add them to the layout tree.

Calculates the coordinate location of the layout tree node.

It is worth noting that the value of this layout tree contains visible elements, and elements with head tags and elements with display: none set will not be placed in it.

Some people say that Render Tree will be generated first, that is, rendering trees. In fact, this was 16 years ago, and now the Chrome team has done a lot of refactoring and there is no process of generating Render Tree. The information of the layout tree has been very perfect, and it has the function of Render Tree.

The reason why we do not talk about the details of the layout is that it is too complex, and the introduction will make the article look too bloated, but in most cases we just need to know what it does. If you want to go deep into the principle and know how it is done, I highly recommend that you read Renren FED team's article to see how the browser layout layout from the Chrome source code.

This is the end of the content about "what happens to the page after entering URL". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.