How to probe deeply into browser parsing and XSS 07/06 Update SLTechnology News&Howtos

How to probe deeply into browser parsing and XSS

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail how to deeply explore browser parsing and XSS. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have some understanding of the relevant knowledge after reading this article.

First, some basic knowledge that you need to know

Why are you coding?

Mainly because some data is not suitable for transmission. There are a variety of reasons, such as Size is too large, including private data, another important point is that some characters can cause ambiguity.

For URL: & used to split multiple parameters, if a parameter key value is name=v&lue, it will cause ambiguity because the value v&lue of the name parameter contains &. Therefore, we need to encode & with URL. After url encoding, the server treats the byte immediately following "%" as a normal byte, but not as a delimiter for each parameter or key-value pair.

For HTML: when the browser encounters, it is recognized as the end of the element. If there is, because the attribute value of the tag carries >, it will also cause ambiguity. Therefore, the > of the attribute value needs to be HTML encoded, even if the character entity is used.

Two, three kinds of coding 2.1HTML encoding (character entity)

A character entity is a predefined escape sequence. There are two representations of character entities:

1. Character entities begin with & + predefined entity names +; semicolons, such as "

3.2 JavaScript parser

Whether Unicode character escape sequences or Hex encodings like\ uXXXX can be decoded depends on the situation. First, there are three places in JavaScript where Unicode character escape sequences can appear:

1. In string (in String)

When an Unicode escape sequence appears in a string, it is interpreted as a normal character and does not break the context of the string.

For example, alert ("\ u0031\ u0030")

The part that is encoded and escaped is 10, which is a string and will be decoded normally, and the JS code will be executed.

2. In the identifier (in identifier names)

If the Unicode escape sequence exists in the identifier, that is, the variable name (such as function name, etc.). Which will be decoded

For example,\ u0061\ u006c\ u0065\ u0072\ u0074 (10)

The part that is encoded and escaped is the alert character, which is the function name, and belongs to the case in the identifier, so it will be decoded normally and the JS code will be executed.

3. Control character (in control characters)

If the Unicode escape sequence exists in a control character, it is decoded but not interpreted as a control character, but as part of an identifier or string character. The control characters are', ", (), etc.

For example, alert\ u0028 "xss"); (after Unicode encoding, it is no longer decoded as a control character, but as part of the identifier alert (.

Therefore, control characters such as parentheses of the function cannot be interpreted normally after Unicode escape.

Summary: Unicode sequences cannot appear in control characters, otherwise they cannot be interpreted.

Example 1:

\ u0061\ u006c\ u0065\ u0072\ u0074\ u0028\ u0031\ u0031\ u0029

The encoded part is alert (11).

The JS in this example will not be executed because the control characters are encoded.

Example 2:

\ u0061\ u006c\ u0065\ u0072\ u0074 (\ u0031\ u0032)

The encoded part is alert and 12 in parentheses.

In this example, JS will not be executed because the encoded part in parentheses cannot be interpreted properly, even if it is uncoded as a number, it is still treated as a string (12 here is a string 12, not an int integer). You can either use ASCII numbers, or add "" or''to make it a string, which can only be used as a normal character.

Example 3:

Alert ('13\ u0027)

It is coded as'.

The JS of this example will not be executed because the control character is encoded and the decoded 'becomes part of the string and is no longer interpreted as a control character. So in this case the string is incomplete because there is no'to end the string.

Example 4:

Alert ('14\ u000a')

The JS of this example will be executed because the encoded part is in the string and will only be interpreted as ordinary characters and will not break through the string context.

Example 5:

This example cannot be executed. Let's look at it from the perspective of the browser: read it first.

In this example, the HTML parser first decodes the UserInput part of the character entity

Then the JavaScript parser parses the JS of the onclick part and executes the JS

The arguments to the window.open ('UserInput') function are passed into URL after JS, so the UserInput part is decoded by the URL parser.

So the parsing order is: HTML parsing-> JavaScript parsing-> URL parsing.

Example 3:

In this example, the HTML parser first decodes the UserInput part of the character entity.

Then the property value of href is parsed by the URL parser

Then because Scheme is javascript, it is parsed by JavaScript

After parsing executes JS, the window.open ('UserInput') function is passed into URL, so it is parsed by the URL parser.

So the parsing order is: HTML parsing-> URL parsing-> JavaScript parsing-> URL parsing.

Comprehensive examples:

First, the HTML parser parses. When parsing to the value of the href attribute, the state machine enters the attribute value state (Attribute Value State), which decodes the character entity.

Then parsed and decoded by the URL parser

Then, because the Scheme is javascript, it is parsed and decoded by the JavaScript parser, plus the encoding part is the function name, which belongs to the identifier, so it can be decoded and interpreted normally.

After three rounds of parsing and decoding, the result is obtained:

This is the end of the in-depth exploration of browser parsing and XSS. I hope the above content can be of some help and can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.