How to prevent XSS attacks 04/27 Update SLTechnology News&Howtos

How to prevent XSS attacks

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to prevent XSS attacks. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

Front-end security

With the rapid development of the Internet, the issue of information security has become one of the most concerned focuses of enterprises, and the front end is a high-risk stronghold that causes enterprise security problems. In the era of mobile Internet, in addition to traditional security problems such as XSS and CSRF, front-end personnel often encounter new security problems such as network hijacking and illegal invocation of Hybrid API. Of course, the browser itself is constantly evolving and developing, constantly introducing new technologies such as CSP and Same-Site Cookies to enhance security, but there are still many potential threats, which require front-end technicians to continue to "check leaks and fill gaps".

In recent years, with the rapid development of Meituan's business, the front end is faced with many security challenges, so it has accumulated a lot of practical experience. We have combed the common front-end security problems and corresponding solutions, which will be made into a series, hoping to help front-end personnel continue to prevent and repair security vulnerabilities in their daily development.

We will explain XSS, which mainly includes:

Introduction of 1.XSS attack

Classification of 2.XSS attacks

Prevention and Detection of 3.XSS attacks

Summary of 4.XSS attacks

5.XSS attack case

Introduction of XSS attack

Before we begin this article, let's ask a question. Please judge whether the following two statements are correct:

1.XSS prevention is the responsibility of the back-end RD (developer). The back-end RD should escape sensitive characters on the interface where all users submit data before taking the next step.

two。 All the data to be inserted into the page can be inserted into the page through the escape of a sensitive character filter function, after filtering out the common sensitive characters.

If you are not sure about the answer, you can look down with these questions, and we will disassemble them step by step.

Occurrence and repair of XSS loophole

XSS attack is that the page has been injected with malicious code, in order to more vividly introduce, we use the example that happened to Xiaoming classmate to illustrate.

A case.

One day, the company needs a search page to determine the content of keywords based on URL parameters. Xiaoming quickly finished the page and put it online. The code is as follows:

< input type = "text" value = " ">

< button >

< div >

The keywords you search for are:

< %= getParameter (" keyword ") %>

However, shortly after launching, Xiao Ming received a mysterious link from the security group:

Http://xxx/search?keyword=">alert('XSS');

Xiaoming clicked on the link with a foreboding [do not imitate, only the link that is safe can be clicked]. Sure enough, a dialog box with "XSS" popped up on the page.

Damn it, I've been hit! Xiao Ming frowned and discovered the secret:

When the browser requests http://xxx/search?keyword=">alert('XSS');, the server parses the request parameter keyword and gets "> alert ('XSS');", which is spliced into HTML and returned to the browser. The following HTML is formed:

< input type = "text" value = "" >

< script >

Alert ('XSS'); ">

< button >

< div >

The keyword you search for is: ">

< script >

Alert ('XSS')

The browser cannot tell that alert ('XSS'); is malicious code and therefore executes it.

Here not only the content of div is injected, but also the value property of input is injected, and alert pops up twice.

In the face of this situation, how should we take precautions?

In fact, it's just that the browser executes the user's input as a script. Then just tell the browser that the content is text.

Clever Xiaoming soon found a solution to fix this loophole:

< input type = "text" value = " ">

< button >

< div >

The keywords you search for are:

< %= escapeHTML ( getParameter (" keyword ")) %>

EscapeHTML () is escaped according to the following rules:

| | character | escaped character |

| |-|-|

| & | & amp |

| | & ampgt; |

| | "| & ampquot; |

| |'| & amp#x27; |

| | / | & amp#x2F; |

After the escape function is processed, the final response received by the browser is:

< input type = "text" value = ""><script>alert('XSS');</script>" >

< button >

< div >

The keywords you search for are: & ampquot;><script>alert (& amp#x27;XSS'); & amplt;/script&gt

Malicious code is escaped, no longer executed by the browser, and search terms are perfectly displayed on the page.

Through this incident, Xiaoming learned the following knowledge:

Usually the user input contained in the page is displayed in the form of text in a fixed container or attribute.

Attackers use the user input fragments of these pages to splice specially formatted strings to break through the limitations of the original location and form code fragments.

Attackers cause potential risks by injecting scripts into the target website to run on the user's browser.

XSS attacks can be prevented by HTML escape. Of course it's not that simple! Please read on].

Pay attention to the special HTML attribute, JavaScript API

Since the last event, Xiao Ming will carefully escape the data inserted into the page. And he also found that most templates have an escape configuration that allows all data inserted into the page to be escaped by default. In this way, I am not afraid to accidentally leave out the unescaped variables, so Xiao Ming's work gradually becomes easier.

However, as a director, it is impossible for me to make Xiaoming change Bug so simply and happily.

Soon after, Xiaoming received a mysterious link from the security group: http://xxx/?redirect_to=_javascript:alert('XSS'). Xiao Ming dared not be careless and hurriedly opened the page. However, the page does not automatically pop up the evil "XSS".

Xiao Ming opened the source code of the corresponding page and found the following:

< a href = " ">

Jump.

In this code, when the attack URL is http://xxx/?redirect_to=_javascript:alert('XSS'), the server response becomes:

< a href = "_javascript:alert('XSS')" >

Jump.

Although the code will not be executed immediately, once the user clicks the a tag, the browser will pop up "XSS".

Damn it, I made a mistake again...

Here, the user's data does not break our limit in location and is still the correct href attribute. But its content is not the type we expected.

It turns out that it's not just special characters, but even a string like _ javascript: a string like this can trigger a XSS attack if it appears in a specific location.

Xiao Ming frowned and came up with a solution:

/ / prohibit URL from starting with "_ javascript:"

Xss = getParameter ("redirect_to") .startsWith ('_ javascript:')

If (! xss) {

Jump.

} else {

Jump.

}

As long as URL doesn't start with _ javascript:, is it safe?

The security group threw another connection: http://xxx/?redirect_to=_jAvascRipt:alert('XSS')

Can this also be carried out? .. Well, browsers are so powerful.

Xiao Ming wanted to cry without tears. When judging whether the beginning of URL is _ javascript:, he first converted the user input to lowercase, and then compared it.

However, the so-called "as virtue rises one foot, vice rises ten". In the face of Xiaoming's protection strategy, the security group constructed such a connection:

Http://xxx/?redirect_to=%20_javascript:alert('XSS')

% 20_javascript:alert ('XSS') becomes _ javascript:alert (' XSS') after being parsed by URL, and the string begins with a space. In this way, the attacker can bypass the back-end keyword rules and successfully complete the injection.

In the end, Xiao Ming chose the whitelist method to completely solve this loophole:

/ / filter according to the situation of the project, forbid the "_ javascript:" link, illegal scheme, etc.

AllowSchemes = ["http", "https"]

Valid = isValid (getParameter ("redirect_to"), allowSchemes)

If (valid) {

Jump.

} else {

Jump.

}

Through this incident, Xiaoming learned the following knowledge:

1. The escape of HTML does not mean rest easy.

two。 For link redirects, such as & amplt;a href= "xxx" or location.href= "xxx", check their contents and prohibit links that start with _ javascript:, and other illegal scheme.

Adopt different escape rules according to the context

One day, in order to speed up the loading speed of the web page, Xiaoming inlined a data into HTML through JSON:

< script >

Var initData =

< %= data.toJSON () %>

EscapeHTML () cannot be used where the JSON is inserted, because the JSON format will be broken after escaping.

However, the security group also found a loophole, so it is not safe to inline JSON like this:

1. When a JSON contains the characters Ubun2028 or Ubun2029, it cannot be used as a literal amount of JavaScript, otherwise a syntax error will be thrown.

two。 When the JSON contains a string, the current script tag will be closed, and the following string content browser will parse according to HTML; the injection can be completed by adding the next tag.

So we need to implement an escapeEmbedJSON () function to escape the inline JSON.

The rules of escape are as follows:

| | character | escaped character |

| |-|-|

| | Utility 2028 |\ u2028 | |

| | Utility 2029 |\ u2029 | |

| |

Var initData =

< %= escapeEmbedJSON ( data.toJSON ()) %>

Through this incident, Xiaoming learned the following knowledge:

1.HTML escape is very complicated, and different escape rules should be adopted in different situations. If you use the wrong escape rules, it is likely to bury the hidden danger of XSS.

two。 You should avoid writing your own escape libraries as far as possible, and use mature ones that are commonly used in the industry.

Loophole summary

Xiao Ming's example is over. Let's take a systematic look at the injection methods of XSS:

In the text embedded in HTML, malicious content is injected with script tags.

In inline JavaScript, the spliced data breaks the original restrictions (strings, variables, method names, etc.).

In tag attributes, malicious content contains quotation marks to break through the limitation of attribute values and inject other attributes or tags.

Executable code such as _ javascript: is included in the href, src, and other attributes of the tag.

In onload, onerror, onclick, and so on, uncontrolled code is injected.

In the style attribute and tag, it contains something like background-image:url ("_ javascript:...") ; (the new version of the browser is already protected).

In the style attribute and tag, it contains something like _ expression (...) The CSS expression code (the new version of the browser is already protected).

In short, if the developer does not properly filter the text entered by the user, he rashly inserts it into the HTML, which can easily lead to injection vulnerabilities. Attackers can exploit vulnerabilities to construct malicious code instructions, and then use malicious code to compromise data security.

Classification of XSS attacks

Through the above examples, we have gained some understanding of XSS.

What is XSS?

Cross-Site Scripting (Cross-site scripting attack), referred to as XSS, is a kind of code injection attack. An attacker causes a malicious script to run on a user's browser by injecting malicious scripts into the targeted Web site. Using these malicious scripts, attackers can obtain sensitive information of users, such as Cookie, SessionID, etc., thus endangering data security.

In order to distinguish it from CSS, the first letter of the attack is changed to X, so it is called XSS.

The essence of XSS is that malicious code is unfiltered and mixed with the normal code of the site; browsers cannot tell which scripts are trusted, causing malicious scripts to be executed.

Because it is executed directly in the user's terminal, malicious code can directly obtain the user's information, or use this information to make an attacker-defined request to the website by pretending to be the user.

In some cases, the injected malicious script is relatively short due to input limitations. However, more complex attack strategies can be accomplished by introducing external scripts and executing them by the browser.

There is a question: in which way do users "inject" malicious scripts?

Not only the business "user's UGC content" can be injected, but also the parameters on the URL can be the source of the attack. None of the following can be trusted when processing input:

UGC information from the user

Links from third parties

URL parameter

POST parameter

Referer (may come from an untrusted source)

Cookie (possibly from other subdomain injection)

XSS classification

According to the source of the attack, XSS attacks can be divided into three types: storage type, reflection type and DOM type.

| |-|-|

Storage area: where malicious code is stored.

Insertion point: who gets the malicious code and inserts it into the web page.

Storage XSS

Attack steps for storage XSS:

1. The attacker submits malicious code to the database of the target website.

two。 When the user opens the target website, the web server takes the malicious code out of the database and splices it back to the browser in HTML.

3. After receiving the response, the user's browser parses and executes, and the malicious code mixed in it is also executed.

4. Malicious code steals user data and sends it to the attacker's website, or impersonates the behavior of the user and calls the interface of the target website to perform the actions specified by the attacker.

This kind of attack is common in websites with users saving data, such as forum posts, product reviews, user private messages and so on.

Reflective XSS

Attack steps for reflective XSS:

1. The attacker constructed a special URL that contains malicious code.

two。 When a user opens a URL with malicious code, the web server takes the malicious code out of the URL and splices it back to the browser in HTML.

3. After receiving the response, the user's browser parses and executes, and the malicious code mixed in it is also executed.

The difference between reflective XSS and storage XSS is that the malicious code of storage XSS is stored in the database, and the malicious code of reflective XSS is stored in URL.

Reflective XSS vulnerabilities are common in the function of passing parameters through URL, such as website search, redirection, and so on.

As users are required to actively open malicious URL to take effect, attackers often combine a variety of means to induce users to click.

The content of POST can also trigger reflective XSS, but the trigger conditions are harsh (you need to construct a form submission page and guide the user to click), so it is very rare.

DOM type XSS

Attack steps for DOM XSS:

1. The attacker constructed a special URL that contains malicious code.

two。 The user opens URL with malicious code.

3. After receiving the response, the user browser parses and executes, and the front-end JavaScript takes out the malicious code in the URL and executes it.

The difference between DOM-based XSS and the former two kinds of XSS: in DOM-based XSS attacks, the extraction and execution of malicious code is completed by the browser, which belongs to the security loophole of the front-end JavaScript, while the other two XSS belong to the security loophole of the server.

Prevention of XSS attacks

As you can see from the previous introduction, there are two main elements of XSS attacks:

1. The attacker submitted malicious code.

two。 The browser executes malicious code.

For the first element: can we filter out the malicious code entered by the user in the process of user input?

Input filtering

When the user submits, the input is filtered by the front end and then submitted to the back end. Is it feasible to do so?

The answer is no. Once the attacker bypasses the front-end filtering and constructs the request directly, the malicious code can be submitted.

So, change the filtering time: the back end filters the input before writing to the database, and then returns the "safe" content to the front end. Is this feasible?

Let's take an example, a normal user enters 5.

< 7 这个内容，在写入数据库前，被转义，变成了 5 < 7。问题是：在提交阶段，我们并不确定内容要输出到哪里。这里的"并不确定内容要输出到哪里"有两层含义： 1.用户的输入内容可能同时提供给前端和客户端，而一旦经过了escapeHTML()，客户端显示的内容就变成了乱码( 5 < 7 )。 2.在前端中，不同的位置所需的编码也不同。当 5 < 7 作为 HTML 拼接页面时，可以正常显示：< div title = "comment" >

5 & amplt; 7.

When 5 & amplt; 7 is returned through Ajax and then assigned to the variable of JavaScript, the string obtained at the front end is the escaped character. This content cannot be directly used for the presentation of templates such as Vue, nor can it be directly used for content length calculation. Cannot be used for title, alert, etc.

Therefore, input-side filtering can solve specific XSS problems in some cases, but it will introduce a lot of uncertainty and garbled problems. Such methods should be avoided when guarding against XSS attacks.

Of course, input filtering is necessary for specific input types, such as numbers, URL, phone numbers, e-mail addresses, and so on.

Since input filtering is not entirely reliable, we need to prevent XSS by "preventing browsers from executing malicious code." This part is divided into two categories:

1. Prevent injection in HTML.

two。 Prevent malicious code from being executed when JavaScript is executed.

Prevent storage and reflective XSS attacks

Both storage and reflective XSS are inserted into the response HTML after the malicious code is removed from the server, and the "data" deliberately written by the attacker is embedded in the "code" and executed by the browser.

There are two common ways to prevent these two vulnerabilities:

1. Change to pure front-end rendering to separate the code from the data.

two。 Fully escape HTML.

Pure front-end rendering

Pure front-end rendering process:

1. The browser first loads a static HTML, which does not contain any business-related data.

two。 The browser then executes the JavaScript in the HTML.

3.JavaScript loads business data through Ajax and calls DOM API to update it to the page.

In pure front-end rendering, we explicitly tell the browser whether the content to be set next is text (.innerText), attribute (.setAttribute), style (.style), and so on. Browsers will not be easily tricked into executing unexpected code.

However, pure front-end rendering should also be careful to avoid DOM-based XSS vulnerabilities (such as onload events and _ javascript:xxx in href, please refer to the section "preventing DOM-based XSS attacks" below).

In many internal and management systems, pure front-end rendering is very appropriate. However, for pages with high performance requirements or SEO requirements, we still have to face the problem of splicing HTML.

Escape HTML

If splicing HTML is necessary, you need to use an appropriate escape library to fully escape the insertion points around the HTML template.

Commonly used template engines, such as doT.js, ejs, FreeMarker, etc., usually have only one rule for HTML escape, that is, &

< >

"the escape of'/ these characters does provide some XSS protection, but it is not perfect:

| |-|-|

| | HTML tag text content | Yes | |

| | HTML attribute value | Yes |

| | inline JavaScript | none |

| | inline JSON | none |

Therefore, in order to improve the XSS protection measures, we need to use more perfect and more detailed escape strategies.

For example, in the Java project, the commonly used escape library is org.owasp.encoder. The following code is quoted from the official description of org.owasp.encoder.

< div >

< %= Encode.forHtml ( UNTRUSTED ) %>

< input value = "" />

< div style = "width:" >

< div style = "background:" >

< script >

Var msg = ""

Alert (msg)

< script >

Var _ _ INITIAL_STATE__ = JSON .parse ('')

< button onclick = "alert('');" >

Click me

< a href = "/search?value=&order=1#top" >

< a href = "/page/" >

< a href = '' >

Link

It can be seen that the coding of HTML is very complex, and corresponding escape rules should be used in different contexts.

Prevent DOM type XSS attacks

DOM-type XSS attack, in fact, the front-end JavaScript code of the website is not rigorous enough to execute untrusted data as code.

Be especially careful when using [xss_clean], .outerHTML, [xss_clean] (), do not insert untrusted data into the page as HTML, but try to use .textContent, .setAttribute (), and so on.

If you use the Vue/React technology stack and do not use the v-html/dangerouslySetInnerHTML function, you will avoid the XSS hidden dangers of innerHTML and outerHTML in the front-end render phase.

Inline event listeners in DOM, such as location, onclick, onerror, onload, onmouseover, etc., the href attribute of the tag, eval (), setTimeout (), setInterval () of JavaScript, and so on, can run strings as code. If untrusted data is spliced into a string and passed to these API, it is easy to cause security risks, which must be avoided.

< img onclick = "UNTRUSTED" onerror = "UNTRUSTED" src = "data:image/png," >

< a href = "UNTRUSTED" >

one

< script >

Malicious code is called in / / setTimeout () / setInterval ()

SetTimeout ("UNTRUSTED")

SetInterval ("UNTRUSTED")

/ / location calls malicious code

Location.href = 'UNTRUSTED'

Malicious code is called in / / eval ()

Eval ("UNTRUSTED")

If these are used in your project, be sure to avoid concatenating untrusted data in strings.

Other XSS precautions

Although XSS can be prevented by careful escape when rendering pages and executing JavaScript, relying solely on developer caution is still not enough. Here are some general solutions that can reduce the risks and consequences of XSS.

Content Security Policy

Strict CSP can play the following roles in the prevention of XSS:

Disable loading of outfield code to prevent complex attack logic.

External domain submission is prohibited. After the website is attacked, the user's data will not be leaked to the external domain.

Inline script execution is prohibited (strict rules, currently found to be used by GitHub).

Disable unauthorized script execution (new feature, Google Map Mobile is in use).

Reasonable use and reporting can find XSS in time, which is helpful to repair the problem as soon as possible.

For more information about CSP, follow up on the following articles in the front-end security series.

Input content length control

For untrusted input, a reasonable length should be limited. Although it cannot completely prevent the occurrence of XSS, it can increase the difficulty of XSS attacks.

Other safety measures

HTTP-only Cookie: JavaScript is prohibited from reading some sensitive Cookie, and an attacker cannot steal this Cookie after completing XSS injection.

CAPTCHA: prevents scripts from submitting dangerous actions by impersonating users.

Detection of XSS

Xiaoming has gained a lot from the above experience. He has also learned how to prevent and fix XSS vulnerabilities and has relevant security awareness in daily development. But for code that is already online, how to detect whether there are XSS vulnerabilities in it?

After searching, Xiao Ming found two ways:

1. Use generic XSS attack strings to manually detect XSS vulnerabilities.

two。 Use the scanning tool to automatically detect XSS vulnerabilities.

In Unleashing an Ultimate XSS Polyglot, Xiao Ming found a string like this:

_ jaVasCript: / *-/ * `/ *\` / *'/ * "/ * * / (/ * * / oNcliCk=alert ()) / /% 0D%0A%0d%0a// search by category

After receiving the response, the browser will load and execute the malicious script / / xxxx.cn/image/t.js. In the malicious script, the user's login status is used to follow, post Weibo, send private messages and other operations. The Weibo and private messages sent can be accompanied by attacks on URL to induce more people to click and constantly enlarge the scope of the attack. This way of publishing malicious content under the identity of the victim and magnifying the scope of the attack layer by layer is called the "XSS worm".

Extended Reading: Automatic Context-Aware Escaping

As we said above:

1. Proper HTML escape can effectively avoid XSS vulnerabilities.

two。 A complete escape library requires a variety of rules for the context, such as HTML attributes, HTML literal content, HTML comments, jump links, inline JavaScript strings, inline CSS stylesheets, and so on.

3. The business RD needs to select different escape rules according to the context of each insertion point.

In general, the escape library cannot determine the context of the insertion point (Not Context-Aware), so the responsibility for implementing the escape rules falls on the business RD, requiring each business RD to fully understand the various situations of XSS, and to ensure that each insertion point uses the correct escape rules.

This mechanism has a large workload and is guaranteed by manual work, so it is easy to cause XSS vulnerabilities, and it is also difficult for security personnel to find hidden dangers.

In 2009, Google proposed a concept called Automatic Context-Aware Escaping.

The so-called Context-Aware means that when the template engine parses the template string, it parses the template syntax, analyzes the context of each insertion point, and automatically selects different escape rules. In this way, the workload of the business RD is reduced and the human omissions are reduced.

In a template engine that supports Automatic Context-Aware Escaping, the business RD can define templates in this way without manually implementing escape rules:

< html >

< head >

< meta charset = "UTF-8" >

< title >

< body >

< a href = "{{.url}}" >

After parsing, the template engine knows the context of the three insertion points and automatically selects the corresponding escape rules:

< html >

< head >

< meta charset = "UTF-8" >

< title >

< body >

< a href = "{{.url | urlescaper | attrescaper}}" >

Currently, template engines that support Automatic Context-Aware Escaping include:

1.go html/template

2.Google Closure Templates

The above is the editor for you to share how to prevent XSS attacks, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.