XSS Prevention Cheatsheet
How to prevent XSS in Java, Python, Node.js, C#, Go, and Scala
Join the DZone community and get the full member experience.Join For Free
XSS, or Cross-Site Scripting, is one of the most common vulnerabilities found in applications. In bug bounty programs of different organizations, XSS consistently ranks as the most common vulnerability found. Today, let’s learn how these attacks work, how they manifest in code, and how to prevent them in your programming language. Let’s dive right in!
Anatomy of an XSS attack
XSS happens whenever an attacker can execute malicious scripts on a victim’s browser.
Applications often use user input to construct web pages. For example, a site might have a search functionality where the user can input a search term, and the search results page will include the term at the top of the results page. If a user searches “abc”, the source code for that page might look like this:
But what if that application cannot tell the difference between user input and the legitimate code that makes up the original web page?
Attackers might be able to submit executable scripts and get that script embedded on a victim’s webpage. These malicious scripts can be used to steal cookies, leak personal information, change site contents, or redirect the user to a malicious site. There are three main types of XSS attacks: reflected XSS, stored XSS, and DOM XSS.
For example, if the application also allows users to search via URLs:
If an attacker can trick victims into visiting this URL:
The script in the URL will become embedded in the page the victim is visiting, making the victim’s browser run the JS code contained within the
<script> tags. This is called a “reflected XSS” attack.
During a stored XSS attack, the attacker places the malicious script into a database before it gets returned to the victim. Let’s say that example.com also allows users to post status updates for others to see. An attacker can post this status update:
This malicious script will become embedded on the attacker’s profile page, attacking anyone who visits the attacker’s profile page.
Finally, DOM-based XSS is similar to reflected XSS, except that in DOM-based XSS, the user input never leaves the user’s browser. Since the malicious input is never sent to the server, this type of XSS is harder to detect and prevent.
As in reflected XSS, attackers submit DOM-based XSS payloads via the victim’s user input. Unlike reflected XSS, a DOM-based XSS script doesn’t require server involvement, because it executes when user input modifies the source code of the page in the browser directly. Say a website allows the user to change their locale by submitting it via a URL parameter:
The URL parameter isn’t submitted to the server. Instead, it’s used to change the language of the webpage by a client-side script of the application. But if the website doesn’t validate the user-submitted parameter, an attacker can trick victims into visiting a URL like this one:
The site will embed the payload on the user’s web page, and the victim’s browser will execute the malicious script.
The key to preventing XSS is output encoding. You should never insert user-submitted data directly into an HTML document. Instead, you should encode any untrusted input that ends up on an HTML page so that browsers know the input should be treated as content and not raw HTML. This will make sure that attackers cannot influence the way browsers interpret the information on the page by submitting dangerous characters or character sequences. For example, if someone submits
<script>alert(1)</script>, browsers should treat
</script> as user content, not HTML script tags.
To prevent XSS, you should encode characters that have special meaning in HTML, such as the
& character, angle brackets, single and double quotes, and the forward-slash character. In our example, you can encode the left and right angle brackets can be encoded into HTML characters < and > to prevent browsers from treating the content as script tags.
The prevention of DOM-based XSS requires a different approach. Since the malicious user input won’t pass through the server, sanitizing the data that enters and departs from the server won’t work. Instead, you should avoid rewriting the HTML document based on user input, and implement client-side input validation before user input is inserted into the DOM.
Defense in Depth
You can also take measures to mitigate the impact of XSS flaws if they do happen. First, set the
HttpOnly flag on sensitive cookies that your site uses. This prevents attackers from stealing cookies via XSS. You should also implement the
Preventing XSS in your Programming Language
Now, let’s talk about how you can prevent XSS vulnerabilities in your programming language!
If you are using Java Server Pages (JSP), you need to be aware that JSP templates do not escape dynamic content by default. Let’s say that you want to display a message this way:
You will need to use the
<c:out> tag or
fn:escapeXml() function to escape potentially dangerous content in untrusted input:
Most Python template languages will take care of output encoding for you. For example, Jinja2 will automatically encode any input placed within curly braces. This should prevent XSS in most cases.
This will output in HTML:
The Razor template language that uses C# automatically escapes dynamic content automatically and protects against most potential XSS attacks:
This code will render the HTML code:
The Go package
html/template will automatically escape dynamic content, protecting you from most XSS:
This code will write the escaped string <script>alert(1)</script> to the output variable
On the other hand, the
"text/template" package does not offer this protection.
Most template languages in Scala will also encode dynamic content by default. For instance, in the Play framework, this user input will be displayed safely:
Overriding Safe Defaults
For each template language that automatically encodes dynamic content, there are ways of overriding the safe default, which might render the application vulnerable. And if you are constructing HTML code manually from user input and not using templates to render dynamic content, then you’ll need to find a way to encode the user input before inserting it into HTML strings. See our vulnerability database (TODO: add vulnerability database link) for some example scenarios that might make an application vulnerable and for some tips for escaping content manually in different programming languages.
Opinions expressed by DZone contributors are their own.