For some time I’ve been seeing the Angler exploit kit pop up and infect clients without through malvertising campaigns without having a referer when visitng the landing page. The reason why this is interesting is that it makes it a lot harder to track down the malicious creative IDs which can be disabled by the advertisement operator. This is key in trying to fight active malvertising campaigns. In this short article I’ll go through the current setup the Angler exploit kit uses to avoid the referer chain by losing it in a 2 step system.





Initial infection chain for Angler

Angler is currently using a method that allows them to break the referer chain. Breaking this chain makes it hard to track down the malicious advertisement associated with them. As an example here is a malvertising case involving Angler, the advertiser was about 5 layers down in the chain. The following screenshot is from the Fiddler sesion:

In the image I have commented all the requests with their purpose; this was done after analyzing the chain.



We see an advertiser ‘hrd-marketing[.]com’ being used and a JavaScript file being requested. A bit after the advertiser related request(s) there is a post to an odd looking HTML file:



This POST request does not contain any referer. In the response of the server we see a meta refresh snippet in a fake 404 page. There is a 200 OK response but a piece of text saying 'Page not found’ in the body. If we look for this specific URL in the rest of the Fiddler session we a request going towards it:



Interestingly again no referer, it seems the Angler guys are pulling some tricks to break the referer chain, lets have a look.





Analysis

If you look at the actual referers you find there are none on either the redirector or landing. Let’s first look at the advertiser, for this I already took the trouble to filter out the others and pinpointed this one serving the Angler redirect. The reason I did this was to spare 2/3 of a blog about tracking this one down in the insane amount of advertiser chains this one appeared in. For those wondering, the initial site was an adult content website for videos.

Lets start at the advertiser JavaScript, the advertiser gives the following blob back of which the second part (after the newline) is of interest to us. The top part is the real advertisement:

If we take it out and beautify it a bit we get the following:



The script is heavily obfuscated, especially the part of the target URL for the POST request. If you clean it up you end up with this (note: I removed the target URL from the pastebin post to not get it flagged):



So how does their trick work? its actually quite simple but a bug in modern web browser in my opinion.

On the page it first adds a new DIV. Inside this DIV it puts an iframe which does not have a 'src’ attribute meaning it is not loading anything from a remote site. However most browsers see the context/body of the iframe (and normally the page it loads based on the 'src’ attribute) as a seperate page with its own context; this is pretty much how the Angler referer-less request(s) trick works. In the next step they generate a form with a hidden input (with some unique tokens, not sure what they do but they might mean something for them on their backend). This form is put in the body of the iframe context after which they submit the form. When they submit the form they are sending it from within the iframe context which does not have an actual page loaded which causes the request coming from it to not have a referer; quite a nasty trick.

The small piece of script I created can be used to request pages without referers in the latest versions of Chrome, Firefox and Internet Explorer as of the 7th of May 2015. See it as a POC.



The next step is why is the landing being requests without a referer. Here’s how the landing page request looks in Wireshark:

This one is slightly harder to explain and I don’t have 100% certainty but I assume this is how browsers are processing it:

When a browser performs a POST request as seen here the response is (in most cases) some external resource being loaded in the page. In the case of the redirector we see the response is HTML which contains a meta refresh tag which changed the page location to the specified URL. My guess is that the browser follows the refresh but due to a bug doesn’t follow it with a referer. The thing is however, I only got a POC of this to work on Internet Explorer but could not reproduce it on Chrome or Firefox.

And there we go, two tricks (of which one works on all browsers) to lose referers and frustrate automated systems and researchers like me. It is interesting knowing the Angler history that they come up with these kind of tricks; it certainly helps their malvertising campaign lifetime.





Following the RFC

If we look at the RFC it actually specifies that unless a user manually enters a URL it should always make a referer chain except when coming from an SSL connection:



The Referer[sic] request-header field allows the client to specify, for the server’s benefit, the address (URI) of the resource from which the Request-URI was obtained (the “referrer”, although the header field is misspelled.) The Referer request-header allows a server to generate lists of back-links to resources for interest, logging, optimized caching, etc. It also allows obsolete or mistyped links to be traced for maintenance. The Referer field MUST NOT be sent if the Request-URI was obtained from a source that does not have its own URI, such as input from the user keyboard.

You can read it here: http://tools.ietf.org/html/rfc2616#section-14.36

Following the RFC it would appear this referer trick is actually a bug in the browsers. In a discussion on this subject with the Google Chrome developers (read it [here] they linked me to the following RFC: W3C Referrer Policy in which the concept of ’Nested Browsing Contexts’ is described:

Certain elements (for example, iframe elements) can instantiate further browsing contexts. These are called nested browsing contexts. If a browsing context P has a Document D with an element E that nests another browsing context C inside it, then C is said to be nested through D, and E is said to be the browsing context container of C. If the browsing context container element E is in the Document D, then P is said to be the parent browsing context of C and C is said to be a child browsing context of P. Otherwise, the nested browsing context C has no parent browsing context. A browsing context A is said to be an ancestor of a browsing context B if there exists a browsing context A’ that is a child browsing context of A and that is itself an ancestor of B, or if the browsing context A is the parent browsing context of B.

You can interpret the above text as follows:

You have a website in which you embed an iframe pointing to an HTTP website, call this iframe’d website example-iframe.com

From your initial website you embed a form in the body context of the iframe so in the example-iframe.com website context.

You submit this form which will happen inside the context of example-iframe.com

What you would expect is:

The iframe towards example-iframe.com will have your initial website in its referer

The form submitted from example-iframe.com will have example-iframe.com in its referer

The fact that an iframe without a “src” attribute doesn’t have any website context is understandable, especially from a programmer side of view. The issue here is that I’d expect a src-less iframe to be a sort of 'local’ frame (as it doesn’t load external content). Any request coming from it would have the initial website creating the src-less iframe as the 'parent’ and thus the referer.



I’m currently awaiting a reply from the developers, the same bug is also present in Firefox and Internet Explorer. I feel a small rewrite on this special case is in order; it currently works as intended but not as expected.