Disclaimer: Everything that I say in this blog post about ShapeSecurity and their ShapeShifter product, is based on their YouTube video, their description of their product on their pages, and an article on PandoDaily. As such, the product may be much more sophisticated than I think it is (if so, they should have really given better examples) in which case, what you read below may not be 100% correct.

On one of my daily Twitter rounds, I noticed someone talking about this great product that will revolutionize web security. What does this product do you ask? It uses “polymorphism” to constantly change a page’s HTML code so that bots won’t be able to appropriately interact with the page. This means, according to the company, that all bad things that bots do (automatic signups, spam, credential testings, etc.) will seize. At the same time, the page remains identical for human users who will have no problems interacting with the page. Here’s the presentation of their product:





Workings

Once you remove the buzzwords, what you’re left with, is essentially a rewrite of form metadata (ids of elements, names of elements and form targets) with random text. This rewriting will be done by their device on the code’s way out and the device will then, I assume, rewrite the fields back to their appropriate names and IDs so that your web application can read the values of the GET and POST fields that they expect. So a form sent out by the web app, like this:

<html> <body> <form action="login.php"> <input id="username" type="text" name="username" /> <input id="password" type="password" name="password" /> <input type="submit" /> </form> </body> </html>

Will arrive at the user (or bot) like this:

<html> <body> Please login: <form action="234dsf234"> <input id="lkj546llkj456" type="text" name="sad7683432h" /> <input id="xcvcx98xcv" type="password" name="123fdxf123" /> <input type="submit" /> </form> </body> </html>

The code will also be different every time you load the page, thus the attacker cannot hard-code the new random names into his bot.

1,000 ways around it

If your bot is a wget/perl/python script with hardcoded names for parameters, then indeed this will give you troubles. But, guess what… if this gives you troubles, then maybe you shouldn’t be writing bots in the first place!

Today, there are headless browsers like PhantomJS and HtmlUnit that actually read in the HTML together with the JavaScript, run the JavaScript (oh no!!!) and construct a DOM which you can inspect/modify using JavaScript. As a matter of fact, I remember of at least one article where there was a DDoS attack by an army of PhantomJS browsers, running on compromised machines.

So your PhantomJS-based botnet has received the above page. Now what?

#1 Find the form using document.forms

No need to mess around with pesky names. How about this, instead?

//PhantomJS code document.forms[0].childNodes[0] //This is your username field document.forms[0].childNodes[1] //This is your password field

And now, I can hear you think out loud. What if they add a random number of forms before and after the right form? How will you then find the proper form? Well, the answer lies in the fact that the page should still be the same for real users (as they claim will be). Thus these extra forms should be invisible/hidden.. and you can test for that 🙂

//PhantomJS code for (var i=0; i < document.forms.length; i++){ if (document.forms[i].offsetHeight == 0 && document.forms[i].offsetWidth == 0){ // This is fake/invisible skip it continue; } else{ //This is a real form, visible to the user. ... } }

#2 Find user-readable text

A login form on a website should still have meaningful textual cues next to it, so that a legitimate user knows what to type in each box (“Username”, “Password”, etc.) Use an XPATH, find the element that contains text that’s next to input fields you’re after, and then use “.nextSibling” to get the input element right next to it. If they are adding fake elements, test again the visibility of siblings until you find a visible one.

#3 If all else fails…

This won’t happen, but lets assume, for the sake of argument, that it does. Get a screenshot of the loaded page using your PhantomJS browser (4 lines, I kid you not):

//PhantomJS code var page = require('webpage').create(); page.open('http://github.com/', function () { page.render('github.png'); phantom.exit(); });

Send screenshot to a CAPTCHA sweatshop in China, have a poor guy who currently makes $3/day typing CAPTCHAs to take a break and just click wherever he sees a input-field on the image. You will probably not pay more than 0.5 cents (not dollars) per screenshot (current CAPTCHA buying price is $1-2 per 1000 CAPTCHAs) and in return, you get back an array of <x,y> coordinates. Use Selenium, CasperJS, whatever you want, click on the coordinates, fill in the text fields and submit.

Conclusion

In my humble opinion, this HTML-polymorphism thing will do little more than produce a few laughs and high-fives to the people writing today’s bots. If you still want to use this technology, I, for half the price of whatever it will sell for, will give you a box, that will paint your forms green and brown, so that they can hide in bushes where bots can’t find them.

Till next time

Nick Nikiforakis

Update – 22 January 2014

I just received the following email from Marc Hansen of Shape Security, which states the following:

Hi Dr. Nikiforakis, I appreciated your analysis of the ShapeShifter. Given the amount of information you had to evaluate, I agree with your conclusions. However, as you suspected, it is actually more sophisticated. We simplified the public explanation for several reasons including the full explanation is lost on most readers; some patent filings are not completed; and most importantly, we didn’t want to scoop the academic paper on subject we just submitted for consideration at an academic conference. However, even this first paper is just a subset of what we are fully doing. As a matter of strategy we expect to continue to disclose the real details of our concepts via academic channels and use our web site for more simplified explanations. We are engaged in a very serious and well funded effort to find the best solutions to protect web sites from automated attack. We are assembling a team of the best people to help and have nice start, but we are always interested in top talent. I have copied Dr. Ariya Hidayat and Dr. Xinran Wang on this email. Ariya leads our team building the countermeasures and Xinran leads our security research. You might recognize Ariya’s name, as he was the developer of PhantomJS–so, yes, this vector had occurred to us 🙂 If you have any interest in collaborating with our team to help find the best solutions, please drop us a note. Best regards,

Marc Hansen

Given that Ariya Hidayat is part of the team, (the developer of PhantomJS), I am really eager to see what extra tricks are in place to stop the easy attacks that I mentioned above. Only time will tell.