Web apps are impossible to secure

At the end of the 1990’s a horrible realisation was dawning on the software industry: security bugs in C/C++ programs weren’t rare one-off mistakes that could be addressed with ad-hoc processes. They were everywhere. People began to realise that if a piece of C/C++ was exposed to the internet, exploits would follow.

We can see how innocent the world was back then by reading the SANS report on Code Red from 2001:

“Representatives from Microsoft and United States security agencies held a press conference instructing users to download the patch available from Microsoft and indicated it as “a civic duty” to download this patch. CNN and other news outlets following the spread of Code Red urged users to patch their systems.”

Windows did have automatic updates, but if I recall correctly they were not switched on by default. The idea that software might change without the user’s permission was something of a taboo.

First signs of a Blaster infection

The industry began to change, but only with lots of screaming and denial. Back then it was conventional wisdom amongst Linux and Mac users that this was somehow a problem specific to Microsoft … that their systems were built by a superior breed of programmer. So whilst Microsoft accepted that it faced an existential crisis and introduced the “secure development lifecycle” (a huge retraining and process program) its competitors did very little. Redmond added a firewall to Windows XP and introduced code signing certificates. Mobile code became restricted. As it became apparent that security bugs were bottomless “Patch Tuesday” was introduced. Clever hackers kept discovering that bug types once considered benign were nonetheless exploitable, and exploit mitigations once considered strong could be worked around. The Mac and Linux communities slowly woke up to the fact that they were not magically immune to viruses and exploits.

The final turning point came in 2008 when Google launched Chrome, a project notable for the fact that it had put huge effort into a complex but completely invisible renderer sandbox. In other words, the industry's best engineers were openly admitting they could never write secure C++ no matter how hard they tried. This belief and design has become a de-facto standard.

Now it’s the web’s turn

Unfortunately, the web has not led us to the promised land of trustworthy apps. Whilst web apps are kind of sandboxed from the host OS, and that’s good, the apps themselves are hardly more robust than Windows code was circa 2001. Instead of fixing our legacy problems for good the web just replaced one kind of buffer overflow with another. Where desktop apps have exploit categories like “double free”, “stack smash”, “use after free” etc, web apps fix those but then re-introduce their own very similar mistakes: SQL injection, XSS, XSRF, header injection, MIME confusion, and so on.

This leads to a simple thesis:

I put it to you that it’s impossible to write secure web apps.

Let’s get the pedantry out of the way. I’m not talking about literally all web apps. Yes you can make a secure HTML Hello World, good for you.

I’m talking about actual web apps of decent size, written under realistic conditions, and it’s not a claim I make lightly. It’s a belief I developed during my eight years at Google, where I watched the best and brightest web developers ship exploitable software again and again.

The Google security team is one of the world’s best, perhaps the best, and they put together this helpful guide to some of the top mistakes people make as part of their internal training program. Here’s their advice on securely sending data to the browser for display:

To fix, there are several changes you can make. Any one of these changes will prevent currently possible attacks, but if you add several layers of protection (“defense in depth”) you protect against the possibility that you get one of the protections wrong and also against future browser vulnerabilities. First, use an XSRF token as discussed earlier to make sure that JSON results containing confidential data are only returned to your own pages. Second, your JSON response pages should only support POST requests, which prevents the script from being loaded via a script tag. Third, you should make sure that the script is not executable. The standard way of doing this is to append some non-executable prefix to it, like “ ”. A script runn ing in the same domain can read the contents of the response and strip out the prefix, but scripts running in other domains can't. NOTE: Making the script not executable is more subtle than it seems. It’s possible that what makes a script executable may change in the future if new scripting features or languages are introduced. Some people suggest that you can protect the script by making it a comment by surrounding it with /* and */ , but that's not as simple as it might seem. (Hint: what if someone included */ in one of their snippets?)

Reading this ridiculous pile of witchcraft and folklore always makes me laugh. It should be a joke, but it’s actually basic stuff that every web developer at Google is expected to know, just to put some data on the screen.

Actually you can do all of that and it still doesn’t work. The HEIST attack allows data to be stolen from a web app that implements even all the above mitigations and it doesn’t require any mistakes. It exploits unfixable design flaws in the web platform itself. Game over.

Not really! It gets worse! Protecting REST/JSON endpoints is only one of many different security problems a modern web developer must understand. There are dozens more (here’s an interesting example and another fun one).

My experience has been that attempting to hire a web developer that has even heard of all these landmines always ends in failure, let alone hiring one who can reliably avoid them. Hence my conclusion: if you can’t hire web devs that understand how to write secure web apps then writing secure web apps is impossible.

The core problem

Virtually all security problems on the web come from just a few core design issues:

Buffers that don’t specify their length

Protocols designed for documents not apps

The same origin policy

Losing track of the size of your buffers is a classic source of vulnerabilities in C programs and the web has exactly the same problem: XSS and SQL injection exploits are all based on creating confusion about where a code buffer starts and a data buffer ends. The web is utterly dependent on textual protocols and formats, so buffers invariably must be parsed to discover their length. This opens up a universe of escaping, substitution and other issues that didn’t need to exist.

The fix: All buffers should be length prefixed from database, to frontend server, to user interface. There should never be a need to scan something for magic characters to determine where it ends. Note that this requires binary protocols, formats and UI logic throughout the entire stack.

HTTP and HTML were designed for documents. When Egor Homakov was able to break Authy’s 2-factor authentication product by simply typing “../sms” inside the SMS code input field, he succeeded because like all web services Authy is built on a stack designed for hypertext, not software. Path traversal is helpful if what you’re accessing is an actual set of directories with HTML files in them, as Sir Tim intended. If you’re presenting a programming API as “documents” then path traversal can be fatal.

REST was bad enough when it returned XML, but nowadays XML is unfashionable and instead the web uses JSON, a format so badly designed it actually has an entire section in its wiki page just about security issues.

The fix: Let’s stop pretending REST is a good idea. REST is a bad idea that twists HTTP into something it’s not, only to work around the limits of the browser, another tool twisted into being something it was never meant to be. This can only end in tears. Taking into account the previous point, client/server communication should be using binary protocols that are designed specifically for the RPC use case.

The same origin policy is another developer experience straight out of a Stephen King novel. Quoth the wiki:

The behavior of same-origin checks and related mechanisms is not well-defined in a number of corner cases … this historically caused a fair number of security problems. In addition, many legacy cross-domain operations predating JavaScript are not subjected to same-origin checks. Lastly, certain types of attacks, such as DNS rebinding or server-side proxies, permit the host name check to be partly subverted.

The SOP is a result of Netscape bolting code onto a document format. It doesn’t actually make any sense and you wouldn’t design an app platform that way if you had more than 10 days to do it in. Still, we can’t really blame them as Netscape was a startup working under intense time pressure, and as we already covered above, back then nobody was thinking much about security anyway. For a 10 day coding marathon it could have been worse.

Regardless of our sympathy it’s the SOP that lies at the heart of the HEIST attack, and HEIST appears to break almost all real web apps in ways that probably can’t be fixed, at least not without breaking backwards compatibility. That’s one more reason writing secure web apps is impossible.

The fix: apps need a clear identity and shouldn’t be sharing security tokens with each other by default. If you don’t have permission to access a server you shouldn’t be able to send it messages. Every platform except the web gets this right.

There are a bunch of other design problems in the web that make it hard to secure, but the above examples are hopefully enough to convince.