HAProxy’s high-performance security capabilities are utilized as a key line of defense by many of the world’s top enterprises. Application-layer DDoS attacks are aimed at overwhelming an application with requests or connections, and in this post we will show you how an HAProxy load balancer can protect you from this threat.

This post describes techniques that utilize ACLs, maps, and stick tables. See the following posts for introductions to those topics:

Put any website or application up these days and you’re guaranteed to be the target of a wide variety of probes and attacks. Your website is a boat that must constantly weather the storm of various threats, including distributed denial-of-service (DDoS) attacks, malicious bots, and intrusion attempts. Over the years, HAProxy has evolved to living life in these perilous waters through the development of flexible building blocks that can be combined to mitigate nearly any type of threat. These building blocks include high-performance ACLs and maps, real-time tracking with stick tables, a performant SSL/TLS stack, WAF and much more. Even with all these added capabilities, it maintains the best-in-class performance that it’s known for.

The spectrum of companies benefiting from HAProxy’s advanced security capabilities range from small mom-and-pop shops to large enterprises, managed hosting companies and load balancer-as-a-service platforms serving millions of requests per second. Top websites include GitHub, which uses HAProxy to protect its network from application-layer DDoS attacks, and StackExchange, which uses it to detect and protect against bot threats. Furthermore, Booking.com chose HAProxy as a core component in its edge infrastructure for its superior performance after comparing it with other software load balancers on the market.

In this blog post, we’ll demonstrate how the HAProxy load balancer protects you from application-layer DDoS attacks that could, otherwise, render your web application dead in the water, unreachable by ordinary users. In particular, we’ll discuss HTTP floods. An HTTP flood operates at the application layer and entails being immersed with web requests, wherein the attacker hopes to overwhelm your application’s capacity to respond.

HTTP Flood

The danger of HTTP flood attacks is that they can be carried out by just about anyone. They don’t require a large botnet and tools for orchestrating the attack are plentiful. This accessibility makes it especially important that you have defenses in place to repel these assaults.

These attacks can come in a few different forms, but the most commonly seen pattern consists of attackers requesting one or more of your website’s URLs with the highest frequency they are able to achieve. A shotgun approach will be to request random URLs, whereas more sophisticated attackers will profile your site first, looking for slow and uncached resources that are more vulnerable. For example, they may target search pages.

In order to evade detection for longer, the attack may consist of many different source IP addresses. It may be carried out by bots or by groups of real users working in unison to bring down your site. That was the case with the Low Orbit Ion Cannon (LOIC) and High Orbit Ion Cannon (HOIC) attacks carried out by the hacktivist collective Anonymous. The seemingly widespread range of source IPs is what characterizes the distributed nature of the attack.

HAProxy comes with features for mitigating HTTP floods and will play a vital part in your overall defense strategy. In this blog post, we will be using HAProxy Enterprise because of its additional security features, which we’ll talk about later in this article. However, many of the solutions you will see will work for the Community Edition as well with minimal adjustments to names and paths.

Manning the Turrets

The ideal place to stop an HTTP flood is at the edge of your network. Stopping threats here protects your upstream web applications by minimizing the traffic and system load that could impact them, as well as other sites and services running on those servers. It also prevents unnecessary confusion during attack identification by drawing a clear frontline to the battle.

The HAProxy load balancer receives requests from the Internet and passes them to your web servers. This lets you guard the perimeter.

The other network devices that sit between HAProxy and the Internet, including routers and firewalls, are typically operating at too low a level to allow for request inspection.

Did You Know? ALOHA, the HAProxy plug-and-play appliance, can protect you from low-level, protocol-based attacks, such as SYN floods, at line rate with our mitigation solution called ALOHA, the HAProxy plug-and-play appliance, can protect you from low-level, protocol-based attacks, such as SYN floods, at line rate with our mitigation solution called PacketShield . PacketShield is powered by NDIV , an open-source network traffic processing framework that we’ve been working on since 2013. We have since been working closely with the XDP team to bring some NDIV features to XDP and make NDIV work on top of XDP.

With HAProxy, you have two methods that are very effective at classifying malicious requests. The first is to monitor the rate at which a user is making requests. The second is to flag HTTP requests that have signatures you wouldn’t expect to see from ordinary users.

For the best results, you should combine the two methods. Setting request rate limits lets you block clients that access your website’s resources too frequently, while denying requests that contain anomalous data narrows the field of possible attackers.

Setting Request Rate Limits

Tracking user activities across requests requires in-memory data storage that can identify returning clients and correlate their actions. This is key to setting rate-limiting thresholds—being able to track how many requests someone is making. HAProxy enables you to do this through an extremely flexible and high-performance data storage called stick tables, a feature that is unique to HAProxy.

Stick tables provide a generic key-value store and can be used to track various counters associated with each client. The key can be based on anything found within the request. Typically, it will be the user’s IP address, although it can also be something more specific like the IP+UserAgent. Commonly tracked values are the request count and request rate over a period of time.

Stick tables were developed in collaboration with StackExchange, the network of Q&A communities that includes Stack Overflow, who initially approached HAProxy in 2010 about implementing rate limiting based on traffic patterns. Stick tables are an extremely mature and proven technology within HAProxy, enabling many of its advanced features. If you’re new to stick tables, you can learn more by reading our blog post Introduction to HAProxy Stick Tables.

Defining the Storage

Create a stick table by adding a stick-table directive to a backend or frontend . In the following example, we use a placeholder backend named per_ip_rates. Dedicating a backend to holding just a stick-table definition allows you to reference it in multiple places throughout your configuration.

Consider the following example:



This sets up the storage that will keep track of your clients by their IP addresses. It initializes a counter that tracks each user’s request rate. Begin tracking a client by adding an http-request track-sc0 directive to a frontend section, as shown:



With this configuration in place, all clients visiting your website through HAProxy via the fe_mywebsite frontend will be stored in the per_ip_rates stick table. All of the counters specified in the stick table definition will be automatically maintained and updated by HAProxy.

Next, let’s see how to put this data to good use.

Limiting Request Rates

Let’s say that you wanted to block any client making more than 10 requests per second. The http_req_rate(10s) counter that you added will report the number of requests over 10 seconds. So, to cap requests at 10 per second, set the limit to 100.

In the following example, we add the http-request deny directive to reject clients that have gone over the threshold:



This rule instructs HAProxy to deny all requests coming from IP addresses whose stick table counters are showing a request rate of over 10 per second. When any IP address exceeds that limit, it will receive an HTTP 429 Too Many Requests response and the request won’t be passed to any HAProxy backend server.

These requests will be easy to spot in the HAProxy access log, as they will have a termination state of PR–, which means that the session was aborted because of a connection limit enforcement:

If you’d like to define rate limit thresholds on a per URI basis, you can do so by adding a map file that pairs each rate limit with a URL path. See our blog post Introduction to HAProxy Maps for an example.

Maybe you’d like to rate limit POST requests only? It’s simple to do by adding a statement that checks the built-in ACL, METH_POST.



You can also tarpit abusers so that their requests are rejected with a HTTP 500 status code with a configurable delay. The duration of the delay is set with the timeout tarpit directive. Here, you’re delaying any response for five seconds:



When the timeout expires, the response that the client gets back after being tarpitted is 500 Internal Server Error, making it more likely that they’ll think that their assault is working.

Slowloris attacks

Before getting into our second point about DDoS detection, identifying odd patterns among users, let’s take a quick look at another type of application-layer attack: Slowloris.

Slowloris involves an attacker making requests very slowly to tie up your connection slots. Contrary to other types of DDoS, the volume of requests needed to make this attack successful is fairly low. However, as each request only sends one byte every few seconds, they can tie up many request slots for several minutes.

An HAProxy load balancer can hold a greater number of connections open without slowing down than most web servers. As such, the first step towards defending against Slowloris attacks is setting maxconn values. First, set a maxconn in the global section that leaves enough headroom so that your server won’t run out of memory even if all the connections are filled, per the sizing guide. Then inside the frontend or a defaults section, set a maxconn value slightly under that so that if an attack saturates one frontend, the others can still operate.

Next, add two lines to your defaults section:



The first line causes HAProxy to respond to any clients that spend more than five seconds from the first byte of the request to the last with an HTTP 408 Request Timeout error. Normally, this only applies to the HTTP request and its headers and doesn’t include the body of the request. However, with option http-buffer-request , HAProxy will store the request body in a buffer and apply the http-request timeout to it.

Blocking Requests by Static Characteristics

You’ve seen how to block requests that surpass a maximum number of HTTP requests. The other way to identify and stop malicious behavior is by monitoring for messages that match a pattern. Patterns are set in HAProxy using access control lists (ACLs). Read our blog post Introduction to HAProxy ACLs if you’re new to using them.

Let’s see some useful ACLs for stopping DDoS attacks.

Using ACLs to Block Requests

A number of attacks use HTTP/1.0 as the protocol version because that’s the version supported by some bots. It’s easy to block these requests using the built-in ACL, HTTP_1.0 :



You can also reject requests that have non-browser User-Agent headers, such as curl.



This line will deny the request if the -m sub part of the User-Agent request header contains the string curl anywhere in it. The -i makes it case-insensitive. You might also check for other strings such as phantomjs and slimerjs, which are two scriptable, headless browsers that could be used to automate an attack.



If you have many strings that you’re checking, consider saving them to a file—one string per line—and referencing it like this:



At other times, an attacker who is using an automated tool will send requests that don’t contain a User-Agent header at all. These can be denied too, as in the following example:



Even more common is for attackers to randomize the User-Agent strings that they send in order to evade detection for longer. Oftentimes, these come from a list of genuine values that a true browser would use and make it harder to identify malicious users.

This is where the HAProxy Enterprise Fingerprint Module comes in handy. It uniquely identifies clients across requests, even when they change their User-Agent string. It works by triangulating many data points about a client to form a signature specific to them. Using this information, you can then ID and dynamically block the abusers.

Blacklisting and Greylisting

Another characteristic that you might use to filter out potentially dangerous traffic is the client’s source IP address.

Whether intentionally or unintentionally, China seems to be the origin of much DDoS traffic. You may decide to blacklist all IPs coming from a particular country by researching which IP blocks are assigned to it and denying them en masse.

Use the src fetch method to get a client’s source IP address. Then, compare it to a file containing all of the IP address ranges that you wish to block.



Your blacklist.acl file might look like this:



To streamline this, you can use a GeoIP database like MaxMind or Digital Element. Read our blog post, Using GeoIP Database within HAProxy to see how to set this up. Alternatively, these lookups can happen directly from within HAProxy Enterprise using a native module that allows for live updates of the data and doesn’t require extra scripts to translate to map files. The native modules also result in less memory consumption in cases where lookups need to be granular, for example, on a city basis.

If you don’t like the idea of banning entire ranges of IP addresses, you might take a more lenient approach and only greylist them. Greylisting allows those clients to access your website, but enforces stricter rate limits for them.

The following example sets a stricter rate limit for clients that have IP addresses listed in greylist.acl:



If you are operating two or more instances of HAProxy for redundancy, you’ll want to make sure that each one has the list of the IP addresses that you’ve blacklisted and that they are each updated whenever you make a change. Here’s a place where using HAProxy Enterprise gives you an advantage. By using a module called lb-update, you can host your ACL file at a URL and have each HAProxy instance fetch updates at a defined interval.

In the next example, we’re using lb-update to check for updates every 60 seconds:



Protecting TCP (non-HTTP) Services

So far, we’ve primarily covered protecting web servers. However, HAProxy can also help in protecting other TCP-based services such as SSH, SMTP, and FTP. The first step is to set up a stick-table that tracks conn_cur and conn_rate :



Next, create or modify a frontend to use this table by adding track and reject rules:



With the usual backend :



Now, each client can establish one SMTP connection at a time. If they try to open a second one while the first is still open, the connection will be immediately closed again.

Delaying connections

With e-mail and other server-speaks-first protocols (where the server sends a message as soon as a client connects instead of waiting for the client to say something, as with HTTP) we can delay connections as well by adding the following after the rules we added to block:



This will immediately connect any client that has made only one connection within the last minute. A threshold of less than two is used so that we’re able to accept one connection, but it also makes it easy to scale that threshold up. Other connections from this client will be held in limbo for 10 seconds, unless the client sends data down that second pipe, which we check with req_len . In that case, HAProxy will close the connection immediately without bothering the backend.

This type of trick is useful against spam bots or SSH bruteforce bots, which will often launch right into their attack without waiting for the banner. With this, if they do launch right in, they get denied, and if they don’t, they had to hold the connection in memory for an additional 10 seconds. If they open more connections to get around that rate limit, the conn_cur limits from the previous section will stop them.

The Stick Table Aggregator

Using active-active HAProxy load balancers in front of your websites increases your redundancy, protecting you in case one load balancer goes offline. It also provides extra capacity when weathering an application-based DDoS attack. You can learn how to set it up by watching our on-demand webinar, Building Highly Scalable ADC Clusters Using BGP Equal-cost Multi-path Routing.

In a standard HAProxy Community configuration, each individual instance of HAProxy only sees the requests coming to it. It does not see the requests that are received by other load balancer instances. This gives an attacker more time to stay under the radar.

If you’re using HAProxy Enterprise, enabling the Stick Table Aggregator module solves this problem. It allows HAProxy servers to aggregate all of their request rates and statistics and make decisions based on the sum of data.

The illustration below depicts how multiple load balancers can be peered to share information. Note how by adding tiers of stick table aggregators, you can collect data from many instances of HAProxy. Contact us to learn how to set this up.

The reCAPTCHA and Antibot Modules

HAProxy isn’t limited to just flat-out blocking a request. Sometimes, you’ll deal with situations where things are less certain: Is it a bot or is it a bunch of visitors that appear with the same IP only because they are behind a NAT? More adaptable responses are in order.

Using a Lower Priority Backend

If you want to allow suspicious requests to your site normally, when loads are low, but restrict them when loads start increasing (or dedicate a cheap VM to suspicious requests, divert traffic to a static version of your site, etc), using another backend can help.

To do this, create a backend with the new servers and then use a use_backend line to direct requests to it:



This will typically go after the http-request deny rules, which would have a higher threshold like 200, so that an overly abusive bot will still get direct error responses, while ones with a lower request rate can get the be_website_bots backend instead. If returning errors even at the higher rates concerns you, you can add { be_conn(be_website) gt 3000 } to only outright deny requests if there are more than 3,000 currently active connections to the backend.

Sending a Javascript challenge

The HAProxy Enterprise Antibot module provides a way to make clients generate a key to enter the site, which will help identify individual users behind a NAT and seperate the clients that support Javascript from the ones that don’t.

The Antibot module asks the client to solve a dynamically generated math problem. It works off of the idea that many automated DDoS bots aren’t able to parse JavaScript. Or, if they are, doing so slows them down. Spending CPU time on solving the puzzle often consumes an attacker’s resources that they’re paying for by the minute and, frustrated, they will often go elsewhere in search of an easier target.

View our on-demand webinar, DDoS Attack and Bot Protection with HAProxy Enterprise, to learn more and see a demo of the Antibot module in action.

Challenging a Visitor to Solve a Captcha

The reCAPTCHA module presents the client with a Google reCAPTCHA v2 challenge that a bot won’t be able to complete. This is helpful for cases where a bot is taking advantage of a full-fledged browser such as headless Chrome or Selenium. This, like the Antibot module, weeds out illegitimate users, either stopping them in their tracks or slowing them down to the point where it’s unfavorable for them to continue the assault.

View our on-demand webinar, DDoS Attack and Bot Protection with HAProxy Enterprise, to learn more and see a demo of the reCaptcha module in action.

Silently Dropping Requests

When your rules clearly indicate that a bot is a bot and it is just generating too much traffic, the best thing to do is to try and overload it.

In order to make requests, the bot needs to keep track of the TCP connections, and normally so does HAProxy. Thus, both are tied, except that HAProxy has to also answer other visitors at the same time. With silent-drop HAProxy will tell the kernel to forget about the connection and conveniently forget to notify the client that it did so. Now, HAProxy doesn’t need to track that connection. This leaves the client waiting for a reply that will never come and it will have to keep the connection in its memory, using one of its source ports, until it times out. To do this, add http-request silent-drop , like so:



The main downside to this is that, presuming that the rules are set such that no legitimate clients will get this treatment, any stateful network devices (namely firewalls) will be confused by this, as they too won’t get a notification that the connection has closed. This will cause these devices to keep track of connections that HAProxy is no longer thinking about and, in addition, consume memory on the stateful firewall. Be mindful of this if you are using such a device.

Conclusion

In this blog post, you’ve learned how to defend your websites from application-layer attacks like HTTP floods and Slowloris by using features built into HAProxy for rate limiting and flagging suspicious clients. This safeguards your web servers and prevents malicious traffic from entering your network.

HAProxy Enterprise will give you some must-have features for aggregating stick table data and challenging suspicious clients with either JavaScript or reCAPTCHA puzzles. These extras will ensure that you’re getting the full picture of your traffic and that regular users aren’t locked out by false positives.

If you’d like to implement a DDoS attack protection solution using HAProxy, backed by enterprise support and the unique insights of the HAProxy Technologies staff, then you can request your trial for the HAProxy Enterprise right away or contact us to learn more.

Got an idea you’d like us to blog about? Let us know in the comments! Want to keep up to date on topics like these? Sign up for our newsletter and follow us on Twitter!