Use rate limiting in HAProxy to stop clients from making too many requests and promote fair usage of your services.

Rate limiting in HAProxy stops a client from making too many requests during a window of time. You might have a policy that stipulates how many requests a client can make, just as a matter of keeping resource usage fair. Or, you may want to put rate limiting in place to guard against certain types of attacks like application-layer DDoS attacks.

There are several ways for you to turn on rate limiting. Each technique uses the flexible building blocks of the HAProxy configuration language, combining access control lists (ACLs), stick tables, and maps, to compose a slightly different solution meant for a particular use case.

In this blog post, we’ll zero in on limiting the number of HTTP requests that a client can make. We’ll save other interesting scenarios, such as limiting the number of connections, the bytes flowing in, the bytes flowing out, and the maximum amount of errors, for another time. We’ll also avoid itemizing every way that you can react to misbehaving clients and simply focus on denying them. However, there are, in fact, many actions you can take in HAProxy when you see a client exceeding a rate limit. For example, you can tarpit them, send them to a different pool of servers, or ban them for some extended period of time. HAProxy Enterprise adds a even more options, such as the ability to present reCAPTCHA and JavaScript challenges.

Setting the Maximum Connections

Before diving into actual rate limiting, note that you can achieve a level of fairness by enabling queuing. Queuing means that you can store excess connections in HAProxy until your servers are freed up to handle them. HAProxy is designed to hold onto lots of connections without a sharp increase in memory or CPU usage. However, queueing has to be turned on before you’ll see the benefit.

Use the maxconn parameter on a server line to cap the number of concurrent connections that will be sent. Here’s an example that sends up to 30 connections at a time to each server. After all servers reach their maximum, the connections queue up in HAProxy:

If all 30 connections are being used on all three servers, or in other words 90 connections are active, then new connections will have to wait in line for a slot to free up. This means that the servers themselves won’t become overloaded.

In all likelihood, a server will become available fast enough that the client will never even know the difference. You can define how long clients should be queued by adding the timeout queue setting, like this:

The idea behind setting a timeout is that it’s better to let some clients receive a 503 Service Unavailable error than to allow your servers to become buried under the load. Or, from the client’s perspective, it’s better to get an error and deal with it (programmatically, of course), than to wait an extended amount of time and possibly cause errors that are more difficult to resolve.

Sliding Window Rate Limiting

Let’s look at the most straightforward case of rate limiting. In this scenario, you want to limit the number of requests that a user can make within a certain period of time. The period is a sliding window. So, if you set it to allow no more than 20 requests per client during the last 10 seconds, HAProxy will count the last 10 seconds. Consider this HAProxy configuration:

The stick-table directive creates a key-value store for storing counters like the HTTP request rate per client. The key is the client’s IP address, as configured by the type parameter, which is used to store and aggregate that client’s number of requests. The http-request track-sc0 src line adds the client as a record in the stick table. The counters begin to be recorded as soon as the IP is added.

A stick table record expires and is removed after a period of inactivity by the client, as set by the expire parameter. That’s just a way of freeing up space. Without an expire parameter, oldest records are evicted when the storage becomes full. Here, we’re allowing 100,000 records.

The http-request deny line sets the rate limit threshold and the action to take when someone exceeds it. Here, we’re allowing up to 20 concurrent requests and denying additional ones with a 429 Too Many Requests response until the count during the last 10 seconds is below the threshold again. Other actions include forwarding the client to a dedicated backend or silently dropping the connection. The sc_http_req_rate fetch method returns the client’s current request rate.

You can play with the time period or the threshold. Instead of counting requests over 10 seconds, you might extend it to something like 1000 requests during the last 24 hours. Simply change the counter specified on the stick-table line from http_req_rate(10s) to http_req_rate(24h) . Then update the http-request deny line to allow no more than 1000 requests.

We covered a similar example in our blog post, Bot Protection with HAProxy. In that post, we demonstrating how to track a client’s error rate, which can be used to detect vulnerability scanners.

Rate Limit by Fixed Time Window

Suppose you wanted to allow up to 1000 requests per day. In the last example, we used a sliding window. So, if a person makes 500 requests on Monday and another 500 on Tuesday, the combined total will count towards the 1000 requests limit during the last 24 hours. If, instead, you decided that a person should be allowed 1000 requests from sunup to sundown, but the count should reset at midnight each day, then you’d have to go about it differently.

Rather than using the http_req_rate counter, which takes a time period, you’d use http_req_cnt , which increments forever until reset or until the expiration is hit. You would then use the HAProxy Runtime API to clear all records at exactly midnight.

First, update your frontend to look like this:

Now, when a client makes request 1001, they will be denied. However, you need a way to reset this status at the end of each day. Enable the Runtime API by adding a stats socket directive to the global section of your HAProxy configuration:

Next, install the socat utility and use it to invoke the clear table Runtime API command to clear all records from the stick table:

You could set up a cron job to do this automatically each day. Set it and forget it. If you need to clear a single record as a one-off, you can include the client’s IP address, as shown:

Rate Limit by URL

Some pages require more processing time than others, such as pages that query a database to render a report. They might need a stricter rate limit. In that case, you might decide to set the limit threshold depending on the page. In this scenario, we’ll check the URL path as an added dimension.

First, add a file called rates.map to the /etc/haproxy directory. This map file will associate URL paths with their rate limits. Add the following to it, in which three paths are associated with various thresholds:

Next, update your HAProxy configuration to look like this:

Instead of keying off of IP addresses in the stick table, we’ve specified a type of binary. This is populated with a hash of the HTTP Host header, the URL path, and the client’s source IP address. You get all of that when the http-request track-sc0 base32+src directive is called. That way, you can differentiate a client’s request rate over a number of different webpages.

The first http-request set-var line finds the request rate threshold in the rates.map file for the current URL path being requested. If the requested URL is not found in the map file, a default of 20 is used. It stores the result in a variable named req.rate_limit. The next http-request set-var line sets a variable named req.request_rate to the client’s current request rate for the page.

In order to compare the allowed limit with the client’s request rate, we subtract one from the other and make sure that the difference is more than zero. If it isn’t, we deny the request because they’ve surpassed the threshold for that page.

Rate Limit by URL Parameter

Here’s a slight variation on rate limiting by URL path: rate limiting by URL parameter. You might use this if your clients include an API token in the URL to identify themselves.

Here, we’re using a sliding window of 24 hours, during which time a client can make up to 1000 requests. The stick table’s type is a string and we’re using the http-request track-sc0 line to store a URL parameter named token as the key in the table. So, a user might request a page like this:

http://yourwebsite.com/api/v1/does_a_thing?token=abcd1234

The has_token ACL ensures that a token is included in the URL. The exceeds_limit ACL finds the current request count for the last 24 hours. The http-request deny line denies the request if the client has exceeded the limit or didn’t give a token. Note that we’ve added an unless exceeds_limit clause to the end of the http-request track-sc0 line since there’s no point in continuing to increment the counter after they’ve exceeded the limit. It also prevents the client from being perpetually blocked and lets the entry expire.

You may wonder when you should use the http_req_rate(24h) counter vs the http_req_cnt counter in conjunction with an expire parameter set to 24h. The former is a sliding window over the last 24 hours. The latter begins when the user sends their first request and increments from then on until the expiration. However, unless you’re manually clearing the table every 24 hours via the Runtime API, the http_req_cnt could stay in effect for a long time while the client stays active. That’s because the expiration is reset whenever the record is touched.

HAProxy Enterprise reCAPTCHA Module

HAProxy Enterprise adds several security-related modules that help you correctly identify bots and respond intelligently. One is the reCAPTCHA module. When an attacker launches a denial-of-service attack, oftentimes they’ll deploy a legion of bots to throw requests at you. When you detect that a client has exceeded your rate limit, rather than just denying them, you can send them a reCAPTCHA challenge.

The benefit of a reCAPTCHA is that it lowers the risk of false positives. Maybe a legitimate user got caught by the limit. The module lets them prove that they’re a human. Those malicious bots will be stopped, or at least slowed to the point that it’s inconvenient for them to keep attacking your service, but true, human visitors will be able to pass the test and continue.

Conclusion

In this blog post, you learned several ways to enable rate limiting in HAProxy. Using its building blocks—ACLs, stick tables, and maps—various sophisticated techniques are not only possible, but easy to implement. A common approach is to track users over a sliding window of time. However, you can use the Runtime API to clear stick table records to achieve fixed-time-period rate limiting, as in midnight-to-midnight. Also, because you have access to all of the information inside of the HTTP request, it’s possible to base your rate limit on the URL’s path or parameters.

If you enjoyed this post and want to see more like it, subscribe to our blog! You can also follow us on Twitter and join the conversation on Slack.

Want to hear about the additional security features available in HAProxy Enterprise? Contact us to learn more and sign up for a free trial.