When I started at Reverb, we were using Loggly for our log aggregation. Loggly is great, and I recommend them for startups just getting air under their wings. However, there is a point in time where you really need to be able to dig into your data and create custom dashboards tuned to your organization.

Enter the ELK stack.

Using the combination of these three technologies has helped us understand our traffic patterns, the diversity of requests(GET vs POST) and also who the bad actors are. Once we had the ELK Stack in place, we really started to see the picture that was being painted from our production NGINX logs.

Fig A: Good Actor

You’ll notice that the blue bar represents GET requests, while the orange bar represents POST requests. When browsing a site using your browser, your phone or any other device a GET request will almost always precede a POST request.

This here is someone browsing our site, signing up, going away and them coming back a few hours later. You might also notice that the requests spike no higher than 70 within a ten minute period which indicates a pretty average user.

Fig B: Bad Actor

Fig B paints a very different picture. See all those POST requests being made without any preceding GET requests? Yeah, that’s a bot. A bot trying to inject garbage into our application endpoints trying to SQL inject us.

How do we mitigate this?

First we need to implement rate limiting on the endpoints they’re abusing. As with all security mitigation techniques, this is not a magic bullet that will save you from these bad actors entirely but it is a good first step.

This won’t even block all requests from getting in, but it will prevent most of the requests from getting to your application, thus allowing your application to handle legitimate requests instead.

To do this limiting we will be using the limit_req module in NGINX. It is built into the core of NGINX, so you won’t need to recompile to enable it.

In the http context of your NGINX configuration, add these lines:

First we will want to configure NGINX to only rate limit POST requests. We can do this by mapping POST requests to a variable called $limit on lines 5–9.

On line 12 we create a 10mb(:10m) zone in memory and use the value of the $limit variable(clients binary IP) to store in the zone.

Now that you’ve told NGINX to create a place to store IPs, lets apply the rate limiting to different configuration contexts using limit_req.

Something I really like about NGINX is the ability to set configuration options in different contexts. You can specify the limit_req option on a http, server or location basis. What this really means for us is the ability to rate limit globally, per vhost or per location.

Here are some examples of rate limiting in different contexts:

Rate limiting for the entire NGINX process:

Please note that it is very likely that your application generates more than one request per page. Unless you test this option thoroughly, I recommend that you consider placing the limit_req in the location block instead.

Rate limiting per VHost:

Again, this will limit your entire vhost to 1r/s.

And finally, what we use here at Reverb:

This configuration allows us to target specific endpoints with different zones and different rate limits. While a normal traffic pattern for /my/messages can be 1–2 r/s, that same configuration might cause a legitimate user to render a 503 while submitting a listing.

As with most things, testing goes a long way here. Now when our bad actor exceeds this limit, NGINX will return a 503 effectively telling them to go away. In our next post, we’ll discover how we can use the 503’s NGINX returns to our advantage with HAProxy.

It is too easy to fall victim to our own intention in the realm of security. User Experience should never be sacrificed in the name of security.

If we intend to block bad actors to prevent a bad site-wide experience, we must be conscious of our mitigation techniques so that we do not create a bad experience for the good ones.

Til next time,

@atom_enger