The naughtiness score

A simple algorithm to prevent jerks from ruining rawgithub.com for everyone

Update: RawGit (formerly rawgithub) no longer behaves quite the same way as this post describes, but the core concept of the naughtiness score is still used to throttle and blacklist excessive traffic.

Last week, the developer of a popular Chrome extension released an update that requested a JavaScript file from rawgithub.com every time a user of the extension visited any web page in Chrome.

This kind of sucked for rawgithub.

I spent most of my Sunday working to mitigate the ensuing flood of HTTP requests. The problem wasn’t the load — it turns out Node and Nginx were more than up to the task of handling the flood—it was the sheer bandwidth consumed by all those incoming requests.

At the time, rawgithub was hosted on Amazon EC2. In a typical month it cost me a paltry $30 in AWS fees, roughly half of that being bandwidth costs. This flood was well on the way to ballooning those costs to over $1,000 a month, which was an unhappy prospect for my bank account.

After trying several things, including temporarily null routing rawgithub.com’s DNS (suboptimal, since it meant discarding legitimate traffic) and moving the server to DigitalOcean (which has more generous bandwidth pricing than EC2), the most effective solution ended up being to respond to the abusive requests with a simple JavaScript file:

alert('Stop it.');while(1){}

For users of the abusive extension, this made Chrome completely unusable by freezing it on startup and on any attempt to load a page. Those users either uninstalled the extension or stopped using Chrome altogether, instantly reducing the flood to manageable levels.

Turns out being really annoying is the best way to prevent abuse of rawgithub.com

As an added bonus, those angry users also complained loudly to the extension’s author and began leaving one-star reviews of the extension. The author quickly came to his senses and released an update.

The flood abated and things went back to normal, but I wasn’t happy with the prospect of having to fight a fire anytime some doofus decides to use rawgithub irresponsibly. Clearly, the process of manually blacklisting abusers wasn’t going to scale.

Thus, the naughtiness score was born.