Nginx: a caching, thumbnailing, reverse proxying image server?

A month or two ago, I decided to remove Varnish from my site and replace it with Nginx's built-in caching system. I was already using Nginx to proxy to my Python sites, so getting rid of Varnish meant one less thing to fiddle with. I spent a few days reading up on how to configure Nginx's cache and overhauling the various config files for my Python sites (so much for saving time). In the course of my reading I bookmarked a number of interesting Nginx modules to return to, among them the Image Filter module.

I thought it would be neat to combine Nginx's reverse proxying, caching, and image filtering to create a thumbnailing server for my images hosted on S3. If you look closely at the <img> tag below (and throughout this site), you can see Nginx in action.

In this post I'll describe how I configured Nginx to efficiently and securely serve thumbnails for images hosted on S3. As a bonus, I'll also show how I'm using the Secure Links module to prevent people from maliciously generating thumbnails.

Getting started

In order for all the pieces to work, your Nginx needs to be compiled with the image filter, proxy, and secure links modules. You can check what modules you have by running nginx -V . If you're using Ubuntu, an easy fix is to just install the nginx-extras package.

Once Nginx is ready, we can start in on configuring Nginx.

Configuration

The first thing we'll want to declare is our proxy cache. This declaration goes in the http section of the nginx.conf file and describes the file-based cache that will store our generated thumbnails. Because a cache miss means fetching the full image from S3 and then resizing it, we want to configure the cache to be large enough to hold most of our thumbnails. For my sites I just estimated that 200 MB would be sufficient.

To define your cache, add this line somewhere in the http section of the nginx config:

# Nginx will create a cache capable of storing 16MB of keys and 200MB of data. proxy_cache_path /tmp/nginx-thumbnails levels=1:2 keys_zone=thumbnail_cache:16M inactive=60d max_size=200M ;

Now we need to create two server definitions: a caching server and a resizing server. The resizing server will be responsible for acting as a reverse-proxy to S3, generating and serving the resized images. The caching server will sit in front of the resizing server, caching and serving the resized images. Although I didn't think it would have been necessary to have two servers, after a bit of googling around because my cache wasn't being populated, I came across several posts that indicated this was the case.

The caching server

The caching server will be the one that is exposed to the public (I put mine at m.charlesleifer.com ). Because the sole responsibility of this server is to cache responses from the resizing server, the configuration is pretty minimal. Here is how I've set mine up:

server { listen 80 ; server_name m.charlesleifer.com ; location / { proxy_pass http://localhost:10199 ; proxy_cache thumbnail_cache ; proxy_cache_key " $host$document_uri$is_args$arg_key" ; proxy_cache_lock on ; proxy_cache_valid 30d ; # Cache valid thumbnails for 30 days. proxy_cache_valid any 15s ; # Everything else gets 15s. proxy_cache_use_stale error timeout invalid_header updating ; proxy_http_version 1 .1 ; expires 30d ; } }

Whenever a request comes in to the caching server, the "thumbnail_cache" is checked first. If no match is found then we proxy back to the resizing server, which is running on localhost. Valid responses from resizing server are cached for 30 days, while anything else is cached for 15 seconds.

The resizing server

All the interesting stuff lives in the resizing server. The job of the resizing server is to fetch images from S3 and resize them on-the-fly, based on dimensions specified in the URL. Additionally, the resizing server checks that a security key is present with each request to prevent people from generating arbitrary thumbnails.

There are a couple distinct sections of the server config block, so let's start with the ones we've seen already: proxying.

server { listen 10199 ; server_name localhost ; set $backend 'your.s3.bucket_name.s3.amazonaws.com' ; resolver 8 .8.8.8 ; # Use Google for DNS. resolver_timeout 5s ; proxy_buffering off ; proxy_http_version 1 .1 ; proxy_pass_request_body off ; # Not needed by AWS. proxy_pass_request_headers off ; # Clean up the headers going to and from S3. proxy_hide_header "x-amz-id-2" ; proxy_hide_header "x-amz-request-id" ; proxy_hide_header "x-amz-storage-class" ; proxy_hide_header "Set-Cookie" ; proxy_ignore_headers "Set-Cookie" ; proxy_set_header Host $backend ; proxy_method GET ; }

There's really not too much going on here besides telling our server how to talk to S3, so let's keep going. The next thing we'll want to configure is the Nginx image filter module. It only takes a couple directives, some of which we will define at the server level.

Below the proxy_ settings, add the following image_filter settings:

server { # ... image_filter_jpeg_quality 85 ; # Adjust to your preferences. image_filter_buffer 12M ; image_filter_interlace on ; }

Lastly, we'll define a location block that will:

look for well-formed URLs. Validate the request signature. Extract the dimensions from the URL. Fetch the image from S3 and load it into the image_filter_buffer . Resize and respond.

Item number 2 is particularly interesting. The author of a similar blog post used Lua to validate a request signature, but that seems like a lot of work. The Nginx secure_link extension was surprisingly easy to get working.

The secure_link module works by generating a hash, which is the concatenation of the URL of the requested image, and a secret string known only to your app. Because of hash length extension, we append our key to the URL rather than prepending it. Since you know the secret, you can generate valid hashes whenever you wish display a thumbnail in your application.

Here is the final piece of configuration:

server { # ... error_page 404 = 404 /empty.gif ; location ~ ^/t/([\d-]+)x([\d-]+)/(.*) { secure_link $arg_key ; # The hash is stored in the `key` querystring arg. secure_link_md5 " $uri my-secret-key" ; if ( $secure_link = "") { # The security check failed, invalid key! return 404 ; } set $image_path ' $3' ; image_filter resize $1 $2 ; proxy_pass http:// $backend/$3 ; } }

And that's all there is to it!

Generating hashes

If you are using Python, here is the code I wrote to generate the hash, given a particular thumbnail URI:

import base64 import hashlib def thumbnail_url ( filename , width , height = '-' ): uri = '/t/ %s x %s / %s ' % ( width , height , filename ) md5_digest = hashlib . md5 ( uri + ' my-secret-key' ) . digest () key = base64 . b64encode ( md5_digest ) # Make the key look like Nginx expects. key = key . replace ( '+' , '-' ) . replace ( '/' , '_' ) . rstrip ( '=' ) return 'http://m.charlesleifer.com %s ?key= %s ' % ( uri , key )

Thanks for reading

Thanks for taking the time to read this post, I hope you found it interesting. Feel free to leave a comment if you have any questions and I'll do my best to answer. If you notice that I'm doing something wrong in the above configuration, please let me know as well, and I'll update the post.

Commenting has been closed, but please feel free to contact me