If you’ve watched tv recently, you’ve perhaps seen the introduction to Elementary, the latest in a string of Sherlock Holmes inspired shows. The intro depicts a machine in which a series of events set off by a marble results in the caging of a figurine. Watching it, you might question the complexity of the events, but that is the essence of a Rube Goldberg machine—something that is intentionally over engineered to perform what would normally be a simple task.

But what does this have to do with shipping?

A few months ago, one of our Redis machines became overloaded. We were able to quickly kill the offending script, work on a fix, and relaunch without further problems.

While the relaunch may have been uneventful (as we hope all launches are), the story of how we ended up in this situation is quite interesting. We hope that by sharing what we learned we can help others from building the same Rube Goldberg machine.

To follow along, you should know that our development stack is in Python, and this particular case involved Redis, Memcached and Celery—the Python library for dealing with queues and background workers.

To provide shipping rate quotes as quickly as possible, we built our own rating engine. This rates engine breaks down the entire United States by zip code to return prices for shipping to and from any location, resulting in a potential 498,501 entries (or keys) in our system.

The zip code entries are contained in a flat file, and we needed to get this data into Memcached for our production system to query every time we make a shipping rate request. Our tests showed that the round-trip to Memcached is ~10ms, meaning it would take 13 hours every time we wanted to seed Memcache with that data. We needed a faster solution to seed the cache.

We attempted to speed this up by what became a bit of a Rube Goldberg machine:

Writing a script to read the flat file in batches Create Celery tasks with the data from the batch Have Celery task push the data to Redis Have Redis push the batch to Memcached

But as Mike Tyson once said, “everyone has a plan until they get punched in the mouth.”

It turns out that we had a bug in our batching code:

batch_size = 500 # number of lines to batch together

flat_file = load_zones_file() # reads flatfile

for i in xrange(0, len(flat_file), batch):

batch = zones_table[i:(i+1)*batch_size]

# sends the task to Redis to be read & processed by worker

process_batch.delay(batch)

That code has a tiny and very costly bug. Think about the size of batch at i=0 , it works fine. However when i>0, the size of batch increased with i ! Specifically, the size of i th batch is i*batch_size whereas it really is supposed to be just batch_size

Hence the size of all batches sent to Redis is: (1+n)*n/2*batch_size , where n=len(flat_file) . So we just sent ~n2*batch_sizelines to Redis instead of n*batch_size . That was a pretty costly bug. The fix is obviously to just multiply by batch_size in the start index:

batch = zones_table[i*batch_size:(i+1)*batch_size]

While we were looking for the bug, we also discovered that Memcached supports a set_multi in it’s binary protocol, which Django exposes as django.core.cache.cache.set_many .

According to Django source code, this is much more efficient for obvious reasons (I’d imagine reusing the connection to Memcached for one).

So what did we learn?

1. Look for library-supported ways of batching or improving performance. Often times others have run into similar problems and there’s probably a right solution there for you. In this case if we had known about django.core.cache.cache.set_many , we could have avoided the whole situation. This is actually what we ended up employing as our solution, and it’s been working great.

2. Redis’ DEL Operator can be very handy. It is a destructive operation so be careful with it, it came in handy when we decided to go with django’s set_many and get rid of the old tasks.

3. Redis will not always free up (return) memory to the OS when keys are removed. Hence the OS’s memory metrics can be misleading, use the Redis CLI or monitors that used it (Elasticache’s “Bytes Used For Cache”) to monitor actual memory usage.

We’re hiring at Shippo! If building simple systems is up your alley, and shipping code that sends millions of packages around the world peaks your interest, check out our jobs page!

Shippo is a multi-carrier API and web app that helps retailers, marketplaces and platforms connect to a global network of carriers. Businesses use Shippo to get real-time rates, print labels, automate international paperwork, track packages and facilitate returns. Shippo provides the tools to help businesses succeed through shipping.