Extending the redis-rb client library

Andrew Berls

We're huge fans of Redis here at Sutro. It's simple, powerful, and crazy fast, making it ideal for a wide range of use cases. For KnowMyRankings, we use it heavily for everything from caching to job scheduling infrastructure and metric tracking such as user logins.

While Redis ships with a wide range of data structures and commands to manipulate them, we found ourselves doing a number of operations repetitively and wishing for native support in the redis-rb client library. So, we set about extending the client library and adding a few useful functions of our own, packaged up into our first open-source gem, redis-client-extensions.

Functions

Right now, the library is very small and only adds a few functions. We tried not to go over the top, and only add extensions for things we felt we really needed.

hmultiset

Redis hashes are great for storing key-value data. We use hash structures extensively in our backend scheduling infrastructure and found ourselves wishing for an easy way to take a Ruby hash and dump it into a Redis hash. However, the syntax for the HMSET command is:

HMSET key field value [field value ...]

This is a little awkward to express in Ruby. Enter #hmultiset ! A call to hmultiset(key, hash) will manipulate the hash into arguments suitable for redis-rb, and do exactly what you'd expect.

$redis.hmultiset("my-hash", { name: "tom", color: "red" }) $redis.hget("my-hash", "name") # => "tom" $redis.hget("my-hash", "color") # => "red"

cache_fetch

The first step in optimizing any web application is almost always reducing the number of queries made to the database. We use Postgres as our primary datastore, and our dashboards are heavy on rollups and computed values. Redis is insanely fast as a key value store, and so we wished to use it as a cache store for some of our more expensive computations. #cache_fetch is a simple get/set wrapper for caching expensive computations with an expiration period. It works as follows:

# Initial cache miss, block evaluated to store value # (expiration in seconds) ret = $redis.cache_fetch('my-key', expires_in: 60) do 'cached-operation' end # => 'cached-operation' # Calling again retrieves cached value, block will not be called ret = $redis.cache_fetch('my-key', expires_in: 60) do 'something-else' # Not called! end # => 'cached-operation'

But wait there's more - Redis serializes values to strings, and so we found ourselves having to coerce types around our Redis calls, which gets cumbersome. This leads us to our next two extension functions:

Note: we're aware of the redis-store cache adapter, but determined we preferred this simple solution for now

mdump / mload

As mentioned, coercing types on their way out of Redis was annoying. So, we chose to integrate Ruby's Marshal library to serialize objects to byte streams. This looks like garbage to humans, but allows Ruby to reconstruct the original value and type using Marshal.dump and Marshal.load . We added the mdump and mload functions to redis to simply let you store and retrieve marshalled values at a given key.

$redis.mdump('my-key', [1,2,3]) $redis.get('my-key') # => "\x04\b[\bi\x06i\ai\b" (unreadable marshalled bytes) $redis.mload('my-key') # => [1,2,3]

While Marshal works with any type of value, we tend to avoid serializing 'big' objects like ActiveRecord::Base instances and mostly use it as a convenience for lower-level types. As hinted at previously, cache_fetch uses mdump and mload internally so your types will be preserved.

It's worth noting that marshalled byte stream values obviously only work with Ruby; however, we're a Ruby-only shop (for now), so we haven't had any concerns with cross-platform serialization.

find_keys_in_batches

The SCAN command introduced in Redis 2.8 allows for incremental iteration over collections of elements (including keys matching a pattern). We wanted to use it for our internal analytics that involved storing a large number of patterned keys and then running rollups as a batch job. However, SCAN returns a 'cursor' on each iteration and requires the user to keep calling it until iteration terminates (assuming you want to loop through all matches, of course). We wrote #find_keys_in_batches to abstract away the cursor, and easily allow us to loop over keys matching some pattern. It yields the batches of keys returned from the SCAN command to a user-defined block - for example:

$redis.find_keys_in_batches(match: "visits*", count: 100) do |keys| puts "Got batch of #{keys.count} keys" puts "Values for this batch: #{$redis.mget(keys)}" end

Wrapping up

We deliberately kept the number of extension functions to a minimum. Some developers have strong negative feelings about monkey patching like this (aka "freedom patching"), so we carefully added only the functions we really needed, and feel this is the cleanest solution. We thought about adding functions to deal with serializing JSON values, but have been fine just using #hmultiset and #hgetall so far.

In the meantime, we'd love to hear the community's thoughts on this and welcome any contributions to the project. All of the code and installation instructions are on GitHub for your viewing pleasure!

Related Reading

Light-Speed Analytics with Redis Bitmaps and Lua

Fast. easy, realtime metrics using Redis bitmaps