Rendezvous Hashing: an alternative to Consistent Hashing Aniruddha Follow Jan 7 · 4 min read

In any kind of stateful distributed system, the problem of mapping a key to a set of machines is pretty common. Even if a distributed system is stateless, you might still want to map a key to the same set of machines for better locality of processing. In its essence, this is very similar to how hash tables work — map a set of k keys to n buckets.

The simplest way to do this is to use modular operations. Hash your key to get a fixed length value, then compute the modulo with n and pick the machine in that slot. For a uniform hash function, this works well if the number of endpoints doesn’t change very frequently and if the cost of re-mapping keys between endpoints is low. If either of those two is not true, this performs very poorly because all of your keys could get remapped if the size of the list changes.

These days, the standard way to limit the number of keys being re-mapped is to use consistent hashing. Most major distributed databases use it in some form or another. Consistent hashing is a special kind of hashing where on an average, K/n keys are remapped whenever the list of endpoints changes (K is the total number of keys). The term consistent hashing first appeared in literature in 1997 in this paper. In consistent hashing, both the keys and the buckets are hashed onto a circle. A key maps to the first bucket that is encountered in the clockwise direction (or counter-clockwise — it doesn’t really matter). Searching for the bucket responsible for a key is pretty simple — pre compute the hash values for all buckets and sort them, hash the key and then run a binary search (in O(log(n))) to find the lowest value that’s higher than the hash of the key. When the buckets are resized, some keys move over to the closest new bucket. On average, the number of keys that need to move is K/n — which is ideal.

One of the biggest drawbacks of consistent hashing is that keys can be imbalanced across buckets. This is mainly because of how resizing is handled. For example, if a bucket is removed, all keys mapped to that bucket move over to the next one (similar for the case where a bucket is added). Ideally, these keys would be distributed equally across all the remaining buckets. To overcome this problem, most implementations divide each physical machine into multiple virtual nodes. Even then, the keys now spread out over as many virtual nodes you assign to a physical machine instead of the ideal state of the load spreading out over all of them. If the number of virtual nodes is not higher than the number of machines, the load can be distributed unevenly.

Rendezvous hashing predates consistent hashing by a year and takes a very different approach to solving these problems, while maintaining the K/n re-mapping invariant. Unfortunately, it’s not as well known as consistent hashing. It’s also known as Highest Random Weight hashing, because of how it’s implemented. Conceptually and practically, it’s much simpler to understand and implement. You hash the key and the machine together and then pick the one with the highest hash value.

type router struct {

endpoints []*Endpoint

} func (r *router) Get(key string) *Endpoint {

var ep *Endpoint

hashVal := -INF for _, e := range r.endpoints {

h = hash(key, e)

if h > hashVal {

ep = e

hashVal = h

}

}

return ep

}

In case of a uniform hash function, if the buckets change, the keys (on an average, K/n keys) get spread out over all other buckets instead of just one or the number of virtual nodes that were assigned to a machine. The biggest drawback of rendezvous hashing is that it runs in O(n) instead of O(log(n)). However, because you don’t typically have to break each node into multiple virtual nodes, n is typically not large enough for the run-time to be a significant factor.

We actually used this at Twitter in our internal pub/sub platform, EventBus. EventBus was modeled similar to Kafka — there were topics, and topics had subscriptions. A group of clients together consumed a subscription. We called this smallest unit, a stream. Unlike Kafka, EventBus had separate storage and serving layers — so you could scale out the serving layer horizontally. More importantly, any machine could serve a stream. Also, unlike Kafka, we supported a mode where all clients within a subscription could choose to receive a full copy of the stream and implement their own filtering.

Initially, we randomly assigned these streams to different serving machines. This worked fine when the number of streams was in the low hundreds. However, over time, some of our most popular topics (like the one with tweets) gathered streams numbering in the tens of thousands, many with client-side streaming enabled. Because the serving layer kept a local cache of events for each stream and different streams could be reading data at different offsets, every machine started keeping a large amount of data in memory — leading to horrendous GC pressure. We needed an easy way for a group of clients to independently converge on the same serving machine for a particular stream so that an item, once cached, could be sent to multiple clients. We used rendezvous hashing to do this with pretty good results. The clients would select a machine while starting consumption and then periodically rebalance every 5–10 minutes till the throughput across different machines stabilized.

Sometimes, elegant and obscure algorithms tend to outperform conventional wisdom.