Consistent Hashing illustrated in simple words

Introduction

Imagine you are designing a scalable backend for an e-commerce website like Amazon. The amount of data volume will be humongous and exponentially rising every year. Do you think you will be able to store & manage the data on a single server? The answer is No.

When your data can’t fit on a single machine, you will have to spin up more machines. One of the most significant design objectives would be to efficiently distribute the data among the servers. At the same time, it’s also essential to optimize the retrieval of data from a cluster of servers.

Social media websites such as Instagram, Facebook or Twitter store lots of data on the server’s file systems. They use cache servers to quickly serve requests for data that is accessed frequently eg:- viral posts. Even In-Memory Caches have a limitation on the amount of data that can be stored. Hence, it becomes necessary to horizontally scale the Caching layer as well.

In this post, we will walk through the Consistent Hashing Algorithm which solves the above challenges. We will start with a very simplistic approach to solving data partitioning problem and see how Consistent Hashing overcomes all the bottlenecks. The algorithm is employed by many open source applications like Cassandra, Riak, Dynamo DB, etc

Data Distribution Strategies

Linear Distribution

We are given a set of servers and we want to come up with a strategy to distribute the data among them. Let’s start with a very naive solution to this problem. Assume, we fill up the servers one after another i.e we start writing data to the next server only if the current server becomes full.

In the following illustration, we have a simple server that can store only 4 records at a time. When a server becomes full, we add a new server and new data is added to it.

Linearly storing data among servers

Well, this approach works perfectly fine while writing data on any server. What happens when you are asked to read a particular data? You need to identify the server that stores the given data and then fetch it. How do you identify the server? Would you walk through all the servers and scan linearly each one of them. That would hamper the read performance.

For eg:- In the above case, if you are asked to look up “New York” since there is no direct mapping between the key and the server, you will have to linearly scan all servers and search for this key.

We will have a look at an approach which will solve this problem for us in the next section.

Hashing

In the previous section, we saw that if we have N servers, the time complexity to fetch a record would be O(N). We want to efficiently read and write the data in O(1). The first thing that strikes our mind is the HashMap data structure that provides O(1) lookups and writes.

Let’s see if Hashing can solve our problem. Assume that we have N servers that store the data and an application having the strategy to distribute the data. The approach is similar to the one used by HashMap. First, hash the keys and then identify the bucket where the data will go. The application will first hash the key and then determine which server by computing hash(data) % N.

The above algorithm will give the server number where the data will be written. Further, while retrieving the data, it will use the same logic, get the server number and fetch the data. Both reads and writes completed in O(1).

Let’s go through an illustration. Assume we have three servers named S0, S1 and S2. Our keys are world city names. Using hashing, we compute the bucket or the server to which the key needs to be assigned.

Hashing & Computing bucket of the keys

Allocation of Keys

But will this always work in Distributed Systems? We will run into the following problems:-

If we add more servers, then hash(data) % N will vary. That means we will have to redistribute all the data on the addition of a new server.

We will run into the same problem if one of the servers is removed. Since the number of servers N is variable here, all the keys will get impacted.

Following is an illustration of what happens when a new server is added. As the number of servers has grown from 3 to 4, the bucket computation logic will change to Hash % 4.

Old & New Allocation of Keys

On adding a new server, we observe that two of the three keys got impacted. If we add a new server, the bucket of key “Madrid” will be 0 (S0) instead of 1(S1). We will have to move this key to server S1 to ensure that our application finds it. Hence, we have to rehash all the existing keys and assign them to different servers. In the worst case, this can impact all the keys in the system.

What is Consistent Hashing?

Consistent Hashing in Action

Consistent Hashing solves our problem when we want to dynamically add or remove servers. In the case of simple hashing, addition or removal of a server will impact all the M keys stored in the system. However, Consistent Hashing ensures that only M/N keys are affected where N is the number of servers.

Consistent Hashing makes the distribution of the keys independent of the number of servers used by the system. Thus, we can scale up or down without impacting the overall system.

Fundamentally, Consistent Hashing uses a Hash Ring. The algorithm maps every server to a point on the circle. It first uses the server’s IP address, calculates its hash and assigns it a point on the circle (angle). Following is a simple illustration of how the angle is calculated for 3 servers S1, S2, S3:-

Assigning servers to points on the Hash Ring

Further, every key is hashed using the same hashing algorithm & assigned a point on the server. For every hashed key, we move in a clockwise direction and find the nearest server & assigned to it.

Assigning keys to points on the Hash Ring

We get the following allocation for the above set of keys:-

Allocation of Keys to Servers

Following is a pictorial representation of the above Key allocation to different servers on a hash ring:-

Allocation of Keys to the server on a Hash Ring

As you can see from the above diagram, we move in a clockwise direction from every key to find it’s server.

Scaling & Adding a new server

As described in the previous section, we first calculate the hash of the server’s IP address & find its location on the circle. For eg: If we add a server S4, and find that it’s located between S2 and S0 on the circle. Further, we reassign the keys of S0 which have an angle less than S3 or in other words which occur before S3 on the circle.

Below diagram illustrates this process where a new server S3 is added and it’s located between S2 and S0. Initially, the key “Mumbai” was assigned to the server S0. On addition of S3, we see that the first server encountered from the key “Mumbai” in the clockwise direction is S3, hence we assign this key to S3.

Adding a new server S3

As seen from above, the addition of a new server doesn’t impact all the keys. Only the keys occurring between two servers on the hash ring need reallocation.

Removal of an existing server

When an existing server is removed, only the keys belonging to that server need to be reassigned. For the keys belonging to the removed server, the next server on the hash ring in the clockwise direction is found. Further, the keys are then allocated to the new server.

The following picture illustrates the process of removing an existing server:-

Removal of Server S1

In the above diagram, server S1 is removed. The key “New York” was assigned to server S1. On removal of S1, we find the first server from the key “New York” and find server S2. Hence, the key “New York” is reassigned to server S2.

Unlike normal hashing, removal of a server doesn’t need a rehashing of all the keys. Only the keys of the removed server have to be reassigned.

Virtual Nodes

We saw that when a node is removed, all the keys assigned to this node will be moved to the next node in the hash ring. Often, on removing a node, the data distribution becomes uneven and load on one of the node increases.

In the above case, if we remove S0 from the system, then the key “London”, will get mapped to server S2. Eventually, we will find S2 handling three keys while S1 only managing one key. Thus there is an uneven distribution of data.

Removal of S0 places more load on S2

In an ideal case, when there are M keys and N servers, every server must have close to M/N keys. Thus, the addition or removal of a node can impact a maximum of M/N keys in the system. To ensure that there is near ideal distribution, we introduce virtual nodes in the system. Every physical node has multiple virtual nodes on the hash ring.

We use multiple hash functions to find the position of virtual nodes on the hash ring. Every server is denoted by Sij, where i denotes the actual server number and j stands for its virtual copy. For eg: for the first server, the virtual copies will be S00, S01, S02, S03, etc. We use different hash functions to compute the hash of every virtual copy.

We get the following allocation of virtual servers in our above example:-

Virtual servers on a hash ring

As seen from the above diagram, the virtual copies of Server S1 are S10, S12 and S13. The same applies to server S0. This results in a near-uniform data distribution among the nodes.

For a given key, if the next server in the hash ring is S12, then it will get assigned to the first physical server. To generalize, a key getting assigned to a virtual server Sij will be stored on the physical server Si.

Applications of Consistent Hashing

Consistent Hashing was introduced in the year 1997. However, it has found traction in many distributed system applications. It’s used in Amazon’s Dynamo DB as a partitioning component. Further, open-source applications such as Apache Cassandra and Voldermort use it for data partitioning.

Applications of Consistent Hashing

References