Surely all of us have used some URL Shortening services (shortened links) like Bitly or TinyURL.

For an engineer, it is one thing to use, but how to design a system that loads billions of clicks per month surely also many people are interested.

Today I am writing this article to dig deeper into those systems to see how they have been designed.

Purpose of the article:

Give you an overview of how to design the system of millions of users, billions of clicks per month. From thinking to problem solving.

Can build a URL Shortening system yourself like Bitly, TinyURL.

Helping you with experience in the system design position interview.

What is the URL Shortening system?

Certainly, there are some who have never used link shortening service ever. So let me briefly explain what URL Shortening is.

URL Shortening is a service we can shorten the original link.

For example, our original link is: https://medium.com/jp-tech/docker-image-in-production-1gb-or-100mb-a455ed5eb461?source=your_stories_page--------- — — — — — — — — —

After using the shortened link, it will become like this: shorturl.at/bvzO3

Now if we open the shorturl.at/bvzO3 link in the browser, it will redirect to the original link.

Why do we need shortened links?

This is probably a question many people are concerned about. For example, just send the original link to the viewer, but don’t need to shorten the link so it takes time? And now, does anyone have to manually enter the link again? …

It is not wrong to ask those questions.

The main purpose of using shortened links is:

Looking at short links looks better.

Can statistic the number of people who click on the link to analyze and evaluate the results. Using for marketing.

Can hide some affiliate marketing links for the purpose of making money.

System functional requirements

Most link shortening systems must meet the following requirements:

Functional requirements:

Input is one original link, the system will shorten the original link into a shorter and unique link format

When the user accesses the shortened link, the system will redirect it to the original link

Users can choose to customize their shortened links at will.

Shortened links will expire after a certain amount of default time. However, users can adjust this time period.

Non-functional requirements:

High availability system. Why do we need this? Because if the system dies, then all the shortened links will die by then.

When clicking on the shortened link to switch to the original link, the redirection time must be minimal (minimal latency).

Shortened links cannot be guessed.

Expansion request:

How many times can I analyze click on a shortened link?

Provide an API that can be used by 3rd parties.

System analysis

In this section, I will show you how to estimate the number of monthly requests, disk space, memory usage, network bandwidth consumption …

Most link shortening systems will have quite a high amount of access.

Assuming the system we designed will have a read: write ratio of 100: 1. (Remember this ratio as it is used throughout the article.)

The read rate here is the number of people who click on the shortened link

Write ratio is the number of people who created a shortened link.

Traffic Estimation

Suppose our system has 500 million shortened links in a month.

With a read:write ratio is 100:1 then the number of read will be: 500M * 100 = 50B (M is million, B is billion)

What is the number of writing in 1 second?

500M / (30 days * 24 hours * 3600 seconds) = 200 URL / s

What is the number of reading in 1 second?

200 * 100 = 20K URL / s (because the read: write ratio is 100: 1)

Storage Estimation

Suppose we are going to save all shortened links for 5 years. Because we have 500M links shortened in 1 month, then 5 years we will have:

500M * 12 months * 5 years = 30B URLs

Assuming each shortened link we will use 500 bytes to save it in storage. Drive space saved 500M URLs for 5 years will be:

30B * 500 bytes = 15TB

Bandwidth Estimation (network bandwidth)

First, let me explain what bandwidth is (network bandwidth)

Network bandwidth is the term for the amount of data transfer (data size) in a period of 1 second.

Which data transfer will include two types of incoming data with outgoing data: incoming data is the amount of data transferred to the server (same as the upload type), outgoing data is the amount of data returned from the server to the user (like downloading). Because our system has 200 new URLs in 1 second then :

total incoming data = 200 * 500 bytes = 100 KB / s

With reading request, our system has 20K URL / s, then:

total outgoing data = 20K * 500 bytes = 10MB / s

Memory Estimation

In order for the system to run faster, the best solution is to cache short links that many users click. So how much memory will we need?

If we follow the 80:20 rule, 20% of the shortened links generate 80% of the system traffic. (Simply put, only 20% of the shortened links are for many users to access, the remaining 80% do not have access at all. So 20% of shortened links will generate 80% of traffic for that reason.)

Since we have a total of 20K URLs / s (or 20K requests / s), then 1 day will have:

20K * 3600 seconds * 24 hours = 1.7B request / day

To cache 20% of these requests, we will need:

0.2 * 1.7B * 500 bytes = 170GB memory

Summary of system size

Our system has 500M URLs in a month, and has a read:write ratio of 100:1. Then our system specification will be as follows:

200 URLs generated every second

Number of access: 20K requests / s

Incoming data (same as uploading): 100KB / s

Outgoing data (same as downloading): 10MB / s

Drive capacity in 5 years: 15TB

Memory capacity for cache: 170GB

API design

We can use SOAP or REST APIs to design the system API. Through the above requirements, we see that our system needs at least the following 2 APIs:

createURL

Firstly we need an API to create a shortened link:

createURL(api_dev_key,

original_url,

custom_alias=None,

expire_date=None)

api_dev_key (string): is API developer key of a registered user account. This key is used to identify the user, limiting the number of user requests (aka rate-limiting) original_url (string): original link

custom_alias (string — optional): customize key for URL

expire_date (string — optional): the expiration date of the shortened link

Return value (string):

If successful, it will insert into the database and return shortened links

If it fails, return an error code.

deleteURL

The second API is also quite necessary to delete the shortened link registered.

deleteURL(api_dev_key, url_key)

api_dev_key (string) is API developer key of the registered user account

url_key (string): is a shortened link.

Return value (string):

If successful, the shortened link will be deleted.

If it fails, it returns an error code.

How to prevent hackers?

Hackers can use api to create many shortened links that exceed the current system design. With the purpose of “Rest In Peace” our system.

For example, our current system is designing a response every month to 500 million URLs generated.

And hacker attacks will create 100 times the current is about 50 trillion URLs so the system will consume more resources, use more memory, consume more drives. Then surely the system will be down. And the whole shortened link will be dissolved.

So how to solve this problem? The simplest way is to limit the number of api calls via api_dev_key (this technique is called rate-limiting that Grab is using). For example, each api_dev_key will only create about 100 shortened links in 1 day for example.

It is not a 100% perfect way, but it also limits some problems.

Database design

Our database requirements will be as follows:

Billions of records need to be saved

Each 1 object will save as small as possible (range less than 1KB)

There is no need for a data relationship between records.

The system has a high read rate

Database schema:

We will need 2 main tables: 1 to store user information, and 1 to store URL information.

What kind of database should be used?

Because we have expected to save billions of records, moreover, tables have no relationship to each other, so using the NoSQL key-value would probably be the best option. For example, DynamoDB and Cassandra I think are ok.

Algorithm and basic system design

The problem to solve here is how to create a shortened link and it’s only from the original one.

In the first part, I have taken an example of a shortened link: shorturl.at/bvzO3

Then this part we will go to design to create the shortened part, which is bvzO3.

Encoding URL

We can use some hash functions (like MD5 or SHA256) to hash the URL input value. Then will use some coding functions to display. For example, base36 ([a-z, 0–9]), or base62 ([a-z, A-Z, 0–9]) and base64 ([a-z, A-A, 0–9, -,.]).

The question is, what key length do we use? 6.8 or 10?

If base64 is used for 6 characters, then we have 64 ^ 6 = 68.7B URLs

If using base64 for 8 characters, then we have 64 ^ 8 = 281 trillion URLs

Because our system has 500M URLs generated each month, the system used in 5 years will have a total of:

500M * 12 months * 5 = 30B URLs / 5 years.

So with 68.7B URLs (with 6 characters) is usable for 5 years.

If we use the MD5 algorithm as a hash function, then it will generate a hash value containing 128 bits. Then base64 encodes the hash value, it will generate at least 21 characters (because each base64 character will encode 6 bits of hash value).

Meanwhile, our key space only needs 6 characters. So how can you choose a key? We can choose the first 6 characters. Although there are cases it overlaps. but the probability is only about 1 / (64 ^ 6). It is very small. Should be acceptable.

If it is safe, every time we generate it, we check it in the database to see if it is or not. If not, then ok, and if it does, then add any random string before the URL and repeat until the unique string is generated.

E.g:

What is the problem with our solution?

Many users can share the same original link, so the shortened link will be duplicated. And this is not acceptable.

What if something in the URL is encoded? For example, http://example.com/index.php?id=design and http://example.com/index.php%3Fid%3Ddesign are two URLs that are completely the same but a portion of the URL has been encoded.

Solution

There are two approaches that can solve this problem.

We can use an incremental integer and append to the beginning of each root link. Then it will always make sure our original link is unique, even if there are many people filling out a single link, the shortened link will always be different. And after creating the shortened link, this integer will increase by 1. But there is a problem is that if the number is increased forever, this integer will be overflow. Moreover, this incremental processing also affects the performance of the system.

Alternatively, we can add the user_id to the beginning of each URL. However, if the user is not logged in and wants to create a shortened link, then we have to ask for another key. And this key must be unique (If the unique non-unique input key will require re-entry, until only unique).

And here is the flow of the system:

First, enter the link you want to shorten, and press Enter. The request will then be sent to the server.

The server will receive the request and transfer it to the specialized part that shortens the link. Let’s call it the Encoding Service.

Encoding Service will perform short URL handling:

If the URL has not yet existed in the system, it will save the shortened link into the database and return to the server the results.

If that URL already exists in the system (ie someone has used this URL already). Then it will add a sequence (incremental integer) to the beginning of the URL and perform shortening links. Then save the shortened link to the database and return results to the server.

The server receives results and returns to the user.

Data Partitioning and Replication

If we store all 30 billion URLs in the DB, and have up to 20K requests / s calls in the DB. Then maybe the DB will be quite large load and lead to down. To solve this problem, there are 2 solutions:

Partitioning data in the database (Data Partitioning). It means we will split the DB into many different DB. Each child will contain 1 piece of data.

Cache the URL or call to minimize the query to the DB (I’ll explain this in the next section)

For Data Partitioning, there will be 2 types:

Range Based Partitioning

This type of partition will rely on the first letter in the URL or hash key to divide the data.

For example, the URL (skip https: // or http: //) that start with the word “a” will be in DB type “a”. Any URL that starts with the letter “b” will enter the database “b”. If the partition is based on this first letter, we will need 26 different databases (from a -> z)

But this solution may be problematic, suppose we put all the URLs starting with the letter “f” into the database of type “f”. But unfortunately, all of the URLs starting with the letter “f” are the ones with the biggest access. At that time, this type of “f” DB was quite large load.

Note: This type of partition based on the first letter is just an example, you can devise your own algorithm to partition data properly and efficiently. It is not necessary to select the first letter to partition.

Hash-Based Partitioning

In this type, we will get the hash value of the object being stored. It will then calculate which partition will be used based on the hash function. We can take the hash value of the primary key, or the root link, to determine which partition will store data.

Cache

For a system of billions of clicks a month, a cache server is indispensable.

Why do we need a cache server?

The standard flow will be:

Step 1: Users access to shortened links

Step 2: we have to go to the DB to get the original link from the shortened link

Step 3: redirect the user to the original link.

If there is no cache server, then each time it will have to go into the DB to get the results. And resulting DB will be subjected to quite large load. To minimize the query to the DB, we will cache the previous query result. For the next time if the user has access to the shortened link, now we just need to go to the retrieved cache to complete, without having to query the DB to get the results again.

Because the Cache Server will always store data on memory. So compared to getting results from a DB, getting results from memory will be a lot faster.

Which Cache Server do we need to use?

Currently, there are many cache servers like Redis, Memcache. I find it quite famous and also being used quite widely on large systems around the world.

Previously I was working in a game company and my system at that time used Redis and I found it quite good and supported a lot of functions. For example, automatically ranking results, can sync data between memory and storage to prevent data loss …

So if you do not know which one to use, I recommend learning and implementing Redis.

How much memory do we need?

Like the previous part, I calculated that this system will use up to 170GB of memory to cache 20% of the URL. But now the server has 256GB of memory so much more than enough to solve this problem.

In addition, we can combine many small servers (for example, each server has 8GB of memory) to cache those URLs.

If the cache is full, then how?

Most cache systems follow a number of mechanisms such as LRU (Least Recently Used) or LFU (Least Frequently Used).

LRU (Least Recently Used): discard the most recently used cache items.

LFU (Least Frequently Used): Removes the least used cache items.

Because of these mechanisms, the cache will always be refreshed to avoid full use.

Load Balancer

With multiple access systems like this one, a web server may not be able to handle it yet. To solve this problem, I will use many web servers. Each web server will take a partial request from the user.

The question is how to automatically request distribution to each different web server?

And Load Balancer was born to solve this problem.

For example, under Load Balancer there are several web servers. The first time a request from the Client to the Load Balancer will be forwarded to the web server 1. The second time will be sent to the web server 2 ….

Currently, some server providers like AWS, Google or Azure support Load Balancer. So you don’t need to worry about having to build a load balancer. Just install and use is done.

Conclusion

Read through this article surely you have a bit of thinking in designing a large system to serve millions of users how it is not it?

I think if you encounter a similar system like that, then you have enough knowledge and skill to solve.

Because many new graduates or those who have not worked in large systems probably don’t know where to start, what technology to use. Then through this article hope to help you answer those questions.