Have you ever wondered how people create URL shortening websites. They just do it using common sense. You heard it right. I too thought it is a very big task. But after thinking a bit, I came to know that simple mathematical concepts can be used in writing beautiful applications. What is the link between mathematics and URL shortening?. That is what we are going to unveil in this article.

In a single statement URL shortening service is built upon two things.

String mapping Algorithm to map long strings to short strings ( Base 62) A simple web framework (Flask, Tornado) that redirects a short URL to Original URL

There are two obvious advantages of URL shortening.

Can remember the URL. Easy to maintain. Can use the links where there are restrictions in text length Ex. Twitter.

Technique of URL shortening

There is nothing like URL shortening algorithm. Under the hoods, every record storing in the database is allocated with one Primary Key(PK). That PK is passed into an algorithm which in turn generates a string. We will indirectly map that short string with the URL that customer registers with us.

I visit website of Bit.ly and pass my blog link http://www.impythonist.wordpress.com to it. Then I got this short link.

Here one question comes to our mind. How they reduce lengthy string to a short one? . They are not actually reducing size of original link.They just do abstraction here. Steps every one do are:

Insert a record with URL into database

Use the record ID returned to generate the short string

Pass it back to Customer

Whenever you receive a request, then extract short string from URL and re-generate Database record ID -> Fetch the URL -> Simple Redirect to Website

That’s it. It is very simple to generate a short string from a given large number using Base62 Algorithm. Whenever a request comes to our website, we can get back the number by decoding the short string from URL. Then use that number ID to fetch record from database and redirect to that URL.

Let us build one such URL shortener in Python

Code for this project is available at my git repo. https://github.com/narenaryan/Pyster

As I told you before there are three ingredients in preparing a URL shortening service.

Base62 Encoder and Decoder

Flask for handling requests and redirects

SQLite3 for serving the purpose of database

Now If you know about converting Base10 to Base64 or Base62( any base) then you can proceed with me. Other wise just see what are base conversions here.

http://tools.ietf.org/html/rfc3548.html

I here interested only in Base62 because I need to generate strings which are combinations of [a-z][A-Z][0-9]. Encoder maps integer to a string. Decoder generates integer from given string. They are like Function and Reverse Functions. This is the Base62 code for encoder and decoder in Python

from math import floor import string def toBase62(num, b = 62): if b <= 0 or b > 62: return 0 base = string.digits + string.lowercase + string.uppercase r = num % b res = base[r]; q = floor(num / b) while q: r = q % b q = floor(q / b) res = base[int(r)] + res return res def toBase10(num, b = 62): base = string.digits + string.lowercase + string.uppercase limit = len(num) res = 0 for i in xrange(limit): res = b * res + base.find(num[i]) return res

Now let me create a database called urls.db using the following command.

$ sqlite3 urls.db Now I am creating main.py for flask app and a template file.

# main.py from flask import Flask, request, render_template, redirect from math import floor from sqlite3 import OperationalError import string, sqlite3 from urlparse import urlparse host = 'http://localhost:5000/' #Assuming urls.db is in your app root folder def table_check(): create_table = """ CREATE TABLE WEB_URL( ID INT PRIMARY KEY AUTOINCREMENT, URL TEXT NOT NULL ); """ with sqlite3.connect('urls.db') as conn: cursor = conn.cursor() try: cursor.execute(create_table) except OperationalError: pass # Base62 Encoder and Decoder def toBase62(num, b = 62): if b <= 0 or b > 62: return 0 base = string.digits + string.lowercase + string.uppercase r = num % b res = base[r]; q = floor(num / b) while q: r = q % b q = floor(q / b) res = base[int(r)] + res return res def toBase10(num, b = 62): base = string.digits + string.lowercase + string.uppercase limit = len(num) res = 0 for i in xrange(limit): res = b * res + base.find(num[i]) return res app = Flask(__name__) # Home page where user should enter @app.route('/', methods=['GET', 'POST']) def home(): if request.method == 'POST': original_url = request.form.get('url') if urlparse(original_url).scheme == '': original_url = 'http://' + original_url with sqlite3.connect('urls.db') as conn: cursor = conn.cursor() insert_row = """ INSERT INTO WEB_URL (URL) VALUES ('%s') """%(original_url) result_cursor = cursor.execute(insert_row) encoded_string = toBase62(result_cursor.lastrowid) return render_template('home.html',short_url= host + encoded_string) return render_template('home.html') @app.route('/<short_url>') def redirect_short_url(short_url): decoded_string = toBase10(short_url) redirect_url = 'http://localhost:5000' with sqlite3.connect('urls.db') as conn: cursor = conn.cursor() select_row = """ SELECT URL FROM WEB_URL WHERE ID=%s """%(decoded_string) result_cursor = cursor.execute(select_row) try: redirect_url = result_cursor.fetchone()[0] except Exception as e: print e return redirect(redirect_url) if __name__ == '__main__': # This code checks whether database table is created or not table_check() app.run(debug=True)

Let me explain what is going on here.

We have Base62 encoder and decoder

We have two functions one is index. Another one is short_url

Index function(‘/’) returns home page and also posts original URL into database

short url(‘/short_url’) just recieves the request for redirect and finally redirects shortened URL to Original URL. If you observe code carefully, you can easily grasp things.

We can also give look at template here. https://raw.githubusercontent.com/narenaryan/Pyster/master/templates/home.html .

Project structure looks this way.

Run the flask app on port 5000.

$ python main.py * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat......

If you visit http://localhost:5000 in your browser you will see

Now enter URL to shorten and click submit. It posts data to database and generates short string like below image. In my case it is http://localhost:5000/f . The string seems to be very short, but as no of URLs registered increase the string increases gradually. Ex. 11Qxd etc

Now if we click that link, it takes us to http://www.example.org

So this is how URL shortening work. For entire code, just clone my repo and give a try. https://github.com/narenaryan/Pyster

I hope you enjoyed the article. Please do comment if you have any query. Even you can mail me at narenarya@live.com