Your boss tells you to learn partitioning topics to scale your databases or understand the database design at your company. You search the internet and then you arrive here. Welcome! Here is a gentle introduction to these key points of partitioning database tables:

Database Table Partitioning:

vertical vs horizontal partitioning

First, we will need to look at how databases (DBs) are designed typically. DBs are composed of tables, which have columns and rows — each can have many or few. Partitioning a database (for our sake let’s say our DB only consists of one large table) is where very large tables are divided into multiple smaller tables. In doing so, queries access only a chunk of the data and can thus run faster because there is less data to search. The mission of partitioning is to assist in maintenance of large tables and to shave off the overall response time in reading and loading data for SQL functions.

Vertical Partitioning vs Horizontal Partitioning

Vertical Partitioning stores tables &/or columns in a separate database or tables.

Horizontal Partitioning (sharding) stores rows of a table in multiple database clusters.

Sharding makes it easy to generalize our data and allows for cluster computing (distributed computing). Sharding is needed if a data set is too large to be stored in a single DB. Most importantly, sharding allows a DB to scale in line with its data growth. It also reduces table size (index size more specifically) which improves search performance.

Common rules/guidelines for sharding (horizontally partitioning):

Each database server must be the same structurally. Data records are sensibly divided in a sharded DB. Each data record exists in only one shard (other than audit tables/backups)

Partitioning Pros: Scalability, increased speed for large databases

Cons: Makes programming and operations more convoluted