Optimizing and Scaling a MySQL Web Application

Or How to Stop Worrying When You See Timeouts and Learn to Love These Techniques

What Does it Mean to Optimize and Scale a MySQL Web Application?

Ask an application developer, a database admin, and a DevOps engineer what it means to “optimize and scale MySQL” and you’ll probably get three different answers. This article presents techniques from all three of these perspectives. As a heuristic, we use the following to define what it means to “scale MySQL”:

Your MySQL-backed web application’s traffic just increased 100x (yay!) but now you’re starting to see timeouts (no!). What should you do?

The techniques we discuss fall into these categories:

Application Level Optimizations MySQL Level Optimizations Scaling Through Infrastructure

Note most of these techniques are also applicable to PostgreSQL and other SQL-RDBMSs. For simplicity, we colloquially use MySQL to refer to these other databases as well.

Application Level Optimizations

These optimization techniques occur at the application level.

Cache Queries in Redis

Caching database queries in an in-memory database like Redis will lead to fewer MySQL queries. Not every query should be cached, just those queries which are frequently accessed and/or slow.

As with all caching solutions, this will increase code complexity and introduce potentially hard-to-find bugs. This complexity comes primarily from keeping the cache up-to-date with MySQL. A common technique is when a table is updated, all its caches are deleted/invalidated. The next time a cache-able query is requested, it sees the cache is absent/invalid (aka cache miss), so it will query MySQL and cache the result.

This type of caching works great for high-read-low-write caches, such as queries which populate the content for the front page of a popular media site. But in high-write cases, caches are frequently missed, defeating the purpose of caching.

Prefetching Records to Prevent n+1 Queries

The n+1 query is defined as a query that fetches n records from the database, then runs an addition query for each of those records. If this definition is abstract and confusing, an example will clarify:

Suppose, we want to fetch all 5 users from an account as well as each user’s expenses. Here are the 5+1 queries

SELECT * FROM users WHERE account_id = 42;

# returns users 3,5,27,38,99 # Now select the expenses for each user

SELECT * FROM expenses WHERE user_id = 3;

SELECT * FROM expenses WHERE user_id = 5;

SELECT * FROM expenses WHERE user_id = 27;

SELECT * FROM expenses WHERE user_id = 38;

SELECT * FROM expenses WHERE user_id = 99;

Prefetching (aka eager-loading) means if we expect the application to also query for the expenses, then we batch query them by user_id , resulting in just 2 queries instead of 5.

SELECT * FROM users WHERE account_id = 42; SELECT * FROM expenses WHERE user_id IN (3,5,27,38,99);

Many ORM frameworks already support prefetching such as ActiveRecord’s include method, so just be sure to use them.

Caching Query Results as Variables

If a query result is used multiple times during an operation, be sure to cache it as a variable. Here’s an example in Ruby:

We cache the order query as instance variable, so we don’t run the query 3 times.

Cache Columns and Summary Tables

A cache column on a table stores certain frequently accessed aggregate (ie. COUNT SUM MIN MAX AVG ) data. For example, if a messages table is quite large and the number of unread messages for users are accessed frequently, it may be a good idea to store unread_message_count in the users table as a column. As such, we would need to update this unread_message_count every time a message gets read or new messages come in.

Similarly, a summary table is a table that holds a variety of aggregate queries. For example, a CRM dashboard might show the number of sales leads and closes by month by salesperson. Rather than query this data every time the dashboard is loaded, save the data beforehand in a summary table. This example works especially well with cases like monthly historical data, so it will be rarely updated.

Supplement Queries With a Search Engine Like ElasticSearch

Take the load off database by moving complex or heavy-text queries into a search engine like Elasticsearch. The cost here will be similar to the caching solutions mentioned above, primarily increased code complexity from keeping the search indices up-to-date with the database.

Archiving Old Rows in Large Tables

If a table becomes extremely large (ie. such as the messages or activities table on a social networking site), consider archiving old rows in a separate database. As such, the old data is not readily available on the UI and the user needs to explicitly request more data.

MySQL Level Optimizations

These optimization techniques involve changes at the MySQL level.

Select Only the Required Columns

A query using * will return all columns, including joined tables. For example,

SELECT *

FROM posts

JOIN post_tags ON posts.id = post_tags.post_id

JOIN tags ON post_tags.tag_id = tags.id

will return all columns from all three tables adding unnecessary I/O, memory and CPU load to the MySQL server. Instead, return only the needed columns.

Use ENUM for categorical data types

Use ENUM for categorical data types. For example

# Instead of

education VARCHAR(10) NOT NULL # Use ENUM

education ENUM ('highschool', 'college', 'postgrad') NOT NULL

Aside from a data integrity check, an ENUM column only uses 1 byte for up to 255 distinct values.

Add Needed Indexes and Remove Unused Indexes

An index is a data structure (usually a b-tree) that speeds up certain queries for a specific table. An index works by sorting the table by one (or more) of its column(s). This sorting allows lookups by that column to take O(log_2(N)) time since its a binary search instead of O(N) time as a linear search.

For queries that use multiple columns, use composite indexes which are indexes that sort a table by more than one column. For more details on how single and composite indexes work, check out this article.

To find missing indexes, use MySQL’s EXPLAIN tool. When EXPLAIN is prepended to a query, it runs the query and returns information like

mysql> EXPLAIN SELECT * FROM users WHERE account_id = 1 LIMIT 5\G;

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: users

type: ref

possible_keys: index_users_on_account_id_and_uid

key: index_users_on_account_id_and_uid

key_len: 5

ref: const

rows: 104

Extra: Using where

Here is a good write-up from Sitepoint on how to read the EXPLAIN output , but the main thing to watch out for is when the type field has the value ALL , which means the query performed a full table scan. A full table scans means there’s probably a missing index, so be sure to add them.

As always, indexes don’t come free. They increase the database write time and use more disk space, so be sure to add only indexes that will be used and remove un-used indexes.

Sometimes Many Simple Queries are Better Than a Single Complex Query

Sometimes breaking up a single complex query into multiple simple queries is better. For example, instead of

SELECT users.full_name FROM users

JOIN user_locations ON users.id = user_locations.user_id

JOIN locations ON locations.id = user_locations.location_id

WHERE locations.id = 4231;

Rewrite it as

SELECT locations.id FROM locations WHERE id = 4231;

SELECT user_locations.id FROM user_locations WHERE location_id = 21;

SELECT users.full_name FROM users WHERE id IN (123,456,789,101);

The reason why this is the case is because:

The MySQL query cache can cache these simple queries for other uses. A complex join is less likely to be used in other contexts. But note the query cache has been deprecated in MySQL 8.0. Simple queries are less likely to be subject to lock contention. Multiple simple queries can be run across different replicas instead of being confined to a single master (more on replicas/master below). In earlier days when network speeds were slower, it was better to make only one trip to the database. This isn’t the case nowadays. Network speeds are much faster, especially when your app and database servers are hosted on the same cloud provider region.

Check The Slow Query Log

Now that we know how to optimize queries, where do we find slow queries to optimize?

In the (aptly named) Slow Query Log which records all queries that exceed a user-defined threshold called long_query_time . You can enable the Slow Query Log with

SET GLOBAL slow_query_log = 'ON';

SET GLOBAL long_query_time = 5; # seconds

SET GLOBAL slow_query_log_file = '/path/filename';

Scaling Through Infrastructure Changes

Scaling MySQL through infrastructure essentially means adding more database servers or using more powerful database servers, called horizontal scaling or vertical scaling, respectively.

Use Replica Databases For Read Operations (Horizontal)

MySQL has built-in support for replicas, which as the name implies, are database copies of the master database. The way replication works is that data changes to the master database are saved to a binary log which the replica replays to sync itself with the master. As such a replica is not always synced up with master and this delay is called replica latency.

Replicas can be to take the load off the master by handling read queries that tolerate a certain level of replica latency. Furthermore, keeping reads and writes on separate databases will prevent read-write-locking issues.

Also, replicas can be used for backups (ie. when a new dev accidentally deletes client data on his first day).

Functional Partitioning and Micro-services (Horizontal)

Some applications can easily be partitioned. For example, an e-commerce app can be partitioned into a consumer-oriented database and a merchant-oriented database. Most joins should be within the same database but joins across databases can be done at the application level.

When the database is partitioned along functionality, this leads naturally to micro-services. In our case, the consumer-facing functionality can be its own micro-service talking with the consumer-oriented database and the merchant-facing functionality can be its own micro-service talking with the merchant-oriented database.

Table Sharding (Horizontal)

There seems to be some ambiguity of the term sharding, but we here we will define table sharding as the partitioning of a table by rows.

The are many performance issues with a table having too many rows, such as:

queries are slower

backups take longer

migrations lock up the database for a longer duration

Table sharding partitions a single big table into many smaller tables (all with the same schema). This partitioning can occur along functional lines. For example, a users table can be partitioned along geographic lines. Maybe an enterprise CRM has one huge customer who accounts for 50% of all the records in a table, so partitioning by this one customer and every one else is a good strategy. The rule of thumb with table sharding is that the resulting partitions should be roughly the same size. Also note these partitioned tables don’t even necessarily have to be on the same database server.

MySQL supports table sharding which it calls user defined partitioning.

Upgrading Server Hardware (Vertical)

Vertical scaling means upgrading the database server to a machine with more memory, CPU speed/cores, I/O capacity, etc. Nowadays cloud service providers like Heroku or AWS make this pretty easy.

For example, on Heroku, to upgrade the database server, first create a new (and more powerful) replica to follow the master. Once the replica is caught up, swap the ENV vars and make this replica the new master.

Conclusion

We looked at a variety of techniques to optimize and scale a MySQL application, but if the database is still not scaling even after all these techniques have been employed, it might be worthwhile to look at Non-SQL databases.

If your data models and relations are very graph-like with lots of queries being graph traversals, then a graph database like Neo4j and would be the obvious choice.

Another popular non-SQL database is MongoDB which is a document database, meaning there is no schema, no relations, and the entire database can thought of as a collection of JSONs. But from a scaling perspective, MongoDB automatically shards data.

Thanks for reading! Feel free to leave your own suggestions, corrections or other comments below — S