4 Simple Steps to Detect & Fix Slow Rails Requests

UPDATE : We’ve released 2016: We’ve released Scout App Monitoring , which automates these steps for you.

In Blink, Malcom Gladwell’s book on split-second decisions, Gladwell tells the story of how the Emergency Department at Chicago’s Cook County Hospital changed their process for diagnosing chest pain.

Dr. Brendan Reilly instituted a simple test to determine whether a patient was suffering from a heart attack. It combined just 4 questions with the results of an ECG. This simple test was 70% better than the barrage of questions previously asked by hospital staff to identify patients that weren’t having a heart attack and was nearly 20% better at identifying patients that were having heart attacks.

More information on the patient’s symptoms often led to an incorrect diagnosis. It distracted doctors from the real issues.

I’ve seen this many times with developers trying to debug performance issues in Rails applications. They look at outliers instead of the obvious culprits. It’s part of the reason I’ve never felt a need for a deep, detailed Rails monitoring application (i.e. – benchmarks from the controller to the database level on every request).

The majority of the time, our performance problems have nothing to do with the Rails framework (and we’ve worked through a lot of issues since we started building Rails apps in 2005). Why benchmark the entire request cycle when the vast majority of issues are isolated at the database layer? After I’ve ruled out the database, I can see benchmarking a single request (there’s a great free tool below), but I simply don’t want the other, often irrelevant information clouding my mind.

The root symptom we want to avoid in our apps is slow requests. Our Scout plugin for analyzing slow Rails requests has been installed nearly 250 times, so we’re not alone there.

First, it’s probably not Rails

Contrary to what you heard on the Interweb, it’s probably not Rails itself that’s making your app slow. We conducted an internal survey of the Highgroove Studios team to see where we’ve encountered performance issues and the root cause:

The database layer has a huge edge on all other issues. In fact, almost all of the performance problems could have happen in any framework in any language. Issues like missing database indexes, not using joins correctly, loading too many records into memory, manipulating too many records through iteration ( #map , #each ), and memory leaks occur in many languages.

It’s not a bad thing to have performance issues, your web app is growing, but it’s a problem if they aren’t quickly fixed.

1. Monitor for slow web requests

First, we want to be aware of slow web requests ASAP. We use Scout's Slow Rails Requests plugin for real-time notification of slow requests because:

We have a very fast release cycle, and it’s important that we’re aware of any side-effects of a new release ASAP

We could analyze our log files weekly, but it’s too easy to push off a task that isn’t done automatically.

We like knowing about it before our clients

Once this plugin is installed, we’ll quickly be alerted of slow requests. Now, lets monitor a couple of key metrics that can impact the performance of our applications.

2. Monitor Server Load

Most nix servers measure a form of server health called the Server Load. Usually, the Server Load is given in Load Averages over 3 different time periods.

Your Server's Load is essentially a rough idea of the number of queued processes waiting for a resource to become available. This resource is generally CPU time, but could also include a number of other factors like Memory, swap space, disk, etc. A lower number is a good indicator of your overall system health and responsiveness.

The 3 averages are for the last minute, the last 5 minutes, and the last 15 minutes. Using these averages, we can see how busy your server really is.

If you take a look at the "top" program's output on the server, you will see that this server is not busy at all! In fact, this server is currently at 0.00 load on all three load averages. This is ideal, and indicates an idle server, waiting for a process to handle.

It’s common to see that when the load reaches a certain threshold (perhaps 3.0), processes can slow to a crawl and your Rails app may stop responding. We typically generate an alert through Scout's Load Average plugin if the load exceeds 3.00.

Why

A slow web request could cause a spike in the load or it could be slow because a background job is using a lot of the CPU, a large number of requests are coming through, etc. Tracking the load helps us figure out these issues.

3. Monitor Memory Usage

On the memory-side, there are 2 things we typically monitor on our Rails setups:

The memory usage of our Mongrel processes & associated processes (like a Ferret server)

The memory usage of the system, most importantly the swap space usage.

It is important to note that as processes use resident memory, they will also increase their use of virtual memory, in step. Processes will actually appear to consume more of this “virtual memory” than the amount of actual physical memory of the system. This is perfectly normal, since most operating systems can manage in-memory paging and sharing of resources, but, when this virtual memory begins the process of “paging” to disk, using swap space to utilize the hard drive to simulate physical memory, do we experience slowness or worse – out of memory problems.

Think about it this way. If you worked in a restaurant and I gave you a big load of dishes (your processes) and 5 really fast dish-washing machines (resident / physical memory), and 5 really slow dish-washers (hard drive / swap space), you would do best to try and optimize all your dishes to be handled by the fast machines. Only when you really needed to, would you utilize those slow dish-washers, and only if you couldn’t handle all the dishes coming in.

Many Rails applications – either the apps themselves or third party libraries – suffer from memory leaks. As your server uses more and more memory, both their resident memory and virtual memory begin to grow. They begin to use the hard drive as swap space for virtual memory, which is far slower than physical memory. This can dramatically slow performance of the entire system, and thus, all requests. We generate an alert through the Process Usage Plugin if our Mongrel processes exceed a given threshold (usually around 100 MB) and if the percentage of swap space used exceeds a given threshold (usually around 60%) using the Memory Profiler Plugin.

Why

This is often an easy problem to fix: if finding the leak is hard (and it usually is), you can do a scheduled restart. If you are constantly using a lot of swap space, you probably need more memory (that’s cheap compared to development hours).

4. Fixing slow requests

So, Scout sends you an alert regarding a slow web request – now what?

Install the Query Reviewer Rails Plugin

As stated earlier, most of our performance issues are related to the database, and the Query Reviewer Plugin does a tremendous job of finding issues with MySQL and benchmarking the entire request cycle. The key feature of this plugin is that the query information is embedded directly on in the view.

The Optimization Process

We use the following process when Scout identifies a slow web request:

Login to Scout and view the data across the slow Rails requests, CPU load, and memory usage plugins. If the CPU load is high, the memory usage of our Mongrel proccess are high, or the % of swap space used is unreasonably high, other issues could be impacting this slow request. We may restart our mongrel process or check on any background jobs that are running and re-run the request. Re-run the slow request in our local environment, seeing if we can replicate the issue. Make sure the MySQL query reviewer plugin is enabled. Review the information provided by the MySQL Query Reviewer plugin, massage your SQL queries. Repeat steps 2 and 3 until performance is acceptable.

Summary

We’ve seen lots of people waste time tracing the Rails stack for performance issues when the cause is usually quite simpler – look at the obvious places first before digging through the Rails stack.

Links: