We, like many companies, build and iterate quickly — releasing new or improved features almost every week.

With each change, we try to be thoughtful about adding indexes, deleting dead code, and refactoring. But performance is not the primary objective, improving and testing the product is. We’re OK with a little speed-debt creeping in, and we combat it by taking a few days every month to focus on speed.

Our goal last week was to spend two days halving average load times from 350ms to ~180ms.

We started with our most important page, the questions page. The code revealed so many opportunities! We added missing indexes, deleted unnecessary database lookups, and deferred loading of some components till after the page load.

A few hours of work sped up this one page by 20%. Great, but far from our goal.

We turned to New Relic, and the real problem became clear. Here’s a breakdown of the most time-consuming function calls for rendering the page:

A breakdown of the 10 most time-consuming functions on this page

7 out of 10 of them relate to compiling templates. In all, 65 separate templates are used by this page, 22 of which are compiled in more than 1 out of every 5 requests.

A separate template for each potentially reusable bit of UI is great for DRY and consistency, but apparently not for performance.

We discovered that Flask defaults to a 50-template limit for the Jinja2 template cache. With scores of separate templates required by each view, each page would result in dozens of cache misses and recompilations.

To confirm that the template cache was the problem, we set up a profiler. We saw that switching routes would consistently result in scores of calls to `jinja2/[nodes|lexer|compiler].py`, while loading subsequent pages within a route without changing routes would not.

We decided to change the cache limit. Unfortunately, contrary to the usual ease of use of Flask, there was no clean and easy way to do this. Instead, we found a way to remove the cache limit entirely, and added this line before `app.run()`:

app.jinja_env.cache = {}

One line to set the cache to a dictionary, and by doing so, removing the limit of the number of cached templates.

The profiler immediately showed that templates would only be compiled the first time a route was loaded.

We deployed the change, and got this:

(ignore the average ms — it’s for the entire time range):

The questions page

The landing page

The course page

This one line brought us below our goal, without any of the other changes we made to the code.