We’ve all seen articles that explain why server X or library Y are the best thing since sliced bread. You’ve heard your colleagues talk about how everyone not using gevent is plain old silly. You’ve seen articles showing Bjoern serve over 100.000 requests per second yet once you deploy it you have trouble serving more than ten. And is ten even bad?

How fast is “fast”?

Reading through piles of synthetic frameworks is only useful if you ever intend to build a WSGI stack of your own. There is not a single definition of fast. Optimizing things before they become bottlenecks is pointless. The only fast you should care about is fast enough.

Bjoern is lightning fast but only if your application is I/O-bound and only needs a handful of CPU cycles to complete. This is also where other libev-, gevent- and asyncio-based servers shine.

Django templates however are quite CPU-hungry. In a real world Django application the only URL that fits Bjoern’s speed requirement is probably your health check endpoint. And while serving health checks at one million rpm is tempting it’s probably not a business requirement.

Before you can test if your site is fast you need to understand your traffic and choose what to test.

Fix your code first

Before you decide that you absolutely need to switch your production environment to the most popular stack of the month make sure your application is not the problem. Unless you’re using Django’s runserver command in production chances are there’s more than your current platform to blame.

Install the Opbeat agent and learn which views contribute the most to your overall performance. Are they slow because your database server is slow? Maybe you just happen to execute a whole lot of excessive queries. Make extensive use of tools such as select_related and prefetch_related and know when to use which. It’s more complex than “foreign key vs. many-to-many.”

Use New Relic to profile a running production instance and understand where it spends the most of its processing time.

Pushing everything to a cache server may look tempting but it should be your last resort. Caching causes bugs that are hard to reproduce. Hard to reproduce bugs cause frustration, make you lose customers and hurt your productivity. There are only two hard things in computer science and one of them is cache invalidation.

Once you’re sure the application is well behaved and certainly not a performance problem in itself, you can move to load testing.

Load testing

Use Google Analytics to understand what your traffic peaks are and which pages are accessed the most during your busiest hours. This will give you an idea of your current traffic. Are you planning a large sale? How many impressions do you plan to buy for the upcoming ad campaign? You need to understand the traffic you’re dealing with before you can tell if your site is fast enough.

Pick the most representative URL and use Siege or wrk. Or use tools like Locust to create a stress scenario that reflects your actual traffic. Choosing a good testing scenario is important to get meaningful answers. If you run an online store your checkout page is going to see far less traffic than your product list or your product details page.

Make sure you test your target platform and not your local machine. Your shiny quad-core laptop with 32 GB of RAM can handle significantly higher traffic than a middle-tier AWS EC2 instance. The loopback virtual network interface handles traffic and buffering differently than a real network card.

Make sure the machine you’re using to test has the same architecture, number of CPU cores and amount of RAM as your production servers. An extra CPU core can result in much higher threading performance even if your application is single-process and bound by GIL.

When comparing frameworks make sure the ground is even. Avoid using a reverse proxy even if your production environment is behind a load balancer. Auto-scaled load balancers like the ones provided by AWS will limit your performance unless your tests last long enough for the scaling mechanism to trigger. It’s ok to test over the internet as long as your connection is reliable and your network latency does not change between test runs.

Start with your expected traffic. Make sure you don’t overshoot as congestion will result in much lower measured throughput and longer response times. Check the response times and make sure the WSGI server is configured correctly. If unsure how many workers to use (be it threads, processes, eventlets, greenlets or what have you), start with one and slowly increase each time repeating the test and observing the results.

Never trust any magic formulas like “2 × CPU cores + 1.” No two applications are the same and strangers on Stack Overflow can’t possibly predict how your code behaves. If there was a single best way we’d only have one WSGI server and it would not take any parameters.

What to look for

Throughput is not everything. Serving an average of fifty requests per second is useless if your average latency is five seconds.

Different application require different service levels. It’s probably ok if your payment form takes two seconds to process. It’s most likely not ok if your online painting application takes two seconds to place a dot.

If you can handle your expected traffic while keeping latency under a second then you’re probably already fast enough. If you can handle twice that traffic while keeping the latency under a second then you can stop reading this and find yourself a real problem to tackle.

Lastly, if you have the budget to scale, consider solving your performance problems with money. Scaling vertically will only allow you to grow so much but horizontal scaling lets you evenly distribute load between many servers even if none of them are particularly powerful. Multiple servers increase robustness and provide fault tolerance. Additional resources may not improve your code but they will allow you and your team to keep using the tools you’re comfortable with. This includes being able to hire people who have no idea what PyPy or Meinheld are.

Join me next time in part 2 where I’ll use Saleor to compare a number of popular WSGI stacks and Python versions. I’ll also talk about the steps you should take before you decide to change your platform.