HTTP/1.1 and HTTP/2: A Performance Comparison for Python¶

If you don't pay any attention to my Twitter feed, you might have missed the fact that I have spent the last few months working on a client-side HTTP/2 stack for Python, called hyper. This project has been a lot of fun, and a gigantic amount of work, but has finally begun to reach a stage where some of the more crass bugs have been worked out.

For this reason, I think it's time to begin analysing the relative performance of HTTP/1.1 and HTTP/2 in some example use-cases, to get an idea of where things stand.

Like any good scientist, I don't want to just dive in and explore: I first want to establish what I expect to see. These expectations come from two places: familiarity with hyper , and familiarity with HTTP in general.

My expectation is that hyper is, in its current form, going to compare to the standard Python HTTP stack as follows:

hyper will be more CPU intensive

will be more CPU intensive hyper will be slower

will be slower hyper will increase the amount of data sent on the network for workloads involving a small number of HTTP requests

will increase the amount of data sent on the network for workloads involving a small number of HTTP requests hyper will decrease the amount of data sent on the network for workloads involving a large number of HTTP requests

This is for the following reasons. Firstly, hyper will consume more CPU because it has substantially more work to do than a standard HTTP stack. hyper needs to process each HTTP/2 frame (of which there will be at least 4 per request-response cycle), burning CPU all the while to do so. Conversely, the standard HTTP/1.1 stack in Python can do relatively little work, reading headers line-by-line and then the body in one go, requiring almost no transformation between wire format and in-memory representation.

Secondly, hyper will be slower because it has to cross from user-space to kernel-space and back again twice per frame read. This is because hyper needs to read 8 bytes from the wire (to find out the frame length), followed by the data for the frame itself. This context-switching is expensive, and not something that needs to be done in quite the same way for HTTP.

For workloads involving a small number of requests, HTTP/2 does not provide particular bandwidth savings or improve network efficiency. The bandwidth savings provided by HTTP/2 come from header compression, which is at its most effective when sending and receiving multiple requests/responses with very similar headers. For small numbers of requests, this provides little saving. The network efficiency savings come from having long-lived TCP connections resize their connection window appropriately, but this benefit will be lost when sending relatively small numbers of requests. As the cherry on top of this cake, there's some additional HTTP/2 overhead in the form of framing and window management which will lead to HTTP/2 needing to send more bytes than HTTP/1.1 did.

HTTP/2's major win should be in the area of workloads with large numbers of requests. Here, HTTP/2's header compression and long-lived connections should be expected to provide savings in network usage.

These are my expectations. Let's dive in and see what we can see.

The Set Up¶

First, I need to install hyper . Because of some ongoing issues regarding upstream dependencies I will be running this test in Python 3.4 using the h2-10 branch of hyper (which, despite its name, implements the h2-12 implementation draft of HTTP/2). As such, I went away and installed that branch using pip .