3x faster Flask apps

Quart as a upgrade to Flask

Python has evolved since Flask was first released around 8 years ago, particularly with the introduction of asyncio. Asyncio has allowed for the development of libraries such as uvloop and asyncpg that are reported (here, and here) to improve performance far beyond what was previously possible. Sadly Flask is not easily combined with asyncio or these libraries. However the Flask-API can be used with asyncio via the Quart framework.

Quart provides the easiest transition for Flask apps to use asyncio as it shares the Flask-API. This means that existing Flask apps can evolve with very little effort into Quart apps and then use these new libraries to take advantage of performance improvements not possible with Flask.

This document details the transition from Flask to Quart for a typical production CRUD app and shows the performance improvements for a typical production deployment.

tl;dr

Upgrading this Flask-pyscopg2 app to a Quart-asyncpg app gives a performance speedup of 3x without requiring a major rewrite or adjustment of the code, full diff. Summarised results,

Route | Flask | Quart | Ratio

| Req/s | Req/s | (Quart/Flask)

-----------------------------------------------

GET /films/995/ | 330 | 1160 | 3.5

GET /films/ | 99 | 195 | 2.0

POST /reviews/ | 325 | 1114 | 3.4

The App

For this comparison I’m going to consider a simple app that simply provides a RESTful interface to a database. This is a common use case in a microservices architecture and provides for a very simple codebase to compare.

The app has two blueprints consisting of three routes in total. These routes are meant to represent typical CRUD usage i.e. GET /films/<int:id>/ a single resource, GET /films/ all resources, and POST /reviews/ a new resource .

The source code is available at https://github.com/pgjones/faster_than_flask_article note that there are two commits, a Flask version and a Quart version.

Evolution from Flask to Quart

Evolution from Flask to Quart is meant to be easy and require the minimal set of changes, specifically imports from flask are changed to be from quart and functions become async. An example from the full diff is,

def add_review():

data = request.get_json()

...

which becomes,

async def add_review():

data = await request.get_json()

...

Evolution from psycopg2 to asyncpg

Evolving the psycopg2 code to use asyncpg is a little more involved as the two have differing usages. To simplify the diff a PoolWrapper is used in the Flask app to give a context managed psycopg2 connection with the same API as asyncpg i.e. with pool.acquire() as connection: . This allows asyncpg to be used by changing the with to async with .

Aside from the connection, Asyncpg and psycopg2 also differ on cursor usage, transactions, execution arguments and query formatting. These are mostly differing conventions and the details are best considered in the diff.

Deployment

Exposing a Flask app directly to production traffic is unlikely to scale and doesn’t really represent a typical production environment. This is because Flask itself can only handle a single request at a time. Instead a WSGI server is typically used in combination with some kind of asynchronous worker, for example Gunicorn with eventlet.

Quart is also best deployed with Gunicorn, which allows the same command to be used to run both the Flask and Quart apps,

$ gunicorn --config gunicorn.py 'run:create_app()'

with the config file differing only in terms of which worker is used (eventlet for Flask and the Quart-UVLoop worker for Quart).

The performance measurements were performed with both the Flask and Quart apps running behind Gunicorn.

Database

The Postgresql sample database is used to give the apps something to act against in a CRUD manner. It is unchanged save for the addition of a simple review table,

CREATE TABLE review (

film_id INTEGER REFERENCES film(film_id),

rating INTEGER

);

Performance Measurement

To measure the performance of the apps wrk is used. It is configured to use 20 connections to match the database connection pool size (this should ensure the highest throughput, and 20 is a typical value I’ve used). To measure the GET requests the command is,

$ wrk --connections 20 --duration 5m http://localhost:5000/${PATH}/

whereas for POST requests the command is,

$ wrk --connections 20 --duration 5m --script post.lua http://localhost:5000/${PATH}/

with the post.lua file defined here.

Test system

The database Postgres (9.5.10), wrk (4.0.0) and apps Python (3.6.3) with requirements asyncpg (0.13.0), Flask (0.12.2), Gunicorn (19.7.1), psycopg2 (2.7.3.2), Quart (0.3.1) are all running on a single AWS c4.large machine.

Results

The full results are shown in the table below,

Route | Requests per second | Average Latency [ms] |

| Flask | Quart | Flask | Quart |

---------------------------------------------------------------

GET /films/995/ | 330.22 | 1160.27 | 60.55 | 17.23 |

GET /films/ | 99.39 | 194.58 | 201.14 | 102.76 |

POST /reviews/ | 324.49 | 1113.81 | 61.60 | 18.22 |

note that Quart servers between 2 to 3.5 times more requests per second with a corresponding 2 to 3.5 times reduction in the average latency.

Conclusion

The evolution of a Flask app to a Quart app is quite easy as the shared API mostly means the effort is writing async & await in the correct places. The evolution from psycopg2 to asyncpg is however more involved and likely troublesome if SQLAlchemy (or other ORMs) are used.

The performance of the app is noticeably improved. This improvement is mostly due to the usage of asyncpg and uvloop which Quart enables. Quart alone is estimated to offer a more modest 1.5 times speedup.

In conclusion the transition from a Flask-psycopg2 app to a Quart-asyncpg app is fairly low effort and gives a very reasonable performance improvement. This is likely to extend for any of the other asyncio based libraries, meaning that Quart enables the minimal effort transition for Flask apps to the asyncio ecosystem.