Python allows for several different approaches to parallel processing. The main issue with parallelism is knowing its limitations. We either want to parallelise IO operations or CPU-bound tasks like image processing. The first use case is something we focused on in the recent Python Weekend* and this article provides a summary of what we came up with.

Before Python 3.5, there were two ways of parallelising IO-bound operations. The native method was to use multithreading and the non-native method involved frameworks like Gevent to schedule concurrent tasks as micro threads. But then Python 3.5 brought native support for concurrency and local threading with asyncio . I was curious to see how each of these would perform in terms of memory footprint. Find out the results below 👇

Prepare a testbed

For this purpose, I created a simple script. Even though the script does not have a lot of functionality, it still demonstrates a real use case. The script downloads bus ticket prices from a webpage 100 days upfront and prepares them for processing. Memory usage was measured with the memory_profiler module. The code is available in this Github repository.

Let’s test!

Synchronous

I executed a single thread version of the script to act as a benchmark for the other solutions. The memory usage was pretty stable throughout the execution and the obvious drawback was the execution time. Without any parallelism, the script took about 29 seconds.

Sequential memory usage

ThreadPoolExecutor

Multithreading is part of the standard library toolbox. With Python 3.5, it is easily accessible via the ThreadPoolExecutor that provides a very simple API to parallelise existing code. However, using threads comes with some drawbacks and one of them is higher memory usage. On the other hand, a significant increase in the speed of execution is the reason we’d want to use it in the first place. The execution time of this test was ~17 sec. That’s a big difference compared to ~29 sec for synchronous execution. The difference is a variable affected by the speed of IO operations. In this case network latency.