Multiple processes make the world go faster

DISCLAIMER: This is my first play with threads and processes. I am probably doing things wrong. All data used is collected by me from my own code. Your experience with multithreading and multiprocessing WILL be different to mine.

This is more a technical write-up of some work I did today.

So the plan was to use multithreading to have a few things going on at once in Legend of Tenebrae’s dungeon generation. As it turns out, Python and threads don’t play well together, due to the Global Interpreter Lock (GIL), which basically seeks out to destroy multiple threads. This is needed, as the memory management in Python isn’t thread safe. It is possible to multithread though, and I did have success using a multithreaded dungeon generator, but it was taking longer than the benchmark standard of about 1.1 seconds (11 ticks). More on that later.

I then decided to have a go using multiple processes to split the work up. Once I had a little fiddle around, the results were great! I’d reduced the time taken to 0.8 seconds (8 ticks), but a blank dungeon was being created. This confused the hell out of me. Python, by it’s very nature, allows references of lists to be edited and have the original list reflect these changes. This behaviour was being used to generate the dungeons before I started multithreading/multiprocessing and when I was using multithreading.

Not, however, for multiprocessing. I was passing the array into the process, but the original differed to the “copy”. Apparently, you cant do this using multiprocessing without using a Manager, which, as it’s name suggests, manages things, including lists. It does not, however, manage a list of lists, which is how I’m generating dungeons (only the top level list will be managed, any changes to the lower lists will be ignored). The solution? Use the queue.

The queue is a queue that processes (including the main process) can put and retrieve things into and out of. It can even be shared between process! If I put the generated part of the dungeon into the queue, I can retrieve it when all sections are done. I can not, however, guarantee any order that the results would be in, and due to what a queue is, I can only grab the oldest data in the queue. To fix this, I’m using a hackey little trick of adding a list containing [<NameOfProcess>, <PartOfDungeonGenerated>] to the queue, and then I can check what data goes where using the name of the process, which is something I manually give.

Here’s a table of some quick tests I did to determine the fastest approach to generating dungeons. All dungeons are generated using the same seed and the same algorithm. Multithreading/processing splits the work between 2 threads/processes.

Size of dungeon (Tiles) Control (Seconds) Multithreading (Seconds) Multiprocessing (Seconds) 10*10 0.117 0.158 0.524 100*100 1.163 1.573 0.704 500*500 5.895 7.965 5.328

As you can see from the table, multiprocessing is generally faster, but slower to start with. Most dungeons will be about 100*100 in size, so that’s the main test to be looking at – 0.4 seconds faster than the control and less than half the time for multithreading. I think that my MacBook was on the way out with the 500*500 test, as it froze and crashed for 600*600, which is probably why the times are so close. My next test will be using more processes to split the work load further, and find that optimal number of processes to use.

Overall, I’m happy with the results!