I have recently been re-writing an old Objective-C game in Swift, Apple’s new language de jour. If you are familiar with Swift and how Apple have been promoting it then you’ll know that Apple are pushing it as a protocol orientated language and are espousing the use of values rather than reference semantics where appropraite.

There have been a few presentations at WWDC on this topic. Last year’s talks included Protocol-Oriented Programming in Swift, and Building Better Apps with Value Types in Swift, and followed up this year with Protocol and Value Oriented Programming in UIKit Apps. All of which are worth a look if you’re getting stuck into Swift.

I’m not going to go over the benefits and challenges with a broad value based approach here, I’m going to take a look into a specific aspect of how value types could impact your performance a significant way that might be unexpected, especially when coming from an object based world.

My game, Mazy, has several maze generating algorithms for the player to choose between, one of which is Kruskal. A quick outline looks like this

Start with a grid of maze tiles where there are walls between every tile.

Assign each tile in the maze a unique ‘maze identifier’

Choose a random tile from the maze

If the selected tile is adjacent to a tile with a different ‘maze identifier’ then remove the wall between the tiles. Then update all tiles on either side of the wall to have the same identifier.

Repeat until all tiles have the same maze identifier.

Here’s a video of Kruskal in action, showing the identifier for each tile in the maze:

Kruskal Maze Generation

The key here is that when a wall is removed the tiles on one side of the wall need to be updated to have the same maze identifier as the other side. We could scan over all the tiles in the maze to find matching identifiers but this is rather expensive, pariticurlay as we may have to do this several thousand times. To increase performance we need a mapping from identifier to tile position.

When I first implemented this I took this simple approach for storing the mappings:

typealias MazeIdentifier = Int

struct MazePoint {

let x: Int

let y: Int

}

var tileMapping: [MazeIdentifier: [MazePoint]] = [:]

Which is to say, a simple dictionary (aka hash, map..) from an integer to an array of x,y co-ordinates.

Simple enough, but there’s hidden danger here!

Each time we break down a wall we append the MazePoint’s of the tiles that have changed generation ID to the existing list for the new generation ID:

tileMapping[generationId]?.append(newMazePoint)

Looks fine? Well… Kinda… It works… but the performance is totally dreadful when adding large numbers of points.

If you’d like to follow the rest of this post with a project then clone this git repo.

Lets start by appending 50,000 times to a single ID

for _ in 0..<50_000 {

tileMapping[0]?.append(MazePoint(x:1, y:1))

}

If you’ve got the project click ‘Dictionary/Array Append’ to trigger this.

Hmm… that’s taking a while…

Oh... Dear…

15.7 seconds for 50,000 array appends. Hmm, that can’t be right, what’s going on here?

Let’s take a look at the official documentation (always a good idea!) for Array.append.

Complexity: Amortized O(1) unless self’s storage is shared with another live array; O(count) if self does not wrap a bridged NSArray; otherwise the efficiency is unspecified..

Interesting. It certainly looks like we’re not hitting O(1) here!

Let’s dig into what’s happening in more detail before jumping to conclusions. Instruments to the rescue! Fire up the time profiler, run the function and then invert the call tree. You can clearly see where the time has gone:

Hmm interesting, there’s a lot of memory coping going on here. ‘Array_makeUniqueAndReserveCapacityIfNotUnique’ seems to be the key. Swift is open source, so we can see exactly what is going on here.

Fundamentally our problem appears to be that our ownership of the array is not unique, and Swift’s copy-on-write semantics cause a copy of the array to be made when we append to it.

But why is our ownership not unique? We’re just appending to an array, right? Doesn’t .append, append in place?

My first approach to fixing this problem was to turn back to reference based semantics and replace the array with an NSMutableArray. This works as you’d expect:

But it’s pretty ugly as you can’t add a struct to an NSArray, rather you have to convert it to an object and back again (see MazePointObject in the git repo).

That’s just ignoring the cause of the issue anyway. Why are we getting this array copying, and can we avoid it?

Looking back at our append:

tileMapping[0]?.append(MazePoint(x:1, y:1))

Notice the ? there. That’s necessary because a dictionary lookup returns an Optional. This allows you to perform lookups on a dictionary for keys that might not be there. For example in the Swift REPL:

1> var myDict = [“a”:1234]

myDict: [String : Int] = 1 key/value pair {

[0] = {

key = “a”

value = 1234

}

}

2> print (myDict[“a”])

Optional(1234)

3> print (myDict[“b”])

nil

However I think this is the source of our problem! tileMapping[0] does not return the value at ‘0’, rather the Optional wrapper. When we subsequently append to it, Swift doesn’t know that the value contained in the optional wrapper is the value in the dictionary and thus must copy it.

I’m not sure if this is something that could be enhanced in the compiler one day…

What to do? Accessing arrays by index doesn’t return an Optional. If you access an array with an index that doesn’t exist your application will crash.

As the key of our mapping is an Integer, it fits easily into an Array model. If we redefine our mapping like this:

var tileMappingArray: [[MazePoint]] = []

Then we add arrays at each array index and finally our append looks like this:

tileMappingArray[0].append(MazePoint(x:1, y:1))

Executing that (Array/Array append in the git project). We get

Result! A fast, value based, solution to our problem!

I hope you found this post interesting. If you have any comments or better ideas on how to solve this problem then I’d love to hear them!

Update: /u/goatbag on reddit suggested a couple of alternatives to addressing this issue.

One using a pure Swift referenced array, rather than getting burdened with NSArray. Secondly it was suggested to reduce the number of appends by batching them into their own array, which is O(1) and then performing a single append to the dict, which is O(length of result). This complexity is fine as the append is only performed once per maze generation step, rather than once per updated tile per step. I.e:

tileMapping[0] = []

var batchArray:[MazePoint] = [] for _ in 0..<50_000 {

batchArray.append(MazePoint(x:1, y:1))

}

tileMapping[0]?.appendContentsOf(batchArray)

The performance looks to be pretty much identical to the Array/Array solution, however I much prefer this as it can use any Hashable value as the dictionary key and not be restrained to Integers. Additionally the risk of an out-of-range crash is eliminated. Thanks /u/goatbag :)

The git repo has been updated with the above, click Dictionary/Batch Array Append.