Well, we’re still waiting for coroutines in Swift several years in. In the meantime, we have many concurrency mechanisms to choose from. How do we know which one to choose? Let’s examine each one and compare their performances.

Serial Queue

We can use a Grand Central Dispatch serial queue to limit access to the resource. This is the most common, easiest to implement, and slowest. Let’s create a wrapper for our generic value to ensure concurrency for it:

struct SynchronizedSerial<Value> { private let mutex = DispatchQueue(label: "com.basememara.SynchronizedSerial") private var _value: Value init(_ value: Value) { self._value = value } var value: Value { mutex.sync { _value } } mutating func value(execute task: (inout Value) throws -> T) rethrows -> T { try mutex.sync { try task(&_value) } } }

To use this wrapper, initialize your object with the initial value. Then you must call .value to get or .value { ... } to set the value.

var temp = SynchronizedSemaphore(0) temp.value // 0 temp.value { $0 += 1 } // 1

Now let’s test this bad boy out. But how? We can kickoff a million concurrent tasks to increment the variable.

func testSynchronizedSerialWritePerformance() { var temp = SynchronizedSerial(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in temp.value { $0 += 1 } } XCTAssertEqual(temp.value, 1_000_000) } }

Let’s also test the performance for just reads, with sporadic writes in between. This should simulate real-world usage:

func testSynchronizedSerialReadPerformance() { var temp = SynchronizedSerial(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { guard $0.isMultiple(of: 1000) else { return } temp.value { $0 += 1 } } XCTAssertGreaterThanOrEqual(temp.value, iterations / writeMultipleOf) } }

The test is considered to be successful if:

The operation does not crash due to multiple threads trying to write to the same memory. The race condition would result in a bad access memory crash. The result of the incremented value should match the number of concurrent operations. In our example, the temp value should result in 1,000,000 since it was incremented 1,000,000 times. If the number is not the same as the number of operations, then the task was executed on a stale value which is a race condition that leads to corrupt data.

The test succeeds – No crash and the result value is 1,000,000! 🎉 The time it took to execute the million tasks is 4.16 seconds for writes and 4.12 seconds for reads.

Concurrent Barrier Queue

This time we will use a concurrent queue instead of a serial one. Furthermore, we will use the .barrier flag to allow concurrent reads, but block access when a write is in progress (you can read my previous detailed post about this).

struct SynchronizedBarrier<Value> { private let mutex = DispatchQueue(label: "com.basememara.SynchronizedBarrier", attributes: .concurrent) private var _value: Value init(_ value: Value) { self._value = value } var value: Value { mutex.sync { _value } } mutating func value(execute task: (inout Value) throws -> T) rethrows -> T { try mutex.sync(flags: .barrier) { try task(&_value) } } }

Same API and test again:

func testSynchronizedBarrierWritePerformance() { var temp = SynchronizedBarrier(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in temp.value { $0 += 1 } } XCTAssertEqual(temp.value, 1_000_000) } } func testSynchronizedBarrierReadPerformance() { var temp = SynchronizedBarrier(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { guard $0.isMultiple(of: 1000) else { return } temp.value { $0 += 1 } } XCTAssertGreaterThanOrEqual(temp.value, iterations / writeMultipleOf) } }

No crashes and the result is also 1,000,000! It succeeds at 3.4 second for writes and 1.19 seconds for reads.

Semaphores

This Grand Central Dispatch object uses a counter mechanism to block a thread. You can declare that a semaphore can handle one or more tasks simultaneously, but usually you’d want to set this to one for mutual exclusivity:

struct SynchronizedSemaphore<Value> { private let mutex = DispatchSemaphore(value: 1) private var _value: Value init(_ value: Value) { self._value = value } var value: Value { mutex.lock { _value } } mutating func value(execute task: (inout Value) -> Void) { mutex.lock { task(&_value) } } } private extension DispatchSemaphore { func lock(execute task: () throws -> T) rethrows -> T { wait() defer { signal() } return try task() } }

The API is the same as before, but using the GCD semaphore under the hood. The test is similar as well:

func testSynchronizedSemaphoreWritePerformance() { var temp = SynchronizedSemaphore(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in temp.value { $0 += 1 } } XCTAssertEqual(temp.value, 1_000_000) } } func testSynchronizedSemaphoreReadPerformance() { var temp = SynchronizedSemaphore(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { guard $0.isMultiple(of: 1000) else { return } temp.value { $0 += 1 } } XCTAssertGreaterThanOrEqual(temp.value, iterations / writeMultipleOf) } }

No crashes and the result is also 1,000,000! Time to complete is 2.85 seconds for writes and 2.1 seconds for reads.

NSLock

Another concurrency technique on my list is NSLock . It comes from the C-language world and works similar to a semaphore where it locks/unlocks the thread:

struct SynchronizedNSLock<Value> { private var mutex = NSLock() private var _value: Value init(_ value: Value) { self._value = value } /// Returns or modify the value. var value: Value { mutex.lock { _value } } /// Submits a block for synchronous execution with this lock. mutating func value(execute task: (inout Value) throws -> T) rethrows -> T { try mutex.lock { try task(&_value) } } } private extension NSLocking { /// Attempts to acquire a lock, blocking a thread’s execution until the /// process can be executed, then relinquishes a previously acquired lock. func lock(execute task: () throws -> T) rethrows -> T { lock() defer { unlock() } return try task() } }

The API and test is identical as before:

func testSynchronizedNSLockWritePerformance() { var temp = SynchronizedNSLock(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in temp.value { $0 += 1 } } XCTAssertEqual(temp.value, 1_000_000) } } func testSynchronizedNSLockReadPerformance() { var temp = SynchronizedNSLock(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { guard $0.isMultiple(of: 1000) else { return } temp.value { $0 += 1 } } XCTAssertGreaterThanOrEqual(temp.value, iterations / writeMultipleOf) } }

The test is successful… and so far the fastest by far at 0.422 seconds for writes and 0.509 seconds for reads! I was hoping GCD would win since it’s more modern, but who knew 🤷‍♂️

OSLock

The last lock we will test is os_unfair_lock . It’s even lower level than NSLock but has the same concept of lock/unlock:

class SynchronizedOSLock<Value> { private var mutex = os_unfair_lock_s() private var _value: Value init(_ value: Value) { self._value = value } /// Returns or modify the value. var value: Value { lock { _value } } /// Submits a block for synchronous execution with this lock. func value(execute task: (inout Value) throws -> T) rethrows -> T { try lock { try task(&_value) } } } private extension SynchronizedOSLock { /// Attempts to acquire a lock, blocking a thread’s execution until the /// process can be executed, then relinquishes a previously acquired lock. func lock(execute task: () throws -> T) rethrows -> T { os_unfair_lock_lock(&mutex) defer { os_unfair_lock_unlock(&mutex) } return try task() } }

The API and test is identical as before:

func testSynchronizedOSLockWritePerformance() { let temp = SynchronizedOSLock(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in temp.value { $0 += 1 } } XCTAssertEqual(temp.value, 1_000_000) } } func testSynchronizedOSLockReadPerformance() { let temp = SynchronizedOSLock(0) measure { temp.value { $0 = 0 } // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { guard $0.isMultiple(of: 1000) else { return } temp.value { $0 += 1 } } XCTAssertGreaterThanOrEqual(temp.value, 0) } }

The test succeeds and is the fastest of all! The time it took to complete is 0.2 seconds for writes and 0.354 seconds for reads 😳

Case Closed?

Performance Test Results

Not so fast.. OSLock and NSLock are stupid fast, but there is a huge trade-off happening behind the scenes. As you’d imagine, throwing a million tasks concurrently at a device is intense. Let’s look at the CPU usage for each:

GCD Serial Queue

GCD Concurrent Barrier Queue

GCD Semaphore

NSLock

OSLock

Using any of the Grand Central Dispatch mechanisms, they get pegged at around 100% for the duration of the test. When using OSLock or NSLock though, the results are shocking… over 1,000% 😱

OSLock and NSLock are C-language classes that runs close to the metal, but we do not want to sacrifice battery for speed at this disproportionate rate. This is just silly!

Grand Central Dispatch on the other hand is known to manage resources efficiently while still being relatively fast. Even in the Apple docs, it says:

Dispatch semaphores call down to the kernel only when the calling thread needs to be blocked. If the calling semaphore does not need to block, no kernel call is made.

GCD is managing resources from many sides. Indeed DispatchSemaphore is the fastest of the GCD family, but notice the DispatchQueue with the concurrent .barrier is twice as fast for reads, only a tiny bit slower for writes. This seems like the most balanced way within realistic scenarios. So although not the fastest, DispatchQueue with concurrent .barrier is the all around winner 🏆

What About Property Wrappers?

We could integrate our thread-safe solution with Swift 5.1’s Property Wrappers. Let’s find out what happens:

@propertyWrapper struct Atomic<Value> { private var value: Value private var mutex = DispatchSemaphore(value: 1) var wrappedValue: Value { get { mutex.lock { value } } set { mutex.lock { value = newValue } } } init(wrappedValue value: Value) { self.value = value } } final class AtomicTests: XCTestCase { @Atomic var atomicTemp = 0 func testSynchronizedPropertyWrapper() { measure { atomicTemp = 0 // Reset DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in atomicTemp += 1 } XCTAssertEqual(atomicTemp, 1_000_000) } } }

The test…. FAILS!

When I run the test, 1,000,000 launched tasks did not increment the counter 1,000,000 times. Although it does not crash since multiple writes weren’t occurring, it’s the reads that were operating against stale data:

XCTAssertEqual failed: (“115786”) is not equal to (“1000000”)

This is a perfect example of how concurrency management can be dangerous. It does not crash, but only succeeds 10% of the time. A bug like this can go unnoticed in production for months!

We have to wait for coroutines for Property Wrapper’s to handle this.

Conclusion

Handling concurrent operations is vital for any application. Left unhandled, your users end up with crashes or corrupt data that are extremely hard to reproduce and track. Things may change when coroutines are introduced in Swift, until then, looks like Grand Central Dispatch is our best option.

The source code is available here.

Also, thanks to @kylnew for asking me to do a video cast on this blog post:

Further Reading

Happy Coding!!