We were in search for some C++ reader/writer locks implementation that allows a thread to acquire a lock and then optionally pass it on to another thread. The C++11 and C++14 standard library lock implementations std::mutex and shared_mutex do not allow that (it would be undefined behaviour – by the way, it’s also undefined behaviour when doing this with the pthreads library).

Additionally, we were looking for locks that would neither prefer readers nor writers, so that there will be neither reader starvation nor writer starvation. And then, we wanted concurrently queued read and write requests that compete for the lock to be brought into some defined execution order. Ideally, queued operations that cannot instantly acquire the lock should be processed in approximately the same order in which they queued.

Interested in trying out ArangoDB? Fire up your cluster in just few clicks with ArangoDB Oasis: the Cloud Service for ArangoDB. Start your free 14-day trial here.

We finally found some nice paper describing a variety of locks, and we were intrigued by the properties of Phase-fair RW locks (section 3.2.4, page 9 in the paper):

“A RW lock is phase-fair iff it has the following properties:

PF1 reader phases and writer phases alternate;

PF2 writers are subject to FIFO ordering, but only with regard to other writers;

PF3 at the start of each reader phase, all currently-unsatisfied read requests are

satisfied (exactly one write request is satisfied at the start of a writer phase); and

PF4 during a reader phase, newly-issued read requests are satisfied only if there are

no unsatisfied write requests pending”

So we decided to build a C++ implementation based on what is described in the paper.

The paper provides some pseudo-code for a ticket-based lock-less implementation using multiple atomic variables. We tried that out, and it used to work fine in practice. But then we ran into the problem that we would not only need the lock and unlock operations: we also needed an operation that tries to lock and return if the lock cannot be acquired instantly ( try_lock ). And we needed that combined with a configurable timeout after which try_lock would give up. And we wanted this try_lock operation to properly queue until it can acquire the lock successfully or times out.

That turned out to be rather complex to implement with the multiple atomic variables approach suggested in the paper: the approach there suggests that for acquiring the lock in write mode, two atomic variables need to be changed (page 14 in the paper):

1 2 3 4 5 6 7 8 9 procedure writelock ( L : ˆ pftlock ) var ticket , w : unsigned integer ; ticket : = fetch and add ( &L−>win, 1); await ticket = L − > wout ; w : = PRES or ( ticket and PHID ) ; ticket : = fetch and add ( &L−>rin, w); await ticket = L − > rout

For the try_lock operation the lock acquisition will fail if the lock cannot be acquired. Ideally, a try_lock operation will not change the state of the lock if it cannot be acquired instantly. But with the multiple atomic variable approach, this will become tricky, given that there are several atomic variables that will need to be read and modified for entering the lock in write mode.

So we decided that even if it is 2018 and lock-free programming is definitely on the advance, sticking with a completely lock-free implementation will make the lock code a lot more complex. And for a lock implementation that controls critical parts of the application, it is definitely an advantage to have a solution that is more simple and robust than having a more efficient but complex solution.

So we went with an implementation that uses some std::mutex under the hood, and std::condition_variable to wake up already queued waiters. So we could rely on the correctness of std::mutex and std::condition_variable for all our modifications of the internal lock state and notifications. Given the fact that it is a read-write lock and may lock out other operations anyway (in case there is another writer), we thought this would be a good initial tradeoff.

That approach allowed us to come up with rather simple code for the lock and unlock operations. And the try_lock implementation was only a bit more complicated because we needed try_lock to work with a configurable timeout, and have it run a configurable callback function to notify the application in case the lock cannot be acquired instantly but will go into a waiting state.

We have run some initial experiments with this, and it seems that it indeed provides the semantics (especially the fairness) we desire. We will now run additional tests with that lock implementation to see how it performs.

Our current lock implementation code is in an experimental feature branch, which can be found on Github.

As the rest of the ArangoDB source code, the code is open source and licensed under the Apache 2 license, so please feel free to use it in other projects or modify it.

Read here the comparison: Lockless programming with atomics in C++ 11 vs. mutex and RW-locks

We are also interested in comments from other users of this code and eager to get feedback on how it worked out and performs in other contexts! Let us know on Community Slack.