The two major problems in concurrent programs are data races and deadlocks. Races occur when two or more threads are accessing shared memory without proper synchronization. Deadlocks occur when synchronization is based on locking and multiple threads block each other's progress. In a typical deadlock scenario, thread A locks the mutex X, and thread B locks the mutex Y. Now, in lockstep, thread A tries to lock Y and B tries to lock X. Neither can make progress, waiting forever for the other thread to release its lock. The program freezes.

In a sense, data races are the opposite of deadlocks. The former result from not enough synchronization, the latter usually happen when there's too much synchronization and it gets out of hand.

There are sophisticated schemes to prevent races and/or deadlocks, but they either require special language support or strict discipline on the part of programmers. Here I present a solution based on a well-known deadlock-avoidance protocol and show how it can be enforced by the compiler. It can be applied to programs in which the number of locks is fixed and known up front.

This scheme could probably be implemented in C++ by somebody skilled in template metaprogramming. I chose the D language (www.digitalmars.com/D) insteada language from the same curly-brace family, but with more user-friendly support for metaprogramming. Most code samples in this article should be easy to understand if you're familiar with C++, Java, or C#.

Lock Ordering

The crucial observation is that deadlocks occur when threads take locks in different order. In the previous example, thread A tried to take locks X and then Y, while thread B tried to take lock Y, then X. If you could ensure that all threads that take both locks take them in the same order, you'd know they will not deadlock (at least not on those two locks).

In some programs all locks are known at compile time and you can prescribe the order in which they are to be taken. Strictly speaking, the ordering doesn't have to be totalpartial order is sufficient (for example, the way nodes are ordered in a tree using parent/child relationship). If two locks are never taken together, they don't have to be ordered. In any case, given a partial order, there always is a way to sort the locks into some linear sequence (it's called "topological sort"). So without loss of generality, I assume that we have such a sequence of locks in our program.

Now it's easy to convince yourself that if all threads take locks in the order compatible with this sequence, there will be no deadlocks. (Notice that it's okay to skip some locks in the sequence.)

Even the best protocol doesn't amount to much if it cannot be enforced. The first thing that comes to mind is to try to enforce the scheme at runtime. Every time you take a lock you would add it to a thread-local list and, if you detected an out of order attempt, you'd throw an exception. Releasing a lock would take it off the list.

The problem with runtime enforcement is that concurrency bugs are hard to reproduce. The program may go through testing with flying colors only to deadlock months or years later on a client's machine. I'll concentrate on ways to enforce deadlock freedom at compile time.

A compile-time scheme is by necessity more restrictive; it might reject perfectly valid programs. But once the program compiles or, more precisely, type-checks, you have the guarantee that it will never deadlock.

I realized that the biggest problem in statically enforcing a lock-ordering scheme is that you don't know which locks may be taken by the functions you are calling. Suppose your function takes lock number 6 in the global sequence. As long as you hold onto it, you shouldn't be able to call any function that may take any lower-numbered locks. Suppose that the information about what locks may be taken by a functionits lock optionscould be encoded in its type signature. Trying to call a function that is declared to take lock 5 from a function that takes lock 6 would result in a type error.

How can you encode, using the type system, the information that, say, lock 6 has been taken? It turns out that there is a trick to that.