The volatile type qualifier in C/C++ means roughly that accesses to the qualified object happen on the actual machine as they do in the abstract machine. I’ve written about volatile pretty extensively, so won’t repeat myself.

An interesting problem with volatile is that in practice, compilers fail to respect it: they add, remove, and reorder accesses in ways that break the rules. This happens because the rules for translating accesses to volatile-qualified objects are very different from the rules for accessing regular C variables: the compiler has nearly complete freedom to add, remove, and reorder non-volatile variable accesses so long as this doesn’t change the program’s externally observable behavior. Thus, many optimization passes require special cases like this:

if (!var->is_volatile()) transform_code();

The problem is that every optimization developer must add these extra safety checks every time — any omission is likely to break the properties that volatile is intended to preserve.

A few years ago Eric Eide and I observed that the rules for accessing volatile objects are very similar to the rules for manipulating function calls. In other words, when the compiler lacks any special knowledge about a called function, it must not add, remove, or reorder function calls. This lead to the idea that if compiler writers would simply model volatile accesses as function calls, all of those special cases in the optimization passes would go away. We tested this idea by writing a source-to-source transformer for C code that turned accesses to volatiles into calls to automatically generated helper functions. In other words, if x is defined as “volatile int x;” then this code:

y=x;

becomes:

y=__volatile_read_int(&x);

Our idea mostly worked. What I mean is that many, but not all, problems in miscompilation of accesses to volatiles went away after transforming programs. Our hypothesis was that when wrapper functions didn’t work, it was always because the compiler was performing a regular miscompilation (i.e., generating the wrong code in a way that has nothing to do with volatile). But we couldn’t back this up since we lacked a correct compiler and we also didn’t have time to manually inspect thousands of failed test cases.

So far this is all old news, but there has been a very nice new development in volatile-land. As of recently, CompCert implements a proved volatile semantics like this:

Accesses to volatile-qualified objects are turned into function calls in the frontend. The target-independent optimization passes totally ignore these function calls. In the backend, the function calls are turned into inline code that performs the accesses.

This is basically the strategy that Eric and I came up with, but with a nice improvement: much of the overhead associated with actual function calls is avoided due to the late inline substitution. The overhead of function calls — particularly in terms of code size — would be significant for small embedded applications that consist mainly of accesses to device registers. Some overhead will likely remain due to the calling convention and because CompCert must pessimistically assume that a helper function has updated any global variables that happen to be cached in registers. Hopefully the CompCert folks will quantify the overheads of various alternatives in a paper sometime.

We looked for volatile bugs in CompCert and found a few: they were in, for example, the unproved frontend code that expands C-level struct assignments into Clight. After Xavier fixed these bugs (I think there were two of this ilk) we can no longer find volatile bugs in CompCert. Now we finally come to the point of this post:

Out of the dozen-odd compilers we have tested, there is only one — CompCert — that reliably respects C’s volatile qualifier.

This is an impressive result.

Update from March 1 2011:

My description of CompCert is a bit out of date. Xavier Leroy explains:

One little correction, though: the handling of volatiles you describe is that of CompCert 1.7. In the latest release CompCert 1.8, I improved the generated code by, in essence, inlining the _volatile_read and _volatile_write functions after optimizations are done, but before register allocation. (In reality, it’s done a bit differently, through a notion of “inlinable builtin functions” of which the volatile operations is an instance.) This way, the generated code isn’t constrained by the calling conventions: the compiler knows that the caller-save registers are not destroyed by the _volatile_* functions, and that these “functions” can take their arguments in any register, not just those dictated by the calling conventions.

This sounds like exactly the right solution: not only does it give us correct code and optimization passes that are free of volatile-related clutter, but the performance and size of the generated code should be very good.