GCC and static analysis

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

Concurrency tends to make programming hard. Kernel development obviously involves dealing with a lot of concurrency, but there is also a lot of multi-threaded user-space development that suffers from the same issues. It would be nice if the computer could help developers avoid race conditions and other problems that arise in concurrent environments. Some developers at Google have been working on just such a project for some time, but they have just relocated the project from GCC to the LLVM Clang compiler, saying that GCC is not suited to the work they want to do. The result has been a sort of wake-up call for GCC developers. Is the GCC compiler suite not well suited to the creation of static analysis tools?

Like kernel programming, multi-threaded user-space programming involves creating and using locks to prevent concurrent access to shared data. In a properly designed and implemented locking scheme, code will never see inconsistent views of shared data, and that data will not change when the code is not expecting changes. Getting to that point is hard, though, and the bugs that result from locking mistakes can be hard to reproduce and hard to diagnose. There are few things more frustrating than a highly intermittent bug that seemingly makes no sense and defies efforts to track it down.

In 2008, Google developer Le-Chun Wu announced a project to add support for "thread safety annotations" in C and C++ programs to the GCC compiler. Since then, work has progressed to the point that the developers have a useful system that is employed by a number of internal projects. The ideas are relatively straightforward. Shared data requiring a lock is annotated with something like:

Mutex data_mutex; struct shared_data Data GUARDED_BY(data_mutex);

Functions that manipulate locks are annotated with:

class some_class { Mutex mutex; void get_lock() EXCLUSIVE_LOCK_FUNCTION(mutex) { ... } void put_lock() UNLOCK_FUNCTION(mutex) { ... } /* ... */ };

There are also annotations for shared locks and "trylock" functions that may not succeed. If a function expects a given lock to be held when it is called, it can be annotated with EXCLUSIVE_LOCKS_REQUIRED() ; there is also a LOCKS_EXCLUDED() annotation for functions that will acquire non-nesting locks themselves. Finally, this construction can be used for lock ordering constraints:

class some_class { Mutex m1; Mutex m2 ACQUIRED_AFTER(m1); /* ... */ };

Some other annotations exist; they can be seen in this slide deck [PDF], which is where your editor stole the above examples from.

The GCC implementation sets itself up as an early optimization pass. It builds a representation of the code from the GIMPLE internal representation, tracking which locks are held at each point through the function. When problems or inconsistencies are found, the compiler raises the alarm, hopefully causing the problem to be fixed before it gets into production code and bites somebody.

This code is available in a branch in the GCC repository and appears to be useful, so it came as a bit of a surprise when, on April 19, Google developer Diego Novillo announced that the project had been terminated, and that the group was now working to implement the same functionality in the LLVM Clang compiler instead. When asked why the group was making this change, developer Delesley Hutchins responded:

The gcc version has been difficult to support and maintain, due mainly to the fact that the GIMPLE intermediate language was never designed for static analysis. The abstract syntax tree provided by Clang is an easier data structure to work with for front-end analyses of this kind.

The response added that the GCC implementation has "some issues" that would make it hard to merge into the GCC mainline, while the Clang implementation has been in the trunk all along.

The GCC developers, naturally, would like to understand what it is about their compiler that made this project hard. The Google team was hesitant to respond, seemingly unwilling to criticize GCC or to cause a compiler flame war. Eventually, though, Delesley posted a more detailed description of the kinds of difficulties they had run into. There is a lot of information there, but it seems to come down to two crucial points:

The GIMPLE representation loses information about the structure of the original program that the static analysis pass really needs to have. LLVM, instead, builds an abstract syntax tree that much more closely matches the original C/C++ syntax; additional structures are then based on that tree. LLVM's tree is evidently well suited to the task of static analysis.

The GIMPLE representation changes significantly from one compiler release to the next, causing a lot of things to break. That adds a maintenance cost that the Google developers are unwilling to pay - especially if basing on LLVM instead makes that cost go away.

There were other things that were better for their purposes in GCC, but, in the balance, the developers concluded that LLVM is the better platform for their work. They seem determined to make the shift and feel that attempts to fix the problems they encountered in GCC would create difficulties elsewhere in the compiler. So this move appears to be a done deal.

In one sense, this decision lacks serious implications for the free software community. LLVM, too, is free software and the static analysis code will be part of it. That said, it is worrisome if GCC's internal structure truly turns out to be poorly suited to static analysis tasks. There is a lot of interesting work being done in the static analysis area; it offers the prospect of finding large numbers of bugs early in the development process. Static analysis tools will almost certainly be an increasingly important part of many developers' tool boxes. If those tools cannot easily be implemented with GCC, that compiler's future may not be as bright as it otherwise should be.

That said, it is dangerous to extrapolate from one project to the whole field of static analysis tools. The GCC plugin mechanism is just beginning to mature; it really has not had the time to turn into a platform that complex tools can be built on top of. So, while this episode is a warning that the GCC developers should hear (and evidently have heard quite well), it should not be seen as evidence of fatal design flaws. Likely as not, it is just an indication of a problem in need of solution - something the GCC community is good at.

