In my group’s recent compiler testing paper we wrote:

We have never seen an “interesting” split vote where randomized differential testing of a collection of C compilers fails to produce a clear consensus answer

Randomized differential testing is just a fancy way of describing this process:

Randomly generate a test input Run it through several different implementations of the same specification Check if all implementations produced equivalent output

Today we saw our first split vote using a program generated by Csmith. The reduced test case is:

#include <stdio.h> struct S0 { unsigned f1 : 1; }; struct S0 s; int main (void) { int x = -3; int y = x >= (0, s.f1); printf ("%d

", y); return 0; }

GCC, KCC, and CompCert all print “0

”. MSVC 2010, Intel CC 12.0.2, and today’s development snapshot of Clang (all for x86-64) print “1

”. All compilers have optimizations turned off. There are two possibilities:

The test case is ill-formed or ambiguous. Three of the six tools are wrong.

I’m pretty sure that (using the C99 standard) the test case is fine and the correct value for y is 0. The reasoning is that s.f1, an unsigned value, is promoted to signed int before the comparison is performed, making the comparison operator signed, resulting in false, or zero. The type and value of the left operand to the comma operator should be irrelevant.

There are a few interesting things going on here:

Two of the three (apparently) correct results were produced by relatively lightly-tested research compilers.

Most compiler bugs are in the optimizers. Therefore, problems like this that show up with optimizations disabled are relatively rare.

C is not the simple “portable assembly language” that people like to claim it is. Nobody gets all of its corner cases right, even for something relatively simple like integers.

Just yesterday, Xuejun — the main Csmith hacker — added support for the comma operator. Most compilers’ implementations of it are probably not very well tested.

Intuitively, n-version programming should work. Knight and Leveson famously showed that it may not.

Related previous posts from this blog are here and here.

Update: I should have added that I’m interested to see if there are any other compilers that get this wrong. If you have access to a compiler not on my list above and wouldn’t mind running the test and posting the result in a comment, I’d appreciate it.

Update from July 13: The behavior of this program is a bit more subtle than my explanation above indicates. John McCall’s comment has the best explanation I’ve seen so far.

Update from July 14: In C, a global variable DOES NOT need an explicit initializer. It is automatically initialized to zero.