I built a version of OCaml with some instrumentation for reporting errors in using the C language’s integers. Then I used that OCaml to build the Coq proof assistant. Here’s what happens when we start Coq:

[regehr@gamow ~]$ ~/z/coq/bin/coqtop intern.c:617:10: runtime error: left shift of 255 by 56 places cannot be represented in type 'intnat' (aka 'long') intern.c:617:10: runtime error: left shift of negative value -1 intern.c:167:13: runtime error: left shift of 255 by 56 places cannot be represented in type 'intnat' (aka 'long') intern.c:167:13: runtime error: left shift of negative value -1 intern.c:173:13: runtime error: left shift of 234 by 56 places cannot be represented in type 'intnat' (aka 'long') intern.c:173:13: runtime error: left shift of negative value -22 intern.c:173:13: runtime error: left shift of negative value -363571240 interp.c:978:43: runtime error: left shift of 2 by 62 places cannot be represented in type 'long' interp.c:1016:19: runtime error: left shift of negative value -1 interp.c:1016:12: runtime error: signed integer overflow: -9223372036854775807 + -2 cannot be represented in type 'long' interp.c:936:14: runtime error: left shift of negative value -1 ints.c:721:48: runtime error: left shift of 1 by 63 places cannot be represented in type 'intnat' (aka 'long') ints.c:674:48: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long' compare.c:307:10: runtime error: left shift of 1 by 63 places cannot be represented in type 'intnat' (aka 'long') str.c:96:23: runtime error: left shift of negative value -1 compare.c:275:12: runtime error: left shift of negative value -1 interp.c:967:14: runtime error: left shift of negative value -427387904 interp.c:957:14: runtime error: left shift of negative value -4611686018 interp.c:949:14: runtime error: signed integer overflow: 31898978766825602 * 65599 cannot be represented in type 'long' interp.c:949:14: runtime error: left shift of 8059027795813332990 by 1 places cannot be represented in type 'intnat' (aka 'long') Welcome to Coq 8.4pl1 (February 2013) unixsupport.c:257:20: runtime error: left shift of negative value -1 Coq <

This output means that Coq---via OCaml---is executing a number of C's undefined behaviors before it even asks for any input from the user. The problem with undefined behaviors is that, according to the standard, they destroy the meaning of the program that executes them. We do not wish for Coq's meaning to be destroyed because a proof in Coq is widely considered to be a reliable indicator that the result is correct. (Also, see the update at the bottom of this post: the standalone verifier coqchk has the same problems.)

In principle all undefined behaviors are equally bad. In practice, some of them might only land us in purgatory ("Pointers that do not point into, or just beyond, the same array object are subtracted") whereas others (store to out-of-bounds array element) place us squarely into the ninth circle. To which category do the undefined behaviors above belong? To the best of my knowledge, the left shift problems are, at the moment, benign. What I mean by "benign" is that all compilers that I know of will take a technically undefined construct such as 0xffff << 16 (on a machine with 32-bit, two's complement integers) and compile it as if the arguments were unsigned, giving the intuitive result of 0xffff0000 . This compilation strategy could change.

If we forget about the shift errors, we are still left with three signed integer overflows:

-9223372036854775807 + -2

-9223372036854775808 - 1

31898978766825602 * 65599

Modern C compilers are known to exploit the undefinedness of such operations in order to generate more efficient code. Here's an example where two commonly-used C compilers evaluate (INT_MAX+1) > INT_MAX to both 0 and 1 in the same program, at the same optimization level:

#include <stdio.h> #include <limits.h> int foo (int x) { return (x+1) > x; } int main (void) { printf ("%d

", (INT_MAX+1) > INT_MAX); printf ("%d

", foo(INT_MAX)); return 0; }

$ gcc -w -O2 overflow.c $ ./a.out 0 1 $ clang -O2 overflow.c $ ./a.out 0 1

Here's a longish explanation of the reasoning that goes into this kind of behavior. One tricky thing is that the effects of integer undefined behaviors can even affect statements that precede the undefined behavior.

Realistically, what kinds of consequences can we expect from signed integer overflows that do not map to trapping instructions, using today's compilers? Basically, as the example shows, we can expect inconsistent results from the overflowing operations. In principle a compiler could do something worse than this---such as deleting the entire statement or function which contains the undefined behavior---and I would not be too surprised to see that kind of thing happen in the future. But for now, we might reasonably hope that the effects are limited to returning wrong answers.

Now we come to the important question: Is Coq's validity threatened? The short answer is that this seems unlikely. The long answer requires a bit more work.

What do I mean by "Coq's validity threatened"? Of course I am not referring to its mathematical foundations. Rather, I am asking about the possibility that the instance of Coq that is running on my machine (and similar ones running on similar machines) may produce an incorrect result because a C compiler was given a licence to kill.

Let's look at the chain of events that would be required for Coq to return an invalid result such as claiming that a theorem was proved when in fact it was not. First, it is not necessarily the case that the compiler is bright enough to exploit the undefined integer overflow and return an unexpected result. Second, an incorrect result, once produced, might never escape from the OCaml runtime. For example, maybe the overflow is in a computation that decides whether it's time to run the garbage collector, and the worst that can happen is that the error causes us to spend too much time in the GC. On the other hand, a wrong value may in fact propagate into Coq.

Let's look at the overflows in a bit more detail. The first, at line 1016 of interp.c, is in the implementation of the OFFSETINT instruction which adds a value to the accumulator. The second overflow, at line 674 of ints.c, is in a short function called caml_nativeint_sub(), which I assume performs subtraction of two machine-native integers. The third overflow, at line 949 of interp.c, is in the implementation of the MULINT instruction which, as far as I can tell, pops a value from the stack and multiplies it by the accumulator. All three of these overflows fit into a pattern I've seen many times where a higher-level language implementation uses C's signed math operators with insufficient precondition checks. In general, the intent is not to expose C's undefined semantics to programs in the higher-level language, but of course that is what happens sometimes. If any of these overflows is exploited by the compiler and returns strange results, OCaml will indeed misbehave. Thus, at present the correctness of OCaml programs such as Coq relies on a couple of things. First, we're hoping the compiler is not smart enough to exploit these overflows. Second, if a compiler is smart enough, we're counting on the fact that the resulting errors will be egregious enough that they will be noticed. That is, they won't just break Coq in subtle ways.

If these bugs are in the OCaml implementation, why am I picking on Coq? Because it makes a good example. The Coq developers have gone to significant trouble to create a system with a small trusted computing base, in order to approximate as nearly as possible the ideal of an unassailable system for producing and checking mathematical proofs. This example shows how even such a careful design might go awry if its chain of assumptions contains a weak link.

These results were obtained using OCaml 3.12.1 on a 64-bit Linux machine. The behavior of the latest OCaml from SVN is basically the same. In fact, running OCaml's test suite reveals signed overflows at 17 distinct locations, not just the three shown above; additionally, there is a division by zero and a shift by -1 bit positions. My opinion is that a solid audit of OCaml's integer operations should be performed, followed by some hopefully minor tweaking to avoid these undefined behaviors, followed by aggressive stress testing of the implementation when compiled with Clang's integer overflow checks.

UPDATE: psnively on twitter suggested something that I should have done originally, which is to look at Coqchk, the standalone proof verifier. Here's the output:

[regehr@gamow coq-8.4pl1]$ ./bin/coqchk intern.c:617:10: runtime error: left shift of 255 by 56 places cannot be represented in type 'intnat' (aka 'long') intern.c:617:10: runtime error: left shift of negative value -1 intern.c:167:13: runtime error: left shift of 255 by 56 places cannot be represented in type 'intnat' (aka 'long') intern.c:167:13: runtime error: left shift of negative value -1 intern.c:173:13: runtime error: left shift of 234 by 56 places cannot be represented in type 'intnat' (aka 'long') intern.c:173:13: runtime error: left shift of negative value -22 intern.c:173:13: runtime error: left shift of negative value -363571240 interp.c:978:43: runtime error: left shift of 2 by 62 places cannot be represented in type 'long' interp.c:1016:19: runtime error: left shift of negative value -1 interp.c:1016:12: runtime error: signed integer overflow: -9223372036854775807 + -2 cannot be represented in type 'long' interp.c:936:14: runtime error: left shift of negative value -1 ints.c:721:48: runtime error: left shift of 1 by 63 places cannot be represented in type 'intnat' (aka 'long') ints.c:674:48: runtime error: signed integer overflow: -9223372036854775808 - 1 cannot be represented in type 'long' compare.c:307:10: runtime error: left shift of 1 by 63 places cannot be represented in type 'intnat' (aka 'long') Welcome to Chicken 8.4pl1 (February 2013) compare.c:275:12: runtime error: left shift of negative value -1 Ordered list: Modules were successfully checked [regehr@gamow coq-8.4pl1]$