The C programming language, by design, permits programmers to work at both low and high levels. In particular, for applications like operating system development, cryptography, numerical methods, and network processing, programmers can go back and forth between treating data as representing abstract types and as machine addressable sequences of bits, between abstract operations and low level, machine dependent operations. Capturing this sophisticated approach in a formal standard is not simple, and often the writers of the ISO C Standard have thrown up their hands and labeled the effects of non-portable and potentially non-portable operations “undefined behavior” for which they provided only a fuzzy guideline. Unfortunately, the managers of the gcc and clang C compilers have increasingly ignored the guideline and ignored well-established well-understood practice, producing often bizarre and dangerous results and a flock of complaints.

There was a CERT security alert.

Extremely negative reviews from the Linux developers (“the C standard is _clearly_ bogus shit” – Linus Torvalds) which made GCC developers provide an ad-hoc collection of compiler flags for an unstable non-Standard dialect of C.

A warning from the chief Clang architect that compiler changes mean that “ huge bodies of C code are land mines just waiting to explode” (followed later by a suggestion that programmers switch to other languages).

(followed later by a suggestion that programmers switch to other languages). A paper from the foremost researcher in cryptography asking for a version of C that would be “safe” from dangerous “optimizations”.

A well thought out and completely ignored academic proposal for a “friendly” C compiler and standard.

A less polite academic analysis, also ignored.

As an example of what provoked these reactions, consider a small C program that, without a warning, when passed though the current gcc and Clang highest level optimizers can’t recognize -2147483648 as a negative number. The same program works correctly when compiled at lower levels of “optimization”, or with optimizations by earlier versions of clang and gcc or even currently by the Intel C compiler. And consider the following, idiomatic C code, transformed into an infinite loop that never gets to emergency shutdown.

//keep opening valve until pressure down or do emergencyshutdown

for(int i=0; i >=0; i++){ //try it a bunch of times

if ( getpressure() > DANGER)openvalve();

}

if (getpressure() > DANGER)emergency_shutdown();

You can see how this works in the gcc8 generated machine code for the x86 – a processor family that natively wraps arithmetic overflow to do the right thing. The increment, the check, and the call to emergency shutdown are all deleted by the compiler – as an “optimization”.

/Machine code output of GCC8 with O3 optimization level.

f:

sub rsp, 8

.L4:

call getpressure

cmp eax, 10

jle .L4

call openvalve

jmp .L4





Of course, this code is perhaps not the best C code, but it accords with the customary and expected behavior of the language over decades of practice. Earlier versions of Gcc compile the code correctly as does the Intel compiler. So tested, working safety critical code, moved to a new version of the same compiler will fail with no warning.

The ISO C Standards Committee is oblivious

When I visited the ISO WG14 C Standards committee meeting in Pittsburgh this winter to ask for some corrective action, I found a peculiar lack of urgency. The most interesting comments were from a long-time committee member who seemed to believe that C had become a hopeless cause and from another attendee who suggested that fixing contradictions in the standard was not urgent because there are so many of them. The gcc specific carve-out that makes Linux development even possible is unstable and fragile – relying on often poorly documented escapes that will be constantly under pressure from the design trajectory of both gcc and clang compilers (which appear to be excessively focused on artificial benchmarks). As more sophisticated compilation methods, such as link time optimizations, become more prevalent, the opportunities for catastrophic silent transformations will proliferate.

Already, gcc has created a serious security violation in Linux by silently deleting an explicit programmer check for a null pointer (after which both GCC and Clang were forced to add a specific flag disabling that “optimization”). The Linux approach of stumbling into dangerous “optimizations” and then protecting against them is neither reliable not extensible to projects with fewer or less sophisticated users/testers. Given institutional unwillingness to think about the problem at either the standards or compiler levels, some outside intervention is necessary if anything is going to be fixed. Fortunately, the Standard can be mostly repaired by a very simple change (although it would be good if there was some continuing active involvement in the Standard from people involved in operating systems development).

The ISO C standard includes a long standing, well intentioned but botched effort to strengthen C type system in a way that has produced a contradictory and complex set of rules under which a vast body of C code can be considered to embody “undefined behavior” (often with no remedy in the Standard). Not only is is not possible to access “device memory”, but it is impossible to write a memory allocator let alone a virtual memory system, check for arithmetic overflow, reliably check for null or invalid pointers, checksum network packets, encrypt arbitrary data, catch hardware interrupts, copy arbitrary data structures, use type punning, or, most likely, to implement threads. Application programs written in C face the same problems and worse. For example, the standard POSIX function “mmap” appears to be intrinsically “undefined behavior”. Everything, however, worked for decades until recent compiler changes. The primary open source compiler developer groups have seized on the sloppy definition of undefined behavior in the Standard to justify a series of increasingly dangerous silent code transformations that break working code. For reasons explained below, I will call these transformations Beyond Guideline Transformations (BGTs).

The innovation: Beyond the Guideline Transformations

The key passage in the C11 Standard is the following (my highlight in red)

undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements 2 NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

3 EXAMPLE An example of undefined behavior is the behavior on integer overflow.

The compiler developers have seized on “imposes no requirements” to claim that they are entitled to do anything, no matter how destructive, when they encounter undefined behavior even though the guideline in Note 2 captures the customary interpretation. At one time, it was generally understood that the undefined status of signed integer overflow meant that the native arithmetic operations of the target processor determined the result: so that often “i+ j” in C can be compiled to a single machine instruction. If the result of the addition overflows the processor representation of an signed integer – anything could happen -the burden was on the programmer, processor, and execution environment to detect and handle the problem. Almost every modern mainstream processor simply rolls over as expected, although there are some processors that generate traps and some embedded devices with things as odd as saturating arithmetic. None of that is a problem under the customary compilation of the language or within the suggested guideline of “Note 2”. The compiler can just leave it up to the underlying machine, the execution environment, and the programmer to sort out. But, Gcc/Clang developers, with the tacit acquiescence of the ISO standards committee, now claim the compiler is free to assume undefined behavior “can’t happen” and that the compilers can silently remove programmer checks for it – even while generating code that produces the condition.

This, “beyond the guideline” (BTG) approach is relatively new and is still unknown to many practitioners and involves major conflicts with the ANSI C definition in the second edition Kernighan and Ritchie book: which is the definitive specification . You can follow Linux’s lead and block the overflow “optimization” with “frwapv” which forces wrapping semantics (in some places) but that’s not even necessarily what programmers want and there are many other places where C code is vulnerable to similar transformations. For example, in machines with linear addressing, it is not unusual to check a pointer is in a range with code like:

if( p < startptr || p >= endptr)error()

This code is straightforward and easy to compile on the dominant processor architectures of the day, ARM, x86, MIPS, and similar. It might be a problem on, say, segmented architectures or elaborately typed architectures such as the not lamented Itanium. It’s also possible that, depending on how the pointer “p” is generated that it might violate complex, contradictory, and hard to understand type rules for pointer comparisons and fall into the “undefined behavior” bucket. In that case, the compiler developers claim that they can silently delete the check – after all, undefined behavior “can’t happen”. All the code that depends on these types of checks, checks that are reasonable C code and that have worked for decades, may encounter an unpleasant surprise at any future time when the compilers get around to “optimizing” those checks away. And, as usual, there is no guaranteed way in the Standard to “fix” this check. Suggestions involving converting pointers to integers run into other undefined behavior and the optional status of the required data type.

Since common sense has not sufficed, both the compiler development groups and the Standards need an intervention. The decision to ignore the guideline is easily repaired, mostly by removing a paragraph mark – which in the arcane world of ISO standards, transforms a “non-normative” guideline into a mandatory rule.

undefined behavior behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements other than that : Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). It is not acceptable to change the semantics of correct program structures due to undefined behavior elsewhere.

I’d add the following note as well

NOTE: Undefined behavior, particularly, due to non-portable constructs is a normal component of C programs. Undefined behavior due to errors that the translator can detect should produce a diagnostic message.

Undefined behavior is, currently, unavoidable in the C standard

One of the common arguments made by some of the compiler developers who champion BTG transformations is based on a false dichotomy: that without license for BTG transformations, compilers would have to provide an intrusive run-time that guaranteed predictable results and the Standard would have to spell out every possible behavior. This speaks to a fundamental misunderstanding of the purpose and use of C.

How can the compiler distinguish between an erroneous read of uninitialized memory which will produce unspecified results (or even a trap on some architectures) and a syntactically equal chunk of code that reads a memory mapped device control register which cannot be written? The designers of C solved the problem by leaving it up to the programmer and the execution environment: the compiler produces a load operation, the semantics are not its problem – they are “undefined behavior”.

How can the compiler or standard distinguish between a mistaken use of an integer pointer to access a floating point number and a numerical methods expert’s shortcut divide by 2 which involves changing the binary representation of the exponent? That’s not the responsibility of the compiler.

How can the compiler tell the difference between a null pointer dereference in user space where it will, usually, cause a trap on memory mapped systems, and one where the zero page is accessible memory? It cannot: the programmer and operating system can handle it. The best the compiler can do is warn.

What would be nice is if the compilers could warn about possible errors and if the Standard provided opt-in methods of locking down access methods and preventing common errors like array boundary and type violations. C developers increasingly rely on sophisticated static analysis tools and test frameworks to detect common errors and have embraced methodologies to limit the kinds of haphazard programming that was common earlier. But all of that is very different from retroactive, silent, language changes based on opaque and often ill considered rules.

The developers of the C Standard were forced to add a hurried character pointer exception to their type rules when they discovered they had inadvertently made it impossible to write the fundamental memcpy function in C. The exception makes the type rules ineffective while still being cumbersome ( Brian Kernighan made exactly this criticism of Pascal decades ago). And it certainly did not fix the problem. For example, the Standard only makes malloc/free/calloc and similar work within the type system by treating them as special exceptions to the type rules (in a footnote, no less) and apparently nobody realized that those functions could not then be implemented in Standard C. There is no excuse for compilers to punitively enforce such nonsense through BTG transformations.

Clang architect Chris Lattner seems to believe there is nothing the compiler developers can do:

The important and scary thing to realize is that just about *any* optimization based on undefined behavior can start being triggered on buggy code at any time in the future. Inlining, loop unrolling, memory promotion and other optimizations will keep getting better, and a significant part of their reason for existing is to expose secondary optimizations like the ones above. To me, this is deeply dissatisfying, partially because the compiler inevitably ends up getting blamed, but also because it means that huge bodies of C code are land mines just waiting to explode.

Lattner is absolutely correct about the “land mines” which this approach to “optimization” produces but he is wrong to imply that the compilers are required to adopt the BTG approach to undefined behavior or that the code triggering undefined behavior is necessarily “buggy”. C compilers could optimize on undefined behavior, as they always have, without violating the guidelines suggested in the standard: as Intel’s ICC still does. The choice to go beyond the guidelines is an engineering choice.

The invisible advantages of BTG optimizations

Some of the defenders of BTG optimizations argue that code producing undefined behavior is not actually C code and so doesn’t have a meaning. That kind of moronic legalism has no support in the history of the language, the behavior of compilers (which have no trouble compiling and producing expected behavior from the same code prior to “optimization”) or the language of the standard – it’s just a pompous way of excusing crappy engineering. The C-Standard is a train-wreck on its own, but it does not mandate the bizarre paradoxical undefined behavior GCC and Clang are choosing to haphazardly implement.

The supposed motivation for BTG transformations is that they enable optimizations. There are, however, only a bunch of dubious anecdotes to support the claim of great optimizations – not a single example of careful measurement. And there are studies that show the opposite. Here’s the justification from one of the LLVM developers:

This behavior enables an analysis known as “Type-Based Alias Analysis” (TBAA) which is used by a broad range of memory access optimizations in the compiler, and can significantly improve performance of the generated code. For example, this rule allows clang to optimize this function: float *P; void zero_array() { int i; for (i = 0; i < 10000; ++i) P[i] = 0.0f; } into “ memset(P, 0, 40000) “. This optimization also allows many loads to be hoisted out of loops, common subexpressions to be eliminated, etc. This class of undefined behavior can be disabled by passing the -fno-strict-aliasing flag, which disallows this analysis. When this flag is passed, Clang is required to compile this loop into 10000 4-byte stores (which is several times slower), because it has to assume that it is possible for any of the stores to change the value of P, as in something like this: int main() { P = (float*)&P; // cast causes TBAA violation in zero_array. zero_array(); } This sort of type abuse is pretty uncommon, which is why the standard committee decided that the significant performance wins were worth the unexpected result for “reasonable” type casts.

This is a ridiculously contrived example, of course. The program, as written, makes no sense and will crash on any conceivable system. If not setup to fail, any half-decent C programmer could have used a local variable parameter for the pointer (in which case -fno-strict-aliasing has no effect) or called memset directly. These are the kinds of things that C programmers find and fix using profiling. Oddly, the compiler misses the big optimization, which is to do nothing – since nothing is done with the array before the program exits. So here we have the usual elements of an example of why C compilers need to make “unexpected” BTG results out of undefined behavior

A minor or unnecessary optimization of a contrived example that shows nothing about actual applications that is possibly a undefined behavior optimization but not a BTG transformation that is accompanied by absolutely zero in the form of measurement that ignores actual optimization possibilities.

A talk by Google’s Clang expert, Chandler Carruth on this topic has an even worse example. Carruth gives an example of array indexing that has all of the first 4 characteristics above, plus the benefit that if you look at the generated code, the supposedly much better form taking advantage of undefined behavior for optimizations is not really any better than the original. He also explains how the compiler cannot determine whether programs are correct (!) and makes an unfortunate analogy between compiler behavior and APIs. Imagine if operating systems, for example, were to caution developers that “anything can happen” if the parameters passed to an OS system call are not as expected. Programming languages should minimize surprise.

In the real world, the justification for BTG transformations is convenience for the compilers – which are increasingly driven by not only artificial benchmarks, but by their support for C++ and Java which are languages that have very different semantics. Operating on intermediate codes that are stripped of C specific semantics, it is difficult for the compilers to safely apply optimization passes without making dangerous C transformations. Rather than addressing the weakness of this internal representation of code, it’s easier to find loopholes that allow the customary semantics to be ignored.

C is not Java

Here’s something more from Lattner.

Violating Type Rules: It is undefined behavior to cast an int* to a float* and dereference it (accessing the “int” as if it were a “float”). C requires that these sorts of type conversions happen through memcpy: using pointer casts is not correct and undefined behavior results. The rules for this are quite nuanced and I don’t want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc). This behavior enables an analysis known as “Type-Based Alias Analysis” (TBAA) which is used by a broad range of memory access optimizations in the compiler, and can significantly improve performance of the generated code.

To me: “The rules for this are quite nuanced and I don’t want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc)” means, “we mucked up the standard and we are going to cause many systems to fail as these nuanced rules confuse and surprise otherwise careful and highly expert programmers”. Unexpected results are a catastrophe for a C programmer. Limitations on compiler optimizations based on second guessing the programmer are not catastrophes ( and nothing prevents compiler writers from adding suggestions about optimizations). Programmers who want the compiler to optimize their algorithms using clever transformations should use programming languages that are better suited to large scale compiler transformations where type information is clear indication of purpose. As Chris Lattner notes:

It is worth pointing out that Java gets the benefits of type-based optimizations without these drawbacks because it doesn’t have unsafe pointer casting in the language at all.

Java hides the memory model from the programmer to make programming safer and also to permit compilers to do clever transformations because the programmer is not permitted to interact with the low level system. Optimization strategies for the C compiler should take into account that C does not put the programmer inside a nice abstract machine. The C compiler doesn’t need to force C to become a 5th rate Pascal or Swift. Unexpected results are far more serious a problem for C than missed minor optimizations.