A discussion of undefined behaviour and compiler optimisation, particularly in regards to signed integer overflow.

C (and C++) compilers are becoming notorious for exploiting the notion of undefined behaviour – the idea that certain things a program might do have no behaviour proscribed by the language standard, and that the compiler can assume the program doesn’t do these things when it is generating object code. Quite a few people have been objecting to this, since it can result in the generated code not doing what the programmer intended; the problem is becoming more noticeable over time, as compilers introduce more sophisticated optimisation techniques which are more likely to exploit the notion.

One prominent example is that of signed integer overflow. Most C programmers are developing for machines which use a 2’s complement representation of integers; addition and subtraction, with such a representation, is implemented in exactly the same way as for unsigned arithmetic. If the addition of two positive signed integers overflows – that is, if the result is larger than can be represented – the processor will produce a number that, when interpreted as a 2’s complement signed integer, will appear to be negative. This is called “wrapping” because the value has “wrapped around” from the high end of the numeric range to the low end.

For this reason, you occasionally see C code that looks something like this:

int b = a + 1000; if (b < a) { // overflow puts("input too large!"); return; }

The “if” statement is designed to detect the overflow condition (in this case from adding 1000 to the value from the variable ‘a’) and report an error. The problem is that, in C, signed integer overflow is one case of undefined behaviour. Compilers, for some time now, have performed an analysis which shows that the condition can never be true: if I add 1000 (or any positive number) to another value, the result cannot be smaller than the original value; if overflow occurred, that is undefined behaviour, and it is the programmer’s responsibility (arguably) to ensure that their program never exhibits such behaviour. Therefore, the compiler may decide that the entire if statement can be removed as an optimisation (it can never be true, it can never have an effect, it may as well not be there).

The problem with this compiler optimisation, in this case, is that it has removed the test that the programmer specifically used in an attempt to detect the overflow situation and handle it. An example of this with a real compiler can be seen here. (Side note: the godbolt.org site on which that example is hosted is great! you can edit the code and see the compiled form with a wide range of compilers. Play with it!). Observe that the overflow check is not removed if the type is changed to an unsigned integer, since unsigned overflow has defined behaviour in C (or rather, more accurately, unsigned arithmetic is defined to wrap and thus the overflow does not actually occur).

So is this wrong? Some have argued that it is, though it’s clear that many compiler vendors feel that it’s legitimate. The main arguments made by proponents of (edit: implementation-defined)wrapping overflow behaviour, if I understand them correctly, boil down to variants of the following:

Wrapping on overflow is a useful behaviour.

Wrapping is the behaviour expected by programmers.

Exploiting undefined behaviour semantics on overflow gives no significant benefit.

The C language standard, in regards to undefined behaviour, gives license for implementations “ignoring the situation completely, with unpredictable results”, but this doesn’t allow optimisations to assume that the situation for which the undefined behaviour is proscribed will not come about.

Let’s look at these one by one:

Wrapping on overflow is a useful behaviour?

The main utility for a wrapping behaviour is to be able to detect overflow after it occurs. (If there are other uses, that could not be handled using unsigned integers instead, I am not immediately unable to think of any and suspect they are rare). While this would indeed simplify the problem of avoiding the use of erroneously overflowed results, it certainly doesn’t help in all cases (consider multiplication, or addition of two unknown quantities with unknown sign).

For the trivial case where wrapping behaviour does allow simply detecting overflow after it occurs, it is also straightforward to determine whether overflow would occur, before it actually does so. The example above can be rewritten as follows:

if (a > INT_MAX - 1000) { // would overflow puts("input too large!"); return; } int b = a + 1000;

That is, you can perform a check to see whether the result of an addition will exceed the maximum representable value, rather than performing the addition and then trying to determine whether that overflow occurred by checking if the result is mathematically inconsistent. (If the sign of both operands is unknown, the check becomes significantly more complicated, but this is also true when checking for overflow after the operation with wrapping overflow semantics).

With this in mind, I’m not really convinced that wrapping overflow is generally useful.

Wrapping is the behaviour expected by programmers?

It’s more difficult to argue against this point, since clearly at least some C programmers have written code which expects wrapping semantics for signed integer overflow. However, I don’t think that this alone is a strong argument for implementing wrapping semantics by default (note that several compilers implement options for wrapping overflow, if it really is desired).

An obvious mitigation for the problem of programmers expecting this particular behaviour is for the compiler to issue a warning when it optimises based on the alternative undefined-behaviour-is-assumed-not-to-occur semantics. Unfortunately as we see in the godbolt.org link above, compilers don’t always do so (Gcc 7.3 does but 8.1 does not, so this appears to be a regression).

Exploiting undefined behaviour semantics on overflow gives no significant benefit?

If true in all cases this would be a compelling argument for having compilers default to wrap-on-overflow, since it is probably better to allow the “detect overflow after it occurs” mechanism described above to work even if it is technically incorrect – if only because that mechanism may be in use in code which is arguably broken.

I suspect that with typical C programs the benefit of this particular optimisation (removing checks for mathematically impossible conditions) is usually negligible, because C programs tend to be written by programmers who are seeking good performance and who tend to hand-optimise their code anyway: that is, if it’s obvious that particular “if” statement has a condition that can never be true, the programmer would likely have removed the statement themselves. Indeed, a search reveals a few studies where the effectiveness of this optimisation has been questioned, tested, and found to be mostly insignificant for the particular benchmarks under test. However, while in many cases there is no benefit for C, the code generation engines and optimisers in compilers are commonly general and could be used for other languages where the same might not be so generally true; consider C++, where it is somewhat idiomatic in templated code to rely on the optimiser from removing redundant code, rather than doing it manually. There is also the case of languages being transpiled to C and relying on the C compiler to optimise away redundant code.

Also, even without overflow check elimination, it is not necessarily correct to assume that wrapping integers has minimal direct cost even on machines which use 2’s complement representation. The Mips architecture, for example, can perform arithmetic operations only in registers, which are fixed size (32 bit). A “short int” is generally 16 bits and a “char” is 8 bits; if assigned to a register, the underlying width of a variable with one of these types will expand, and forcing it to wrap according to the limit of the declared type would require at least one additional operation and possibly the use of an additional register (to contain an appropriate bitmask). I have to admit that it’s been a while since I’ve had exposure to any Mips code and so I’m a little fuzzy on the precise cost involved, but I’m certain it is non-zero and other RISC architectures may well have similar issues.

The language standard does not allow for signed integer overflow not to wrap, if that’s what the underlying architecture does?

This argument is particularly weak when examined. It essentially states that there is a requirement that “undefined behaviour” actually grants only limited license to the implementation (compiler), by the text of the standard. What the text that proponents latch on to says precisely is the following, as part of the definition of undefined behaviour:

NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, …

The claim is that “ignoring the situation completely” would not allow for assuming that the situation leading to the undefined behaviour – overflowing addition, for example – could not happen, but rather that, if it does happen, the implementation must carry on as if it did not happen but must respect the result it would obtain from asking the processor to perform such an operation (putting it another way: as if the translation from source to machine code was direct and naive).

First, we should observe that this text is in a NOTE and therefore non-normative (may not proscribe behaviour), according to the ISO directive mentioned in the foreword of the same document:

In accordance with Part 3 of the ISO/IEC Directives, this foreword, the introduction, notes, footnot es, and examples are also for information only.

Given that the “possible undefined behaviour” appears in such a note, it is not proscriptive. Note that the actual definition text for “undefined behavior” reads:

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.

I have added emphasis on the important part: there are no requirements for undefined behaviour; the list of “possible undefined behaviors” in the note contains merely examples and cannot be definitive. “Imposes no requirements” is unambiguous.

Some extend the argument to say that, regardless of what the text actually says, the intention of the language committee when those words were drafted was that the behaviour should in general match that of the underlying hardware, as closely as possible, assuming a naive translation to machine code. This may be true, though I’ve not seen any evidence (such as historical text) that supports it. Even if it were true, however, it would not necessarily apply to the current incarnation of the text.

Final thoughts

The arguments for wrapping on overflow are mostly flawed. Probably the strongest argument that can be made is a combination: it is occasionally expected by less experienced programmers (who do not understand the nuances of C and of its undefined behaviour), and is not particularly harmful to performance – however, the latter is not true in all cases, and the former is a somewhat dubious reasoning when considered by itself.

Personally, I feel that I would much rather have trap on overflow than wrap. That is, I would rather that a program crash instead of continuing with either undefined behaviour or a potentially incorrect value, either of which could be a security issue. This would certainly have a slight performance impact on most(?) architectures, particularly x86, but on the other hand it would immediately flag overflow bugs rather than allowing them to be exploited or to produce incorrect results further down the line. It could also in theory allow the compiler to safely remove redundant comparisons following a potential overflow, because it ensures that they really can’t happen, though I note that Clang and Gcc both apparently fail to take advantage of this.

Fortunately, both trapping and wrapping options are implemented by the compiler I use most often, which is Gcc. The “-ftrapv” and “-fwrapv” command line arguments can be used to enable each respectively.

There are of course a number of other causes of undefined behaviour; integer overflow is only one. I don’t necessarily think that all of these are useful and I do think that there are plenty of specific cases where the semantics should be defined by the language, or at least classified as implementation-defined. And I’m wary of compiler vendors being too liberal in their interpretation: if the compiler behaves in ways that are counter-intuitive, especially for someone who has read the language specification themselves, there is always the risk of real software bugs resulting; if the opportunities for optimisation that such an interpretation opens up are negligible, it is hardly worthwhile to adopt it. An examination of some issues around this area may be the topic of a future post.

Addendum (24 Aug 2018)

I’ve realised that much of the above could be better written. To briefly summarise, clarify, and add some minor points:

I was not trying to argue that undefined behaviour on overflow is preferable to wrapping, but rather that wrapping is not much better than undefined behaviour on overflow in practice. In particular, you can get security issues from wrapping behavior in much the same way as you can with undefined behaviour – and I’d argue that many security issues resulting from unchecked integer overflow, other than those which come from the compiler removing erroneous post-overflow checks, actually come from the fact that the value has wrapped around rather than any other undefined behaviour associated with the overflow.

The only real benefit of wrap-on-overflow is that it doesn’t cause post-overflow checks to be removed. While that might eliminate some attack vectors, it leaves open the possibility that some overflows won’t be checked for at all (i.e. the programmer did not include an overflow check) and will be uncaught.

If security is not a concern but speed of execution is, undefined behaviour on overflow may allow better optimisation possibilities and provide a performance benefit, at least in some cases. On the other hand if security is a concern, wrap-on-overflow potentially leaves holes open.

This means that between trap-on-overflow, wrap-on-overflow, and undefined-behaviour-on-overflow, I see very few cases where wrap-on-overflow should be the preferred choice.

In regards to post-overflow checks, I have concerns that leaving them in place could lead to the wrong perception (that post-overflow checks work and are guaranteed to work). Trap-on-overflow avoids this problem. Good warnings help to alleviate it.

I think that any programmer writing security-sensitive code will ideally have a good grasp of the semantics of, and the potential pitfalls of, the language they are writing in. For C, this means understanding the overflow semantics and nuances of undefined behaviour. It’s unfortunate that some C programmers still don’t seem to have that level of understanding.

I have seen a claim that “most C programmers expect wrapping behaviour”, but I’m not aware of any evidence that this is true. (I’ve said “some” above since I’ve seen anecdotal evidence and I doubt this would be disputed anyway).

There are two separate issues: one is what the C language standard should require, and another is what the compilers should implement. I’m (somewhat) ok with the language standard specifying undefined behaviour for overflow. This post is arguing for what the behaviour of compilers should be.

Trap-on-overflow need not require that every operation is checked for overflow; ideally it would only mandate that the program either behaves in a mathematically consistent manner or aborts, which allows for “temporary overflow” that doesn’t generate an incorrect result. This would allow optimising “a + b – b” to “a” (which wrapping also does) and “(a * b) / b” to “a” (which wrapping doesn’t).

Addendum (21 Nov 2018)

It turns out that the GCC implementation (circa 8.2) of -ftrapv is terrible – it doesn’t detect all overflow cases, and it causes a large performance degradation due to farming out most arithmetic operations to support functions (i.e. it imposes a function call overhead for a simple addition). The Clang/LLVM implementation is much better.