Additional discussion: Hacker News, Reddit

I ran into a really fun bug at work yesterday, where I discovered that my C program was branching down logically inconsistent code paths. After drinking another cup of coffee and firing up GDB I realized that somehow, a boolean variable in my code was simultaneously testing as both true and not true.

While I cannot reproduce the actual source code here, the effect was that code like

bool p ; /* ... */ if ( p ) puts ( "p is true" ); if ( ! p ) puts ( "p is false" );

would produce the output:

p is true p is false

So what’s going on here?

Well it turns out that the authors of the C language specification (and the people who went on to implement compilers for it) were serious about the concept of undefined behavior. In particular, the result of attempting to use an uninitialized variable is undefined.

And in this case that’s exactly what happened: I had failed to properly initialize some memory. Easy, bug fixed. But what I think is interesting are the reasons this code failed in precisely the way that it did. In order to investigate that, we need to get specific.

On 64-bit Linux (Ubuntu 12.04), compiling the following program:

#include <stdio.h> #include <stdbool.h> int main ( int argc , char * argv []) { volatile bool p ; if ( p ) puts ( "p is true" ); else puts ( "p is not true" ); if ( ! p ) puts ( "p is false" ); else puts ( "p is not false" ); return 0 ; }

with GCC 4.6.3, using the command line:

$ gcc bool1.c -g0 -O0 -fno-dwarf2-cfi-asm -masm=intel -S -o bool1.asm

produces this (truncated) assembly language:

.file "bool1.c" .intel_syntax noprefix .section .rodata .LC0: .string "p is true" .LC1: .string "p is not true" .LC2: .string "p is false" .LC3: .string "p is not false" .text .globl main .type main , @ function main: .LFB0: push rbp .LCFI0: mov rbp , rsp .LCFI1: sub rsp , 32 .LCFI2: mov DWORD PTR [ rbp - 20 ], edi mov QWORD PTR [ rbp - 32 ], rsi movzx eax , BYTE PTR [ rbp - 1 ] test al , al je .L2 mov edi , OFFSET FLAT : .LC0 call puts jmp .L3 .L2: mov edi , OFFSET FLAT : .LC1 call puts .L3: movzx eax , BYTE PTR [ rbp - 1 ] xor eax , 1 test al , al je .L4 mov edi , OFFSET FLAT : .LC2 call puts jmp .L5 .L4: mov edi , OFFSET FLAT : .LC3 call puts .L5: mov eax , 0 leave .LCFI3: ret

To perform the test if ( p ) here, first the stack variable is loaded into a 32-bit register with movzx eax, BYTE PTR [rbp-1] , and then we use the instruction test al, al which sets the zero flag (ZF) if the lower eight bits of this value are zero. Next we execute the conditional jump je .L2 , which jumps to print “p is not true” if ZF was set; otherwise we don’t jump, and “p is true” gets printed instead.

Next let’s examine the second test, if ( ! p ) , at label .L3 . This starts out the same by loading the boolean variable into register eax , but notice how the negation is handled. Rather than reorder the jumps or use jne instead of je , the compiler explicitly negates the boolean by performing a bitwise exclusive-or: xor eax, 1 .

Normally this would be fine — a bool variable is only supposed to contain a value of zero or one, in which case its value can be negated by XOR with 1. When you cast to a bool at runtime, the compiler generates code to ensure only one or zero gets stored. For instance, the cast in this program:

#include <stdbool.h> volatile char c = 0xff ; volatile bool p ; int main ( int argc , char * argv []) { p = ( bool ) c ; return 0 ; }

is implemented as the following four instructions:

movzx eax , BYTE PTR c [ rip ] test al , al setne al mov BYTE PTR p [ rip ], al

wherein setne sets the register al to exactly 1 if the char contained any nonzero value, before saving the register’s 8-bit value to the boolean variable.

But the compiler affords us no such protection if we accidentally use an uninitialized value as a boolean. It doesn’t have to, it’s not the compiler’s responsibility; the result of using an uninitialized stack variable is undefined. And so if we somehow wind up with a value of e.g. 0x60 stored at the address of a bool variable (as I saw during my troubleshooting yesterday), both the variable and its negation (via exclusive or with 1) will be nonzero, and therefore test as true.

Interestingly, enabling optimization ( -O2 ) in GCC causes the compiler to factor out the XOR and instead reorder the jumps, meaning this program actually behaves more robustly under compiler optimization (for certain definitions of “robust” anyway):

.file "bool1.c" .intel_syntax noprefix .section .rodata.str1.1 , "aMS" , @ progbits , 1 .LC0: .string "p is true" .LC1: .string "p is not true" .LC2: .string "p is false" .LC3: .string "p is not false" .section .text.startup , "ax" , @ progbits .p2align 4 ,, 15 .globl main .type main , @ function main: .LFB22: sub rsp , 24 .LCFI0: movzx eax , BYTE PTR [ rsp + 15 ] test al , al je .L2 mov edi , OFFSET FLAT : .LC0 call puts .L3: movzx eax , BYTE PTR [ rsp + 15 ] test al , al je .L7 mov edi , OFFSET FLAT : .LC3 call puts .L5: xor eax , eax add rsp , 24 .LCFI1: ret