It’s hard to know when a piece of software has been sufficiently tested. Code coverage metrics are a partial answer. Basically, a coverage metric is a function that maps each execution to a set of coverage targets. For example, if we are performing function coverage, then the universe of targets is comprised of functions from the source code (or object code) and an execution covers every function from which it causes at least one statement or instruction to execute.

Can a bug hide from function coverage? This isn’t hard:

void foo (void) { if (!x) bug(); }

As long as foo() is called with x having a non-zero value, the function will be covered but the bug will not fire. We can fix this particular problem by tightening up the coverage metric, for example by insisting on statement coverage, where each statement in the program is a separate coverage target.

Let’s talk now about branch coverage, which insists that every conditional branch executes both ways. This is very similar to statement coverage except that it fixes a major hole in statement coverage where you can get 100% coverage without executing the false condition of any conditional that lacks an else branch.

Branch coverage is interesting because it’s about the strongest coverage metric that makes sense in practice. We can easily define metrics that have more coverage targets, such a path coverage, but achieving full coverage is extremely difficult and also it forces us to confront hard questions about the feasibility of paths. Here’s a page that provides a quick overview of the common and some no-so-common coverage metrics.

The rest of this piece is a partial list of ways that bugs can hide from branch coverage, and also some discussion of how we might go about finding these bugs.

Bug-Hiding Strategy #1: Concurrency

Hiding a bug using multithreading is really easy. For example:

void called_by_thread_1 () { p = NULL; ... do something quick ... p = q; ... do something slow ... } int called_by_thread_2 () { return *p; }

Assume that p is an atomic variable and that q is never NULL. We can easily get full branch coverage without testing the case where called_by_thread_2() dereferences a null pointer, and in fact that odds are this is what will happen. Concurrent programs want specialized coverage metrics; here’s a paper that I like.

Bug-Hiding Strategy #2: Math Overflow

This code looks innocent:

int average (int x, int y) { return (x+y)/2; }

But unfortunately, on a platform with 32-bit ints, average (1000000000, 2000000000) returns -647483648, despite the fact that the actual result is perfectly representable in 32-bit two’s complement. The problem is that the intermediate value x+y overflows. This is undefined behavior in C/C++. In practice we expect wrapping behavior from this kind of overflow, but even wrapping produces the wrong answer here.

Common test coverage metrics, including very difficult ones like path coverage, do not help us find math overflow bugs. One possibility would be to add overflow checks to the source code and then aim for branch coverage, but this is horrible. One could also add overflow checks using the compiler and then aim for branch coverage of the object code; this probably only makes sense in conjunction with test-case generation tools such as Klee.

The real solution to math overflow bugs is systematic documentation and enforcement of constraints on integer values, supported by a tool such as Frama-C. For example, the average() function would be preceded by this annotation:

/*@ requires x + y <= INT_MAX; */

Adding x and y inside the annotation does not cause overflow problems because integer operations in the annotation language are the mathematical ones, not the C ones. The problem with writing these annotations is that it is very hard work.

Bug-Hiding Strategy #3: Loop Error

There is much potential for hiding bugs inside loops:

int factorial (int n) { assert (n < 13); int f = 1; do { f *= n; n--; } while (n>0); return f; }

This code looks pretty good and, in fact, it returns the correct result for 0<n<13. The assertion protects against integer overflow on 32-bit machines. The remaining interesting case is n=0, for which this function is wrong: it returns 0 instead of 1. Branch coverage doesn’t help find this kind of bug.

Test coverage for loops is tricky; some sources recommend forcing each loop to execute 0 times, 1 time, and more than once, when possible.

Bug-Hiding Strategy #4: Lookup Table

Certain kinds of codes get a good speedup by precomputing some sub-results and storing these in a lookup table. This is commonly seen for checksums, math library functions, and crypto codes. For example, this AES implementation has several lookup tables. They cannot be verified by casual inspection. If any entry in such a table is incorrectly computed or gets corrupted somehow, the resulting code will be wrong and branch coverage again does not help find the problem. The famous Pentium fdiv bug was precisely this kind of error.

We might insist that every entry in each lookup table is accessed during testing, or we might simply double check that each table contains that proper values.

Bug-Hiding Strategy #5: Array Index

Code like this is pretty easy to write:

int update (int index, value_t value) { if (index > incorrect_table_size) return ERROR; table[index] = value; return OK; }

Here we’ll need to test the OK and ERROR conditions, but if we’re comparing against the wrong table size, RAM corruption can result from code that received 100% branch coverage. The table size is unlikely to be wrong for a statically sized table, but I’ve seen real programs where a piece of code failed to properly notice when a dynamically sized table changed size.

Bug-Hiding Strategy #6: Path-Sensitive Conditionals

This code is highly contrived, but it illustrates the problem:

int silly (int a) { int *p = &x; if (a > 10) p = NULL; if (a < 20) return *p; return 0; }

Assume x is a global int. If we test this code with parameter values of 5 and 25, branch coverage is satisfied, and yet we will not trigger the undefined behavior that happens when a==15. Besides going for path coverage, I don’t know of a good way to find these bugs.

Bug-Hiding Strategy #7: Heap Shape

Tons of problems can hide in heap data structures. For example, assume that we have a red-black tree implementation where the delete() operation leaves the tree in a broken state where subsequent insert() calls will fail. We could easily end up with 100% branch coverage of insert(), delete(), and the other operations without even actually testing the case where a node is deleted and then inserted.

Conclusions

100% branch coverage is often attainable at the unit level, but almost never is at the system level. Regardless, it is insufficient for finding any number of kinds of bugs (the list here is just off the top of my head; if you have any favorites that I missed, please let me know).

There’s no general-purpose way to find these bugs, but rather a variety of partial solutions. First, we can pay close attention to idioms such as lookup tables that can subvert branch coverage. Second, we can add extra paths to the code in an attempt to make implicit predicates explicit, such as integer overflow checks. Third, we can play with the coverage metric, for example by augmenting branch coverage with additional loop coverage targets. Finally, we can add annotations to our code to express necessary preconditions, postconditions, and loop invariants, and then use a tool to check these.