Abstractions For Binary Search, Part 10: Putting It All Together After the discussions of the last few weeks, we are finally ready to build our test cases.



We shall begin by writing code to implement the strategy for choosing test cases that we discussed a few weeks ago. In order to verify that our code searches an n-element array correctly, we must create every possible n-element array in which each element other than the first differs from its predecessor by either 0 or 2. Once we have chosen an element's value, the following element has two possible values. Therefore, there are 2n-1 possible relevant n-element arrays to test.

We shall assume that an unsigned long has at least n bits, and leave it as an exercise to generalize this code to arrays with more bits than will fit in an unsigned long . With that assumption, we can count through every possible n-1-bit number and use each bit in these numbers to cover all the possibilities for relevant arrays. If we assume that array has at least n elements, we can start by writing

unsigned long bits = (n == 0)? 1UL: 1UL << (n-1);

Here, we have made bits equal to 2n-1 except in the case where n is zero; we shall count down from there to zero. The loop looks like this:

do { --bits; unsigned i = 1; for (unsigned j = 0; j != n; ++j) { array[j] = i; if (bits & (1UL << j)) i += 2; } for (unsigned x = 0; x <= i+1; ++x) { assert(binary_search(x, array, n) == linear_search(x, array, n)); } } while (bits != 0);

We decrement bits at the beginning of the loop because the first time through, all the bits we care about are zero. Each time through the outer loop, we create a new n-element array by starting with 1, then making each subsequent array element equal to either the previous element or that element plus 2, depending on the value of one of the bits in bits .

Once we have this array, we search for every value in the array, as well as for a value smaller than the smallest array element, one larger than the largest element, and one in between every pair of unequal adjacent elements. Because adjacent pairs of elements differ either by 0 or by 2, we can find the appropriate range by starting with 0 and ending with (and including) a value one greater than the last element of the array. So, for example, if we are testing four-element arrays, and the array that we are currently using is 1 1 3 3 , it is sufficient to search for 0, 1, 2, 3, and 4 .

We have already seen the code for linear_search and the instrumented code for binary_search , so we're pretty much done. However, this code leaves open several general questions about testing that deserve careful consideration:

Are assert statements really appropriate here? What's nice about an assert statement is that if it fails, most C++ implementations produce a diagnostic message that makes it clear precisely where the failure is. If you are expecting the code to work correctly under ordinary circumstances, this behavior of assert statements makes it unnecessary to figure out a comprehensible diagnostic message on one's own. What's not so nice is that if the preprocessor variable NDEBUG is set to zero, assert statements have no effect. This property of assert statements means that they are not a reliable way of ensuring that tests are actually run; the tester must ensure that NDEBUG is not set.

statements really appropriate here? What's nice about an statement is that if it fails, most C++ implementations produce a diagnostic message that makes it clear precisely where the failure is. If you are expecting the code to work correctly under ordinary circumstances, this behavior of statements makes it unnecessary to figure out a comprehensible diagnostic message on one's own. What's not so nice is that if the preprocessor variable is set to zero, statements have no effect. This property of assert statements means that they are not a reliable way of ensuring that tests are actually run; the tester must ensure that is not set. We made several assumptions while we were writing this code. For example, we did not show any tests for the code that we just developed. Of course, we can inspect how it behaves in simple test cases and ensure that it does what we expect. Moreover, if all of our test cases do work, and they consist of performing what we expect to be the same computation in two different ways, then we are more confident that the code does what we expect than we would be if we did not test it. Nevertheless, I think that it is easy not to pay enough attention to the question of how to test code that we write for testing purposes.

We did not add code to the binary_search function to verify that its input array is correctly sorted. Our rationale is that the arrays are constructed that way, so it should not be necessary to test. But suppose we stay skeptical — suppose we take the view that even though we may have thought we constructed a properly sorted test case, we might not have done so. In that case, why are we willing to accept uncritically that choosing arrays that consist only of odd integers is good enough? Perhaps we should be testing our code on arrays of other types as well, or on other integer values as well. In general, it's hard to know how much time we should spend devising tests for conditions that "can't happen."

function to verify that its input array is correctly sorted. Our rationale is that the arrays are constructed that way, so it should not be necessary to test. But suppose we stay skeptical — suppose we take the view that even though we may have thought we constructed a properly sorted test case, we might not have done so. In that case, why are we willing to accept uncritically that choosing arrays that consist only of odd integers is good enough? Perhaps we should be testing our code on arrays of other types as well, or on other integer values as well. In general, it's hard to know how much time we should spend devising tests for conditions that "can't happen." How far should we go toward testing the underlying implementation? I once had a piece of code fail horribly because a C++ implementation did not implement unsigned integer multiplication correctly: Instead of yielding a result that was correct modulo the word size, whenever an unsigned integer multiplication overflowed, the result would be zero. As another example, consider the hardware floating-point division bug that caused certain processors to give ever so slightly wrong answers to certain floating-point division problems. Or the bug that caused one processor I encountered to compute 2256+2-256 as zero. Or, for that matter, the computer I once encountered that occasionally yielded double-precision floating-point results in which the low-order 32 bits were all zero — because the contacts on the circuit board that computed those bits had become corroded. If you assume that your implementation is always correct, you will get into trouble sooner or later. But if you can't assume that your implementation is correct, how do you test anything?

Part of the reason that testing is so tricky is that it is one of at least three interlocking parts of writing software that works. The first part is deciding what problem you are trying to solve. This decision sounds easy, but even people who are paying for software to be written do not always know exactly what they want. For example, in reviewing specifications for one system, I encountered a client who said that a particular variable would always have one of two values. I asked what the system should do if it has neither of those values, and she said that she didn't care: That state of affairs was impossible. But when I offered to make the code delete the entire customer database if that variable had an "impossible" value, she protested. Apparently the "impossible" state of affairs wasn't all that impossible. Being human, we do not always anticipate perfectly everything that might happen.

Once we know what problem to solve, the next problem is to state that knowledge precisely. Such statements are harder than they look. For example, an innocent-sounding phrase such as "the largest element in a sequence" can fail in several ways: The sequence might be empty, or its elements might have a type without a well-defined order relation, or it might contain some elements with values, such as floating-point NaN (not a number), that defy comparison even when ordinary values can be compared just fine.

Once we have a precise statement of the problem, we can translate that statement into code. And again we have problems, because there might be errors in the translation. Much of the thought behind software testing concentrates on this third part of writing software. However, I think that the first two parts are at least as important.

All three of these parts have one thing in common: They are abstractions. Remember that an abstraction is always a form of selective ignorance: We are ignoring certain parts of whatever it is we're thinking about in order to be able to reduce what's left to a form about which we can reason clearly. Every time we form an abstraction, we run the risk of ignoring something important. Perhaps the most important part of creating abstractions, therefore, is to understand what we are abstracting well enough that we can be sure that the abstraction that results is one that captures the parts of the whole that we actually care about at the moment.

-------------------------------------------------------------------------

It is time for me to bid you farewell. This is my last article for Dr. Dobb's, which is ceasing publication at the end of 2014. I've had great fun writing these articles, and interacting with readers.

I have every intention to continue writing: I have several ideas for future projects, which I shall announce in due course. Meanwhile, I would like to invite you to follow me on Facebook or Twitter.