I’m struggling with my article numbering now :-)

Regular readers will have noticed that I am running a whole bunch of series in parallel on this blog: one reviewing programming books, one on the issue of simplicity in programming, one reviewing Series 5 of Doctor Who, a related one on Russell T. Davies’s contributions, and one on learning Lisp. To my huge surprise, what started out as a review of Jon Bentley’s book Programming Pearls [amazon.com, amazon.co.uk] has led through a sequence of digressions and become a series of its own on binary search. This is part 4 of that series: when I’ve finished part 4, I will — at last! — progress to the actual review. But since part 4 is about the distinctly non-trivial subject of writing correct code, it will itself have to be composed of multiple sub-parts. So this article on invariants is both part 1 of the new writing-correct-code series and part 4a of the binary-search series. Hope that’s clear.

A few people have complained about the sushi pictures on this blog, so before I plough into the article proper, here is something completely different.

Tools for thinking about code

In previous articles, I’ve complained about using testing as a crutch (note: not complained about testing), and rather self-righteously claimed that people should think about their code. What exactly do I mean by that? In this mini-series, I want to draw attention to a few concepts that will be patronisingly familiar to most of you who have CS degrees, but which will hopefully be helpful to any avocational programmers hoping to tighten up their code. Anyone who attempted the binary-search challenge and didn’t get it right first time (modulo syntax errors and other such trivia) might find something of use here.

The three tools I have in mind are:

Invariants

Bound functions

Preconditions/postconditions

These are powerful concepts, and can be used at different degrees of formality and rigour. At one extreme, some computer-science researchers might try to use these concepts to mathematically prove the correctness of a piece of code — perhaps with the proof being an order of magnitude longer than the code. At the other extreme, a seasoned professional developer probably always has at least a fuzzy and informal notion of an invariant in the back of his head whenever he writes a loop. I think that we would often benefit from thinking more explicitly about invariants. But note that in this article I will go rather overboard for the purposes of explanation, and use a level of detail that I would never normally use for a program as trivial as a binary search. So: I am not suggesting that we should invest this level of analysis in every loop we ever write.

What is an invariant?

An invariant is a property that remains true throughout the execution of a piece of code. It’s a statement about the state of a program — primarily the values of variables — that is not allowed to become false. (If it does become false, then the code is wrong.) Choosing the correct invariant — one that properly expresses the intent of an algorithm — is a key part of the design of code (as opposed to the design of APIs); and ensuring that the invariant remains true is a key part of the actual coding. Roughly speaking, if your invariant properly expresses the intent of the algorithm, and if your code properly maintains the invariant, then that is enough for you to be confident that the code, if it terminates, yields the correct answer. (Ensuring that it does in fact terminate is the job of the bound function, which I will talk about next time.)

As usual, an ounce of example is worth a ton of explanation, so let’s go back to the problem of binary search, and see if we can build a routine whose correctness we’re confident of even before we run our tests.

A brief recap of linear and binary search

Here’s a quick reminder of what binary search is about. The goal is to find a value in an array. If the array contains one or more instances of the value, then the search function should return one of the indexes of the sought value, otherwise it should return -1. (To help us concentrate on the algorithm, we’ll assume that the values are all integers — we could do this with templates and generics and whatnot, but for our current purposes that would only obscure the algorithmic issues that we want to concentrate on.)

In the general case of searching in an array, you can’t do better than the trivial linear search code (expressed in C; the code is pretty much identical in C++ or Java). Here we call the array a, it has size elements, and the sought value is called val:

int linear_search(const int a[], const int size, const int val) { int i; for (i = 0; i < size; i++) if (a[i] == val) return i; return -1; }

This function takes, on average, size/2 probes to find an element that is present, and size probes to determine that one is absent. We say that it is O(n), which means simply that its run-time is proportional to n.

But if we are allowed to assume that the array is sorted (in ascending order) then we can do much better. We can use the binary search algorithm: here, we cut the search space in half at each step rather than reducing it by one. We guess that the middle element of the array might be equal to val; if it is, we’re done, otherwise we can narrow the search to either the top half of the array (if the mid-point is less than val) or the bottom half (otherwise). Then we probe the midpoint of the new, smaller, range, and continue in this fashion until we either find the element or discover that the remaining range is empty.

Because binary search cuts the search-space in half at each step, it converges on the desired value much more quickly than linear search — in log2(n) steps. For very small arrays that difference is negligible; but when searching in an array of a million elements, binary search needs 20 probes rather than 500,000 on average for a linear search.

What is the invariant for binary search?

In informal terms, the invariant for this algorithm is: “if the sought value is present in the array at all, then it is present in the current range”.

To make it useful for deriving correct code, though, we need to formalise that invariant in terms of specific variables and values. And before we can do this, we need to decide on the representation of the range under consideration. There are several candidate representations, none of them greatly better or worse than the others: we could keep track of the highest and lowest array indexes that might hold val, or the lowest index and the size of the range; or use asymmetric indexes, where we maintain the index to the base of the current range and the index that points past the end. For this version of the routine, I’m going to arbitrarily choose to represent the range by two variables, lower and upper, which contain the lowest and highest indexes into a that might contain val. With this representation, we can formalise the invariant as:

if val is at any position i in a, (i.e. a[i]==val) then lower <= i <= upper.

So long as we ensure this invariant is kept true, we can be confident that our code will not fail to find val if it’s present. (This doesn’t show that the program will terminate, but we’ll look at that next time.)

Armed with this invariant, we can write the code with some confidence that it’s right (and indeed this C function that I just wrote passed all tests first time):

int binary_search(const int a[], const int size, const int val) { int lower = 0; int upper = size-1;

/* invariant: if a[i]==val for any i, then lower <= i <= upper */ while (lower <= upper) { int i = lower + (upper-lower)/2; if (val == a[i]) { return i; } else if (val < a[i]) { upper = i-1; } else { /* val > a[i] */ lower = i+1; } } return -1; }

Informal proof of the code

Here is the reasoning that led to the code above, expressed pretty informally.

The first thing to do is establish the invariant that’s going to hold true for the rest of the function, so we set the variables lower and upper to appropriate values (i.e. the lowest and highest indexes within the whole of a ).

and to appropriate values (i.e. the lowest and highest indexes within the whole of ). We have ensured that the invariant is true when we first enter the loop. To show that stays true throughout the running of the function, we need to show that whenever it’s true at the top of the loop, it’s also true at the bottom.

The first statement of the loop (assigning to i ) does not affect any of the variables referenced by the invariant, so it can’t possibly cause the invariant to stop being true. (Note that a and val can never be changed as we declared them const , so we only need to worry about lower and upper .)

) does not affect any of the variables referenced by the invariant, so it can’t possibly cause the invariant to stop being true. (Note that and can never be changed as we declared them , so we only need to worry about and .) What follows is a three-way IF statement: we need to show that each of the three branches maintains the invariant.

The first branch covers the case where we have found the sought element. At this point, we’re returning from the function (and therefore breaking out of the loop) so we don’t really care about the invariant any more; but for what it’s worth, it remains true, as we don’t change the values of lower or upper .

or . The second branch ( val < a[i] ) is the first time we need to use non-trivial reasoning. If we’re in this branch, we know that the condition guarding it was true, i.e. val < a[i] . But because a is sorted, we also know that for all j > i, a[j] >= a[i] . This means that val < all a[j] with j >= i , so the highest position it can be at is a[i-1] . Knowing this, we can set upper to i-1 , and know that the invariant still holds good with the new, more restrictive value of upper . Notice what has happened here: we have shrunk the range to half its previous size or less, but simple reasoning about the code persuades us that we still know where val must be (if it’s anywhere in a ). We are confident that we have not inadvertently excluded it from the range.

) is the first time we need to use non-trivial reasoning. If we’re in this branch, we know that the condition guarding it was true, i.e. . But because is sorted, we also know that for all . This means that < all , so the highest position it can be at is . Knowing this, we can set to , and know that the invariant still holds good with the new, more restrictive value of . Notice what has happened here: we have shrunk the range to half its previous size or less, but simple reasoning about the code persuades us that we still know where must be (if it’s anywhere in ). We are confident that we have not inadvertently excluded it from the range. The third branch follows the same form as the second: since we know that val > a[i] and that a[i] >= a[j] for all j < i , we can conclude that the lowest position val can be at is a[i+1] , so we adjust lower accordingly and maintain the invariant.

and that for all , we can conclude that the lowest position can be at is , so we adjust accordingly and maintain the invariant. Since we’ve verified that all three branches of the IF maintain the invariant, we know that the invariant holds on exiting that IF.

That means the invariant is true at the bottom of the loop, which means it will be true at the start of the next time around the loop.

And by induction we deduce that it always remains true.

Finally, we break out of the loop when the candidate range is empty, i.e. when it’s been shrunk to negative size, so lower > upper . At this point, we know that the condition of the invariant (“if val is at any position i in a” ) does not hold, so the invariant is trivially true; and we return the out-of-band value -1.

Does this sound like an awful lot of work? When it’s spelled out step by step in this way, yes; but in practice, you never need to go this slowly and carefully unless you’re writing avionics software, a life-support system’s firmware, or a Reinvigorated Programmer article. The point is not necessarily to go through the code in small pedantic steps like this, but to spend some time up front understand what the invariant is, keep that invariant in mind while coding, and understand the resulting code to whatever depth seems appropriate.

It takes much longer to read about this (and much longer to write about it!) than it does to actually do it. Writing the C function above took two or three minutes; just being explicitly aware of the invariant was nine tenths of the battle.

This code is good

I’m going to claim that this code is objectively better than most of the solutions that were posted in response to the original challenge — better, even, than most of the correct solutions. [Yes, I realise that I am setting myself up for a really big fall if someone finds a bug in it!] Here’s why:

It’s short, which makes it easier to understand and maintain.

It has no special cases — array of length 0 and 1 are handled by the main code.

The single comment is genuinely informative.

In short, the code obeys the golden rule from The Elements of Programming Style: “Say what you mean simply and directly“.

Through the use of a simple but powerful technique, I was able to write compact, clear code in a short time; and be pretty confident, even before I tested it, that it was correct.

That last point is important not as some kind of macho posturing, but because it means that the tests, when I run them, are giving a second distinct line of sight on the problem. I’m not using tests to make my code right (even if that were possible, which I’ve argued is not the case). Instead, I have written code that I already think is right, and I’m using the tests as an independent line of verification. You may remember in the last post that I said: “writing tests lets you triangulate”; I was trying to be conciliatory towards all the test-driven developers out there, but actually that’s not true most of the time — because if the tests are the development then all you’re getting when you run them is validation that doing the same thing again gives you the same result.

Let me try an analogy to explain what I mean. If I am paying the bill at a restaurant and I want to add up the amounts for the various dishes, I add from top to bottom, then verify by adding from bottom to top. By approaching the same problem from two different directions, I give myself the best chance of avoiding mistakes; whereas if I add in the same direction both times, I am liable to repeat a “favourite mistake” like adding 8+7 and making the result 13 rather than 15. Or I might verify by using another different technique such as column-at-a-time-and-carry. The point is to use multiple techniques that have different points of weakness.

But when testing and development are the same thing, then I can’t use testing to verify my development.

Why does no-one talk about invariants?

The great mystery to me is why no-one seems to talks about invariants any more. You will of course find them in The Elements of Programing Style, Programming Pearls, and other books of similar vintage, but they are not to be found in, say, Refactoring.

Here is an amazing statistic: the original binary-search challenge post has now attracted more than seven hundred comments. Of those seven hundred, only three comments [Mihai, dave, Darius Bacon] so much as mention invariants. (All three comments are candidate solutions to the binary-search challenge: Mihai’s and dave’s use invariant assertions and Darius’s mentions the invariant in comments.) That is astonishing to me, given that we’ve all been trying to solve a problem that is known to be subtle — I warned about its deceptive difficulty up front — and one that is so amenable to analysis using an invariant. I can only assume that they don’t teach invariants in CS degrees any more. (Surely that’s not actually true? Anyone graduated recently care to comment?)

Here is my best guess on how such a useful technique has become so unfashionable: I think it’s collateral damage from the huge popularity of object-oriented programming. The ability to reason rigorously about a program depends on having solid knowledge of its relevant state, and what is an “object” but a big opaque ball of mutable state that can change under our feet? Yes, objects “hide” state by encapsulating it; but it’s still there — everyone is naked under their clothes. Changes to object state cause changes to its behaviour, which makes them hard to reason about. (This is why object-oriented techniques don’t work well in concurrent systems, hence in part the growth of interest in functional languages as multi-core processors become increasingly ubiquitious.)

[Note well — and I say this because almost everything I write on this blog seems to get misinterpreted somewhere or other: I am not, repeat not, saying object-orientation is A Bad Thing. I am saying that it has a specific and important drawback which ought to be taken into account, alongside its benefits, when deciding where its use is appropriate.]

Finally, I think there is a practical point to be made here: where languages have facilities that let us mark objects as immutable, or mark methods as pure queries, we should take advantage of them whenever possible — as I did in the code above, where a, size and val were all marked const. And where languages lack those features, we should grind our teeth in fruitless despair and cry out to the heavens. (Although I love Ruby, I hate the fact that it has no way to talk about these things.)

There you have it: invariants! They’re easy to use, very powerful, and let you think clearly about subtle code such that your program has a good chance of running right first time; and they provide a different axis of analysis that allows your tests to be more informative. Get in the habit of using them!

Update: links to this whole series