There are some programming books that I’ve read from cover to cover repeatedly; there are others that I have dipped into many times, reading a chapter or so at a time. Jon Bentley’s 1986 classic Programming Pearls is a rare case where both of these are true, as the scuffs at the bottom of my copy’s cover attest:

(I have the First Edition [amazon.com, amazon.co.uk], so that’s what I scanned for the cover image above, but it would probably make more sense to get the newer and cheaper Second Edition [amazon.com, amazon.co.uk] which apparently has three additional chapters.)

I’ll review this book properly in a forthcoming article (as I did for Coders at Work, The Elements of Programming Style, Programming the Commodore 64 and The C Programming Language), but for now I want to look at just one passage from the book, and consider what it means. One astounding passage.

Only 10% of programmers can write a binary search

Every single time I read Programming Pearls, this passage brings me up short:

Binary search solves the problem [of searching within a pre-sorted array] by keeping track of a range within the array in which T [i.e. the sought value] must be if it is anywhere in the array. Initially, the range is the entire array. The range is shrunk by comparing its middle element to T and discarding half the range. The process continues until T is discovered in the array, or until the range in which it must lie is known to be empty. In an N-element table, the search uses roughly log(2) N comparisons. Most programmers think that with the above description in hand, writing the code is easy; they’re wrong. The only way you’ll believe this is by putting down this column right now and writing the code yourself. Try it. I’ve assigned this problem in courses at Bell Labs and IBM. Professional programmers had a couple of hours to convert the above description into a program in the language of their choice; a high-level pseudocode was fine. At the end of the specified time, almost all the programmers reported that they had correct code for the task. We would then take thirty minutes to examine their code, which the programmers did with test cases. In several classes and with over a hundred programmers, the results varied little: ninety percent of the programmers found bugs in their programs (and I wasn’t always convinced of the correctness of the code in which no bugs were found). I was amazed: given ample time, only about ten percent of professional programmers were able to get this small program right. But they aren’t the only ones to find this task difficult: in the history in Section 6.2.1 of his Sorting and Searching, Knuth points out that while the first binary search was published in 1946, the first published binary search without bugs did not appear until 1962. — Jon Bentley, Programming Pearls (1st edition), pp. 35-36.

Several hours! Ninety percent! Dude, SRSLY! Isn’t that terrifying?

One of the reasons I’d like to see a copy of the Second Edition is to see whether this passage has changed — whether the numbers improved between 1986 and the Second-Edition date of 1999. My gut tells me that the numbers must have improved, that things can’t be that bad; yet logic tells me that in an age when programmers spend more time plugging libraries together than writing actual code, core algorithmic skills are likely if anything to have declined. And remember, these were not doofus programmers that Bentley was working with: they were professionals at Bell Labs and IBM. You’d expect them to be well ahead of the curve.

And so, the Great Binary Search Experiment

I would like you, if you would, to go away and do the exercise right now. (Well, not right now. Finish reading this article first!) I am confident that nearly everyone who reads this blog is already familiar with the binary search algorithm, but for those of you who are not, Bentley’s description above should suffice. Please fire up an editor buffer, and write a binary search routine. When you’ve decided it’s correct, commit to that version. Then test it, and tell me in the comments below whether you got it right first time. Surely — surely — we can beat Bentley’s 10% hit-rate?

Here are the rules:

Use whatever programming language you like. No cutting, pasting or otherwise copying code. Don’t even look at other binary search code until you’re done. I need hardly say, no calling bsearch() , or otherwise cheating :-) Take as long as you like — you might finish, and feel confident in your code, after five minutes; or you’re welcome to take eight hours if you want (if you have the time to spare). You’re allowed to use your compiler to shake out mechanical bugs such as syntax errors or failure to initialise variables, but … NO TESTING until after you’ve decided your program is correct. Finally, the most important one: if you decide to begin this exercise, then you must report — either to say that you succeeded, failed or abandoned the attempt. Otherwise the figures will be skewed towards success.

(For the purposes of this exercise, the possibility of numeric overflow in index calculations can be ignored. That condition is described here but DO NOT FOLLOW THAT LINK until after writing your program, if you’re participating, because the article contains a correct binary search implementation that you don’t want to see before working on your clean-room implementation.)

If your code does turn out to be correct, and if you wish, you’re welcome to paste that code into your comment … But if you do, and if a subsequent commenter points out a bug in it, you need to be prepared to deal with the public shame :-)

For extra credit: those of you who are really confident in your programming chops may write the program, publish it in a comment here and then test it. If you do that, you’ll probably want to mention the fact in your comment, so we cut you extra slack when we find your bugs.

I will of course summarise the results of this exercise — let’s say, in one week’s time.

Let’s go!

Update (an hour and a half later)

Thanks for the many posted entries already! I should have warned you that the WordPress comment system interprets HTML, and so eats code fragments like

if a[mid] < value

The best way to avoid this is to wrap your source code in {source}…{/source} tags, but using square brackets rather than curly. (The first time I tried to tell you all this, I used literal square brackets, and my markup-circumvention instructions were themselves marked up — D’oh!). Do not manually escape < and > as < and > — the {source} wrapper deals with these. Doing it this way also has the benefit of preserving indentation, which no other method seems to do.

And an apology for WordPress: I really, really wish that this platform allowed commenters to preview their comments and/or edit them after posting, so that all the screwed-up source code could have been avoided. I’ve tried to go and fix some of them myself, but — arrgh! — it turns out that WordPress not only displays code with < symbols wrongly, it actually throws away what follows, so there’s nothing for me to restore.

Update 2 (four hours after the initial post)

Wow, you guys are amazing. Four hours, and this post already has more comments than the previous record holder (Whatever Happened to Programming, 206 comments at the time of writing.)

For anyone who’d like to see more discussion, there are some good comments at Hacker News and perhaps some slightly less insightful comments at Reddit, where actually writing code is seen as “elitism”.

Update 3: links to this whole series