Book review: Retro debugging

This is the fifth in a series of reviews of debugging books.

The Frozen Keyboard: Living With Bad Software (Boris Beizer, TAB Books, 275 pp., 1988)

(Boris Beizer, TAB Books, 275 pp., 1988) Secrets of Software Debugging (Truck Smith, TAB Books, 276 pp., 1984)

(Truck Smith, TAB Books, 276 pp., 1984) How to Debug Your Personal Computer (Jim Huffman and Robert C. Bruce, Reston Publishing, 157 pp., 1980)

(Jim Huffman and Robert C. Bruce, Reston Publishing, 157 pp., 1980) Software Debugging for Microcomputers (Robert C. Bruce, Reston Publishing, 351 pp., 1980)

(Robert C. Bruce, Reston Publishing, 351 pp., 1980) Program Style, Design, Efficiency, Debugging, and Testing, 2nd Ed. (Dennie Van Tassel, Prentice-Hall, 323 pp., 1978)

(Dennie Van Tassel, Prentice-Hall, 323 pp., 1978) Program Debugging (A.R. Brown and W.A. Sampson, American Elsevier Computer, 166 pp., 1973)

(A.R. Brown and W.A. Sampson, American Elsevier Computer, 166 pp., 1973) Debugging Techniques in Large Systems (edited by Randall Rustin, Prentice-Hall, 148 pp., 1971)

Part of the reason I started this series of book reviews was to get a feel for how approaches to debugging have changed over the years. Similarly, I’m interested in seeing what has stayed the same, perhaps parading around in different guises.

To that end, I collected some debugging books that were published before 1990. I used the following methodology, that is totally and completely based on rigorous techniques practised in academic settings.

Search for books published before 1990 with “debug” in the title. If the combination of title, cover, and author seemed interesting—and the price was not ridiculous—obtain a copy and read it.

There isn’t any theme to the content aside from that, save for Beizer’s book, which was a recommendation.

As it happens, the books fall into two different computing eras: mainframe computing and personal/home computing. These also happen to line up with the publication dates: 1970s for mainframe, 1980s for personal. You won’t glean much about debugging for the current day and age if you read them, but you will find some enlightening contrasts and parallels.

(A note about Beizer’s book: it’s really a general book for the home computer user and not a book about the process of software development or debugging. It should not be considered part of the home computing books. Although it does discuss how software is developed, particularly how it is (or should be) tested, it is more about the state of software and computers at the time. I’ve included it in this review because it is both a good book and helps provide context for the era.)

Aside from the books’ specific content—which is amusingly out of date in almost all respects—the primary difference they demonstrate from today’s world is speed. Every single aspect of development comes across as slower. Making a change in the code and just running it, let alone deploying it, could take on the order of days. Brown and Sampson’s book describes a case study, run in 1971, that contains the specification for a program that today would be, at best, an assignment for a first year student. Competent developers would probably be able to write it in under a day, with unit tests. In the case study it takes two professional programmers, each with 6-7 years experience, nearly two weeks of effort to get a working program.

As ludicrous as that may seem, a big reason for it is the lack of machine time. The case study contains a detailed program development log for one of the programmers. That programmer logged 13.75 days of development time and only 37 minutes of machine time. It takes him days to get to the point where he even starts writing code, which in all likelihood ends up on cards or some other tedious, error-prone input mechanism. All the mainframe era books allude to the fact that CPU time cost money and that “the computer” is often used for many other things. In other words, developers almost never had a computer to work on and initially wrote most of the code at their desk on paper. There were certainly terminals and time sharing operating systems in the 1970s so it is not accurate to assume that all programmers were using punch cards to submit programs in batch to the computer. Nevertheless, there was significantly less computing power and storage space than today and it was much more expensive. Hence, the mainframe era books consistently push the notion that you should aim to “get it right the first time” so as to avoid wasting CPU cycles.

The books for home computing don’t go to this extent for the obvious reason that you had a computer right in front of you, although it would be a stretch to say that development was a speedy affair. Truck Smith’s book contains the programming journals he recorded while writing three different programs on an Apple II. One of them is for a simple diff -like program written in BASIC that ends up being a little under 300 lines, including comments. Here he is reflecting on the effort.

Reading the blow by blow descriptions [about 50 pages worth] gives the impression that it took a long time to write this program and get it running. Actually it took me about a week, working nights (but not every night of the week).

The home computing books are devoid of discussion on software tools for development and debugging, including something as rudimentary as a text editor. They assume the use of the BASIC interpreter and primitive (that is, non-multitasking) operating system that comes with the computer to input and manage a program. Smith also has journals for a Pascal and assembly language program, but there is no hint as to how to go about actually using those languages on the computer. The assumption is that the user has the knowledge and can obtain the necessary software to use a different programming language. Bruce’s books centre entirely on the limited capabilities of BASIC on the Sol 20 microcomputer. The home computing world comes with more immediate computing power and very little in the way of taking advantage of it.

This lack of speed goes hand-in-hand with debugging being talked about as a distinct stage of software development. It is almost exclusively portrayed as an offline activity. This is explicitly stated in the mainframe books. It is also true in the home computing books, although the definition shifts slightly.

When you look at the debugging tools and techniques the books describe there is one item that rules them all: printed output. Nearly everything is geared toward making program listings or diagnostic output more efficient and useful. Like most things in these books, this will strike readers in this day and age as ridiculous. Still, it is entirely reasonable given the conditions of the time. Recall that in the mainframe domain CPU time was precious: running automated tests or using tools to query output (such as something like grep ) may have been difficult or impossible. In the world of home computing merely inspecting your program on a monitor, without obtaining extra software, was likely a chore. There was very limited memory (be it disk or RAM) for holding output so even if there was a pager program, only so much could be stored. The cheapest, most abundant source of “memory” for storing output for the purposes of inspection was paper.

Given that paper is characterized as the preferred method for debugging it’s not surprising that debugging is talked about as something separate from programming. This is probably the starkest difference from today’s methods, where program output is easily stored and examined making debugging practically synonymous with development. Bruce talks about the “program development and shakedown” stages. Van Tassel says that one should plan for debugging to take four times as long as any of planning, writing, and testing. Brown and Sampson’s book starts out decrying the fact that so much time is spent on debugging. Practically every paper in the collection edited by Rustin assumes debugging is independent from the act of programming. This is not surprising: when output is removed from the runtime environment, the development process feels more delineated and less fluid.

The debugging techniques, however, are not markedly different. They’re just smaller in scope to the point where in modern environments we don’t really think of them as techniques. Everything that is prescribed for instrumenting a program to tease out information about its state is still applicable, although it may not be necessary.

The first is desk checking, that is, reading the program listing, possibly making a flow chart, and running through examples manually. Exactly when this step is recommended depends on whether you are reading the mainframe books (before you write the code) or the home computing books (after you run the code). The need to make a flow chart is pretty rare these days since we can quickly navigate code online to understand its control flow. Due to an abundance of computing power and storage space it’s also easy to execute example code and store it (automated testing, anyone?). Desk checking is a form of last resort today and a primary technique of the past.

The other technique, talked about at great length, is careful and judicious placement of print statements. When it comes to tools used in the online computing environment, print statements are the only one mentioned in the home computing books. There is no online debugger: no breakpoints, no program halting, and no memory inspection. There are also no system tools since the whole computer is devoted to running the program you’re writing. The mainframe books do talk about system-level monitors and online debugging, but it is done in a way that suggests it is an exotic luxury. It’s more or less “use ’em if you got ’em.” Any such tools are described as system specific, so a treatment of it would be out of place in a book about general debugging. The use of memory dumps is talked about by the mainframe books, saying that they are hard to work with (you probably wouldn’t examine them online) and that you’ll probably want to avoid them. In the end, the only reliable technique that you can use anywhere is a well-placed print statement. The advice on how to place them is largely the same as today, with more attention paid to avoiding extensive output since this would use a lot of paper.

As mentioned, the mainframe books see debugging as a stage of software development. The home computing books, by contrast, barely discuss the notion of process at all. Bruce’s books completely ignore how a program got written and dive into how you can fix an existing one; any programs provided come into existence through a contrived narrative for the sake of presentation. Smith’s book spends limited space on any development formalities and buries all the advice in enjoyable, haphazard experience reports. Debugging BASIC programs is something that gets done in fast edit-run cycles, the caveat being that the paucity of features in the environment makes it less appealing than it sounds. This is probably why the home computing books push offline debugging so much.

The lack of rigour in the home computing books carries over into testing. While testing is not the same as debugging, they are closely related and are frequently discussed together. So it’s a bit surprising that there is no serious mention of it in the home computing book set. Smith does touch on it, but doesn’t talk much about how to test; Bruce and Huffman completely ignore the subject. It’s embarrassing that Beizer provides more insight on how to test software, even superficially, than books directly aimed at programmers. Even more shameful, Beizer goes into some detail—with wonderful, witty diatribes—and demonstrates a deep knowledge of the subject. The lack of testing discussion in the home computing books relegates them to the realm of amateurs.

The mainframe books, on the other hand, are serious about testing and how it acts as a “debugging aid”. There is a clear emphasis on isolating parts of the program and testing them on a regular basis, especially after a change has been made. The practice of “testing units” or testing modules is mentioned often. This process is very drawn out in the description from Brown and Sampson as it involves multiple people: those who write the program, those who type it, and those who run it. Van Tassel says that using automated testing is the only way to handle large code bases and that storing test input so it can be reused is a necessity. Keeping a log of tests and test results is recommended if you desk check a routine. Testing plays a key role in managing how one debugs a program. (Included in the testing discussions is a lot of hopefulness on one day proving programs correct, complete with a healthy dose of skepticism and no serious expectation of it happening soon.)

Again, what differs from today is the scope. For example, merely checking types at module/function boundaries, let alone within a function, is referred to as “runtime overhead” that may require multiple compilers since optimizing compilers probably wouldn’t verify types. There is an emphasis on getting the compiler to check as much as possible if the resources are available (such as having a compiler that checks those things in the first place!). As usual this is because the resources to run tests are constrained and compilation itself took a significant amount of time. Descriptions of what is tested by the programmer is generally limited to a single module, which is something on the order of implementing a few equations or file operations.

All in all, debugging on mainframes in the 1970s and home computers in the early 1980s comes across a tedious exercise. In the mainframe books you can practically feel the authors pining for more power. There is all this potential and nothing to realize it. It’s mostly today’s development process played out in super-slow motion, but at least it is methodical. The home computing books, on the other hand, have a severe lack of discipline. This is probably in large part due to the use of BASIC as the grounding point, although Truck Smith tries hard to branch out a little.

This collection of books shows that, in the past, a lot more attention was paid to things that we now mostly ignore. And those things are found in the periods of time that emerge from having to wait for anything to happen. There is a clear advantage to having a wealth of computing power available when it comes to debugging. Being able to capture gigabytes of output from any number of tools and analyze it in short order, all on the same machine on which the program is running, is significantly better than worrying about whether a print statement will send a hundred pages of output to the line printer. The introspection tools that our current execution environments offer is a realization of what was yearned for in the mainframe books and we get to reap the rewards. What is revelatory is the seeming lack of formality in home computing. It’s as if everyone was so enamoured by the fact there was a computer in their home that they forgot about the computers at work. The complaints in Beizer’s book may very well be attributed to the lessons found in Smith, Bruce and Huffman’s work. There isn’t enough in the books to explain this discrepancy in thoroughness, but there’s certainly something to explore.

One thing is clear: the more things change the more they stay the same, they just get faster.