Some Programs Are More Correct Than Others There's a saying that the difference between theory and practice is that in theory, there is no difference between theory and practice; but in practice, there is.



There's a saying that the difference between theory and practice is that in theory, there is no difference between theory and practice; but in practice, there is.

In theory, either a program is correct or it isn't. In practice, different programs can be incorrect in different ways. For example, consider a program that always either produces a correct result or crashes. Another program sometimes produces a correct result, but sometimes quietly yields an incorrect result. Clearly neither of these programs is correct — and yet because of their different kinds of incorrectness, it is easy to imagine circumstances in which either program might be useful and the other one useless.

As another example, consider a chess-playing program competing in a tournament with rules that say that if a program takes too long to move, it forfeits the game. In such a tournament, making a bad but legal move within the time limit would be much more useful than not moving at all. More generally, many programs solve problems for which the best possible answer is not known — a state of affairs that makes testing much more difficult. Chess is one example of such a problem; forecasting the weather is another. A weather-forecasting program might contain serious bugs, but might still produce useful results. Weather forecasts are never completely accurate, so whether a forecast is useful is not a yes-no question. Moreover, a program that produces a useful forecast well in advance is certainly more useful than a program that takes so long to run that it forecasts only weather that has already happened.

In both of these examples, the incorrect result is no worse than having no result at all. In effect, running the program does something useful, even if that something falls short of what was desired. Theoreticians may divide programs into two categories — correct and incorrect — but in practice, some incorrect programs are useful anyway, while others are worse than useless.

This three-way division — correct, incorrect but not harmful, and worse than useless — suggests that both our system design and our test strategies should often be aimed more strongly at avoiding the third category than they are at ensuring that programs are in the first. For example, although it is obviously important for a computer-controlled medical x-ray source to produce the right dosage, it is even more important to design the entire system so that a software error will not kill the patient.

In addition to asking whether a program can ever be worse than useless, we can also ask how easy it is to recover after a program has misbehaved. As a simple example, a program that processes an input file and produces an output file is probably more badly broken if it is capable of destroying its input file than it is if the worst it can do is produce incorrect output. More generally, if a program that manipulates a database fails in a way that corrupts the database, that failure is worse if it is impossible to repair the damage than if there is a way of rebuilding the database.

In effect, the theoretical notion that programs are either correct or incorrect is not enough to model how programs are used in practice. The theory fails to answer the practical question: We have a program that, despite our best efforts, is incorrect; and we did not learn about its incorrectness until we ran it. Now what? The answer to this question may well depend on aspects of our design that the theory does not take into account. Instead, we may have to use pragmatic strategies to contain likely kinds of errors, and to deal with them after the fact.

Next week, I'll discuss some concrete examples of such strategies.