Code Quality Warnings: Highlighting Likely Bugs before They Bite



In reading a recent AdaCore blog entry about “Low Hanging Bugs” (http://blog.adacore.com/going-after-the-low-hanging-bug), it struck me that these sorts of errors bite me frequently. Wouldn't it be nice if some Janus/Ada tool pointed these out before I wasted hours debugging the cause? Ada already is a great language for detecting bugs early. Many things that would cause debugging sessions in C are outright illegal in Ada – so getting the code to compile detects and eliminates many classes of potential bugs. Indeed, many of us “old Ada hands” tend to let the compiler find our mistakes; it's not worth our time to try to find them before compiling. (In extreme cases, we might even let the compiler tell us what code needs to be changed; the recently completed expression tree overhaul in Janus/Ada was accomplished that way.) It seems that anything that can extend that to catch even more bugs is building on an Ada strength. My thoughts turned to how such a tool would fit into my usual workflow. Our rules require that all code to be checked in has been compiled successfully and that code is checked in no later than the day after it is checked out. Thus all of our work tends to compiler-centric – even large changes are compiled as quickly as possible. In addition, we prefer (but don't require) that checked-in code has been tested (testing can sometimes take several days and delaying check-ins by that much could impact other work). The effect is that most work is accomplished with a tight edit-build-immediately test cycle. Running a separate tool at check in or once per day for checks that are intended to point out likely stupid bugs doesn't work. By the time the tool was run, the bug probably would have been encountered and tracked down the hard way. Such a tool could be part of an IDE. It couldn't be always on, though, because in that case it would nag repeatedly during editing (especially during cut-and-paste coding). Probably one would get so used to ignoring the warnings that you'd forgot to check them when done. So such a tool would have to be run just before compiling. JAWS (the original Janus/Ada text-mode IDE) had such a tool; it checked units directly in the editor's memory so as to save the time to save the file and start the compilation for silly mistakes such as syntax errors. However, back then, saving a file and running a compilation took 3-4 minutes. Now we get annoyed when a compilation isn't done in 3-4 seconds. The time savings of an embedded pre-compilation aren't worth the effort of developing it, especially as it would have to detect problems even in incorrect code. As a separate tool isn't looking appealing, I then started wondering about whether these checks would make sense as part of the compiler. After all, in that case, forgetting to run them is nearly impossible. What other reasons could there be for not putting these checks into a compiler? Obviously, a lot of people build separate tools because they don't have a compiler to add things to. However, that is not an issue here at R.R. Software! Another reason to use a separate tool is that it is too expensive to have in the compiler. “Expensive” here could mean in terms of development time, amount of runtime, or in the amount of space used. Here, however, development time is likely to be less as part of a compiler, as one can run the checks on fully resolved expressions – no need to figure out meanings of types or identifiers in expressions. The runtime of checks such as these is pretty insignificant, as it can be part of a Legality Check tree pass that is going to be run in any case. Moreover, most of the checks are reasonably local in an expression tree, involving only a handful of operations. The checks also have little effect on the space use in the compiler. Two other possible reasons have to do with the messages themselves. If there are a lot of useless messages (that is, messages for code that isn't buggy, a so-called false positive), they could reduce the value of the compiler itself. That can be managed by the precise checks implemented – it's necessary to omit checks that may or may not reflect a bug. Similarly, the compiler could output too many other messages such that the important ones get lost. We've overhauled the warnings in Janus/Ada precisely to reduce this problem (see Avoiding Crying Wolf With Warnings); these checks should fit in well. Thus, there aren't any important reasons for not adding these checks to the compiler. So we went ahead and did it; we called them Code Quality Warnings. They'll be available starting with the upcoming 3.2.0 preview. We did hedge a bit by providing a separate switch (/WQ) to turn them off or convert them to informational messages. Early experience (discussed below) hasn't shown significant issues with the Code Quality Warnings; we're planning to leave them on (along with all other warnings) when using the 3.2.0 compilers. The basic idea is for Code Quality Warnings to flag code that is unlikely to be what was meant. Most such code falls into the category of “useless computation”, where a complicated expression is used to calculate a simple value (often a constant). It's important to remember that all of the code flagged by quality warnings is legal Ada code, and it might be intended for some obscure reason. Thus these are warnings, not errors. The most basic useless computation is a relational operator with the same operand on both sides. For instance: if X = X then The operand here can be any Ada subexpression that is the same. We used an existing routine used to check for conforming expressions to make the determination of “same”, so different expanded names and the like do not prevent the check from triggering. Obviously, one character names are not very realistic, but consider the following: for I in Data'Range loop for J in Data'Range loop if Data(I).Wins = Data(I).Wins then ... Here, we'll get a code quality warning on the if condition. Most likely, the second I was supposed to be J; as it is, the condition does nothing interesting. False positives are unlikely in cases like this. If the expression evaluates to different values each time it is evaluated (like the Random function), it might be intended and possibly even do something useful. But such code is tricky enough that it probably should be written some other way. (That advice applies to almost all of the code quality warnings.) A similar check can be (and is) performed for operators like "-" and "/". An expanded check is made for the logical operators. A matching subexpression anywhere in a chain of logical operators is detected. For instance, both of the following will trigger a code quality warning: if X and X and Y then ... elsif Y or else X or else Y then Another such check is the DeMorgan mistake. The classic example is: if X /= Y or else X /= Z then This expression is always True, regardless of the value of X (with some important caveats, discussed in a moment). That's because either X /= Y or X /= Z will be true, and oring True with anything gives True. The logical operator here needs to be and. This follows from the DeMorgan Law: not (X = Y or X = Z) is the same as X /= Y and X /= Z When I started testing these warnings, however, I got a number of occurrences of this warning that were clearly false positives. The first one was: if Ptr1 /= null or else Ptr2 /= null then In this case, null is playing the role of X. This expression is True unless both Ptr1 and Ptr2 are null, which seems like a reasonable thing to test. The problem here is that I had ignored the caveats for the DeMorgan warning, but it turns out one is important. The first caveat to any of these rules is that we assume that X evaluates to the same value each time it is evaluated. That's true of most Ada expressions, but not always true of function calls. (Think of Random for an extreme example.) For these rules, that possibility can be ignored, as any code depending upon that is really tricky and needs extensive comments (or better yet, gets written some other way). But the DeMorgan check has another caveat of its own: we're assuming that Y and Z evaluate to different values when they are evaluated. And this isn't always true – all of the false positives I've seen for the DeMorgan check stem from this assumption. It's possible to write an expression for which the point is essentially to check that two values match a third value. Such an expression could very well end up looking like the expression that started this discussion. The positive version is: if Ptr = null and Ptr2 = null then It should be fairly obvious that an expression with X as a constant is more likely to be a false positive than a real bug. Unless Y and Z are such that they couldn't have the same value, the expression is likely just testing the values of some variables are the same as some common value. That is probably what the programmer intended, and a warning is harmful (after all, too many false positives and you'll just ignore or even suppress the warning). Thus, the DeMorgan quality warning was recoded so it is not triggered if X turns out to be a constant. (Note that this means an actual constant or literal, not things with possibly changing values that Ada defines to be constant, like function results and loop parameters.) One can still get false positives with this rule; one example I've seen was if Ptr = Ptr2 and Ptr2 = null then which is just another way to test that both Ptr and Ptr2 are null. I changed this expression into the previous one, as it is likely to be a bit cheaper in terms of code anyway, and it isn't going to trigger a quality warning. Note that we make the check regardless of the order of the operands. One could imagine making the check only if the left operand of each equality is equal, but I didn't do this because I personally don't always put the variable on the left – I tend to write whatever is most readable for a particular instance. Thus I didn't want the check to miss a bug just because I had written: if Z /= X or X /= Y then The early experience with real code shows that this false positive doesn't happen often (more on this below), so for now the check is as I've described it here. We'll need to compile more code to see if there is a real problem with these false positives. The last group of quality checks involve if statement conditions (and similarly in if expressions, once those get implemented in Janus/Ada). If a condition is repeated within a single if statement, there is certainly a problem. For example: if X then Something1; elsif Y then Something2; elsif X then Something3; end if; Something3 will never be executed (again assuming that X evaluates to the same value each time). This is very likely a mistake of some sort, either X was supposed to be some other expression, or Something3 is redundant code that can't happen (that's especially likely if Something1 and Something3 are the same code), or Something1 and Something3 need to be combined. In almost every case, something needs fixing. This is the only one of these checks that changes how the compiler operates. In order to make the check, all of the previous conditions of an if statement have to be available. That wasn't true in Janus/Ada in the past. But modern machines have plenty of memory; saving copies of a few expressions for a short while isn't likely to cause any real memory usage issues. (That wouldn't have been true in the early days of Janus/Ada; one could not save too much as that effectively decreased the size of the symbol table that could be processed. Of course, there's no problem running a small-memory compiler like Janus/Ada on a large memory machine – doing the reverse would be impossible.) Interestingly, this check would have detected a bug in the implementation of the quality checks themselves. For a while the code of one of the checks contained if conditions that tested the left operand of an expression twice (and there was no code to check the right operand). Had the checks been operative when compiling the compiler, the error would have been immediately pointed out (rather than being noticed accidentally while fixing a different bug). We ran the compiler enhanced with these quality warnings on a variety of Ada source code. We started with the ACATS test suite (since we use that as a continuous regression test). It turns out that the ACATS has lots of dubious code, especially A = A constructs. Given the purpose of the test suite, that doesn't really matter. Thus, the ACATS was abandoned for this purpose. Our in house test suite (mostly tests created to prevent regression of user bug reports) similarly has lots of dubious constructs. Before abandoning checking of the warnings for it, however, I ran across two tests with code of the form of: if Ada_Convention_Passed and Ada_Convention_Passed then A check of the source code showed that there were two variables, Ada_Convention_Passed and C_Convention_Passed; clearly this check was supposed to check both, not the same one twice. However, the mistake would only matter if the test failed, which explains why it never was detected in the past. The enhanced compiler was also used to compile Claw. It had only one warning. That was a DeMorgan false positive, the Ptr1 = Ptr2 example previously discussed. The code was changed to eliminate the warning. Other work required the runtime to be recompiled. Two quality warnings appeared in it. There was a case with two if branches with identical code in an internal routine to Generic_Elementary_Functions. The routine in question is used to implement functions like SinH, so it probably isn't used much, and the code in question is dealing with boundary cases. This looked like a case of leaving some unneeded code behind; it's hard to be sure as there were no comments in that code. The other bug was a classic DeMorgan mistake in the task supervisor. It appears that any conditional entry call using Delay Until would have raised Program_Error, as the check was backwards: if Status /= Abs_Timed_Call or else Status /= ATC_Abs_Delay then raise Program_Error; This needs and then rather than or else. The warning here clearly saved a lengthy debugging session. The early experience with the quality checks therefore had some false positives, but also found several real bugs and some redundant code in supposedly tested production code. And they would have found at least one bug in their own implementation (potentially saving a debugging session). So they do appear to be valuable. It will be interesting to find out what happens when you run them on your code. We have a number of ideas for additional quality checks, but most of those will require assistance from the optimizer to avoid false positives. That will be a project for another time. Undoubtedly these new checks will cause problems for someone. If that turns out to be you (especially if you have problems with false positives that can't be worked around), I'd like to hear from you. (And I don't mind praise, either. :-) Write me at randy@rrsoftware.com.