Today we are going to see a case study illustrating a bug. Not a very spectacular one: a typical bug one would encounter in everyday work. We will start with the symptoms, identify the root cause, and suggest measures to prevent similar things from happening in the future.

The program in question, among other things, displays times of day with minute resolution. Because there is a lot of information to be displayed, and the output needs to be compact, the program does not separate hours from minutes, like "11:30" , but glues them together, like "1130" . The interpretation is obvious and this format is common in our industry. Even when we see a three-digit number like "130" it is unambiguously identified as 01:30.

At some point it was observed that the program sometimes outputs different times than would be expected. Sometimes we get the wrong minute: "100" , expected "140" . Sometimes we get the correct minute but wrong hour "310" , expected "510" . Sometimes the time is correct, and sometimes everything is wrong.

If we list the examples along we may start to observe the pattern.

actual output expected output 045 045 1 00 140 1010 1650 12 25 2025

The actual output is always smaller or equal to the expected value. The difference grows as the actual time grows. The bug is reproducible and the difference appears to only depend on the value of the actual time of day. Therefore it should be easy to identify the place in the code responsible for the situation.

Indeed after a short while, we discover that someone decided to store the time of day as int denoting “minutes since midnight.” Not a bad idea: it optimizes the storage capacity and we can easily compute the difference between two times (on the same date). But the downside is that somewhere in the program we have the following piece of code:

struct flight_data { // ... date arrival_date; // no problem here int arrival_time; // minutes since midnight }; void display(const flight_data& f) { // ... cout << f.arrival_date; cout << f.arrival_time; }

This explains the discrepancy. The expected 01:40 a.m. is 100 minutes after midnight. Therefore we see "100" instead of "140" .

At this point, someone might just change function display to:

void display(const flight_data& f) { // ... cout << f.arrival_date; cout << minutes_as_HHMM(f.arrival_time); }

and consider the matter closed. It could be ‘closed’ in the sense that when we test again, the numbers will be reflecting the expected values, bug fixed, bug statistics improved…

But we can clearly see that since it was easy to make a mistake, it is not unlikely that in some other place in the code someone else also fell into the same trap. Even if there is none now, it is likely that it will occur in the future, because it is so easy to get it wrong.

Just fixing the bugs “for now” is not satisfactory. I strongly believe that we should aim at something more: design the code in such a way that the omissions like the one above are impossible, or difficult to achieve. The developer’s goal, as I see it, should be to not have bugs in the first place rather than to plant bugs and then fix them.

One thing we can fix immediately is to rename the member variable:

struct flight_data { // ... date arrival_date; int arrival_time_as_minutes_since_midnight; };

The variable name sends a clear message, plus, if we do it, the compiler will show us every place where it is referred to by the old name and we can decide if there are other places that need fixing.

However, function display in real life is not as short and simple as we have seen it above. It is more likely, or at least conceivable, that you would see something like this:

void display(const flight_data& flt, bool abbreviate, const supplementary_data& suppl) { bool do_vca = supl.do_any_tla; bool do_strict = suppl.force_strict || !suppl.aux_vec.empty(); if (!abbreviate || suppl.traffic && suppl.traffic->do_tse) do_strict = true; do_vca |= is_interline(flt); cout << boost::format("%1%;%2% %3%") % boost::str(boost::format("%1%%2%") % (do_vca ? flt.status : extend_status(flt.status) % suppl.alteration_str) % boost::str(boost::format("from %1% %2% %3%") % flt.departure_apt % flt.departure_date % (do_strict ? flt.arrival_time_as_minutes_since_midnight : flt.estimated_arrival_time)) % boost::str(boost::format("to %1% %2% %3%") % flt.arrival_apt % (flt.departure_date + flt.dep_offset) % (do_strict ? flt.departure_time : flt.estimated_departure_time)); }

If you ever need to read through what this function does, and working in a big project I really have to analyze such things, you will be distracted by a number of things. First, just the amount of symbols packed like this is distracting. Second, in line 10, the statement is indented as though someone wanted to execute it conditionally. But because there are no braces in this if-statement, it will not be the case. So, we have a bug either in the logic flow or in the indentation.

Next, you will notice I used boost::format . Do you know this library? If not, now you will be distracted. You will have to go out of your way in order to spend a couple of minutes checking what it does. If you do know it, you will wonder why it was used in this unconventional and suboptimal way. A bug, or was there some purpose?

In the end, you will be so distracted by these things that it may slip your attention that we are streaming an integer that we do not want displayed as an integer. You will have to rely on the assumption that if it compiles, hopefully it does the right thing.

Employ the type system

A simple way to avoid the above problem is to make use of the type system. Introduce a new type for storing information about minutes elapsed since midnight:

class Minutes_since_midnight { int minutes; // invariant: minutes >= 0 && minutes < 24 * 60 public: explicit Minutes_since_midnight(int m) : minutes(m) {} // requires: m >= 0 && m < 24 * 60 int as_int() const { return minutes; } // and relational operators ... };

This is something similar to an opaque typedef. It is more however: we can associate an invariant with it. Now the type, apart from being distinct, also conveys the information that we are only interested in the subset of values of int .

Now, we can use the type in our program:

struct flight_data { // ... date arrival_date; Minutes_since_midnight arrival_time; }; void display(const flight_data& f) { // ... cout << f.arrival_date; cout << f.arrival_time; // compile-time error }

An attempt to stream out Minutes_since_midnight will result in compiler error, which will offer us an opportunity to think how we want to proceed before the program is built. One tempting option is to define a dedicated streaming operator for our type, but personally I doubt there is one good default way of displaying such a type. I would rather rely on converting functions, like minutes_as_HHMM .

Is it worth the trouble?

Type Minutes_since_midnight does not add any run-time performance hit compared to an int , so one may think there is no practical trade-off involved here. But the truth is there is one trade-off at stake, and it looks like it is sometimes non-trivial. The safety measure we have seen above requires that we define a new class, whereas type int is available out of the box, for free. The cost of adding a new class is the developer’s additional time. We now have to consider all the guidelines related to defining classes: which members should be private; should we define a custom copy constructor? If the policy in your work requires adding a .h and .cpp file for each class, it means adding new files. Then, new includes, then documenting it, then new unit tests. Look, a class is additional code which could contain bugs…

So you might ask the question, is the potential gain in safety worth the investment?

Everyone will have their own answer. Mine is this. If you do not make mistakes in your programs, it is not worth the trouble. After all this technique is only about preventing mistakes (which are not guaranteed to be there). If you are writing a small toy or test program for yourself exclusively, you probably don’t mind the bugs. But if the program is for customers other than yourself, and if the consequence of the bug is potentially severe, ignoring this technique appears like a strange thing to do; unless you know a superior solution.