

Author: “No Bugs” Hare Follow: Job Title: Sarcastic Architect Hobbies: Thinking Aloud, Arguing with Managers, Annoying HRs,

Calling a Spade a Spade, Keeping Tongue in Cheek

So, there is no Wyoming. And remember: if they said it on television, it MUST BE TRUE! Garfield the Cat, 'It MUST BE TRUE!' episode —

There is a common (mis)perception out there that whatever is written in a book, is to be trusted implicitly. As we’ve seen in the previous part of this article, it is not always the case. In particular, we were able to identify 8 different mistakes, one style issue and one “almost-mistake”, all within 42 lines of code from the book, and presented as tutorial (!).

Today we continue with the further 2 pages (containing 58 lines of code) from the same book. This will bring the total number of lines to 100, and we’ll be able to answer the question from the title of these 2 articles – “how many mistakes can fit into 100 lines of book tutorial code”.

So, let’s take a look at the next 2 pages out of the very same book [ProgrammingMultiPlayerGames]:

NB:To see the image more clearly, please click on it

NB2: No books have been harmed in the production of these scans (the effect of the pages being torn out is photoshopped)

Mistake #9: Missing const qualifier

void dreamMessage::WriteString(char *s) //NOBUGS: missing 'const' in declaration { if(!s) { return; //NOBUGS: see Mistake #10 } else Write(s, strlen(s) + 1); }

In the declaration of the dreamMessage:writeString() above there is one thing missing: it is const qualifier (the function should have been declared as WriteString(const char *s)).

In general, const qualifiers should be used whenever semantically applicable (both in C and in C++), and WriteString() represents a very obvious case for it: parameter of WriteString() is not expected to be changed (neither it is changed), and marking its parameter as being const leads to the code being more clear both to compiler and to developer (and also prevents accidental changes which violate the const contract). Reasoning behind using const is widely available, and we won’t discuss it in detail here; those interested in the reasoning behind, can refer to [SutterAlexandrescu].

Actually, this is not the first place where this mistake (missing const qualifier) has occurred within the code: it has already been observed in dreamMessage::Write(void *d, int length).

Mistake #10: NULL parameter value causes Different Message Format

“While the detection of pointer parameter being NULL is almost universally a good idea, having it silently ignored might lead to difficult-to-find bugs.While the detection of pointer parameter being NULL is almost universally a good idea, having it silently ignored might lead to difficult-to-find bugs. This alone would qualify this code as an almost-mistake, but in the WriteString() function above, the situation is significantly worse.

If NULL value is passed to WriteString(), then a message is generated (with no indication of any error whatsoever), but the format of this message-when-WriteString()-parameter-is-NULL will be diffferent than the format-of-the-message-when-WriteString()-parameter is not NULL. More specifically, when parameter is NULL, there will be nothing written to the message, not even a ‘\0’ character, so when the parser on the receiving side calls ReadString() to parse the output of our WriteString() call, it will read the data beyond what-was-written-by-this-WriteString() call, with generally unpredictable results. In general, such a practice is extremely error-prone and will lead to very-difficult-to-find bugs.

Instead of silently returning here, good practice would be to throw an exception, or at least (in the extreme case, bordering with a mistake) to have an assert()1 or (once again bordering with a mistake) – to write a single ‘\0’ char to the message, to keep the message format correct regardless of the function input values.

Mistake #11: Implicit Prohibitions and Implicit Modes

void dreamMessage::BeginReading(int s) { size = s; readCount = 0; }

“It means that if I'm intermixing calls to Write*() and BeginReading() functions - which is intuitively ok from my perspective as a developer-using-dreamMessage - my code is likely to fail in an unexpected-to-me way.Within the dreamMessage::BeginReading(), there is a very subtle mistake, which originates from a mix-up between a message and its parsing (see Mistake #1 in the first part of the article). The problem here is that the data member size is re-used for both reading and writing the message; most of the time it has the semantics of “size of the message”; however, for the BeginReading(int) function this semantics becomes broken, and the message size begins to be controlled externally (and arbitrarily). It means that if I’m intermixing calls to Write*() and BeginReading() functions – which is intuitively ok from my perspective as a developer-using-dreamMessage2 – my code is likely to fail in an unexpected-to-me way. The failure (from my intuitive perspective) will be resulting from an unexpected behaviour that after the call to BeginReading(int), not only position-until-which-Read-will-parse-the-message will change, but also a point where new data will be appended to the message, which is a counterintuitive side-effect of BeginReading(int) call.

Turning on my mind-reading machine, I can guess that the author of class dreamMessage didn’t think that such a usage scenario is allowed, i.e. his assumption was that the the caller may not intermingle Writes and Reads. Such a restriction would be fine as such, but then it needs to be enforced, so that the developer who is doing something wrong with dreamMessage, gets notified about The Horrible Mistake in His Ways as soon as possible. With this restriction in mind, we can say that our message can be in one of two states – “being read” or “being written”, with certain operations prohibited in each of these states. As I’ve said above, the problem here is not about having this restriction, but about having it implicit and non-enforced. To make the dreamMessage solid in this regard, these requirements need to be made explicit, by having a separate data member state, and by asserting (or by throwing an exception) whenever a function-inappropriate-for-this-state is called.

In practice, to avoid this problem in the case of dreamMessage, a much better approach would be to address Mistake #1, and to separate the parser from the message (adding a parameter parse_size to the parser to provide functionality of BeginReading(int)). However, in some other cases, making a previously-implicit state explicit (and asserting when the contract between the class user and the class itself is violated) can be a Really Good Idea (which will save a lot of trouble for developers-using-your class down the road).

Mistake #12: Returning the Pointer to Static

char *dreamMessage::Read(int s) { static char c[2048]; if(readCount+s > size ) return NULL; else memcpy(&c, &data[readCount], s); //NOBUGS: potential memory corruption (see Mistake #13) readCount += s; return c;//NOBUGS: returning pointer-to-static }

Unlike the previous mistake, there is nothing subtle about this one: it is very obvious. As you can see, this function returns a pointer to the static array c. And returning a pointer to a static is a really bad coding practice which should be avoided (in most of the code, including network-related one).

One big reason to avoid returning a pointer to static, is that such code will break really badly as soon as you try to use it in a multi-threaded environment (and as the book is about multi-player(!) games, multi-threaded stuff might be needed, at least in the network part of the code). More generally, returning pointer-to-static means that using dreamMessage::Read() is not reentrant.

If the previous reason is not sufficient, another reason to avoid such practice is that it makes the code counter-intuitive and error-prone (for example, a second call to Read() will change the data-which-has-been-read by a previous call to Read()).

K&R C In 1978, Brian Kernighan and Dennis Ritchie published the first edition of The C Programming Language. This book, known to C programmers as 'K&R', served for many years as an informal specification of the language— Wikipedia —Bottom line: avoid returning pointers to the static data at all cost; while this practice has been used in times of K&R and since that point has found its place in the C standard library, it has been discouraged for at least the last 15 years or so (with reentrant versions of such functions, such as asctime_r() and strtok_r(), specified in more recent versions of POSIX).

In this particular case, the caller could have provided a buffer (and buffer size(!)) where the result should be placed, or a pointer to some kind of allocated buffer might be returned, or even could have used something like std::vector<uint8_t>.

This mistake will be repeated twice in the 100 lines of code under analysis.

Mistake #13: Potential Memory Corruption

In the dreamMessage:Read() above, if the parameter s is > 2048, a buffer overwrite (likely causing a memory corruption) will occur. To make things worse, this “s MUST be <= 2048” requirement is based on an arbitrary size of the buffer buried deep within function implementation.

If you really need to have a fixed-size buffer, have the decency to throw an exception or at least (bordering with a mistake) to assert()3 when its size is exceeded. Relying on the caller always being sane is a really bad idea (and makes life of the caller – when debugging her code – Much Much Worse than it should be).

Mistake #14: A Little Marvel: Abusing Return Values

char dreamMessage::ReadByte(void) { char c; if(readCount+1 > size) c = -1;//NOBUGS: '-1' is a legal value for a char else memcpy(&c, &data[readCount], 1); readCount++; return c; }

There is an error-handling code in this function, but there is a subtle but really bad problem with it. When the caller is trying to read beyond the size, the function detects it. So far, so good. But what does the function do then?

It returns -1, and the return type is char, which means that if there was a byte -1 written to the message by the sender, the return value of this function becomes indistinguishable from the error return value.

These return codes are normally used (elsewhere in the book) in a manner such as:

//from ReadString() function: char c = ReadByte(); if(c == -1 || c ==0)//NOBUGS: See Also Mistake #14a break;

It means that if the string contains a perfectly valid character with ‘-1’ value, the parser will interpret it as an end of the string. Moreover, it will leave the parser in an inconsistent state, causing further parsing to read invalid data.

In the C standard libraries, there is a practice of returning ‘-1’ on error, and it is fine – as long as ‘-1’ is not a valid return value. For example, getc() function does return ‘-1’ (more precisely – EOF), but the return type of getc() is int (not char), and all valid chars are returned as positive values, which makes the function sane (unlike dreamMessage::ReadByte() above).

While expanding the return type of ReadByte() would fix the bug, in C++ it’s better to use exceptions and avoid thinking about this kind of stuff (while simplifying error handling for the caller too).

This mistake will be repeated 4 times on our 100 lines of code torn out of the book.

Mistake #14a: Honourable Mention: Relying on Char being Signed

Actually, the code right above (with a comment “from ReadString() function”), has one more bug (thanks to Tobi for mentioning signed/unsigned char issue, which has lead to discovering this mistake). The problem with this code is that comparison if(c==-1) will work properly only if char c is a signed char. Otherwise (if char is unsigned), it will be cast to int, and after this cast it will be equal to 255, so that comparison with -1 will never work.

As C standard doesn’t specify whether char should be signed or unsigned (and that even worse, there is a compile-time switch in most compilers to affect char being signed or unsigned with default behaviour differing from one compiler to another one4), such code is incorrect. While this bug is not, strictly speaking, in our 100 lines of code under review, it still clearly deserves a honourable mention in our bug parade; on the other hand, it can be seen as a code issue even while staying strictly within ReadByte() function.

Fortunately, avoiding such bugs is easy – by using int8_t (or uint8_t if you prefer your type being unsigned) instead of char. In practice, I usually prefer to keep my own typedef byte8 as uint8_t.

Mistake #15: Life after Death

core dump In computing, a core dump (in Unix parlance), memory dump, or system dump consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has terminated abnormally (crashed).— Wikipedia —In addition to the abuse-of-return-values described above, the very same dreamMessage::ReadByte() function has yet another potentially-coredumping mistake (though chances of this core dump happening are not too high in practice). More specifically, whenever the size is exceeded, ‘-1’ is returned. However, as the function doesn’t simply return ‘-1’, but also increments readCount. Then, if the caller repeats these calls to ReadByte() long enough, it will eventually cause readCount to wrap around. And after readCount wraps around, it will go into the negative area, so the error detection condition won’t fire, and the code will fall through to the line #7, likely causing a read from inaccessible memory (which in turn usually leads to core dump).

The best would be to throw a C++ exception in case of buffer overread.

This mistake will be repeated 4 times on our 4 book pages; note that the simple ‘return -1’ instead of ‘c = -1’ is not exactly correct for functions such as ReadShort() etc., as it might lead to reading the discarded data; for functions such as ReadShort(), the correct non-exception version would be something along the lines of { readCount = size; return -1; } – though I strongly recommend using exceptions instead.

Mistake #16: Magic Number

Magic Number The use of unnamed magic numbers in code obscures the developers' intent in choosing that number, increases opportunities for subtle errors, and makes it more difficult for the program to be adapted and extended in the future.— Wikipedia —In the dreamMessage:ReadByte() above, ‘-1’ is a typical “magic number”, and magic numbers are to be avoided. The reasons are numerous (see, for example, [Martin] and [Stroustrup]), and while I am myself particularly guilty of using magic numbers in my code, I admit that it is indeed a mistake (well, at least when the magic numbers go beyond the implementation details of one single class).

This mistake is repeated four times within 100 lines of code under review.

Conclusion

We’ve managed to find 16 different mistakes (with 39 instances of these mistakes), one style-issue and one almost-mistake (with 7 instances in total), all this within 100 lines of supposedly tutorial code. I think it perfectly illustrates my point that not everything from the books (or even my blog posts for that matter) should be taken as an example of “how to write software”.5

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.