Six ways to write more comprehensible code

How to keep your code from destroying you

I learned to write, clear, maintainable code the hard way. For the last twelve years, I've made my living writing computer games and selling them over the Net using the marketing technique that was once charmingly known as shareware. What this means is that I start with a blank screen, start coding, and, a few tens of thousands of lines of code later, I have something to sell.

This means that, if I make a stinky mess, I'm doing it in my own nest. When I'm chasing down a bug at 3 a.m., staring at a nightmare cloud of spaghetti code, and I say, "Dear God, what idiot child of married cousins wrote this garbage?", the answer to that question is "Me."

So I have been well rewarded by learning about good, sane programming techniques. Some of these practices are described in this article. Many skilled, experienced, morally upright coders will always know some of this stuff by heart. All these people will get from this article is a chance to bathe in my delightful prose style and remember how horrible life was before they got the clean code religion.

But there are many who, like me, stumbled into programming in an unexpected or unusual way and never had anyone drill this stuff into them. These things are basic to many but, to others, they are invaluable techniques that nobody has told them. So, to those who don't want to make a mess, this is for you.

The example case

For illustration purposes, our example program throughout this article is a hypothetical computer game called Kill Bad Aliens. In it, you will control a spaceship. It will move horizontally back and forth at the bottom of the screen, shooting bullets upwards. Your ship will be controlled by, say, the keyboard.

Figure 1. Our hypothetical game

The game will take place in periods of time called Waves. During each wave, aliens will appear, one after another, at the top of the screen. They'll fly around and drop bombs. Aliens will appear a fixed period of time apart. After you kill a certain number of aliens, the Wave ends.

Killing an alien gives you some points. When you finish a wave, you get a bonus number of points depending on how quickly you finished.

When a bomb hits you, your ship blows up and another appears. When you are blown up three times, the game is over. If you got a high score, you are valuable as a person. If your score is low, you are not.

So you sit down and start to write Kill Bad Aliens in C++. You define objects to represent the ship, your bullets, the enemies, and the enemy's bullets. You write the code to draw all the objects. You write the code to move them around as time passes. You write the game logic, the alien AI, the code to read the user's will from the keyboard and so on.

Now how do we do this so that, when the game is ready, the code is comprehensible, easily maintained, and, in general, not a mess?

Tip 1: Comment like a smart person.

Comment your code. Obviously. If you write a procedure, fail to comment it, and return to it a few months later to rework it (and you will), not having comments will cost you time. And time is your most valuable resource. Lost time can't ever be replaced.

But commenting, like anything else, is a skill. You get better with practice. There is good commenting, and there is bad commenting.

You don't want to write too much. Suppose you write comments for a function that, in the future, saves you ten minutes of time understanding your code. Great. But suppose your comments are so verbose that it takes five minutes to write them and then, later, five minutes to read them. Then you've saved yourself zero time. Not so good.

You don't want to write too little, either. If code goes on for a page or two without something breaking down what's going on, well, I hope that code is clear as crystal, because otherwise you're wasting future time.

And you don't want to comment in stupid ways. When people first start writing comments, they often get hyper and write things like:

// Now we increase Number_aliens_on_screen by one. Number_aliens_on_screen = Number_aliens_on_screen + 1;

Uhmmm, duh. If something is so obvious, it doesn't need a comment. And if your code is such a tangle that you need a comment for every single line of it, you'd probably profit from making it simpler in other ways first. Comments don't just save time, they cost it. They take time to read, and they spread out the actual code on the screen, so you can have less of it on your monitor to inspect at one time.

And, while we're at it, don't ever do this:

Short get_current_score() { [insert a whole bunch of code here.] return [some value]; // Now we're done. }

Oh? We're done? Thanks for letting me know. That big right bracket and the infinite expanse of empty space beyond really didn't tip me off to that. And you don't need a comment before the return statement saying, "Now we return a value," either.

So, if you are writing code, in the absence of a boss or a company policy telling you what to do, how do you comment it? Well, what I do for code I am stuck with maintaining myself is write an introduction. When I return to a procedure I forgot that I wrote, I want to see an explanation for what is going on. Once I understand what the machinery is doing, it becomes infinitely easier to understand the actual coding. This generally involves:

A few sentences before the procedure/function saying what it does. A description of the values being passed into it. If a function, a description of what it returns. Inside the procedure/function, comments that split the code up into shorter tasks. For chunks of code that seem thorny, a quick explanation of what is happening.

So we need a description at the beginning and a few signposts inside explaining the road taken. Doing this is very quick, and it saves a ton of time in the long run.

Here is an example from the theoretical Kill Bad Aliens. Consider the object representing the bullet the player fires. You will frequently have to call a function to move it upwards and see if it hits anything. I would probably code it something like this:

// This procedure moves the bullet upwards. It's called //NUM_BULLET_MOVES_PER_SECOND times per second. It returns TRUE if the //bullet is to be erased (because it hit a target or the top of the screen) and FALSE //otherwise. Boolean player_bullet::move_it() { Boolean is_destroyed = FALSE; // Calculate the bullet's new position. [Small chunk of code.] // See if an enemy is in the new position. If so, call enemy destruction call and // set is_destroyed to TRUE [small chunk of code] // See if bullet hits top of screen. If so, set is_destroyed to TRUE [Small chunk of code.] // Change bullet's position. [Small chunk of code.] Return is_destroyed; }

If the code is clean enough, this sort of commenting should be sufficient. And it will save plenty of time the dozen times I return to this function to fix a dumb mistake I made.

Tip 2: Use #define a lot. No, a LOT.

Suppose that, in our hypothetical game, we want the player to get ten points when he shoots an alien. There are two ways to do this. This is the bad way:

// We shot an alien. Give_player_some_points(10);

This is the good way: In some global file, do this:

#define POINT_VALUE_FOR_ALIEN 10

Then, when we give up some sweet points, naturally, we'd write,

// We shot an alien. Give_player_some_points(POINT_VALUE_FOR_ALIEN);

Most programmers know, on some level, to do things like this. But it requires discipline to do it enough. Just about every time you define a constant number, you should strongly consider defining it in one central place. For example, suppose you want the play area to be 800 by 600 pixels. Be sure to do this:

#define PIXEL_WIDTH_OF_PLAY_AREA 800 #define PIXEL_HEIGHT_OF_PLAY_AREA 600

If, at a later date, you decide to change the size of your game window (and you very well might), being able to change the value in this one spot will save you time twice. First, you don't need to search through all of your code for all of the times you said the play area was 800 pixels wide. (800! What was I thinking?) Second, you don't have to fix the bugs that will invariably be caused by the references you will invariably miss.

When I'm working on Kill Bad Aliens, I need to decide how many aliens need to be killed to complete a wave, how many aliens will be on screen at once, and how fast they appear. For example, if I wanted every wave to have the same number of aliens all appearing at the same rate, I might be inclined to write something like this:

#define NUM_ALIENS_TO_KILL_TO_END_WAVE 20 #define MAX_ALIENS_ON_SCREEN_AT_ONCE 5 #define SECONDS_BETWEEN_NEW_ALIENS_APPEARING 3

Pretty clear. And later, if I think the waves are too short or the time between aliens is too fast, I can tweak these values and instantly rebalance the game.

One nice advantage of setting up the game values like this, by the way, is that the ability to make changes so quickly can be a lot of fun and make you feel like a god. For example, if you change the above to this:

#define NUM_ALIENS_TO_KILL_TO_END_WAVE 20 #define MAX_ALIENS_ON_SCREEN_AT_ONCE 100 #define SECONDS_BETWEEN_NEW_ALIENS_APPEARING 1

Then you are one recompile away from seeing everything be funny and crazy.

Figure 2. Kill Bad Aliens before we crank up all of the constants

Figure 3. Kill Bad Aliens after we crank up all of the constants (it may not be a good game now, but it's fun to see)

By the way, you will note that I did not write any comments for the above values. That is because their meaning is obvious from the variable name. Which brings us to the next point.

Tip 3: Don't use variable names that will mock you.

The overall objective is simple: write code so that, if someone who has no idea what it's doing reads it, he or she can understand what's going on as quickly as possible.

One key strategy for achieving this goal is to give your variables, procedures, etc. good, descriptive names. If someone looks a variable name and goes, "Yeah, I see what that is," that saves five minutes of searching through the program looking for clues to what the heck incremeter_side_blerfm is supposed to mean.

Look for a good middle balance here. Give things names that are long and clear enough that you understand what they are, but not so long and awkward that they make the code less readable.

For example, in real life, I probably wouldn't give constants names as long as I did in the previous section. I just did that so that you, the reader, would totally understand what they meant without any context. In the context of the program itself, instead of:

#define MAX_ALIENS_ON_SCREEN_AT_ONCE 5

I would almost undoubtedly write:

#define MAX_NUM_ALIENS 5

Any confusion caused by the shorter name would be cleared up very quickly, and the shorter name would lead to much more readable code.

Now consider the snippet of code I would call very frequently to move all the aliens around the screen. I would almost undoubtedly write it like this:

// move all the aliens for (short i = 0; I < MAX_NUM_ALIENS; i++) if (aliens[i].exists()) // this alien currently exist? aliens[i].move_it();

Note that the array of all of the aliens is just called aliens . This is perfect. It's exactly descriptive of what I want, but it's short enough that I can type it a thousand times without going mad. This is probably an array you'll be using a LOT. If you call it something like all_aliens_currently_on_screen , your code will be ten miles longer and less clear because of it.

As well, I simply called the loop variable, without any extra comment, i . When first getting into the whole descriptive variable name thing, it can be tempting to call it "counter" or something like that. Not necessary. The point of naming a variable is to enable the reader to instantly go, "Oh. I know what that does." If it's called "i", "j", or so on, everyone knows that's used for the loop. No explanation necessary.

Of course, it's possible to be much more careful about variable naming than this. For example, there is something called Hungarian Notation. There are lots of flavors of this, but the basic idea is that you put a tag at the beginning of a variable name saying what type it is. (So all unsigned long variables begin with ul , etc.) This is a bit fussier than I like to get, but it's something you should know about. It's possible to spend too much time making things clear, but it takes some effort.

Tip 4: Do error checking. You make errors. Yes, you.

If it's a decently sized program, it's going to have a lot of functions and procedures. As excruciating as it is, every one of them should have a bit of error checking.

When you create your procedure/function, you should always think, "Suppose some malevolent, insane person passed this all sorts of backwards, weird values. How can this poor, fluffy bit of code defend itself and keep the computer from exploding?" Then write your code to check for and protect itself from that weird data.

Here's an example. The main goal of our groovy space game is to kill aliens and accumulate points, so we'll need a procedure to alter the score. Furthermore, when we add points, we want to call a routine that draws pretty sparkles on the score. So here's a first pass:

Void change_score(short num_points) { score += num_points; make_sparkles_on_score(); }

So far, so good. Now ask yourself: What could go wrong here?

First, an obvious one. What if num_points is negative? Do we want to allow the player's score to go down? Well, we might. But there is nothing in the description of the game I gave earlier that mentions losing points. Plus, games should be fun, and losing points is never fun. So we will say a negative number of points is an error and must be caught.

That one was easy. But here is a more subtle problem (and one I deal with in my games all the time). What if num_points is 0?

This is a very plausible situation. Remember, we are giving a bonus at the end of every wave, based on how quickly the player completes it. What if the player is super slow? And we decide to give a bonus of 0 in that situation? Then it's entirely possible, at 3 a.m., to call change_score and pass a 0.

The problem here is that we probably don't want the scoreboard to flash with pretty colors when the number displayed doesn't change. So we want to catch that. Let's try this code:

Void change_score(short num_points) { if (num_points < 0) { // maybe some error message return; } score += num_points; if (num_points > 0) make_sparkles_on_score(); }

There. Much nicer.

And note that this was a very simple function. None of the fancy, newfangled pointers you crazy kids like to use. If you are passing arrays or pointers, then you REALLY better be watching for errors or bad data.

And the advantages of doing this don't end at keeping your program from exploding. Good error checking makes it faster to debug too. Suppose you know that you are writing data outside the bounds of some array, and you're going through your code looking for where that might happen. When you look at a procedure where all of your error checking is in order, you don't have to spend time picking through it to find the mistake.

This saves tons of time, and it bears repeating. Time is the most valuable resource we have.

Tip 5: "Premature optimization is the root of all evil." - Donald Knuth

I didn't make up the above sentence. But you can find it on Wikipedia, so it must be really smart.

Unless you are trying to make people suffer, your first goal, when writing code, should be clarity. Simple code is faster to write, faster to understand when you return to it later, and faster to debug.

Optimization is the enemy of clarity. Sometimes, though, you have to optimize. This is especially true in games. However, and this is the vital point, you almost never know what you need to optimize until you actually take your functioning code and test it with a profiler. (A profiler is a program that watches your program and figures out how much time it spends using different calls. These are awesome programs. Find one.)

Every time I've optimized one of my games, I've invariably been blown away. The code I was most worried about was always fine. The code I'd never thought about was slow. Because I had no idea what was fast and what was slow, all optimization time I had spent before getting actual data was wasted. Worse than wasted, in fact, because it tangled up the code.

This is a hard rule to follow. Heck, if it were easy, it wouldn't have to be a rule. Good programmers tend to be offended by clumsy code that could be faster.

But rejoice! After preaching about how you should spend more time doing this and more time doing that, this is one rare, precious moment when I'm saying that it's okay to be lazy!

Write something that is clean and works. You have all the time in the world to ugly it up with optimization later. But don't do it until you're sure that you're doing the right thing.

And, speaking of doing things that hurt, here's one final bit of advice:

Tip 6: Don't be too clever by half.

There is something called the IOCCC. That is, the "International Obfuscated C Code Contest." You see, C and C++, whatever their considerable advantages, lend themselves to writing nightmarishly complicated code. This contest displays the value of clear code by celebrating the insane. It's pretty awesome.

It's worth a visit to see just how much damage you can do with an encyclopedic knowledge of a programming language, combined with a lack of shame. If you know enough, you can cram ten lines of code into one line. All it costs you is a complete inability to quickly fix the bug you put in it.

The lesson here is that if the code you are writing requires detailed knowledge of intricate rules of precedence or inspires you to look in the back chapters of some book to figure out what you're doing, you are being too clever by half.

Everyone has his or her own tolerance for complexity in code. I, personally, write my programs like stereotypical old grannies drive. As far as I'm concerned, if your C code requires you to know the difference between i++ and ++i, it is too complicated.

You can think I'm a wimp if you want. You're right. But I spend a lot less time trying to comprehend my code than I would otherwise.

Conclusion

At this point, you may be thinking, "Wow. That was a big waste of time. All of this stuff is obvious and everyone knows it. Why did anyone write all of this?" I hope this is what you're thinking. Then you're already smart. Good for you.

But don't think that all of this is obvious to everyone. It's really not. Bad code gets written all the time. But it doesn't have to be that way.

If you are trying to wrangle tons of code and keep it from destroying you, I hope that some of this helps. Keep it simple, keep it clear, and you will save lots of time and screaming.

Downloadable resources

Related topics