Many people think that the readability of code has to do with the letters and symbols used. They believe it is the adding, removing, or changing of those symbols that makes code more readable. In some sense, they’re right. However, the underlying principle is:

Readability of code depends primarily on how space is occupied by letters and symbols.

What does that mean? Well, it means two things:

Code should have the proper amount of white space around it. Not too much, not too little.

There should be the proper amount of space within a line of code to separate out the different parts. Separate actions should generally be on separate lines. Indentation should be used appropriately to group blocks of code.

With this principle, it’s actually the absence of code that makes things readable. This is a general principle of life–for example, if there was no space at all between letters and words in a book, it would be hard to read. On the other hand, it’s easy to see the moon against the clear night, because there’s a lot of clear black space that isn’t the moon. Similarly, when your code has the right amount of space in it, you can tell where and what the code is easily.

For example, this code is hard to read:

x=1+2;y=3+4;z=x+y;print"hello world";print"z is"+z;if(z>y+x){print"error";}

Whereas with the proper spacing in, around, and between the lines, it becomes easy to read:

x = 1 + 2; y = 3 + 4; z = x + y; print "hello world"; print "z is" + z; if (z > y + x) { print "error"; }

There can also be too much or wrong space, however. This code is also hard to read:

x = 1+ 2; y = 3 +4; z = x + y; print "hello world" ; print "z is " + z; if (z > y+x) { print "error" ; }

Code itself should take up space in proportion to how much meaning it has.

Basically, tiny symbols that mean a lot make code hard to read. Very long names that don’t mean much also make code hard to read. The amount of meaning and the space taken up should be closely related to each other.

For example, this code is unreadable because the names are too small:

q = s(j, f, m); p(q);

The space those names take up is very little compared to how much meaning they have. However, with appropriately-sized names, it becomes more apparent what that block of code is doing:

quarterly_total = sum(january, february, march); print(quarterly_total);

On the other hand, if the names are too long compared to how much meaning they represent, then the code becomes hard to read again:

quarterly_total_for_company_x_in_2011_as_of_today = add_all_of_these_together_and_return_the_result(january_total_amount, february_total_amount, march_total_amount); send_to_screen_and_dont_wait_for_user_to_respond(quarterly_total_for_company_x_in_2011_as_of_today);

This principle applies just as well to entire blocks of code as it does to individual names. We could replace the entire block of code above with a single function call:

print_quarterly_total();

And that is even more readable than any of the previous examples. Even though the name we used– print_quarterly_total –is a bit longer than our other names for things, that’s okay because it represents more meaning than other pieces of code do. In fact, it’s even more readable than our block of code was, by itself. Why is that? Because the code block took up a lot of space for, effectively, very little meaning, and the function takes up a more reasonable amount of space for the same meaning.

If a block of code takes up a lot of space but doesn’t actually have much meaning, then it’s a good candidate for refactoring. For example, here’s a block of code that handles some user input:

x_pressed = false; y_pressed = false; if (input == "x") { print "You pressed x!"; x_pressed = true; } else if (input == "y") { if (not y_pressed) { print "You pressed y for the first time!"; y_pressed = true; if (x_pressed) { print "You pressed x and then y!"; } } }

If that were our whole program, that would probably be readable enough. However, if this is within a lot of other code, we could make it more readable like this:

x_pressed = false; y_pressed = false; if (input == "x") { handle_x(x_pressed); } else if (input == "y") { handle_y(x_pressed, y_pressed); }

And we could make it even more readable by reducing it to this:

handle_input(input);

Reading “handle_input” in the middle of our code is much easier than trying to read that whole first block, above, because “handle_input” is taking up the right amount of space, and the block is taking up too much space. Note, however, if we’d done something like h(input) instead, that would be confusing and unreadable because “h” is too short to properly tell us what the code is doing. Also, handle_this_input_and_figure_out_if_it_is_x_or_y_and_then_do_the_right_thing(input) would not only be annoying for a programmer to type, but would also make for unreadable code.

Naming Things

It was once said by a famous programmer that naming things was one of the hardest problems in computer science. These principles of readability give us some good clues on how to name things, though. Basically, the name of a variable, function, etc. should be long enough to fully communicate what it is or does, without being so long that it becomes hard to read.

It’s also important to think about how the function or variable is going to be used. Once we start putting it into lines of code, will it make those lines of code too long for how much meaning they actually have? For example, if you have a function that is only called once, on one line all by itself, with no other code in that line, then it can have a fairly long name. However, a function that you’re going to use frequently in complex expressions should probably have a name that is short (though still long enough to fully communicate what it does).

-Max