My job has required me to read quite a bit of code written by other people, and over the years I've found that there are some problems that tend to recur again and again. The idea with this blog entry is to write up some tips on this once and for all, in the hope that it can help people out there who struggle to write clean code.

The main goal with these tips is readability. This is important because usually your code has to be changed or reviewed by other people, and writing readably makes these people's jobs a lot easier. That may be obvious. What's less obvious is that it also makes your own job easier, because working on simple code is a lot easier than working on complex code, even while you are writing it, and the more readable code is always simpler.

An important part of making your code readable is to always work to communicate your intent. The basic idea is that it's usually not difficult to see what a single line of code does. The tricky thing is to see why it's doing what it does. So the more you can make your code communicate the thinking behind the code, the better it is. You'll see this point recurring repeatedly below.

I should add that most people seem to be aware of most of these rules, in a general kind of way. The main point of this posting is really to convince people that these rules are important. So important, in fact, that you should try to always follow them, and that you should have a really bad conscience about any code that breaks them. So bad that you just never write any such code. Some of these rules require you to write a little extra code, but I think I can claim that none of them cause you to actually lose time, on any project.

1. Always know why you are catching an exception

This has to be the most common mistake I see people make, and it seems to me that exceptions is a feature many people haven't really learned how to use yet. One of the reasons for this is probably that before the advent of Java the languages people used at university did not have exceptions, and so no rules were taught about this. It's not a trivial feature to use, so it's hardly surprising that people struggle.

Anyway, the cardinal rule is: if you catch an exception you need to have a good reason. Reporting an expected error condition to the user in a more friendly way is a good reason. Handling an expected, but unusual, condition is another. But very often people just swallow the exceptions with an empty catch block (this is a real nono), or they do like this:

private static List readLines(String fileName) { String line; ArrayList file = new ArrayList(); try { BufferedReader in = new BufferedReader(new FileReader(fileName)); while ((line = in.readLine()) != null) file.add(line); in.close(); } catch (Exception e){ System.out.println(e); return null; } return file; }

The catch block here does nothing useful. In fact, it hides useful information if the program were to crash. If you don't catch the exception, you get a traceback. If you do catch it you get much less. Plus, it takes up 5 lines, and it makes the method almost twice as big as it needs to be. Doing without is just a lot better.

That would give us this:

private static ArrayList readLines(String fileName) throws IOException { String line; ArrayList file = new ArrayList(); BufferedReader in = new BufferedReader(new FileReader(fileName)); while ((line = in.readLine()) != null) file.add(line); in.close(); return file; }

2. Never declare implementation types

This is another thing I see very often: parameters, variables, or even return types that are really implementation types. You can see this in the code above, where ArrayList is used where List would do. There are several problems with this: ArrayList is longer, and you have to do a lot of work if you want to change what list implementation you are using.

The main thing, however, is that just using List communicates your intent that this be a list much more clearly. It says "I thought about this, and I really intend this collection to be a List, and not just any sort of collection". In other words: it tells the reader that you care about the order in this collection.

Another round of simplification brings us this:

private static List readLines(String fileName) throws IOException { String line; List file = new ArrayList(); BufferedReader in = new BufferedReader(new FileReader(fileName)); while ((line = in.readLine()) != null) file.add(line); in.close(); return file; }

3. Use descriptive variable names

This is a rule that can be hard to follow, but in general variables should be named so that they make it clear what a variable contains. Sometimes this is blindingly obvious, and in these cases short names like i for loop indexes, in for files is perfectly OK. But if you use a longer name, try to make it meaningful.

In the case of the method above, what does file really contain? Yes, it does contain the file, but what it really is is a list of all the lines in the file. So why not call it lines ? That would give us:

private static List readLines(String fileName) throws IOException { String line; List lines = new ArrayList(); BufferedReader in = new BufferedReader(new FileReader(fileName)); while ((line = in.readLine()) != null) lines.add(line); in.close(); return lines; }

Still not perfect, but arguably a lot more readable than the original.

4. Cut-and-paste code

One simple approach that's suggested quite often as a way to improve people's code is to disable the paste function on every developer machine. While the idea may be a little over the top, the basic thought is sound. If you have a snippet of, say, 7 lines of code that do one thing, and you want to do it again to a different set of variables, don't copy and paste the code. Instead, make a function (or a method, if you're stuck with Java).

There are several reasons for this. One is simple brevity. Another is that it makes your code much easier to read, since you would effectively be replacing 14 lines of code with just 2. This would obviously be at the expense of an additional function/method, but since these stand alone and isolated from the rest of the code, their impact on readability is very low. The main reason, however, is that by doing this you make life much easier for yourself. Even if this is a one-time script, you are still going to change the code around. This is easier if it's broken up into parts, like functions/methods.

5. Variables and communication

One of the hardest things to pick up when reading code is to understand the flow of values through the variables in the code. (This is one reason why functional programming is popular.) This has several implications for how the code should be written.

For one thing, this means that you should try to declare variables as close to where they are used as possible. Let's say you write the following:

String tmp = null; // 50 lines that don't touch tmp tmp = whatever; // ...

This means the reader has to scan 50 lines to see if tmp is used anywhere in there. Merging the assignment and the declaration would remove this problem completely, and would lose you nothing.

Another thing this means is that if a variable is only used inside a block (if statement, while loop, or whatever), then it really should be declared in there. Imagine you write this:

int tmp; for (...) { // code that uses tmp } // no more mentions of tmp below this point

In this case, you could have declared tmp inside the for loop. Not doing it sends the signal that "I'm going to keep using this variable" and forces the reader to try to figure out what happens with it later. Moving the declaration inside the loop means that you can't use the variable after the loop, which makes the code much easier to read.

A third consequence is that you should really work hard to keep your functions/methods short. If you have a function/method that's, say, 400 lines long it becomes almost impossible to track the data flow through it. There will just be too many variables and too many assignments to keep track of. So this is a real no-no. Just don't do it. And as usual with these rules, this is a rule that will also help you get the code right the first time, so this is worth doing even if you will just write the code and then discard it.

In fact, if you really do keep your methods short, a lot of the other rules here become less important, because the impact of breaking them will be lessened. The reason is of course that it's very hard for a short method to be really difficult to read. There's so little code in a short method that you can almost always digest it quite easily. So you could think of this as being the most important rule.

6. Don't preserve return values you don't use

This is also related to making it easier to track data flow. Let's say you write something like this:

boolean present = myCollection.remove(object);

You could have written it like this, however:

myCollection.remove(object);

You can safely assume that any reader of your code will know this, and therefore using the first form is a signal that "I care about this return value, and I'm going to use it for something". If you don't, you'll be confusing the reader, because they are assuming that they now need to figure out what you're going to do with this variable.

Of course, leaving it out saves space, too, so there's really no reason not to drop these variables.

7. Omit needless code!

It's quite common to find code that's commented out, but still hanging around. This is mainly bad because it bloats the code unnecessarily. People seem to do this because they want to have the possibility to bring the code back, either because they are writing an alternative they are not sure about, or (and this seems quite common) because they don't dare to delete it. However, in nearly all cases there is no real reason to keep such code around. You should be using version control, and that means you could always find any deleted code again.

Also, since this code is no longer compiled or executed there is no real difference between commenting it out or deleting it. You've still made the change. The difference is that now it's quite likely that this code will slowly move out of sync with the code around it, so that by the time you find you want it again it will no longer work. That actually makes the code dangerous, because it tempts you to include code without understanding what it does. (If you really understood it you could retype it quite quickly even if it was deleted.)

Of course, commenting out code while you are working on the code is fine. I do that a lot. However, I never check in code that is commented out, and whenever I find such code I delete it, without asking anyone or telling them. If they wanted the code they shouldn't have commented it out. And it will be in the version history, anyway.