C coding style guidelines

Scope

Attempts are frequently made to prescribe a ‘coding style’, often taking the form of a list of simple mechanical syntactic rules such as ‘operators shall be separated from their operands by a single space’ or ‘nested blocks shall be indented by precisely three spaces’. These attempts generally just reflect the particular preferences of the author, and his or her own ideas about what is ‘readable’ code and what is not. Readability, however, is in the eye of the beholder; what one programmer will regard as elegant and clear will be unusable to another. Any naïve set of rules, rigidly enforced, will alienate, annoy, and reduce the productivity of more people than it pleases: a recipe for disaster.

In this guide I shall try to show how a little intelligence can be applied to the complex question of code readability. What I present is not a simple-minded list of ‘thou shalt nots’; it is more a set of ideas for the construction and setting out of a program. The ideas are based on the way I write code and, of course, my own view of what is elegant and readable. Your opinion may differ and is precisely as valid as mine; I present my ideas here so that you can steal them if you wish.

The discussion does not cover, other than incidentally, issues related to reliability such as the use of assert(), or to the building of large programs from many files.

Examples are presented in C, although many of the ideas are applicable to other languages such as VHDL or HTML - and sometimes even English!

Layout

There are programs available, such as ‘cb’, that will rewrite code into a particular fixed format. The idea of this section is to look at ways of trying to add information to the code by the way it is laid out, rather than stripping out such information as ‘cb’ does.

Program context

One of my principles here is that a program should not occupy too much space on the screen when it is being edited. The brain is only capable of holding so much information, and the more context that can be displayed the easier the code will be to grasp. It is, however, wise to limit oneself to 80 characters per line: many printers do not make a good job of printing longer lines.

Indentation

Indentation is often a helpful way to indicate the nesting of blocks in code. But not always: I find

for(i=0;i<10;i++) for(j=0;j<10;j++) for(k=0;k<10;k++) printf("%d %d %d

",i,j,k);

clearer than

for(i=0;i<10;i++) for(j=0;j<10;j++) for(k=0;k<10;k++) printf("%d %d %d

",i,j,k);

Nested ‘if-then-else’ blocks can sometimes also be clearer if not indented.

Keeping the indentation increment small (I use 2 spaces) helps keep line lengths below 80 characters.

Clutter

If you’re going to use indentation to indicate structure, there is no need for markers such as ‘{’ and ‘}’ to be prominent: you can write

if(a>b) { u++; v++;} if(c>d) { u--; v--;}

rather than

if(a>b) { u++; v++; } if(c>d) { u--; v--; }

Very short functions can be conveniently written on one line:

int incircle(x,y) double x,y; {return x*x+y*y<1.0;}

Clutter can also be reduced by use of the comma operator thus:

if(a>b) u++,v++; if(c>d) u--,v--;

Alignment

The fact that program code occupies a two-dimensional space on the page or on the screen can be used to good advantage as shown in the following examples:

if(t0&&t1 ) u++,v-- ; if( t1&&t2) v++,w--; if(t0&& t2) u--, w++;

a=((((((((((((((((0 <<1 )|lmw) <<3 )|le ) <<1 )|lsc) <<1 )|lsm) <<1 )|lsl) <<BLTE-4)|lb ) <<5 )|lf ) <<2 )|lm );

int isinf (s,e,m) int s,e,m; {return( m==0&&e==0xff);} int ispinf(s,e,m) int s,e,m; {return(s==0&&m==0&&e==0xff);} int isminf(s,e,m) int s,e,m; {return(s==1&&m==0&&e==0xff);} int isnan (s,e,m) int s,e,m; {return( m!=0&&e==0xff);} int isden (s,e,m) int s,e,m; {return( m!=0&&e==0 );}

The examples show how much easier it is to see the symmetries of the code at a glance from the ‘two-dimensional’ structure: the code is therefore simpler to check and more likely to be correct.

Comments

A comment is not necessarily a good thing. Comments require maintenance, which requires time, effort and expense. It is worth considering carefully whether a comment is really called for before writing it. Ask: does it help to understand the code? Is it likely to go out of date? Would it be better to invest time in laying the code out so as to display its structure better?

In general, putting comments after the code to which they refer, on the same line, avoids ambiguity and saves space. Thus

p+=l; /* add on length of new string */ q+=m; /* allocate space for tokens */

is clearer than either

p+=l; /* add on length of new string */ q+=m; /* allocate space for tokens */

or

/* add on length of new string */ p+=l; /* allocate space for tokens */ q+=m;

Comments which simply repeat the code, such as

a++; /* increment a */ p+=l; /* bump p */

are useless and best deleted.

There is no need to repeat others’ work. Don’t attempt to describe a difficult algorithm if you can cite a book or paper instead. Avoid citing URL’s: the file won’t be there in a year’s time, and it’s easy to find an on-line copy of a paper from its title and author.

Implementation

Generality and boundary cases

Try to write code that is as general as possible. For example, if there are two otherwise equally good ways to express a sort algorithm, one of which will work correctly with empty input and one of which will not, prefer the first. Making the ‘zero case’ work is usually no extra effort in C (although it is harder in some other languages), and results in clearer, more concise code: you can write

sort(p,l)

instead of

if(l!=0) sort(p,l);

and the savings accumulate. Indexing things from zero (and that includes using a ‘little endian’ convention) often simplifies matters.

Failure to work correctly on empty input is frequently a symptom of a poorly thought out algorithm.

Naming

Avoid long, similar-looking variable names. It is tedious for the reader to have to check carefully to distinguish

a=array_index_upper_bound;

from

a=array_index_lower_bound;

and tedious for the programmer to ensure that each instance is correctly spelled.

Constants and macros

There is no need to ‘#define’ constants that are obvious, short, and never going to change. I have seen:

#define SECONDS_IN_A_MINUTE (60)

However, the following are worthwhile:

#define MAP_COLOURS (4) /* colours needed for planar map */ #define PI (3.1415926536) #define CURRENCY "GBP"

since they are respectively non-obvious, long, and conceivably liable to change.

Writing macros as ‘atomic’ and syntactically callable like functions reduces scope for error. Write

#define X (3+(Y))

rather than

#define X 3+Y

and

#define F(a) (..., ... (a), ...)

rather than

#define F(a) ...; ... (a); ...

so that

if(x) F(y); else ...

works as expected.

Sources of information

Try to arrange for a single source for pieces of information. Write

#define L2SZ (10) /* log base 2 of size */ #define SZ (1<<L2SZ)

rather than

#define L2SZ (10) /* log base 2 of size */ #define SZ (1<<10)

since then things are more likely to stay consistent when changes are made.

Use of particular language features

Nothing is banned. Strictures of the form ‘never use goto’ are wrong-headed: there are cases where using a ‘goto’ is the simplest, clearest way to achieve what is required, for example when an error condition occurs inside a deeply-nested loop.

Observe also that the meaning of ‘break’ and ‘continue’ is not correctly preserved when a loop is added around a fragment of code. The meaning of ‘goto’, however, is correctly preserved. The ‘break’ and ‘continue’ statements should therefore be used with extreme caution.

Some people deprecate the conditional operator; but there are contexts where it is a little clearer and less prone to error than the corresponding construction using ‘if’, such as:

a=(t0? x1:x2)+ (t1?x0: x2)+ (t2?x0:x1 );

It is worth bearing portability in mind, however, and the more arcane features of a language are best avoided unless there is a significant counterbalancing benefit. For example, local functions are supported by gcc; but many other compilers do not support them, and that part of the gcc compiler has many more bugs than the more established parts.

This page most recently updated Tue 14 Jul 10:08:00 BST 2020