The following is a list of notes taken on writing clean code, i.e. code that is maintainable and extensible.

Names

Naming is the hardest and the most important part of writing clean code. Names should clearly express intent and the assumption here is that everyone involved in the codebase has the same cultural background which is not always the case in practice. Some general tips:

Classes should be Nouns, e.g. User

Methods should be verbs, e.g. getById() , save()

, No need for prefixes such as m_ or str for strongly typed languages

or for strongly typed languages Pick one word per concept, e.g. fetch , retrieve and get are semantically equal

Functions

Functions or methods are the fundamental building blocks of programming. In fact, the internal operation of programs normally consists mostly of functions pushing data onto and popping data off the stack as they call each other. Sometimes memory needs to be allocated on the heap for data that must survive across function calls.

When a function is called, a stack frame is created to support the function's execution. The stack frame contains the function's local variables and the arguments passed to the function by its caller. The frame also contains housekeeping information that allows the called function (the callee) to return to the caller safely. The exact contents and layout of the stack vary by processor architecture and function call convention.

An example of a call stack for the DrawLine function which is called by DrawSquare .

Some general tips on writing functions:

Functions should be small and they should do 1 thing only: Only have 1 level of indentation - highly nested functions should be refactored into sub-routines No side-effects! public int sum(int a, int b) { int result = a + b; resetGui(); // this is untestable and introduces a hidden dependency! return result; }

Do not return null - caller will need to always check cluttering p code, consider using special case return values

- caller will need to always check cluttering p code, consider using return values Don't pass null as a parameter value either

as a parameter value either Prefer exceptions for error conditions except in cases were a Nullable or Optional type is available

or type is available Should ideally return a value, especially for monadic functions as this allows function chaining

Fewer arguments are better - the more the arguments, the more the complexity and test cases that need to be written

Object Oriented Programming

An important but suble point to note in OOP is that objects hide their data behind abstractions and expose functions that operate on that data whereas data structures expose their data and have no meaningful functions. Good OOP requires knowing when to use objects and when to use data structures. Consider the following example:

public class Point { public double x; public double y; } public interface Point { double getX(); double getY(); void setCartesian(double x, double y); double getR(); double getTheta(); void setPolar(double r, double theta); }

In the second Point definition the co-ordinate system being used by the implementation is not known and need not necessarily be cartesian nor polar!

Tips on writing clean OO code:

Classes should be small and follow the Single Responsibility Principle (SRP)

Classes should have high cohesion, i.e. operate on a small number of variables

Avoid using boundary interfaces, e.g. instead of returning a Map , wrap it in a class ( Sensors ) to encapsulate the implementation

, wrap it in a class ( ) to encapsulate the implementation Comments should only be used for clarification or amplification - avoid in general and let the code do the talking!

Prefer exceptions to error codes - error codes have the habit of spilling out into the entire system

Use unchecked exceptions to not break encapsulation

Finally, procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change. Again, writing clean code requires insight as to when to use which style of programming.

Test Driven Development

Unit Tests should follow the F.I.R.S.T principle, i.e. they should be Fast, Independent of any external dependencies or manual setup, Repeatable, Self-validating (no manual checking verification) and Timely (run just before writing production code). Some general tips on writing clean tests:

Tests should be readable above all else and this might mean relaxing certain production code restrictions on performance

Single concept per test

Create helper methods to simplify complicated setups

Convert multiple asserts into a single assert via a state pattern

The three laws of TDD:

First law You may not write production code until you have written a failing test. Second law You may not write more of a test than is sufficient to fail, and not compiling is failing. Third law You may not write more production code than is sufficient to pass the currently failing test.

Not quite TDD.

System Design

Classes should follow the open-closed principle - open for extension but closed for modification. Consider the following example where we write an AreaCalculator which calculates the total area of a collection of rectangles. public class Rectangle { public double width; public double height; } public class AreaCalculator { public double calculateArea(Collection rectangles) { double result = 0; for (Rectangle r : rectangles) { result += r.width * r.height; } return result; } } We would now like to extend this function to calculate the area of circles as well. Our new function now looks as follows: public abstract class Shape { } public class Rectangle extends Shape { public double width; public double height; } public class Circle extends Shape { public double radius; } public class AreaCalculator { public double calculateArea(Collection shapes) { double result = 0; for (Shape s : shapes) { if (s instanceof Rectangle) { Rectangle r = (Rectangle) s; result += r.width * r.height; } else { Circle c = (Circle) s; result += c.radius * c.radius * Math.PI; } } return result; } } Extending this further to calculate the area of triangles now requires another modification to the calculateArea method, i.e. it is not open for extension . We can change this by introducing an area method on the Shape data structure. Our code now looks like the following: public abstract class Shape { abstract double area(); } public class Rectangle extends Shape { public double width; public double height; @Override public double area() { return width * height; } } public class Circle extends Shape { public double radius; @Override public double area() { return radius * radius * Math.PI; } } public class AreaCalculator { public double calculateArea(Collection shapes) { double result = 0; for (Shape s : shapes) { result += s.area(); // note the simplicity } return result; } }

which calculates the total area of a collection of rectangles. We would now like to extend this function to calculate the area of circles as well. Our new function now looks as follows: Extending this further to calculate the area of triangles now requires another modification to the method, i.e. it is not . We can change this by introducing an method on the data structure. Our code now looks like the following: Dependency Inversion Principle - depend upon abstractions and interfaces, not concrete implementations

When building large software systems, try to avoid doing a big design up front - use a dependency injection container to separate cross-cutting concers like transactions, logging, etc. from business logic

References

HackerNews submission / discussion