Writing Testable Code

Isa Goksu, ThoughtWorks Inc, http://www.thoughtworks.com

Every domain has its own best practices and disciplines. Let's think about an English teacher. He has to follow certain practices and adopt certain disciplines to be successful in his job and write testable code. He cannot sing a song, or use the sign language to give his lecture at its best. There are certain methods and certain practices to follow. These practices come from different experiences, people and time. I'm not going to explain how an English teacher should teach, I'm sure there are many books about that. My point was about "the existence" of different methodologies, practices and disciplines to follow. These practices tell you not only how, they also tell you what, when, where and whom. This doesn't mean that you cannot create your own methodology. However, it's a fact that many people who tried to create new methodologies have failed.

I think programming is the art of translating business requirements into a computer language. And just like teaching, programming has its own practices to follow and disciplines to adopt. Test driven design, pair-programming, code-reviewing are just couple of them. There are many books that explain all these practices/disciplines. However, in this article, I would like to focus on programming from a different aspect: "Writing Testable Code". To me, this little concept is not getting enough attention. And I think it definitely needs more attention.

Problems of Dirty Code Base

Developing new software isn't just about to accomplish the business requirements within the given time frame. You have to think about the maintenance and support of the created software as well. So what happens when we write dirty code? Basically you end up with a mess. A mess which adding a new feature, changing the behavior of a component, replacing UI elements with the new ones, or any maintenance/upgrades that you could think of would be very hard.

Please notice that this problem has nothing to do with you being agile or your design skills. This is about whether someone else can understand your code after development or not. It's about making yourself indispensable or replaceable within the team, or how easy it is to introduce someone new into the team.

Isn't Unit Testing Enough?

Short answer is "No". Unfortunately we tend to think that if we have unit tests, it solves all the problems. Well, this is obviously not true. Unit tests are tools that help you to test your code whether it behaves the way you want or not. It does nothing else.

In our case, yes unit tests might help understanding the code base. However, the quality of the tests and the total test coverage in the code base is often case not good enough. It is very likely to end up having only 45-55% test coverage and 65-70% quality in the unit tests.

Besides, having unit tests doesn't prove that you followed TDD (Test Driven Design) practices. So by looking at this picture, we can presume that "Unit Testing" is actually not enough.

Unit Testing vs. Test Driven Design

Unfortunately many developers still don't know or don't think about the difference. The definition of testing methodologies differ greatly between these two designs. As I mentioned earlier, "Unit Testing" is the practice of testing your own code to understand whether it behaves the way you want or not. On the other hand Test Driven Design is the discipline of using "Unit Testing" when you design the software. This small difference actually makes big impact on your software design. Let me explain why:

If your development team is using "Unit Testing"; creating the code base first (even a single unit) and write the tests after creating the code base, it actually means that you're hanging yourself with "Unit Testing" rope. The reason is every time you need a change in the design that affects your unit; you will have to fix the existing unit tests too. In the beginning it seems like a small problem, but it will be a real big problem later in the development phase since you will have lots of unit/integration tests depend on certain design assumptions. And when this design changes, your tests will all have to be changed. After some time you will also notice that your development team is spending more time on fixing tests than developing the software itself. Moreover, it is very hard to identify "what to test" for an already written code than a code that you are about to write. This is also why many developers that follow "Unit Testing" start testing the methods in the unit than the unit behaviors.

However, if your development team is adopting the TDD discipline, you will notice all these design changes will be identified in the early phase [1] by developers. Writing tests first will allow you to discover the spots where new design is crucial, just by showing how hard/easy to write the test for that particular behavior.

On top of this, TDD will ensure that your unit/integration tests will be based on behaviors rather than methods. As we all know, most design changes are just the separation of concerns within the unit/module. Another way of saying this is, your responsibilities will shift to some other unit/module, but the behavior will still exist in your system. As you can see, instead of maintaining the existing unit/integration tests, you will be just moving them to the new unit.

I guess this difference is a subject for another article.

What Problem Exactly Are We Talking About?

Robert C. Martin has a really nice book about "Clean Code". This book covers lots of hints about agile software development. I strongly recommend you to read it. I want to quote something from the book:

"You are reading this book for two reasons. First, you are a programmer. Second, you want to be a better programmer. Good. We need better programmers." [2]

What is better programmer? I believe Uncle Bob is not talking about the programmers who can write one-liners (one line that does the whole behavior). Once you read the book, you will understand that he is actually talking about the programmers that can write clean, easy to understand code. I want to add one more thing to this, which was not in the book. Better programmers are always replaceable in the team.

So how do we ensure that our development team consists of better programmers? Well, this is a topic for a book unfortunately. However, I believe those are the programmers who can manage to write the simplest thing that could possibly work [3], and they are replaceable anytime within the team.

Let's look at the possible problems within a development team that are the symptoms of dirty code base:

When you introduce a new team member, it takes more than a day for him to start coding

When your team members are becoming more indispensable every new day

When you want to change the behavior of newly written code piece, it requires original developer to do it

When your team members are spending too much time on testing

When nobody could understand certain code pieces without rewriting them

When your team members don't want to work on certain area

When senior team members having hard time to mentor new team members

Yes, all these are the symptoms of dirty code base. Why is this dirty code? It is a dirty code because all these problems are the results of some concessions while coding. Concessions that make your code hard to understand, follow, change or replace. And when you make some concessions, next developer makes his own concession on top yours. So what are these concessions then?

Adding too many dependencies (create big object graphs) Hiding dependencies (i.e. within constructor, private/static methods, etc) Making objects hard to initialize Creating so many execution paths from a single point Using train wrecks (trying to use/access a method/property by chaining so many other objects) Using too many arguments for a single method Creating longer methods/classes Breaking SOLID [4] rules Repeating yourself Over-designing/thinking And many more...

In the next chapter, I will try to address some of these problems and solution proposals. I believe if we try to make our code testable, we can take a good big step towards being a "Better Programmer" and having a clean code base. And I don't think these are really hard things to do. Most of them are pretty simple and easy to accomplish by just paying a little more attention to details.

Testability

I would like to mention that this part of the article is not about software design. This is all about paying more attention to details and focusing on testability. All these items can be controlled while doing a code review.

Working with new code base and existing code base has different problems when it comes to testability. For new code base, you can pretty much apply all agile and XP (Extreme Programming) practices. On the other hand applying these practices to existing code base is not always possible. In this article I would like to focus on new code base only. However, I would like to write a separate article for existing code base and testability.

New code base is always fun to work with. As I mentioned earlier, you can pretty much apply all agile and XP practices. I will assume that you are already familiar with these practices. If you are not, you can still continue reading. However, I would definitely recommend you to look at these practices [5] in no time.

Too Many Dependencies

First rule of writing testable code to me is having a good dependency management within the code base, or at least having the least possible dependencies in your object/module relations. Basically, when you refer another object within your object, you depend on that object. When these object dependencies reach to a certain level, you would have unbreakable code pieces. Or in other terms, it would be impossible to understand this particular code piece without understanding lots of other code pieces. And sometimes this goes to deep and you would have transitive dependencies; you depend on a unit, and that unit depends on some other units. And this big chain of dependencies brings so much information overhead that nobody can easily understand the unit/module after a while.

Every dependency ties your unit to another unit/module. It means without having that unit/module, you cannot use your unit anywhere in the system. However, having no dependency is very theoretical situation. When you use database access, you would probably depend on the database layer in the application. When you modify the customer details, you would probably depend on the customer service in the application.

So is this a problem? In fact, it is not a problem. Having a centralized database layer or a customer service is always a good idea. However, if your unit has more than 3-4 of these dependencies, then it is a big problem. It means that, you are introducing more possible change reasons to your code. According to "Single Responsibility Principle" [6], your unit cannot have more than one reason to change.

Another aspect of this problem is the unit tests. When you have too many dependencies, you will find yourself trying very hard to instantiate your unit while writing the unit tests. Because you have too many dependencies, it will be very hard to instantiate the unit under test. Let me try to explain this with some code snippets:

class CreditCard {

PaymentService paymentService;

CustomerService customerService;

OrderService orderService;

EmailService emailService;

// more dependencies...



void charge(amount) { /* use these services... */ }

}

And a possible unit test for charging:

testChargeNothingWhenNegativeAmountIsGiven() {

CreditCard card = new CreditCard();

PaymentGateway paymentGateway = new PaymentGateway();

PaymentService paymentService = new PaymentService();

paymentService.setPaymentGateway(paymentGateway);

CustomerService customerService = new CustomerService();

OrderService orderService = new OrderService();

EmailService emailService = new EmailService(); card.setPaymentService(paymentService);

card.setCustomerService(customerService);

card.setOrderService(orderService);

card.setEmailService(emailService);

// many more setups // and your test case

}

As you see, after 3-4 dependencies your code starts becoming unreadable. So how do we solve this problem? Actually it's very simple. Keep the dependent units that have same reason to change together, and the rest separate. In above example, Email Service has definitely different reasons to change than our unit. On the other hand Payment Service, Order Service and Customer Service have more likely to have same reason to change. So separating different concerns out of our unit would make it cleaner, and easier to understand. And more importantly, it will reduce the dependencies of our unit.

To solidify above example, creating an asynchronous Email Job that uses the Email Service to take care of the emailing part will reduce the dependencies. And/or replacing the logic that uses Customer/Order Service to find out some information about the Customer/Order with the information itself that Credit Card unit needs, will definitely reduce the dependencies.

If we can pay a little more attention to this little dependency chain, our code will be clearer to understand.

Hidden Dependencies

Another problem that comes with dependencies is hidden dependencies. Hidden dependency is a dependency that you cannot see the dependency through the code easily and it is generally noticed in the run-time only. Let me give you an example of it:

// .. some code

creditCard.charge();

// .. some other code

When we run above code, we get an exception in the run-time that says: "Payment Service is null/not assigned". Then we realize that we forgot to set the payment service. And we set it:

// .. some code

PaymentService paymentService = new PaymentService();

creditCard.setPaymentService(paymentService);

creditCard.charge();

// .. some other code

When we try second time running above code, we get another exception in the run-time that says: "Payment Gateway is null/not assigned". Then again, we realize that we forgot to set the Payment Gateway for the Payment Service. And we set it too:

// .. some code

PaymentGateway paymentGateway = new PaymentGateway();

PaymentService paymentService = new PaymentService();

paymentService.setPaymentGateway(paymentGateway);

creditCard.setPaymentService(paymentService);

creditCard.charge();

// .. some other code

When we try one last time running above code, we get some other exception in the run-time that says: "PayPal Web Service is null/not assigned". After figuring out all the dependencies and set them properly, we can run our code without a problem:

// .. some code

PayPalWebService paypalWebService = new PayPalWebService();

PaymentGateway paymentGateway = new PaymentGateway();

paymentGateway.setClearingHouse(paypalWebService);

PaymentService paymentService = new PaymentService();

paymentService.setPaymentGateway(paymentGateway);

creditCard.setPaymentService(paymentService);

creditCard.charge() ;

// .. some other code

As you see, it takes 2-3 tries to find out all the minimum dependency settings for our unit. And the process of identifying these dependencies won't make a big change for even an experienced developer.

You can think that having a dependency injection framework solves all these problems. Yes, it will indeed solve our problem here. However, it won't make your code testable. When you try to test above piece of code, you will still need your dependency injection framework which is itself another dependency. Isolating your unit will be almost impossible. Besides, understanding the code is still another aspect of this problem. Even though you have a dependency injection framework, it doesn't mean that you code is easy to understand.

So how do we solve this problem? Well, making your dependencies obvious. To do so, we can use one of several alternative methods. My favorite is using Constructor Injection. Constructor Injection is a method of injecting your dependencies to your unit in the construction time. Let's see in an example:

class CreditCard {

private PaymentService paymentService;



public CreditCard(PaymentService paymentService) {

this.paymentService = paymentService;

}



// more code..

}

If we try to use this unit anywhere in the system, it is very obvious that this unit requires a Payment Service to work with. And when you try to test the unit, it is now very easy to inject a fake/stub/mock Payment Service which doesn't require any initialization.

Another possible solution to this problem is, using method level injection. Again, the idea stays same: make your dependencies obvious.

// some code..

public void charge(PaymentService paymentService) {

// use this service

}



// more code..

Dependency Lookups

Another dependency related problem is Dependency Lookups. Dependency Lookup is a technique that comes from several patterns like Service Locator Pattern, Repository Pattern, etc. The idea is very simple: Instead of instantiating the dependencies by yourself, keeping all dependencies in a central repository where every unit can ask their dependencies from this repository.

Although it sounds very good, it brings some complexity to your system later in the development phase. One of the major concerns about Dependency Lookup is they are not testable. Let's see in the example:

// some code..

Reader reader = Repository.load(Reader.class); // this part may vary

// more code..

So as you can see from above code, there is no dependency injection or hard-coded instantiation of the dependency in the code. However, when you try to test this code snippet, you will find yourself in need of using this very same repository. This is exact same problem with static method usages. Any static/global access in the code is always hard to test.

Another problem with Dependency Lookups is you cannot easily see which implementation of this dependency is coming from the repository. In above example, if you ask yourself: "Is this a FileReader, or StringReader or SocketReader?" you will see that it is hard to answer this question when all of these implementations implements the same Reader interface.

Possible solution to this problem is extracting this snippet as a separate method. And when you test this unit, overriding this extracted method and returning a stub/mock will do the magic. Let's see in an example:

// some code..

Reader reader = getSocketReader(); // extract method



protected Reader getSocketReader() {

return Repository.load(Reader.class);

}



// more code..

And when you want to stub this Reader in the test, all you need to do is overriding this method and returning a stub/mock equivalent.

class MyFakeUnit extends MyUnit {

@Override protected Reader getSocketReader() {

return new MyStubbedSocketReader();

}

}

Now you can use MyFakeUnit in the test instead of MyUnit to isolate this dependency from the test. Ideally you try to avoid using Dependency Lookups. However, if it is there because of some framework usage, above solution will do the trick.

Object Initializer Blocks

Object initialization has two aspects. First one is about the constructor/initialization block, and second one is about the usage of this object. There are certain things that we should avoid while coding the constructor like hidden dependencies, switch cases, if clauses, loops, etc. However, the most major thing that you should be aware of, to me is avoiding Singleton dependency (one instance in entire system), and static method usage. Developers mostly tend to load some configuration or a socket connection (database, HTTP, remote, etc) in the constructor:

class MyUnit {

private DBConnector db;

private Configuration config;



public MyUnit() {

db = DBConnector.getInstance();

config = Configuration.Load(Configurations.SYSTEM);

}

}

If you check the above code piece, you will see that db variable is using a Singleton object and config variable is using some static method to initialize itself. So what is wrong with it? Well the problem is isolation of this unit. As you see, it is impossible to inject any fake/mock DBConnector or Configuration class to this unit while we're doing unit testing. If your DBConnector instance is connecting to the production system, then every time you run your unit test, it will try to connect to the production system database which we don't want it for sure.

Another problem with this type of constructors is that it's also causing the "hidden dependency" problem that we described earlier. To solve all these problems, it is better to extract these dependencies and use the extracted methods instead:

// some code..

db = getDatabaseConnector();

config = getSystemConfiguration();



// and the extracted parts would be

public DBConnector getDatabaseConnector() { /* .. */ }

public Configuration getSystemConfiguration() { /* .. */ }

Second problem with object initialization blocks is their usages. It sometimes can be brutal because of the transitive dependencies. Let's see it in an example:

// some code..

public CreditCard(PaymentService paymentService, CustomerService customerService) {

// details..

}



// and usage of this unit somewhere in the system

PayPalWebService paypalWebService = new PayPalWebService();

PaymentGateway paymentGateway = new PaymentGateway();

paymentGateway.setClearingHouse(paypalWebService);

PaymentService paymentService = new PaymentService();

paymentService.setPaymentGateway(paymentGateway);



ProfileService profileService = new ProfileService();

CustomerService customerService = new CustomerService(profileService);



CreditCard card = new CreditCard(paymentService, customerService);

As you see, all these transitive dependencies tie you up so bad that you cannot even do a simple initialization. So how do we solve this problem? Well, if it would just be the matter of testing, we would inject fake/mock/stub services to the CreditCard. However, we are talking about the real usage of this CreditCard class. In this case, we have two options. We either eliminate the dependencies, or we will look for a dependency injection framework. My personal choice is focusing on the first one if possible. If you were building an enterprise level system, then going for second option would make more sense. Depending on the language that you are using for development, you can find lots of different frameworks that help you on injecting dependencies.

Unclear Execution Paths

Every time your program sees a conditional clause, it diverts the execution. These diversions are called execution paths. For instance, you would have two separate execution paths for the user who is logged in or not. The problem is not about having different execution paths. It is about abusing the usage of them. Sometimes developers create 4-5 different execution paths from a single point. And on top of this, they might even add some nested execution paths. Check the following example:

// some code..

if (lineItem != null && (lineItem.quantity > 0 && customer.isActive() && (card.expirityYear > 2010 || card.type == AMX)) || couponCode.isValid()) {

// details..

}

Can you tell how many execution paths this particular conditional case creates? I am not even sure if I typed correctly. When this developer leaves the team, this business logic will be very hard to identify. And moreover, how to test, or what to test here? As you see, it is really hard to decide.

"if" usages should be always under control. If we cannot control, it means that we cannot test it. Ideally it is always wise to avoid any of if usages, if it is possible. So I prepared a quick guideline for "if" usages:

Allow them when:

Condition can be expressed without "AND", "OR" operators:

if (myContainer.hasItems()) {

//...

}

Condition is about an encapsulated field:

if (_visibility == VISIBLE) {

//...

}

Condition is about non-float numerical comparison:

if (items.size() > 2) {

//...

}

Be cautious when:

There is more than one "AND", "OR" operator:

if (_isVisible && _isParent) {

//...

}



// refactor it:

if (canAddUser()) {

//...

}

There is a negative logic:

if (!_isParent) {

//...

}



// change your logic and make it positive if possible,

// positive statements are easier to understand

There is a very long condition because of variable/method names:

if (payPalWebService.isClearingHouseAvailableForPurchasing(myLineItem)) {

//...

}



// extract a method and give a short name that tells

// the purpose of this condition

if (clearingHouseReady(myLineItem)) {

//...

}

There is a "if/else" logic:

if (_isParent) {

number = generateItemNumber(this);

} else {

number = 0;

}



// ensure you forced everything to not have this situation.

// if you still have, try doing something like:

number = 0;

if (_isParent) {

number = generateItemNumber(this);

}

Don't allow when:

It is just a "NULL" check:

if (user != null) {

//...

}



// try not having a null user in the first place.

// if you don't know how to avoid, please refer to my other article [7]





There are more than two "AND", "OR" operators:

if ((_isVisible && !isParent) || iAmMaster || youCanPassIsSet) {

//...

}



// avoid having multiple conditions in one clause.

// if you already have, extract a method out of it:

if (isValid()) {

//...

}

There is nested "if/else" statement:

if (isThisTrue) {

if (checkIfHeIsTheMaster() && !hmmmThisSeemsWeird()) {

// do something...

} else {

// do something else...

}

} else {

// do some other thing...

}



// you should never have this situation.

// try using well known OOP technique called POLYMORPHISM,

// or if the operator is one of (>, <, ==) then various possible

// design patterns like Strategy (behavior modification),

// Visitor (context evaluation) or Specification (item selection).

// if you're not happy, but you don't know what to do then talk

// to someone you think he/she might know.

If you can follow above guideline, it will be really easy to test your code since identifying execution paths, and creating test coverage for them will be very obvious.

Train Wrecks

"Train Wrecks" is a term used to refer too many object messaging in one line. For instance:

object.message(); // this is a normal call

object.someOtherObject.someService.getMeSomeInfo(); // this is a train wreck

As you see, reading the second line and understanding the components of it takes some time. It also violates "Law of Demeter" [8] which has only 3 rules to watch:

Each unit should have only limited knowledge about other units: only units "closely" related to the current unit.

Each unit should only talk to its friends; don't talk to strangers.

Only talk to your immediate friends.

Even though having train wreck expressions seems useful, it could be a real pain when you get an exception in that particular line. Identifying the source of that exception always takes more time than a simple expression. Also your components become tightly coupled. Later in the design, untying these dependencies would be really hard.

So how do we solve this problem? Basically, avoid the train wrecks, and just follow the 3 rules above.

Conclusion

I hope this opens your vision a bit. It is very hard for me to cover everything here. I'm sure all the examples I gave might have exceptional cases depending on the context. However, all these things are all about paying more attention. As you can see from above examples, all of them can be easily done while doing pair programming or code reviews. Writing clean code, making your code testable is not hard; it is very easy indeed if you do pay a little attention to details. And it would make you much better developer. Although being indispensable in the team seems good to many of us, it is actually never good. It just increases your responsibilities, your tasks, and it forces you to work more and more. And it brings you at such a point where no one would want to work with you. So don't be indispensable.

For more information about all these topics, please refer to Refactoring Book [8], SOLID Principles, and Clean Code talks [9].

References

More Java and Software Testing Knowledge

Java Tutorials and Videos

Software Testing Magazine

Click here to view the complete list of archived articles

This article was originally published in the Spring 2010 issue of Methods & Tools