High Level Testing

Virgil Dupras

2010-03-03



Too often, developers new to unit testing see them as a liability. They're hard to write, and when you want to deeply refactor your code (for example, when you want to re-organize your classes' respective responsibilities), you have to change your unit tests. This then discourages you to refactor your code, making you less agile. These developers will probably abandon unit testing and write on their blog that TDD sucks (because it's common knowledge that unit testing is good, so you can't say that it sucks. It's easier to attack TDD, aka. too many unit tests).

You know what? They're right. I think we do way too much unit testing when we should be doing integration testing instead. If you have a class that is all neat and cute and unit tested and that a new requirement in your application requires you to split that class in two (to do it correctly), you either have to change all your tests, or you implement a kludge that allows you to keep your former architecture (don't do that).

What I've been doing for about 2 years (for moneyGuru's development, to be precise) is what I call high level testing, which is like integration testing, but without (or very little) unit testing. It has some downsides, but it's been working great so far and has empowered me with nearly unlimited refactoring potential on the whole project. I'm under the impression that a lot of developers have a narrow view of unit testing, which prevents them from even considering this possibility. I hope, with this article, to broaden their horizon a little bit.

The Accounting Example

We interrupt this program for a message of general interest. In the real word, nobody cares about implementing the Fibonacci sequence. Stop using it in your examples. Stop it.

The benefits of high level testing are better illustrated with an example. I'll use my own (simplified, for the purpose of the article) use case, which is an application that does accounting. Let's say that we want a little app that has accounts with entries in them. If we write the app using TDD and "normal" unit tests and we end up with something like:

class App(object): def __init__(self): self.accounts = [] def add_account(self, name): self.accounts.append(Account(name)) class Account(object): def __init__(self, name): self.name = name self.entries = [] def add_entry(self, description, amount): self.entries.append(Entry(description, amount)) def balance(self): return sum(e.amount for e in self.entries) class Entry(object): def __init__(self, description, amount): self.description = description self.amount = amount

and unit tests looking like:

from nose.tools import eq_ from accounting import App, Account, Entry def test_entry(): entry = Entry('foo', 42) eq_(entry.description, 'foo') eq_(entry.amount, 42) def test_account(): account = Account('foo') account.add_entry('bar', 42) account.add_entry('baz', 43) eq_(account.entries[0].description, 'bar') eq_(account.entries[1].description, 'baz') eq_(account.balance(), 42+43) def test_app(): app = App() app.add_account('foo') app.add_account('bar') eq_(app.accounts[0].name, 'foo') eq_(app.accounts[1].name, 'bar')

On top of that, you slap a nice GUI around your neat classes. Users love your app and the suggestions keep flowing in. Users have needs for entries that represent a transfer of money between two accounts. They're tired of having to manually add/edit/delete both sides of that transfer to each account.

You know what's coming. There's a word for that kind of entries in the accounting world. It's a transaction. This is the right thing to do, but it would completely reverse your class hierarchy. Instead of having entries belonging to accounts, you'd have transactions moving money between accounts. All your tests would have to be re-written, yeaouch! You could kludge around your current design by somehow linking entries together, but you know it would just delay the inevitable. The temptation to throw away these unit tests and freely refactor the code becomes so strong now... You should have used high level testing!

With high level testing, you would have first asked yourself how the user would interact with your software. In this case, that would be a list of account names on the left side with a "+" button underneath (for simplicity, we overlook all other functionalities, like removal), and at the right of that list, a two-columns table with a "+" button and a "Balance" label underneath. After that, you design a public API for these functionalities and make sure that your tests only use this API. The resulting code for Account and Entry would stay the same, but your App class would get larger:

class App(object): def __init__(self): self.accounts = [] def add_account(self, name): self.accounts.append(Account(name)) def account_names(self): return [a.name for a in self.accounts] def account_balance(self, account_index): account = self.accounts[account_index] return account.balance() def add_entry(self, account_index, description, amount): account = self.accounts[account_index] account.add_entry(description, amount) def entry_rows(self, account_index): account = self.accounts[account_index] return [(e.description, e.amount) for e in account.entries]

and your tests would only use App 's public API:

from nose.tools import eq_ from accounting import App def test_entry_rows(): app = App() app.add_account('') app.add_entry(0, 'foo', 42) app.add_entry(0, 'bar', 43) eq_(app.entry_rows(0), [('foo', 42), ('bar', 43)]) def test_account_balance(): app = App() app.add_account('') app.add_entry(0, 'bar', 42) app.add_entry(0, 'baz', 43) eq_(app.account_balance(0), 42+43) def test_account_names(): app = App() app.add_account('foo') app.add_account('bar') eq_(app.account_names(), ['foo', 'bar'])

If you build your app this way, you are then completely free to re-organize your underlying code to accommodate a new requirement. First, just add a new test for your requirement:

def test_transfer(): app = App() app.add_account('first') app.add_account('second') app.add_entry(0, 'transfer', 42, transfer_index=1) eq_(app.entry_rows(0), [('transfer', 42)]) eq_(app.entry_rows(1), [('transfer', -42)])

Then, after you quickly made the test pass in your old architecture (because that's what you're supposed to do with TDD, making tests pass and then re-factor), you're completely free to re-organize your code to something like the code below and this, without touching your tests.

class App(object): def __init__(self): self.accounts = [] self.transactions = [] def add_account(self, name): self.accounts.append(Account(name)) def account_names(self): return [a.name for a in self.accounts] def account_balance(self, account_index): account = self.accounts[account_index] return account.balance() def add_entry(self, account_index, description, amount, transfer_index=None): account = self.accounts[account_index] if transfer_index is not None: transfer = self.accounts[transfer_index] else: transfer = None transaction = Transaction(description, amount, account, transfer) self.transactions.append(transaction) account.rebuild_entries_from_transactions(self.transactions) if transfer is not None: transfer.rebuild_entries_from_transactions(self.transactions) def entry_rows(self, account_index): account = self.accounts[account_index] return [(e.description, e.amount) for e in account.entries] class Account(object): def __init__(self, name): self.name = name self.entries = [] def balance(self): return sum(e.amount for e in self.entries) def rebuild_entries_from_transactions(self, transactions): self.entries = [] for txn in transactions: if txn.from_account is self: self.entries.append(Entry(txn.description, -txn.amount)) elif txn.to_account is self: self.entries.append(Entry(txn.description, txn.amount)) class Transaction(object): def __init__(self, description, amount, to_account, from_account): self.description = description self.amount = amount self.from_account = from_account self.to_account = to_account class Entry(object): def __init__(self, description, amount): self.description = description self.amount = amount

See? We've just completely changed the way our classes interact with each other without having to change our tests. Of course, using high level testing doesn't mean that you'll never have to change tests again, because a change in your public API is always possible. There was even one in the example: the addition of the transfer_index argument. However, such changes are usually much less frequent and, more importantly, much less radical. They usually only require simple search and replace in the tests.

NOTE: Technically, it was too soon to perform this refactoring because the code without transactions was simpler. However, if there was an editing feature and a deletion feature, the code with transactions would become much simpler. This refactoring was, again, for the purpose of the article.

The Downsides

This nearly unlimited refactoring potential comes at a price. The biggest downside is that it makes tests more complex. For every little thing that you want to test, you have to have the whole App setup. For example, if you want to add amount formatting to your app, you can't just add a format_amount() function and test it in isolation (well, you can, but that limits your refactoring potential). You have to create an App instance, add an account, add an entry and then test the formatting.

The solution to that is to build a testing framework specific to your application. You have lots of tests that create an app with an account with an entry in it? Build a app_with_entry(description, amount) helper function and use it in your tests. It might sound like an hackish way to solve the problem, but in the case of moneyGuru, a rather nicely designed helper suite emerged from the initially hackish code (I love the concept of code "emerging" and "evolving"...). The key to this is to keep test-writing convenient. If you start every test with a big sigh of total boringness, then you need to write yourself some helper code.

Another downside is the public API. If you make all your calls go through that App instance, you'll end up with a huge and ugly unit. The solution to this problem, again, is open ended. For moneyGuru, interactions with the GUI are handled by individual public controllers. For each GUI element, a table for example, there's a controller that provides a public API for the view to use. This controller then talks to the "core" of the application. Tests only use these public controllers to manipulate the app. The end result is pretty nice looking and works very well for moneyGuru, but I'm sure that there are other ways to deal with this problem.

NOTE: Curious about moneyGuru? The code is publicly available.

Conclusion

As with cross-toolkit software, high level testing is something that works very well for me but that seems to be overlooked by a lot of developers. There's also a fair part of TDD's critic that is centered around how time-consuming and error-prone constant unit tests changing is. Hopefully, this testing method will be considered more in the future and TDD critics will start to have more valid arguments.