Introduction to unittest

Starting Testing with Python

Note In Python 2.7 and 3.2 a whole bunch of improvements to unittest will arrive. The major changes include new assert methods, clean up functions, assertRaises as a context manager, new command line features, test discovery and the load_tests protocol. unittest2 is a backport of the new features (and tests) to work with Python 2.4, 2.5 & 2.6. See: unittest2: improvements to the unittest module

Introduction As a dynamic language Python is substantially easier to test than other languages, which means that there is absolutely no excuse for not having good tests for your Python projects. This article is about testing in Python, mainly using the Python standard library testing framework unittest. Testing is an important subject in the Python world and there are a huge number of different testing libraries and tools available. You can find a good collection of some of the popular libraries in the Python Testing Tool Taxonomy.

Testing There are many different ways of categorizing tests, and many names for subtly different styles of testing. Broadly speaking, the three categories of tests are as follow: Unit tests

Functional tests (black box tests, acceptance tests, integration tests)

Regression tests Unit tests are for testing components of your code, usually individual classes or functions. The elements under test should be testable in isolation from other parts, which means eliminating dependencies. It isn't always obvious how to do this, but there are ways of handling these dependencies within your tests. Dependencies that particularly need to be managed include cases where your tests need to access external resources like databases or the filesystem. As well as looking at the basics of setting up a test framework we'll also be looking at some of the techniques you can use to control dependencies (mock objects). Functional tests are higher-level tests that drive your application from the outside. This can be done with automation tools, by your test framework, or by providing hooks within your application. Functional tests mimic user actions and test that specific input produces the right output. As well as testing the individual units of code that the tests exercise, they also check that all the parts are wired together correctly - something that unit testing alone doesn’t achieve. For some ideas about functional testing techniques with Python see my article Functional Testing of GUI Applications. Regression testing checks that bugs you’ve fixed don’t recur. Regression tests are basically unit tests, but your motivation for writing them is different. Once you’ve identified and fixed a bug, the regression test guarantees that it doesn’t come back. The easiest way to start with testing in Python is to use the standard library module unittest.

The unittest Module unittest has its origins in the Java testing framework JUnit, which itself was a port of the Smalltalk testing framework SUnit created by Kent Beck. As it is part of the xUnit family unittest is sometimes known as pyUnit. unittest is an object oriented framework based around test fixtures. In unittest the test fixture is the TestCase class, and its basic usage is very simple: import unittest class MyTest ( unittest . TestCase ): def testMethod ( self ): self . assertEqual ( 1 + 2 , 3 , "1 + 2 not equal to 3" ) if __name__ == '__main__' : unittest . main () You create a new test fixture by subclassing TestCase and defining test methods whose names start with test . The test methods perform actions with your production classes and call assert methods to verify the expected behavior and results. In large test frameworks it is common to subclass TestCase and provide methods useful for testing your specific project. Your test modules will then subclass your custom test fixture rather than directly inheriting from unittest.TestCase . The block at the end of the example above calls unittest.main() when the test module is executed directly from the command line: python test_something.py This executes all the tests in the test module reporting failures and errors. Test passes are shown as a '.', failures with an 'F' and errors with an 'E'. To run a test suite consisting of several test modules they must be collected before they can be run.

Loaders, runners and all that stuff There are a whole bunch of other classes in unittest; test runners, test loaders, test suites, test results and all that stuff. My book IronPython in Action has a more detailed description of how they wire together. Code to execute tests in multiple test modules might look like this: import unittest import test_something import test_something2 import test_something3 loader = unittest . TestLoader () suite = loader . loadTestsFromModule ( test_something ) suite . addTests ( loader . loadTestsFromModule ( test_something2 )) suite . addTests ( loader . loadTestsFromModule ( test_something3 )) runner = unittest . TextTestRunner ( verbosity = 2 ) result = runner . run ( suite ) The code above imports all the test modules separately ( test_something , test_something2 ...) and turns them into a test suite before executing them with a runner. The image below shows the interactions between the classes. All of these classes can be subclassed to customize their behavior. Fortunately there is also a simpler way of collecting and running all the tests in a project.

Automatic test discovery An easier way of running all the tests in a project is to use automatic test discovery. This is a feature that has been in alternative Python testing frameworks, such as nose and py.test, for a long time. Test discovery has finally been added to unittest in what will become Python 2.7 and Python 3.2. The test discovery has been backported as a separate module that can be used with Python 2.4 or more recent, including IronPython. The discover module: automatic test discovery for unittest When you run discover.py from the command line it searches from the current directory, recursing into Python packages that it finds, running all the test modules that it is able to import. The basic way of running test discovery is from the command line with the current directory at the top level of the project: python discover.py Or if the discover module is on your default module path (either in a directory pointed to by the PYTHONPATH environment variable or a directory added to the path by site.py) then you can execute it with: python -m discover discover identifies test modules as importable files (inside a Python package) matching the pattern 'test*.py'. You can configure the pattern, and options like the directory discover starts its search in, with command line parameters: > python -m discover -h Usage: discover.py [options] Options: -h, --help show this help message and exit -v, --verbose Verbose output -s START, --start-directory=START Directory to start discovery ('.' default) -p PATTERN, --pattern=PATTERN Pattern to match tests ('test*.py' default) -t TOP, --top-level-directory=TOP Top level directory of project (defaults to start directory) If you want to build a more complex test framework, perhaps with a custom test runner that pushes results to a database, you can still use discovery by importing and using the DiscoveringTestLoader which is a subclass of the standard unittest TestLoader .

The assert methods We've looked at some of the plumbing behind creating a test framework for a project, let's look at the different assert methods available on the TestCase class. These methods allow you to make different kinds of assertions about the behaviour of your objects. The four most common assert methods come in two pairs, with a positive and a negative variant: assertTrue and assertFalse , assertEqual and assertNotEqual . import unittest from mymodule import MyClass class MyTest ( unittest . TestCase ): def testTrue ( self ): myclass = MyClass () try : result = myclass . method () self . asssertTrue ( result ) finally : myclass . close () def testFalse ( self ): myclass = MyClass () try : result = myclass . anotherMethod () self . asssertFalse ( result ) finally : myclass . close () def testEqual ( self ): myclass = MyClass () try : first = myclass . methodOne () second = myclass . methodTwo () self . assertEqual ( first , second ) finally : myclass . close () def testNotEqual ( self ): myclass = MyClass () try : first = myclass . methodOne () third = myclass . methodThree () self . assertNotEqual ( first , third ) finally : myclass . close () These methods should all be self-explanatory. They provide the basic building block for you to build your test infrastructure on. In addition to these four asserts there are three additional ones. The first two of these are assertAlmostEqual and assertNotAlmostEqual for comparing floats. You specify the number of decimal places to compare them to: first = 3.1 second = 3.2 # this will pass self . assertAlmostEqual ( first , second , 0 ) # this will fail self . assertAlmostEqual ( first , second , 1 ) In practise I don't find number of decimal places to be a fine enough granularity for comparison. Inevitably when comparing floats I actually compare against a delta: first = 3.1 second = 3.2 delta = 0.2 difference = abs ( first - second ) self . assertTrue ( difference < delta , "difference: %s is not less than %s " % ( difference , delta )) The final assert method is a bit more useful. Often when testing an API you need to test how it behaves under error conditions, for example you may want to test that given invalid input a method raises a specific type of exception. The assertRaises method is how we test for this. It takes an exception type as the first argument, followed by a callable (usually a function) along with any arguments it takes. The assert methods calls the function and the assert passes if an exception of the correct type is raised. If an exception is not raised the assert fails and if an non-matching exception is raised then it is not caught and the test fails with an error. def adder ( a , b ): return a + b self . assertRaises ( TypeError , adder , 33 , 'a string' ) In the version of unittest that will be in Python 2.7 / 3.2 several new and useful assert methods have been added. The Python documentation for these versions has the details. The assert statement unittest assertions are based on the Python assert statement. The assert statement takes an expression and raises an AssertionError if the expression evaluates to False. The assert in one of the examples from above could be written as: assert 1 + 2 == 3 , "1 + 2 not equal to 3" There are two reasons we use assert methods rather than plain asserts in our tests. The assert will fail if the expression evaluates to False, but the only error message we get is the message we provide to the assert statement: >>> assert a == b , "a != b" Traceback (most recent call last): ... AssertionError : a != b If we use the assert methods then the failure message can include more useful information, especially about the objects being compared. The second reason to use assert methods is that assert statements are disabled (not executed) when Python is run with the -O or -OO command line arguments (optimized mode). assert statements can be used to verify conditions in your code; runtime design by contract.

setUp and tearDown I'm sure you noticed that in the earlier examples all the test methods had some code in common. They both instantiated MyClass and closed the instance when the test completed. This not only violates DRY (Don't Repeat Yourself) but is tedious and error prone (bad things may happen if you forget to close the instance). Those of you used to testing frameworks in other languages will not be surprised to hear that unittest has methods called setUp and tearDown to deal with these situations. If your test cases define a setUp method it will be called before every test. If there is an exception raised in the setUp then the appropriate error or failure will be recorded and the test method will not be run. setUp is particularly useful for functional / integration tests where setting up fixtures means establishing a lot of state. In the case of our earlier tests we can rewrite using setUp: import unittest from mymodule import MyClass class MyTest ( unittest . TestCase ): def setUp ( self ): unittest . TestCase . setUp ( self ) self . myclass = MyClass () def testTrue ( self ): try : result = self . myclass . method () self . asssertTrue ( result ) finally : myclass . close () ... In setUp here MyClass is instantiated and stored as an instance variable on the TestCase instance. The test can access the instance created by setUp instead of having to create it itself. The setUp method in TestCase does nothing (at least in the current versions of unittest) so strictly speaking it isn't necessary to call up to the parent class. When you inherit from a custom TestCase you will need to call up to the parent method so it is a good habit to get into. setUp has a corresponding method that is called after the test has run, tearDown . Like setUp, if an exception is raised in tearDown then the appropriate error or failure will be recorded for the test. Currently our tests still have to close the MyClass instance after use, we can use tearDown to fix this: import unittest from mymodule import MyClass class MyTest ( unittest . TestCase ): def setUp ( self ): unittest . TestCase . setUp ( self ) self . myclass = MyClass () def tearDown ( self ): unittest . TestCase . tearDown ( self ) self . myclass . close () def testTrue ( self ): result = self . myclass . method () self . asssertTrue ( result ) ... See how using tearDown also simplifies our test. As tearDown is executed even if the test fails or an error occurs we no longer need to use a try: ... finally: to ensure the MyClass instance is closed. We've now covered the major points of working with unittest itself, let's look at some general Python testing techniques.

Duck typing and mock objects One of the reasons that Python is so much easier to test than statically typed languages is that because of the wonders of duck typing we can substitute any object at runtime for another object that supports the same operations. This means we can swap out production classes with mock objects that record how they are used. Let's look at how we might test this code in Python: class MyClass ( object ): def __init__ ( self ): self . data = None def readData ( self , source ): self . data = source . read () source . close () MyClass has a readData method that takes a data source, reads from it and then closes it. A real data source may be expensive (slow) to create, and in any case we want to test MyClass in isolation. We can create a mock data source that has the methods MyClass uses. Our tests can use the mock data source so that we can check MyClass uses it as it should. This is especially useful when using a real data source may be slow or uses an external resource like a database: class MockDataSource ( object ): def __init__ ( self ): self . readFrom = False self . closed = False def read ( self ): self . readFrom = True return 'some data' def close ( self ): self . closed = True The read method of our mock data source returns some known data when called. It also records that it has been read from, by setting readFrom to True, and when close has been called. Using the MockDataSource to test MyClass : import unittest from mymodule import MyClass def TestMyClass ( unittest . TestCase ): def testConstructor ( self ): "Test the default state" myclass = MyClass () self . assertEqual ( myclass . data , None ) def testReadData ( self ): myclass = MyClass () source = MockDataSource () myclass . readData ( source ) self . assertEqual ( myclass . data , 'some data' ) self . assertTrue ( source . readFrom ) self . assertTrue ( source . closed ) Constructing custom mocks for all the production classes you need to test can be time consuming and painful. Fortunately we can make this easier by using one of the many Python mocking libraries that are available. My favourite is mock, which by coincidence I wrote and is particularly suited for use with unittest. The main class in the mock library is Mock . Mock automatically creates methods and attributes as they are accessed and records how they are used. Mock instances have several useful methods and attributes to control their behavior and make assertions about how they have been used. We can rewrite our test above to use Mock: import unittest from mock import Mock from mymodule import MyClass def TestMyClass ( unittest . TestCase ): def testConstructor ( self ): "Test the default state" myclass = MyClass () self . assertEqual ( myclass . data , None ) def testReadData ( self ): myclass = MyClass () source = Mock () source . read . return_value = 'some data' myclass . readData ( source ) self . assertEqual ( myclass . data , 'some data' ) self . assertTrue ( source . read . called ) self . assertTrue ( source . close . called ) The line source.read.return_value = 'some data' automatically creates the read method on our mock data source merely by accessing it and then sets the return_value to be 'some data' . The newly created read method is actually a new Mock instance. As Mock instances are callable they can behave just like methods of objects. Setting the return_value controls what is returned when the mock is called. If the mock is called with arguments you can use the assert_called_with method to verify that it has been called with the expected arguments. A quick run down of some of the useful members on Mock objects: >>> from mock import Mock >>> mock = Mock () >>> mock . method . return_value = 'foo' >>> >>> mock . method ( 1 , 2 , 3 , 4 ) 'foo' >>> mock . method . called True >>> mock . method . assert_called_with ( 8 , 6 ) Traceback (most recent call last): ... AssertionError: Expected : ((8, 6), {}) Called with: ((1, 2, 3, 4), {}) >>> Mock objects can even raise exception or have other side effects when called: >>> mock = Mock () >>> mock . side_effect = Exception ( 'Boom!' ) >>> mock () Traceback (most recent call last): ... Exception : Boom! >>> results = [ 1 , 2 , 3 ] >>> def side_effect ( * args , ** kwargs ): ... return results . pop () ... >>> mock . side_effect = side_effect >>> mock (), mock (), mock () (3, 2, 1) There's lots more to mock so it is worth perusing the documentation. As well as the Mock class it has useful decorators for automatic monkey patching, which is another powerful testing technique.

Monkey patching Monkey patching is a term that originated in the Python community to describe runtime modification (patching) of live objects. This can include replacing methods with a completely new implementation. This is generally regarded as being a bad thing to do in production code but is very useful for testing. Note Monkey patching is a term that started with the Python community but is now widely used (especially within the Ruby community). It seems to have originated with Zope programmers, who referred to guerilla patching. This evolved from gorilla patching into monkey patching. We can illustrate this with some new methods on MyClass . class MyClass ( object ): def __init__ ( self ): self . data = None def readData ( self , source ): self . data = source . read () source . close () def synchronise ( self ): source = self . getDataSource () self . readData ( source ) self . store () The new method synchronise on MyClass fetches a data source, reads the data and then stores it. synchronise calls readData , which we have already worked with, and getDataSource and store which for convenience aren't shown. We can test the synchronise method in isolation by monkey patching the three methods that it uses. Note In Python we can patch classes as well as instances. If we patch an instance then the changes only affect that instance but changes to classes persist. If you patch a class in a test then you have to be very careful to restore the class to its original state or your changes will 'leak' and could affect future tests. We can reuse the Mock class we have been working with to replace the methods that synchronise calls. import unittest from mock import Mock , sentinel from mymodule import MyClass def TestMyClass ( unittest . TestCase ): def testSynchronise ( self ): myclass = MyClass () # put the monkey patching in place myclass . getDataSource = Mock () myclass . getDataSource . return_value = sentinel . DataSource myclass . readData = Mock () myclass . store = Mock () # make the call myclass . synchronise () # assertions self . assertTrue ( myclass . getDataSource . called ) myclass . readData . assert_called_with ( sentinel . DataSource ) self . assertTrue ( myclass . store . called ) As well as Mock this test uses another object provided by the mock module. sentinel is another object that creates attributes on demand. Every time you access the same attribute it returns the same object, so we can use sentinel to provide known values for our tests. It makes for nice readable tests when we should need some value that we can test against. In this test we check that readData is called with the return value of getDataSource , sentinel.DataSource . That method was easy to test, but methods that use external classes can be harder to test. With Python as well as patching instances we can patch objects at the module level. Because name lookup is done at runtime we can replace the implementation of an external class with another mock object. As with directly patching classes any changes you make to modules will persist so you have to be extremely careful about undoing any changes you make. The mock module has decorators for tests that can handle doing the patching and automatically undoing it once the test has completed. Let's have a look at a potential implementation for the getDataSource method that synchronise calls. from datasource import DataSource class MyClass ( object ): def getDataSource ( self ): return DataSource () It's a trivial piece of code but it could be very hard to test, especially if creating the DataSource is expensive or connects to external resources that may not be available in a test environment. We can test it by patching out the DataSource name in mymodule . When getDataSource is called the DataSource name will be looked up in the module namespace. If we have replaced it with an alternative implementation then that will be used instead. It is important to realise that we are patching the namespace where DataSource is used, which in our case is mymodule , and not patching the place where DataSource is defined. The mock module provides a patch decorator that will do the patching for us, and as an added bonus it will patch it with a Mock object and pass the mock into our test method. Here is a simple test for getDataSource using the patch decorator, the mock created by the patch decorator is the extra parameter ( MockDataSource ) to the testSynchronise method: import unittest from mock import Mock , patch , sentinel from mymodule import MyClass class TestMyClass ( unittest . TestCase ): @patch ( 'mymodule.DataSource' ) def testSychronise ( self , MockDataSource ): MockDataSource . return_value = sentinel . DataSource myclass = MyClass () source = myclass . getDataSource () self . assertEquals ( source , sentinel . DataSource ) Instantiating a class is done by calling it (instantiation is actually done by the __call__ method of the class's metaclass - so instantiation is calling the class). As DataSource is instantiated inside the getDataSource call we control what is returned by setting the return value on the MockDataSource . getDataSource just returns the instance it creates, so we test that the return value of calling this method is the same object we set as the MockDataSource return value. But there’s a potential problem with over using monkey patching. Your tests become whitebox tests that know a great deal about the implementation of the objects under test. A pattern that can help reduce this coupling is dependency injection.