Test Driven Development with pytest

Introduction

Good software is tested software. Testing our code can help us catch bugs or unwanted behavior.

Test Driven Development (TDD) is a software development practice that requires us to incrementally write tests for features we want to add. It leverages automated testing suites, like pytest - a testing framework for Python programs.

Automated Testing

Developers usually write code, compile it if necessary, and then run the code to see if it works. This is an example of manual testing. In this method we explore what features of the program work. If you would like to be thorough with your testing, you'll have to remember how to test the various outcomes of each feature.

What if a new developer began to add features to the project, would you have to learn their features to test it as well? New features sometimes affect older features, are you going to manually check that all previous features still function when you added a new one?

Manual testing can give us a quick boost in confidence to continue development. However, as our application grows it becomes exponentially harder and tedious to continuously test our code base manually.

Automated testing shifts the burden of testing the code ourselves and keep tracking of the results, to maintaining scripts that do it for us. The scripts run modules of the code with inputs defined by the developer and compares the output with the expectations defined by the developer.

The pytest Module

Python's Standard Library comes with an automated testing framework - the unittest library. While the unittest library is feature-rich and effective at its task, we'll be using pytest as our weapon of choice in this article.

Most developers find pytest easier to use than unittest . One simple reason is that pytest only requires functions to write tests, whereas the unittest module requires classes.

For many new developers, requiring classes for tests can be a bit off-putting. pytest also includes many other features that we'll use later in this tutorial which are not present in the unittest module.

What is Test-Driven Development?

Test-Driven Development is a simple software development practice that instructs you or a team of coders to follow these tree steps to create software:

Write a test for a feature that fails Write code to make the test pass Refactor the code as needed

This process is commonly referred to as the Red-Green-Refactor cycle:

You write an automated test for how the new code should behave and see it fail - Red

Write code in the application until your test passes - Green

Refactor the code to make it readable and efficient. There's no need to be worried that your refactoring will break the new feature, you simply need to rerun the test and ensure it passes.

A feature is complete when we no longer need to write code for its tests to pass.

Why use TDD to Create Applications?

The common complaint of using TDD is that it takes too much time.

As you become more efficient with writing tests, the time required by you to maintain them decreases. Furthermore, TDD provides the following benefits, which you can find worth the time tradeoff:

Writing tests require that you know the inputs and output to make the feature work - TDD forces us to think about the application interface before we start coding.

Increased confidence in codebase - By having automated tests for all features, developers feel more confident when developing new features. It becomes trivial to test the entire system to see if new changes broke what existed before.

TDD does not eliminate all bugs, but the likelihood of encountering them are lower - When trying to fix a bug, you can write a test for it to ensure it's fixed when done coding.

Tests can be used as further documentation. As we write the inputs and outputs of a feature, a developer can look at the test and see how the code's interface is meant to be used.

Code Coverage

Code coverage is a metric that measures the amount of source code that's covered by your test plan.

100% code coverage means that all the code you've written has been used by some test(s). Tools measure code coverage in many different ways, here are a few popular metrics:

Lines of code tested

How many defined functions are tested

How many branches ( if statements for example) are tested

It's important that you know what metrics are used by your code coverage tool.

As we make heavy use of pytest , we'll use the popular pytest-cov plugin to get code coverage.

High code coverage does not mean that your application will have no bugs. It's more than likely that the code has not been tested for every possible scenario.

Unit Test vs Integration Tests

Unit tests are used to ensure an individual module behaves as expected, whereas integration tests ensure that a collection of modules interoperate as we expect them too.

As we develop larger applications, we'll have to develop many components. While these individual components may each have their corresponding unit tests, we'll also want a way to ensure that these multiple components when used together fulfill our expectations.

TDD requires that we begin by writing a single test that fails with the current code base, then work towards its completion. It doesn't specify that it has be a unit test, your first test can be an integration test if you'd like.

When your first failing integration test is written, we can then start to develop each individual component.

The integration test will fail until each component is built and passes their tests. When the integration test passes, if crafted correctly we would have met a user requirement for our system.

Basic Example: Calculating the Sum of Prime Numbers

The best way to understand TDD is to put it in to practice. We'll begin by writing a Python program that returns the sum of all numbers in a sequence that are prime numbers.

We'll create two functions to do this, one that determines if a number is prime or not and another that adds the prime numbers from a given sequence of numbers.

Create a directory called primes in a workspace of your choosing. Now add two files: primes.py , test_primes.py . The first file is where we'll write our program code, the second file is where our tests will be.

pytest requires that our test files either begin with "test_" or end with "_test.py" (therefore, we could have also named our test file primes_test.py ).

Now in our primes directory, let's set up our virtual environment:

$ python3 -m venv env # Create a virtual environment for our modules $ . env/bin/activate # Activate our virtual environment $ pip install --upgrade pip # Upgrade pip $ pip install pytest # Install pytest

Testing the is_prime() Function

A prime number is any natural number greater than 1 that is only divisible by 1 and itself.

Our function should take a number and return True if it's prime and False otherwise.

In our test_primes.py , let's add our first test case:

def test_prime_low_number(): assert is_prime(1) == False

The assert() statement is a keyword in Python (and in many other languages) that immediately throws an error if a condition fails. This keyword is useful while writing tests because it points to exactly what condition failed.

If we enter 1 or a number smaller than 1 , then it can't be prime.

Let's now run our test. Enter the following in your command line:

$ pytest

For verbose output you can run pytest -v . Ensure that you virtual environment is still active (you should see (env) at the beginning of the line in your terminal).

You should notice output like this:

def test_prime_low_number(): > assert is_prime(1) == False E NameError: name 'is_prime' is not defined test_primes.py:2: NameError ========================================================= 1 failed in 0.12 seconds =========================================================

It makes sense to get a NameError , we haven't created our function yet. This is the "red" aspect of the red-green-refactor cycle.

pytest even logs failed tests in the color red if your shell is configured to show colors. Now let's add the code in our primes.py file to make this test pass:

def is_prime(num): if num == 1: return False

Note: It's generally good practice to keep your tests in separate files from your code. Aside from improved readability and separation of concerns as your codebase grows, it also keeps the developer of the test away from the internal workings of the code. Therefore, the tests use the application interfaces in the same way another developer would use it.

Now let's run pytest once more. We should now see output like this:

=========================================================== test session starts ============================================================ platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 rootdir: /Users/marcus/stackabuse/test-driven-development-with-pytest/primes plugins: cov-2.6.1 collected 1 item test_primes.py . [100%] ========================================================= 1 passed in 0.04 seconds =========================================================

Our first test passed! We know that 1 is not prime, but by definition 0 is not prime, nor is any negative number.

We should refactor our application to reflect that and change is_prime() to:

def is_prime(num): # Prime numbers must be greater than 1 if num < 2: return False

If we run pytest again, our tests would still pass.

Now let's add a test case for a prime number, in test_primes.py add the following after our first test case:

def test_prime_prime_number(): assert is_prime(29)

And let's run pytest to see this output:

def test_prime_prime_number(): > assert is_prime(29) E assert None E + where None = is_prime(29) test_primes.py:9: AssertionError ============================================================= warnings summary ============================================================= test_primes.py::test_prime_prime_number /Users/marcus/stackabuse/test-driven-development-with-pytest/primes/test_primes.py:9: PytestWarning: asserting the value None, please use "assert is None" assert is_prime(29) -- Docs: https://docs.pytest.org/en/latest/warnings.html ============================================== 1 failed, 1 passed, 1 warnings in 0.12 seconds ==============================================

Note that the pytest command now runs the two tests we've written.

The new case fails as we don't actually compute whether number is prime or not. The is_prime() function returns None as other functions do by default for any number greater than 1.

The output still fails, or we see red from the output.

Let's think about how we determine where a number is prime or not. The simplest method would be to loop from 2 until one less than the number, dividing the number by the current value of the iteration.

To make this more efficient, we can check by dividing numbers between 2 and the square root of the number.

If there's no remainder from the division, then it has a divisor that's neither 1 nor itself, and therefore not prime. If it does not find a divisor in the loop, then it must be prime.

Let's update is_prime() with our new logic:

import math def is_prime(num): # Prime numbers must be greater than 1 if num < 2: return False for n in range(2, math.floor(math.sqrt(num) + 1)): if num % n == 0: return False return True

Now we run pytest to see if our test passes:

=========================================================== test session starts ============================================================ platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 rootdir: /Users/marcus/stackabuse/test-driven-development-with-pytest/primes plugins: cov-2.6.1 collected 2 items test_primes.py .. [100%] ========================================================= 2 passed in 0.04 seconds =========================================================

It passes. We know that this function can get a prime number, and a low number. Let's add a test to ensure it returns False for a composite number greater than 1.

In test_primes.py add the following test case below:

def test_prime_composite_number(): assert is_prime(15) == False

If we run pytest we'll see the following output:

=========================================================== test session starts ============================================================ platform darwin -- Python 3.7.3, pytest-4.4.1, py-1.8.0, pluggy-0.9.0 rootdir: /Users/marcus/stackabuse/test-driven-development-with-pytest/primes plugins: cov-2.6.1 collected 3 items test_primes.py ... [100%] ========================================================= 3 passed in 0.04 seconds =========================================================

Testing sum_of_primes()

As with is_prime() , let's think about the outcomes of this function. If the function is given an empty list, then the sum should be zero.

That guarantees that our function should always return a value with valid input. After, we'll want to test that it only adds prime numbers in a list of numbers.

Let's write our first failing test, add the following code at the end of test_primes.py :

def test_sum_of_primes_empty_list(): assert sum_of_primes([]) == 0

If we run pytest we'll get the familiar NameError test failure, as we did not define the function yet. In our primes.py file let's add our new function that simply returns the sum of a given list:

def sum_of_primes(nums): return sum(nums)

Now running pytest would show that all tests pass. Our next test should ensure that only prime numbers are added.

We'll mix prime and composite numbers and expect the function to only add the prime numbers:

def test_sum_of_primes_mixed_list(): assert sum_of_primes([11, 15, 17, 18, 20, 100]) == 28

The prime numbers in the list we are testing are 11 and 17, which add up to 28.

Running pytest to validate that the new test fails. Now let's modify our sum_of_primes() so that only prime numbers are added.

We'll filter the prime numbers with a List Comprehension:

def sum_of_primes(nums): return sum([x for x in nums if is_prime(x)])

As is routine, we run pytest to verify we fixed the failing test - everything passes.

Once complete, let's check our code coverage:

$ pytest --cov=primes

For this package, our code coverage is 100%! If it wasn't, we can spend some time adding a few more tests to our code to ensure that our test plan is thorough.

For example, if our is_prime() function was given a float value, would it throw an error? Our is_prime() method does not enforce the rule that a prime number must be a natural number, it only checks that it's greater than 1.

Even though we have total code coverage, the feature being implemented may not work correctly in all situations.

Advanced Example: Writing an Inventory Manager

Now that we grasped the basics of TDD, let's dive deeper into some useful features of pytest which enable us to become more efficient at writing tests.

Just like before in our basic example, inventory.py , and a test file, test_inventory.py , will be our main two files.

Features and Test Planning

A clothing and footwear store would like to move management of their items from paper to a new computer the owner bought.

While the owner would like many features, she's content with software that could perform the following upcoming tasks immediately.

Record the 10 new Nike sneakers that she recently bought. Each is worth $50.00.

Add 5 more Adidas sweatpants that cost $70.00 each.

She's expecting a customer to buy 2 of the Nike sneakers

She's expecting another customer to buy 1 of the sweatpants.

We can use these requirements to create our first integration test. Before we get to writing it, let's flesh out the smaller components a bit to figure out what would be our inputs and output, function signatures and other system design elements.

Each item of stock will have a name, price and quantity. We'll be able to add new items, add stock to existing items and of course remove stock.

When we instantiate an Inventory object, we'll want the user to provide a limit . The limit will have a default value of 100. Our first test would be to check the limit when instantiating an object. To ensure that we don't go over our limit, we'll need to keep track of the total_items counter. When initialized, this should be 0.

We'll need to add 10 Nike sneakers and the 5 Adidas sweatpants to the system. We can create an add_new_stock() method that accepts a name , price , and quantity .

We should test that we can add an item to our inventory object. We should not be able to add an item with a negative quantity, the method should raise an exception. We also should not be able to add any more items if we're at the our limit, that should also raise an exception.

Customers will be buying these items soon after entry, so we'll need a remove_stock() method as well. This function would need the name of the stock and the quantity of items being removed. If the quantity being removed is negative or if it makes the total quantity for the stock below 0, then the method should raise an exception. Additionally, if the name provided is not found in our inventory, the method should raise an exception.

First Tests

Preparing to do our tests first has helped us design our system. Let's start by creating our first integration test:

def test_buy_and_sell_nikes_adidas(): # Create inventory object inventory = Inventory() assert inventory.limit == 100 assert inventory.total_items == 0 # Add the new Nike sneakers inventory.add_new_stock('Nike Sneakers', 50.00, 10) assert inventory.total_items == 10 # Add the new Adidas sweatpants inventory.add_new_stock('Adidas Sweatpants', 70.00, 5) assert inventory.total_items == 15 # Remove 2 sneakers to sell to the first customer inventory.remove_stock('Nike Sneakers', 2) assert inventory.total_items == 13 # Remove 1 sweatpants to sell to the next customer inventory.remove_stock('Adidas Sweatpants', 1) assert inventory.total_items == 12

At every action we do an assertion on the state of the inventory. It's best to assert after an action is done, so when you're debugging you'll know the last step that was taken.

Run pytest and it should fail with a NameError as no Inventory class is defined.

Let's create our Inventory class, with a limit parameter that defaults to 100, starting with the unit tests:

def test_default_inventory(): """Test that the default limit is 100""" inventory = Inventory() assert inventory.limit == 100 assert inventory.total_items == 0

And now, the class itself:

class Inventory: def __init__(self, limit=100): self.limit = limit self.total_items = 0

Before we move on to the methods, we want to be sure that our object can be initialized with a custom limit, and it should be set correctly:

def test_custom_inventory_limit(): """Test that we can set a custom limit""" inventory = Inventory(limit=25) assert inventory.limit == 25 assert inventory.total_items == 0

The integration continues to fail but this test passes.

Fixtures

Our first two tests required us to instantiate an Inventory object before we could begin. More than likely we'll have to do the same for all future tests. This is bit repetitious.

We can use fixtures to help solve this problem. A fixture is a known and fixed state that tests are run against to ensure that results are repeatable.

It's good practice for tests to run in isolation of each other. The results of one test case shouldn't affect the results of another test case.

Let's create our first fixture, an Inventory object with no stock.

test_inventory.py :

import pytest @pytest.fixture def no_stock_inventory(): """Returns an empty inventory that can store 10 items""" return Inventory(10)

Note the use of the pytest.fixture decorator. For testing purposes we can reduce the inventory limit to 10.

Let's use this fixture to add a test for the add_new_stock() method:

def test_add_new_stock_success(no_stock_inventory): no_stock_inventory.add_new_stock('Test Jacket', 10.00, 5) assert no_stock_inventory.total_items == 5 assert no_stock_inventory.stocks['Test Jacket']['price'] == 10.00 assert no_stock_inventory.stocks['Test Jacket']['quantity'] == 5

Observe that the name of the function is the argument of the test, they must be the same name for the fixture to be applied. Otherwise, you'd use it like a regular object.

To ensure that the stock was added we have to test a bit more than the total items stored so far. Writing this test has forced us to consider how we display a stock's price and remaining quantity.

Run pytest to observe that there are now 2 failures and 2 passes. We'll now add the add_new_stock() method:

class Inventory: def __init__(self, limit=100): self.limit = limit self.total_items = 0 self.stocks = {} def add_new_stock(self, name, price, quantity): self.stocks[name] = { 'price': price, 'quantity': quantity } self.total_items += quantity

You'll notice that a stocks object was initialized in the __init__ function. Again, run pytest to confirm that the test passed.

Parametrizing Tests

We mentioned earlier that the add_new_stock() method does input validation - we raise an exception if the quantity is zero or negative, or if it carries us over our inventory's limit.

We can easily add more test cases, using try/except to catch each exception. This also feels repetitious.

Pytest provides parametrized functions that allows us to test multiple scenarios using one function. Let's write a parametrized test function to ensure our input validation works:

@pytest.mark.parametrize('name,price,quantity,exception', [ ('Test Jacket', 10.00, 0, InvalidQuantityException( 'Cannot add a quantity of 0. All new stocks must have at least 1 item')) ]) def test_add_new_stock_bad_input(name, price, quantity, exception): inventory = Inventory(10) try: inventory.add_new_stock(name, price, quantity) except InvalidQuantityException as inst: # First ensure the exception is of the right type assert isinstance(inst, type(exception)) # Ensure that exceptions have the same message assert inst.args == exception.args else: pytest.fail("Expected error but found none")

This test tries to add a stock, gets the exception and then checks that it's the right exception. If we don't get an exception, fail the test. The else clause is very important in this scenario. Without it, an exception that wasn't thrown would count as a pass. Our test would therefore have a false positive.

We use pytest decorators to add a parameter to the function. The first argument contains a string of all the parameter names. The second argument is a list of tuples where each tuple is a test case.

Run pytest to see our test fail as InvalidQuantityException is not defined. Back in inventory.py let's create a new exception above the Inventory class:

class InvalidQuantityException(Exception): pass

And change the add_new_stock() method:

def add_new_stock(self, name, price, quantity): if quantity <= 0: raise InvalidQuantityException( 'Cannot add a quantity of {}. All new stocks must have at least 1 item'.format(quantity)) self.stocks[name] = { 'price': price, 'quantity': quantity } self.total_items += quantity

Run pytest to see that our most recent test now passes. Now let's add the second error test case, an exception is raised if our inventory can't store it. Change the test as follows:

@pytest.mark.parametrize('name,price,quantity,exception', [ ('Test Jacket', 10.00, 0, InvalidQuantityException( 'Cannot add a quantity of 0. All new stocks must have at least 1 item')), ('Test Jacket', 10.00, 25, NoSpaceException( 'Cannot add these 25 items. Only 10 more items can be stored')) ]) def test_add_new_stock_bad_input(name, price, quantity, exception): inventory = Inventory(10) try: inventory.add_new_stock(name, price, quantity) except (InvalidQuantityException, NoSpaceException) as inst: # First ensure the exception is of the right type assert isinstance(inst, type(exception)) # Ensure that exceptions have the same message assert inst.args == exception.args else: pytest.fail("Expected error but found none")

Instead of creating a whole new function, we modify this one slightly to pick up our new exception and add another tuple to the decorator! Now two tests are executed on a single function.

Parametrized functions cut down on the time it takes to add new test cases.

In inventory.py , we'll first add our new exception below InvalidQuantityException :

class NoSpaceException(Exception): pass

And change the add_new_stock() method:

def add_new_stock(self, name, price, quantity): if quantity <= 0: raise InvalidQuantityException( 'Cannot add a quantity of {}. All new stocks must have at least 1 item'.format(quantity)) if self.total_items + quantity > self.limit: remaining_space = self.limit - self.total_items raise NoSpaceException( 'Cannot add these {} items. Only {} more items can be stored'.format(quantity, remaining_space)) self.stocks[name] = { 'price': price, 'quantity': quantity } self.total_items += quantity

Run pytest to see that your new test case passes as well.

We can use fixtures with our parametrized function. Let's refactor our test to use the empty inventory fixture:

def test_add_new_stock_bad_input(no_stock_inventory, name, price, quantity, exception): try: no_stock_inventory.add_new_stock(name, price, quantity) except (InvalidQuantityException, NoSpaceException) as inst: # First ensure the exception is of the right type assert isinstance(inst, type(exception)) # Ensure that exceptions have the same message assert inst.args == exception.args else: pytest.fail("Expected error but found none")

Like before, it's just another argument that uses the name of a function. The key thing is to exclude it in the parametrize decorator.

Looking at the code some more, there's no reason why there needs to be two methods to get adding new stocks. We can test errors and success in one function.

Delete test_add_new_stock_bad_input() and test_add_new_stock_success() and let's add a new function:

@pytest.mark.parametrize('name,price,quantity,exception', [ ('Test Jacket', 10.00, 0, InvalidQuantityException( 'Cannot add a quantity of 0. All new stocks must have at least 1 item')), ('Test Jacket', 10.00, 25, NoSpaceException( 'Cannot add these 25 items. Only 10 more items can be stored')), ('Test Jacket', 10.00, 5, None) ]) def test_add_new_stock(no_stock_inventory, name, price, quantity, exception): try: no_stock_inventory.add_new_stock(name, price, quantity) except (InvalidQuantityException, NoSpaceException) as inst: # First ensure the exception is of the right type assert isinstance(inst, type(exception)) # Ensure that exceptions have the same message assert inst.args == exception.args else: assert no_stock_inventory.total_items == quantity assert no_stock_inventory.stocks[name]['price'] == price assert no_stock_inventory.stocks[name]['quantity'] == quantity

This one test function first checks for known exceptions, if none are found then we ensure the addition matches our expectations. The separate test_add_new_stock_success() function is now just executed via a tupled parameter. As we don't expect an exception to be thrown in the successful case, we specify None as our exception.

Wrapping up our Inventory Manager

With our more advanced pytest usage, we can quickly develop the remove_stock function with TDD. In inventory_test.py :

# The import statement needs one more exception from inventory import Inventory, InvalidQuantityException, NoSpaceException, ItemNotFoundException # ... # Add a new fixture that contains stocks by default # This makes writing tests easier for our remove function @pytest.fixture def ten_stock_inventory(): """Returns an inventory with some test stock items""" inventory = Inventory(20) inventory.add_new_stock('Puma Test', 100.00, 8) inventory.add_new_stock('Reebok Test', 25.50, 2) return inventory # ... # Note the extra parameters, we need to set our expectation of # what totals should be after our remove action @pytest.mark.parametrize('name,quantity,exception,new_quantity,new_total', [ ('Puma Test', 0, InvalidQuantityException( 'Cannot remove a quantity of 0. Must remove at least 1 item'), 0, 0), ('Not Here', 5, ItemNotFoundException( 'Could not find Not Here in our stocks. Cannot remove non-existing stock'), 0, 0), ('Puma Test', 25, InvalidQuantityException( 'Cannot remove these 25 items. Only 8 items are in stock'), 0, 0), ('Puma Test', 5, None, 3, 5) ]) def test_remove_stock(ten_stock_inventory, name, quantity, exception, new_quantity, new_total): try: ten_stock_inventory.remove_stock(name, quantity) except (InvalidQuantityException, NoSpaceException, ItemNotFoundException) as inst: assert isinstance(inst, type(exception)) assert inst.args == exception.args else: assert ten_stock_inventory.stocks[name]['quantity'] == new_quantity assert ten_stock_inventory.total_items == new_total

And in our inventory.py file first we create the new exception for when users try to modify a stock that doesn't exist:

class ItemNotFoundException(Exception): pass

And then we add this method to our Inventory class:

def remove_stock(self, name, quantity): if quantity <= 0: raise InvalidQuantityException( 'Cannot remove a quantity of {}. Must remove at least 1 item'.format(quantity)) if name not in self.stocks: raise ItemNotFoundException( 'Could not find {} in our stocks. Cannot remove non-existing stock'.format(name)) if self.stocks[name]['quantity'] - quantity <= 0: raise InvalidQuantityException( 'Cannot remove these {} items. Only {} items are in stock'.format( quantity, self.stocks[name]['quantity'])) self.stocks[name]['quantity'] -= quantity self.total_items -= quantity

When you run pytest you should see that the integration test and all others pass!

Conclusion

Test-Driven Development is a software development process where tests are used to guide a system's design. TDD mandates that for every feature we have to implement we write a test that fails, add the least amount of code to make the test pass, and finally refactor that code to be cleaner.

To make this process possible and efficient, we leveraged pytest - an automated test tool. With pytest we can script tests, saving us time from having to manually test our code every change.

Unit tests are used to ensure an individual module behaves as expected, whereas integration tests ensure that a collection of module interoperate as we expect them too. Both the pytest tool and the TDD methodology allow for both test types to be used, and developers are encouraged to use both.