Most of our unit tests work.

They might go something like this:

class Person(object):

def __init__(self, name):

self.name = name def create_person(name):

return Person(name) def test_func():

assert create_person('tom').name == 'tom'

This is fairly innocuous. We’ve given our innocent class some input to create a new instance, and running the test has matched our expectation. This will be just fine.

We want this test to fail. Heck, you want this test to fail. It may seem counter-intuitive and no-one likes that red text, almost laughing at you, “haha, your tests failed, you idiot!”

If there’s a hell specifically for developers, this would be the welcome sign.

But we want to know how creating our innocent and pure Person class can go horribly, horribly wrong. Why? Well arguably that’s the whole point of software testing. Let’s approach this from another angle.

Let’s say you designed a car. It looks sleek, goes fast and has comfy bucket seats. Cool! You’ll make millions.

The only problem is, if you turn the volume knob on the stereo while adjusting the right mirror and lifting slightly off the seat, the rear tyres fall off. The deadline for release is tomorrow (for the sake of argument of course. I’m sure cars aren’t this easy to release into the wild. I’d be concerned if they were).

What do you tell the test drivers?

Please don’t use the volume knob while adjusting the right mirror and lifting slightly off the seat.

This is how we write unit tests as developers. We specify the inputs that we approve of subconsciously (or perhaps even consciously?), and then rejoice and celebrate when we see that endorphin-releasing green text telling us our tests passed.

Ah sweet relief

As programmers we’re generally looking for the automated way to do everything, so it’s only natural that someone (or in the world of open-source, a group of someone’s) would come up with a way to automate the creation of comprehensive test cases.

Enter Hypothesis. It ends up solving both our problems:

Coming up with a load of test cases Finding something that will break the test

Not only will Hypothesis do this for you, but importantly, it will find the shortest input that will break a test, and particularly something that human’s wouldn’t really think of.

A good example is an email field on a web application. The following is some example code which tests this with Hypothesis:

As you can see we have a straight-forward validation function which just matches a string against some regex to check it matches the form “person@place.com” and a test to check it works.

Basically what the decorator then does is pick a strategy to use. This says to Pytest, run this test for each possible combination given in the emails() strategy (there’s a whole host of different inbuilt strategies here). We run the test and ... it failed!

I’ll let the console output (albeit shortened for brevity!) describe what Hypothesis did and how it broke the test:

We see that Hypothesis found a test case that broke the validation function, then whittled it down to it’s bare essence. In this case it was a character that wasn’t allowed by our regex that caused the test to fail.

The solution in our validation function is to be more forgiving of email inputs, as the emails() strategy conforms to the RFC on valid email addresses! For instance, a more basic regex like r'[^@]+@[^@]+\.[^@]+' would cause the test to pass.

Using the example of the obscure failure in our newly-designed car, while we know that “blah@blah.com” is going to work, we don’t necessarily expect anyone to have an email address of “!@a.com”, despite it being perfectly valid. Hypothesis therefore takes the human aspect out of testing, and allows us to focus on robustness in our code. This, after all, is arguably the main reason we test individual units.

At Reposit, we use Hypothesis to test creating objects in Django. For example, if we have a model with a name field, we might run Hypothesis over it to see if a particular character or series of characters causes any sort of problems or issues that we may not have thought of. This is especially helpful in integration testing, when you want to know how your other services deal with any strange behaviour in their periphery.

I hope to write another post about our integration testing soon, and how you can really get the most out of pytest for more than just unit testing.

So why not give it a go? What have you got to lose?