Stateful testing¶

With @given , your tests are still something that you mostly write yourself, with Hypothesis providing some data. With Hypothesis’s stateful testing, Hypothesis instead tries to generate not just data but entire tests. You specify a number of primitive actions that can be combined together, and then Hypothesis will try to find sequences of those actions that result in a failure.

Note This style of testing is often called model-based testing, but in Hypothesis is called stateful testing (mostly for historical reasons - the original implementation of this idea in Hypothesis was more closely based on ScalaCheck’s stateful testing where the name is more apt). Both of these names are somewhat misleading: You don’t really need any sort of formal model of your code to use this, and it can be just as useful for pure APIs that don’t involve any state as it is for stateful ones. It’s perhaps best to not take the name of this sort of testing too seriously. Regardless of what you call it, it is a powerful form of testing which is useful for most non-trivial APIs.

Hypothesis has two stateful testing APIs: A high level one, providing what we call rule based state machines, and a low level one, providing what we call generic state machines.

You probably want to use the rule based state machines - they provide a high level API for describing the sort of actions you want to perform, based on a structured representation of actions. However the generic state machines are more flexible, and are particularly useful if you want the set of currently possible actions to depend primarily on external state.

You may not need state machines¶ The basic idea of stateful testing is to make Hypothesis choose actions as well as values for your test, and state machines are a great declarative way to do just that. For simpler cases though, you might not need them at all - a standard test with @given might be enough, since you can use data() in branches or loops. In fact, that’s how the state machine explorer works internally. For more complex workloads though, where a higher level API comes into it’s own, keep reading!

Rule based state machines¶ Rule based state machines are the ones you’re most likely to want to use. They’re significantly more user friendly and should be good enough for most things you’d want to do. class hypothesis.stateful. RuleBasedStateMachine [source] ¶ A RuleBasedStateMachine gives you a more structured way to define state machines. The idea is that a state machine carries a bunch of types of data divided into Bundles, and has a set of rules which may read data from bundles (or just from normal strategies) and push data onto bundles. At any given point a random applicable rule will be executed. A rule is very similar to a normal @given based test in that it takes values drawn from strategies and passes them to a user defined test function. The key difference is that where @given based tests must be independent, rules can be chained together - a single test run may involve multiple rule invocations, which may interact in various ways. Rules can take normal strategies as arguments, or a specific kind of strategy called a Bundle. A Bundle is a named collection of generated values that can be reused by other operations in the test. They are populated with the results of rules, and may be used as arguments to rules, allowing data to flow from one rule to another, and rules to work on the results of previous computations or actions. You can think of each value that gets added to any Bundle as being assigned to a new variable. Drawing a value from the bundle strategy means choosing one of the corresponding variables and using that value, and consumes() as a del statement for that variable. If you can replace use of Bundles with instance attributes of the class that is often simpler, but often Bundles are strictly more powerful. The following rule based state machine example is a simplified version of a test for Hypothesis’s example database implementation. An example database maps keys to sets of values, and in this test we compare one implementation of it to a simplified in memory model of its behaviour, which just stores the same values in a Python dict . The test then runs operations against both the real database and the in-memory representation of it and looks for discrepancies in their behaviour. import shutil import tempfile from collections import defaultdict import hypothesis.strategies as st from hypothesis.database import DirectoryBasedExampleDatabase from hypothesis.stateful import Bundle , RuleBasedStateMachine , rule class DatabaseComparison ( RuleBasedStateMachine ): def __init__ ( self ): super ( DatabaseComparison , self ) . __init__ () self . tempd = tempfile . mkdtemp () self . database = DirectoryBasedExampleDatabase ( self . tempd ) self . model = defaultdict ( set ) keys = Bundle ( "keys" ) values = Bundle ( "values" ) @rule ( target = keys , k = st . binary ()) def add_key ( self , k ): return k @rule ( target = values , v = st . binary ()) def add_value ( self , v ): return v @rule ( k = keys , v = values ) def save ( self , k , v ): self . model [ k ] . add ( v ) self . database . save ( k , v ) @rule ( k = keys , v = values ) def delete ( self , k , v ): self . model [ k ] . discard ( v ) self . database . delete ( k , v ) @rule ( k = keys ) def values_agree ( self , k ): assert set ( self . database . fetch ( k )) == self . model [ k ] def teardown ( self ): shutil . rmtree ( self . tempd ) TestDBComparison = DatabaseComparison . TestCase In this we declare two bundles - one for keys, and one for values. We have two trivial rules which just populate them with data ( k and v ), and three non-trivial rules: save saves a value under a key and delete removes a value from a key, in both cases also updating the model of what should be in the database. values_agree then checks that the contents of the database agrees with the model for a particular key. We can then integrate this into our test suite by getting a unittest TestCase from it: TestTrees = DatabaseComparison . TestCase # Or just run with pytest's unittest support if __name__ == "__main__" : unittest . main () This test currently passes, but if we comment out the line where we call self.model[k].discard(v) , we would see the following output when run under pytest: AssertionError : assert set () == { b '' } ------------ Hypothesis ------------ state = DatabaseComparison () var1 = state . add_key ( k = b '' ) var2 = state . add_value ( v = var1 ) state . save ( k = var1 , v = var2 ) state . delete ( k = var1 , v = var2 ) state . values_agree ( k = var1 ) state . teardown () Note how it’s printed out a very short program that will demonstrate the problem. The output from a rule based state machine should generally be pretty close to Python code - if you have custom repr implementations that don’t return valid Python then it might not be, but most of the time you should just be able to copy and paste the code into a test to reproduce it. You can control the detailed behaviour with a settings object on the TestCase (this is a normal hypothesis settings object using the defaults at the time the TestCase class was first referenced). For example if you wanted to run fewer examples with larger programs you could change the settings to: DatabaseComparison . TestCase . settings = settings ( max_examples = 50 , stateful_step_count = 100 ) Which doubles the number of steps each program runs and halves the number of test cases that will be run.

Rules¶ As said earlier, rules are the most common feature used in RuleBasedStateMachine. They are defined by applying the rule() decorator on a function. Note that RuleBasedStateMachine must have at least one rule defined and that a single function cannot be used to define multiple rules (this to avoid having multiple rules doing the same things). Due to the stateful execution method, rules generally cannot take arguments from other sources such as fixtures or pytest.mark.parametrize - consider providing them via a strategy such as sampled_from() instead. hypothesis.stateful. rule ( targets=(), target=None, **kwargs ) [source] ¶ Decorator for RuleBasedStateMachine. Any name present in target or targets will define where the end result of this function should go. If both are empty then the end result will be discarded. target must be a Bundle, or if the result should go to multiple bundles you can pass a tuple of them as the targets argument. It is invalid to use both arguments for a single rule. If the result should go to exactly one of several bundles, define a separate rule for each case. kwargs then define the arguments that will be passed to the function invocation. If their value is a Bundle, or if it is consumes(b) where b is a Bundle, then values that have previously been produced for that bundle will be provided. If consumes is used, the value will also be removed from the bundle. Any other kwargs should be strategies and values from them will be provided. hypothesis.stateful. consumes ( bundle ) [source] ¶ When introducing a rule in a RuleBasedStateMachine, this function can be used to mark bundles from which each value used in a step with the given rule should be removed. This function returns a strategy object that can be manipulated and combined like any other. For example, a rule declared with @rule(value1=b1, value2=consumes(b2), value3=lists(consumes(b3))) will consume a value from Bundle b2 and several values from Bundle b3 to populate value2 and value3 each time it is executed. hypothesis.stateful. multiple ( *args ) [source] ¶ This function can be used to pass multiple results to the target(s) of a rule. Just use return multiple(result1, result2, ...) in your rule. It is also possible to use return multiple() with no arguments in order to end a rule without passing any result.

Initializes¶ Initializes are a special case of rules that are guaranteed to be run at most once at the beginning of a run (i.e. before any normal rule is called). Note if multiple initialize rules are defined, they may be called in any order, and that order will vary from run to run. Initializes are typically useful to populate bundles: hypothesis.stateful. initialize ( targets=(), target=None, **kwargs ) [source] ¶ Decorator for RuleBasedStateMachine. An initialize decorator behaves like a rule, but the decorated method is called at most once in a run. All initialize decorated methods will be called before any rule decorated methods, in an arbitrary order. import hypothesis.strategies as st from hypothesis.stateful import RuleBasedStateMachine , Bundle , rule , initialize name_strategy = st . text ( min_size = 1 ) . filter ( lambda x : "/" not in x ) class NumberModifier ( RuleBasedStateMachine ): folders = Bundle ( "folders" ) files = Bundle ( "files" ) @initialize ( target = folders ) def init_folders ( self ): return "/" @rule ( target = folders , name = name_strategy ) def create_folder ( self , parent , name ): return " %s / %s " % ( parent , name ) @rule ( target = files , name = name_strategy ) def create_file ( self , parent , name ): return " %s / %s " % ( parent , name )

Preconditions¶ While it’s possible to use assume() in RuleBasedStateMachine rules, if you use it in only a few rules you can quickly run into a situation where few or none of your rules pass their assumptions. Thus, Hypothesis provides a precondition() decorator to avoid this problem. The precondition() decorator is used on rule -decorated functions, and must be given a function that returns True or False based on the RuleBasedStateMachine instance. hypothesis.stateful. precondition ( precond ) [source] ¶ Decorator to apply a precondition for rules in a RuleBasedStateMachine. Specifies a precondition for a rule to be considered as a valid step in the state machine. The given function will be called with the instance of RuleBasedStateMachine and should return True or False. Usually it will need to look at attributes on that instance. For example: class MyTestMachine ( RuleBasedStateMachine ): state = 1 @precondition ( lambda self : self . state != 0 ) @rule ( numerator = integers ()) def divide_with ( self , numerator ): self . state = numerator / self . state This is better than using assume in your rule since more valid rules should be able to be run. from hypothesis.stateful import RuleBasedStateMachine , rule , precondition class NumberModifier ( RuleBasedStateMachine ): num = 0 @rule () def add_one ( self ): self . num += 1 @precondition ( lambda self : self . num != 0 ) @rule () def divide_with_one ( self ): self . num = 1 / self . num By using precondition() here instead of assume() , Hypothesis can filter the inapplicable rules before running them. This makes it much more likely that a useful sequence of steps will be generated. Note that currently preconditions can’t access bundles; if you need to use preconditions, you should store relevant data on the instance instead.

Invariants¶ Often there are invariants that you want to ensure are met after every step in a process. It would be possible to add these as rules that are run, but they would be run zero or multiple times between other rules. Hypothesis provides a decorator that marks a function to be run after every step. hypothesis.stateful. invariant ( ) [source] ¶ Decorator to apply an invariant for rules in a RuleBasedStateMachine. The decorated function will be run after every rule and can raise an exception to indicate failed invariants. For example: class MyTestMachine ( RuleBasedStateMachine ): state = 1 @invariant () def is_nonzero ( self ): assert self . state != 0 from hypothesis.stateful import RuleBasedStateMachine , rule , invariant class NumberModifier ( RuleBasedStateMachine ): num = 0 @rule () def add_two ( self ): self . num += 2 if self . num > 50 : self . num += 1 @invariant () def divide_with_one ( self ): assert self . num % 2 == 0 NumberTest = NumberModifier . TestCase Invariants can also have precondition() s applied to them, in which case they will only be run if the precondition function returns true. Note that currently invariants can’t access bundles; if you need to use invariants, you should store relevant data on the instance instead.

Generic state machines¶ Warning GenericStateMachine is deprecated and will be removed in a future version. class hypothesis.stateful. GenericStateMachine [source] ¶ A GenericStateMachine is a deprecated approach to stateful testing. In earlier versions of Hypothesis, you would define steps , execute_step , teardown , and check_invariants methods; and the engine would then run something like the following: @given ( st . data ()) def test_the_stateful_thing ( data ): x = MyStatemachineSubclass () x . check_invariants () try : for _ in range ( 50 ): step = data . draw ( x . steps ()) x . execute_step ( step ) x . check_invariants () finally : x . teardown () We now recommend using rule-based stateful testing instead wherever possible. If your test is better expressed in the above format than as a rule-based state machine, we suggest “unrolling” your method definitions into a simple test function with the above control flow.