Foo

parse . render == id

parse . render == id

data MyError e a = Failure e | Success a

λ import Test.QuickCheck λ import Test.QuickCheck.Arbitrary.Generic λ let g = genericArbitrary :: Gen (MyError Int Bool) λ mapM_ print =<< generate (replicateM 20 g) Success True Success True Success False Success False Success True Failure 17 Failure 8 Failure (-29) Failure (-14) Success True Failure 22 Failure (-2) Success False Success True Success False Failure (-15) Success False Success True Failure (-5) Failure (-5)

Int

Failure

MyError

Int

Int

MyError Int Bool

Failure

Success

Int

Bool

-- Product type numCases (a, b) = numCases a * numCases b -- Sum type numCases (Either a b) = numCases a + numCases b

numCases (a, b) = max (numCases a) (numCases b) numCases (Either a b) = numCases a + numCases b

λ import Fake λ let g = unCoverage (gcover :: Coverage (MyError Int Bool)) λ mapM_ print =<< generate (sequence g) Failure 75 Success False

Think of a time you've written tests for (de)serialization code of some kind, say for a data structure called. If you were using the lowest level of sophistication you probably defined a few values by hand, serialized them, deserialized that, and verified that you ended up with the same value you started with. In Haskell nomenclature we'd say that you manually verified that. If you were a little more sophisticated, you might have used the QuickCheck library (or any of the numerous similar packages it inspired in other languages) to verify theproperty for a bunch of randomly generated values. The first level of sophistication is often referred to as unit testing. The second frequently goes by the term property testing or sometimes fuzz testing.Both unit testing and property testing have some drawbacks. With unit testing you have to write fairly tedious boilerplate of listing by hand all the values you want to test with. With property testing you have to write a precise specification of how to generate the different cases randomly. There is a package called generic-arbitrary that can automatically derive these generators for you generically, but that approach often ends up testing many values that aren't giving you any increase in code coverage. To see this, consider the example of testing this domain-specific data type:Here is what we get using generic-arbitrary to generate test values:It is testing the exact same value multiple times. But even if it had a mechanism for deduplicating the values, do you really need so many different test cases for the serialization of thein thecase? I'm testing my serialization code for thetype, so I probably don't care about testing the serialization of. I would take it as a given that the libraries I'm using got the serialization forcorrect. If we take this assumption as a given, then for the above case of testing, I would really only care about two cases, thecase and thecase. Testing serializations forandare out of the scope of my concern because from the perspective of my project they are primitives.What we really want to test is something I call "full constructor coverage". With this pattern of generation you only generate enough values to exercise each constructor your your type hierarchy once. This gets you better code coverage while testing many fewer values. If we wanted exhaustive testing, the number of values you'd need to test are given by this relation:This is precisely what the "algebraic" in "algebraic data types" is referring to. For product types the number of inhabitants is multiplied, for sum types it is added. However, if we're going for full constructor coverage, the number of values we need to test is reduced to the following relation:For complex types, replacing the multiplication with the max greatly reduces the number of values you need to test. Here's what full constructor coverage looks like for MyError:At this point readers might be thinking, "Who cares? Computers are fast." Well, QuickCheck defaults to generating 100 test cases for each property, but complex types can easily make this insufficient. You can increase how many test cases it generates, but that will make your test suite slower and you probably won't know what the right number is to get the level of coverage you are aiming for. In my experience slower test suites can cause tangible business impact in production systems in terms of possibly longer downtimes when problems are encountered in production and you have to wait for tests to pass before you can deploy a fix. At the end of the day, naive randomized test case generation very unlikely to score as well as the full constructor coverage method under the metric of lines of code covered per test case.The good news is that Haskell's algebraic data types give us exactly what we need to get full constructor coverage in a completely automatic way. It is a part of the fake package and you can use it today. You can see example uses in the test suite . In a dynamically typed language you don't have knowledge of the constructors and forms they take. All variables can hold any data, so you don't have any knowledge of structure to guide you. Thanks to types, Haskell knows that structure.