The round-trip property

... a hastily-written post about the power-to-weight champion of property testing.

Parse farce

Computers need to talk to one another, which means programmers need to agree on things. This is the worst, for various reasons, but mostly because it leaves us sitting around implementing parsers and printers for formats we barely care about. A good portion of my career thus far has been spent on parsers and printers!

Parsers and printers are a classic missed isomorphism: they're supposed to be dual, but in general we cannot derive both from the same specification. If you want some assurance that your implementation is correct, expect to be uncomfortably reliant on testing.

Unit tests are pretty much unavoidable here, and are especially handy when there's a simple specification or pre-existing test suite to compare against. However, the bugs encountered while writing a parser are generally not easy to pre-empt! Unit tests cannot bear the entire load. You'd end up shipping bugs or wasting time.

Tripping

We know that parsers and printers are supposed to be dual. We can simply treat this as a law, and write a property test to enforce it.

Using Hedgehog primitives, it might look a little like this:

prop_trip = do x <- forAll genAst parse (pretty x) === pure x

... although a wise Hedgehog programmer would probably use the supplied tripping function.

While the round-trip test is pretty well-known among functional programmers and property-based testing fans , I really want to hammer home three things:

It is unreasonably effective It is fairly easy to implement It should be in your testing toolkit

This is both the simplest useful property-test I can think of for a working engineer and the most likely to reliably identify bugs of consequence. It is property testing's trojan horse. Talk to your team lead about property testing today!

A worked example

Let's mock up a parser and pretty-printer, break them subtly, and use a generator to identify incompatibilities. This is especially easy when using a property-testing library like Hedgehog. If you're stuck in a language without a decent framework, this is a simple enough property to implement by hand using random primitives .

Tokens

Our pal today is a tiny subset of a C-like expression language. I'll only be implementing a scanner, since it should be sufficient to get the point across. Lexers / scanners are the simplest form of parser, and also the easiest to test. I'll befriend two birds with the one loaf here by also demonstrating a little bit of Parser Combinator Hell .

Here's a valid program (don't worry about how it executes):

x := 41 ; x += 2 ; x + 0.00001 ; abc := x; x++;

Here are examples of our tokens, in lieu of a real specification:

55555 // ints 1.222222 // floats, up to six points := // assignment + // addition += // plus equals abc // variables, [a-zA-Z]+ ( // left paren ) // right paren ++ // increment ; // separator

... and here's a corresponding datatype:

data Token = TInt Int | TFloat Double | TAssign | TPlus | TPlusEq | TVar Text | TLParen | TRParen | TIncrement | TSeparator deriving ( Eq , Ord , Show )

We don't care about any kind of whitespace. It can be peeled off by the lexer as we go, and can be inserted freely by the pretty-printer.

From this point on, code may be intentionally incorrect. Please don't cargo-cult anything without reading the full post.

Pretty-printing

Personally, I prefer to write the printer first, so I can get the property test in place before I even get started on the parser. This is a preference I developed over time by doing the precise opposite. Your mileage may vary.

A (very bad) pretty-printer for a stream of tokens, given the above caveats, is pretty straightforward. We just write a printer for an individual token...

renderToken :: Token -> Text renderToken t = case t of TInt x -> T.pack (show x) TFloat d -> T.pack (printf "%6f" d) TAssign -> ":=" TPlus -> "+" TPlusEq -> "+=" TVar x -> x TLParen -> "(" TRParen -> ")" TIncrement -> "++" TSeparator -> ";"

... and then add liberal amounts of whitespace. For this toy language, whitespace only really matters for disambiguation between ++ and + , x y and xy . Let's be lazy and apply it between each token. This is a hack job, after all!

pretty :: [ Token ] -> Text pretty = T.intercalate " " . fmap renderToken

λ > pretty [ TLParen , TVar "x" , TIncrement , TRParen , TSeparator ] "( x ++ ) ;"

Generating tokens

Here's a first attempt at a Hedgehog generator for tokens. Recall we had some very convenient restrictions on float precision and variable names. We'll take advantage of Hedgehog to enforce these.

genTokens :: Gen [ Token ] genTokens = Gen.list (Range.linear 0 10000 ) genToken genToken :: Gen Token genToken = Gen.choice [ TInt <$> Gen.int (Range.linear 0 maxBound) , TFloat <$> Gen.double (Range.linearFrac 0.0 9223372036854775807.9 ) , pure TAssign , pure TPlus , pure TPlusEq , TVar <$> Gen.text (Range.linear 0 20 ) Gen.alpha , pure TLParen , pure TRParen , pure TIncrement , pure TSeparator ]

Writing the property

To get the property in place, we'll need stubs for parse :

parse :: Text -> Either ParseError [ Token ] parse = undefined

Now we can write our property:

prop_trip :: Property prop_trip = property $ do toks <- forAll genTokens tripping toks pretty parse tests :: IO Bool tests = check prop_trip

... and now we can type tests into the repl to see how we're doing.

Parsing

I'll write the parser in a naive-but-reasonable way and write about each bug the property identifies. I'm using Parsec for this, since it makes it relatively easy to make mistakes.

Let's start by peeling off whitespace...

whitespace :: Parser () whitespace = void (many P.space)

... stubbing out a token parser...

parseToken :: Parser Token parseToken = P.choice [ ]

... and writing our top-level parsing functions, ensuring irrelevant whitespace is handled and we parse all the way to the end of the stream:

parse :: Text -> Either ParseError [ Token ] parse = -- You almost always want to parse until EOF when using Parsec! P.parse (parseTokens <* P.eof) "" parseTokens :: Parser [ Token ] parseTokens = do whitespace P.many (parseToken <* whitespace)

Most of our tokens are pretty boring string literals, so let's get those out of the way.

parseToken :: Parser Token parseToken = P.choice [ P.string ":=" *> pure TAssign , P.string "+" *> pure TPlus , P.string "+=" *> pure TPlusEq , P.string "(" *> pure TLParen , P.string ")" *> pure TRParen , P.string "++" *> pure TIncrement , P.string ";" *> pure TSeparator ]

Parsec doesn't give us numeric parsing primitives, so I'll write some slow ones of my own:

parseInt :: Parser Int parseInt = do xyz <- P.many1 P.digit case readMaybe xyz of Just x -> pure x Nothing -> empty parseDouble :: Parser Double parseDouble = do a <- P.many1 P.digit _ <- P.char '.' b <- P.many1 P.digit guard (length b <= 6 ) case readMaybe (a ++ "." ++ b) of Just x -> pure x Nothing -> empty

... and I'll need to recognise our alphanumeric variables, too:

parseVar :: Parser Text parseVar = fmap T.pack (P.many1 P.letter)

... so, stitching it all together, it looks like something like this should work:

parseToken :: Parser Token parseToken = P.choice [ TInt <$> parseInt , TFloat <$> parseDouble , TVar <$> parseVar , P.string ":=" *> pure TAssign , P.string "+" *> pure TPlus , P.string "+=" *> pure TPlusEq , P.string "(" *> pure TLParen , P.string ")" *> pure TRParen , P.string "++" *> pure TIncrement , P.string ";" *> pure TSeparator ]

... but does it? My first unit test passes. Looks like it!

> parse "xyz := 1 + 1;" Right [ TVar "xyz" , TAssign , TInt 1 , TPlus , TInt 1 , TSeparator ]

Looks like we're done! If the property passes, we can go home.

> tests ✗ < interactive > failed after 2 tests and 6 shrinks . ┏━━ RTT.hs ━━━ 152 ┃ prop_trip :: Property 153 ┃ prop_trip = 154 ┃ property $ do 155 ┃ toks <- forAll genTokens ┃ │ [ TFloat 0.0 ] 156 ┃ tripping toks pretty parse ┃ ^^^^^^^^^^^^^^^^^^^^^^^^^^ ┃ │ ━━━ Original ━━━ ┃ │ Right [ TFloat 0.0 ] ┃ │ ━━━ Roundtrip ━━━ ┃ │ Left (line 1 , column 2 ) : ┃ │ unexpected '.' ┃ │ expecting digit, letter, ":=" , "+" , "+=" , "(" , ")" , "++" , ";" or end of input This failure can be reproduced by running : > recheck ( Size 1 ) ( Seed 6054864962880355960 1567128570873129121 ) < property >

Weird failures on floats. That's weird, parseDouble worked fine when I tested it in isolation! Turns out the order inside a choice block really matters in Parsec. When rules are ambiguous, the most general one needs to be at the bottom, else it will always succeed in place of any more specific combinators. Flipping the order of the TInt and TFloat parsers is not enough, however! Parsec also allows failing generators to consume input, so we need to use try to stop parseDouble eating integers. Here's how we get away with it:

P.choice [ TFloat <$> P.try parseDouble , TInt <$> parseInt ... ]

The next counterexample is [TPlusEq] , which complains about an unexpected = . This looks like the same thing, the first + is picked up by TPlus and the = is meaningless. So, let's put Increment and PlusEq above Plus , and wrap them both in try :

parseToken :: Parser Token parseToken = P.choice [ TFloat <$> P.try parseDouble , TInt <$> parseInt , TVar <$> parseVar , P.try (P.string "++" ) *> pure TIncrement , P.try (P.string "+=" ) *> pure TPlusEq , (P.string "+" ) *> pure TPlus , (P.string ":=" ) *> pure TAssign , P.string "(" *> pure TLParen , P.string ")" *> pure TRParen , P.string ";" *> pure TSeparator ]

The next counterexample is [TVar ""] , which really shouldn't be happening. This is a generator bug, and not an intentional one. Whoops. Adjust the lower range parameter: TVar <$> Gen.text (Range.linear 1 20) Gen.alpha .

Here's our final counterexample:

Failed ( - Original / + Roundtrip ) │ - Right [ TFloat 2.2858544151012983e-283 ] │ + Right [ TFloat 0.0 ]

Looks like we're generating some floats that require more precision than six decimal places to render. They're being truncated during serialisation. This is another generator invariant. We can fix this by ensuring our generated values are truncated up front:

genToken :: Gen Token genToken = Gen.choice [ ( TFloat . round6) <$> Gen.double (Range.linearFrac 0.0 9223372036854775807.9 ) , ... ] round6 :: Double -> Double round6 x = (fromInteger $ round (x * ( 10 ^ 6 ))) / ( 10.0 ^^ 6 )

That's it, the property now passes!

✓ < interactive > passed 100 tests .

A straw man

I'm now reasonably confident that this parser works. This is a really simple language, so I could probably have done that with unit tests this time around, but it probably would have taken a similar amount of effort.

This was the simplest, least interesting serialisation problem I could think of. It intentionally has no recursive productions or context-sensitivity. I added several restrictions to make it easy to parse. It still picked up a couple of places where I was holding Parsec wrong, and a couple of spots where my generator, parser, and printer were implementing different languages.

Round-tripping is useful for pretty much any kind of parser-printer combo, from JSON or binary serialisation up to ridiculous layout-sensitive Haskell parsers. It pays off pretty much immediately, even for inane cases like this. It is hard to live without when implementing complicated formats. If you're solving this kind of problem, I implore you to give it a try.

All up

While you may still need other tests to ensure you haven't implemented some wildly incorrect language, round-trip testing can ensure your parser is complete and unambiguous. Care sometimes needs to be taken in writing a good printer and a good generator. This feels a lot like writing a specification! It's a much more productive process than thrashing your brain for interesting permutations.

Parser ambiguities are an extremely common class of bug that can wreak the worst kind of havoc. With a small amount of up-front due diligence, we can eliminate them in a semi-automated fashion. This is almost certainly in your interest.

This technique can be reused for any serialisation layer, as well as any internal functions that should be involutive, including isos.