Terminating Tricky Traversals

Posted on January 29, 2020

Just a short one today. I’m going to look at a couple of algorithms for breadth-first traversals with complex termination proofs.

Breadth-First Graph Traversal

In a previous post I talked about breadth-first traversals over graphs, and the difficulties that cycles cause. Graphs are especially tricky to work with in a purely functional language, because so many of the basic algorithms are described in explicitly mututing terms (i.e. “mark off a node as you see it”), with no obvious immutable translation. The following is the last algoirthm I came up with:

bfs :: Ord a => (a -> [a]) -> a -> [[a]] (a[a])[[a]] = takeWhile ( not . null ) ( map fst (fix (f r . push))) bfs g r) ((fix (f rpush))) where = ([],Set.empty) : [ ([],seen) | (_,seen) <- xs ] push xs([],Set.empty)[ ([],seen)(_,seen)xs ] @ ((l,s) : qs) f x q((l,s)qs) | Set.member x s = q Set.member x s | otherwise = (x : l, Set.insert x s) : foldr f qs (g x) (xl, Set.insert x s)f qs (g x)

As difficult as it is to work with graphs in a pure functional language, it’s even more difficult to work in a total language, like Agda. Looking at the above function, there are several bits that we can see right off the bat won’t translate over easily. Let’s start with fix .

We shouldn’t expect to be able to write fix in Agda as-is. Just look at its Haskell implementation:

fix :: (a -> a) -> a (aa) = f (fix f) fix ff (fix f)

It’s obviously non total!

(this is actually a non-memoizing version of fix , which is different from the usual one)

We can write a function like fix , though, using coinduction and sized types.

Coinductive types are the dual to inductive types. Totality-wise, a coinductive type must be “productive”; i.e. a coinductive list can be infinitely long, but it must be provably able to evaluate to a constructor (cons or nil) in finite time.

Sized types also help us out here: they’re quite subtle, and a little finicky to use occasionally, but they are invaluable when it comes to proving termination or productivity of complex (especially higher-order) functions. The canonical example is mapping over the following tree type:

The compiler can’t tell that the recursive call in the mapTree function will only be called on subnodes of the argument: it can’t tell that it’s structurally recursive, in other words. Annoyingly, we can fix the problem by inlining map .

The other solution is to give the tree a size parameter. This way, all submodes of a given tree will have smaller sizes, which will give the compiler a finite descending chain condition it can use to prove termination.

So how do we use this stuff in our graph traversal? Well first we’ll need a coinductive Stream type:

And then we can use it to write our breadth-first traversal.

How do we convert this to a list of lists? Well, for this condition we would actually need to prove that there are only finitely many elements in the graph. We could actually use Noetherian finiteness for this: though I have a working implementation, I’m still figuring out how to clean this up, so I will leave it for another post.

Traversing a Braun Tree

A recent paper (Nipkow and Sewell 2020) provided Coq proofs for some algorithms on Braun trees (Okasaki 1997), which prompted me to take a look at them again. This time, I came up with an interesting linear-time toList function, which relies on the following peculiar type:

newtype Q2 a = Q2 { unQ2 :: ( Q2 a -> Q2 a) -> ( Q2 a -> Q2 a) -> a a)a) }

Even after coming up with the type myself, I still can’t really make heads nor tails of it. If I squint, it starts to look like some bizarre church-encoded binary number (but I have to really squint). It certainly seems related to corecursive queues (Smith 2009).

Anyway, we can use the type to write the following lovely toList function on a Braun tree.

toList :: Tree a -> [a] [a] = unQ2 (f t b) id id toList tunQ2 (f t b) where Node x l r) xs = Q2 (\ls rs -> x : unQ2 xs (ls . f l) (rs . f r)) f (x l r) xs(\ls rsunQ2 xs (lsf l) (rsf r)) f Leaf xs = Q2 (\_ _ -> []) xs(\_ _[]) b = Q2 (\ls rs -> unQ2 (ls (rs b)) id id ) (\ls rsunQ2 (ls (rs b))

So can we convert it to Agda?

Not really! As it turns out, this function is even more difficult to implement than one might expect. We can’t even write the Q2 type in Agda without getting in trouble.

Q2 isn’t strictly positive, unfortunately.

Apparently this problem of strict positivity for breadth-first traversals has come up before: Berger, Matthes, and Setzer (2019); Hofmann (1993).

Wait—Where did Q2 Come From?

Update 31/01/2020

Daniel Peebles (@copumpkin on twitter) replied to my tweet about this post with the following:

Interesting! Curious how you came up with that weird type at the end. It doesn’t exactly feel like the first thing one might reach for and it would be interesting to see some writing on the thought process that led to it Dan P (@copumpkin), Jan 30, 2020.

So that’s what I’m going to add here!

Let’s take the Braun tree of the numbers 1 to 15:

┌8 ┌4┤ │ └12 ┌2┤ │ │ ┌10 │ └6┤ │ └14 1┤ │ ┌9 │ ┌5┤ │ │ └13 └3┤ │ ┌11 └7┤ └15

Doing a normal breadth-first traversal for the first two levels is fine (1, 2, 3): it starts to fall apart at the third level (4, 6, 5, 7). Here’s the way we should traverse it: “all of the left branches, and then all of the right branches”. So, we will have a queue of trees. We take the root element of each tree in the queue, and emit it, and then we add all of the left children of the trees in the queue to one queue, and then all the right children to another, and then concatenate them into a new queue and we start again. We can stop whenever we hit an empty tree because of the structure of the Braun tree. Here’s an ascii diagram to show what’s going on:

┌8 | ┌8 | ┌8 | 8 ┌4┤ | ┌4┤ | 4┤ | │ └12 | │ └12 | └12 | 9 ┌2┤ | 2┤ | | │ │ ┌10 | │ ┌10 | ┌9 | 10 │ └6┤ | └6┤ | 5┤ | │ └14 | └14 | └13 | 11 1┤ --> -----> --------> │ ┌9 | ┌9 | ┌10 | 12 │ ┌5┤ | ┌5┤ | 6┤ | │ │ └13 | │ └13 | └14 | 13 └3┤ | 3┤ | | │ ┌11 | │ ┌11 | ┌11 | 14 └7┤ | └7┤ | 7┤ | └15 | └15 | └15 | 15 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

If we want to do this in Haskell, we have a number of options for how we would represent queues: as ever, though, I much prefer to use vanilla lists and time the reversals so that they stay linear. Here’s what that looks like:

toList :: Tree a -> [a] [a] = f t b [] [] toList tf t b [] [] where Node x l r) xs ls rs = x : xs (l : ls) (r : rs) f (x l r) xs ls rsxs (lls) (rrs) f Leaf _ _ _ = [] _ _ _[] = foldr f b ( reverse ls ++ reverse rs) [] [] b ls rsf b (lsrs) [] []

Any place we see a foldr being run after a reverse or a concatenation, we know that we can remove a pass (in actual fact rewrite rules will likely do this automatically for us).

toList :: Tree a -> [a] [a] = f b t [] [] toList tf b t [] [] where Node x l r) xs ls rs = x : xs (l : ls) (r : rs) f (x l r) xs ls rsxs (lls) (rrs) f Leaf _ _ _ = [] _ _ _[] = foldl ( flip f) ( foldl ( flip f) b rs) ls [] [] b ls rsf) (f) b rs) ls [] []

Finally, since we’re building up the lists with : (in a linear way, i.e. we will not use the intermediate queues more than once), and we’re immediately consuming them with a fold, we can deforest the intermediate list, replacing every : with f (actually, it’s a little more tricky than that, since we replace the : with the reversed version of f , i.e. the one you would pass to foldr if you wanted it to act like foldl . This trick is explained in more detail in this post).

toList :: Tree a -> [a] [a] = f t b id id toList tf t b where Node x l r) xs ls rs = x : xs (ls . f l) (rs . f r) f (x l r) xs ls rsxs (lsf l) (rsf r) f Leaf _ _ _ = [] _ _ _[] = ls (rs b) id id b ls rsls (rs b)

Once you do that, however, you run into the “cannot construct the infinite type” error. To be precise:

• Occurs check: cannot construct the infinite type: a3 ~ (a3 -> c0) -> (a3 -> c1) -> [a2]

And this gives us the template for our newtype! It requires some trial and error, but you can see where some of the recursive calls are, and what you eventually get is the following:

newtype Q2 a = Q2 { unQ2 :: ( Q2 a -> Q2 a) -> ( Q2 a -> Q2 a) -> [a] a)a)[a] }

(You can remove the list type constructor at the end, I did as I thought it made it slightly more general). And from there we get back to the toList function.

References