Haskell and other call-by-need languages use lazy evaluation as their default evaluation strategy. For beginners and advanced programmers alike this can sometimes be confusing. At Well-Typed a core part of our business is teaching Haskell, which we do through public courses (such as the upcoming Skills Matter courses Fast Track to Haskell, Haskell Performance and Optimization and The Haskell Type System), private in-house courses targeted at specific client needs, and of course through writing blog posts.

In order to help us design these courses, we developed a tool called visualize-cbn . It is a simple interpreter for a mini Haskell-like language which outputs the state of the program at every step in a human readable format. It can also generate a HTML/JavaScript version with “Previous” and “Next” buttons to step through a program. We released the tool as open source to github and Hackage, in the hope that it will be useful to others.

The README.md file in the repo explains how to run the tool. In this blog post we will illustrate how one might take advantage of it. We will revisit the infamous triple of functions foldr , foldl , foldl' , and show how they behave. As a slightly more advanced example, we will also study the memory behaviour of mapM in the Maybe monad. Hopefully, this show-rather-than-tell blog post might help some people understand these functions better.

Throughout this section we will use this definition of enumFromTo :

enumFromTo n m = n m if n <= m then n : enumFromTo (n + 1 ) m (n) m else [] []

so that, say, [1..3] corresponds to (enumFromTo 1 3) .

foldr / foldrl / foldl'

In this section we will examine the difference between these three functions. We will not study these functions directly, however, but study a slightly simpler variant in the form of three definitions of length on lists. For a more in-depth discussion of this triple of functions, see our earlier blog post on this topic.

foldr

Consider the naive definition of length :

length xs = xs case xs of xs -> 0 [] : xs') -> 1 + length xs' (xxs')xs'

This corresponds to defining length = foldr (\x n -> 1 + n) 0 .

Let’s consider what happens when we compute length [1..3] ; you can click on Prev and Next to step through the execution:

Prev Next (step Step, Status) Term Heap

When you step through this, notice what is going on:

We first apply the definition of length .

. Then in step 1, length needs to do a case analysis, which forces us apply enumFromTo , and evaluate it until we have a top-level Cons constructor (in step 4)

needs to do a case analysis, which forces us apply , and evaluate it until we have a top-level constructor (in step 4) At that point we can execute the pattern match and the process continues.

When we evaluate enumFromTo (add 1 1) 3 in step 6, note that the expression add 1 1 is only evaluated once, although it is used twice; this sharing of computation is what makes Haskell a true lazy (call-by-need) language (as opposed to a call-by-name language).

in step 6, note that the expression is only evaluated once, although it is used twice; this sharing of computation is what makes Haskell a true lazy (call-by-need) language (as opposed to a call-by-name language). In these animations these shared expressions are shown separately below the expression; you can think of this as the “heap”, and accordingly the animation also shows when these expressions are garbage collected (e.g. step 14).

Note as you step through that we have a build up of calls to add which are only resolved until the very end. This is the source of the memory leak in this definition of length (corresponding to foldr ).

foldl

In a foldl -style definition of length , we introduce an accumulator:

length acc xs = acc xs case xs of xs -> acc []acc x : xs' -> length ( 1 + acc) xs' xs'acc) xs'

This corresponds to defining length = foldl (

x -> 1 + n) 0 .

Unlike the previous definition, this is tail-recursive; however, it still suffers from a memory leak due to Haskell’s extremely lazy nature. You will see why when you step through the execution:

Prev Next (step Step, Status) Term Heap

Notice how we are still building up a chain of additions, except they are now in the accumulator instead. This chain is only resolved (and garbage collected) at the very end (step 26).

foldl'

In the final definition, we make sure to evaluate the accumulator as we go:

length acc xs = acc xs case xs of xs -> acc []acc x : xs' -> let acc' = add 1 acc in seq acc' ( length acc' xs') xs'acc'addaccacc' (acc' xs')

This corresponds to defining length = foldl' (

x -> 1 + n) 0 .

If you step through this note that we evaluate the accumulator immediately at each step, and moreover that garbage collection can now happen as we compute as well:

Prev Next (step Step, Status) Term Heap

mapM over the Maybe monad

As a final example of a slightly more advanced nature, try predicting what will happen when we run this in ghci :

case mapM return [ 1 .. ] of Just (x : _) -> x (x_)

If you try it out and the result is not what you expected, perhaps stepping through the following evaluation of mapM return [1..3] to weak-head normal form (whnf: when there is a constructor at the top-level) will help you understand:

Prev Next (step Step, Status) Term Heap

Note that this expression reduces to whnf only after the entire list has been evaluated, and moreover that this requires an O(n) nested pattern matches. The take-away point from this example is that mapM should not be applied to long lists in most monads, as this will result in a memory leak.

Conclusion