22nd February 2011, 12:42 pm

I’ve been playing with deriving efficient parallel, imperative implementations of "prefix sum" or more generally "left scan". Following posts will explore the parallel & imperative derivations, but as a warm-up, I’ll tackle the functional & sequential case here.

Folds You’re probably familiar with the higher-order functions for left and right "fold". The current documentation says: foldl , applied to a binary operator, a starting value (typically the left-identity of the operator), and a list, reduces the list using the binary operator, from left to right: foldl f z [x1, x2, ⋯, xn] ≡ (⋯((z `f` x1) `f` x2) `f` ⋯) `f` xn

The list must be finite. foldr , applied to a binary operator, a starting value (typically the right-identity of the operator), and a list, reduces the list using the binary operator, from right to left: foldr f z [x1, x2, ⋯, xn] ≡ x1 `f` (x2 `f` ⋯ (xn `f` z)⋯)

And here are typical definitions: foldl ∷ (b → a → b) → b → [a] → b

foldl f z [] = z

foldl f z (x : xs) = foldl f (z `f` x) xs



foldr ∷ (a → b → b) → b → [a] → b

foldr f z [] = z

foldr f z (x : xs) = x `f` foldr f z xs

Notice that foldl builds up its result one step at a time and reveals it all at once, in the end. The whole result value is locked up until the entire input list has been traversed. In contrast, foldr starts revealing information right away, and so works well with infinite lists. Like foldl , foldr also yields only a final value. Sometimes it's handy to also get to all of the intermediate steps. Doing so takes us beyond the land of folds to the kingdom of scans.

Scans The scanl and scanr functions correspond to foldl and foldr but produce all intermediate accumulations, not just the final one. scanl ∷ (b → a → b) → b → [a] → [b]



scanl f z [x1, x2, ⋯ ] ≡ [z, z `f` x1, (z `f` x1) `f` x2, ⋯]



scanr ∷ (a → b → b) → b → [a] → [b]



scanr f z [⋯, xn_1, xn] ≡ [⋯, xn_1 `f` (xn `f` z), xn `f` z, z]

As you might expect, the last value is the complete left fold, and the first value in the scan is the complete right fold: last ( scanl f z xs) ≡ foldl f z xs

head ( scanr f z xs) ≡ foldr f z xs

which is to say last ∘ scanl f z ≡ foldl f z

head ∘ scanr f z ≡ foldr f z

The standard scan definitions are trickier than the fold definitions: scanl ∷ (b → a → b) → b → [a] → [b]

scanl f z ls = z : ( case ls of

[] → []

x : xs → scanl f (z `f` x) xs)



scanr ∷ (a → b → b) → b → [a] → [b]

scanr _ z [] = [z]

scanr f z (x : xs) = (x `f` q) : qs

where qs @ (q : _) = scanr f z xs

Every time I encounter these definitions, I have to walk through it again to see what's going on. I finally sat down to figure out how these tricky definitions might emerge from simpler specifications. In other words, how to derive these definitions systematically from simpler but less efficient definitions. Most likely, these derivations have been done before, but I learned something from the effort, and I hope you do, too.

Specifying scans As I pointed out above, the last element of a left scan is the left fold over the whole list. In fact, all of the elements are left folds, but over prefixes of the list. Similarly, all of the elements of a right-scan are right folds, but over suffixes of the list. These observations give rise to very simple specifications for scanl and scanr : scanl f z xs = map ( foldl f z) (inits xs)

scanr f z xs = map ( foldr f z) (tails xs)

Equivalently, scanl f z = map ( foldl f z) ∘ inits

scanr f z = map ( foldr f z) ∘ tails

Here I'm using the standard inits and tails functions from Data.List , documented as follows: The inits function returns all initial segments of the argument, shortest first. For example, inits "abc" ≡ [ "" , "a" , "ab" , "abc" ]

The tails function returns all final segments of the argument, longest first. For example, tails "abc" ≡ [ "abc" , "bc" , "c" , "" ]

The definitions: inits ∷ [a] → [[a]]

inits [] = [[]]

inits (x : xs) = [[]] ++ map (x : ) (inits xs)



tails ∷ [a] → [[a]]

tails [] = [[]]

tails xs @ (_ : xs') = xs : tails xs'

This definition of inits is stricter than necessary, as it examines its argument before emitting anything but ends up emitting an initial empty list whether the argument is nil or a cons. Here's a lazier definition: inits xs = [] : case xs of

[] → []

(x : xs) → map (x : ) (inits xs)

This second version produces the initial [] before examining its argument, which helps to avoid deadlock in some recursive contexts. These specifications of scanl and scanr make it very easy to prove the properties given above that last ∘ scanl f z ≡ foldl f z

head ∘ scanr f z ≡ foldr f z

(Hint: use the fact that last ∘ inits ≡ id , and head ∘ tails ≡ id .) Although these specifications of scanl (via map , foldl / foldr , and inits / tails ) state succinctly and simply what scanl and scanr compute, they are terrible recipes for how, because they perform quadratic work instead of linear. But that's okay, because now we're going to see how to turn these inefficient & clear specifications into efficient & less clear implementations.

Deriving efficient scans To derive efficient scans, use the specifications and perform some simplifications. Divide scanl into empty lists and nonempty lists, starting with empty: scanl f z []

≡ {- scanl spec -}

map ( foldl f z) (inits [])

≡ {- inits def -}

map ( foldl f z) [[]]

≡ {- map def -}

[ foldl f z []]

≡ {- foldl def -}

[z]

scanl f z (x : xs)

≡ {- scanl spec -}

map ( foldl f z) (inits (x : xs))

≡ {- inits def -}

map ( foldl f z) ([] : map (x : ) (inits xs))

≡ {- map def -}

foldl f z [] : map ( foldl f z) ( map (x : ) (inits xs))

≡ {- foldl def -}

z : map ( foldl f z) ( map (x : ) (inits xs))

≡ {- map g ∘ map f ≡ map (g ∘ f) -}

z : map ( foldl f z ∘ (x : )) (inits xs)

≡ {- (∘) def -}

z : map (λ ys → foldl f z (x : ys)) (inits xs)

≡ {- foldl def -}

z : map (λ ys → foldl f (z `f` x) ys)) (inits xs)

≡ {- η reduction -}

z : map ( foldl f (z `f` x))) (inits xs)

≡ {- scanl spec -}

z : scanl f (z `f` x) xs

Combine these conclusions and factor out the common (z : ) to yield the standard "optimized" definition. Does scanr work out similarly? Let's find out, replacing scanl with scanr , foldl with foldr , and inits with tails : scanr f z []

≡ {- scanr spec -}

map ( foldr f z) (tails [])

≡ {- tails def -}

map ( foldr f z) ([[]])

≡ {- map def -}

[ foldr f z []]

≡ {- foldr def -}

[z]

The derivation for nonempty lists deviates from scanl , due to differences between inits and tails , but it all works out nicely. scanr f z (x : xs)

≡ {- scanr spec -}

map ( foldr f z) (tails (x : xs))

≡ {- tails def -}

map ( foldr f z) ((x : xs) : tails xs)

≡ {- map def -}

foldr f z (x : xs) : map ( foldr f z) (tails xs)

≡ {- scanr spec -}

foldr f z (x : xs) : scanr f z xs

≡ {- foldr def -}

(x `f` foldr f z xs) : scanr f z xs

≡ {- head/scanr property -}

(x `f` head ( scanr f z xs)) : scanr f z xs

≡ {- factor out shared expression -}

(x `f` head qs) : qs where qs = scanr f z xs

≡ {- stylistic variation -}

(x `f` q) : qs where qs @ (q : _) = scanr f z xs



Coming attractions The scan implementations above are thoroughly sequential, in that they thread a single linear chain of data dependencies throughout the computation. Upcoming posts will look at more parallel-friendly variations.

I’ve been playing with deriving efficient parallel, imperative implementations of "prefix sum" or more generally "left scan". Following posts will explore the parallel & imperative derivations, but as a warm-up,...