31st January 2011, 03:03 pm

As you might have noticed, I’ve been thinking and writing about memo tries lately. I don’t mean to; they just keep coming up.

Memoization is the conversion of functions to data structures. A simple, elegant, and purely functional form of memoization comes from applying three common type isomorphisms, which also correspond to three laws of exponents, familiar from high school math, as noted by Ralf Hinze in his paper Generalizing Generalized Tries.

In Haskell, one can neatly formulate memo tries via an associated functor, Trie , with a convenient synonym " k ↛ v " for Trie k v , as in Elegant memoization with higher-order types. (Note that I’ve changed my pretty-printing from " k :→: v " to " k ↛ v ".) The key property is that the data structure encodes (is isomorphic to) a function, i.e.,

k ↛ a ≅ k → a



In most cases, we ignore non-strictness, though there is a delightful solution for memoizing non-strict functions correctly.

My previous four posts explored use of types to statically bound numbers and to determine lengths of vectors.

Just as (infinite-only) streams are the natural trie for unary natural numbers, we saw in Reverse-engineering length-typed vectors that length-typed vectors (one-dimensional arrays) are the natural trie for statically bounded natural numbers.

BNat n ↛ a ≡ Vec n a



and so

BNat n → a ≅ Vec n a



In retrospect, this relationship is completely unsurprising, since a vector of length n is a collection of values, indexed by 0, . . . , n - 1.

In that same post, I noted that vectors are not only a trie for bounded numbers, but when the elements are also bounded numbers, the vectors can also be thought of as numbers. Both the number of digits and the number base are captured statically, in types:

type Digits n b = Vec n ( BNat b)



The type parameters n and b here are type-encodigs of unary numbers, i.e., built up from zero and successor ( Z and S ). For instance, when b ≡ S (S Z) , we have n-bit binary numbers.

In this new post, I look at another question of tries and vectors. Given that Vec n is the trie for BNat n , is there also a trie for Vec n ?

Edits:

2011-01-31: Switched trie notation to " k ↛ v " to avoid missing character on iPad.

A trie for length-typed vectors A vector is a trie over bounded numbers. What is a trie over vectors? As always, isomorphisms show us the way. Vec n a → b ≅ (a × ⋯ × a) → b

≅ a → ⋯ → a → b

≅ a ↛ ⋯ ↛ a ↛ b

≡ Trie a (⋯ ( Trie a b)⋯)

≅ ( Trie a ∘ ⋯ ∘ Trie a) b

So the trie (functor) for Vec n a is the n-ary composition of tries for a : type VTrie n a = Trie a :^ n

where for any functor f and natural number type n , f :^ n ≅ f ∘ ⋯ ∘ f -- (n times)



N-ary functor composition Since composition is associative, a recursive formulation might naturally fold from the left or from right. (Or perhaps in a balanced tree, to facilitate parallel execution.) Right-folded composition Let's look at each fold direction, starting with the right, i.e., f :^ Z ≅ Id

f :^ S n ≅ f ∘ (f :^ n)

Writing as a GADT: data ( :^ ) ∷ ( * → * ) → * → ( * → * ) where

ZeroC ∷ a → (f :^ Z ) a

SuccC ∷ IsNat n ⇒ f ((f :^ n) a) → (f :^ ( S n)) a

Functors compose into functors and applicatives into applicatives. (See Applicative Programming with Effects (section 5) and the instance definitions in Semantic editor combinators.) The following definitions arise from the standard instances for binary functor composition. instance Functor f ⇒ Functor (f :^ n) where

fmap h ( ZeroC a) = ZeroC (h a)

fmap h ( SuccC fs) = SuccC ((fmap∘fmap) h fs)

instance ( IsNat n, Applicative f) ⇒ Applicative (f :^ n) where

pure = pureN nat

ZeroC f ⊛ ZeroC x = ZeroC (f x)

SuccC fs ⊛ SuccC xs = SuccC (liftA2 (⊛) fs xs)

pureN ∷ Applicative f ⇒ Nat n → a → (f :^ n) a

pureN Zero a = ZeroC a

pureN ( Succ _) a = SuccC ((pure ∘ pure) a)

More explicitly, the second pure could instead use pureN : pureN ( Succ n) a = SuccC ((pure ∘ pureN n) a)

Some tidier definitions Using (⊔) , there are tidier definitions of fmap and (⊛) : fmap = inZeroC ( $ ) ⊔ inSuccC (fmap∘fmap)



(⊛) = inZeroC2 ( $ ) ⊔ inSuccC2 (liftA2 (⊛))

where the new combinators are partial functions that work inside of ZeroC and SuccC . inZeroC h ( ZeroC a ) = ZeroC (h a )



inSuccC h ( SuccC as) = SuccC (h as)

inZeroC2 h ( ZeroC a ) ( ZeroC b ) = ZeroC (h a b )



inSuccC2 h ( SuccC as) ( SuccC bs) = SuccC (h as bs)

This example demonstrates another notational benefit of (⊔) , extending the techniques in the post Lazier function definitions by merging partial values. Left-folded composition For left-folded composition, a tiny change suffices in the S case: f :^ Z ≅ Id

f :^ S n ≅ (f :^ n) ∘ f

which translates to a correspondingly tiny change in the SuccC constructor. data ( :^ ) ∷ ( * → * ) → * → ( * → * ) where

ZeroC ∷ a → (f :^ Z ) a

SuccC ∷ IsNat n ⇒ (f :^ n) (f a) → (f :^ ( S n)) a

The Functor and Applicative instances are completely unchanged.

Vector tries (continued) Using the analysis above, we can easily define tries over vectors as n-ary composition of tries over the vector element type. Again, there is a right-folded and a left-folded version. Right-folded vector tries instance ( IsNat n, HasTrie a) ⇒ HasTrie ( Vec n a) where

type Trie ( Vec n a) = Trie a :^ n

Conversion from trie to function is, as always, a trie look-up. Its definition closely follows the definition of f :^ n : ZeroC v `untrie` ZVec = v

SuccC t `untrie` (a :< as) = (t `untrie` a) `untrie` as

For untrie , we were able to follow the zero/successor structure of the trie. For trie , we don't have such a structure to follow, but we can play the same trick as for defining units above. Use the nat method of the IsNat class to synthesize a number of type Nat n , and then follow the structure of that number in a new recursive function definition. trie = trieN nat

where trieN ∷ HasTrie a ⇒ Nat n → ( Vec n a → b) → ( Trie a :^ n) b

trieN Zero f = ZeroC (f ZVec )

trieN ( Succ _) f = SuccC (trie (λ a → trie (f ∘ (a :< ))))

Left-folded vector tries The change from right-folding to left-folding is minuscule. instance ( IsNat n, HasTrie a) ⇒ HasTrie ( Vec n a) where

type Trie ( Vec n a) = Trie a :^ n



ZeroC b `untrie` ZVec = b

SuccC t `untrie` (a :< as) = (t `untrie` as) `untrie` a



trie = trieN nat

trieN ∷ HasTrie a ⇒ Nat n → ( Vec n a → b) → ( Trie a :^ n) b

trieN Zero f = ZeroC (f ZVec )

trieN ( Succ _) f = SuccC (trie (λ as → trie (f ∘ ( :< as))))

Right vs Left? There are two tries for Vec n a , but with Haskell we have to choose one as the HasTrie instance. How can we make our choice? The same sort of question arises earlier. The post Reverse-engineering length-typed vectors showed how to discover Vec n by looking for the trie functor for the BNat n . The derivation was Vec n a ≅ a × ⋯ × a

≅ ( 1 → a) × ⋯ × ( 1 → a)

≅ ( 1 + ⋯ + 1 ) → a

The question of associativity arises here as well, for both product and sum. The BNat n a and Vec n types both choose right-associativity: data BNat ∷ * → * where

BZero ∷ BNat ( S n)

BSucc ∷ BNat n → BNat ( S n)



data Vec ∷ * → * → * where

ZVec ∷ Vec Z a

( :< ) ∷ a → Vec n a → Vec ( S n) a

Since trie construction is type-driven, I see the vector trie based on right-folded composition as in line with these definitions. Left-folding would come from a small initial change, swapping the constructors in the BNat definition: data BNat ∷ * → * where

BSucc ∷ BNat n → BNat ( S n)

BZero ∷ BNat ( S n)

From this tweaked beginning, a BNat trie construction would lead to a left-associated Vec type: data Vec ∷ * → * → * where

( :< ) ∷ Vec n a → a → Vec ( S n) a

ZVec ∷ Vec Z a

And then left-folded composition for the Vec trie.

As you might have noticed, I’ve been thinking and writing about memo tries lately. I don’t mean to; they just keep coming up. Memoization is the conversion of functions to...