10th June 2009, 04:36 pm

Memoization takes a function and gives back a semantically equivalent function that reuses rather than recomputes when applied to the same argument more than once. Variations include not-quite-equivalence due to added strictness, and replacing value equality with pointer equality.

Memoization is often packaged up polymorphically:

memo :: (???) => (k -> v) -> (k -> v)

For pointer-based (“lazy”) memoization, the type constraint (“???”) is empty. For equality-based memoization, we’d need at least Eq k , and probably Ord k or HasTrie k for efficient lookup (in a finite map or a possibly infinite memo trie).

Although memo is polymorphic, its argument is a monomorphic function. Implementations that use maps or tries exploit that monomorphism in that they use a type like Map k v or Trie k v . Each map or trie is built around a particular (monomorphic) type of keys. That is, a single map or trie does not mix keys of different types.

Now I find myself wanting to memoize polymorphic functions, and I don’t know how to do it.

Flavors of polymorphism

If a recursively defined function f is polymorphic, then the recursion may still be monomorphic. That is, recursive calls may be restricted to the same type instance as the parent call. Most recursive polymorphic functions fit this form, because most polymorphic recursive data types are “regular”, meaning that a polymorphic data type is included in itself only at the same type instance. For instance, the usual polymorphic lists and trees are regular.

Example: GADTs

Among other places, non-regular, or “nested data types” arise in statically typed encodings of typed languages. For instance, here’s a GADT (generalized algebraic data type) for typed lambda calculus expressions:

-- Variables data V a = V String (Type a) -- Expressions data E :: * -> * where Lit :: a -> E a -- literal Var :: V a -> E a -- variable (:^) :: E (a -> b) -> E a -> E b -- application Lam :: V a -> E b -> E (a -> b) -- abstraction

The Type type is sort of like TypeRep , except that it is statically typed.

data Type :: * -> * where Bool :: Type Bool Float :: Type Float ... (:*:) :: Type a -> Type b -> Type (a,b) (:->:) :: Type a -> Type b -> Type (a->b)

These GADTs ( E and Type ) are both non-regular, and so recursive functions over them will involve more than one argument type.

So how can we memoize?

A first try

Let’s consider a specific case of polymorphic functions:

type k :--> v = ∀ a. HasType a => k a -> v a

HasType is to Typeable as Type is to TypeRep . The memo implementation I’m playing with relies on HasType .

The memoizer can have type

memo :: (k :--> v) -> (k :--> v)

which uses rank 2 polymorphism (because of the argument type’s ∀ , which cannot be moved to the outside).

My implementation of memo is similar to the discussion in Stretching the storage manager: weak pointers and stable names in Haskell, but modernized a bit and adapted for polymorphism. It uses an IntMap of lists of pairs of stable names (akin to pointers) for arguments and values for results. The idea is to first use hashStableName to get an Int to use as an IntMap key, and then linearly traverse the resulting list of binding pairs, comparing stable keys for equality. ( StableName has Eq but not Ord .) Although hashStableName can map different stable names to the same hash value, collisions are rare, so the StableBind lists rarely have more than one element and the linear search is cheap.

type SM k v = I.IntMap [StableBind k v] -- Stable map data StableBind k v = ∀ a. HasType a => SB (StableName (k a)) (v a)

The reason for hiding the type parameter a in StableName is so that different bindings can be for different types.

The key tricky bit is managing static typing while searching for a particular StableName a in a [StableBind] . Here’s my implementation:

blookup :: ∀ k v a. HasType a => StableName (k a) -> [StableBind k v] -> Maybe (v a) blookup stk = look where look :: [StableBind k v] -> Maybe (v a) look [] = Nothing look (SB stk' v : binds') | Just Refl <- tya `tyEq` typeOf2 stk', stk == stk' = Just v | otherwise = look binds' tya :: Type a tya = typeT

The crucial magic bit is

tyEq :: Type a -> Type b -> Maybe (a :=: b)

where a :=: b represents a proof that the types a and b are the same type. The proof type has a simple GADT representation:

data (:=:) :: * -> * -> * where Refl :: a :=: a

This simple type definition ensures that only valid type-equality proofs can exist. Well, except for ⊥. The guard’s pattern match with Just Refl will force evaluation, so that ⊥ can’t sneak by us. That match also informs the type-checker that stk and stk' are stable names of the same type in that clause, which then makes stk == stk' be well-typed, and makes Just v have the required type, i.e., Maybe (v a) .

Finally, the typeOf2 function is a simple helper that peels off two type constructors and extracts a Type :

typeOf2 :: HasType a => g (f a) -> Type a

Wishing for more

The type of memo above is too restrictive for my uses. It only handles polymorphic functions of type k a -> v a , and only with the single constraint HasType a .

The reason I want lazy memoization now is that I’m compressing expressions to maximize representation sharing, as John Hughes described in Lazy Memo-functions. Once sharing is maximized, pointer-based memoization works better, because equal values are pointer-equal. To compress an expression, simply use a memoized copy function, as John suggested.

compress :: HasType a => E a -> E a compress e = mcopy e where mcopy, copy :: HasType b => E b -> E b -- Memo version mcopy = memo copy -- Copier, with memo-copied components copy (u :^ v) = appM (mcopy u) (mcopy v) copy (Lam v b) = lamM v (mcopy b) copy e = e -- memoized constructors appM :: (HasType a, HasType b) => E (a -> b) -> E a -> E b appM = memo2 (:^) lamM :: (HasType a, HasType b) => V a -> E b -> E (a -> b) lamM = memo2 Lam

The memo2 function is defined in terms of memo , using some some newtype trickery. Its type:

memo2 :: HasType a => (k a -> l a -> v a) -> (k a -> l a -> v a)

But, sigh, the (:^) and Lam constructors I’m trying to memoize do not have the required types. My higher-order-polymorphic memo functions do not have flexible enough types.

And this is where I’m stuck.

I’d appreciate your ideas and suggestions.

Memoization takes a function and gives back a semantically equivalent function that reuses rather than recomputes when applied to the same argument more than once. Variations include not-quite-equivalence due to...