The categorical abstract machine Mk.2

The categorical abstract machine is a neat virtual machine designed in 1987 by Cousineau, Curien and Mauny.

This posts extends the categorical abstract machine with typechecking and proof search.

De Bruijn's notation can be used to give simple denotational semantics to lambda calculus.

{0}(p,d) = d, {n+1}(p,d) = {n}p, {c}p = c, {MN}p = {M}p({N}p), {λ.M}pd = {M}(p,d).

CAM is a very simple virtual machine based on these semantics. It has only few state transition rules that can be easily described in Haskell.

data Sem a = Nil | Cons (Sem a) (Sem a) | Closure (Sem a -> Sem a) | Syn a data Tm = MPair [Tm] [Tm] | MAbs [Tm] | MConst | MFst | MSnd | MAp deriving (Show) eval :: Sem a -> Tm -> Sem a eval v (MAbs f) = Closure (\w -> foldl eval (Cons v w) f) eval v (MPair f g) = Cons (foldl eval v f) (foldl eval v g) eval v (MConst) = Nil eval (Cons f _) MFst = f eval (Cons _ g) MSnd = g eval (Cons (Closure f) w) MAp = f w

Described in this way, it's an obvious implementation of normalisation by evaluation. The "eval" constructs an initial semantic interpretation of a term.

Throughout this blog post I use the following "type information". We use categorical terms "terminal", "product", "exponent" to refer on units, tuples and functions. There are also two sets of type variables, some for me, and some for you.

data Ty = Terminal | Prod Ty Ty | Exp Ty Ty | Some Int | Var Int deriving (Show, Eq)

The Ty -values have similar role as Sem values, but on a different level. These are values produced by interpreting functors, whereas Sem are values produced by interpreting natural transformations.

Since we have an NbE, we should be able to reify semantics back to syntactic terms and from syntactic terms to semantics.

reify :: Ty -> Sem [Tm] -> [Tm] reify Terminal Nil = [MConst] reify (Prod a b) (Cons x y) = [MPair (reify a x) (reify b y)] reify (Exp a b) (Closure f) = [MAbs (reify a (f (reflect b (scope a))))] where scope (Exp a _) = MFst:scope a scope _ = [MSnd] reify (Some _) (Syn t) = t reify (Var _) (Syn t) = t reflect :: Ty -> [Tm] -> Sem [Tm] reflect Terminal _ = Nil reflect (Prod a b) t = Cons (reflect a (t++[MFst])) (reflect b (t++[MSnd])) reflect (Exp a b) t = Closure (\x -> reflect a [MPair t (reify b x), MAp]) reflect (Some _) t = Syn t reflect (Var _) t = Syn t nbe :: Ty -> Ty -> [Tm] -> [Tm] nbe a b t = reify b (foldl eval (reflect a (scope b)) t) where scope (Exp e _) = MFst:scope e scope _ = []

Given a well-typed term, the reify should be a total function. I'm not coinvinced at all that is correct because the reconstruction of the scope in the reify seems funky, but examining how the evaluator produces the closure suggests it'd be necessary.

*CAM2> nbe (Some 0) (Some 0) [] [] *CAM2> nbe (Prod (Some 0) Terminal) (Some 0) [MFst] [MFst] *CAM2> nbe (Prod (Some 0) Terminal) Terminal [MSnd] [MConst] *CAM2> nbe Terminal (Exp (Some 0) (Some 0)) [MAbs [MSnd]] [MAbs [MSnd]] *CAM2> nbe Terminal (Exp (Exp (Some 0) (Some 1)) (Some 0)) [MAbs [MAbs [MFst,MSnd]]] [MAbs [MAbs [MFst,MSnd]]]

The computation yields us a normal form and it's very interesting to examine how normal forms are shaped. If you don't share this opinion, watch some Frank Pfenning's logic lectures.

We can conclude what the normal form is through two methods. One is to look at the evaluator and think about where would it get stuck? Another way is to examine the reify/reflect functions we just described.

data Norm = Pair Norm Norm | Abs Norm | Const | Neut Neut deriving (Show, Eq) data Neut = Id | Fst Neut | Snd Neut | Ap Neut Norm deriving (Show, Eq)

There's something interesting happening with the neutral form of "apply". It gets this weird [MPair t (reify b x), MAp] -form, that is it gets stuck with a pair that is "loaned" to it.

Next we define nreify and nreflect .

nreify :: Ty -> Sem Neut -> Norm nreify Terminal Nil = Const nreify (Prod a b) (Cons x y) = Pair (nreify a x) (nreify b y) nreify (Exp a b) (Closure f) = Abs (nreify a (f (nreflect b (scope a Id)))) where scope (Exp a _) x = scope a (Fst x) scope _ x = Snd x nreify (Some _) (Syn t) = Neut t nreify (Var _) (Syn t) = Neut t nreflect :: Ty -> Neut -> Sem Neut nreflect Terminal _ = Nil nreflect (Prod a b) t = Cons (nreflect a (Fst t)) (nreflect b (Snd t)) nreflect (Exp a b) t = Closure (\x -> nreflect a (Ap t (nreify b x))) nreflect (Some _) t = Syn t nreflect (Var _) t = Syn t nnbe :: Ty -> Ty -> [Tm] -> Norm nnbe a b t = nreify b (foldl eval (nreflect a (scope b Id)) t) where scope (Exp e _) x = scope e (Fst x) scope _ x = x

Though now we have normal forms, what do they mean as terms? That can be written down, like this.

meaning :: Norm -> [Tm] meaning (Pair a b) = [MPair (meaning a) (meaning b)] meaning (Abs a) = [MAbs (meaning a)] meaning Const = [MConst] meaning (Neut Id) = [] meaning (Neut (Fst a)) = meaning (Neut a) ++ [MFst] meaning (Neut (Snd b)) = meaning (Neut b) ++ [MSnd] meaning (Neut (Ap f n)) = [MPair (meaning (Neut f)) (meaning n), MAp]

Now we can get normal forms in the structures that cannot be filled by anything else but a normal form.

*CAM2> nnbe (Some 0) (Some 0) [] Neut Id *CAM2> nnbe (Prod (Some 0) Terminal) (Some 0) [MFst] Neut (Fst Id) *CAM2> nnbe (Prod (Some 0) Terminal) Terminal [MSnd] Const *CAM2> nnbe Terminal (Exp (Some 0) (Some 0)) [MAbs [MSnd]] Abs (Neut (Snd Id)) *CAM2> nnbe Terminal (Exp (Exp (Some 0) (Some 1)) (Some 0)) [MAbs [MAbs [MFst,MSnd]]] Abs (Abs (Neut (Snd (Fst Id)))) *CAM2> nnbe Terminal (Exp (Exp (Some 0) (Some 1)) (Prod Terminal (Some 0))) [MAbs [MAbs [MFst,MSnd,MSnd]]] Abs (Abs (Neut (Snd (Snd (Fst Id)))))

Why come up with such a structure? Well, there are some things we can do with it.

Normal forms are shaped such that they can be typechecked without including annotations inside them. We can use a bidirectional typechecking algorithm for that:

check :: Ty -> Ty -> Norm -> Bool check c Terminal Const = True check c (Prod a b) (Pair f g) = check c a f && check c b g check c (Exp a b) (Abs f) = check (Prod c b) a f check a c@(Some _) (Neut n) = maybe False (==c) (infer a n) check a c@(Var _) (Neut n) = maybe False (==c) (infer a n) check _ _ _ = False infer :: Ty -> Neut -> Maybe Ty infer a (Fst n) = fmap fst (infer a n >>= valid_product) infer a (Snd n) = fmap snd (infer a n >>= valid_product) infer a (Ap n f) = do (c,d) <- infer a n >>= valid_exponent guard (check a d f) >> pure c infer a Id = Just a valid_product :: Ty -> Maybe (Ty,Ty) valid_product (Prod a b) = Just (a,b) valid_product _ = Nothing valid_exponent :: Ty -> Maybe (Ty,Ty) valid_exponent (Exp a b) = Just (a,b) valid_exponent _ = Nothing guard :: Bool -> Maybe () guard True = pure () guard False = Nothing

Note the right side of Neut -structure is supposed to be abstract. It's there because we could substitute Id away if we knew its type. Therefore for the structure to be in the specific normal form that we want, this property is required.

Lets try this out to something, just to verify it's not entirely wrong.

*CAM2> check Terminal (Exp (Exp (Some 0) (Some 1)) (Prod Terminal (Some 0))) (Abs (Abs (Neut (Snd (Snd (Fst Id)))))) True

This is useful but it's possible you may want to store programs that aren't in a normal form in practice. However it is also possible to form arbitrary terms by annotating their types into the leaves.

To understand how to produce internally verified programs, you need to do a little trip to the category theory. The computation rules described here can be derived from adjoint functors. Adjointness is a rich condition providing plenty of structure that can be relied upon.

For functors to be adjoint, they form unit/counit natural transformations. For example, pair/(fst*snd) forms an unit/counit pair. The pair functor is a right adjoint for the diagonal functor.

Functors form structures that behave like polymorphic types. Though, natural transformations are a lot like functors themselves.

There's a condition that a natural transformation behaves like a structure-preserving transformation from one functor to an another. This means that units and counits allow things to "pass-through".

f ∘ fst = fst ∘ (f * g) g ∘ snd = snd ∘ (f * g) (f * f) ∘ pair = pair ∘ f

If we recognize this property you can conclude that natural transformations can be treated as if they took parameters. This also makes them self-identifying.

fst f g snd f g pair f

After all natural transformations are families of morphisms. If you identify the parameters they take, then you can identify individual morphisms by filling in the parameters.

Next I construct the natural transformation presentation. It's a little bit neat.

data NT = NSym | NTerm | NComp NT NT | NProd NT NT | NExp NT NT | NConst NT | NPair NT | NCur NT NT | NFst NT NT | NSnd NT NT | NAp NT NT deriving (Eq, Show)

The validation is surprisingly simple.

nt_env :: NT -> Maybe (NT, NT) nt_env NSym = pure (NSym, NSym) nt_env NTerm = pure (NTerm, NTerm) nt_env (NComp f g) = do (a,b) <- nt_env f (c,d) <- nt_env g guard (b == c) pure (a,d) nt_env (NProd x y) = do (a,b) <- nt_env x (c,d) <- nt_env y pure (NProd a c, NProd b d) nt_env (NExp x y) = do (a,b) <- nt_env x (c,d) <- nt_env y pure (NProd a d, NProd b c) nt_env (NConst x) = do (_,c) <- nt_env x pure (NTerm, c) nt_env (NPair f) = do (a,b) <- nt_env f pure (NProd a a, b) nt_env (NCur f g) = do (a,b) <- nt_env f (c,d) <- nt_env g pure (NExp (NProd a d) c, b) nt_env (NFst f g) = do (a,b) <- nt_env f (_,d) <- nt_env g pure (a, NProd b d) nt_env (NSnd f g) = do (_,b) <- nt_env f (c,d) <- nt_env g pure (c, NProd b d) nt_env (NAp f g) = do (a,b) <- nt_env f (c,d) <- nt_env g pure (a, NProd (NExp b c) d)

You might be asking, why does this work? What are we doing different here? The trick would appear to be that we end up correctly describing the full shape of fst , snd and const .

Now as a reminder that we aren't going to abandon lambda calculus, lets leave those structures here.

data LC = LVar Int | LConst | LLam LC | LAp LC LC | LPair LC LC | LFst LC | LSnd LC deriving (Show)

It's trivial to build up functions to relate these structures into the main content in this blogpost. For now I leave them away.

There's still one thing about those normal forms that I didn't cover. We can use it to drive a proof search. Actually, if you look at those natural transformations, notice that they allow you to treat type inference as a form of proof search. This is a relatively cool thing and I'm going to cover it after I've introduced how to do proof search with these structures.

First we're going to need a logic programming environment.

class Unifiable a where var :: Int -> a as_var :: a -> Either Int a rewrite :: (Int -> Maybe a) -> a -> a overlap :: a -> a -> Maybe [(a,a)] occurs :: Int -> a -> Bool

And we have to layout it over our type declarations.

instance Unifiable Ty where var i = Var i as_var (Var i) = Left i as_var a = Right a rewrite _ Terminal = Terminal rewrite f (Prod x y) = Prod (rewrite f x) (rewrite f y) rewrite f (Exp x y) = Exp (rewrite f x) (rewrite f y) rewrite _ (Some i) = Some i rewrite f x@(Var i) = case f i of Nothing -> x Just v -> v overlap Terminal Terminal = pure [] overlap (Some i) (Some j) | (i == j) = pure [] overlap (Var i) (Var j) | (i == j) = pure [] overlap x@(Var _) y = pure [(x, y)] overlap x y@(Var _) = pure [(x, y)] overlap (Prod x y) (Prod z w) = do xs <- overlap x z ys <- overlap y w pure (xs ++ ys) overlap (Exp x y) (Exp z w) = do xs <- overlap x z ys <- overlap y w pure (xs ++ ys) overlap _ _ = Nothing occurs i (Var j) = (i == j) occurs i Terminal = False occurs i (Prod x y) = any (occurs i) [x,y] occurs i (Exp x y) = any (occurs i) [x,y] occurs i (Some j) = False

You can think of unification as a coequalizer in a suitable category where morphisms form substitutions.

unify :: Unifiable a => a -> a -> Map Int a -> Maybe (Map Int a) unify x y m = unify' (as_var $ subs m x) (as_var $ subs m y) m unify' :: Unifiable a => Either Int a -> Either Int a -> Map Int a -> Maybe (Map Int a) unify' (Left i) (Left j) m = if i == j then pure m else pure (Map.insert i (var j) m) unify' (Left i) (Right x) m = if occurs i x then Nothing else pure (Map.insert i x m) unify' (Right x) (Left i) m = if occurs i x then Nothing else pure (Map.insert i x m) unify' (Right x) (Right y) m = overlap x y >>= (foldl (>>=) (pure m) . fmap (uncurry unify)) subs :: Unifiable a => Map Int a -> a -> a subs m k = rewrite (\i -> Map.lookup i m >>= pure . subs m) k

This is very much an implementation of μKanren. You treat the logic program as a stream of results.

type State a = (Map Int a, Int) data Stream a b = Empty | Delay (Stream a b) | Filled (State a) b (Stream a b) | Suspend (Stream a b) stream_head :: Stream a b -> Maybe (State a, b) stream_head Empty = Nothing stream_head (Delay tail) = stream_head tail stream_head (Filled st x tail) = Just (st, x) stream_head (Suspend s) = stream_head s stream_list :: Stream a b -> [(State a, b)] stream_list Empty = [] stream_list (Delay tail) = stream_list tail stream_list (Filled st x tail) = (st, x):stream_list tail stream_list (Suspend s) = stream_list s

The Suspend differs from the usual μKanren implementation. I was not sure how to implement this feature properly so I extended the definition of stream to include it.

Suspended stream is empty if it appears in an empty context. These are meant to stack such that we have layers of traps that determine what the context is in each case.

For now this likely is overconstrained and not produce every proof. I suppose Suspension would need De-bruijn levels to work properly.

The implementation wraps goals so that they can be put into a monad. This is done to make it convenient to write logic programs.

newtype Goal a b = Goal {enumerate :: State a -> Stream a b} -- Enumeration of solutions empty_state :: State a empty_state = (Map.empty, 0) any_solution :: Goal a b -> Maybe (State a, b) any_solution g = stream_head (enumerate g empty_state) all_solutions :: Goal a b -> [(State a, b)] all_solutions g = stream_list (enumerate g empty_state) -- It is nice for the REPL to display few solutions for me. instance (Show a, Show b) => Show (Goal a b) where show g = let some = fmap snd (take 20 (all_solutions g)) in case some of [] -> "Nope" _ -> "Solutions:

" ++ (foldr (\x y -> x++"

"++y) "" $ fmap show some)

Following code implements an usual μKanren environment.

infixr 7 ≡ infixr 6 ∧ infixr 3 ∨ fresh :: Unifiable a => (a -> Goal a x) -> Goal a x fresh f = Goal fresh' where fresh' (m,i) = enumerate (f (var i)) (m,i+1) (≡), eq :: Unifiable a => a -> a -> Goal a () (≡) = eq eq x y = Goal eq' where eq' (st,i) = case unify x y st of Just a -> Filled (a,i) () Empty Nothing -> Empty false :: Goal a b false = Goal (\st -> Empty) mplus :: Stream a b -> Stream a b -> Stream a b mplus Empty g = g mplus (Delay f) g = mplus g f mplus (Filled st x f) g = Filled st x (mplus f g) mplus (Suspend f) (Suspend g) = Suspend (mplus g f) mplus (Suspend f) g = mplus g (Suspend f) bind :: Stream a t -> (t -> State a -> Stream a b) -> Stream a b bind Empty g = Empty bind (Delay f) g = Delay (bind f g) bind (Filled st x f) g = mplus (g x st) (bind f g) bind (Suspend s) g = Suspend (bind s g) mutate :: (x -> y) -> Stream a x -> Stream a y mutate fn Empty = Empty mutate fn (Delay f) = Delay (mutate fn f) mutate fn (Filled st x f) = Filled st (fn x) (mutate fn f) mutate fn (Suspend s) = Suspend (mutate fn s) (∧), conj :: Goal a x -> Goal a y -> Goal a (x,y) (∧) = conj conj f g = Goal impl' where impl' st = bind (enumerate f st) (comma (enumerate g)) comma h x = mutate (\y -> (x,y)) . h (∨), disj :: Goal a x -> Goal a x -> Goal a x (∨) = disj disj f g = Goal disj' where disj' st = mplus (enumerate f st) (enumerate g st)

The stream trapping is provided by "suspend" and "trap" -functions.

suspend :: Goal u a -> Goal u a suspend g = Goal (\st -> Suspend (enumerate g st)) trapshutter :: Stream a b -> Stream a b trapshutter Empty = Empty trapshutter (Delay s) = Delay (trapshutter s) trapshutter (Filled st x s) = Filled st x (trapshutter_open s) trapshutter (Suspend s) = Empty trapshutter_open :: Stream a b -> Stream a b trapshutter_open Empty = Empty trapshutter_open (Delay s) = Delay (trapshutter_open s) trapshutter_open (Filled st x s) = Filled st x (trapshutter_open s) trapshutter_open (Suspend s) = s trap :: Goal u a -> Goal u a trap g = Goal (\st -> trapshutter (enumerate g st))

The kreify is used for retrieving results from the logic program.

You get quite useful results this way.

-- Apply all substitutions to allow readout kreify :: Unifiable a => a -> Goal a a kreify x = Goal (\st -> Filled st (subs (fst st) x) Empty)

Delaying gives a turn for an another stream to produce values. Despite this feature the proof search would not appear to produce a balanced stream of values.

-- Delay recursive expressions. delay :: Stream a b -> Stream a b delay Empty = Empty delay (Delay s) = Delay (delay s) delay (Filled st x s) = Filled st x (Delay (delay s)) delay (Suspend s) = Suspend (delay s) zzz :: Goal a x -> Goal a x zzz f = Goal (\st -> delay (enumerate f st))

Finally this is just some convenience to allow us to write logic programs a bit easier in Haskell.

-- Provide fresh variable as output freshvar :: Unifiable a => Goal a a freshvar = fresh pure instance Functor (Goal a) where fmap f x = Goal (mutate f . enumerate x ) instance Applicative (Goal a) where pure x = Goal (\st -> Filled st x Empty) (<*>) x y = Goal (\st -> bind (enumerate x st) impl') where impl' f = mutate f . enumerate y instance Monad (Goal a) where (>>=) f g = Goal (\st -> bind (enumerate f st) (\y -> (enumerate (g y))))

The proof search structure can be derived from the checking routine. Though there's a thing that needs to be taken into account.

If you convert the normal-form type checking into a search program, you would find out it gets stuck if you give it (a → a) → a .

That type is the type of the fixed point and you may guess what's happening there. The proof search engine finds out the result f _ , but it wants to fill the hole, therefore it fills it up with f _ resulting in f (f _) , and it keeps going without returning a result.

Some proof search engines have search depth limit to fix this. It's bit of an inconvenience though because the proof search must take the depth search limit as an argument.

Technically the engine searching to depth isn't an issue. Consider for the following example: (a → a) → a → a .

Actually let me show you the difference.

*CAM2> fix_test Nope *CAM2> nat_test Solutions: Neut (Snd Id) Neut (Ap (Fst Id) (Neut (Snd Id))) Neut (Ap (Fst Id) (Neut (Ap (Fst Id) (Neut (Snd Id))))) Neut (Ap (Fst Id) (Neut (Ap (Fst Id) (Neut (Ap (Fst Id) (Neut (Snd Id)))))))

I'll show these programs later. Though you see that the second thing produces results because there's a base case.

This is achieved by constraining the neutral form search. Before the procedure is allowed to search for f (_:a):a , it must produce x:a first. If the base case is available, it can fill the f _ as well.

Ok, here is the search program.

proofsearch :: Ty -> Ty -> Goal Ty Norm proofsearch a b = search_norm [] a b search_norm :: [Ty] -> Ty -> Ty -> Goal Ty Norm search_norm ctx a b = (neut ∨ term ∨ zzz exp ∨ zzz prod) where term = do b ≡ Terminal atom_constraint_check ctx pure Const exp = do c <- freshvar d <- freshvar b ≡ Exp c d atom_constraint_check ctx x <- search_norm ctx (Prod a d) c pure (Abs x) prod = do c <- freshvar d <- freshvar b ≡ Prod c d atom_constraint_check ctx x <- search_norm ctx a c y <- search_norm ctx a d pure (Pair x y) neut = do b' <- kreify b atomic_term b' capture_recursives suspend ctx b' trap $ do trace <- prooftrace ctx a b' neut <- search_neut a (b':ctx) trace Id pure (Neut neut)

I likely should reimplement this with μKanren extended into a constraint solver. The atom_constraint_check is everywhere because we have to ensure those variables stay abstract.

To consider how the suspension may be implemented wrong for now, think about how search_norm [x] produces different results than search_norm [] under multiple traps.

Before the trap is set up, we have to set the trapping variable. I guess there's a better way but we trace a spine for now.

data ProofTrace = PT0 | PT1 | PTA Ty deriving (Show) prooftrace :: [Ty] -> Ty -> Ty -> Goal Ty [ProofTrace] prooftrace ctx a b = (atomic ∨ zzz prod0 ∨ zzz prod1 ∨ zzz ap) where atomic = do kreify a >>= atomic_term a ≡ b atom_constraint_check ctx pure [] prod0 = do x <- freshvar y <- freshvar a ≡ Prod x y kreify b >>= atomic_term atom_constraint_check ctx rest <- prooftrace ctx x b pure (PT0:rest) prod1 = do x <- freshvar y <- freshvar a ≡ Prod x y kreify b >>= atomic_term atom_constraint_check ctx rest <- prooftrace ctx y b pure (PT1:rest) ap = do x <- freshvar y <- freshvar a ≡ Exp x y kreify b >>= atomic_term atom_constraint_check ctx rest <- prooftrace ctx x b pure (PTA y:rest)

The search_neut fills the spine up with neutral values it requires.

search_neut :: Ty -> [Ty] -> [ProofTrace] -> Neut -> Goal Ty Neut search_neut _ _ [] neut = pure neut search_neut a ctx (PT0:tail) neut = search_neut a ctx tail (Fst neut) search_neut a ctx (PT1:tail) neut = search_neut a ctx tail (Snd neut) search_neut a ctx (PTA b:tail) neut = do norm <- search_norm ctx a b search_neut a ctx tail (Ap neut norm)

The capture_recursives actually suspends the values if they hit the context.

capture_recursives :: (Goal Ty () -> Goal Ty ()) -> [Ty] -> Ty -> Goal Ty () capture_recursives _ [] _ = pure () capture_recursives susp (x:xs) y = do x' <- kreify x atomic_term x' if x' == y then susp (pure ()) else capture_recursives (suspend . susp) xs y

Proof-search tends to get stuck if we hit branches with little to no results in them. This happens if we don't check a trivial constraint too late, and it ends up discarding the whole branch but we still search through.

Such trivial result-removing thing is our requirement that we look for n-long normal forms. Some variables ending up to the right side of "Neut" must be constrained to stay as variables.

atom_constraint_check :: [Ty] -> Goal Ty () atom_constraint_check [] = pure () atom_constraint_check (x:xs) = do kreify x >>= atomic_term atom_constraint_check xs atomic_term :: Ty -> Goal Ty () atomic_term (Var _) = pure () atomic_term (Some _) = pure () atomic_term _ = false

There would be few things I can do to improve the proof search here. But the whole thing is just an ongoing experiment or a proof-of-concept at this stage.

I was going to present how type inference is just a form of proof search. To do that you have to count occurrences of variables in the type.

occurrences :: Ty -> Map Int Int -> Map Int Int occurrences Terminal m = m occurrences (Var x) m = Map.alter increment x m where increment Nothing = Just 1 increment (Just n) = Just (n+1) occurrences (Some x) m = Map.alter increment (-x) m where increment Nothing = Just 1 increment (Just n) = Just (n+1) occurrences (Exp x y) m = occurrences x (occurrences y m) occurrences (Prod x y) m = occurrences x (occurrences y m) each_var_occurs_only_once t = Map.fold f True (occurrences t Map.empty) where f count j = (j && (count == 1))

If the variable occurs more than once, we discard the result. Though we have the problem that I described earlier. This means for discarding a lot of results.

But it still produces some results that we can look at.

type_inference_demonstration = do x <- freshvar prog <- proofsearch x x x' <- kreify x if each_var_occurs_only_once x' then pure (x', prog) else false *CAM2> type_inference_demonstration Solutions: (Var 0,Neut Id) (Terminal,Const) (Exp (Var 5) (Var 8),Abs (Neut (Ap (Fst Id) (Neut (Snd Id))))) (Prod (Var 5) (Var 6),Pair (Neut (Fst Id)) (Neut (Snd Id)))

These seem familiar? They're descriptions of identity natural transformations. Natural transformations that form an identity morphism for a functor.

It was quite interesting thing to find out. I thought type inference is a weird thing that's opposite to proof search. Turns out it's not, it's just a simple case of proof search.

The next interesting thing is to do with a proof search engine is to look if the routines are right.

all_programs :: Goal Ty (Norm, Ty, Ty, Bool) all_programs = do x <- freshvar y <- freshvar prog <- proofsearch x y x' <- kreify x y' <- kreify y pure (prog, x', y', check x' y' prog)

Like I said, the thing is not quite balanced so it's returning neutral terms only if it happens to be entirely unconstrained.

*CAM2> all_programs Solutions: (Neut Id,Var 1,Var 1,True) (Neut (Fst Id),Prod (Var 1) (Var 3),Var 1,True) (Neut (Snd Id),Prod (Var 2) (Var 1),Var 1,True) (Neut (Ap Id Const),Exp (Var 1) Terminal,Var 1,True) (Neut (Ap Id (Abs (Neut (Snd Id)))),Exp (Var 1) (Exp (Var 4) (Var 4)),Var 1,True) (Neut (Ap Id (Pair Const Const)),Exp (Var 1) (Prod Terminal Terminal),Var 1,True) (Neut (Ap Id (Abs (Neut (Fst (Snd Id))))),Exp (Var 1) (Exp (Var 4) (Prod (Var 4) (Var 9))),Var 1,True) (Neut (Ap Id (Pair Const (Abs (Neut (Snd Id))))),Exp (Var 1) (Prod Terminal (Exp (Var 6) (Var 6))),Var 1,True) (Neut (Ap Id (Abs (Neut (Snd (Snd Id))))),Exp (Var 1) (Exp (Var 4) (Prod (Var 8) (Var 4))),Var 1,True)

Though those neutral terms seem to typecheck, like they should.

Once you have the proof search, a simple check is to find out the SKI combinators. The proof search is entirely broken if it cannot do this, though.

i_program :: Goal Ty Norm i_program = do proofsearch Terminal (Exp (Some 0) (Some 0)) k_program :: Goal Ty Norm k_program = do let x = Terminal y = (Exp (Exp (Some 0) (Some 1)) (Some 0)) proofsearch x y s_program :: Goal Ty Norm s_program = do let a = Some 0 b = Some 1 c = Some 2 x = Terminal y = (Exp (Exp (Exp c a) (Exp b a)) (Exp (Exp c b) a)) proofsearch x y *CAM2> i_program Solutions: Abs (Neut (Snd Id)) *CAM2> k_program Solutions: Abs (Abs (Neut (Snd (Fst Id)))) *CAM2> s_program Solutions: Abs (Abs (Abs (Neut (Ap (Ap (Snd (Fst (Fst Id))) (Neut (Snd Id))) (Neut (Ap (Snd (Fst Id)) (Neut (Snd Id))))))))

Some types have unique programs for them. In those cases the proof could be even omitted.

swap_program :: Goal Ty Norm swap_program = do let a = Some 0 b = Some 1 x = Prod a b y = Prod b a proofsearch x y prod_programs :: Goal Ty Norm prod_programs = do let a = Some 0 x = Prod a a y = Prod a a proofsearch x y *CAM2> swap_program Solutions: Pair (Neut (Snd Id)) (Neut (Fst Id)) *CAM2> prod_programs Solutions: Pair (Neut (Fst Id)) (Neut (Fst Id)) Pair (Neut (Fst Id)) (Neut (Snd Id)) Pair (Neut (Snd Id)) (Neut (Fst Id)) Pair (Neut (Snd Id)) (Neut (Snd Id))

There are 4 programs of form (a*a) → (a*a) but only 1 program with the form a*b → b*a .

Of course, when we have a nice way to generate programs, we can generate programs that fit together, then run them.

composition_test :: Goal Ty (Ty, Ty, Norm, Norm) composition_test = do x <- freshvar y <- freshvar z <- freshvar prog1 <- proofsearch x y prog2 <- proofsearch y z x' <- kreify x z' <- kreify z pure (x', z', prog1, prog2) composition_test_run = do (x, y, prog1, prog2) <- composition_test pure (nnbe x y (meaning prog1 ++ meaning prog2)) *CAM2> composition_test Solutions: (Var 2,Var 2,Neut Id,Neut Id) (Prod (Var 2) (Var 4),Var 2,Neut Id,Neut (Fst Id)) (Prod (Var 3) (Var 2),Var 2,Neut Id,Neut (Snd Id)) (Exp (Var 2) Terminal,Var 2,Neut Id,Neut (Ap Id Const)) (Exp (Var 2) (Exp (Var 5) (Var 5)),Var 2,Neut Id,Neut (Ap Id (Abs (Neut (Snd Id))))) (Exp (Var 2) (Prod Terminal Terminal),Var 2,Neut Id,Neut (Ap Id (Pair Const Const))) (Exp (Var 2) (Exp (Var 5) (Prod (Var 5) (Var 10))),Var 2,Neut Id,Neut (Ap Id (Abs (Neut (Fst (Snd Id)))))) *CAM2> composition_test_run Solutions: Neut Id Neut (Fst Id) Neut (Snd Id) Neut (Ap Id Const) Neut (Ap Id (Abs (Neut (Snd Id)))) Neut (Ap Id (Pair Const Const)) Neut (Ap Id (Abs (Neut (Fst (Snd Id))))) Neut (Ap Id (Pair Const (Abs (Neut (Snd Id)))))

Here's the "nat" and "fix" program generators.

nat_test :: Goal Ty Norm nat_test = do let x = Prod (Exp (Some 0) (Some 0)) (Some 0) y = Some 0 proofsearch x y fix_test :: Goal Ty Norm fix_test = do let x = Exp (Some 0) (Some 0) y = Some 0 proofsearch x y

Btw. If you added fixed point to the language it would become a trivial program for every type.

I think we aren't far away from having a programming language founded mainly on category theory. Two things are missing.

We'd need some way to abstract over arithmetic operations and it's not obvious outright. Another is that we need a way to translate from denotational structures into machine datatypes and APIs (you can use linear logic to model interactions and mutable state).

Note that the logic programs written here were very difficult to construct. They usually get stuck if you don't write them right. Though the results are impressive enough that it was worthwhile.

You can download the CAM2.hs and fiddle around with it.