The setup

Consider a type of terms parameterized over a type of function symbols node and a type of variables var :

data Term node var = VarTerm !var | FunTerm !node !(Vector (Term node var)) deriving (Eq, Ord, Show) instance Functor (Term node) where fmap f (VarTerm var) = VarTerm (f var) fmap f (FunTerm n cs) = FunTerm n (Vector.map (fmap f) cs) instance Monad (Term node) where pure = VarTerm join (VarTerm term) = term join (FunTerm n cs) = FunTerm n (Vector.map join cs)

This is a useful type, since we encode open terms with Term node Var , closed terms with Term node Void , and contexts with Term node () .

The goal is to define a type of substitutions on Term s in the most pleasant possible way. Here's a first stab:

newtype Substitution (node ∷ Type) (var ∷ Type) = Substitution { fromSubstitution ∷ Map var (Term node var) } deriving (Eq, Ord, Show)

Now let's define some auxiliary values related to Substitution :

subst ∷ Substitution node var → Term node var → Term node var subst s (VarTerm var) = fromMaybe (MkVarTerm var) (Map.lookup var (fromSubstitution s)) subst s (FunTerm n ts) = FunTerm n (Vector.map (subst s) ts) identity ∷ Substitution node var identity = Substitution Map.empty -- Laws: -- -- 1. Unitality: -- ∀ s ∷ Substitution n v → s ≡ (s ∘ identity) ≡ (identity ∘ s) -- 2. Associativity: -- ∀ a, b, c ∷ Substitution n v → ((a ∘ b) ∘ c) ≡ (a ∘ (b ∘ c)) -- 3. Monoid action: -- ∀ x, y ∷ Substitution n v → subst (y ∘ x) ≡ (subst y . subst x) (∘) ∷ (Ord var) ⇒ Substitution node var → Substitution node var → Substitution node var s1 ∘ s2 = Substitution (Map.unionWith (λ _ _ → error "this should never happen") (Map.map (subst s1) (fromSubstitution s2)) ((fromSubstitution s1) `Map.difference` (fromSubstitution s2)))

Clearly, (Substitution n v, ∘, identity) is a monoid (ignoring the Ord constraint on ∘ ) and (Term n v, subst) is a monoid action of Substitution n v .

Now suppose that we want to make this scheme encode substitutions that change the variable type. This would look like some type SubstCat that satisfies the module signature below:

data SubstCat (node ∷ Type) (domainVar ∷ Type) (codomainVar ∷ Type) = … ∷ Type subst ∷ SubstCat node dom cod → Term node dom → Term node cod identity ∷ SubstCat node var var (∘) ∷ (Ord v1, Ord v2, Ord v3) → SubstCat node v2 v3 → SubstCat node v1 v2 → SubstCat node v1 v3

This is almost a Haskell Category , but for the Ord constraints on ∘ . You might think that if (Substitution n v, ∘, identity) was a monoid before, and subst was a monoid action before, then subst should now be a category action, but in point of fact category actions are just functors (in this case, a functor from a subcategory of Hask to another subcategory of Hask).

Now there are some properties we'd hope would be true about SubstCat :

SubstCat node var Void should be the type of ground substitutions. SubstCat Void var var should be the type of flat substitutions. instance (Eq node, Eq dom, Eq cod) ⇒ Eq (SubstCat node dom cod) should exist (as well as similar instances for Ord and Show ). It should be possible to compute the domain variable set, the image term set, and the introduced variable set, given a SubstCat node dom cod . The operations I have described should be about as fast/space-efficient as their equivalents in the Substitution implementation above.

The simplest possible approach to writing SubstCat would be to simply generalize Substitution :

newtype SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type) = SubstCat { fromSubstCat ∷ Map dom (Term node cod) } deriving (Eq, Ord, Show)

Unfortunately, this does not work because when we run subst it may be the case that the term we are running substitution on contains variables that are not in the domain of the Map . We could get away with this in Substitution since dom ~ cod , but in SubstCat we have no way to deal with these variables.

My next attempt was to deal with this issue by also including a function of type dom → cod :

data SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type) = SubstCat !(Map dom (Term node cod)) !(dom → cod)

This causes a few problems, however. Firstly, since SubstCat now contains a function, it can no longer have Eq , Ord , or Show instances. We cannot simply ignore the dom → cod field when comparing for equality, since the semantics of substitution change depending on its value. Secondly, it is now no longer the case that SubstCat node var Void represents a type of ground substitutions; in fact, SubstCat node var Void is uninhabited!

Érdi Gergő suggested on Facebook that I use the following definition:

newtype SubstCat (node ∷ Type) (dom ∷ Type) (cod ∷ Type) = SubstCat (dom → Term node cod)

This is certainly a fascinating prospect. There is an obvious category for this type: the Kleisli category given by the Monad instance on Term node . I am not sure if this actually corresponds to the usual notion of substitution composition, however. Unfortunately, this representation cannot have instances for Eq et al. and I suspect it could be very inefficient in practice, since in the best case it will end up being a tower of closures of height Θ(n) , where n is the number of insertions. It also doesn't allow computation of the domain variable set.

The question