Prerequisites:

This article assumes the reader already knows about:

Higher Kinded Type (HKT)

Fix Point Type

Fix Point Type is a type that generically describes recursive data. By abstracting over a recursive data structure, we can define a generic operation that works on any recursive data structure.

To learn more about HKT and Fix Point Type, check these links:

Setup

You can run all code snippets in ammonite-repl, for ADT definition, you need to enclose them in a curly brace like { ...adt definition } before pasting to ammonite repl.

What this post is about?

This post aims to document how to retain type information of Generalized Algebraic Data Type (GADT) with Fix Point Type. I will illustrate the technique by refactoring a GADT with direct recursion into another GADT without recursion and implement the catamorphism method for it.

This technique can be useful when the type param of your original recursive GADT propagates through the layers of your recursive data structure. It might not make sense now, so I suggest carry on reading :).

The following code snippet shows a GADT that describes how to query a recursive data (e.g., JSON). It is super simple; it can only query String or Boolean by path. But we can add things like QueryByWithCondition later (not in the scope of this article).

// simplified sealed trait Query[A]

case object QueryString extends Query[String]

case object QueryBool extends Query[Boolean]

case class QueryPath[A](path: String, next: Query[A]) extends Query[A] // sample data // {

// "oh": {

// "my": "zsh"

// }

// }

// expression to query _.oh.my from JSON above val expression = QueryPath("oh", QueryPath("my", QueryString))

As you might have noticed, the Query ADT is recursive, and the type param A is recursive too, when constructing QueryPath which extends Query , the type param is determined by the next argument of QueryPath .

def typeMatch[A, B](a: A, b: B)(implicit eq: A=:=B) = () val queryString = QueryPath("my", QueryString)

val queryNestedString = QueryPath("my", QueryPath("oh", QueryString)) typeMatch(queryString, queryNestedString) // compiles

Having a type param is useful because, for any recursive query, we can know the result type statically without traversing the tree, which is a runtime property.

The following steps are common when applying recursion schemes:

Refactor Recursive ADT to abstract recursion away. Construct recursive data structure using Fix Point Type. Define a Functor instance for the new Non-Recursive ADT. Implement (or use them from a library) operations like catamorphism and anamorphism on our recursive data structure.

(For common use cases, we can use operations from recursion schemes library, e.g. Matryoshka.)

We are going to follow the same steps here.

Step 1: Refactor GADT to remove recursion

The goal of this step is to remove recursion from our ADT, in our case, QueryPath should no longer refer to Query in its definition

sealed trait QueryF[+F[_], A]

case object QueryStringF extends QueryF[Nothing, String]

case object QueryBoolF extends QueryF[Nothing, Boolean]

case class QueryPathF[F[_], A](path: String, next: F[A]) extends QueryF[F, A]

Above is the GADT after transformation. They are suffixed by F to differentiate with the previous GADT. A few things to note:

We added a Higher Kinded Type (HKT) param F[_] to Query base trait, and make it a covariant. On QueryStringF , we specify Nothing on the F[_] position, so that QueryStringF is a subtype of QueryF , same for QueryBoolF. We removed the recursion on QueryPathF , by abstracting next: QueryF[A] into next: F[A] , this is why we have to introduce F[_] on QueryF , so that we can abstract over the original Query[A] which is a HKT.

Now let’s see how to form a query using the new ADT.

// compiles with `-Ypartial-unification` compiler flag

val expression = QueryPathF("oh", QueryPathF("my", QueryStringF))

As you can see, it’s similar to our previous example, which is a hint that our new generic representation is as powerful as the previous, more concrete representation in terms of expressivity.

Step 2: Construct recursive data structure using Fix Point Type

Let’s revise what Fix Point Type is. fix is a structure that conforms to the following rule:

fix(f) = f(fix(f)) for all f

The rule above is abstract, in the sense that it can apply to many different domains. For example, when f is a function, then fix is commonly known as Y-combinator which describes a recursive function, when f is a type then we get Fix Point Type , which describes a recursive type.

Below is a straightforward way to define Fix point type in Scala:

case class Fix[F[_]](unfix: F[Fix[F]])

It follows the rule above in that every Fix[F] is equivalent to F[Fix[F]] for all F.

Unfortunately, we cannot use this definition of Fix to construct a recursive version of QueryF .

// does not compile

Fix(QueryPathF("key", Fix(QueryStringF))

Because the type signature does not match, the Fix constructor takes a type param with * -> * kind (commonly pronounced as Star to Star), encoded as F[_] in Scala, but QueryF has kind of (* -> *) -> * -> * , as it takes a F[_] and A and returns a proper type (i.e. a Star Type).

The solution to this problem is to create a new HFix type that matches the type signature of QueryF

case class HFix[F[_[_], _], A](unfix: F[HFix[F, ?], A])

This looks a bit scary, let’s try to break it down.

The first type param F[_[_], _] is a type constructor that takes 2 type params, a HKT * -> * and a proper type *

is a type constructor that takes 2 type params, a HKT and a proper type The second type param is just a proper type, bound to the symbol A

unfix becomes F[HFix[F, ?], A] , remember F takes a * -> * on the first type param, which is the kind of HFix[F, ?] . The second type param would then be A so that we propagate it across layers of recursion

It’s worthwhile to emphasize two points:

It allows us to relate an extra type param between layers (i.e. both HFix[F[_[_], _], A] and F[HFix[F, ?], A] contains type A ). It abstracts over a type with this shape (* -> *) -> * -> * , encoded as F[_[_], _] , which match our QueryF type.

Sample usage of HFix :

def queryString = HFix(QueryStringF: QueryF[HFix[Query,?], String])

def queryBool = HFix(QueryBoolF: Query[HFix[QueryF,?], Boolean])

def queryPath[A](p: String, next: HFix[QueryF, A]) = HFix(QueryPathF(p, HFix(next))) queryPath(

"oh",

queryPath(

"my",

queryString

)

) res32: HFix[QueryF, String] = HFix(QueryPathF("oh", HFix(QueryPathF("my", HFix(QueryStringF)))))

Cool, so we can create a recursive structure that expresses querying a string at the path oh.my , note the type of this structure is HFix[QueryF, String] , which is great because we retain the knowledge that it results in a String .

Step 3: Define Functor instance for GADT

Defining a Functor instance is essential, because it describes how to transform a recursive data structure. Unfortunately, due to the usage of GADT, a standard Functor wouldn’t work, it’s similar to the issue with standard Fix Point Type.

trait Functor[F[_]]

Above is the interface of Functor, we can define Functor for any type with kind * -> * , but Query has a type of (* -> *) -> * -> * as we’ve seen earlier.

The solution is similar to how to solve the issue with Fix; we need to define another Functor like structure that fit our type.

When using ordinary ADT, we want a Functor instance so that we can transform the type param of our ADT, eg. F[A] => F[B] . But now we have QueryF[F[_], A] , which contains two type params. What should we transform?

The most generic way would be to transform both type params but to simplify the problem, we are only transforming the first type param, e.g., QueryF[F[_], A] => QueryF[G[_], A]

import $ivy.`org.typelevel::cats-core:1.6.0` import cats.~> trait HFunctor[F[_[_], _]] {

def hmap[I[_], J[_]](nt: I ~> J): F[I, ?] ~> F[J, ?]

}

This typeClass describes the ability to transform the first type param of an arbitrary type F, where F has kind of (* -> *) -> * -> * , which is what we need.

Next, we need to implement an instance of HFunctor for QueryF :

implicit val queryHFunctor: HFunctor[QueryF] = new HFunctor[QueryF] {

def hmap[I[_], J[_]](nt: I ~> J): QueryF[I, ?] ~> QueryF[J, ?] = {

new (QueryF[I, ?] ~> QueryF[J, ?]) {

def apply[A](a: QueryF[I, A]): QueryF[J, A] = {

a match {

case QueryStringF => QueryStringF

case QueryBoolF => QueryBoolF

case query: QueryPathF[I, A] => QueryPathF(query.path, nt(query.next))

}

}

}

}

}

The implementation is relatively straightforward. We try to handle every case of GADT and change the F[_] type param. For bottom type (i.e. when F is Nothing), we don't have to change anything because we define F as covariant.

Step 4: Implement catamorphism

Finally, after all the ceremony, we are in the position to define catamorphism .

At a high level, catamorphism is an operation that collapses a recursive structure into a single value by collapsing each layer of the data structure recursively.

It is generic in that user can define how to collapse each layer.

It looks like this:

type HAlgebra[F[_[_], _], G[_]] = F[G, ?] ~> G def hCata[F[_[_], _], G[_], I](alg: HAlgebra[F, G],hfix: HFix[F, I])(implicit F: HFunctor[F]): G[I] = {

val inner = hfix.unfix

val nt = F.hmap(

new (HFix[F, ?] ~> G) {

def apply[A](fa: HFix[F, A]): G[A] = hCata(alg, fa)

}

)(inner)

alg(nt)

}

The flow of hCata is similar to normal cata method. By trying to peel off the outer layer of HFix[F, I] , we recursively dive into the innermost layer of the recursive structure. Then we apply HAlgebra while traversing back to the top layer.

Note: You can try to implement anamorphism and other operations :Dl

Now that we’ve defined everything we need, let’s try to use it.

We’ll go through two examples, using the query we created earlier.

1) Print recursive query in a human-readable format:

val nestedQuery = queryPath(

"oh",

queryPath(

"my",

queryString

)

) // a trick to fold into String, this is interesting as it shows that

// generalized type constructor is super powerful; it can represent a

// more specialized type easily type JustString[A] = String // important part: convert each layer of query into a string

val print: HAlgebra[QueryF, JustString] = new HAlgebra[QueryF, JustString] {

override def apply[A](fa: QueryF[JustString, A]): JustString[A] = {

fa match {

case QueryStringF => "as[String]"

case QueryBoolF => "as[Bool]"

case q: QueryPathF[JustString, A] => s"${q.path}.${q.next}"

}

}

} hCata(print, nestedQuery) // res44: JustString[String] = "oh.my.as[String]"

2) Convert our query description to a Circe Decoder:

import $ivy.`io.circe::circe-parser:0.10.0` import io.circe.Decoder val toDecoder: HAlgebra[QueryF, Decoder] = new HAlgebra[QueryF, Decoder] {

override def apply[A](fa: QueryF[Decoder, A]): Decoder[A] = fa match {

case QueryBoolF => Decoder.decodeBoolean

case QueryStringF => Decoder.decodeString

case q: QueryPathF[Decoder, A] =>

Decoder.instance { cursor =>

cursor.get(q.path)(q.next)

}

}

} val decoder = hCata(toDecoder, nestedQuery) val json = parse("""

{

"oh": {

"my": "20202"

}

}

""").right.get decoder.decode(json) // Right("20202")

I tried to use simpler examples. For more complicated examples, check this project:

Conclusion:

We’ve walked through how to apply recursion schemes on a GADT. The basic idea is similar to applying recursion schemes on ordinary ADT. The main difference is that we need to operate on higher order types. For example:

Functor that takes a function ( * -> * ) becomes HFunctor that takes a natural transformation (* -> *) -> (* -> *) .

that takes a function ( ) becomes HFunctor that takes a natural transformation . Fix with shape (* -> *) -> * becomes HFix with shape ((* -> *) -> (* -> *)) -> (* -> *) .

with shape becomes with shape . Algebra with (* -> *) -> * -> * becomes HAlgebra ((* -> *) -> (* -> *)) -> (* -> *) -> (* -> *) .

Notice there’s a common theme: the number of stars doubled in the process. I haven’t fully understood why but I hope this article is still useful.

Credit

I learned most of the techniques described here from Xenomorph by Nuttycom. I find it interesting and thought would be good to write it down.

Thanks, Nuttycom, for sharing your idea with the open source world! I also wish to thank my co-worker Olivier Melois (Baccata on github)for introducing me to the idea of recursion scheme, and Alex for correcting me on notation of Higher Kinded Type.