forall a. ((a->a)->a->a)

Boehm-Berarducci encoding is often confused with Church encoding. First of all, Church encoding, which represents data types in an untyped lambda-calculus, is not tight. Without types, it is impossible to separate lambda-terms that encode some data type, from the terms that represent no value of that data type. The main distinction between the two approaches is subtle. In a word, Church encoding only describes introductions whereas Boehm and Berarducci also define the elimination, or pattern-matching, on encoded data types. For example, the list [1,2] is represented in either encoding as the term \f x -> f 1 (f 2 x) . It is easy to write the function head , which applied to the above lambda-term returns 1 . It is much harder to define tail , which applied to the above term returns \f x -> f 2 x , the encoding of [2] . The reader may wish to pause and try writing this function. The solution, which is the generalization of the predecessor on Church numerals, seems ad hoc. It is not clear how to extend it to the encodings of other data structures and to make it automatic. Boehm and Berarducci were the first to show `the trick', the systematic method of writing pattern-matching functions on encodings of any algebraic data type.

The Boehm and Berarducci's paper has many great insights. Alas, the generality of the presentation makes the paper very hard to understand. It has a Zen-like quality: it is incomprehensible unless you already know its results. I understood the representation of pattern-matching because I had re-discovered `the trick' independently and was specifically looking for it in the paper. This article is an attempt to explain and illustrate the main ideas of the paper on a simple example.

Recall, the Boehm-Berarducci encoding translates algebraic data types and the operations on them into System F, which contains only abstractions and applications and base type constants. The encoding applies only to what is now called `strictly positive' data types: there are no functions in the domain of constructors. (See Remark 1.1 (d) in their paper). For example, the encoding does not apply to the higher-order abstract syntax. The encoding represents recursive (inductive, to be precise) data types as System F terms with non-recursive, but higher-rank types. Informally, the encoding replaces the type-level primitive recursion with the term-level primitive recursion and the higher-rank type, of the order at most two. The encoding can be implemented in OCaml or Haskell, both of which support higher-rank types. As we shall see, the encoding has an inherent severe performance problem due to repeated re-encoding.