guest post by Georgios Bakirtzis and Christian Williams

We continue the Applied Category Theory Seminar with a discussion of the paper Introduction to Turing Categories by Hofstra and Cockett. Thank you to Jonathan Gallagher for all the great help in teaching our group, and to Daniel Cicala and Jules Hedges for running this seminar.

Introduction

Turing categories enable the categorical study of computability theory, and thereby abstract and generalize the classical theory of computable functions on the natural numbers.

A Turing category is a category C \C equipped with:

cartesian products – to pair (the codes of) data and programs,

a notion of partiality – to represent programs (morphisms) which do not necessarily halt,

and a Turing object A A – to represent the “codes” of all programs.

The Turing object acts as a weak exponential for all pairs of objects in the category. While this is impossible in most “set-theoretical” categories, it is natural in computer science, because primitive formal languages—the “machine code” of the higher languages—are untyped (really one-typed).

For example, the untyped λ \lambda -calculus is represented by a reflexive object A A in a cartesian closed category, i.e. an object equipped with an embedding app : A → A A app:A\to A^A which sends a λ \lambda term M M to its action induced by application, and a retraction lam : A A → A , lam: A^A\to A, which sends an endo-hom term M ( − ) M(-) to the lambda term λ x . M ( x ) \lambda x.M(x) . The morphism • A : = eval ∘ ( app × A ) : A × A → A \bullet_A:= eval\circ (app\times A):A\times A\to A is a Turing morphism, meaning that every morphism f : A n → A f: A^n\to A has a “code” ( f ) : 1 → A (f):1\to A such that f = • A n ∘ ( ( f ) × A n ) . f = \bullet_A^{n} \circ ((f)\times A^n).

Turing categories are closely connected with partial combinatory algebras, magma objects with a certain form of “computational completeness”. The paradigmatic example is the “SK combinator calculus”, a variable-free representation of the λ \lambda calculus.

Turing categories and partial combinatory algebras (PCAs) are in a certain sense equivalent, which constitutes the main result of this work and justifies the notion of Turing category. (This is made into a precise equivalence of categories between relative PCAs and Turing categories over a fixed base category in the paper Categorical Simulations.)

Our purpose is to introduce the reader to some basic ideas in computability, and demonstrate how they can be expressed categorically.

Computability Theory

In the early 1900s, we focused intently on determining “what can we prove (with formal systems)?”, or analogously, “what can we compute (with computers)?”. This simple question inspired great advances in logic, mathematics, and computer science, and motivated their mutual development and deepening interconnection.

Three main perspectives emerged independently in order to answer this question:

Alan Turing thought about “hardware”, with Turing machines;

Alonzo Church thought about “software”, with the λ \lambda -calculus; and

-calculus; and Kurt Gödel thought about logic, with general recursive functions.

The Church-Turing thesis states that these are equivalent notions of computation: a function f : ℕ → ℕ f:\mathbb{N}\to \mathbb{N} is Turing computable iff f f can be written as a closed λ \lambda -term, iff f f is general recursive. This has led computer scientists to believe that the concept of computability is accurately characterized by these three equivalent notions.

Stephen Kleene, a student of Church, developed and unified these ideas in his study of “metamathematics”, focusing on the perspective of partial recursive functions (Introduction to Metamathematics – S.C. Kleene), which are equivalent to Gödel’s.

We can generate the class of partial recursive functions from basic functions and operators, and utilize the idea of Gödel numbering to consider the natural numbers both as the space of all data and all programs. The two main theorems which characterize an effective enumeration of all such functions—the Universality and Parameter theorems described below—are the original motivation for the definition of Turing categories.

Recursive Functions

Computability is humbly rooted in simple arithmetic. Using only three kinds of primitive functions, we can build up all computable functions, the partial recursive functions:

(1) constant: C a ( x 1 , … , x n ) = a successor: S ( x ) = x + 1 projection: P i k ( x 1 , … , x k ) = x i \begin{array}{lll} \text{constant: } & C_a(x_1,\dots ,x_n) & = a\\ \text{successor: } & S(x) & = x+1\\ \text{projection: } & P_i^k(x_1,\dots ,x_k) & = x_i\\\\ \end{array}

The set of primitive recursive functions, R R , is generated by two operators. First, we of course need composition:

(2) composition: ( g i : ℕ k → ℕ ) ∈ R ∧ ( h : ℕ n → ℕ ) ∈ R ∘ ⇒ h ∘ ( g 1 , … g n ) ≕ S k n ( h , g 1 , … , g n ) ∈ R . \begin{array}{ll} \text{composition: } & (g_i:\mathbb{N}^k\to \mathbb{N})\in R \wedge (h:\mathbb{N}^n\to \mathbb{N})\in R\\ \circ & \implies h\circ (g_1,\dots g_n) \eqqcolon \mathbf{S_k^n}(h,g_1,\dots ,g_n)\in R. \end{array}

Second, to derive real computational power we need the eponymous recursion:

(3) recursion: g : ℕ k → ℕ ∈ R ∧ h : ℕ k + 2 → ℕ ∈ R ρ ⇒ R k ( g , h ) ≕ f ∈ R f ( 0 , x ¯ ) = g ( x ¯ ) f ( y + 1 , x ¯ ) = h ( y , f ( y , x ¯ ) , x ¯ ) \begin{array}{ll} \text{recursion: } & g:\mathbb{N}^k\to \mathbb{N}\in R \wedge h:\mathbb{N}^{k+2}\to \mathbb{N}\in R\\ \rho & \implies \mathbf{R^k}(g,h) \eqqcolon f\in R\\\\ & f(0,\overline{x}) = g(\overline{x})\\ & f(y+1,\overline{x}) = h(y,f(y,\overline{x}),\overline{x})\\\\ \end{array}

This can be understood as a “for loop”— g g represents the initial parameters, y y is the counter, and h h describes a successive evaluation of the accumulated values—these are combined into the function f f :

(4) f ( n , x ¯ ) ≔ { y = g ( x ¯ ) for ( i = 0 ; n ) { y = h ( x ¯ , i ) } } \begin{array}{l} f(n,\overline{x})\coloneqq\\ \{y = g(\overline{x})\\ for (i = 0;n)\\ \{ y = h(\overline{x},i) \} \} \end{array}

Now we have enough to do arithmetic, and we’re off. For example, we can define addition: numbers are sequences of successors, and adding is simply a process of peeling the S S ’s off of one number and appending them to the other, until the former reaches 0:

(5) base step: add ( 0 , a ) = g ( a ) = a = P 1 1 ( a ) induction: add ( S ( b ) , a ) = S ( g ( b , a ) ) ≕ h ( b , f ( b , a ) , a ) = S ( P 2 3 ( b , c , a ) ) recursion: add : = R 2 ( P 1 1 , S 1 3 ( S , P 2 3 ) ) \begin{array}{ll} \text{base step: } & add(0,a) = g(a) = a = P_1^1(a)\\ \text{induction: } & add(S(b),a) = S(g(b,a)) \eqqcolon h(b, f(b,a), a) = S(P_2^3(b,c,a))\\ \text{recursion: } & add := R^2(P_1^1,S_1^3(S,P_2^3))\\ \end{array}

As a simple demonstration:

(6) add ( 2 , 1 ) = S 1 3 ( S , P 2 3 ) ( 1 , add ( 1 , 1 ) , 1 ) = S ( add ( 1 , 1 ) ) = S ( S 1 3 ( S , P 2 3 ) ( 0 , add ( 0 , 1 ) , 1 ) ) = S ( S ( P 1 1 ( 1 ) ) ) = S ( S ( S ( 0 ) ) ) = 3 . \begin{array}{ll} add(2,1) & = S_1^3(S,P_2^3)(1,add(1,1),1)\\ & = S(add(1,1))\\ & = S(S_1^3(S,P_2^3)(0,add(0,1),1))\\ & = S(S(P_1^1(1)))\\ & = S(S(S(0))) = 3. \end{array}

Of course, not all computations actually halt. We can use a diagonalization argument to show that not all Turing-computable functions are primitive recursive. To complete the equivalence, the more general partial recursive functions are given by adding a minimization operator, which inputs a ( k + 1 ) (k+1) -ary function and outputs a k k -ary function, which searches for a minimal zero in the first coordinate of f f , and returns only if one exists:

(7) minimization: ( f : ℕ k + 1 → ℕ ) ∈ R ∧ ( f is total ) μ ⇒ ( μ ( f ) ) ∈ R μ ( f ) ( x 1 , … , x k ) = z ⇔ ( f ( z , x 1 , … , x k ) = 0 ) ∧ ( ∀ i ≤ z − 1 f ( i , x 1 , … , x k ) ≥ 1 ) \begin{array}{ll} \text{minimization: } & (f:\mathbb{N}^{k+1}\to \mathbb{N})\in R \wedge (\text{f is total})\\ \mu & \implies (\mu\mathbf{(f)})\in R\\ & \mu(f)(x_1,\dots ,x_k) = z \iff\\ (f(z,x_1,\dots ,x_k) = 0) & \wedge (\forall i \leq z-1 \;\; f(i,x_1,\dots ,x_k) \geq 1) \end{array}

This allows for “unbounded minimization of safe predicates”, to check the consistency of rules on infinitary data structures. For example, with this operator we can explicitly construct a universal Turing machine. The outputs of μ \mu are certainly partial in general, because we can construct a predicate whose minimal zero would decide the halting problem.

All Turing-computable functions are represented by partial recursive functions, translating the complexity of Turing machines to a unified mathematical perspective of functions of natural numbers.

Gödel Numbering

Upon first hearing the Church-Turing thesis, one may ask “why do we only care about functions of natural numbers?” By the simple idea of Gödel numbering, we need only consider (the partial recursive subset of) the hom Set ( ℕ , ℕ ) Set(\mathbb{N}, \mathbb{N}) to think about the big questions of computability.

Kurt Gödel proved that a formal language capable of expressing arithmetic cannot be both consistent, meaning it does not deduce a contradiction, and complete, meaning every sentence in the language can be either proven or disproven. The method of proof is remarkably simple: as a language uses a countable set of symbols, operations, and relations, we can enumerate all terms in the language. We can then use this enumeration to create self-referential sentences which must be independent of the system.

The idea is to use the fundamental theorem of arithmetic. Finite lists of natural numbers can be represented faithfully as single natural numbers, using an encoding with prime powers: order the primes { p i } \{p_i\} , and define the embedding

(8) f : ℕ k → ℕ f:\mathbb{N}^k\to \mathbb{N}

(9) [ n 1 , … , n k ] ↦ ∏ i p i n i . [n_1,\dots,n_k]\mapsto \prod_i p_i^{n_i}.

This can be applied recursively, to represent lists of lists of numbers, giving encoding functions enc for any depth and length of nested lists. The decoding, dec, is repeated factorization, forming a tree of primes. For example,

(10) enc ( [ [ 1 , 4 , 8 , 2 ] , [ 1 , 3 ] ] ) = 2 2 ⋅ 3 4 ⋅ 5 8 ⋅ 7 2 ⋅ 3 2 ⋅ 3 3 enc([[1,4,8,2],[1,3]]) = 2^{2\cdot 3^4\cdot 5^8\cdot 7^2\cdot 3^{2\cdot 3^3}}

(11) dec ( 3553 ) = [ 0 , 0 , 0 , 0 , 1 , 0 , 1 , 1 ] . dec(3553) = [0,0,0,0,1,0,1,1].

We can then enumerate the partial recursive functions: because they are generated by three basic functions and three basic operators, we can represent them as lists of natural numbers by the following recursive encoding “#”:

(12) # C k = [ 1 , k ] # S = [ 2 ] # P i k ( x 1 , … , x k ) = [ 3 , k , i ] # S k n ( h , g 1 , … , g n ) = [ 4 , # h , # g 1 , … , # g n ] # R k ( g , h ) = [ 5 , k , # g , # h ] # M ( f ) = [ 6 , # f ] \begin{array}{ll} \#C_k & = [1,k]\\ \#S & = [2]\\ \#P_i^k(x_1,\dots,x_k) & = [3,k,i]\\\\ \#S_k^n(h,g_1,\dots,g_n) & = [4,\#h,\#g_1,\dots ,\#g_n]\\ \#R^k(g,h) & = [5,k,\#g,\#h]\\ \#M(f) & = [6,\#f] \end{array}

Then, a partial recursive function gives a list of lists, which can be encoded with Gödel numbering. For example,

(13) add = R 2 ( P 1 1 , S 1 3 ( S , P 2 3 ) ) # add = [ 5 , 2 , [ 3 , 2 , 2 ] , [ 4 , [ 2 ] , [ 3 , 4 , 3 ] ] ] enc ( # add ) = 2 5 ⋅ 3 2 ⋅ 5 2 3 ⋅ 3 ⋅ 2 ⋅ 5 2 ⋅ 7 2 4 ⋅ 3 2 2 ⋅ 5 2 3 ⋅ 3 4 ⋅ 5 3 . \begin{array}{ll} add & = R^2(P_1^1,S_1^3(S,P_2^3))\\ \#add & = [5,2,[3,2,2],[4,[2],[3,4,3]]]\\ enc(\#add) & = 2^5\cdot 3^2\cdot 5^{2^3\cdot 3\cdot 2\cdot 5^2}\cdot 7^{2^4\cdot 3^{2^2}\cdot 5^{2^3\cdot 3^4\cdot 5^3}}\text{.} \end{array}

So why do we want to enumerate all partial recursive functions? In contrast to the incompleteness theorems, this gives the theory of computability a very positive kind of self-reference, in the form of two theorems which characterize a “good” enumeration of partial recursive functions—this “good” means “effective” in a technical sense, namely that the kinds of operations one performs on functions, such as composition, can be computed on the level of codes.

Kleene’s Recursion Theorems

There is an enumeration of partial recursive functions R = { ϕ i } R = \{\phi_i\} , such that the following theorems hold.

Universality: The functions ϕ ( n ) \phi^{(n)} are partial recursive: ϕ ( n ) ( e , x 1 , … , x n ) = ϕ e ( x 1 , … , x n ) \phi^{(n)}(e,x_1,\dots ,x_n) = \phi_e(x_1,\dots ,x_n) .

Parameter: There are total recursive functions S m n S_m^n such that ϕ ( n + m ) ( e , x 1 , … , x m , u 1 , … , u n ) = ϕ ( m ) ( S m n ( e , x 1 , … , x m ) , u 1 , … , u n ) . \phi^{(n+m)}(e,x_1,\dots ,x_m,u_1,\dots ,u_n)= \phi^{(m)}(S_m^n(e,x_1,\dots ,x_m),u_1,\dots ,u_n).

The first says ℕ \mathbb{N} is a “universal computer”—we have partial ϕ ( k ) \phi^{(k)} for each arity, which takes (the code of) a program and (the code of) input data, and computes (the code of) the result. The second says this computer has generalized currying—there are total S m n S_m^n for each arity, which transforms ( n + m ) (n+m) -ary universal programs into m m -ary ones, by taking the first m + 1 m+1 arguments and returning the code of another program.

This is a remarkable fact—imagine having a computer which comes preloaded with every conceivable program, and one simply inputs the number of the program, and the number of the data. Furthermore, if one does not know which program to use, one can determine even that one step at a time.

Turing categories are meant to provide a minimal setting in which these two theorems hold. The Universality Theorem describes the universal property of the Turing object/morphism pair, and the Parameter Theorem describes the fact that the factorizations of the n n -ary functions are determined by that of the unary ones, via the Turing morphism.

Restriction Categories

We have already discussed that a notion of partiality is necessary to model computation because not all computations halt at all inputs. This notion is concrete captured through the notion of partial functions. Partiality can be categorically described through restriction categories.

Formally, a restriction category is a category C C endowed with a combinator ( − ) ¯ \overline{(-)} , sending

(14) ( f : A → B ) ↦ ( f ¯ : A → A ) (f:A\to B) \mapsto (\overline{f}:A\to A)

such that the following axioms are satisfied.

(15) 1 f f ¯ = f 2 f ¯ g ¯ = g ¯ f ¯ dom ( f ) = dom ( g ) 3 g f ¯ ¯ = g ¯ f ¯ dom ( f ) = dom ( g ) 4 g ¯ f = f gf ¯ cod ( f ) = dom ( g ) \begin{array}{lll} \text{1 } & f\overline{f} = f &\\\\ \text{2 } & \overline{f}\overline{g} = \overline{g}\overline{f} & dom(f)=dom(g)\\\\ \text{3 } & \overline{g\overline{f}} = \overline{g}\overline{f} & dom(f)=dom(g)\\\\ \text{4 } & \overline{g}f = f\overline{gf} & cod(f)=dom(g) \end{array}

The first axiom states that precomposing with the domain of a map does not change or otherwise restrict the original. The second axiom states that restricting two functions behaves similarly to intersection in the sense that they commute. The third axiom states that taking the intersection of two domains is the same as taking either map and restricting it to the domain of the other map. The fourth axiom states that restricting a map in some other domain means that you restrict it on some other composite.

Some examples of restriction categories are:

(16) Par sets and partial functions. f ¯ ( x ) = x if f ( x ) is defined, else ⊥ . Par ( C , M ) ob ( C ) and left-monic spans. PR N k and partial recursive functions. \begin{array}{ll} \mathsf{Par} & \text{sets and partial functions.}\\ & \overline{f}(x) = x \; \text{ if }\; f(x) \; \text { is defined, else }\; \bot.\\ \mathsf{Par(C,M)} & \mathsf{ob(C)}\; \text{ and left-monic spans.}\\ \mathsf{PR} & \N^k \; \text{ and partial recursive functions.} \end{array}

If f ¯ = 1 A \overline{f}=1_A , then f f is total. Note that all monics are total. The total maps form a subcategory Tot ( C ) Tot(C) , which is of interest in the accompanying paper Total Maps of Turing Categories by Hofstra, Cockett, and Hrube\v{s}. If e = e ¯ e=\overline{e} , then e e is a domain of definition. For any a ∈ C a\in C Denote by O ( a ) O(a) the set of domains of definition of a a . For these examples:

(17) Par O ( X ) = { subsets of X } . Par ( C , M ) O ( c ) = { subobjects of c } . PR O ( N k ) = { recursively enumerable subsets of N k } . \begin{array}{ll} \mathsf{Par} & O(X) = \{ \text{subsets of }\; X\}.\\ \mathsf{Par(C,M)} & O(c) = \{ \text{subobjects of }\; c\}.\\ \mathsf{PR} & O(\N^k) = \{ \text{recursively enumerable subsets of }\; \N^k\}. \end{array}

Enrichment and Cartesian Structure

The notion of restriction entails a partial order on homsets—given f , g : A → B f,g:A\to B , define ( f ≤ g ) ≔ ( f = g f ¯ ) . (f\leq g) \coloneqq (f=g\overline{f}). Intuitively this means that the domain of f f is contained in the domain of g g . In the enrichment, O ( X ) O(X) forms a meet semilattice.

Accordingly, to define the cartesian structure we need to weaken the terminal object and product to the poset-enriched case.

Partial Terminal Object. Once we have partial maps, the terminal object acts like a “domain classifier”. Rather than having a unique map, we have a natural bijection C ( A , 1 ) ≃ O ( A ) . C(A,1)\simeq O(A). For partial recursive functions over ℕ \mathbb{N} , we have precisely the recursively enumerable subsets. The partial terminal object satisfies the weaker universal property that given a domain e ∈ O ( A ) e\in O(A) and a map f : A → B f:A\to B , the domain of the composite with any domain of B B is contained in e e .

The restriction terminal object, 1 1 , is then describing the case when there is a unique total map ! A : A → 1 !_A: A \rightarrow 1 such that ! 1 = 1 !_1 = 1 . Furthermore, for each f : A → B f: A \rightarrow B this family of maps must have ! B f = ! A f ¯ !_B f = !_A \bar{f} to make the following diagram commute.

Partial Product. The partial product has the property that maps ⟨ f , g ⟩ \langle f, g \rangle into the product, A × B A \times B , which has the domain contained in that of both f f and g g . This is to account for the case in which computing f f and g g simultaneously might produce a result that either does not halt in f f or g g . Therefore, the domain of ⟨ f , g ⟩ \langle f, g \rangle needs to be the intersection of the corresponding domains of f f and g g .

Splitting Idempotents

Turing categories are essentially meant to describe collections that can be computed by the maps of the category. This can be expressed with idempotent splitting. For example, in the “Turing category of computable maps on ℕ \mathbb{N} ”, explained later, a restriction idempotent e : ℕ → ℕ e:\mathbb{N}\to \mathbb{N} is a recursively enumerable set. Therefore the splitting of e e , if it exists, makes this subset explicit as an object of the category. Hence, the idempotent completion of Comp ( ℕ ) \mathrm{Comp}(\mathbb{N}) yields the category whose objects are all recursively enumerable sets of natural numbers. This will imply that these categories of computable maps are determined uniquely up to Morita equivalence.

There is, however, a big difference between split restriction categories and general restriction categories. In the former, all domains are “out in the open”, rather than “hiding” as idempotents on some object. If we completely understood the idempotent completion of Comp ( ℕ ) Comp(\mathbb{N}) , we would know all recursively enumerable subsets of ℕ \mathbb{N} ; one could argue that the study of computability would be complete. However, Turing categories are meant to abstract away from the natural numbers, in order to distinguish what is essential about computation, and remove anything about arithmetic that might be contingent.

Turing Categories

Turing categories provide an abstract framework for computability: a “category with partiality” equipped with a “universal computer”, whose programs and codes thereof constitute the objects of interest.

Let C C be a cartesian restriction category, and let τ X , Y : A × X → Y \tau_{X,Y}:A\times X\to Y . A morphism f : Z × X → Y f:Z\times X\to Y admits a τ X , Y \tau_{X,Y} -index when there exists h : Z → A h: Z\to A such that the following commutes.

The τ X , Y \tau_{X, Y} -index can be thought of as an evaluation function and h h as the function λ * ( f ) \lambda^*(f) . When every f : Z × X → Y f: Z \times X \rightarrow Y admits such an evaluation; that is τ X , Y \tau_{X, Y} -index, then τ X , Y \tau_{X, Y} is universal. If the object A A does this for any X , Y X, Y then C C is a Turing category and A A is a Turing object.

Given such a Turing category, C C , an applicative family for the Turing object A A is a family of maps τ = { τ X , Y : A × X → Y ∣ X , Y } \tau = \{\tau_{X, Y}: A \times X \rightarrow Y\mid X, Y\} , where X X , Y Y range over all objects in C C . An applicative family for A A is universal if for every X , Y X,Y , we have that τ X , Y \tau_{X, Y} is universal. In such a case τ \tau is a Turing structure on A A . We can then think of a Turing structure on C C as a pair ( A , τ ) (A, \tau) .

We can further express the universality of the Turing object in C C by taking every object of X ∈ C X\in C is a retract of A A , by taking the τ 1 , X \tau_{1, X} -index for π X \pi_X .

This extends to all finite powers 1 , A , A 2 , A 3 , . . . 1, A, A^2, A^3,... , which indicates two important points about the Turing object:

The cardinality of the elements of A A , is either one or infinite; and

, is either one or infinite; and There is an internal pairing operation A × A → A A\times A\to A , which is a section.

The latter morphism generalizes and abstracts the Gödel pairing ℕ × ℕ → ℕ \mathbb{N}\times \mathbb{N}\to \mathbb{N} .

So given such a universal object A A and a Turing structure { τ } \{\tau\} , it is natural to wonder if a universal self-application map τ A , A \tau_{A,A} is special. It is indeed, and it is called a Turing morphism • : A × A → A \bullet:A\times A\to A . Such a map, plus the universal property of A A , is in fact enough to reconstruct the entire Turing structure of C \mathsf{C} .

The classic example is Kleene application ℕ × ℕ → • ℕ \mathbb{N}\times \mathbb{N}\xrightarrow{\bullet}\mathbb{N} . In general it can be thought of as a “universal Turing machine” that takes in a program (a Turing machine) and a datum, and computes the result.

This Turing morphism is the heart of the whole structure—we can construct all other universal applications τ X , Y : A × X → Y \tau_{X, Y}: A \times X \rightarrow Y by A × X → A × m x A × A → • A → r y Y A \times X \xrightarrow{A \times m_x} A \times A \xrightarrow{\bullet} A \xrightarrow{r_y} Y where pairs ( m X , r X ) (m_X, r_X) and ( m Y , r Y ) (m_Y, r_Y) retractions on A A and Y Y respectively.

To verify that τ A , A \tau_{A,A} is a universal application, consider the following commutative diagram, which shows that all subsets can be represented as formal objects and all computable functions, like f f , can be computed by representing them in ℕ \mathbb{N} .

The important idea is that a Turing object represents the collection of (codes of) all programs in some formal system; the universal applicative family is a method of encoding, and the Turing morphism is the universal program which implements all others. The innovation of Turing categories is that they abstract away from Set \mathrm{Set} , and can therefore describe programs which are not explicitly set-theoretical (not necessarily extensional, only intensional).

The axioms of a Turing structure correspond precisely to the Universality and Parameter theorems. This fact culminates in the Recognition Theorem: a cartesian restriction category C C is a Turing category iff there is an object A A of which every object is a retract, and for which there exists a Turing morphism A × A → • A A \times A \xrightarrow{\bullet} A .

Partial Combinatory Algebras

From the Recognition Theorem, we know that Turing categories are essentially generated by a single object equipped with a universal binary application. We can alternatively take the internal view, starting with a magma object ( A , • ) (A,\bullet) in a cartesian restriction category C C , and consider maps A n → A A^n \to A which are “computable” with respect to • \bullet . These maps form a category, and indeed a Turing category, iff A A is combinatory complete—essentially meaning that ( A , • ) (A,\bullet) is expressive enough to represent the (partial) lambda calculus. First, we should understand combinators.

Combinatory Logic

The practice of variable binding is subtle: when a computation uses a set of symbols as variables, one must distinguish between bound and free variables—the former being placeholders, and the latter being data. In formal languages, preserving this distinction requires conditions of the semantic rules, which can add baggage to otherwise simple systems. In the λ \lambda -calculus, beta reduction β : ( λ x . M N ) → M [ x : = N ] \beta: (\lambda x.M\; N) \to M[x:=N] precludes “variable capture”—accidentally substituting free into bound, e.g.: ( λ y . ( λ z . ( y z ) ) z ) → λ z . ( z z ) . (\lambda y.(\lambda z.(y\;z))\; z) \to \lambda z.(z\; z). One must rename the bound variables of M M and N N so they are distinct from variables in both terms: ( λ y . ( λ z . ( y z ) ) z ) ≡ ( λ y . ( λ a . ( y a ) ) z ) → λ a . ( z a ) . (\lambda y.(\lambda z.(y\;z))\; z) \equiv (\lambda y.(\lambda a.(y\;a))\; z) \to \lambda a.(z\; a).

This problem was originally confronted by Moses Schonfinkel in the 1920s, who created a theory of combinators—primitive units of computation defined only by function application. Lambda terms can be translated into combinators by a process of abstraction elimination, which does away with variables completely. This can be understood by observing that we only need to handle the three basic things that occur within λ \lambda -abstraction: variables, application, and further abstraction. These are represented by the fundamental combinators S S and K K :

(18) ( ( ( S x ) y ) z ) = ( ( x z ) ( y z ) ) ( ( K x ) y ) = x \begin{array}{ll} (((S\; x)\; y)\; z) & = ((x\; z)\; (y\; z))\\ ((K\; x)\; y) & = x\\ \end{array}

We can then define a recursive translation of terms:

(19) T [ x ] = x T [ ( M N ) ] = ( T [ M ] T [ N ] ) T [ λ x . M ] = ( K T [ M ] ) (x not free in M) T [ λ x . x ] = ( ( S K ) K ) T [ λ x . λ y . M ] = T [ λ x . T [ λ y . M ] ] (x free in M) T [ λ x . ( M N ) ] = ( ( S T [ λ x . M ] ) T [ λ x . N ] ) (x free in M) \begin{array}{lll} T[x] & = x & \\ T[(M\; N)] & = (T[M]\; T[N]) &\\ T[\lambda x.M] & = (K\; T[M]) & \text{(x not free in M)}\\ T[\lambda x.x] & = ((S\; K)\; K) &\\ T[\lambda x.\lambda y.M] & = T[\lambda x.T[\lambda y.M]] & \text{(x free in M)}\\ T[\lambda x.(M\; N)] & = ((S\; T[\lambda x.M])\; T[\lambda x.N]) & \text{(x free in M)}\\ \end{array}

For example, T [ λ x . λ y . ( y x ) ] = ( ( S ( K ( S I ) ) ( ( S ( K K ) ) I ) ) T[\lambda x.\lambda y. (y\; x)] = ((S\; (K\; (S\; I))\; ((S\; (K\; K))\; I)) . This translation preserves exactly the way this term acts in the λ \lambda calculus. For more information, see To Mock a Mockingbird, and Eliminating Binders for Easier Operational Semantics.

Applicative Systems and Computable Maps

Let C C be a cartesian restriction category, and let 𝔸 = ( A , • : A 2 → A ) \mathbb{A}=(A,\bullet:A^2\to A) be a magma object, a.k.a. an applicative system, in C C . A guiding example is Kleene application, ϕ ( − ) : ℕ × ℕ → ℕ \phi^{(-)}:\mathbb{N}\times \mathbb{N}\to \mathbb{N} . While we care about those which intuitively “act as universal Turing machines”, these are in fact a far more general concept. 𝔸 \mathbb{A} will be seen to generate a “category of computations”.

Comp(A)

A morphism f : A n → A f:A^n\to A is 𝔸 \mathbb{A} - computable \mathbf{computable} , or simply computable, when there exists a total point p : 1 → A p:1\to A , meaning p ¯ = id 1 \overline{p}=id_1 has total domain, which provides the “code” for f f (like the τ \tau -index):

This code p p picks out a program, i.e. an element of the universal Turing machine, which implements the function f f by application. When n ≥ 2 n\geq 2 , we also require that • n − 1 ∘ ( p × 1 ) : 1 × A n − 1 → A \bullet^{n-1}\circ (p\times 1): 1\times A^{n-1}\to A is total. A morphism A n → A m A^n \to A^m is computable if all components are, and a morphism A n → A A^n\to A is computable if its domain is (as a map A n → A n A^n\to A^n ).

By weakening this diagram from commuting to being filled by a restriction 2-cell, i.e. f ≤ • ( p × 1 ) f \leq \bullet (p\times 1) , we generalize from computability to the realizability of f f , meaning that f f can be implemented or “realized” by restricting the code-application composite to the domain of f f , even if the former is defined more generally.

Of course in defining a morphism, we intend to form a category. With this general definition of computability, however, categorical structure does not come for free—for 𝔸 \mathbb{A} -computable morphisms to include identities and to be closed under composition, 𝔸 \mathbb{A} needs to have the right structure. This structure is directly connected to combinatory logic.

Let Poly ( 𝔸 ) Poly(\mathbb{A}) by the sub- (cartesian restriction) category of C C generated by all total points 1 → A 1\to A and application. These morphisms are called 𝔸 \mathbb{A} -polynomial morphisms. This essentially represents “all computations that 𝔸 \mathbb{A} knows how to do”. We should like that these are all computable, i.e. there exists codes for all of these “computations”. An applicative system is combinatory complete if every polynomial morphism is computable.

When is this true? Precisely when 𝔸 \mathbb{A} has the combinators S and K; that is, the following morphisms are 𝔸 \mathbb{A} -computable.

(20) k 0 = π 0 : A × A → A ( K ( a , b ) = a ) s 0 : A × A × A → ⟨ π 13 , π 23 ⟩ ( A × A ) × ( A × A ) → • × • A × A → • A ( S ( a , b , c ) = ( ( a • b ) • ( a • c ) ) ) \begin{array}{ll} k_0= \pi_0: A\times A\to A\\ (K(a,b) = a)\\ s_0: A\times A\times A \xrightarrow{\langle \pi_{13}{,} \pi_{23} \rangle} (A\times A)\times (A\times A) \xrightarrow{\bullet\times \bullet} A\times A \xrightarrow{\bullet} A\\ (S(a,b,c) = ((a\bullet b)\bullet (a\bullet c))) \end{array}

Let’s think about why this is true. As in the original use of combinators, we give a translation from 𝔸 \mathbb{A} -polynomial morphisms to composites of s 0 , k 0 s_0,k_0 , application, and cartesian structure. This suffices because the generators of Poly ( 𝔸 ) Poly(\mathbb{A}) , the total points p : 1 → A p:1\to A , are computable; then the polynomial construction essentially makes all possible application trees of these codes—and this is exactly what s s and k k do.

So, we define a partial combinatory algebra (PCA) to be an applicative system which is combinatory complete. We can now prove the simple categorical characterization of this completeness:

Theorem: For an applicative system 𝔸 = ( A , • ) \mathbb{A} = (A,\bullet) in a cartesian restriction category C C , the following are equivalent:

𝔸 \mathbb{A} is combinatory complete.

𝔸 \mathbb{A} -computable morphisms form a cartesian restriction category.

(The forward direction: computable is a special case of being polynomial. The backward direction: the code of id A id_A is the code of • \bullet .)

What about the restriction structure? That is, given computable f f how do we know that f ¯ \overline{f} is computable? This is actually given by the combinator k k :

i.e., f ¯ = k 0 ⟨ 1 , f ⟩ \overline{f} = k_0\langle 1{,}f\rangle .

Pairing a function with id A id_A and projecting onto the first component gives the subdomain of A A on which the function is defined, and this operation is computable by the existence of the code k k . These combinators are humble but very useful.

Hence, we have a direct equivalence between the classical notion of combinatory completeness and the general, abstract computability concept of cartesian restriction category. (Like many “generalizations” in abstract mathematics, it’s really more an extension than generalization, because we still depend on the original idea of combinatory completeness; but “grounding” the latter by seeing it as a natural structure in a fully abstract setting demonstrates essential understanding, and connects the concept to the rest of category theory and computer science.)

Examples

The most important example of a category of computable maps is Comp ( ℕ ) Comp(\mathbb{N}) , of natural numbers and Kleene application.

Another example is the (partial) lambda calculus, modelled by a reflexive object A A as mentioned in the introduction.

One can generalize these examples by adding further expressive power such as oracles, or by considering “basic recursive function theories” which are sets of functions satisfying the Universality and Parameter theorems.

Structure Theorem

Now, for the unsurprising and important conclusion: given a PCA 𝔸 \mathbb{A} , the computable maps Comp ( 𝔸 ) Comp(\mathbb{A}) form not only a cartesian restriction category, but a Turing category. The application • : A × A → A \bullet:A\times A\to A is the Turing morphism, and the codes are the indices. The pairing operator pair: A → A × A A\to A\times A is represented by the lambda term λ xyz . ( ( zx ) y ) \lambda xyz.((zx)y) , and projections fst,snd: A × A → A A\times A\to A are given by λ xy . x \lambda xy.x , λ xy . y \lambda xy.y . One can check that these satisfy fst pair = π 0 \pi_0 and snd pair = π 1 \pi_1 . We then have (pair, ⟨ \langle fst, snd ⟩ ) : A × A ◃ A \rangle): A\times A \triangleleft A gives the product as a retract of A A , and this extends to A n A^n . Hence, 𝔸 \mathbb{A} is a Turing object, and Comp ( 𝔸 ) Comp(\mathbb{A}) is a Turing category.

Thus, we have the main theorem:

Structure Theorem

Let C C be a cartesian restriction category with applicative system 𝔸 = ( A , • ) \mathbb{A}=(A,\bullet) .

When 𝔸 \mathbb{A} is a PCA PCA in C C , then Comp ( 𝔸 ) Comp(\mathbb{A}) is a Turing category with Turing object 𝔸 \mathbb{A} . If C C is a Turing category and 𝔸 \mathbb{A} is a Turing object, then 𝔸 \mathbb{A} is a PCA PCA in C C and C C is Morita-equivalent to Comp ( 𝔸 ) Comp(\mathbb{A}) .

Conclusion

We learned some about the roots of computability theory, and discussed how Turing categories abstract these ideas, using the guiding example of natural numbers and partial recursive functions. Every Turing category arises from a partial combinatory algebra, in particular the natural numbers with Kleene application. Conversely, every partial combinatory algebra determines a Turing category, establishing the latter and raising the former into more general abstraction. Turing categories remain an active area of research, and our group looks forward to studying and contributing to this work.