Moggi :: CT -> Hask

Blog » Moggi :: CT -> Hask

Posted on

The monad found its way into Haskell from E. Moggi’s research in the late 1980s integrating Category Theory into the domain of programming language semantics. He began by observing that there are three common approaches to proving equivalence of programs:

Operational semantics

Denotational semantics

Model theoretic semantics

All such methods attempt to bridge the syntax/semantics divide. They differ chiefly in the formalisms they adopt to represent the semantic domain, with attendant differences in the syntax-to-semantics mapping. Moggi’s innovation was to use Category Theory to represent the semantic domain. His motivation was the observation that the conventional denotational approach, which identifies type $\alpha \rightarrow \beta$ with a total function of the same signature, “wipes out completely behaviours like non-termination, non-determinism or side-effects, that can be exhibited by real programs.” The key word being "behaviors"; a direct mathematical mapping from terms in a language to objects in a semantic domain has no way of accounting for the behavior of terms like “print”, which trigger observable external behavior. Whatever mathematical semantics such terms may have, they also carry additional meaning that cannot be captured by a straightforward denotational representation.

The key term he uses is notion of computation:

…in order to interpret a programming language in a category C , we distinguish the object A of values (of type A) from the object T A of computations (of type A), and take as denotations of programs (of type A) the elements of T A. In particular, we identify the type A with the object of values (of type A) and obtain the object of computations (of type A) by applying an unary type-constructor T to A. We call T a notion of computation, since it abstracts away from the type of values computations may produce. There are many choices for T A corresponding to different notions of computations.(Notions of Computation and Monads)

He gives the following examples of "notion of computation": partiality, non-determinism, side-effects, exceptions, continuations, and interactive input/output. What these "notions" have in common is disruption of the semantic integrity of a program text. They violate the principle of referential transparency, rendering the text literally uninterpretable to the human reader. So "notions of computation" effectively means the class of opaque terms in programming languages.

In ordinary mathematical notation, as for any formal notation intended for human readers, there are no such opaque terms. There may be terms that are unfamiliar to the reader, but these are unknown, not opaque. They have definite denotation, whereas opaque terms only acquire a denotation when the program is executed as a process on a computing machine. Their semantics are contingent, in contrast to terms like "2 + 3", which have necessary semantics.

This appears to be an insurmountable and lethal problem for the formal semantics of functional programming languages. One goal for the development of functional languages is the elimination of process considerations from the semantics of the language. It should be possible to reason about the meaning of a program text without reference to the process associated with it at run-time. If the language includes opaque terms whose semantics are contingent on the process, then how can it be possible to eliminate processing logic from the semantics? How could one ever hope to prove two programs equivalent?

Moggi simply restated the problem with a different focus: opaque, contingent terms may be unknowns, but they are known unknowns. We know who and where they are in a text, so instead of asking how we can reason about them, we ask how we can represent and manage them in such a way that the semantic integrity of the text as a whole is retained. We know they denote, but we don't know their denotations; we know they behave, but since formal semantics is atemporal, we don't know the order in which they will acquire their behaviors. So the problem is twofold: we must find a way to represent them as ordinary mathematical values, and can we must find a way to ensure that their behavior is properly disciplined.

Moggi's brilliant solution was to use the strong typing of Category Theory as a representational device, and to use the monad as an organizing device. Categories are strongly typed; they provide us with a means to both represent and sequester opaque terms and their contingent values. Sequestration follows naturally from type discipline: objects of a category representing opaque terms with their contingent values cannot naturally mix with objects of any other category. Since they are ordinary objects, they are integrated into the formal semantics of the language as a whole. Functors are the means by which we "construct" such objects; The fact that they serve as proxies for behaviors does not disrupt this integration; they remain objects. In Moggi's formulation, it is the functor that encodes the notion of computation, each constructed object representing a different notion. This is a clever construction of functor as a mathematical operation with side effects, but it is indisputably a mere metaphor for expository purposes. Functors don't have side effects, but that need not stop us from using them to model executable operations that do.

Behavioral discipline follows from the structural properties of the monad. The Kleisli star of category theory and its Haskell counterpart »= use data dependency to impose an ordering on their arguments. The semantics of the program text remain atemporal, but when the text is represented as a process, the data dependency will ensure proper ordering of behaviors.

If the language of categories and functors makes you queasy, here is how Wadler et al. used Moggi's insights to engineer a solution in Haskell. The Haskell equivalents to category and functor are the type and type constructor. A type T A is the image of type A under the mapping of the functor T. What Moggi called notions of computation, and I have called opaque terms and their associated contingent denotations, others have called variously actions, computations, processes, programs, and so forth. Whatever we choose to call them, Haskell represents them as values of a constructed type - IO a and so forth. Haskell's strong typing serves to sequester them in their type, unable to interact with values of other types. They are first-class values in the semantic model of Haskell, with no side-effects within the language. At run-time, they may have observable behaviors; the Haskell bind operator »=, which implements the Kleisli star of CT, imposes an evaluation order on its arguments.

This use of category theory is an absolutely dazzling manoever. A very distinctive feature of category theory is that it erases the distinction between things and relations; "it is technically possible to dispense altogether with the objects and work only with arrows," in the words of its co-inventor Saunders Mac Lane. And thus it is possible to do this with "notions of computation": we can disregard them. The fact that, for example, values of type IO a have some kind of observable side-effect at run-time is irrelevant to the formal semantics of the language. Since they are represented as objects in purely categorical terms, using functors and natural transformations, we can dispense with them and transfer our attention to the morphisms - constructors, bind, etc. In other words, we can think of these things not as values (with additional behavior) but as a nexus of relationships encoded as arrows. So when asked, what is a value of type IO, we can justly respond with an imperious: "Faugh! It means nothing to me! But if you must know, go run the program!"

- Gregg Reynolds

Like this entry?