Type systems are a ubiquitous and essential part of modern programming languages. They help to ensure that separately developed pieces of a program “fit together” properly, and they help to statically prevent programs from exhibiting a large class of runtime errors.

So what can we actually prove about a type system to ensure that it is doing its job? The most common answer is type soundness (or type safety). Introduced originally by Robin Milner over 40 years ago, the type soundness theorem for a programming language L states that if a program P written in L passes L’s type checker, then P should be guaranteed to have well-defined behavior when executed. Often summarized via Milner’s slogan—“well-typed programs can’t go wrong”—type soundness has become the canonical property that type systems for “safe” programming languages are expected to satisfy.

However, as we will argue in this post, the most common syntactic approach to formulating type soundness omits an important aspect of why types are useful. In particular, as John Reynolds famously stated in 1983, “type structure is a syntactic discipline for enforcing levels of abstraction”, where “abstraction” refers to the ability of program components to hide information from other components. But syntactic type soundness has little to say about abstraction. In this post, we explain how, with a semantic formulation of type soundness, we can provide a more satisfactory and comprehensive account of what type systems are good for.

Syntactic Type Soundness

The syntactic approach to type soundness, pioneered by Wright and Felleisen and subsequently simplified by Harper, involves proving two theorems known as “progress” and “preservation”. A program is characterized as having well-defined behavior so long as its execution under a small-step operational semantics never gets stuck—i.e., never reaches a non-terminal state where there is no next step of execution to take. The preservation theorem states that a program remains well-typed as it executes, and the progress theorem states that well-typed programs are either in a terminal state or their next step of execution is well-defined. Together, these theorems imply that the execution of a well-typed program is well-defined—it never gets stuck.

The syntactic approach to type soundness—along with the proof method of progress and preservation—is inarguably one of the “greatest hits” of programming languages research of the past 25 years. It is conceptually simple, it scales easily to handle a wide range of programming language features, and it plays a central organizing role in popular textbooks (e.g., Pierce and Harper). As a result, it has become one of the most widely known, widely taught, and widely applied formal methods in the entire area of programming language foundations.

Limitations of Syntactic Type Soundness

Unfortunately, syntactic type soundness suffers from two significant limitations that are not (in our experience) widely recognized.

The first limitation pertains to data abstraction. A primary function of many languages’ type systems is to give programmers a way of enforcing data abstraction boundaries, so that one can place invariants on the private data representations of modules/classes and be sure that client code will not violate those invariants. However, syntactic type soundness offers no guarantees about whether a programming language’s data abstraction facility actually works—it is extremely easy to prove syntactic soundness of a type system whose data abstraction mechanism is completely broken. For example, until relatively recently, Java’s reflection mechanism made it possible for clients of a class to break internal implementation invariants on its private fields. But as far as syntactic type soundness is concerned, unrestricted reflection is no problem at all.

The second limitation pertains to unsafe features. In practice, most “safe” languages provide unsafe escape hatches—e.g., Obj.magic in OCaml, unsafePerformIO in Haskell, unsafe blocks in Rust, or FFI (calls to code written in another language)—which enable programmers to perform potentially unsafe operations (such as unchecked type casts or array accesses) that the safe fragment of the language disallows. These unsafe escape hatches have proven indispensable, both for functionality—when the language’s safe type system is more restrictive than necessary—and for efficiency—when performance concerns demand the use of a lower-level unsafe abstraction. However, syntactic type soundness has nothing to say about programs that use unsafe features: it simply declares them out of scope.

These two limitations are in fact closely connected, in that it is common to justify the “safe” use of unsafe features by appeal to data abstraction. Specifically, programmers often argue informally that their use of unsafe operations is harmless because said operations have been encapsulated behind a “safe API”. That is, they argue that, thanks to the abstraction boundary of the API, the implementation of the API can enforce invariants on its private data representation which ensure that its use of unsafe features does not lead to any undefined behavior. But of course, to make this reasoning formal, one needs to know whether the language is enforcing data abstraction properly, precisely one of the issues on which syntactic soundness is silent.

Semantic Typing

We have argued that syntactic type soundness is inadequate for judging whether a type system is really doing its job. But can we do any better?

Yes, we can! Over the past 15 years, we—together with a number of colleagues and collaborators—have been developing and refining a viable alternative to syntactic type soundness, one that offers a flexible foundation for reasoning about data abstraction as well as the safe use of unsafe features. Our approach has been deployed in a wide variety of different programming language safety proofs (several of them machine-checked), but perhaps most notably in the Foundational Proof-Carrying Code project at Princeton, as well as (more recently) the RustBelt project for verifying safety of the Rust programming language.

Our approach builds on the age-old idea of semantic type soundness, as exemplified by the formulation in Milner’s original paper. Semantic soundness addresses the central problem with syntactic soundness, namely that it identifies “safe” with “syntactically well-typed” and is thus unable to account for the safety of code that uses potentially unsafe features in a well-encapsulated way.

To overcome this problem, semantic type soundness models safety instead using a more liberal view of well-typedness, which we call semantic typing. Whereas syntactic typing—written —dictates a fixed set of syntactic rules by which safe terms can be constructed, semantic typing—written —merely requires that terms behave in a safe way when executed.

As a simple yet illustrative example, consider the following Rust module:

mod example { // Rust module = abstraction boundary // Invariant: The vector has length at least 1. pub struct NonEmptyVec { v: Vec<i32> } impl NonEmptyVec { pub fn new(v: Vec<i32>) -> NonEmptyVec { // Abort execution if v is empty. if v.len() == 0 { panic!("v should be nonempty"); } NonEmptyVec { v } } pub fn get(&self) -> i32 { unsafe { *self.v.get_unchecked(0) } } } }

Here, we define an abstract data type NonEmptyVec : it is implemented internally as a vector of integers, but this is not exposed to clients of the module, so we can privately enforce the invariant that the vector in a NonEmptyVec is always nonempty (i.e., has length at least 1). The constructor of NonEmptyVec ensures that this invariant is upheld when a NonEmptyVec is created. Consequently, when the get method is called, it can safely perform an unchecked access to the first element of the vector, since we know the vector must be nonempty.

In general, unchecked vector accesses are not safe operations, so Rust requires the use of such an operation in get to be enclosed in an unsafe block. However, as we have argued above, thanks to Rust’s data abstraction, the unchecked vector access is guaranteed to succeed: the use of an unsafe operation has been safely encapsulated by NonEmptyVec ‘s API. Thus, although get is syntactically ill-typed within the safe fragment of Rust, it is semantically well-typed.

The Structure of a Semantic Type Soundness Proof

If you are starting to believe that semantic typing may be a good fit for explaining what “safe encapsulation” means, you may be wondering: What does it mean to prove “semantic type soundness”? The high-level structure of a semantic type soundness proof is very simple.

Adequacy. First, we prove an adequacy theorem, which establishes that closed semantically well-typed terms—i.e., those for which we establish —are indeed safe to execute. This theorem is typically almost trivial to prove because, as explained above, it is more or less baked into the definition of semantic typing.

Semantic typing rules. Second, and more interestingly, we prove semantic versions of the syntactic typing rules of the language. For example, suppose your language has a typing rule for function applications, of the form:

You would then need to prove the corresponding semantic typing rule for function applications:

Unlike the syntactic typing rule, which is part of the definition of syntactic typing, the semantic typing rule is a property that should hold of semantic typing. That is, however we end up formally defining the semantic typing relation, the premises of this rule should imply the conclusion. The semantic typing rules serve to demonstrate that semantic typing is compositional in the same way syntactic typing is.

One immediate consequence of the semantic typing rules is that syntactic typing implies semantic typing—i.e., —a property often referred to historically as the fundamental theorem or fundamental property. As a result, closed syntactically well-typed programs are also semantically well-typed, and by adequacy, they must be safe to execute. In other words, once we have proven semantic type soundness, syntactic type soundness falls out as a simple corollary.

But the main reason we care about semantic type soundness is that it is a strictly stronger result than syntactic type soundness: it shows that we can safely compose syntactically well-typed pieces of a program with other pieces that are syntactically ill-typed (e.g., use unsafe features) so long as those other pieces are semantically well-typed. For instance, once we prove that the example Rust module above is semantically well-typed, we will be able to deduce that if example is used by any syntactically (or semantically) well-typed client program, then the resulting whole program will be safe to execute.

The catch is that, in general, proving a module like example to be semantically well-typed may require arbitrarily interesting verification effort, because it will require one to formalize the module’s internal invariants and ensure that they are upheld. But this is only to be expected: the goal of semantic soundness is precisely to help establish that modules are properly maintaining their internal invariants, a task which often amounts to proving full functional correctness of the code.

Next Step: Defining and Reasoning About Semantic Typing

Of course, the big question left open so far is: How do we actually define semantic typing, and what is involved in proving its required properties?

In the next post in this series, we will examine this question in detail and explore why previous approaches to defining semantic typing have not been more widely adopted. We will then show how recent developments in higher-order program logics have made it possible to define semantic typing and conduct machine-checked semantic soundness proofs with relative ease. Stay tuned!

In the meantime, you can check out Derek Dreyer’s POPL’18 keynote, which provides a broad overview of this topic, and some more details about how you might define semantic typing.

Derek Dreyer is a professor of computer science at the Max Planck Institute for Software Systems (MPI-SWS) and Saarland Informatics Campus in Saarbruecken, Germany, where he leads the Foundations of Programming group and the RustBelt project.

Amin Timany is a postdoctoral fellow of the Flemish Research Fund (FWO) working at the DistriNet research group of the computer science department of KU Leuven.

Robbert Krebbers is an assistant professor in programming languages at Delft University of Technology.

Lars Birkedal is professor and Villum Investigator at Aarhus University, where he heads the Logic and Semantics group and the Center for Basic Research in Program Verification.

Ralf Jung is a doctoral student at the Max Planck Institute for Software Systems (MPI-SWS) and Saarland Informatics Campus in Saarbruecken, Germany.

Disclaimer: These posts are written by individual contributors to share their thoughts on the SIGPLAN blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGPLAN or its parent organization, ACM.