This is the blog version of a talk I did for the PLSE seminars at UofA.

In this post I am going to discuss the problems with adding mutation to an object-oriented programming language with depth-subtyping and propose a type-system which safely supports mutation while keeping a notion of depth-subtyping.

I will avoid formal definitions for this post and instead motivate ideas through examples. Many of these ideas will be familiar for anyone who has programmed in an object-oriented language before. I will also be very loose with terminology. For example, I might say objects are subtypes when I mean their types are subtypes.

What is Typing?

Before I can discuss depth-subtyping or even subtyping, I am going to discuss what I mean by typing. For this post I am going to focus on typing for values, rather than typing for program fragments.

So what are values? Values are things such as true , 1 , 1.3 , and the object {x: 2, y: 3} . This last example is an object with fields x and y which contain the numbers 2 and 3 .

If you are familiar with object-oriented languages you will recognize the above values as having the types Bool , Int , Float , and {x: Int, y: Int} .

For this post, we are going to think of typing as the following:

The type of a value represents the operations we are allowed to perform on the value.

For example, if we have an Int , we are allowed to add it to another Int , but we are not allowed to add it to an object:

1 + 3 is allowed

is allowed 1 + {x: 2, y: 3} is not allowed

What is Subtyping?

Let’s start with some examples:

Nat <: Int . We often think of natural numbers as being a subtype of integers.

. We often think of natural numbers as being a subtype of integers. {x: Int, y: Int, z: Int} <: {x: Int, y: Int} . If we have a object with fields x , y , and z are subtypes of objects with fields x and y .

But what does subtyping mean? Given the above notion of typing:

We say that T is a subtype of U (written T<:U ) if all operations allowed on values of type U are allowed on values of type T .

Width-subtyping

Before we discuss depth-subtyping, let’s discuss a contrasting idea: width-subtyping. Let’s go back to the last example:

{x: Int, y: Int, z: Int} <: {x: Int, y: Int}

Any operations allowed on {x: Int, y: Int} are also allowed on {x: Int, y: Int, z: Int} ; we just ignore the field z . This idea is known as width-subtyping: objects which are wider in fields are subtypes.

Most languages which support subtyping support width-subtyping in some way or the other. For example, in Java we can extend an object type with more fields.

What is Depth-subtyping?

Depth-subtyping is the idea that if we have subtypes T<:U , then objects containing T s are subtypes of objects containing U s: {field: T} <: {field: U} . Contrasted with width-subtyping, instead of going wider in fields, we go deeper in fields.

For example, since we had Nat<:Int , we have PointNat<:PointInt where PointNat and PointInt are defined as:

= { x : Nat , y : Nat} PointNatNatNat} = { x : Int , y : Int} PointIntIntInt}

And it can be depth-subtyping all the way down: objects containing {field: T} are subtypes of objects containing {field: U} , and so on.

We want Depth-subtyping!

Depth-subtyping can be very useful. Consider the following function:

diff (PointInt inp){ Int(PointInt inp){ return (inp . x - inp . y ) (inpinp }

diff takes an object which contains Int s in the field x and y , and subtracts the x value from the y value and returns the answer.

Now suppose we have a PointNat p .

= new PointNat ( 3 , 4 ) PointNat p

It is completely safe to pass p to the function diff , and depth-subtyping would allow us to reuse the diff function for PointNat s. Without depth-subtyping we would have to rewrite the function for PointNat s.

The diff function only relies on the input object having fields x and y containing values which can be treated as Int s. In general, it is safe to pass a PointNat to a function which expects its input to be a PointInt … unless the input is being mutated.

Mutation is Bad for Depth-subtyping!

To see why mutation is bad for depth-subtyping, we are going to introduce some more types.

= { x : Int , y : Int , z : Int} Point3DIntIntInt} = { field : T} Container[T]T}

Usual width-subtyping implies that Point3D <: PointInt . Consider the following program. We will explain it shortly.

1. Point3D p3d = new Point3D ( 1 , 2 , 3 ) Point3D p3d 2. PointInt p2d = new PointInt ( 1 , 2 ) PointInt p2d 3. Container[Point3D] c = new Container (p3d) Container[Point3D] c(p3d) 4. Container[PointInt] c2 = c Container[PointInt] c2 5. c2 . field := p2d c2p2d 6. Point3d p = c . field Point3d p 7. return p . x + p . y + p . z

We first create points p3d and p2d , the first being a 3d-point ( Point3D ) and the second being a 2d-point ( PointInt ). Then, we create a 3d-point container, c . Next, we use depth-subtyping to cast c to a 2d-point container c2 . Then we mutate the container with p2d using the reference c2 . Now we use the original reference c , which is treated as a 3d-point container, to read its field as a 3d-point. Then we read the fields of the object and sum them up.

If we run this code, it will throw a runtime exception when p.z is executed! This is because, we when we read c.field , it actually contains a 2d-point, so it does not have a z field. So trying to read the non-existent z field throws a runtime exception.

Fun fact: If we used Array instead of Container , Java would compile the above code! The Array upcasting in Java was well intentioned, it would allow a form of depth-subtyping for arrays, but it is ultimately unsafe.

What went Wrong?

So what went wrong? The problem was that the operation “assign 2d-point”:

should be allowed for 2d-point containers, but

should not be allowed for 3d-point containers.

So not all operations allowed for 2d-point containers should be allowed for 3d-point containers, even though 3d-points are subtypes of 2d-points.

This might lead us to believe that we cannot allow depth-subtyping in languages with field mutation. Fortunately, this turns out not be true! We can recover depth-subtyping, or at least a notion thereof, if we use something which I call bounded field-typing.

Bounded Field-typing

In bounded field-typing, fields have bounded types such as the following:

{fld: T..U}

We call the lower bound T the setter type of the field fld , and U the getter type. Usually, the setter type is a subtype of the getter type, T<:U .

If x has the above object type, reading the field fld produces a value of type U . Under bounded field-typing, field reads produce the getter type. For mutation, the assignment x.fld:=y is allowed if y has type T ; i.e. we allow assignments of the setter type.

Recovering Depth-subtyping

In the bounded setting, we allow depth-subtyping for getter-types:

S<:U implies

implies {fld: T..S}<:{fld: T..U} .

{fld: T..U} allows us to read a U from the field fld . Since all operations on U s are allowed on S s, we are allowed to read an S from a {fld: T..S} and treat it as an U .

We also allow depth-subtyping for setter-types, but in the other direction, i.e. subtyping is contravariant in the setter type.

S<:T implies

implies {fld: T..U}<:{fld: S..U} .

Notice how, S and T appear in opposite directions of the <: symbol. If we are allowed to write a value which allows the operations of T to be performed on it, then it is safe to write a value which allows more operations.

In the bounded setting, the previous types such as PointInt and PointNat are written with the setter and getter types being equal:

= { x : Nat .. Nat , y : Nat .. Nat } PointNatNatNat = { x : Int .. Int , y : Int .. Int } PointIntIntInt

Safe Depth-subtyping

In the bounded setting, PointNat is not a subtype of PointInt . Since we do not have that Int<:Nat , their setter types are not in the subtyping relationship required for depth-subtyping. However, since Nat<:Int we have:

<: { x : Nat .. Int , y : Nat .. Int } PointNatNatNat

Using the same subtyping relationship for the setter types for PointInt , we have:

<: { x : Nat .. Int , y : Nat .. Int } PointIntNatNat

Using this, we can write a function diff which works for both PointInt s and PointNat s.

diff ({ x : Nat .. Int , y : Nat .. Int } inp){ Int({NatNat} inp){ return (inp . x - inp . y ) (inpinp }

At this point, you will notice that the above, diff function is not quite general enough. If we had a value of the following type, we would not be able to use diff on it, even integers are neither subtypes nor supertypes of natural numbers.

= { x : EvenInt .. EvenInt , y : EvenInt .. EvenInt } PointEvenIntEvenIntEvenInt

For this reason, under bounded field-typing it is very useful to have a bottom type, ⊥ . There is no way to create a value of type ⊥ , but we assume that all operations are allowed on values of type ⊥ . More importantly, ⊥<:T for any type T . This allows us to write diff as the following.

diff ({ x : ⊥ .. Int , y : ⊥ .. Int } inp){ Int({} inp){ return (inp . x - inp . y ) (inpinp }

Notice that we use ⊥ for the setter types of our input parameter. We can now call the diff function with a PointEvenInt as its input, or any object as long as it contains field x and y which contain values which can be treated as Int s. So we have effectively recovered what we wanted from depth-subtyping!

3d-point Containers are not 2d-point Containers

Under bounded field-typing, we still have that 3d-points ( Point3D ) are subtypes of 2d-points ( PointInt ), since width-subtyping is still safe. As we discussed in the previous example, 3d-point Containers are not 2d-point Containers because they do not allow the operation “Assign 2d-point”. This is now reflected at the type level.

1. Point3D p3d = new Point3D ( 1 , 2 , 3 ) Point3D p3d 2. PointInt p2d = new PointInt ( 1 , 2 ) PointInt p2d 3. Container[Point3D] c = new Container (p3d) Container[Point3D] c(p3d) 4. Container[PointInt] c2 = c Container[PointInt] c2 5. c2 . field := p2d c2p2d 6. Point3d p = c . field Point3d p 7. return p . x + p . y + p . z

Here the assignment in line 4 is not allowed and caught by the type system. To make the assignment valid, we can change the container type to a bounded type.

1. Point3D p3d = new Point3D ( 1 , 2 , 3 ) Point3D p3d 2. PointInt p2d = new PointInt ( 1 , 2 ) PointInt p2d 3. Container[Point3D] c = new Container (p3d) Container[Point3D] c(p3d) 4. Container[Point3d .. PointInt ] c2 = c Container[Point3d] c2 5. c2 . field := p2d c2p2d 6. Point3d p = c . field Point3d p 7. return p . x + p . y + p . z

But now, the assignment in line 5 is not valid since it is only safe to assign 3d-points to c2 . If we change the original Container of c type to PointInt , then the assignment is allowed.

1. Point3D p3d = new Point3D ( 1 , 2 , 3 ) Point3D p3d 2. PointInt p2d = new PointInt ( 1 , 2 ) PointInt p2d 3. Container[PointInt] c = new Container (p3d) Container[PointInt] c(p3d) 4. Container[PointInt] c2 = c Container[PointInt] c2 5. c2 . field := p2d c2p2d 6. Point3d p = c . field Point3d p 7. return p . x + p . y + p . z

But now, the field read in line 6 is not allowed, since we are only allowed to assume that the field contains a PointInt . No matter what we do, the type system does not allow us to assign a 2d-point and read it back as a 3d-point.

Bounded field-typing really is safe.

This concludes the post. I hope this shed some light into why the “getter and setter methods” meme in Java exists. I’m going to leave you guys by saying that bounded field-typing really is safe! I formally proved it to be safe in an extension of the DOT calculus called κDot (officially kappa-dot, informally kay-dot). You can find the Coq code at here and the accompanying paper which I presented at the Scala Symposium here.

A friend pointed out that this post is being discussed on Reddit.