Sum Types, Defined in (Informal) Theory

The problems examined so far can be boiled down to two essential categories:

No consistent and expressive way to represent separate data status cases. “Not found” ought to be clearly different from “found, with this data.”

A high likelihood of error when consuming data that could be in one of several forms (i.e. polymorphic). Humans focus on the ordinary case and use variables assuming they will have certain properties or methods, forgetting that the actual value may sometimes be an exceptional case.

Sum types can improve both of the above. To understand sums, it helps to first examine scalar and product types.

Scalars

Two example scalar types. There are three scalar values of type A and two scalar values of type B.

A scalar type contains atomic values: single items which cannot be decomposed into parts. Examples of scalars in JavaScript include built-in types like Boolean, Number, and Null. For instance, the number 42 in JavaScript is a single value inhabiting the Number scalar.

A scalar type has an intrinsic size (a.k.a. cardinality), which is just the number of values inhabiting that type.

The Boolean type has a size of 2 as it contains two values, true & false .

& . The Number type is of “infinite” size (if we ignore the limits of IEEE 754 ).

The Null type has a size of 1 as it contains only one value, null .

Products

A × B is a product of two different scalar types A and B. The size of A × B is the size of A times the size of B. Each value in A × B has both an A component and a B component.

Product types are composite — they consist of multiple (possibly different, possibly identical) types, grouped together in an “and” fashion. For example, a type whose values consist of two Booleans (one Bool and one Bool) is a product. In JavaScript, we can represent custom product types using arrays as tuples, with position serving to distinguish each member type.

// FoodFacts are tuples of 2 bools: [isYummy, isHealthy] const saladFacts = [true, true]

const burgerFacts = [true, false]

const vitaminFacts = [false, true]

const chalkFacts = [false, false]

Why “product?” Because the size of this new composite FoodFacts type is determined by multiplying the sizes of its constituent types. We can easily see above that there are only four different values in the type, obtained by multiplying 2 options for the first Bool × 2 options for the second Bool.

Positional notation (e.g. chalkFacts[0] ) is not very clear with respect to meaning. A more expressive way to represent multiple values grouped together in JS is with objects, which can label each member value. Technically the labels make objects record types rather than products, but we will overlook that in this article, in the interest of making it easier to write examples:

// Person = { name: String, age: Number, employed: Boolean } const mark = { name: 'Mark', age: 67, employed: false }

const jin = { name: 'Jin', age: 34, employed: true }

const ford = { name: 'Ford', age: 19, employed: true }

const sian = { name: 'Sian', age: 24, employed: false }

...

The size of the Person type, ignoring the labels, is Infinity × Infinity × 2. That is, the infinite number of Strings, times the infinite number of Numbers, times the two possible Booleans. Clearly, we will not be able to list every possible value in the Person type.

Sums

A + B is a sum of two different scalar types A and B. The size of A + B is the size of A plus the size of B. Each value in A + B is either an A value or a B value.

If scalar type size is an intrinsic number, and the size of a product type is the product of its constituent type sizes, you will probably not be surprised to hear that the size of a sum type is the sum of its constituent type sizes. Sum types are composite like product types, but in an “or” fashion; a single value in the type is only ever one of the constituent types, not a grouping of them all.

// FinitePrimitive = Boolean | Null | Undefined const finitePrimitive1 = true

const finitePrimitive2 = false

const finitePrimitive3 = null

const finitePrimitive4 = undefined

The FinitePrimitive type we define above has a size of 2 + 1 + 1 = 4. That is, both of the Booleans, plus the single number of Nulls, plus the single number of Undefineds. We can easily list out all four values, which we have done above. Notice, a value in this type is only one of the constituent types.

As another example, consider a sum type composed of some larger types:

// InfinitePrimitive = String | Number | Symbol const infinitePrimitive1 = 'hello'

const infinitePrimitive2 = 'goodbye'

const infinitePrimitive3 = 42

const infinitePrimitive4 = Symbol('hmm')

const infinitePrimitive5 = 314159

const infinitePrimitive6 = 'ok we get it, there are a lot of these'

...

The InfinitePrimitive type has a size of Infinity + Infinity + Infinity. It can be any one of the infinite strings, or one of the infinite numbers, or one of the infinite symbols.

A Sum of Products

A + (B × B) is a sum of a scalar type and a product type. The size of A + (B × B) is the size of A, plus the size of B times the size of B. Each value in A + (B × B) is either an A value or a B × B value. B × B values have both a B component and another B component.

Products and sums can consist of scalars, but we didn’t define them as needing to consist of scalar types — on the contrary, the constituent types may themselves be products and/or sums. Here is a (non-JavaScript) sum of both scalar and product types:

scalar type Ghost (contains one value, `ghost`) product type Character { (contains 2 × 2 = 4 possible values)

afraidOfNoGhosts: Boolean,

ghostbuster: Boolean

} sum type Entity = Ghost | Character

How many values does the Entity type have? Well, it’s 1 possible Ghost value + (2 × 2) possible Character values = 5 different Entity values. Let’s enumerate them:

entity1 = Ghost ghost

entity2 = Character { afraidOfNoGhosts: true, ghostbuster: true }

entity3 = Character { afraidOfNoGhosts: false, ghostbuster: true }

entity4 = Character { afraidOfNoGhosts: true, ghostbuster: false }

entity5 = Character { afraidOfNoGhosts: false, ghostbuster: false }

Tag, You’re It

We’ve almost finished defining sum types, but we’re missing a crucial characteristic which distinguishes them from the (very similar) union type. Suppose we define a sum type for name parts as being either a first name or a last name, where each is a string:

// NamePart can be a first name string OR a last name string

sum type NamePart = String | String namePart1 = 'Wilson' // is this a first or last name?

namePart2 = 'Ashley' // is this a first of last name?

When we encounter a value in the wild, the fact that we know its type is String isn’t quite enough to know whether it was supposed to be from the first choice of NamePart strings, or the second choice.

For that, we need to somehow label the value — with a “tag.” The value of interest will not consist of just the string on its own, but also be accompanied by a symbolic identifier that allows the developer to know unambiguously which of the constituent types it belongs to:

namePart1 = <LastName 'Wilson'>

namePart2 = <FirstName 'Ashley'>

Ah, now we know exactly what roles 'Ashley' and ' Wilson' each play.

“This overall operation is called disjoint union. Basically, it can be summarized as ‘a union, but each element remembers what set it came from’.” — Waleed Khan, Union vs Sum Types

If you think about it, the combination of a tag and some data is itself a product, meaning we can reframe our example sum type as a sum of products, where every product includes a tag:

sum type NamePart = (FirstName & String) | (LastName & String)

Since each tag is only one value, it doesn’t affect the size of the sum type. NamePart now has a size of (1 × Infinity) + (1 × Infinity), equivalent to its earlier size of Infinity + Infinity. Tags are therefore unit types.

With tags, we can now discriminate between otherwise identical values of a given type; the tag is a minimal form of metadata. There is a preponderance of synonyms for the concept of sum types: discriminated unions, tagged unions, disjoint unions, choice types, variants, etc.

Sum types and product types, both being composite, are also known as algebraic data types.