September 13, 2018 at 05:40 Tags Go , Haskell

Algebraic data types (also known as variant types, sum types or discriminated unions) is a neat feature of some programming languages that lets us specify that a value might take one of several related types, and includes convenient syntax for pattern matching on these types at run-time. Here's a canonical binary tree example from Wikipedia, written in Haskell:

data Tree = Empty | Leaf Int | Node Tree Tree

A Tree can be either empty, or a leaf with one integer field, or a node with two Tree fields - its left and right children. Here's a function that finds the depth of such trees:

depth :: Tree -> Int depth Empty = 0 depth ( Leaf n ) = 1 depth ( Node l r ) = 1 + max ( depth l ) ( depth r )

It takes a Tree and returns an integer, and its workings are laid out by cases, depending on the run-time type of the parameter . If we pass anything that's not a Tree into depth , we'll get an error. Very concise and elegant!

"Why doesn't Go have algebraic data types" is a commonly asked question, and there's a FAQ entry about it. Even more details can be found in a Reddit AMA session the Go developers did in 2016 and in this proposal issue. The short version of the answer is that it isn't clear how this feature would interact with interfaces; how would one variant type discriminate between multiple interfaces that may have the same methods, etc. Indeed, it's a tricky issue and no one has found a satisfactory solution yet.

Another common answer is that Go already supports very similar functionality via a combination of interfaces and run-time type switches. In this post I want to show how it can be done, with some examples from synthetic to "real life". The full code for the samples in this post is available here.

Here's the same binary tree type, written in Go :

type Tree interface { } type Empty struct { } type Leaf struct { v int } type Node struct { left , right Tree }

Tree is an interface; Empty , Leaf and Node implement the interface. Note that the Tree interface is empty, so any type implements it, even the built-in string . We can easily restrict this by providing a method that would only be implemented by the interesting types:

type Tree interface { isTree () }

And we'll have to add empty imlementations for our types:

func ( _ Leaf ) isTree () {} func ( _ Node ) isTree () {} func ( _ Empty ) isTree () {}

Now only types with a isTree method will be allowed where the Tree interface is expected.

The depth function employs a type switch to do run-time type dispatching:

func depth ( t Tree ) int { switch nt := t .( type ) { case Empty : return 0 case Leaf : return 1 case Node : return 1 + max ( depth ( nt . left ), depth ( nt . right )) default : log . Fatalf ( "unexpected type %T" , nt ) } return 0 }

It's much more verbose than the Haskell pattern matching, but not less readable, and just as efficient. One obvious deficiency is having to manually reject other types in the default clause; Haskell's pattern matching does this automatically and also makes sure we've covered all cases.

A more realistic example Let's turn to a more realistic example that you could legitimately encounter in a Go program. Since I'm a compiler nerd this will be about trees again - abstract syntax trees. These data structures are a key layer in most compilers, produced by parsers and consumed by whatever comes after (type checking/inference, lowering, optimization, evaluation, etc). For this sample I wrote a complete evaluator for a simple calculator language with arithmetic operations, variables you can set and access, and if conditionals. The full code is in parser-evaluator.go; here I'll just focus on the AST nodes and how to "pattern match" them. This is the Node interface: type Node interface { isNode () String () string } It has the isNode guard method that all concrete node types will implement, along with a String method to make sure all nodes implement the fmt.Stringer interface for debugging. Here are a few sample concrete node types: type AssignStmt struct { Pos scanner . Position Name string Expr Node } type IfStmt struct { Pos scanner . Position Cond Node Then Node Else Node } type IntConstant struct { Pos scanner . Position Value int } If you look at the full code, all of these have a dummy isNode method, along with a String method. This is how pattern matching happens in the Eval function that evaluates an expression: func ( eval * Evaluator ) Eval ( n Node ) ( int , error ) { switch nt := n .( type ) { case IntConstant : return nt . Value , nil // ... // ... more cases // ... case AssignStmt : val , err := eval . Eval ( nt . Expr ) if err != nil { return 0 , err } eval . symbolTable [ nt . Name ] = val return 0 , nil case IfStmt : condVal , err := eval . Eval ( nt . Cond ) if err != nil { return 0 , err } // Lazily evaluate Then or Else based on the result of Cond. if condVal == 0 { elseVal , err := eval . Eval ( nt . Else ) if err != nil { return 0 , err } return elseVal , nil } else { thenVal , err := eval . Eval ( nt . Then ) if err != nil { return 0 , err } return thenVal , nil } } return 0 , fmt . Errorf ( "unmatched node %s" , n ) } Again, a type switch to cleanly discriminate between all the run-time types n could take.

An even more realistic example The evaluator shown in the previous section is something you could run into in real programs, and in fact you do. Go's own go/ast package uses the same idiom for its AST nodes . Looking in src/go/ast/ast.go in Go's source code, we see: // All statement nodes implement the Stmt interface. type Stmt interface { Node stmtNode () } Node is an embedded interface for some position-related methods, and stmtNode is a dummy method only Stmt types implement. Then, looking in src/go/types/stmt.go we find many examples of the by-now familiar type switch: // stmt typechecks statement s. func ( check * Checker ) stmt ( ctxt stmtContext , s ast . Stmt ) { // ... switch s := s .( type ) { case * ast . BadStmt , * ast . EmptyStmt : // ignore case * ast . IfStmt : check . openScope ( s , "if" ) defer check . closeScope () check . simpleStmt ( s . Init ) var x operand check . expr ( & x , s . Cond ) // ... // ... }