Many statically typed programming languages, such as Haskell or Swift, have a feature called “sum types”. Sum types (also known as tagged unions or variant types) allows a new type to be defined as the “union” of a set of other types and values, and allows users to “pattern match” on values to find out the underlying type.

But Go, the predominant language used at Pusher, does not support sum types. As former Haskell users, we miss them so we wanted to describe some workarounds and alternative approaches, including the approach we recommend here at Pusher.

An example type definition in Haskell would be

type Event = EventPublish PublishData | EventSubscribe SubscribeData

The Event type is the union of types PublishData and SubscribeData . If ("public-donuts", "2 for $5") is a PublishData , then EventPublish ("public-donuts", "2 for $5") is an Event .

A user of an Event value can use a “pattern match” to find out whether the value is a publish or a subscribe, and to do different things in each case. For example:

eventToString :: Event -> String eventToString e = case e of EventPublish p -> publishDataToString p EventSubscribe s -> subscribeDataToString s

Sum types are useful in statically typed programming languages because they allow a function to accept a set of types and handle them in different ways. For example handlers for different types of event on a subscription/ chan , traversing recursive data structures, or even expression evaluators.

Let’s see whether we can acheive something similar in Go…

Alternative 0: interface-and-switch

A common alternative, which we can call “interface-and-switch”, is to use an interface{} type for the sum type, and a type switch for the pattern match. Let’s see an example of this for a pub/sub message bus. The bus runs an event loop in its own goroutine, and clients interact with the event loop by writing different events onto its eventChan . Ideally the type of this eventChan would be a sum type over all types of events it can handle, but here we use a interface{} :

type subscribeEvent struct { messageChan chan <- string } type publishEvent struct { message string } type pubsubBus struct { subs [] chan <- string eventChan chan interface {} // Note the interface{} type for events } func ( p * pubsubBus ) Run () { for event := range p . eventChan { // This is not type safe because we might remove a handler, but // forget to remove the function which sends the now unhandled // event on the channel. switch e := event . ( type ) { case subscribeEvent : p . handleSubscribe ( e ) case publishEvent : p . handlePublish ( e ) default : panic ( fmt . Sprint ( "Unknown event type" )) } } } func ( p * pubsubBus ) handleSubscribe ( subscribeEvent subscribeEvent ) { p . subs = append ( p . subs , subscribeEvent . messageChan ) } func ( p * pubsubBus ) handlePublish ( publishEvent publishEvent ) { for _ , sub := range p . subs { sub <- publishEvent . message } }

Source

A significant downside to the interface-and-switch approach is that it is not “type safe” The handler might not handle all types that are passed in, leading to runtime errors. The consumer of the eventChan can only handle subscribeEvent and publishEvent values, but Go’s type system will allow the producer to pass in other values (e.g. 5 or true ). As a result, the type switch has a default action, which is to panic at runtime. By contrast, notice that our Haskell pattern match needs no default case, because the type system verifies that no invalid values will enter the pattern match. Can we achieve this similar type safety in Go?

Alternative 1: a “sum type” interface

To improve the situation, we can replace the interface{} with an interface with a single, dummy, private method. This means that there will be a type error if an unexpected type is used where our interface is expected:

+type event interface { + isEvent() +} type subscribeEvent struct { messageChan chan<- string } +func (subscribeEvent) isEvent() {} type publishEvent struct { message string } +func (publishEvent) isEvent() {} type pubsubBus struct { subs []chan<- string - eventChan chan interface{} // Note the interface{} type for events + eventChan chan event // Now only types which implement `event` can be sent }

Source

Runtime errors are still possible however. For example, during a refactor a handler might be removed but a type that implements the interface is not.

This approach is described in more detail by Jeremy Bowers.

Alternative 2: the visitor pattern

A fully type-safe way of solving this is to attach the handlers as “ visit ” methods to the types themselves. Then an interface is defined with a matching visit method, so we can call this in our event loop. This technique is known as the visitor pattern in OO languages.

type event interface { - isEvent() + // Instances now implement the handler in this method + visit(*pubsubBus) } type subscribeEvent struct { messageChan chan<- string } -func (subscribeEvent) isEvent() {} +func (sE subscribeEvent) visit(p *pubsubBus) { + p.handleSubscribe(sE) +} type publishEvent struct { message string } -func (publishEvent) isEvent() {} +func (pE publishEvent) visit(p *pubsubBus) { + p.handlePublish(pE) +} type pubsubBus struct { subs []chan<- string - eventChan chan event // Now only types which implement `event` can be sent + eventChan chan event } func (p *pubsubBus) Run() { for event := range p.eventChan { - // This is not type safe because we might remove a handler, but - // forget to remove the function which sends the now unhandled - // event on the channel. - switch e := event.(type) { - case subscribeEvent: - p.subs = append(p.subs, e.messageChan) - case publishEvent: - for _, sub := range p.subs { - sub <- e.message - } - default: - panic(fmt.Sprint("Unknown event type")) - } + // Type switch is not required, so it's type-safe + event.visit(p) } }

Source

The disadvantage here is that the handlers are now coupled to the types. Ideally we want to define types independent of a particular way in which they are handled.

Alternative 3: decoupled visitor

We can decouple the handlers by having the visit() function take a struct of handler implementations. Each visit() implementation will call the corresponding handler in the struct.

type event interface { - // Instances now implement the handler in this method - visit(*pubsubBus) + visit(v eventVisitor) } +// The handlers for each event type are defined in instances of this struct +type eventVisitor struct { + // Notice these can have different function signatures + visitSubscribe func(subscribeEvent) + visitPublish func(publishEvent) +} type subscribeEvent struct { messageChan chan<- string } -func (sE subscribeEvent) visit(p *pubsubBus) { - p.handleSubscribe(e) +This is just boilertplate now; we do not need to provide a specific implementaion +func (sE subscribeEvent) visit(v eventVisitor) { + v.visitSubscribe(sE) } type publishEvent struct { message string } -func (pE publishEvent) visit(p *pubsubBus) { - p.handlePublish(e) +func (p publishEvent) visit(v eventVisitor) { + v.visitPublish(p) } type pubsubBus struct { subs []chan<- string eventChan chan event } func (p *pubsubBus) Run() { for event := range p.eventChan { - // Type switch is not required, so it's type-safe - event.visit(p) + // Handler implementations are passed in here. + // Alternative handler implementations could be defined by creating an + // alternative version of this method. + event.visit(eventVisitor{ + visitSubscribe: p.handleSubscribe, + visitPublish: p.handlePublish, + }) } }

Source

The fact that the handlers are decoupled from the types makes it easy to change the handler implementations. It also makes it possible for the handlers to have different type signatures. For example here is an interpreter of arithmetic operations using this visitor pattern.

The disadvantage of this is there is now more boilerplate, particularly the visit method implementations.

Conclusion

It can be frustrating not having sum types if you are used to using them in other programming languages. Fortunately, with a slight change in the way of thinking, we can find decent alternatives. Unfortunately there isn’t one technique that is objectively better than the others. It’s a tradeoff between type-safety, complexity, and verbosity, and the “best” approach will depend on your use case and personal preferences.

That’s until Go 2 is released at least…

Thanks to Jim Fisher and James Lees for proofreading.