The problem: you are writing the backend of an e-commerce site. Your Order data type references a Customer and a list of Product s. Do you represent the list of Product s as a list of product ids? Or do you use fully realized Product s? When rendering an invoice, you may need customer data. But for data analysis, you may only need uuids . How do we encode all this in a flexible and safe manner?

This post will show you a neat trick using Type Families to safely (and sanely) define your Order data type to accommodate all varieties of referential data.

First, let’s start with some extensions we’ll need. This post is written in literate haskell.

> {-# LANGUAGE DataKinds #-} > {-# LANGUAGE TypeFamilies #-} > > module SaneRef where > import Network.URI (parseURI, URI)

Let’s define some base, non-referential data types (they don’t reference any of our other types).

> data Customer = Customer { customerId :: Int > , customerName :: String > } > data Product = Product { productId :: Int > , productName :: String > , productPrice :: Double > }

Naively, our Order will reference Customer s and Product s and may look like:

> data NaiveOrder1 = NaiveOrder1 { naive1Customer :: Int > , naive1Products :: [Int] > } > data NaiveOrder2 = NaiveOrder2 { naive2Customer :: Customer > , naive2Products :: [Product] > }

Which one of should we use? Perhaps we should use both and just move on? While the business decision about over-engineering a problem like this is outside the scope of this post, I believe there is a little solution that highlights a very practical usage of haskell’s Type Families.

The basic idea: if we can tag referential data with a phantom reference, then we can use a type-level function to map a reference and data type to a new type. For example, we might say: “when we have a Customer referenced in the database, it will manifest as an Int ; when we have a customer referenced in our REST API, it will manifest as a URI ; but when we have no reference to a customer, it’s a fully realized Customer object.”

How do we accomplish this? With type families!

We start by defining a universe of references:

> data Reference = Database > | REST > | NoRef -- same as "fully realized"

The DataKinds extension will promote the values Database , REST , and NoRef to types. It will also promote Reference type to a kind.

Next we define a type-level function to map a reference and type to its reference type:

> type family RefType (a :: *) (r :: Reference) :: * where > RefType Customer Database = Int > RefType Customer REST = URI > RefType Product Database = Int > RefType Product REST = URI > RefType a NoRef = a

Our RefType type family states, essentially: “if you’re referencing a Customer or Product from the Database, it’s going to be an integer, if you’re referencing it via the REST api, you’re going to get a URI, otherwise, you’re going to get fully realized data”

How do we use this? Let’s define our Order data type!

> data Order (r :: Reference) = > Order { orderId :: Int > , orderCustomer :: RefType Customer r > , orderProducts :: [RefType Product r] > }

We’ve defined Order in such a way that requires programmers to tag values with a source. Did you get this Order from the database? Well, then you may need to do an application-level join to get a Customer or Product s. Did you already do the join in the DB? Well, then you’re going to get fully realized, unreferenced Customer and Product s!

If we start using this Order type we will start noticing how handy the Reference type tag becomes. For example:

> -- | an Order we fetched from the database > orderFromDB :: Order Database > orderFromDB = Order 1 13 [141, 5594, 21] > > -- | an Order we fetched via our REST api. > orderFromREST :: Maybe (Order REST) > orderFromREST = do > customerUri <- parseURI "https://v1/customer/13" > product1Uri <- parseURI "https://v1/product/141" > product2Uri <- parseURI "https://v1/product/5594" > product3Uri <- parseURI "https://v1/product/21" > return (Order 1 customerUri [product1Uri, product2Uri, product3Uri]) > > -- | a fully realized Order > orderFull :: Order NoRef > orderFull = Order 1 > (Customer 13 "Aaron Levin") > [ Product 141 "Anne Briggs - ST" 299.99 > , Product 5594 "Anne Briggs - The Time Has Come" 399.99 > , Product 21 "Anne Briggs - The Complete Topic Recordings" 29.99 > ]

Above we have three different definitions of the same Order , and in each case we can infer from the type what to expect from our referenced data (customer and products). The code is slightly more readable and we’ve explicitly stated the assumptions about the form of our referential data.

While it’s possible to write a function without specifying the Reference , we will be forced by GHC to make no assumptions about the type of referential data! For example:

> getOrderId :: Order r -> Int > getOrderId (Order i _ _) = i

And that’s it! This is by no means a perfect solution, and it’s definitely not the only solution, but hopefully it inspires you to investigate more simple, practical usages of type-level programming!

Enjoy!