24 Days of Hackage: haskelldb

I conjecture that the majority of bugs that programmers encounter occur at the boundaries between 2 different systems. In Haskell, we do everything we can to encode our assumptions of behaviour via the type system, and perform strict validation when bringing outside information into our cosy type-safe bubble. However, this doesn’t help if you’ve requested the wrong data in the first place - all you can do is throw a runtime exception.

For me, this is a frequent problem with my choice of database access - postgresql-simple . Too often I have queried for foo, bar in some_table , but elsewhere assumed I was querying for bar, foo - the order doesn’t agree. In today’s post, we will look at haskelldb , which is a high level abstract interface to databases.

Fundamentally, haskelldb takes an encoding of your database, and then uses this information to construct SQL for you, and deals with mapping data too and from the format the database. Kicking off today’s example, again using the Helping Santa project from earlier posts, we begin by defining our database interface. First of all, we need to define the columns:

data PresentName = PresentName instance FieldTag PresentName where = const "name" fieldName presentName :: Attr PresentName String = mkAttr PresentName presentNamemkAttr data ChildName = ChildName instance FieldTag ChildName where = const "name" fieldName childName :: Attr ChildName String = mkAttr ChildName childNamemkAttr data ChildLocationLat = ChildLocationLat instance FieldTag ChildLocationLat where = const "loc_lat" fieldName childLocLat :: Attr ChildLocationLat Double = mkAttr ChildLocationLat childLocLatmkAttr data ChildLocationLong = ChildLocationLong instance FieldTag ChildLocationLong where = const "loc_long" fieldName childLocLong :: Attr ChildLocationLong Double = mkAttr ChildLocationLong childLocLongmkAttr

For each column we introduce a unique type, we introduce an Attr for it (with its underlying Expr type), and also add FieldTag instances to map to and from the database column name. Now that we are armed with these column definitions, we can define our tables:

child :: Table ( RecCons ChildName ( Expr String ) ( RecCons ChildLocationLat ( Expr Double ) ( RecCons ChildLocationLong ( Expr Double ) RecNil ))) ))) = baseTable "child" childbaseTable $ hdbMakeEntry ChildName hdbMakeEntry # hdbMakeEntry ChildLocationLat hdbMakeEntry # hdbMakeEntry ChildLocationLong hdbMakeEntry present :: Table ( RecCons PresentName ( Expr String ) ( RecCons ChildName ( Expr String ) RecNil )) )) = baseTable "present" $ hdbMakeEntry PresentName presentbaseTablehdbMakeEntry # hdbMakeEntry ChildName hdbMakeEntry

We use baseTable to define the “base” tables in the database - the tables that actually have data in, and use hdbMakeEntry to add columns to each table.

It’s quite a bit of up front typing I’ll admit, but you could automate this typing with a custom step in your build process (easy with Shake!), or use Template Haskell. I’ve opted to write it all by hand as I find this helps further develop my understanding, as I only have to understand one thing at a time.

Now that we’ve represented the schema in code, let’s have a look at leveraging this to write some queries.

allPresents :: Query ( Rel ( RecCons PresentName ( Expr String ) RecNil )) )) = do allPresents <- table present allPresentstable present $ presentName << allPresents ! presentName projectpresentNameallPresentspresentName presentsFor :: String -> Query ( Rel ( RecCons PresentName ( Expr String ) RecNil )) )) = do presentsFor name <- table child childrentable child <- table present presentstable present $ children ! childName .==. presents ! childName restrictchildrenchildNamepresentschildName $ children ! childName .==. constant name restrictchildrenchildNameconstant name $ presentName << presents ! presentName projectpresentNamepresentspresentName

If you’ve only written SQL before this might look a little alien - haskelldb provides an API that is very close to relational algebra. In the first example we’ve taken the output of the entire present table and returned all rows. In the second example we’ve selected from two tables - a Cartesian join - and filtered this join on a predicate to find just the rows about a specific child (the child’s name is given by input).

The beauty of this interface is that haskelldb is free to rewrite your query in a more optimal form. You can have a look at the generated SQL by using ppSql :

> ppSql (presentsFor "Little Bobby Tables" ) ppSql (presentsFor SELECT name2 as name name2 as name FROM ( SELECT name as name2, name as name2, name as name2 FROM present as T1 ) as T1 , present as) as ( SELECT name as name1 name as name1 FROM child as T1 ) as T2 child as) as WHERE name1 = 'Little Bobby Tables' AND name1 = name2 name1name1name2

Coupled with PostgreSQL’s stunning query planer, the end result is query with the same speed as one I’d write by hand!

haskelldb gives us a extremely expressive API for querying databases, and doesn’t sacrifice performance to do so. This allows you to consume and compose the API in a way that best suits the needs of your application. If you frequently select from the result of a join you can simply store part of this query separately, and compose it later.

One downside of haskelldb is that documentation can be a little bit terse, and the API is quite large, so it can be hard to get going sometimes. I highly recommend Chris Done’s series of blog posts on haskelldb if you’re interested in learning more.

You can contact me via email at ollie@ocharles.org.uk or tweet to me @acid2. I share almost all of my work at GitHub. This post is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.