So you’ve started writing type-safe database queries using Beam or another Haskell database library. No more writing raw SQL strings into your code and praying that your integration tests don’t let things slip through the cracks. But maybe having just your queries be type-safe isn’t good enough for you. Long-lived application are going to have to change their database schema eventually. Can you manage those changes in Haskell as well? Make them type-safe and reliable? As it turns out, you can, with Beam’s built-in migration functionality.

If you don’t know about it, Beam is one of a number of Haskell libraries for providing type-safe, composable database access. Beam also provides migration functionality for managing your database schema. Unfortunately, there’s basically no documentation about how to use it. Rest assured, it is completely functional, if difficult to start using.

This post collects everything I learned while trying to get up and running with Beam’s migrations, and should help you get started as well.

There are actually a lot of different Haskell tools for managing migrations directly in pure Haskell, from the most barebones like postgresql-simple-migration and drifter up to the more heavyweight migration frameworks provided by Beam, Squeal, and Hasql. Some of these options will only work for specific DB backends, predominantly Postgres. If you’re looking to make a decision between the various options, this post won’t go into detailed comparisons between any of these options, but should give you enough information about Beam’s migrations to allow you to make your own comparison.

Before we dive into exploring the migration framework proper, we’ll need an example schema that we’re interested in. Let’s assume we’re working with Postgres. We’ll need the beam-core , beam-migrate , and beam-postgres packages. Since we’re working with migrations, we’ll build up the schema step by step.

Let’s say that we’re trying to set up a database to store orders for a flower shop. For now, we only care about storing information about each flower we stock, and what quantity of each flower is in each order. So we’ll have a table for flowers , a table for orders , and a many-to-many join table between orders and flowers.

When writing migrations in Beam, you first need Haskell datatypes for your entities and database. So let’s write those.

{-# LANGUAGE DeriveGeneric #-} {-# LANGUAGE DeriveAnyClass #-} {-# LANGUAGE TypeFamilies #-} import Data.Int ( Int32 , Int64 ) import Data.Text ( Text ) import Data.Scientific ( Scientific ) import Data.Time ( UTCTime ) import Database.Beam data FlowerT f = Flower { flowerID :: Columnar f Int32 , flowerName :: Columnar f Text , flowerPrice :: Columnar f Scientific } deriving ( Generic , Beamable ) data OrderT f = Order { orderID :: Columnar f Int32 , orderTime :: Columnar f UTCTime } deriving ( Generic , Beamable ) data LineItemT f = LineItem { lineItemOrderID :: PrimaryKey OrderT f , lineItemFlowerID :: PrimaryKey FlowerT f , lineItemQuantity :: Columnar f Int64 } deriving ( Generic , Beamable ) data FlowerDB f = FlowerDB { dbFlowers :: f ( TableEntity FlowerT ) f ( , dbOrders :: f ( TableEntity OrderT ) f ( , dbLineItems :: f ( TableEntity LineItemT ) f ( } deriving ( Generic , Database be) be) instance Table FlowerT where data PrimaryKey FlowerT f = FlowerID ( Columnar f Int32 ) deriving ( Generic , Beamable ) = FlowerID . flowerID primaryKeyflowerID instance Table OrderT where data PrimaryKey OrderT f = OrderID ( Columnar f Int32 ) deriving ( Generic , Beamable ) = OrderID . orderID primaryKeyorderID instance Table LineItemT where data PrimaryKey LineItemT f = LineItemID ( PrimaryKey OrderT f) ( PrimaryKey FlowerT f) f) (f) deriving ( Generic , Beamable ) = LineItemID primaryKey <$> lineItemOrderID lineItemOrderID <*> lineItemFlowerID lineItemFlowerID

Unfortunately, Beam’s migration framework seems to require these types to be written first, because they need to be used in the migration code as well. It would be nice to be able to autogenerate the datatype code as well, but alas.

Once we have this, we can write our first migration. All the major functions you’ll need to actually write migrations are in Database.Beam.Migrate.SQL.Tables .

{-# LANGUAGE OverloadedStrings #-} import Database.Beam.Migrate import Database.Beam.Backend import Database.Beam.Postgres -- It's unfortunate that we have to define this ourselves. utctime :: BeamSqlBackend be => DataType be UTCTime bebe = DataType (timestampType Nothing False ) utctime(timestampType initialSetup :: Migration Postgres ( CheckedDatabaseSettings Postgres FlowerDB ) = FlowerDB initialSetup <$> (createTable "flowers" $ Flower (createTable = field "id" { flowerIDfield int notNull unique = field "name" , flowerNamefield Just 20 )) notNull unique (nationalVarchar ()) notNull unique = field "price" , flowerPricefield Just ( 20 , Just ( 2 )))) notNull (numeric ()))) notNull }) <*> (createTable "orders" $ Order (createTable = field "id" { orderIDfield int notNull unique = field "order_time" , orderTimefield utctime notNull (defaultTo_ (cast_ currentTimestamp_ utctime)) }) <*> (createTable "line_items" $ LineItem (createTable = OrderID $ { lineItemOrderID "order_id" int notNull fieldint notNull = FlowerID $ , lineItemFlowerID "flower_id" int notNull fieldint notNull = field "quantity" , lineItemQuantityfield bigint notNull }) initialSetupStep :: MigrationSteps Postgres () ( CheckedDatabaseSettings Postgres FlowerDB ) = migrationStep initialSetupStepmigrationStep "initial_setup" ( const initialSetup) initialSetup)

Here we can see why we needed to write the datatypes before writing the migration itself; Beam uses the “shape” of the datatypes to store information about which field/table name corresponds to which attribute/record type. This pattern should be familiar if you’ve already used Beam’s query interface, where Beam uses the polymorphism of the datatype fields to store either literal data or expressions to compute said data. Here, it’s used to store table mapping information.

The various database types that can be used for columns are defined in Database.Beam.Query.DataTypes .

With this, we have enough to actually run this migration using the simple implementation Beam has defined in Database.Beam.Migrate.Simple .

import Database.Beam.Migrate.Simple import qualified Database.Beam.Postgres.Migrate as PG -- Beam's simple runner doesn't run destructive migrations -- by default, so we have to override that. allowDestructive :: Monad m => BringUpToDateHooks m = defaultUpToDateHooks allowDestructivedefaultUpToDateHooks = pure True } { runIrreversibleHook migrateDB :: Connection -> IO ( Maybe ( CheckedDatabaseSettings Postgres FlowerDB )) )) = runBeamPostgresDebug putStrLn conn $ migrateDB connrunBeamPostgresDebugconn bringUpToDateWithHooks allowDestructive PG.migrationBackend initialSetupStep

Once we have our working migration, we can also produce the DatabaseSettings that we need to actually run any queries. CheckedDatabaseSettings contains all the information we need, we just have to throw some of it away.

flowerDB :: DatabaseSettings Postgres FlowerDB = unCheckDatabase $ evaluateDatabase initialSetupStep flowerDBunCheckDatabaseevaluateDatabase initialSetupStep exampleQuery :: Connection -> IO [ FlowerT Identity ] = runBeamPostgres conn $ exampleQuery connrunBeamPostgres conn $ runSelectReturningList select (all_ (dbFlowers flowerDB))

If you’ve used Beam before, you might notice that we don’t need to handwrite DatabaseSettings to query the database, since we can just derive it from our migration. Which is… nice, except that we still have to define our custom datatypes before we can write our migrations.

We’ve got our initial database setup done, but the point of a migration framework is to manage changes over time. What kinds of things might change in our quaint little flower shop?

Perhaps the shop owner comes to us and says that they also want to display scientific names for their flowers alongside the common names. So we’ll need an extra column in our database, and we’ll want to rename our original name column to avoid confusion.

The idea is simple enough, but how do we implement it in Beam?

Once you start actually changing the database schema, Beam’s preferred way of representing those changes is as functions from one CheckedDatabaseSettings to another. To see what I mean, we’ll change the name of the original name column first.

import Control.Arrow changeCommonName :: CheckedDatabaseSettings Postgres FlowerDB -> Migration Postgres ( CheckedDatabaseSettings Postgres FlowerDB ) = do changeCommonName settings <- alterTable (dbFlowers settings) $ withColumnChangealterTable (dbFlowers settings) -> do \dbFlowers' <- renameColumnTo "common_name" (flowerName dbFlowers') commonNamerenameColumnTo(flowerName dbFlowers') pure $ dbFlowers' { flowerName = commonName } dbFlowers' { flowerNamecommonName } pure $ settings { dbFlowers = withColumnChange } settings { dbFlowerswithColumnChange } changeCommonNameStep :: MigrationSteps Postgres ( CheckedDatabaseSettings Postgres FlowerDB ) ( CheckedDatabaseSettings Postgres FlowerDB ) = migrationStep changeCommonNameStepmigrationStep "change_common_name" changeCommonName fullMigration :: MigrationSteps Postgres () ( CheckedDatabaseSettings Postgres FlowerDB ) = initialSetupStep >>> changeCommonNameStep fullMigrationinitialSetupStepchangeCommonNameStep

You can see that changeCommonName is the actual code, which we lift into a MigrationStep so that we can combine it with the setup step and run them. This is the basic pattern for each migration you’ll write: create a function that takes in an existing schema and returns a Migration , add the migration code in its body, and then lift and combine it.1

So far so good. Unfortunately, adding the scientific name column is a little harder. After all, our FlowerT datatype doesn’t have an attribute for it. We can’t add it to our existing datatype, since that would mess up our existing migrations; they wouldn’t have a database field to put into that attrtibute! We’ll need to create an entirely new datatype, which also means creating a new database type to hold our new table type.

data FlowerT' f = Flower' { flowerID' :: Columnar f Int32 , flowerCommonName' :: Columnar f Text , flowerScientificName' :: Columnar f Text , flowerPrice' :: Columnar f Scientific } deriving ( Generic , Beamable ) data FlowerDB' f = FlowerDB' { dbFlowers' :: f ( TableEntity FlowerT' ) f ( , dbOrders' :: f ( TableEntity OrderT ) f ( , dbLineItems' :: f ( TableEntity LineItemT ) f ( } deriving ( Generic , Database be) be) instance Table FlowerT' where data PrimaryKey FlowerT' f = FlowerID' ( Columnar f Int32 ) deriving ( Generic , Beamable ) = FlowerID' . flowerID primaryKeyflowerID

At this point you can probably see why Beam’s migration tooling can get very annoying; if we want to actually use the migration incrementally, we’ll have to keep creating new types, and keep updating which DB type is our most current one throughout our code. And we still haven’t written our migration yet.

retable :: Table tbl tbl => CheckedDatabaseEntity Postgres db ( TableEntity tbl) db (tbl) -> Migration Postgres ( CheckedDatabaseEntity Postgres db' ( TableEntity tbl)) db' (tbl)) = alterTable t pure retable talterTable t addScientificName :: CheckedDatabaseSettings Postgres FlowerDB -> Migration Postgres ( CheckedDatabaseSettings Postgres FlowerDB' ) = do addScientificName settings <- alterTable (dbFlowers settings) $ withColumnChangealterTable (dbFlowers settings) -> do \dbFlowers' <- addColumn (field "scientific_name" scientificNameaddColumn (field Just 20 )) unique) (nationalVarchar ()) unique) pure $ Flower' = flowerID dbFlowers' { flowerID'flowerID dbFlowers' = flowerName dbFlowers' , flowerCommonName'flowerName dbFlowers' = scientificName , flowerScientificName'scientificName = flowerPrice dbFlowers' , flowerPrice'flowerPrice dbFlowers' } <- retable (dbOrders settings) orders'retable (dbOrders settings) <- retable (dbLineItems settings) lineItems'retable (dbLineItems settings) pure $ FlowerDB' = withColumnChange { dbFlowers'withColumnChange = orders' , dbOrders'orders' = lineItems' , dbLineItems'lineItems' } addScientificNameStep :: MigrationSteps Postgres ( CheckedDatabaseSettings Postgres FlowerDB ) ( CheckedDatabaseSettings Postgres FlowerDB' ) = migrationStep addScientificNameStepmigrationStep "add_scientific_name" addScientificName fullMigration :: MigrationSteps Postgres () ( CheckedDatabaseSettings Postgres FlowerDB' ) = initialSetupStep fullMigrationinitialSetupStep >>> changeCommonNameStep changeCommonNameStep >>> addScientificNameStep addScientificNameStep

We can’t even reuse the tables from the original FlowerDB directly; we have to write our own retable function to get the type tetris to line up correctly. Frankly, once you have to, y’know, do actual migrations using Beam, it gets incredibly unwieldy.

Let’s say there’s one last change that we want to make. We want to reward loyal customers, so we want to give frequent customers a loyalty account, and add points for each order they make. For simplicity’s sake, we’ll just say there’s a single point per order.

I’m going to skip the datatype updates necessary; they’re pretty mechanical and not very interesting. We’ll have two new datatypes, OrderT' and LoyaltyAccountT , and a new DB type, FlowerDB'' . (I never said Beam’s migration framework was good.) The only thing of note is that the loyalty ID needs to be of type PrimaryKey LoyaltyAccount (Nullable f) , since not every order made might have an associated loyalty account; after all, not everyone will sign up before buying something.

addLoyaltyAccounts :: CheckedDatabaseSettings Postgres FlowerDB' -> Migration Postgres ( CheckedDatabaseSettings Postgres FlowerDB'' ) = do addLoyaltyAccounts settings <- alterTable (dbOrders' settings) $ withLoyaltyIDalterTable (dbOrders' settings) -> do \dbOrders' <- addColumn (field "loyalty_id" (maybeType int)) loyaltyIDaddColumn (field(maybeType int)) pure $ Order' = orderID dbOrders' { orderID'orderID dbOrders' = LoyaltyAccountID loyaltyID , orderLoyaltyID'loyaltyID = orderTime dbOrders' , orderTime'orderTime dbOrders' } <- createTable "loyalty_accounts" $ loyaltyAccountscreateTable LoyaltyAccount = field "id" { loyaltyAccountIDfield int notNull unique = field "customer_name" , loyaltyAccountCustomerNamefield Just 30 )) notNull (nationalVarchar ()) notNull } <- retable (dbFlowers' settings) flowers''retable (dbFlowers' settings) <- retable (dbLineItems' settings) lineItems''retable (dbLineItems' settings) pure $ FlowerDB'' = flowers'' { dbFlowers''flowers'' = withLoyaltyID , dbOrders''withLoyaltyID = lineItems'' , dbLineItems''lineItems'' = loyaltyAccounts , dbLoyaltyAccounts''loyaltyAccounts } addLoyaltyAccountsStep :: MigrationSteps Postgres ( CheckedDatabaseSettings Postgres FlowerDB' ) ( CheckedDatabaseSettings Postgres FlowerDB'' ) = migrationStep addLoyaltyAccountsStepmigrationStep "add_loyalty_accounts" addLoyaltyAccounts fullMigration :: MigrationSteps Postgres () ( CheckedDatabaseSettings Postgres FlowerDB'' ) = initialSetupStep fullMigrationinitialSetupStep >>> changeCommonNameStep changeCommonNameStep >>> addScientificNameStep addScientificNameStep >>> addLoyaltyAccountsStep addLoyaltyAccountsStep

And with that, we have a few examples of doing the most common schema updates. But phew, these migrations are rather a mouthful for how little we’re actually doing in them.

If you’re currently evaluating different options for managing your database schema, hopefully this post has helped you form your own opinion of Beam. My own personal opinion is that Beam’s current migration tooling isn’t very good, and I’ll probably either stick to doing migrations manually or look into some other tools.

The full migration code in this post is available here, for reference.

Found this useful? Got a comment to make? Talk to me!