Writing a domain specific language for Sharc assembly in Haskell

Want to discuss the content of this article? Join the conversation on Twitter!

Domain Specific Languages in Haskell

I made a presentation at one of our local meetup groups earlier this fall. Here’s my (text-based) slides from that presentation. It was an interactive presentation, so most of the code examples are meant to be evaluated when following along.

DOMAIN SPECIFIC LANGUAGE (DSL)

a computer programming language of limited expressiveness focussed on a particular domain

– Martin Fowler

A very good paper on Domain Specific Languages in Haskell is: Functional Programming for Domain-Specific Languages by Jeremy Gibbons

More reading material on Domain Specific Languages can be found on haskell.org:

A LITTLE BACKGROUND

Two main approaches to implementing DSLs

Stand-alone language Custom syntax, that can be tailored for the domain

Requires building parser, compiler etc -> Significant work Embedded DSL Leverages syntax and abstractions from a host language

The DSL is a library defining the domain specific semantic

Blurres the boundary between the host and DSL

We’re going to look at a specific form called deeply embedded DSL so called because terms in the DSL are implemented simply to construct and abstract syntax tree (AST).

Using Haskell’s Algebraic Datatypes (ADT) we can create ASTs like so

data DExp = LitInt Int | Add DExp DExp | Sub DExp DExp | Mul DExp DExp | LitBool Bool deriving Show

A value of the DExp type can be created with one of the above constructor functions, for example the constructor function LitInt takes an Integer as argument.

LitInt 5 : t LitInt -- => LitInt :: Int -> DExp Add ( LitInt 4 ) ( LitInt 12 ) : t LitBool True

Use functions and/or operators for construction. For example we can implement the Num type class.

: i Num -- => class Num a where ( + ) :: a -> a -> a ( * ) :: a -> a -> a ( - ) :: a -> a -> a negate :: a -> a abs :: a -> a signum :: a -> a fromInteger :: Integer -> a

instance Num DExp where a + b = Add a b a - b = Sub a b a * b = Mul a b fromInteger i = LitInt $ fromInteger i

-- Let's try out some expressions 1 + 4 * 9 :: DExp Mul 4 5

Add a couple of functions.

sqr x = x * x conjugate a b = sqr a - sqr b -- And try them out sqr 4 sqr 4 :: DExp sqr $ 1 + 4 * 9 :: DExp conjugate ( 9 * 3 ) ( 4 - 1 ) :: DExp

This DSL is unityped - everything is DExp , which means its possible to construct illegal ASTs.

Mul ( LitBool True ) 1 -- Bad

INTERPRET ALL THE THINGS

Pattern match on the different constructor functions and voila

eval :: DExp -> Int eval ( LitInt a ) = a eval ( Add a b ) = ( eval a ) + ( eval b ) eval ( Sub a b ) = ( eval a ) - ( eval b ) eval ( Mul a b ) = ( eval a ) * ( eval b )

-- Try these out yourself in the REPL eval ( LitInt 4 ) eval ( 2 + 6 * 4 :: DExp ) eval ( conjugate ( 9 * 7 ) ( 4 - 2 ) :: DExp )

But you can just as easily

COMPILE ALL THE THINGS

data Asm = Push Int | StackAdd | StackSub | StackMul deriving Show genByteCode :: DExp -> [ Asm ] genByteCode ( LitInt a ) = [ Push a ] genByteCode ( Add a b ) = ( genByteCode a ) ++ ( genByteCode b ) ++ [ StackAdd ] genByteCode ( Sub a b ) = ( genByteCode a ) ++ ( genByteCode b ) ++ [ StackSub ] genByteCode ( Mul a b ) = ( genByteCode a ) ++ ( genByteCode b ) ++ [ StackMul ]

-- Examples genByteCode ( 12 + sqr 4 - sqr 2 * 3 ) genByteCode $ conjugate ( 9 * 7 ) ( 4 - 2 )

SWITCHING GEARS A LITTLE BIT

Has anybody here done any assembly language coding?

I currently do consulting work for a client, working with embedded systems. Here I have been introduced to the SHARC processor and its assembly language.

Registers

R0-R15 Fixed point (Integer)

F0-F15 Floating point

Algebraic notation

R1 = R2 + R3 ; F2 = F0 * F1 ; F9 = MIN ( F2 , F14 ); F3 = F2 - F1 ;

With a DSL implementation of the SHARC assembly language you could explore interesting things like:

Faster feedback-loop, since you can run (interpret) code on your machine directly and not run on actual hardware. Use Haskell to create abstractions on top of the assembly, advance macros. Quickcheck testing and unit testing - Isolation testing an assembly function is a PITA.

data Rx = R0 | R1 | R2 | R3 | R4 | R5 | R6 | R7 | R8 | R9 | R10 | R11 | R12 | R13 | R14 | R15 deriving ( Show , Eq , Ord ) data Fx = F0 | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | F10 | F11 | F12 | F13 | F14 | F15 deriving ( Show , Eq , Ord )

The SHARC asm is strongly typed, you can’t mix Rx and Fx registers. Illegal examples:

R2 = R0 + F1 F0 = R0 + R1

Both the expression on the rhs must be correctly typed, as well as the assignment.

We want the AST to only contain legal constructions. This is possible using Generalized Algebraic Data Types (GADTs).

Expr is now a polymorpic type, but only provides constructor functions for Expr Integer and Expr Float and not for other instances of the polymorphic type Expr a .

data Expr :: * -> * where LiteralInt :: Integer -> Expr Integer AddR :: Rx -> Rx -> Expr Integer LiteralFloat :: Float -> Expr Float AddF :: Fx -> Fx -> Expr Float

: t LiteralInt 3 : t LiteralFloat 4.9 : t AddF F0 F1 : t AddF F0 R1

THE SHARC ASM DSL

Here’s the whole shebang to play around with. Happy DSL hacking!

{-#LANGUAGE GADTs, KindSignatures, FlexibleInstances, FunctionalDependencies, NoMonomorphismRestriction #-} {-# LANGUAGE CPP #-} import Control.Monad.State ( State , execState , get , put , modify ) import qualified Data.Map as M import qualified Data.Bits as B infix 4 <~ infixl 6 +. infixl 6 -. infixl 6 *. data Rx = R0 | R1 | R2 | R3 | R4 | R5 | R6 | R7 | R8 | R9 | R10 | R11 | R12 | R13 | R14 | R15 deriving ( Show , Eq , Ord ) data Fx = F0 | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | F10 | F11 | F12 | F13 | F14 | F15 deriving ( Show , Eq , Ord ) data Ix = I0 | I1 | I2 | I3 | I4 | I5 | I6 | I7 | I8 | I9 | I10 | I11 | I12 | I13 | I14 | I15 deriving ( Show , Eq , Ord ) data Mx = M0 | M1 | M2 | M3 | M4 | M5 | M6 | M7 | M8 | M9 | M10 | M11 | M12 | M13 | M14 | M15 deriving ( Show , Eq , Ord ) data Cond = Eq | Ne | Gt | Lt | Ge | Le deriving ( Show , Eq , Ord ) data Expr :: * -> * where LitInt :: Integer -> Expr Integer RegR :: Rx -> Expr Integer AddR :: Rx -> Rx -> Expr Integer SubR :: Rx -> Rx -> Expr Integer MulR :: Rx -> Rx -> Expr Integer MinR :: Rx -> Rx -> Expr Integer MaxR :: Rx -> Rx -> Expr Integer AndR :: Rx -> Rx -> Expr Integer OrR :: Rx -> Rx -> Expr Integer XorR :: Rx -> Rx -> Expr Integer NegR :: Rx -> Expr Integer NotR :: Rx -> Expr Integer AbsR :: Rx -> Expr Integer PassR :: Rx -> Expr Integer IncR :: Rx -> Expr Integer DecR :: Rx -> Expr Integer LitFloat :: Float -> Expr Float RegF :: Fx -> Expr Float AddF :: Fx -> Fx -> Expr Float SubF :: Fx -> Fx -> Expr Float MulF :: Fx -> Fx -> Expr Float MinF :: Fx -> Fx -> Expr Float MaxF :: Fx -> Fx -> Expr Float NegF :: Fx -> Expr Float AbsF :: Fx -> Expr Float PassF :: Fx -> Expr Float RegI :: Ix -> Expr Integer RegM :: Mx -> Expr Integer -- By completely separating AddF from AddR we only allow the type safe -- constructions instance Show ( Expr Integer ) where show s' = case s' of LitInt n -> show n RegR n -> show n RegI n -> show n RegM n -> show n AddR a b -> show a ++ "+" ++ show b SubR a b -> show a ++ "-" ++ show b MulR a b -> show a ++ "*" ++ show b MinR a b -> "MIN(" ++ show a ++ "," ++ show b ++ ")" MaxR a b -> "MAX(" ++ show a ++ "," ++ show b ++ ")" AndR a b -> show a ++ " AND " ++ show b OrR a b -> show a ++ " OR " ++ show b XorR a b -> show a ++ " XOR " ++ show b NegR a -> "-" ++ show a NotR a -> "NOT " ++ show a AbsR a -> "ABS " ++ show a PassR a -> "PASS " ++ show a IncR a -> show a ++ "+1" DecR a -> show a ++ "-1" instance Show ( Expr Float ) where show s' = case s' of LitFloat n -> show n RegF n -> show n AddF a b -> show a ++ "+" ++ show b SubF a b -> show a ++ "-" ++ show b MulF a b -> show a ++ "*" ++ show b MinF a b -> "MIN(" ++ show a ++ "," ++ show b ++ ")" MaxF a b -> "MAX(" ++ show a ++ "," ++ show b ++ ")" NegF a -> "-" ++ show a AbsF a -> "ABS " ++ show a PassF a -> "PASS " ++ show a -- Custom implementation of Show for the Expr allows for printing the -- AST in the form the `real` assembly lang would look like -- > AddR R9 R10 -- > MinF F0 F1 data Stmt where AssignR :: Rx -> Expr Integer -> Stmt AssignF :: Fx -> Expr Float -> Stmt AssignI :: Ix -> Expr Integer -> Stmt AssignM :: Mx -> Expr Integer -> Stmt ModifyReg :: Ix -> Mx -> Stmt ModifyIm :: Ix -> Integer -> Stmt Para :: SharcProgram -> Stmt -- Statements enforces the same restrictions, only Rx registers can be -- assigned an Integer Expr and so on instance Show Stmt where show s' = case s' of AssignR r e -> show r ++ "=" ++ show e AssignF r e -> show r ++ "=" ++ show e AssignI r e -> show r ++ "=" ++ show e AssignM r e -> show r ++ "=" ++ show e ModifyReg i m -> "MODIFY(" ++ show i ++ "," ++ show m ++ ")" ModifyIm i m -> "MODIFY(" ++ show i ++ "," ++ show m ++ ")" Para l -> "[" ++ show l ++ "]" -- Previous examples in AST form -- > AssignR R1 $ AddR R2 R3 -- > AssignF F2 $ MulF F0 F1 -- > AssignF F9 $ MinF F2 F14 -- > AssignF F3 $ SubF F2 F1 -- Illegal examples -- AssignR R2 $ AddR R0 F1 -- AssignF F0 $ AddR R0 R1 -- FROM NOW ON ITS JUST ABOUT SYNTAX AND CONVENIENCE -- class ALU a b | a -> b where add :: a -> a -> b sub :: a -> a -> b mul :: a -> a -> b min_ :: a -> a -> b max_ :: a -> a -> b neg :: a -> b abs_ :: a -> b pass :: a -> b -- Polymorphic type class to be able to use the same operators on both -- Rx and Fx instance ALU Rx ( Expr Integer ) where add m n = AddR m n sub m n = SubR m n mul m n = MulR m n min_ m n = MinR m n max_ m n = MaxR m n neg m = NegR m abs_ m = AbsR m pass m = PassR m instance ALU Fx ( Expr Float ) where add m n = AddF m n sub m n = SubF m n mul m n = MulF m n min_ m n = MinF m n max_ m n = MaxF m n neg m = NegF m abs_ m = AbsF m pass m = PassF m class Asgn l r where ( <~ ) :: l -> r -> Sharc () class Mdfy m where modify_ :: Ix -> m -> Sharc () instance Asgn Rx ( Expr Integer ) where ( <~ ) r e = addStmt $ AssignR r e instance Asgn Fx ( Expr Float ) where ( <~ ) r e = addStmt $ AssignF r e instance Asgn Rx Rx where ( <~ ) r s = addStmt $ AssignR r $ RegR s instance Asgn Fx Fx where ( <~ ) r s = addStmt $ AssignF r $ RegF s instance Asgn Ix Ix where ( <~ ) r s = addStmt $ AssignI r $ RegI s instance Asgn Mx Mx where ( <~ ) r s = addStmt $ AssignM r $ RegM s instance Asgn Rx Integer where ( <~ ) r i = addStmt $ AssignR r $ LitInt i instance Asgn Fx Float where ( <~ ) r f = addStmt $ AssignF r $ LitFloat f instance Asgn Ix Integer where ( <~ ) r i = addStmt $ AssignI r $ LitInt i instance Asgn Mx Integer where ( <~ ) r i = addStmt $ AssignM r $ LitInt i instance Mdfy Mx where modify_ i m = addStmt $ ModifyReg i m instance Mdfy Integer where modify_ i m = addStmt $ ModifyIm i m and_ x y = AndR x y or_ x y = OrR x y xor_ x y = XorR x y not_ x = NotR x inc x = IncR x dec x = DecR x ( +. ) = add ( -. ) = sub ( *. ) = mul type SharcProgram = [ Stmt ] type Sharc = State ( Int , SharcProgram ) addStmt :: Stmt -> Sharc () addStmt a = modify $ \ ( n , p ) -> ( n , p ++ [ a ]) assemble :: Sharc () -> SharcProgram assemble program = snd $ execState program ( 0 , [] ) # define APRIME 11 # define SIX 6 fun :: Sharc () fun = do R1 <~ ( APRIME :: Integer ) R2 <~ ( 6 :: Integer ) R0 <~ R1 R2 <~ R1 +. R2 R3 <~ R1 *. R2 R5 <~ min_ R1 R2 R6 <~ max_ R1 R2 R4 <~ R1 R7 <~ neg R4 R8 <~ abs_ R4 R9 <~ R1 ` and_ ` R2 R9 <~ and_ R1 R2 -- State monad jiggery pokery to get the nice do syntax -- With the assemble function we run through the state and accumulate -- all instructions into a list, aka the COMPILER -- OR A SMALL INTERPRETER (only integer parts now :s) type Env = M . Map Rx Integer initialEnv = M . fromList [( R0 , 0 ), ( R1 , 0 ), ( R2 , 0 )] eval :: Env -> Stmt -> Env eval env ( AssignR r expr ) = M . insert r ( evalExpr env expr ) env evalExpr :: Env -> Expr Integer -> Integer evalExpr _ ( LitInt i ) = i evalExpr env ( RegR r ) = env M .! r evalExpr env ( AddR r s ) = ( env M .! r ) + ( env M .! s ) evalExpr env ( SubR r s ) = ( env M .! r ) - ( env M .! s ) evalExpr env ( MulR r s ) = ( env M .! r ) * ( env M .! s ) evalExpr env ( MinR r s ) = min ( env M .! r ) ( env M .! s ) evalExpr env ( MaxR r s ) = max ( env M .! r ) ( env M .! s ) evalExpr env ( NegR r ) = - ( env M .! r ) evalExpr env ( AbsR r ) = abs $ env M .! r evalExpr env ( IncR r ) = ( env M .! r ) + 1 evalExpr env ( DecR r ) = ( env M .! r ) - 1 evalExpr env ( AndR r s ) = ( env M .! r ) B ..&. ( env M .! s ) evalExpr env ( OrR r s ) = ( env M .! r ) B ..|. ( env M .! s ) evalExpr env ( XorR r s ) = B . xor ( env M .! r ) ( env M .! s ) evalExpr env ( NotR r ) = B . complement ( env M .! r ) run = foldl eval initialEnv ( assemble fun )

If you have any comments or feedback, send me an email or send me a message on Twitter @lexicallyscoped.