When I read Ben Hutchison‘s OO/Imperative programmers: ‘Study Functional Programming or Be Ignorant’ I knew I had too much to say for the comments, so I figured I’d put in my 2 cents here.

Haskell is my go-to language, both for scripting, and for getting work done. This is not because of any particular allegiance to the language. Haskell and I have an open relationship, and the moment I find a language that out-Haskells Haskell, you can be sure I’ll move on.

Here I want to describe my favorite things about Haskell. You’ll note that they are all about the type-system. I don’t feel too strongly one way or the other about laziness, or about monads (though I won’t give them up without first finding something to take their place). I don’t even particularly care that it’s a functional language, in as much as I can have these features in a non-functional environment.

Some of these features are already available elsewhere. This is wonderful! If you know of any examples of this, please tell me in the comments.

This is a list of my favorite things:

Separation of class and data definitions.

Haskell’s notion of classes is more like Java’s notion of interfaces. A class is a list of function prototypes, and any data type for which such functions can be defined is an instance of that class. One does not inheret a parent class, but rather, one implements a class. It’s a weird distinction if you haven’t seen it before, but after I learned how to use it, I must say I prefer it.

The first example most people see is the Show class. Here is how it’s defined (to get this listing, I just asked ghci — the interactive GHC prompt — to give me the definition):



Prelude> :info Show class Show a where showsPrec :: Int -> a -> ShowS show :: a -> String showList :: [a] -> ShowS -- Defined in GHC.Show



This says that any data type a which is an instance of Show provides functions with these signatures. (Edited: The first of these functions are used for implementing a Haskell idiom for fast string construction, while the last is related to a restriction in the unmodified Haskell 98 standard.)

When I define a new datatype, I can either ask Haskell to derive a Show instance for me automatically, or I can specify one myself:



data Car = Person { make :: String, year :: Int } deriving Show data Pet = Pet { name :: String, animal :: String, age :: Int } instance Show Pet where show p = "My pet is named " ++ name p ++ " and he is a " ++ animal p ++ " and he is " ++ show (age p) ++ " years old."



I understand that Google’s Go language has this notion of class (they call it interface), and that Scala provides this as well (they call it trait).

Typed side-effects.

In Haskell, if a function wants to communicate with the environment, then the function’s type signature will document this fact. Want to print to the console? Open a socket? Read a file? Any of these actions will put your function into the IO monad, which is a red-flag to other programmers that the function communicates with the environment. When your application works with library code (and whose doesn’t?) this is a handy feature.

Haskell uses the monad design pattern as the underpinning of how it types side-effects. I don’t particularly care that it’s monads per se, I just like that there is something which statically documents which functions communicate with the environment.

Why is this useful? One huge answer is concurrency. If your function has side-effects, it is not obviously thread-safe. If it has no side-effects, it is thread-safe. The monad design pattern provides a way to define application-specific notions of side-effects, which allows you to dial in the granularity on this as much as is appropriate for your application.

With respect to typed side-effects, a common Haskell idiom is to break up your program into different layers of state. For instance, in a web framework, you might have a “user-input” layer which is read-only, and on top of that a “logging” layer, and on top of that your application-specific stuff. (Each of these layers is a monad, or more precisely, a monad transformer.) Haskell allows you to statically track which functions rely on which layers, which is a useful thing if you want to call a function and be certain that it won’t modify some data out from under you.

If you’re new to Haskell and monads, in my humble opinion this idiom is the real reason to give a damn about monads. But that’s just my perspective.

(And it’s certainly not obvious from the beginning, but a lot of bugs can be eliminated this way.)

Type safe macros.

No language is completely free from the occasional boilerplate. One way around this is to use macros.

In C, macros can be very tricky. The preprocessor takes all instances of a macro, replaces it with the corresponding text, then passes off to the compiler. If it turns out that you used the macro incorrectly, the compiler isn’t really there to help you out: after all C macros are all about find-and-replace.

Haskell’s macro system is called Template Haskell. Macros written in Template Haskell are actually written in Haskell syntax. The compiler then takes this code, compiles it like it would any other Haskell, and then uses it to expand your usage of the macros. Everything is typed the whole way through, and if there are errors, the compiler can tell you where they are and why (with its usual level of precision, for better or worse).

When I recently ran into a scenario where (for some very long-winded reason) I had to define 20 essentially-identical datatypes, then give them all essentially the same class instances, I was able to quickly whip up some Template Haskell to do all the lifting for me. When I realized I needed to modify those class instances, it was as simple as modifying the Template Haskell that was generating them.

This is how macro-ing should be. Instead of a deal with the devil, it should be safe enough to be accepted practice.

Quasi-quoting.

This is one of the many fine ways to embed a language in Haskell. Here’s a typical use case: you’re writing a library and the most natural way for a developer to specify some options is in a simple configuration language. You could implement a function String -> MyLibOptions , but if they have any typos in their configuration string, you won’t be able to catch them until run-time. If the configuration isn’t known until run-time that’s fine, but if the configuration is known at compile-time, you’d like the error to be caught at compile-time. (I need to mention that quasi-quoting is able to mix run-time and compile-time data — I’m just simplifying things to describe this use case.)

Quasi-quoting to the rescue. I recently gave an example of Haskell’s quasi-quoting abilities in a post about how it can be used to provide an injection-proof form of string interpolation (via Interpolique). One of my favorite applications is Michael Snoyman‘s Hamlet, a type-safe HTML generation library.

(If you’d like to see what it looks like to implement a quasi-quoter in Haskell, I’ve got some code up on github that demonstrates this in the case of string interpolation, as mentioned above.)

Quasi-quoting is basically syntactic sugar for Template Haskell. Consequently your quasi-quoters are able to reach into the environment and interact with the rest of the code (all in a type-safe, purely functional way, of course). In the string interpolation example above, for instance, the code



author = "broker" content = "' or 1=1;" query = [$interpolique| insert into posts values(^^author , ^^content ); |]



set query equal to the following



*Test> query InterpoliquedString " insert into posts values(b64d(\"YnJva2Vy\"), b64d(\"JyBvciAxPTE7\")); "



which was generated by inspecting the values of author and content at run-time, encoding them in base64, and then interpolating them into the result you see here. The fact that author and content were strings was determined at compile-time, so there wasn’t any chance of any shenanigans when the code actually executed.

For instance, if I instead had the code



author = 2 :: Int content = "' or 1=1;" query = [$interpolique| insert into posts values(^^author , ^^content ); |]



I’d get a compile-time error:



Test.hs:9:23: Couldn't match expected type `String' against inferred type `Int' In the first argument of `InterpoliqueQQ.b64enc', namely `author'



which I think is pretty cool.

Type families and associated types.

I must admit that I only use the “associated types” half of this, although the feature is slightly more general. Anyway, I’ll describe the part that I use.

Type families give you a way to compute which type you want to use. Yes, sounds weird, but it’s amazing.

A typical first example of this is the associated list. Every modern language has these: it is just an array where the lookup doesn’t need to be an Int (think HashMap and the like).

In Haskell this can be described like so:



class GenericMap a where type Key a type Value a get :: a -> Key a -> Value a set :: a -> Key a -> Value a -> a



The first two parts of this class definition are the so-called “associated types.” The easiest way to see this in use is with an example of what an instance might look like. Here I’ll do something crazy and define the function type String -> Int as an instance of this class (the Haskell Wiki article on type families has other examples, some of which you might find more conventional):



instance GenericMap (String -> Int) where type Key (String -> Int) = String type Value (String -> Int) = Int get f k = f k set f k v' = \k' -> if k == k' then v' else f k'



This instance works:



sampleMap :: String -> Int sampleMap s = length s sampleMap' = set sampleMap "foo" 4 ... *Main> get sampleMap "monkey" 6 *Main> get sampleMap "foo" 3 *Main> get sampleMap' "foo" 4 *Main> get sampleMap' "bar" 3 *Main> get sampleMap' "monkey" 6



which is all well and good.

Now, I haven’t yet given any reasons why this type families business is any good. The answer has to do with polymorphism: sometimes you want to write a function whose type signature is so damned flexible you just can’t figure out how to write it. You try a few examples, but each is too restrictive. But there’s a pattern to it. If you’re in this boat, type families can help.

I’d give an example of this, except I already did in polymorphic first class labels. (Which, by the way, is another feature I’d like to see in other languages.)

Another application of type families is type-level programming (functional dependencies can also be used for this, but as type families get better, my interest in seeing functional dependencies in other languages will dwindle). Type-level programming is an insane idea where you do computation in the type system at compile-time.

This actually can be helpful in situations where you have really complicated properties you want to express about your program statically. For instance, I had a situation where certain types had a “size” associated to them. I had functions that were polymorphic over arguments of a given size. Some of these functions would yield a new type that was twice the size as the input.

How do you express that statically, if the goal is to still be polymorphic? Type families can do it. I basically wrote a class whose sole job was to use type families to do arithmetic in the freaking type system. This basically looked like



class HasSize a where type Size a class Doubler a where type Double a ... -- The ~ operator asserts type equality, so this next -- line basically reads "the size of b is `Double' the -- size of a." someFunction :: ( Size b ~ Double (Size a) ) => a -> b someFunction = ...

I would not describe it as pretty, but it solved my problem, and it gave me a compile-time guarantee that an important design invariant was being met. The syntax is easy to read as well. And if it looks like I’m applying functions to types, it’s because I am.

Rank-2 types.

You don’t often see this on the list of great things about Haskell, but I love them. To say that a type is “rank-2” is basically a statement about just how polymorphic it is. I use this feature in two different ways: the first is to solve a polymorphism problem, the second is to prevent tainted data from leaking into places it doesn’t belong (I’m in love with this second application and I have no clue how to statically do it in any other language — tell me in the comments if you do!).

Here is an example of how I use it to get some extra polymorphism:



useFoo :: (forall a . a -> [a]) -> b -> ([b], [String]) useFoo f b = ( f b, f "bar" )

This function takes two arguments (another function and some other type) and uses them to build a tuple (by applying that function twice). The forall asserts that the function we give must work for any type a , hence why we can apply it to the mystical input of type b or to an ordinary String .

If I were to rewrite this function without the forall I'd get a type error (two type errors, actually):



useFoo1 :: (a -> [a]) -> b -> ([b], [String]) useFoo1 f a = ( f a, f "bar" )



gives me



temp.hs:14:18: Couldn't match expected type `[Char]' against inferred type `b' `b' is a rigid type variable bound by the type signature for `useFoo1' at temp.hs:13:25 In the first argument of `f', namely `a' In the expression: f a In the expression: (f a, f "bar") temp.hs:14:23: Couldn't match expected type `a' against inferred type `[Char]' `a' is a rigid type variable bound by the type signature for `useFoo1' at temp.hs:13:12 In the first argument of `f', namely `"bar"' In the expression: f "bar" In the expression: (f a, f "bar")

Absent the forall , the type checker assumes that the function I'm providing works for some type a , and attempts to determine just which type that happens to be. That is, the compiler is allowing me to be a little ambiguous with my type signature, figuring that there is a particular type I have in mind and that it will use type inference to determine what that would be. But then I try to use the function on two different types -- b and String -- and therefore is quite upset. (In fact, it is already upset because, the way I've written the signature for useFoo1 , Haskell assumes that a and b must be distinct, and in fact this is what those errors above are telling me: a is not the same as b , nor is it the same as String .)

While this application is nice, as I alluded above, in my mind the killer application is tracking tainted data. Here are two common scenarios where this is something you want to do:

You have some function which accepts untrusted user input, and you want to be certain that whatever value it returns has been scrubbed clean. This is handy for a function like, say, useUserInputToBuildSQLQuery . (There are many other ways to solve this problem, of course.)

. (There are many other ways to solve this problem, of course.) You have a function which allocates some resources, uses them, then frees them, and you want to make sure it doesn't return a dangling handle. (I'm not aware of another way of solving this problem, and again would appreciate any comments with other ideas.)

The best example of that second scenario is Haskell's ST monad. Code that executes with the ST monad is able to create mutable variables. If you have a function that is written in the ST monad, you can execute it using the runST function, whose signature is



Prelude> :m +Control.Monad.ST Prelude Control.Monad.ST> :t runST runST :: (forall s. ST s a) -> a



The key to how this works is the forall in the signature of runST . In essence, it is preventing code in the ST monad from returning one of these mutable variables. So the following code works:



{-# LANGUAGE Rank2Types #-} import Control.Monad.ST import Data.STRef exampleST :: ST s Int exampleST = do myMutableVar <- newSTRef 0 modifySTRef myMutableVar (

-> n+1) n <- readSTRef myMutableVar return n ... *Main> runST exampleST 1



but the following code does not:



{-# LANGUAGE Rank2Types #-} import Control.Monad.ST import Data.STRef exampleST1 :: ST s (STRef s Int) exampleST1 = do myMutableVar <- newSTRef 0 modifySTRef myMutableVar (

-> n+1) return myMutableVar ... *Main> runST exampleST1 :1:0: Inferred type is less polymorphic than expected Quantified type variable `s' escapes In the first argument of `runST', namely `exampleST1' In the expression: runST exampleST1 In the definition of `it': it = runST exampleST1

People who read my blog will not be surprised when I mention that Oleg Kiselyov and Chung-chieh Shan have shown that this approach can be used to implement region based resource management with good granularity. (This is a paper I've been bringing up a lot recently, as it is the underpinning of memory management in Potential.)

Conclusion.

Haskell has a reputation for being hard to learn, though I feel this reputation is a bit dated now that we have good resources like Learn You a Haskell for Great Good and Real World Haskell. Certainly one of the hardest parts about learning Haskell is that so many of the examples of "good Haskell" that we hold up rely on many of the features I mentioned above, and most of them seem foreign to new Haskellers. That's hard to avoid: Haskell is, after all, a research testbed.

I don't know if you'll feel the same way as I do, but after gaining some experience with using these tools in my own code, it is frustrating to leave them behind when working in other languages. Every language has its seed of grace, elegance, and brilliance that, if it gets into you and grows, will make you into a zealot. I feel that these are Haskell's seeds.