January 30, 2018 at 05:15 Tags Haskell

In this brief post I want to discuss a fairly unusual feature of Haskell - functions that can be parameterized by their return type.

Parametric vs. ad-hoc polymophism It's worth beginning with a quick discussion of the two most common kinds of compile-time polymorphism present in Haskell: parametric polymophism and ad-hoc polymorphism. Parametric polymorphism is possible when we can define a certain operation to work similarly on any type. A simple example is the list type [a] , on which many operations are defined in a way that completely disregards what the actual type a is. For instance, the function length can find the length of the list without relying or caring about a . A slightly more interesting example is map , defined as follows for lists: map :: ( a -> b ) -> [ a ] -> [ b ] map _ [] = [] map f ( x : xs ) = f x : map f xs This will work on any list, regardless of what it contains - integers, other lists or some complicated user-defined type. The definition is written in a way that is oblivious to the actual type of a . This kind approach is commonly called generic programming; in C++, it's represented with templates. This approach is sometimes limiting, however, because we may actually want functions to do something slightly different for every type (or at least for some types). This brings us to ad-hoc polymoprhism, which is represented by either function overloading or template specialization in C++. Ad-hoc polymorphism in Haskell is achieved by using typeclasses. As an example, consider the built-in class Ord . If you define the <= operator for your type, it's then considered to comply with the Ord class and we can do some interesting things with it . For example, we can implement merge-sorting as follows: merge :: Ord a => [ a ] -> [ a ] -> [ a ] merge [] ys = ys merge xs [] = xs merge ( x : xs ) ( y : ys ) | x <= y = x : merge xs ( y : ys ) | otherwise = y : merge ( x : xs ) ys msort :: Ord a => [ a ] -> [ a ] msort [ x ] = [ x ] msort xs = merge ( msort left ) ( msort right ) where ( left , right ) = ( take halflen xs , drop halflen xs ) halflen = length xs ` div ` 2 Note that msort works on a list of some type a , similarly to map ; in this respect it's parametrically polymorphic, with a twist. The type constraint Ord a => tells the compiler that it's only legal to invoke msort on lists of type a if a implements the Ord class. Most built-in Haskell types implement the Ord ord class, so we can use msort right away: > msort [ 2 , 1 , 3 , 8 , 5 ] [ 1 , 2 , 3 , 5 , 8 ] However, if we want to use it on user-defined types, we'll need to implement the Ord class manually: data Person = Person { lastName :: String , firstName :: String } deriving ( Eq , Show ) -- Simple example of making Person sortable by defining Ord. instance Ord Person where ( Person lx fx ) <= ( Person ly fy ) = if lx == ly then fx <= fy else lx <= ly Note that this definition of <= relies on some semantic properties of the custom type (that last name is usually sorted before first name) that Haskell has no way of knowing. This is why ad-hoc is often necessary - per-type definitions describe a real-life domain in some way; for example, had Person had some unique ID it would be perhaps more correct to sort people by this ID. Having done this, we can now sort a list of Person s: > let folks = [(Person "Jones" "Jan"), (Person "Jones" "Areo"), (Person "Falcon" "Hugo"), (Person "Spearson" "Britney")] > msort folks [Person {lastName = "Falcon", firstName = "Hugo"}, Person {lastName = "Jones", firstName = "Areo"}, Person {lastName = "Jones", firstName = "Jan"}, Person {lastName = "Spearson", firstName = "Britney"}] It's important to reiterate that this example showcases both kinds of polymoprhism. msort is parametrically polymoprhic - it's the same code that will work for any type a , as long as a implements Ord . On the other hand, the <= operator is ad-hoc polymorphic - it works on different types, but each type should define its own version.

Return-type polymophism Now that we have the above covered, it's time to turn to the main topic of this post - return-type polymophism. Combined with type inference, this is a fairly unique and cool aspect of Haskell, especially if you come from the C++ world where templates provide some degree of parametricity but it's limited in certain other ways. Let's start with an example, the built-in function read : read :: Read a => String -> a The read function reads input from a string, which must be completely consumed by the input process. Something interesting is happening here: read is parameterized by type a , but this type is not one of its arguments; no, the argument of read is simply a String , which is a concrete type. a appears only in the return type. Let's try to parse an integer by using read : > read "1" <interactive>:46:1: No instance for (Read a0) arising from a use of `read' The type variable `a0' is ambiguous Possible fix: add a type signature that fixes these type variable(s) <...> Oops, what went wrong? The issue here is that 1 could be of several types, and Haskell doesn't know which one to choose. We can fix that with an explicit type annotation: > read "1" :: Int 1 But 1 can also be of other types: > read "1" :: Double 1.0 So read can truly return multiple types , depending on how it's being called. This is return-type polymorphism. Here's a more intriguing example: > putStrLn (take (read "2") (read "\"haskell\"")) ha Haskell didn't complain about read "2" , even though no type annotation was provided. What's going on? The answer is type inference. Haskell looks at that code and sees the result of read "2" being fed into take , as its first argument. The type of take is Int -> [a] -> [a] , so the result of read "2" is placed in an Int . This interaction of type inference with return-type polymophism is one of the most impressive features of Haskell, IMHO! Now let's turn to a slightly more interesting example - monoids. I've mentioned the Monoid type class before, in the context of folds. class Monoid a where The class of monoids (types with an associative binary operation that has an identity). To create a new Monoid , we have to define two functions: mempty :: a Identity of mappend mappend :: a -> a -> a An associative operation Note the type of mempty - it takes no arguments, but returns an arbitrary type a - return-type polymorphism. Here's how Haskell defines Monoid for lists: instance Monoid [ a ] where mempty = [] mappend = ( ++ ) mconcat xss = [ x | xs <- xss , x <- xs ] With this definition, we could use mempty for strings as follows: > mempty :: String "" For numbers, things are somewhat more interesting because there are two ways to define monoids on integers. One is using addition as the associative operation, another is using multiplication. In the former case, the zero element is 0, in the latter it's 1. Since there is no scenario which is clearly better than the other, Haskell doesn't define Monoid for Int : > mempty :: Int <interactive>:11:1: No instance for (Monoid Int) arising from a use of `mempty' Possible fix: add an instance declaration for (Monoid Int) Rather, it adds two new types that wrap Int : Sum and Product . Here's an example: > mempty :: Sum Int Sum {getSum = 0} > mappend (Sum 6) (Sum 7) Sum {getSum = 13} As before with read , note how mempty returns a different type based on what's expected from it. Type inference picks the right overload! In the following sample, type inference knows that mappend takes two arguments of the same type and the first one is a String , so it invokes the String -returning version of mempty : > mappend "Foobar" mempty "Foobar" Similarly, here the Sum -returning version is invoked due to type inference: > mappend (Sum 10) mempty Sum {getSum = 10}