January 23, 2011 — Mario Gleichmann

Welcome to another episode of Functional Scala!

If you’re coming from the object oriented world, chances are good that you’ve already heard something about a concept called Polymorphism. You may also know that there are different classes of Polymorphism out there: most of us are familiar with so called (Sub-)Type polymorphism and Parametric polymorphism (you may call it Generics).

Since functions take a central role within Functional Programming (i need to mention this, in case you’ve forgotten it), we want to take a closer look if Scala allows for so called polymorphic functions, which plays in the area of parametric polymorphism. If you’re now asking what the heck do we need polymorphic functions for (if anything), you’re coming to the right place!

Parametric polymorphism on types

Let’s start the journey and take a look at an example of a polymorphic type, coming from well- known ground: Lists …

You may know, that a List can take elements of an arbitrary type. In fact, a raw List on and off itself isn’t a real type in Scala. Well, of course is List a type! With real type, i mean a type which needn’t to be parameterized with another type in order to work with, such as type String or Int. So all types for which you’re able to define some concrete values, i will call real type in the following sections.

val names : List[String] = List( "Anne", "George", "Carla" ) val primes : List[Int] = List( 11, 17, 19 ) val things : List = List( "Anne", 17, true )

See that explicit type declaration at line 3? We said that things should be of raw type List. But since List isn’t a real type, the compiler complains with

type List takes type parameters

Well, so you always need to declare a concrete instance of a List in the light of the type for the Lists elements (like you see at line 1 and 2) for creating a real Type (like List[String] or List[Int]). From that point of view, you can say that List is a so called Type Constructor: it takes one type (e.g. String) and results into a real type (eg. List[String]). So if we now abstract over types, we would lend ourself in the land of so called kinds, in which you could talk about the nature of types on a meta level (but that would be a topic for a whole episode in itself).

Just remember why we name List polymorphic: As a raw type it doesn’t matter of which types the elements of a List are. This is expressed by applying a type parameter to List (take a look at the definition of type List – there you’ll find something like List[T] – variance aside), which abstracts over the concrete type of the elements. But as soon as you want to declare a concrete instance, you need to type parameterize List in order to get a real type. Sounds funny? At least if you haven’t played with those kind of types. All in all, it’s a more or less complicated description for generic types (e.g. if you come from Java-land).

Parametric polymorphism on functions

If you look back at some earlier episodes, we’ve always defined functions on real types:

val filter = ( predicate :Int => Boolean, xs :List[Int] ) => { for( x <- xs; if predicate( x ) ) yield x } ... val positiveNums = filter( SomeFilters.isPositive, List( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ) )

As you may see, filter only works on a real type List[Int]. If we wanted to reuse filter for – say – filtering a List of Strings we’ve fooled up so we need to come up with another filter function for List[String] … ? Well, yes and no. For this to answer, we’ll now take a short detour and come back to that question afterwards – promised!

First of all, let’s take a look at a somewhat more simple example. Let’s write a function which can be applied to an arbitrary List instance and decide if that instance is empty or not. For this to decide, our function doesn’t depent on the conrete type of the List elements. That said, take a look at the following function definition:

val empty = ( xs :List[_] ) => xs == Nil ... val names : List[String] = List( "Anne", "George", "Carla" ) val primes : List[Int] = List( 11, 17, 19 ) ... empty( Nil ) // true empty( names ) // false empty( primes ) // false

The action goes on at line 1 – a neat function which says that it accepts an arbitrary instance of List, type parameterized for some type we’re not interested in. Aaah – and there goes the mighty underscore again, used to designate a so called Existential Type: we’re informing the compiler, that there must exist such a concrete type, but we don’t care which type it is – it might be a List of Strings or a List of Booleans, we simply don’t care.

If you would give our function an explicit type annotation, it would look like this:

val empty : List[_] => Boolean = ( xs :List[_] ) => xs == Nil ...

Wait wait wait, you may say. We’ve already learned that the above defined Function literal comes down to a concrete implementation of trait Function1[-T,+R]. And you’re right! Remember? A Function is a first class value of a certain type. And like any other value it needs a real type to come into life! So of what type is our function then? Well, it turns out that the compiler will treat an existential type (within function definition) like the most general real type there is: and you know what, it’s Any. So it would be fully legal to define our function like so:

val empty : List[Any] => Boolean = ( xs :List[_] ) => xs == Nil ...

This is all fine, as long as we don’t refer to an instance of that type or some special characterisitcs of that type within our function (or the other way around, it’s all good as long you refer to an instance and treat them as Any).

Now let’s write another function, accepting a concrete Pair (a Tuple2, which is also a Type Constructor) and returning the first component. As we just learned about existential types, let’s give it a go:

val first :Tuple2[_,_] => _ = ( pair :Tuple2[_,_] ) => pair._1 ... val person = ( "Anne", 27 ) ... val name = first( person )

Looks easy! Function first takes any Pair with its first and second component of an arbitrary type, then picks and returns the first component. But of what type is the returned value? If you take a look at the example, you see that the first component is of type String, Hence name should of cource be also of type String! Ehm, remember what we said about how the compiler treats an existential type? Down in our function (which picks the first component within the functions body), all we can state about the type of the pairs first component, is … right … that it’s of type Any. Hence the type annotation of our function could also be rewritten like this:

val first :Tuple2[Any,Any] => Any = ( pair :Tuple2[Any,Any] ) => pair._1 ... val name :Any = first( person )

That’s a pitty! Since we already know the type of the first component, we can’t make use of it, if the value is extracted by our function (which kind of generalizes the type to Any). So what are our options?

Function classes

Unfortunately, we’ve already seen, it’s kind of impossible to write that kind of polymorphic functions, since trait FunctionN always needs some concrete types when bringing some function values into life. But wait another minute. What about writing a type parameterized class which implements our Function trait? Let’s give it a try:

class First[T] extends Function1[Tuple2[T,_],T]{ def apply( pair :Tuple2[T,_] ) :T = pair._1 } ... val firstString :Tuple2[String,_] = new First[String] val firstInt :Tuple2[Int,_] = new First[Int] ... val name :String= firstString( person ) val one :Int = firstInt( (1, 2 ) )

Ouch! As you may see – writing a function this way is all but neat when it comes to function definition! It’s too much boilerplate code, hiding the essence! Secondly (and more important for our current Quest) as all functions are values, we also need to create an individual instance of that function for every type it should operate on … there’s no escape from it …

Function Factories

So you may ask, if there’s no escape from always bringing functions into life which always rely on concrete types, are there any other ways to define such kind of type parameterizable functions as we did by implementing trait Function1, but only without all of that boilerplate code? That said, what we’re really looking for is essentially a way to define a function in the following form (no valid Scala code):

val first :Tuple2[A,_] => A = ( pair :Tuple2[A,_] ) => pair._1

As we know, it’s not possible to have an open (unparameterized) type parameter on a function as a first class value (an instance of a concrete, real type). But it turns out, that we could use ordinary methods which are allowed to take type parameters (since they are not values but only members of certain types), as long as a function pops out at the end of the day. Observe:

... def first[A] :Tuple2[A,_] => A = ( pair :Tuple2[A,_] ) => pair._1 ... val name = first( person )

Now go and take another look at our invalid Scala code again, where we tried to define a polymorphic function. It looks pretty the same, except that the first one is a value definition which represents a function, whereas the second (valid) one is a method which produces a function. This is an important distinction! Every time you call that method, it will produce a new function (that is a new instance)!

So if you take a closer look at line 4, what you see here is a method call which returns a newly created function of type Tuple2[String,Any] => String (since the compiler kicks in and infers the needed type for applying to person). The delivered function is then applied to person. So although the whole expression looks like an ordinary method call, it’s in fact a method call (to a factory method without any parameter) and a function call which follows afterwards. If we would write that expression twice, we would create two instances of the same function type!

The fact, that a new function is created, every time you call that method give rise to some worrisome situations. Gaze at the following scenario, where we will fill a given List with a given value in repetitive manner:

def replicate[T] : ( Int, T, List[T] ) => List[T] = { ( n :Int, t :T, list :List[T] ) => { if( n <= 0 ) list else t :: replicate( n-1, t, list ) } } ... val ys : List[String] = replicate( 4, "y", Nil ) // List( "y", "y", "y", "y" ) val ones : List[Int] = replicate( 5, 1, Nil ) // List( 1, 1, 1, 1, 1 )

Spotted the sweet spot? This innocent looking function seems to call itself recursively while prepending elements in front of the given list. But it isn’t a recursive call at all! Instead of calling itself, the function calls our method replicate, which in turn produces a new function which then gets called. So for every assumed recursive function call, another function comes into life. Big party time for Carbage Collection!

In order to avoid that function inflation, especially in conjunction with recursive functions – and because it seems to be still en voque – we could do some outsourcing. Well, sort of. What would happen if we consign all type parameters to the next outer context, like this:

trait Lists[T]{ ... val replicate : ( Int, T, List[T] ) => List[T] = ( n :Int, t :T, list :List[T] ) => { if( n <= 0 ) list else t :: replicate( n-1, t, list ) } }

Now our function isn’t anonymous anymore. Look, it’s defined in a polymorphic way without having the compiler to complain! The type parameter is shifted to the surrounding trait. And since the function got a name again, the call to replicate inside the function definition refers to the function itself this time!

We’re not escaping from the fact, that we need a concrete function value at runtime. So in order to use that function, we need a concrete instance for that trait Lists. In its simplest form, it could look like this:

val ones = new Lists[Int]{}.replicate( 5, 1, Nil ) val ys= new Lists[String]{}.replicate( 4, "y", Nil )

That again looks ugly. At least, we could let the compiler do some type inference. Watch out:

def replicate[T] = new Lists[T]{}.replicate ... val ys = replicate( 4, "y", Nil )

Ok, that’s nothing else but putting another level of indirection between the function creation and the function call. You have to judge for yourself if it’s feasable for you! In every case, you always have to bear in mind, that a new function is created with every call! Well, if that’s all too much heavy lifting for you, there’s another solution which takes advantage of a mechanism which we already examined in greater detail within the last episode (when it comes to its possible fields of application).

Parameterized Methods

As you may know, Scala allows us to define parameterized Methods. So the above mentioned functions could also be directly written as a method (as a member of some type):

object Tuples{ ... def first[A]( pair :Tuple2[A,_] ) :A = pair._1 } ... val name :String= Tuples.first( person )

Ok, this time we’ve defined a (type) parameterized method which introduced a type parameter for the pairs first component (as long as we dont care for the type of the second component, we denote it as an existential type). No functions on board, not even function creation – it’s all done within the methods body.

But, but, but its a … method, not a function! What about all those situations, where we are in need of a real function? For example, what about our beautiful higher order functions, where we might want to pass some other functions! Well, as we’ve already seen, Eta expansion to the rescue! Since Scala is able to coerce a method into a function, it’s ok to kind of simulate polymorphic functions with parameterized methods.

Back to the roots

Still remember our intial question? We now have some tools at hand on how we could define our filter function as a polymorphic one, allowing us to filter for values of arbitrary type. In doing so, we’ll introduce a type parameter for the type of the List elements. Lets do this in two flavors. First a polymorphic function definition, using outsourcing:

trait Filters[T]{ ... val filter = ( predicate :T => Boolean, xs :List[T] ) => { for( x <- xs; if predicate( x ) ) yield x } } ... object IntegerHandler extends Filters[Int]{ ... val positives : List[Int] = filter( _ > 0, List( 1, -1, 2, -2 ) ) } ... class StringHandler extends Object with Filters[String]{ ... val as : List[String] = filter( _ startsWith( "A" ), List( "Anne", "George" ) ) }

As you might see, we produce two concrete function instances, each one within a different environment (IntegerHandler and StringHandler), which take care to create a proper instance of that trait our polymorphic function lives in. In both cases, the creation of that trait is bound to the creation of the types instance which refers to the function (so we can ensure that there’s always the same function instance we refer to, at least within IntegerHandler resp.within every instance of StringHandler).

Finally, let’s see how to define filter as a plain method:

object Filters{ ... def filter[T] ( predicate :T => Boolean, xs :List[T] ) = for( x <- xs; if predicate( x ) ) yield x } ... val positives : List[Int] = Filters.filter( _ > 0, List( 1, -1, 2, -2 ) ) val as : List[String] = Filters.filter( _ startsWith( "A" ), List( "Anne", "George" ) )

Here, both times we’re referring to one and the same method, so filter isn’t a function anymore. Nonetheless, you are also allowed to pass other functions as arguments to a method, making them a kind of higher order method. For some future episodes this will become our natural choice when it comes to define polymorphic functions, although it become very easy (not to say tempting) to alter or refer to state within the methods surrounding type. In this case the quest for staying pure – that is the quest to do functional programming – has become once more a kind of commitment to you as the developer …

Summary

Wow, what a journey. We saw that it’s not possible to directly define polymorphic functions in Scala. Instead there are some work arounds to kind of simulating them. It all comes down to the fact, that a function is a value of a certain Function type, which needs to be type parameterized at runtime, that is all type parameters need to be type parameterized for getting a real value (or instance of that type).

As long as we don’t care for a type, we could define a function which simply denotes an existential type, which comes down to the most generalized type Any, so that’s the best you can get from such functions. As soon as you want to keep track of a certain type, you need to introduce type parameters. Introducing them by directly giving an implementation for an appropriate Function trait seemed awkward as it produced a lot boilerplate code. To get rid of it, we switched over to methods as function factories with the mentioned problems for recursive function calls. That problem was solved while shifting those type parameters to a surrounding type, but it couldn’t hide the fact that you finally need to create some concrete instances of your functions

As a last resort, we saw ordinary methods as kind of replacement for polymorphic functions. We don’t need to create an instance of that method first, in order to use them. But in fact the creation is only deferred to the creation of an instance of the surrounding type (the method is a member of). In addition to that, going with methods as functions, there’s always higher risk to introduce impure functions, as methods could refer to the types state (the method is living in) more easily.

So as a last conclusion, defining polymorphic functions is possible, but the consequences might outweight the benefits. As always, you need to be aware of the given risks and decide for yourself if it’s worth the trade offs (which i hope to have shown to you) for your concrete problem area …