February 13, 2011 — Mario Gleichmann

Welcome to another episode of functional Scala!

Within the last episodes, we discovered how to define so called algebraic datatypes. So far, we’ve only implemented some of them and saw how to construct some values for that types. Havin’ said that, we only walked on the construction side of algebraic datatypes so far. What’s wrong about that? We’re able to define new datatypes and bring some of their values into life happily. Can’t ask for more! Well, what’s missing is how to operate on them! So within that espisode we’re crossing the border and starting to deconstruct values of a given datatype, using pattern matching. We’ll see that pattern matchin is a very powerful way for writing comprehensible, well-structured functions over algebraic datatypes, while doing some case analysis.

Remember our hand-made, self defined enumerated type Bool? There, we already wrote some simple boolean functions which are operating on values of type Bool. Let me refresh your memory:

sealed abstract class Bool case object True extends Bool case object False extends Bool ... val and : ( Bool, Bool ) => Bool = ( a :Bool, b :Bool ) => if( a == False ) False else b

Ok, we already did some case analysis, leveraging if-expressions. With pattern matching, you would do almost the same here: thinking about the possible cases which can occur and give an expression (to which the function finally evaluates) for each of them. The only difference here is the form of identifying which case is at hand for your given, actual arguments. This is done by matching a value against a so called pattern. Watch out:

val and : ( Bool, Bool ) => Bool = ( a :Bool, b :Bool ) => a match { case False => False case True => b }

What have we done? We simply matched the first argument a against a bunch of possible cases for the arguments actual value, using a so called match expression (introduced by Scala’s keyword match). The actual value of a will be compared against every provided case expression, starting with the first one from top to bottom. More precisely, the value is compared against the pattern, provided by each case-expression. In this example, the match is executed against a definite value (True or False). Hold on, we’ll see some other kind of possible patterns to match against in some further episodes!

As soon as there’s a match between the given value and the provided pattern, any further pattern matching is stopped. All following case expressions remain unregarded, cause we’ve found an applicable case. So the first appropriate case always wins and the whole match expression evaluates to the expression on the right side of that matching case.

In case of Bool, there are always only two cases: the value of a could be either False (so the first case would match) or True (so the second case would match). Since there exist only these two values, there isn’t any other chance for a to be False in the second case. So the second case could also be seen as the otherwise case, without explicitly matching against value True:

val and : ( Bool, Bool ) => Bool = ( a :Bool, b :Bool ) => a match { case False => False case _ => b }

Ah, and there it is again, our magical jack-of-all-trades, the underscore. Within a case expression, the underscore always evaluates to true, so you should use ’em in a clever way, e.g. for collecting all unconsidered cases so far (while placing the underscore at the very top of your line-up might be not so a good idea).

In the above example, using the underscore instead of matching explicitly against value True is logically the same. Now, imagine we would extend our underlying datatype Bool by adding another value constructor, say Maybe (maybe for applying Bool within a probabilistic calculation model):

sealed abstract class Bool ... case object Maybe extends Bool

Now what happens to our first version of function definition, where we matched against True and False explicitly? Because we sealed Bool (using Scala’s keyword sealed for ensuring that Bool can’t be extended within other locations in a rather uncontrolled way), the compiler will inform that the match is not exhaustive: we haven’t considered all possible cases any more! Of course we can ignore that warning which may be receipted with a MatchError at runtime! That’s the downside of algebraic datatypes: they act very poorly when it comes to value extension afterwards, since all your functions which are based on pattern matching against a value of that type may become incomplete!

Regarding the second definition version (using the underscore for absorbing all cases which aren’t covered yet) the new value Maybe will also be absorbed in case otherwice, so you can’t get into trouble for MatchErrors at runtime. But that may be a dearly purchased victory, since the logic will behave quite the same for True and for Maybe. Ask yourself if that’s what you want the function to behave.

In general, we have to ask if our function behaves the right way for any given case-expression any longer. The underlying semantic may be switched by the introduction of Maybe. Maybe we need a more appropriate tristate logic, which always need to consider the combination of two values explicitly. In that case, we need to rewrite our case expressions anyway (in fact that’s what we wanna do in the next episode, where we’re gonna consider some more complex patterns).

If … you can do it yourself

Looking at the above example, we could leverage if-expressions or match-expressions interchangeably. In fact, if we had only pattern matching at hand, we could write our own if-expressions, attesting pattern matching is at least as powerful as using if-expressions. We’ll define our own if-construct as a higher order function, which takes a value of type Bool and two other functions, one for each case. Unfortunately we can’t name our function if, since it’s a reserved word in Scala, so we’ll name it inCase instead. Don’t mind – it’s only a name, the essence remains:

def inCase[A] ( b :Bool, ifTrue : () => A, ifFalse : () => A ) : A = b match { case True => ifTrue() case _ => ifFalse() } ... val i = inCase( True, () => 1, () => -1 )

We just reproduced if-expressions by delegating their inner logic to pattern matching! Well, the form still looks a little clumsy, but what we’ve built here is in essence if in disguise: we can pass two functions, one which gets executed for the True-case and the other one for the else-case. Note that it’s essential that we pass functions as values, since we only want to apply one out of the two. Because Scala is a strict language, we couldn’t simply pass an arbitrary expression for those two cases, since they would be evaluated before they get passed to our function inCase!

Also note, that we’ve defined our function as a method, since we wanna make use of parametric polymorphism. The (result-) type of the whole inCase-expression is polymorph in A, depending on the result type for the functions which get executed in case of True or False. As long as they both result into the same type A, we can safely state the result of the whole expression to be of type A, too. So the following usage scenarios would be completely legal:

val name :String = inCase( True, () => "Ann", () => "Sue" ) val season :Season = inCase( False, () => Winter, () => Summer ) val shape :Shape = inCase( and( True, True ), () => Circle( 10.oF ), () => Rectangle( 5.oF, 5.0F ) ) val thing :Any = inCase( or( and( True, False ), False ), () => Color(230, 90, 150), () => "No Color!" )

There are two annoyances left. First, all three parts of our inCase-expression are mentioned within the same argument list. Don’t worry if you wanna separate them. We’ll see soon how to split an argument list with several arguments into several argument lists with only one argument each, when discovering so called curried functions. Second, what bothers even more is the empty argument list for our case-functions. They don’t take any argument but only result into a value of type A, hence both functions are of type () => A. So you need to carry that empty argument list with you, whenever you gonna define a function of such type. Don’t be sad – we’ll see how to get rid of empty arguments lists when focusing on lazy evaluation in general and by-name parameters in special.

Summary

This time, we finally operated on algebraic datatypes using some really basic pattern matching. We saw how to introduce a match-expression which consists of a bunch of case-expressions. Each case expression provides a pattern, which the value under observation is trying to get matched against. In our case, the pattern came in its simplest form, which’s a concrete value. We also noticed, that a match expression is an expression in itself. The value of that match expression is the value of the right hand expression for the first matching case.

Further on, sealing the base class for our algebraic datatype seems to be a good idea. It’s a way to minimize the risk of introducing some additional values after the datatype is defined and under heavy use. For that however be the case, we saw that a match expression may result into a MatchError if it doesn’t cover all possible cases, or – on the other hand – may be absorbed by the otherwise case, expressed by the underscore pattern (which always evaluates to true when matching against).

Finally, we saw how to simulate if-expressions, leveraging pattern matching and higher order functions. Admitted, the current form really looks ugly (and we’re going to beautify this), but it shows that pattern matching is at least as powerful as using if-expressions (making native if-expressions kind of redundant). In fact, pattern matching is far more powerful (at least far more expressive) when it comes to more complex cases and patterns resp. Those will be the topic for the next episode, since we haven’t seen how to operate efficiently on product types!

Hope to see you then …