Code by Any Other Name

Pattern matching in Scala for expressions

by Ian Robertson

January 31, 2010



Summary

Scala pattern matchers with side effects can have unexpected results in for expressions.


Scala has taken the traditional for-construct and put it on steroids. One of the interesting features is the ability to use pattern matching as part of a for expression. For example, the following:

case class Person(firstName:String, lastName: String);



val people = List(

Person("Jane", "Smith"),

Person("John", "Doe"),

Person("Jane", "Eyre"));



for (Person("Jane", last) <- people) yield "Ms. " + last;



last

yield

List("Ms. Smith", "Ms. Eyre")

object First {

def unapply[A](iter:Iterator[A]): Option[(A)] = {

if (iter.hasNext) Some(iter.next) else None;

}

}



val iters = List(Iterator.from(1), Iterator.from(3));

for(First(x) <- iters) yield x;

List(1,3)

List(2, 4)

will (effectively) iterate over all people with a first name of Jane, bindto each person's last name in turn, and then generate a list of the results of theblock, thus returningIn this example, we're using a case class to provide the pattern matching for us, but we can easily role our own. For example, consider the following matcher which will return the first element of an iterator, if there is one:We can try to get the first element of some iterators:However, this does not yieldas one might expect. Instead, it yields

So what is going on here? According to Programming in Scala, a pattern match in a for loop gets converted into a filter followed by a map. Specifically,

for (pat <- expr 1 ) yield expr 2

expr 1 filter {

case pat => true;

case _ => false;

} map {

case pat => expr 2 ;

}

2

next

is translated by the compiler toIn other words, first the input expression is filtered for those elements which match the pattern, and then in a separate pass, the pattern is applied for the purpose of evaluating. Unfortunately, there are a couple of issues with this. First, if the pattern match is computationally expensive, that work will be doubled for every match. Second, and more concerning, if the pattern match is not side effect free, then unexpected results like the one above can happen. In the example above, the pattern match, by callingon an iterator, changes the state of the iterator, so that a different result comes out in the second pass.

Interestingly, Programming Scala states that "it's guarannteed that a pattern-matching generator will never throw a MatchError ". In fact, this is not true. For example, executing

for (First(x) <- List(Iterator.single(1))) yield x;

scala.MatchError: empty iterator

at $anonfun$2.apply(<console>:6)

at $anonfun$2.apply(<console>:6)

at scala.List.map(List.scala:812)

will try to access a single-element iterator twice, resulting in:

One way to view this situation is that Scala's for construct is inherently functional, and should not be mixed with non-functional constructs such as iterators. However, it turns out that there is a completely functional fix to this problem. Simply change the compile-time pattern-match translation to something like:

expr 1 map {

case pat => Some(expr 2 );

case _ => None;

} filter {

case Some(_) => true;

case None => false;

} map {

case Some(x) => x;

}

expr 1 map {

pat.unapply(_);

} filter {

_.isDefined;

} map {

_.get;

}

Option

filterMap

filterForEach

Talk Back!

or, more succinctly:Of course, there is a cost to this approach: by introducing a third pass, extra an extra list ofobjects is created. Another fix would be to add new methodsandwhich cut down the work to a single pass. While less elegant, it would clearly provide a performance boost, and it would be largely hidden from view. In the mean time, one should use pattern matching in for expressions with care.

Have an opinion? Readers have already posted 6 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Ian Robertson adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Ian Robertson is an application architect at Verisk Health. He is interested in finding concise means of expression in Java without sacrificing type safety. He contributes to various open source projects, including jamon and pojomatic.

This weblog entry is Copyright © 2010 Ian Robertson. All rights reserved.