Cool Tools and Other Stuff

JavaOne 2010: Functional Programming, from Java to Scala

by Eric Armstrong

September 23, 2010



Summary

Dick Wall's talk turns out to be a treasure trove of useful tidbits, and a great introduction to Scala that whets my appetite, big time.


Dick Wall's talk was a revelation. The title was "Funky Java, Objective Scala". It sounded intriguing, but I didn't really know what to expect. It turned out that his talk was a treasure trove of useful tidbits that turned out to be a great introduction to Scala.

Contents

Background

Fortunately, I had recently read John Hughes' seminal paper, Why Functional Programming Matters, originally published in

1990. That paper did a really nice job of presenting two significant benefits of functional programming, in a slow, easy-to follow way:

Lazy Evaluation:

Let's say that function e() operates on entries produced by function g() . Take a game, for example. The function g() generates moves, while function e() evaluates them. But the game allows for an infinite number of moves. Pretty darn hard to generate them all and then evaluate them. But the evaluation function e() has cutoff threshholds. It knows when a move is"good enough". Lazy evaluation says to run e() whenever there is something to run on, and run g() one more step only when e() needs something else to process. Then you don't care if g() runs to infinite depth, because e() will find something acceptable and get back to you sometime before the end of the century.





Let's say that function operates on entries produced by function . Take a game, for example. The function generates moves, while function evaluates them. But the game allows for an infinite number of moves. Pretty darn hard to generate them all and then evaluate them. But the evaluation function has cutoff threshholds. It knows when a move is"good enough". Lazy evaluation says to run whenever there is something to run on, and run one more step only when needs something else to process. Then you don't care if runs to infinite depth, because will find something acceptable and get back to you sometime before the end of the century. Higher-Order Functions:

We know that modularity is a good thing. And we know that functions are useful. But when functions are "first class citizens" (which means you can pass one as an argument), then whole new realms of modularity open up. For example, having written one function that operates on entries in a list, you can pass other functions to it in order to do different things with the entries in that list. The wrapper function is now a "higher order" function that can be configured with worker-functions to do different kinds of stuff.



The result is a tribute to the inductive hypothesis. Having tested that a simple function passed to the list is properly applied to the members, and having tested that the function you pass is correct, you can be sure that result of processing will be correct. (That's the inductive hypothesis. If we can prove that proposition P is true for N+1 whenever P is true for N, and we can prove that P is true for 1, then we know P is true for all numbers. It's the same principle, applied to a list.)

Dick Wall's Talk

That paper turned out to be just the background I needed to fully appreciate Dick's talk. He started out by describing the advantages of functional programming and the situations in which it's helpful. Then he showed how to do functional programming in Java. That was educational in itself. But then he went a step further. He showed how much simpler the code was in Scala, and rattled off a list of Scala features that I typically associate with dynamic languages like Ruby.

To say I was intrigued is to put it mildly. The list of features missed a couple of incredibly powerful things that stem from fully dynamic classes, but at the same time those features can be dangerous, and they can make programs harder understand. So overall, it began to look as though Scala figured out how to strike a really useful balance between flexibility and readability.

Situations that Call for Functional Programming

When I was in school, I loved the fact that programming "recipes" (algorithms) gave me ways to solve problems that weren't well-suited ot a mathematical approach. Years later, object-oriented programming turned out to be a terrific tool for dealing with even more complex problems.

But as Dick pointed out, there are still many mathematical/scientific problems that are more easily expressed using combinations of mathematical functions, rather than algorithms and objects. So while objects and algorithms have been terrific for one class of problems typical of ecommerce, GUIs, and even development tools, it has been less than stellar in other arenas--arenas in which "functional decomposition" is an appropriate solution mechansim.

Dick's list of situations the benefit from the functional approach included:

Parallelism: One of the hallmarks of the functional approach is that functions have no side effects. They return a value, but internal state never changes. Because of that invariance, making things operate sequentially simply isn't as critical as it is with an algorithmic approach. And since order is close to inconsequential, many things can happen in parallel.

Science and Math Applications: As Dick pointed out, "functional composition" is second nature to mathematicians, and even to scientists. So while it seems weird to programmers at first, it's their natural way to do things. So when you are getting information on how to solve a problem, you're likely to get it functional form. And when you're describing how your program works, you're likely to be better understood if you're talking in functional terms.

Lots of nested if's, Massive for-loops, Monster methods, Flags for tracking state: These "code smells" suggest a level of complexity that doesn't break down well using the conventional object-oriented, algorithmic approach. Functional decomposition might be just the tool you need to pry the problem apart into manageable pieces.

Anything other than e-commerce: When you're doing network traffic analysis, for example, it might just be that mathematical functions are the ideal way to size up the situation and make real time routing decisions.

Functional Programming in Java

The next set of slides showed how to do functional programming in Java. The slides were too far away to read easily, but here's the gist of things:

A class that holds state has private final variables. Once assigned, they're fixed. So once created, the state of that object never changes.



To make a "change" you make a copy of the object with state variables modified.



To get an instance of the class, you use a Builder. The builder takes the state you want to set as arguments and returns an object for it.

[Note: For reasons that weren't quite clear, the builder had a chain of constructors, such that the first called the second, and so on, with a different state value passed as an argument in each call.]



[Note: For reasons that weren't quite clear, the builder had a chain of constructors, such that the first called the second, and so on, with a different state value passed as an argument in each call.] Take advantage of a variety of Java extensions: Guava - a Java collection library that provides predicates, functions, and constraints. (Although it's missing the map reduce operation which does sophisticated list processing, Lisp style.) Typesafe Pair<> and Triple<> tuples that let you pass back multiple results from a function, which is frequently desirable with a functional approach. Project Lombok - Which automatically adds getters and setters, as well as the equals and hashcode methods needed for comparisons and collections ordering. LamdaJ and JSR166y extras - Which have map.reduce . Parallel Array - A function that does a forks and joins under the covers to kick off a plethora of parallel processes.



Advantages of Functional Programming

The point here is simply that it's possible to do functional programming in Java. (In the talk, Dick showed a simplified version of a chromosome-analyzing risk-assessment algorithm, and showed how it fell apart rather nicely using functional decomposition.)

When you're used to it, according to Dick, and using it where it's appropriate, functional programs are:

Easier to write

Easier to read

Easier to debug

Easier to test

That last one really caught my attention, because I am a testing maniac. I've coded without tests, and I've coded with them. Coding with a comprehensive collection of tests is way more fun. And it's safer. Programs stay up longer, and when they do crash, they tend to avoid the ultimate sin of losing user data.

When Dick said that functional programs are easier to test, it really rang a bell. Because the first thing I need to do to make code testable is typically to break it up into teeny modules that can be independently tested. Therein lies the path to coding nirvana.

So right about this point in the talk, several things came together: My experience with testing, Dick's comments, and the description of higher order functions in John Hughes' paper. The paper has a marvelous series of examples where you create a list-processing function that takes other functions as an argument. One takes a "sum" function as an argument, and sums the list. Another takes a "product" function, and multiplies the list elements together. He goes on to give several more examples--all of which are variations on the map reduce theme.

The important thing, to circle back to a point made earlier, is that once you know the meta-function works properly on lists of arbitrary size, you only need to add a test for the new function you're adding (sum or product, for example). And all you have to prove with those tests is that the "seed" of the induction hypothesis holds true. You already know that the hypothesis holds for N+1 if it holds for N. All you have to prove is that it does indeed hold for N.

For example, you only need to prove that the sum function produces the correct result for two numbers. It's not hard to do that in a fairly comprehensive manner, given that the task is so small. You care what happens when there is a 1 and a null, or a null and a one, or a null and a null. Maybe half a dozen cases in all. You don't care what happens when there is a null in the beginning of a list, a null at the end, or if there are nulls scattered throughout the list. You don't have to test the list-processing code at the same time as the value-summing code. You divide and conquer.

As an additional benefit, debugging becomes easier, too, because you have a large collection of teensy functions. When any function has a bug, there is only a small amount of code to inspect. If function A combines functions B & C, and B and C are both known to work, then the bug has to be in the way A combines them. At any point in time, you're only looking at small amount of code, so debugging is drastically simplified.

Testing Java with Scala

Highly modular programs are easier to test, and Scala's capacity to create DSLs makes it possible to define a testing language that suits the problem. That's the same sort of thing that the Ruby community has been so succesful with. But since Scala works so well with Java, Dick pointed out that Bill Venners was able to make a tidy little DSL wrapper for JUnit and TestNG called ScalaTest.

So while highly functional Scala progams would be the ultimate in testability, it's possible to enjoy the goodness of Scala to make it easier to write tests for new and existing Java code--and to make those tests easier to read, as well. (see Inset: ScalaTest Your Java Code)

ScalaTest Your Java Code ScalaTest is modeled after my favorite Ruby test harness, RSpec. The ability to do testing in the “Behavior Driven Development” (BDD) style that RSpec allows is important, in my view, because the resulting “specs” can be easily read and reviewed--a highly significant feature that is often overlooked when the benefits of testing are enumerated. http://www.artima.com/weblogs/viewpost.jsp?thread=294984

Reviewable, Runnable Specifications Ensure Software Quality

http://www.scalatest.org/

--ScalaTest can be used to test Java classes

--Shows a readable, runnable test/spec in the BDD style

http://www.scalatest.org/quick_start

--Mentions that it works with JUnit and TestNG

--Mentions that the BDD style shown on the first page is a "trait" that can be enabled or disabled

--Mentions the ability to do functional and integration testing, as well

Scala Features

Having shown that functional programming is a valuable tool for a variety of problems, and having shown that it can be accomplished in Java, Dick Wall went on to show how much easier and simpler the code was in Scala. (It was too far away to read, but there was less of it!)

He cited these advantages of Scala, in particular. (He didn't have time to go into them so the comments that follow them are mine.)

Both Object-Oriented and Functional

It's not Lisp. And it's not Java. It's a really nice blend that lets you choose the approach you need for the problem at hand.

Fully integrated with Java.

Import Java libraries with virtually identical syntax. Use them the same way. Compile to Java classes.

Tuples

It's a highly useful construct that is used often, most frequently to return multiple values from a function. Apparently it is so useful that NetBeans has no less than 13 different varieties of typesafe Tuples.

Pattern Matching

Pattern matching is built in. That's handy, because it's useful. And having it built in keeps the syntax clean.

Closures

Closures are a marvelous thing. I got used to them in Ruby. You pass a block of code to a list, and it gets executed on every member of the list. For example: someList.each {|item| your code here} . Forget loops. You focus on the logic you want to execute for each item in the list--not the loop-processing logic.

DSLs

Scala makes it easy to write DSLs--both internal and external, using slightly different approaches. Any single-argument function can be used as an infix operator, functions can be named using symbols, and parentheses and dot-operators are optional. So you can write object.>=('a') , or object >= 'a' , where >= is defined in whatever way makes sense, using the same pattern you use for other functions: def =< (arg) {...}

Operator overloading

The >= function can be defined using speed for sports cars, hauling capacity for trucks, and weight for Sumo wrestlers.

Traits

A trait is like a class. It has fields and methods. But while it can't be instantiated, it can be included in any class that needs those operations.

Case Classes

This is a fascinating capability that I need to explore more, but early indications are that I'm going to love it. A "case class" is a class that can have multiple constructors, each with a different name. The processing for that class can now ask, "what type of myself am I?" and do the right thing, depending on the answer. To make it concrete, it would be possible to define a Vehicle class with constructors for car, boat, motorcycle, bicycle, and truck. Common methods like slow down and accelerate could be shared, but they could behave differently at times, too. (That is highly unlikely to be the best possible example. But it seems reasonable at the moment.)

Implicits

Scala's answer to Ruby's open classes. The ability to specify a wrapper type for a class, and automatically convert references to the original type into instantiations of your wrapper type. So if you create SuperString, all references to String s are autoconverted to SuperString(s). Too cool. Apparently, it also works when a String argument is passed to a method that specifies SuperString as the parameter type. Also: Default values for missing arguments. Another useful feature that comes in surprisingly handy when constructing DSLs.



And those are just the features he mentioned! Once I started looking around, I came across quite a few more:

Built-in XML handling

Given the prevalence of XML for data storage and interchange, that is a way good thing.

Built-in BNF handling

Great for building external DSLs.

All the cool types: List, HashMap, TreeMap, LinkedHashMap, Queue, Set, Stack, Tuple .

Rich Tree Processing mechansims.

TreeMap and Case Classes, for example, which turn out to be terrific for syntax trees.

Function "currying"

Iif x(a,b) is a function, than x(a,b,c) must mean that x(a,b) returns a function, so apply the returned function to c .

List comprehensions

(Something for the Pythonistas--algorithmically generated lists. You specify the algorithm, you get a list.)

So What's Missing in Scala?

Missing-method Handlers

Many Ruby DSLs do a lot of their processing when you invoke a method that isn't there. You add a handler and "do the right" thing, using the name of the method as your guide, and the fact that you were able to pass a block of code as an argument--a block of code that can easily contain calls to other methods that aren't there.



For example, that kind of handler makes it easy to create a "language" that generates XML. You called foo() and there is no method by that name? That must mean you should start an XML element by that name, so output <foo> . Now execute the code-block that was passed to foo(), if any, and generate anything you find in there. When you're done doing all of that, generate </foo> . Voila! Instant XML generation without having to pre-define a method for every tag you might want to generate.





Many Ruby DSLs do a lot of their processing when you invoke a method that isn't there. You add a handler and "do the right" thing, using the name of the method as your guide, and the fact that you were able to pass a block of code as an argument--a block of code that can easily contain calls to other methods that aren't there. For example, that kind of handler makes it easy to create a "language" that generates XML. You called and there is no method by that name? That must mean you should start an XML element by that name, so output . Now execute the code-block that was passed to foo(), if any, and generate anything you find in there. When you're done doing all of that, generate . Voila! Instant XML generation without having to pre-define a method for every tag you might want to generate. Open Classes

(Not a missing feature. Covered by Scala implicits My thanks to the folks who clued me in.)

Open classes let you add features at runtime. That feature is kind of like a chain saw, actually. It's really powerful. But it can also rip your leg off. And it's tricky to maintain. (What is that class doing *now*?) It's the kind of thing that would be driving programmers nuts by the end of a 20-year maintenance cycle. But it's a godsend when the developers of the original library weren't prescient enough (or didn't have the time) to develop everything you need.





(Not a missing feature. Covered by Scala My thanks to the folks who clued me in.) Pattern-Matching Operator

Scala has a cool match/case mechanism that looks like the Java switch statement, only with regular expressions. But so far, I haven't seen the simple match operator common to Icon and Ruby. In Ruby, the expression syntax is:

someString =~ /regularExpression/



The expression returns true if the string matches the regular expression, false otherwise. It's not as powerful as the match/case statement, but it's convenient and concise.

Of course, Scala makes it pretty easy to define such an operator, and use it the way you would expect to. (Any single-argument function can be used as an infix operator, function names can be symbols, and parentheses and dot-operators are optional. So the syntax comes out exactly the way you want to see it.) Definition: class regexTest(String s) { String local_s = s def @= (regEx re) { match local_s { case re: true case _: false } } } Like a true functional language, the last expression to be evaluated is returned. So if the match succeeeds, true is returned. Otherwise, false . Usage: regexTest(someString) @= /regularExpression/

You then use an "implicit SuperString" definition to make it available in every reference to the String type.

Bottom Line

With all of that functionality in place, what exactly is Scala missing? So far, I've only found a few things:

Ok. Scala is missing a couple of Ruby's cool features. But it's not missing all that much. And it has a lot going for it. (For other takes on the subject, see the Inset.)

Comparing Ruby and Scala In Things a Ruby Developer Needs to Know about Scala, a fellow Ruby-head waxes poetic about Scala's strengths.

Some highlights: "10-100 times faster...(with) built-in syntactic support for XML"

"Scala makes it easy (and intuitive) to write functional style code."

That's important because "Much of Ruby's beauty and productivity comes from its functional side"

"Much better facilities for writing DSLs. Scala's syntax is more flexible and expressive that Ruby's"

"Scala's parser library lets you have BNF-style expressions right in your code" See also this primer on constructing internal DSLs in Scala. It's a follow-on to an earlier paper that told how to write external DSLs. Highlights: "Implicits are perhaps the biggest powerhouse towards designing user friendly DSLs. (They) offer a safe way to extend existing abstractions. Implicits in Scala is perhaps one of the best examples of a meaningful compromise in language design between uncontrolled open classes of Ruby and the sealed abstractions of Java...."

"...And the best part is that the entire extension...is lexically scoped and will only be available within the scope of the

implicit definition function."

Resources



http://scala-lang.org/docu/files/ScalaTutorial.pdf

A clear, concise intro that builds on Java familiarity and shows off Scala's advantages. Brilliant.





A clear, concise intro that builds on Java familiarity and shows off Scala's advantages. Brilliant. http://www.ibm.com/developerworks/java/library/j-scala06278.html#N1019D

The busy Java developer's guide to Scala: Collection types.





The busy Java developer's guide to Scala: Collection types. http://www.artima.com/scalazine/articles/steps.html

12 steps to using Scala. Lots of good info.





12 steps to using Scala. Lots of good info. http://escalatesoft.com

A venture partnered by Dick Wall and Bill Venners, devoted to teaching Scala. They have free intro sessions from time to time. (Subscribe to their RSS feed to find out about them.)



Talk Back!

Have an opinion? Readers have already posted 11 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Eric Armstrong adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Eric Armstrong has been programming and writing professionally since before there were personal computers. His production experience includes artificial intelligence (AI) programs, system libraries, real-time programs, and business applications in a variety of languages. He works as a writer and software consultant in the San Francisco Bay Area. He wrote The JBuilder2 Bible and authored the Java/XML programming tutorial available at http://java.sun.com. Eric is also involved in efforts to design knowledge-based collaboration systems.

This weblog entry is Copyright © 2010 Eric Armstrong. All rights reserved.