The busy Java developer's guide to Scala

Dive deeper into Scala concurrency

Find out how actors offer a new way of modeling your application code

Content series: This content is part # of # in the series: The busy Java developer's guide to Scala Stay tuned for additional content in this series. This content is part of the series: The busy Java developer's guide to Scala Stay tuned for additional content in this series.

About this series Ted Neward dives into the Scala programming language and takes you along with him. In this developerWorks series, you'll learn what all the recent hype is about and see some of Scala's linguistic capabilities in action. Scala code and Java ™ code will be shown side by side wherever comparison is relevant, but (as you'll discover) many things in Scala have no direct correlation to anything you've found in Java — and therein lies much of Scala's charm! After all, if Java could do it, why bother learning Scala?

In the last article, I talked about the importance of building concurrent code (whether it's in Scala or not) and some of the problems and issues facing developers in doing so, including not locking too much, not locking too little, avoiding deadlocks, avoiding spinning up too many threads, and so on. Quite a depressing list.

Not being one to leave things in such a heavy state of despair, I then began to explore with you some of Scala's concurrency constructs, beginning with the basic use of the Java language's concurrency libraries directly from within Scala, then moving on to the MailBox type from the Scala API. And while both approaches are certainly viable, that's not where the real "hotness" of Scala and concurrency comes from.

Enter Scala actors ... stage left.

What's an "actor"?

An "actors" implementation, essentially speaking, is an approach that uses message-passing between executing entities known as actors to coordinate work (notice the intentional refusal to use words like "process," "thread," or "machine"). If this sounds vaguely like an RPC mechanism, it is and it isn't. Where an RPC call (like a Java RMI call) will block at the caller side until the server side has completed its processing and sent back some kind of response (return value or exception), a message-passing approach deliberately does not block the caller, thus neatly avoiding the chance of deadlock.

In and of itself, simply passing messages around doesn't avoid all the problems of concurrency-incorrect code. In addition, this approach also fosters a "share-nothing" style of programming in which different actors have no access to shared data structures (which incidentally helps promote the encapsulation of whether that actor is local to this JVM or all the way across the world) — this completely eliminates the need for synchronization. After all, as we've seen before, if there is nothing shared, there is nothing we need to synchronize concurrent execution against.

This is hardly a formal description of the actors model and undoubtedly those with a more formal background in computer science will find all kinds of ways that this description fails to capture the full details of actors. But for the purposes of this article, it serves as a good basis. More detailed and formal descriptions can be found on the Web, including several academic papers that detail the concepts behind actors (I'll leave it to your discretion to catch up on these later). For now, we're ready to take a look at the Scala actors API.

The Scala actors

Fundamentally, working with actors is not that difficult, the easiest being to simply create an actor using the actor method of the Actor class, as shown in Listing 1:

Listing 1. And ... action!

import scala.actors._, Actor._ package com.tedneward.scalaexamples.scala.V4 { object Actor1 { def main(args : Array[String]) = { val badActor = actor { receive { case msg => System.out.println(msg) } } badActor ! "Do ya feel lucky, punk?" } } }

A couple of things are happening here simultaneously.

First, we import the Scala Actors library from its self-named package, and then we import the members of the Actor class from that library directly; this second step isn't strictly necessary because we could go ahead and use Actor.actor instead of actor in the code later, but doing it this way gives the impression that actor is a built-in construct of the language and (in some opinions) makes the code more readable.

The next step is to create the actor itself, using the actor method, which takes a block of code as a parameter. In this case, the block of code executes a simple receive , which we'll get to in a second. The result is an actor, stored into a value reference, ready for use.

Remember that actors don't use methods for communication, but messages. It may be a bit counterintuitive when I say that the next line, using the ! , is actually a method (in reality) that sends a message (figuratively) to the badActor . Under the hood, buried deeply in the Actor trait, is another of those MailBox elements that we looked at last time; the ! method takes the passed parameter (in this case, a String), dumps it into the mailbox, and returns immediately.

Once the message has been delivered to the actor, the actor processes the message by invoking its receive method, which does exactly as its name implies — it pulls the first available message from the mailbox and delivers it to an implicit pattern-matching block. Note that because you don't specify a type for the pattern-matched case, anything will match and the message is bound to the msg name (which you need in order to print it).

Take careful note of the fact that there isn't any type restrictions on the type that can be sent — you're not restricted to just strings as the previous example might imply. In fact, frequently, actors-based designs use Scala case classes to convey the actual message itself, providing both an implicit "command" or "action" to execute based on the type with the parameters/members of the case class acting as the parameters or data to the action.

For example, suppose you want the actor to carry out a couple of different actions in response to messages sent; the new implementation would look something similar to Listing 2:

Listing 2. Hey, I'm a director!

object Actor2 { case class Speak(line : String); case class Gesture(bodyPart : String, action : String); case class NegotiateNewContract; def main(args : Array[String]) = { val badActor = actor { receive { case NegotiateNewContract => System.out.println("I won't do it for less than $1 million!") case Speak(line) => System.out.println(line) case Gesture(bodyPart, action) => System.out.println("(" + action + "s " + bodyPart + ")") case _ => System.out.println("Huh? I'll be in my trailer.") } } badActor ! NegotiateNewContract badActor ! Speak("Do ya feel lucky, punk?") badActor ! Gesture("face", "grimaces") badActor ! Speak("Well, do ya?") } }

So far, all is well and good, but when this runs, only the new contract is negotiated; after that, the JVM terminates. At first, it may feel like the spawned thread isn't responding to the messages quickly enough, but remember, in an actors model, you don't deal with threads, per se, just message passing. Instead, the problem here is much more straightforward: One receive begets one message, so the fact that you've queued up multiple messages doesn't matter because there's only been one receive and only one message delivered, regardless of how many might be queued up waiting for processing.

Fixing this requires the following changes to the code, shown in Listing 3:

Put the receive block inside of a near-infinite loop.

block inside of a near-infinite loop. Create a new case class to indicate when the whole thing is done processing.

Listing 3. Now I'm a much better director!

object Actor2 { case class Speak(line : String); case class Gesture(bodyPart : String, action : String); case class NegotiateNewContract; case class ThatsAWrap; def main(args : Array[String]) = { val badActor = actor { var done = false while (! done) { receive { case NegotiateNewContract => System.out.println("I won't do it for less than $1 million!") case Speak(line) => System.out.println(line) case Gesture(bodyPart, action) => System.out.println("(" + action + "s " + bodyPart + ")") case ThatsAWrap => System.out.println("Great cast party, everybody! See ya!") done = true case _ => System.out.println("Huh? I'll be in my trailer.") } } } badActor ! NegotiateNewContract badActor ! Speak("Do ya feel lucky, punk?") badActor ! Gesture("face", "grimaces") badActor ! Speak("Well, do ya?") badActor ! ThatsAWrap } }

Whoever said making a movie is hard clearly didn't have Scala actors in the cast.

Acting concurrently

One thing that's not apparent from this code is where the concurrency (if any) is coming from — from what I've shown you so far, this could very well be another synchronous form of method calls and you wouldn't be able to tell the difference. (Technically, you could infer that there's some concurrency going on from the second example before I put the near-infinite loop in, but that's more an accidental proof and clearly not an iron-clad one.)

To prove that some threads are underneath all of this, take a deeper look at the earlier example:

Listing 4. I'm ready for my close-up, Mr. DeMille

object Actor3 { case class Speak(line : String); case class Gesture(bodyPart : String, action : String); case class NegotiateNewContract; case class ThatsAWrap; def main(args : Array[String]) = { def ct = "Thread " + Thread.currentThread().getName() + ": " val badActor = actor { var done = false while (! done) { receive { case NegotiateNewContract => System.out.println(ct + "I won't do it for less than $1 million!") case Speak(line) => System.out.println(ct + line) case Gesture(bodyPart, action) => System.out.println(ct + "(" + action + "s " + bodyPart + ")") case ThatsAWrap => System.out.println(ct + "Great cast party, everybody! See ya!") done = true case _ => System.out.println(ct + "Huh? I'll be in my trailer.") } } } System.out.println(ct + "Negotiating...") badActor ! NegotiateNewContract System.out.println(ct + "Speaking...") badActor ! Speak("Do ya feel lucky, punk?") System.out.println(ct + "Gesturing...") badActor ! Gesture("face", "grimaces") System.out.println(ct + "Speaking again...") badActor ! Speak("Well, do ya?") System.out.println(ct + "Wrapping up") badActor ! ThatsAWrap } }

When this new example runs, it becomes pretty clear that two different threads are involved:

The main thread (the same one that kicks every Java main off)

thread (the same one that kicks every Java off) The Thread-2 thread, which was spun off by the Scala Actors library under the hood

So yes, fundamentally we have always been doing multithreaded execution when we fired up that first actor.

But getting used to this new model of execution can be a bit awkward, if only because it represents an entirely different way of thinking about concurrency. For example, consider the Producer/Consumer model from the last article. There was a fair amount of code there, particularly in the Drop class, leaving us with a pretty clear view of what was going on as the threads interacted with one another and with the monitors required to keep everything in sync. I repeat the V3 code from last article here for reference:

Listing 5. ProdConSample, v3 (Scala)

package com.tedneward.scalaexamples.scala.V3 { import concurrent.MailBox import concurrent.ops._ object ProdConSample { class Drop { private val m = new MailBox() private case class Empty() private case class Full(x : String) m send Empty() // initialization def put(msg : String) : Unit = { m receive { case Empty() => m send Full(msg) } } def take() : String = { m receive { case Full(msg) => m send Empty(); msg } } } def main(args : Array[String]) : Unit = { // Create Drop val drop = new Drop() // Spawn Producer spawn { val importantInfo : Array[String] = Array( "Mares eat oats", "Does eat oats", "Little lambs eat ivy", "A kid will eat ivy too" ); importantInfo.foreach((msg) => drop.put(msg)) drop.put("DONE") } // Spawn Consumer spawn { var message = drop.take() while (message != "DONE") { System.out.format("MESSAGE RECEIVED: %s%n", message) message = drop.take() } } } } }

While it's interesting to see how Scala has simplified parts of that code, in truth it's not really all that different conceptually from the original Java version. But now let's look at what an actor-based version of the Producer/Consumer example might look like if you pared it down to its barest essentials:

Listing 6. Take 1. And ... Action! Produce! Consume!

object ProdConSample1 { case class Message(msg : String) def main(args : Array[String]) : Unit = { val consumer = actor { var done = false while (! done) { receive { case msg => System.out.println("Received message! -> " + msg) done = (msg == "DONE") } } } consumer ! "Mares eat oats" consumer ! "Does eat oats" consumer ! "Little lambs eat ivy" consumer ! "Kids eat ivy too" consumer ! "DONE" } }

This first version certainly wins in terms of brevity and in some situations, maybe does all that needs to be done, but running the code and comparing it against the earlier versions reveals an important difference — the actors-based version is a multiplace buffer, not a single-slot drop like we'd been working with before. To some, this may seem like an enhancement, not a drawback, but let's make sure to compare like to like — let's turn around and create an actors-based version of the Drop in which each call to put() must be balanced by a call to take() .

Fortunately, the Scala Actors library can mimic this functionality pretty easily. Fundamentally, you want the Producer to block until the Consumer has received the message; that's most easily handled by having the Producer block until it receives an acknowledgement from the Consumer that the message has been received. In one sense, this is what the previous monitor-based code is doing using the monitor around the lock object to do that signaling.

The easiest way to do this in the Scala Actors library is to use the !? method instead of the ! method (which blocks until an acknowledgement is received). (In case you're curious, in the Scala Actors implementation, every Java thread is already an actor so the reply is coming to the mailbox that's implicitly associated with the main thread.) This means that the consumer needs to send some kind of acknowledgement; it does this using the reply method that it implicitly inherits (along with the receive method), as shown in Listing 7:

Listing 7. Take 2 ... Action!

object ProdConSample2 { case class Message(msg : String) def main(args : Array[String]) : Unit = { val consumer = actor { var done = false while (! done) { receive { case msg => System.out.println("Received message! -> " + msg) done = (msg == "DONE") reply("RECEIVED") } } } System.out.println("Sending....") consumer !? "Mares eat oats" System.out.println("Sending....") consumer !? "Does eat oats" System.out.println("Sending....") consumer !? "Little lambs eat ivy" System.out.println("Sending....") consumer !? "Kids eat ivy too" System.out.println("Sending....") consumer !? "DONE" } }

Or, if you prefer the version that uses spawn to kick off the Producer into a separate thread from main() (which is most closely mirroring the original), it might look like Listing 8:

Listing 8. Take 4 ... Action!

object ProdConSampleUsingSpawn { import concurrent.ops._ def main(args : Array[String]) : Unit = { // Spawn Consumer val consumer = actor { var done = false while (! done) { receive { case msg => System.out.println("MESSAGE RECEIVED: " + msg) done = (msg == "DONE") reply("RECEIVED") } } } // Spawn Producer spawn { val importantInfo : Array[String] = Array( "Mares eat oats", "Does eat oats", "Little lambs eat ivy", "A kid will eat ivy too", "DONE" ); importantInfo.foreach((msg) => consumer !? msg) } } }

Whichever way you look at it, the actors-based version is a lot simpler than the original ... as long as the reader can keep the actors and the implicit mailboxes straight.

This isn't a trivial point. The actors model seriously upends the whole process of thinking about concurrency and thread-safety; it changes from a model where we focus on the shared data structures (data concurrency) to a model where we focus on the structure of the code itself acting on the data (task concurrency) and sharing as little of the data as possible. Notice that inversion in the sample Producer/Consumer in the previous code. In the previous examples, the concurrency was explicitly written around the Drop class (the bounded buffer). In this version in this article, the Drop doesn't even make an appearance and the focus remains on the two actors (threads) and their interaction via shared-nothing messages.

Naturally, it's still possible to build data-centric concurrency constructs with actors; you just have to take a slightly different approach to it. Consider this simple "counter" object that uses actor messages to convey the "increment" and "get" operations, as shown in Listing 9:

Listing 9. Take 5 ... Count!

object CountingSample { case class Incr case class Value(sender : Actor) case class Lock(sender : Actor) case class UnLock(value : Int) class Counter extends Actor { override def act(): Unit = loop(0) def loop(value: int): Unit = { receive { case Incr() => loop(value + 1) case Value(a) => a ! value; loop(value) case Lock(a) => a ! value receive { case UnLock(v) => loop(v) } case _ => loop(value) } } } def main(args : Array[String]) : Unit = { val counter = new Counter counter.start() counter ! Incr() counter ! Incr() counter ! Incr() counter ! Value(self) receive { case cvalue => Console.println(cvalue) } counter ! Incr() counter ! Incr() counter ! Value(self) receive { case cvalue => Console.println(cvalue) } } }

Or to be more inline with the thrust of the Producer/Consumer example, Listing 10 is a version of Drop that uses actors internally (probably in order to allow other Java classes to use the Drop without having to worry about how to call the actors methods directly):

Listing 10. Drop, meet actors

object ActorDropSample { class Drop { private case class Put(x: String) private case object Take private case object Stop private val buffer = actor { var data = "" loop { react { case Put(x) if data == "" => data = x; reply() case Take if data != "" => val r = data; data = ""; reply(r) case Stop => reply(); exit("stopped") } } } def put(x: String) { buffer !? Put(x) } def take() : String = (buffer !? Take).asInstanceOf[String] def stop() { buffer !? Stop } } def main(args : Array[String]) : Unit = { import concurrent.ops._ // Create Drop val drop = new Drop() // Spawn Producer spawn { val importantInfo : Array[String] = Array( "Mares eat oats", "Does eat oats", "Little lambs eat ivy", "A kid will eat ivy too" ); importantInfo.foreach((msg) => { drop.put(msg) }) drop.put("DONE") } // Spawn Consumer spawn { var message = drop.take() while (message != "DONE") { System.out.format("MESSAGE RECEIVED: %s%n", message) message = drop.take() } drop.stop() } } }

As you can see, it requires more code (and additional threads because each actor acts inside of a thread pool), but this version is API-equivalent to the versions built before, putting all of the concurrency concerns inside of the Drop class where Java developers traditionally expect it to be.

There's more to actors, too.

In cases such as inside of massively scaled systems, having each actor backed by a Java thread is going to be too heavy and wasteful, particularly if each actor is going to spend more of its time waiting than processing. In those situations, an event-based actor might be appropriate; it effectively sits inside of a closure that captures the rest of the actor's actions. That is, a block of code (function) that now doesn't have to be represented via thread state and registers and such. The closure is fired once a message comes in to the actor (this obviously requires an active thread), so the closure borrows an active thread for the period of its activity, then either terminates or puts itself into another "wait" state by calling back into itself, effectively releasing the thread for other uses. (See the Haller/Odersky papers in the Related topics.)

Within the Scala Actors library, this is done with the react method instead of receive as I've shown in this article. The key to using react is that formally react can't return so that the implementation inside the react has to re-invoke the block of code containing the react block. This is where the loop construct comes in handy, creating a near-infinite loop (as its name implies). This means that the Drop implementation in Listing 10 could actually operate purely by borrowing callers' threads, reducing the number of threads required to carry out all the operations required. (In practice, I've never seen this happen with a trivial example so I guess we'll have to take the Scala designers' word for it.)

In some cases, you may choose to inherit from the base Actor trait (in which case the act method has to be defined or the class remains abstract) in order to create a new class that implicitly acts as an actor. Having said that, that idea is falling out of favor within the Scala community; in general, the approach I sketched out (using the actor method from the Actor object) is the preferred way to create a new actor.

Conclusion

Because programming in actors demands a slightly different style than programming in "traditional" objects, there are a couple of things to keep in mind when working with actors.

First, remember that much of the power of actors comes from the message-passing style rather than the blocking-invocation style that characterizes the rest of the imperative programming world. (Interestingly, object-oriented languages that use message passing as a core principle are out there. Two of the most widely-recognized are Objective-C and Smalltalk, and there is also a newcomer to the block, Ioke, created by fellow ThoughtWorker Ola Bini.) If you create classes that extend Actor directly or indirectly, try to ensure that all invocations against said objects are done through message passing.

Second, because messages can be delivered at any given point in time (and more importantly) may come with considerable delay between sending and receiving, take care to make sure that messages carry all the state they might need in order to be handled correctly. This approach will result in:

Making the code easier to understand (because the message will be carrying all the state it needs to process)

Reducing the chance that the actor will be accessing shared state someplace else, thus reducing the chance of deadlock or other concurrency nightmare

Third, although this probably should be somewhat obvious in the context of this conversation, it bears mentioning that actors shouldn't block. At its heart, blocking is what causes deadlock; the more your code can avoid blocking, the more opportunities for deadlock you'll avoid.

Interestingly enough, if you're familiar with the Java Message Service (JMS) API, you'll find a strong sense of parallels in these recommendations I'm making — after all, the actors message-passing style is just messages being passed between entities (like JMS message passing is just messages being passed between entities). The differences are that JMS messages tend to be larger scale and operate at the level of tiers and processes and actors messages tend to be smaller in scale and operate at the level of objects and threads. If you get JMS, you'll get actors.

Actors aren't a panacea to fix every concurrency problem your code might run into, but they definitely present a new way of modeling your application or library code, using constructs that look and act in a fairly simple and straightforward manner. That doesn't mean they'll always behave the way you expect, but some of that behavior is to be expected — after all, objects probably didn't behave the way you expected the first time you ran into them.

That's it for this installment; until next time, enjoy!

Downloadable resources

Related topics