In his article about state management in Clojure, Rich Hickey discusses his reasons for choosing not to use the Erlang-style actor model. While Erlang has made some implementation choices that can lead to problems, the problems are not intrinsic to the Actor Model. As the actor implementation with the longest history of success, Erlang naturally represents many peoples’ understanding of the nature of actors — just as Smalltalk represents many peoples’ understanding of objects. Patterns for actor-based problem solving are still emerging, but my experience with programming actors-in-the-small (i.e.: fine-grained concurrency in a shared-memory multicore context) leads me to believe that there is great potential for this largely-misunderstood model. So with that in mind, let’s break down Rich’s reasons and address them one-at-a-time.

It is a much more complex programming model, requiring 2-message conversations for the simplest data reads, and forcing the use of blocking message receives, which introduce the potential for deadlock.

Erlang’s nested (blocking) receive is not part of Hewitt’s original actor model [1] or Agha’s elaboration of it [2]. By introducing such a mechanism, a kind of deadlock can occur in Erlang. Of course, Erlang provides additional mechanisms, such as time-outs and supervision trees, for handling these failures. In the context of fault-tolerant components and distributed systems these mechanisms are very useful for creating reliable systems, but they are not required for shared-memory multiprocessing.

The actor model does require two messages to “read” data from an object/actor — a request message and its corresponding reply. This is actually what allows you to avoid blocking concurrent requests. The messages are asynchronous, so nothing really needs to be blocked. If the requestor is unable to proceed without the data from the reply, then the requestor may be logically blocked, but that is not a result of using actors, it’s a result of the pattern of interaction used in a particular design.

In most cases, there is much more potential concurrency to exploit in a particular system. Results may not even be “returned” to the requestor. Instead, results can be directed to the object/actor that needs the data. This leads to more of a flow-based approach to decomposing the system. Data flows asynchronously and concurrently to where it is needed. The actors in the system simply react to the arrival of new information in the form of messages representing work to do.

Programming for the failure modes of distribution means utilizing timeouts etc. It causes a bifurcation of the program protocols, some of which are represented by functions and others by the values of messages.

The key idea here is to focus on the protocol of messages. Think of “protocol” as a replacement for “interface” in designing loosely-coupled components. Components that can speak the same protocol can be used interchangeably and even safely upgraded or hot-swapped. Having appropriate strategies and mechanisms for handling distributed failure modes makes it possible to build extremely reliable and resilient systems. Erlang provides many valuable patterns for addressing these issues. However, these mechanisms are not required for communication within the same address space and are not intrinsic to the actor model.

The bifurcation encouraged by actor-based programming is between values and actors. Values remain constant over time. Actors may change their behavior based on messages (values) they receive, so they represent the changable state of the system. Clojure encourages just the same bifurcation. Most of the language deals with values and functions on values. The “identity” concept is used to represent the changable state of the system.

It doesn’t let you fully leverage the efficiencies of being in the same process. It is quite possible to efficiently directly share a large immutable data structure between threads, but the actor model forces intervening conversations and, potentially, copying.

The actor model does not force copying of data. Passing messages between address spaces is what forces copying. Actor model messages are always pure immutable data values, and thus can be safely shared within an address space. An efficient actor implementation will fully leverage the ability to share large immutables values (data structures) among multiple actors. When copying must occur (e.g.: between machines) then it happens safely and transparently, since neither the original nor the copy are allowed to change.

Reads and writes get serialized and block each other, etc.

Actors implement a “shared nothing” data model. If you create an actor that has stateful behavior (such as a “storage cell”) then — and only then — you must define a protocol for access. Since messages are asynchronous, a sender never really blocks, not even to wait for the message to be received. If a response is generated, it is sent as a separate asynchronous message to whatever customer is specified in the request (which may not be the requestor). If there is a problem with “blocking” then either the protocol is poorly designed or the problem inherently requires synchronization. If synchronization is really needed, there are several good protocol patterns available. You’re not limited to the intrinsic synchronization assumed by sequential processing and call-return procedural protocols.

It reduces your flexibility in modeling – this is a world in which everyone sits in a windowless room and communicates only by mail.

On the contrary! The actor model is flexible enough to model the mechanisms of practically any other model of computation, including functional, logical, procedural and object-oriented. The basic mechanisms of the actor model, asynchronous communication of pure values among concurrent components, and dynamic reconfiguration of state, provides a reliable and well-defined semantic foundation.

Thinking differently about the structure of your programs is required for scalable concurrent programming. Fortunately, we have examples all around us. The real world is concurrent. Change requires interaction. State is only observable through behavior. The actor model gives us the tools to represent this directly in our designs.

Programs are decomposed as piles of blocking switch statements.

This is specific to Erlang, which implements actors as tail-recursive functions that block on “receive”. But that is not the only possible implementation. Hewitt/Agha-style actors have no explicit “receive”. Instead, they are activated by the reception of a message. The behavior they execute on activation is finite, and they can not block. In fact, there are really no “threads” at all. Only reactive components that maintain their (passive) state between invocations (messages). All pending work in the system is represented by messages-in-transit.

You can only handle messages you anticipated receiving.

And objects (in a traditional object-oriented language) can only handle messages they anticipated receiving. But both objects and actors can be designed to delegate “unanticipated” messages to another handler. Are all functions in Clojure “total”, or are they undefined for some “unanticipated” input values? In Humus, actors can choose to ignore, modify, redirect, or throw an exception when they receive a message they don’t want to handle directly.

Coordinating activities involving multiple actors is very difficult.

Programming with actors does require a different mental model, just like programming with functions, logic, procedures, or objects. That’s what makes it a model of computation, not just a new set of tools and patterns we can capture in a library. You should expect that a shift to actor-based thinking will be as much of a challenge as shifting to any new computational model.

You can’t observe anything without its cooperation/coordination – making ad-hoc reporting or analysis impossible, instead forcing every actor to participate in each protocol.

Two powerful mechanisms are available to address this issue. First, actors can be easily hidden behind proxies, adapters, or even a façade. Since you can only interact with an actor through its message protocol, you can interpose all kinds of reporting and analysis actors without the knowledge or consent of either the customers or the target actor. All kinds of aspects, monitoring, instrumentation, verification, and adaptation can be implemented this way.

Second, actors can be hosted in a heavily-instrumented meta-configuration which records the full history of all messages and the provenance of all actors in the configuration. The resulting event-trees can be combined with references to the actors’ behaviors for a full picture of any given execution. You can’t get more observable than that.

It is often the case that taking something that works well locally and transparently distributing it doesn’t work out – the conversation granularity is too chatty or the message payloads are too large or the failure modes change the optimal work partitioning, i.e. transparent distribution isn’t transparent and the code has to change anyway.

Properly modularized actor configurations can be distributed, and often replicated, without changing their fundamental operation. This does not make distribution “transparent”, partly for the reasons quoted. However, distributed programming is not the only application for actors. Safe concurrent applications, even on multiple processor cores sharing memory, can be created with actors. And extremely efficient actor implementations do exist.

Conclusion

I have nothing against Clojure. In fact, I think there are a lot of interesting ideas there. Focusing mostly on pure functions and providing explicit mechanisms for handling mutable state is a good idea. In a future article, I intend to explore the implementation of Software Transactional Memory, another interesting idea. I also respect the choice to not support actors. However, I do object to some of the reasons given for making that design decision. This rebuttal is intended to provide a counterpoint to Rich Hickey’s rationale and hopefully dispel some of the misconceptions relating to actor implementations.

References

[1] C. Hewitt. Viewing Control Structures as Patterns of Passing Messages. Journal of Artificial Intelligence, 8(3):323-364, 1977. [2] G. Agha. Actors: A Model of Concurrent Computation in Distributed Systems. MIT Press, Cambridge, Mass., 1986.