Monkey Patching & Gorilla Engineering: Protocols In Clojure

We’re coming up on the release of another major milestone in Clojure, version 1.2. In general, Clojure is getting better, faster, more robust, and more reliable. But the language is still new enough that a few major features get added every iteration, and 1.2 is not an exception to that rule. The most exciting feature we’re about to see hit mainstream in 1.2 isand, part of Clojure’s new single-dispatch calling convention.

Protocols and Types in Clojure are a way to try and solve the age old problem: how do I assocaite behavior and data in a flexible way without totally shooting myself in the foot? They offer an interesting way to structure the relationship between behavior and data; they keep a quintessentially verb-centric outlook (similar to lisp), but focus on single-dispatch on classes (like Java or Ruby). Unlike both, inheritance of objects is totally decoupled from them; objects either conform to protocols or they do not, without any notion of hierarchy. And in this, Clojure developers get access to something not-unlike Ruby’s infamous “monkey patching” without the danger.

Monkey Patching

Before we dive into what Clojure is doing, let’s review how other people have tried to solve this problem. Java and C# have gone the route of interfaces, which are basically abstract classes that you tie into the inheritance hierarchy. Python and Ruby go for a more dynamic what-you-find-is-what-you-get approach that is typical for dynamic typing, using message sending as the dispatching mechanism. This generally works, but has some serious issues.

In Ruby, this is called “Monkey Patching” and it’s considered fairly dangerous. For example:

In case that went by too fast, we re-opened the system’s string class and shoved a new method on there. All strings, both existing and newly created, would have the method. This is a very powerful technique that lets us extend classes that we know about, and it’s pretty impressive when you do a few clever monkeypatches and suddenly your code looks 2 lines long, but…

The danger of open classes and this sort of chicanery is that you can step on other people’s toes. Since a monkeypatch echos around the code, conflicting monkeypatches can subtly and thoroughly hose your system. Ruby lore is full of examples of terrible consequences for monkeypatching. Every monkeypatch change has to be carefully weighed against the potential risk of other libraries that might want to modify that same behavior. Unpredictability in your base classes is generally not something to be encouraged, and so in many software shops the process is outright forbidden or taken only with extreme caution.

Really, that’s a shame. People want to use the technique because it can take code and make it incredibly clear and succinct. System class writers can’t anticipate everything in advance, and the ability to project our verbs and behaviors onto existing classes would be a real boon.

Protocols

Enter Clojure and the Protocols and Types. Let’s start off with Protocols because they’re simple. A protocol is sort of like an interface in Java or C#, but isn’t associated with any inheritance. A protocol is just a list of methods tied to a name in a namespace. Here’s an example of clojure code for a protocol:

So we’re writing a library that uses Bloom Filters to test for object membership. Every object may need to write its own logic for how to hash itself suitably for a bloom filter (e.g., collections might want to hash each member individually or focus on their own identity). We start by making a protocol that describes what we’d like to see types do. This protocol is just a bundle of method signatures associated with the name “blogpost.bloomfilters.BloomFilterable”. Put that thought on hold for a minute, we’ll come back to it shortly.

Types

Types are the exact opposite of Protocols, they are just data, but with no required implementation. The basic syntax for types is trivial:

The difference between defrecord and deftype is that defrecord supports keyed access (like hashmaps in Clojure) and some basic helper methods, deftype just makes exactly what you specify. Simple, right?

Connecting Protocols With Types

We have types, we have protocols, let’s connect our BloomFilterable to our MyIntHolder type:

Since the functions we’re calling in the protocol are namespaced to our blogpost.bloomfiler namespace, there isn’t any risk of anyone else tripping over that name. Every other namespace could have a make-hash function each doing something totally different and a careful Clojure programmer could successfully use all of them as intended.

Performance & Safety At A Fraction Of The Cost

This approach is slightly more static than the monkey patching technique we described for Ruby; there are protocols laid out in advance and you write code to them. But, in general most Ruby mixins (and even Objective-C delegates) have “informal protocols”. There is usually as set of functions that logic expects to be able to call objects to get them to at least coerce to known types. So writing them out in advance is probably a good idea. You’re going to put it in your documentation anyways.

For the small cost of writing out an agreed-upon contract in advance, Clojure gives you quite a bit. In terms of safety this approach is light-years ahead. And performance wise, the Clojure compiler can make smart decisions about how to make calls, making it nearly as fast to call as a direct method invocation on an Object. But most of all, this gives you controlled extensibility. Library writers can write their code to protocols and generics that library users can safely use to slide their own types into place.

It’s What I’m Not Saying

Protocols and Types are cool, but they’re only one of the mechanisms that Clojure provides for developers who want extensible interfaces. Clojure also provides a multiple dispatch facility with defgeneric and defmethod, along with arbitrary ad-hoc type hierarchies. While slightly less efficient than the new protocol system, they allow you to get rid of degenerate implicit logic like the Visitor Pattern. And of course all the stuff everyone glows about in Clojure is getting better with every release.