Jul 17, 2013

With the release of the core.async library the Clojure community has been exposed to the joys1 of Communicating Sequential Processes and Go-style asynchronousity 2. While this is all very exciting and new for the Clojure community at-large, a likewise interesting aspect of core.async’s release has been exposed — deep code-walking macros — a subject I will give an overview of herein.3

In chapter 17 of The Joy of Clojure we show an example of a macro called defformula that allows you to define spreadsheet-like cells in Clojure using watchers. For that example we intentionally required that the formulas defined be preceded by a binding vector, like so:

(defformula avg [at-bats ab hits h] (float (/ @hits @at-bats)))

This allowed us to tie formula-internal variable names with existing reference types without requiring our implementation to garner the ties programmatically. The point of that section had to do with the Observer Pattern, so we didn’t want to clutter the discussion with other concerns. However, we can use the same example to illustrate deep code-walking macros (DCWM) and doing so will lead to a different implementation. For this post I’ll talk about only two types4 of DCWMs, namely: additive transformers and in-place transformers.

Deep walking for additive transforms

The defformula macro in The Joy of Clojure was indeed an additive transformer. That is, the macro took an expression and augmented it with additional stuff, specifically calls to add-watch . However, having the binding vector allowed us to assume reference type dereferencing forms (things like @foo ). However, if I want to infer dereferencing forms within an expression then I will need to dive into it and find them. To do that I’d first like to have a function deref-form? that takes something and checks if it’s a dereferencing form, implemented below:

(ns formulas (:require [clojure.walk :as walk])) (defn deref-form? [form] (boolean (when (seq? form) ;; <1> (let [[op & _] form] (and op (symbol? op) (or ;; <2> (= op 'deref) (= op 'clojure.core/deref)))))))

Only work on seqs Check for deref calls

You’ll notice that the deref-form assumes that the form given is a read-expanded dereferencing form rather than a reader-macro for (i.e. (deref foo) rather than @foo ). Given that assumption, it’ll work as follows:

(deref-form? 42) ;;=> false (deref-form? [deref :foo]) ;;=> false (deref-form? (quote (deref foo))) ;;=> true (deref-form? `(deref foo)) ;;=> true

Seems right. Now that I can identify a dereferencing form I’d like to dive into an expression and gather up all of the names that serve as the target in said dereferencing forms. The function to do this very task named find-derefs is implemented below:

(defn find-derefs [expr] (if (coll? expr) (->> expr walk/macroexpand-all ;; <1> (tree-seq coll? seq) ;; <2> (filter deref-form?) ;; <3> (map second) ;; <4> set) ;; <5> #{}))

Expand all sub-forms Build a seq of all composite forms Take only the deref forms Take only their operands Make a set out of them

You’ll notice that with the use of walk/macroexpand-all I can ensure that any nested @foo s are turned into the equivalent (clojure.core/deref foo) calls. This lets me build up a set of the embedded dereferenced targets in expressions:

(find-derefs :a) ;;=> #{} (find-derefs 42) ;;=> #{} (find-derefs []) ;;=> #{} (find-derefs '(inc @c1)) ;;=> #{c1} (find-derefs '[{:foo @bar, :baz [@quux]}]) ;;=> #{quux bar}

The find-derefs function basically builds the information that the binding form in JoC provided explicitly. Now that I can get at the names embedded in an expression, I can now implement a macro to build spreadsheet-like formulas whose values are dependent on the embedded references within:

(defmacro formula [expr] `(let [formula# (agent ~expr) ;; <1> update-fn# (fn [key# ref# o# n#] ;; <2> (send formula# (fn [_#] ~expr)))] (doseq [r# ~(find-derefs expr)] ;; <3> (add-watch r# ;; <4> :update-formula update-fn#)) formula#))

Set the Agent’s initial value based on result of the formula Build a function to update the Agent with the new formula calculation For every internal dereference in expr … … build a call setting a watcher that updates the formula value on change

The body of a formula is just an expression that depends on zero or more embedded dereferences. Not only does formula deeply walk the expression to find the dereferences, but it also augments a normal call to agent (holding the result of the formula) with calls to add-watch on the embedded (i.e. its dependents) reference types that automatically update the formula ‘s value whenever any of its dependents change. That is, rather than just define an Agent that holds the result of some expression like (agent (float (/ @hits @at-bats))) the formula macro builds something like the following:

(let* [formula (agent (float (/ (deref hits) (deref at-bats)))) update-fn (fn [key ref o n] (send formula (fn [_] (float (/ (deref hits) (deref at-bats))))))] (doseq [r #{hits at-bats}] (add-watch r :update-formula update-fn)) formula)

That’s quite a transformation! Because of this transformation formula allows you to do the following:

(def hits (ref 25)) (def at-bats (ref 100)) (def avg (formula (float (/ @hits @at-bats)))) @avg ;;=> 0.25

As shown, the avg formula holds the result of the division of the numbers stored in the Refs hits and at-bats . What is so cool about DCWMs is that they allow me to define rich semantics for the simple formula expression and augment it with spreadsheet-like capabilities:

(dosync (alter at-bats inc) (alter hits inc)) @avg ;;=> 0.25742576

As shown, changing the values stored in the Refs automatically updates the value stored in the formula avg . This is cool stuff, but that’s not all as I’ll show below.

Deep walking for in-place transforms

Not only can I change the semantics of an expression using a DCWM, but I can also completely transform the expression as it’s traversed. Assume that for whatever reason I wanted to not deal directly with the embedded references inside of a formula. Instead, I want to refer to them through the Vars that hold them. That is, instead of doing this:

(float (/ (clojure.core/deref hits) (clojure.core/deref at-bats)))

I want to do this:

(float (/ @@#'hits @@#'at-bats))

What that odd looking carrier-lost-like menagerie means is:

@ @ #'hits deref deref get the Var holding the reference

Or:

Get the Var

Dereference it

Then dereference whatever it held

As an intermediate step toward transforming dereferencing calls into this indirect Var-lookup, I’ll use the clojure.walk/postwalk-replace function to insert some magic, like so:

(defn via-magic [expr] (walk/postwalk-replace ;; <1> '{clojure.core/deref magic ;; <2> deref magic} expr))

You can see what via-magic does below:

(via-magic (quote (float (/ @hits @at-bats)))) ;;=> (float (/ (magic hits) (magic at-bats)))

So it just replaces the symbol deref with the symbol magic — nothing drastic. The cool part is that I can implement magic as a macro that itself transforms a symbolic name5 into a series of calls like the one shown before:

(defmacro magic [ref-name] `(let [v# (resolve (quote ~ref-name))] ;; <1> (deref (deref v#)))) ;; <2>

Look up the Var by symbol Dereference the Var first, then dereference what it contained

The transformation above ensures that any time a dereference occurs it happens through the Var rather than directly. You’ll notice that if I use magic directly I’ll get the same answer:

(magic at-bats) ;;=> 101

Because magic will eventually get me the value stored in the Var-held reference type, I can interpose its use within a new version of the formula macro at the right place:

(defmacro formula [expr] `(let [formula# (agent ~expr) ;; <1> update-fn# (fn [key# ref# o# n#] ;; <2> (send formula# (fn [_#] ~(via-magic expr))))] ;; <3> (doseq [r# ~(find-derefs expr)] (add-watch r# ;; <4> :update-formula update-fn#)) formula#))

Assume Vars exist for the initial value, so don’t use magic Build the updating watch function like before, except… … use magic to look up through the Var Again add watches to all of the dereferencing sub-forms

You’ll notice that I only chose to lookup through the Vars in one place, namely, the point of update of the formula itself. There would be problems if I re-define either of the formula ‘s dependencies. That is, the update will break because the new embedded reference will not have the needed watchers. In any case, the new implementation of formula works like the old:

(def hits (ref 25)) (def at-bats (ref 100)) (def avg (formula (float (/ @hits @at-bats)))) @avg ;;=> 0.25 (dosync (alter hits #(+ 10 %)) @avg ;;=> 0.35

The new formula macro, in addition to walking the expression to look for dereferncing forms, changed the call-sites of the embedded deref s at the point of update only. Specifically it generated something like the following code:

(let* [formula (agent (float (/ (deref hits) ;; <1> (deref at-bats)))) update-fn (fn [key ref o n] (send formula (fn [_] (float (/ (let* [v (resolve 'hits)] @@v) ;; <2> (let* [v (resolve 'at-bats)] @@v))))))] (doseq [r #{hits at-bats}] ;; <3> (add-watch r :update-formula update-fn)) formula)

Non- magic expansion Using magic the Vars are looked up at the time of the formula’s update Assume that the Var will never be re-bound

I could make formula even more complicated by ensuring that even on a Var re-define the formula updates and the proper watches are applied to the new reference type, but I leave that as an exercise to the reader.6

I could probably go on7 and on about deep code-walking macros, but I hope that this introduction gives an idea not only why they’re so powerful, but also why people (myself included) in the Clojure community are so excited by this technique.

Happy hacking!

:F