This is a follow-up to my Getting acquainted with Clojure zippers post. That was more of an introduction, this article is going to more about some of the difficulties we run into when using zippers to modify tree structures. We are going to try to write a function to modify a tree, and along we will run into lots of problems that, once we deal with them, will eventually lead to zipper enlightenment. The code here isn’t meant as recipes or a perfect example, but just a way to work through some issues that come up when working with Clojure zippers.

The neat thing about zippers in Clojure is that they allow tree manipulation to be purely functional, Zipper manipulation functions can be pure : given the same zipper they will always return the same thing. This is true of movement functions like zip/down and zip/next, since position inside the tree is part of what we are feeding into the function: given the same tree, and the same position inside the tree, the result will alway be the same. The tree itself, obviously, doesn’t change just because we’ve moved. The zipper here is an abstraction that separates the tree itself from the notation of our location in the tree, and thus allows us to deal with both explicitly.

This is our standard zipper-walking format. Most of the work is just figuring out which locations need to be removed and passing them off to zip/remove.

insert-left: does just what it says. Let’s delete any numbers greater than 10:

zip/append-child is a good choice here because we are appending to the end of the vector, and that is what it always does. What about prepending? This time we’ll put a :aa at the beginning.

Let’s try inserting a node now. We’ll try to put a :c after the :b in the first subvector.

In an XML based tree we might have more complex editing needs. XML trees are map-based. Each node is a map with a :tag key, an :attrs key and a :content key. If we wanted to change a <ul> to an <ol> in an HTML document, the call to zip/edit would look something like this:

The first node in the tree, the 2, has been inced up from 1. Editing is possible!

zip/edit is interesting because its arguments are a location and a function to modify the current node. This kind of editing could go rather far, since the current node can be the top of subtree. In a simple case, though, it could look something like this:

It turns out that simple edits are simple. There are several functions in clojure.zip :

What about modifying the tree, though? What happens then?

Immutable gotchas : let’s wrap

Now let’s try something more complicated. We’re going to start over with a new vector tree.

( def z2 (zip/vector-zip [ 1 2 :a :b [ 3 4 :c :d 5 ] :e ]))

Our goal this time is to make sure that all keyword nodes are wrapped together in their own subtrees, with adjacent keywords wrapped together. We want to be able to take a tree like z2 and produce this:

[ 1 2 [ :a :b ] [ 3 4 [ :c :d ] 5 ] [ :e ]]

(I encountered this problem in an HTML document, trying to wrap adjacent, indented <p> elements inside <blockquote>s.)

Here we go. Strategy number one. We’ll look for keywords and, if we find one, we’ll wrap it and then start appending any right siblings.

First, let’s just try wrapping each keyword individually:

Strategy 1 - a: wrap and go…​ down ( defn strategy-one-a [loc] ( if (zip/end? loc) loc ( if ( keyword? (zip/node loc)) ( recur (zip/next (zip/edit loc #( vector %)))) ( recur (zip/next loc))))) ;; wait, don't do this: infinite loop

What? Why is this infinite? Well, when we edit the node, we create a new node, [:a], and then we say zip/next. Except that the next node in this case is down…​ to that :a again! Which we wrap again, and then go find again and again and again. next is relative to the new zipper, not the old one from before the edit. This is one of those immutable/pure-function mind warp things: you want to be able to grab the new state and hold on to the old one too. But here, the old one, where zip/next was going to logically lead to the right sibling, is no longer available to us. Or, to be more precise, it is still available, but it is now a parallel universe where the edit we want to do never really happened. We can’t have the new edit and the old position at the same time, because the old position only exists in a world where the edit never happened.

Back to reality: let’s introduce some logic to make sure we don’t try to wrap keywords that are already inside vectors containing only keywords. In other words, if we run into [:x :y :z] we leave it alone.

First, we write a little function that looks up at a keyword’s parent to see if all the siblings are already keywords:

( defn parent-ok? [loc] ( when (zip/up loc) ( every? keyword? (zip/children (zip/up loc)))))

(Note here that zip/children returns nodes, not locs, so we can feed the results directly to keyword? without having to call zip/node.)

And then we plug it into our previous tree walking function:

Strategy 1 b ( defn strategy-one-b [loc] ( if (zip/end? loc) loc ( if ( and ( keyword? (zip/node loc)) ( not (parent-ok? loc))) ( recur (zip/next (zip/edit loc #( vector %)))) ( recur (zip/next loc))))) ; [[1 2 [:a] [:b] [3 4 [:c] [:d] 5] [:e]] :end] ;; here we see the ended-out zipper

Good: no infinite recursion, and all of our keywords are wrapped, but individually, not grouped together like we want.

Let’s try grabbing them from the left-most keyword. As soon as we hit a keyword, we’ll try to gobble up any keywords to the right before moving on.

Strategy 1 c (starting to get ugly) ( defn strategy-one-c [loc] ( if (zip/end? loc) loc ( if ( and ( keyword? (zip/node loc)) ( not (parent-ok? loc))) ;; up to here, same as before ( let [right-keys ( take-while #( keyword? %) (zip/rights loc))] ( recur (zip/next ( reduce ( fn [z n] (zip/append-child z n)) (zip/edit loc #( vector %)) right-keys)))) ( recur (zip/next loc)))))

The basic structure is the same as before, except that now we try to incorporate keyword nodes to the right of the locations that need to be dealt with (ie. keyword nodes that are not in a parent node containing only keyword children, as established by the parent-ok? function we wrote earlier). Conveniently, the zip library provides zip/rights, which gives us a sequence of nodes (not locs, mind you) that we grab with take-while.

So far nothing extraordinary.

If we do have interesting siblings, we need to bring them into the node we are going to create. That is what we do with reduce here, and it is probably worth breaking down the logic there.

( reduce ( fn [z n] (zip/append-child z n)) (zip/edit loc #( vector %)) right-keys)

This is the three argument form of reduce, where the second argument provides an initial value. Here our starting value is going to be the original keyword node edited to be wrapped inside a new vector. Then, for each of the keyword siblings to the right, we are going to keep calling zip/append.

It would be the equivalent of this, if we could know ahead of time that right-keys contained :a, :b and :c:

(zip/append (zip/append (zip/append (zip/edit loc #( vector %)) :a ) :b ) :c )

So, we gather up some more courage and we run the thing:

(strategy-one-c z2) ; [[1 2 [:a :b] [:b] [3 4 [:c :d] [:d] 5] [:e]] :end]

Well, this kind of works, in that we are getting our keywords grouped together. Obviously, we haven’t done anything to remove the nodes that we’ve copied into our aggregating nodes. And look what happens if we have more successive keyword nodes in our input:

(strategy-one-c (zip/vector-zip [ 1 2 :a :b :c :d :e 3 ])) ; [[1 2 [:a :b :c :d :e] [:b :c :d :e] [:c :d :e] [:d :e] [:e] 3] :end]

Oh dear. We can’t just leave those siblings out there, because they end up getting siblingized themselves. Before we send our big reduced, zip/appended, zip/edited zipper back to zip/next, we need to prune the original keyword siblings from the zipper. Okay, one last try.

Works but…​ (Strategy 1 d) ( defn strategy-one-d [loc] ( if (zip/end? loc) loc ( if ( and ( keyword? (zip/node loc)) ( not (parent-ok? loc))) ( let [right-keys ( take-while #( keyword? %) (zip/rights loc)) loc-no-rights ( reduce ( fn [z _] (zip/remove (zip/right z))) loc right-keys) loc-w-incorporated-keys ( reduce ( fn [z n] (zip/append-child z n)) (zip/edit loc-no-rights #( vector %)) right-keys)] ( recur (zip/next loc-w-incorporated-keys))) ( recur (zip/next loc)))))

I’ve shifted things around a little here for clarity, but it really isn’t too different from the previous rendition. Inside the let, loc-w-incorporated-keys is the result of the same reduce call. The difference here is that before building our final zipper, we remove the siblings.

zip/remove, after blowing away the node, automatically moves to whatever the previous location would be, as determined by calling zip/prev. This could be a problem if we didn’t know that we were working on right siblings, because if we removed the leftmost node, we would end up at the parent node suddenly, which could confuse things considerably. That won’t be a problem here though. This behaviour also allows us to use the combination of zip/right and zip/remove for as many times as we like, because we just keep removing the next node over each time.

loc-no-rights is the intermediary version of the zipper. And because right-keys is holding nodes and not zipper locations, we can come back to them and append them before moving on with zip/next.

Deep breath…​

(strategy-one-d z2) ; [[1 2 [:a :b] [3 4 [:c :d] 5] [:e]] :end] (strategy-one-d (zip/vector-zip [ 1 2 :a :b :c :d :e 3 ])) ; [[1 2 [:a :b :c :d :e] 3] :end]