In part 1 of this series we looked at how to define the core of a window manager for X in Haskell. This provided the core logic of the xmonad tiling window manager. The specification of the window manager as a data structure and its accompanying API enabled us to write high level QuickCheck properties to automatically test the window manager, leading to good assurance that our implementation was correct. The window manager modeled was a very simple full screen window manager, where the focused window was always the only window visible.

In this article we reimplement the window manager core using a more sophisticated data structure — a “zipper” — which will allow us to track focus on each workspace automatically, by maintaining a cursor into an updateable workspace tree. This data structure will form the basis of a proper tiling window manager, where multiple windows are visible on the screen simultaneously. By using a data structure that tracks focus for us, we end up with much simpler code, and a clearer understanding of how focus is manipulated by the window manager.

The problem with a finite maps

The original StackSet data structure, which modeled a multiple-workspace fullscreen window manager, was quite simple:

data StackSet a = StackSet { current :: Int , stacks :: Map Int [ a ] } deriving Eq

This is the obvious way to associate stacks of windows with workspaces: use a lookup table to map workspace names (here just Int keys) to lists of windows. We chose to use a finite map, a purely functional lookup table. In order to track the current workspace, we store separately the key of the current workspace.

The simplicity of this data structure, however, limits its usefulness. The problem with the StackSet as defined is that it only allows us to track one focused window on each workspace: the window at the head of each window list. This is fine for very simple fullscreen window managers, as only one window is ever visible anyway. However, a true tiling window manager, such as xmonad, allows us tile many windows on the screen simultaneously, with focus available to any window on the screen. The windows are tiled to the screen like so:

In this picture, the current workspace contains 4 windows. The head of the window list, window #1, is tiled first, and is given the largest area. However, keyboard focus is actually on window #3. Our StackSet data structure implemented as a Map is not able to handle arbitrary focus as it stands. So the problem then is how best to extend the StackSet type to handle tiling of workspaces, with arbitrary focus.

The naive approach: more tables

One solution is to augment the workspace type with an additional table:

data StackSet a = StackSet { current :: Int , stacks :: Map Int [ a ] , focus :: Map Int a }

Now we maintain a separate mapping of workspaces, to the focused window on each workspace. This is simple enough, and works in practice (indeed, xmonad 0.1 does just this). However, there is a certain awkwardness to it: by adding a new table we have obligated ourselves to a second layer of bookkeeping. When focus changes we have two tables to update and to keep consistent with each other. The inelegance of this approach becomes apparent when we implement functions that affect focus and will require manipulating both tables.

One such operation is ‘delete’. For a basic fullscreen window manager, where focus is always the head of the list, delete has an obvious implementation. Given a window, a workspace and a StackSet, we can remove an arbitrary window from the window manager by removing it from the window list:

delete :: w -> i -> StackSet i w -> StackSet i w delete win wrk ss = ss { stacks = M . adjust ( delete win ) wrk ( stacks ss ) }

So ‘delete’ on the focused window simply calls the ‘delete’ function of the list library, and will move focus to the next element in the list. However, allowing focus on any element in the list complicates the logic — we have two data structures to keep up to date:

delete :: w -> i -> StackSet i w -> StackSet i w delete win wrk ss = ss { stacks = M . adjust ( delete win ) wrk ( stacks ss ) } , focus = M . update ( \ k -> if k == win then elemAfter k ( stacks ss ! wrk ) else Just k ) wrk ( focus ss ) }

So if we’re deleting the currently focused element, then focus moves to the element after the current element, otherwise focus is unchanged. The definition of ‘insert’ also changes in a similar way, as inserting a window should now insert to the right of the currently focused window.

In addition, the tracking of focus in a separate table means we need to make a policy decision about where focus goes in the presence of insert and delete. Under the simple workspace-as-list implementation, there was only one possible implementation, now we can set focus anywhere, so we are forced to state precisely what the behaviour should be. The obvious behaviour is that: delete . insert should be the identity, and leave focus unchanged. This would allows the user to delete transient popup windows, without losing focus on the window they’re working on. Ensuring this is the case complicates the code still further.

All this extra work is very unsatisfying! More book keeping means more code, and more chance for bugs. What we really need is a data structure that tracks focus by design. So let’s follow the functional programmer’s motto: write less, think more!

A Zipper for Trees

The zipper (pdf) is a lovely simple, pure data structure first described by Huet. A zipper is basically a cursor deep into a structure, allowing O(1) updates at the cursor site. By recording a path from the root to the cursor it is possible to navigate locally around the cursor site, while keeping track of the entire structure. A zipper can also be considered as a delimited continuation reified as data. In summary, it provides a natural interface for operations on a distinguished position in a data structure: exactly what we need for tracking focus in our window manager.

Let us consider a simple zipper for the following tree data type (from Huet’s paper):

data Tree a = Item a | Node [ Tree a ]

A zipper for this structure gives us a currently focused node, along with a path back up to a parent node, and to the left and right siblings. We can implement this as:

data Path a = Top | Point { left :: [ Tree a ] , right :: [ Tree a ] , up :: Path a } data Cursor a = Cursor { it :: Tree a , context :: Path a }

So, consider the tree representing the parse of the expression: (a * b) + (c * d):

expr :: Tree String expr = Node [ Node [ Item "a" , Item "*" , Item "b" ] , Item "+" , Node [ Item "c" , Item "*" , Item "d" ] ]

We can visualise this tree as:

Now, we can focus on a particular element of this tree, for example, the second “*” node, using a Cursor type. The following zipper gives us the binary tree with focus set on the “*” node.

subexpr :: Cursor String subexpr = Cursor { it = Item "*" , context = Point { left = [ Item "c" ] , right = [ Item "d" ] , up = Point { left = [ Item "+" , Node [ Item "a" , Item "*" , Item "b" ] ] , right = [ ] , up = Top } } }

Which we can visualise as a sort of unfolded tree, where the focused item is the top of the tree (the green branch represents the path back to the parent node):

It is as if we had picked up the tree by the “*” node, and let the other branches hang down.

The zipper gives us a natural set of O(1) operations on the focused node. To move focus to the left of the current node:

focusLeft :: Cursor a -> Cursor a focusLeft c @ ( Cursor t p ) = case p of Point ( l : ls ) rs up -> Cursor l ( Point ls ( t : rs ) up ) _ -> c

Applying ‘focusLeft’ to our subexpression relinks the local nodes, shifting focus to the item ‘c’:

> focusLeft subexpr Cursor { it = Item "c" , context = Point { left = [ ] , right = [ Item "*" , Item "d" ] , up = Point { left = [ Item "+" , Node [ Item "a" , Item "*" , Item "b" ] ] , right = [ ] , up = Top } } }

Which yields the tree:

Notice that the ‘up’ parts of the tree haven’t changed — the changes were local to the current subexpression, causing focus to cycle left. We can also move focus back up the tree to the parent, by following the ‘up’ path:

focusUp :: Cursor a -> Cursor a focusUp c @ ( Cursor t p ) = case p of Top -> c Point ls rs up -> Cursor ( Node ( reverse ls ++ t : rs ) ) up

Moving up the tree collapses the cursor path.

> focusUp subexpr Cursor { it = Node [ Item "c" , Item "*" , Item "d" ] , context = Point { left = [ Item "+" , Node [ Item "a" , Item "*" , Item "b" ] ] , right = [ ] , up = Top } }

Other important operations on the cursor are ‘insert’ and ‘delete’. Now that we have a precise notion of focus, we can specify the behaviour of focus when inserting a new node quite precisely. Either we insert to the left of the current node, and move focus to this new node:

insertLeft :: Tree a -> Cursor a -> Cursor a insertLeft a c @ ( Cursor t p ) = case p of Top -> undefined Point ls rs up -> Cursor a ( Point ls ( t : rs ) up )

Or we insert to the right:

insertRight :: Tree a -> Cursor a -> Cursor a insertRight a c @ ( Cursor t p ) = case p of Top -> undefined Point ls rs up -> Cursor a ( Point ( t : ls ) rs up )

Lovely! Our data structure gives us an exact account of how focus changes during an insert operation, with no extra book keeping. From here we should be able to map focus in the window manager, onto operations on the cursor position in the zipper, directly implementing the window manager policies we decide on.

‘delete’ is slightly more complex: we need to consider when deleting the current node whether to move focus to the left or right, or if the node is empty, to move focus up the tree:

delete :: Cursor a -> Cursor a delete ( Cursor _ p ) = case p of Top -> undefined Point ls ( r : rs ) up -> Cursor r ( Point ls rs up ) Point ( l : ls ) [ ] up -> Cursor l ( Point ls [ ] up ) Point [ ] [ ] up -> Cursor ( Node [ ] ) up

We see again how the data structure forces us to think about the behaviour of focus during critical operations, and also gives us simple, clear operations. Let’s see now how to use this cursor data type to our window manager.

A Zipper for a Window Manager

Clearly there is a possibility of adapting the zipper-based focus mechanism to our window manager problem. However, rather than there being just one point of focus in a tree, we actually have ‘n’ focused windows to track: one on each virtual workspace. Additionally, we need to track which workspace is currently on screen — yet another focus point. One simplification though, is that rather than having a tree of arbitrary depth, a window manager is a fixed height two level tree, with a fixed ‘n’ branches from the top level.

All this suggests that we actually have a two-level zipper! Firstly, the top level of the window manager tree, the StackSet type, is just a flat zipper, or cursor, onto the focused workspace:

data StackSet a = StackSet { current :: Workspace a , prev :: [ Workspace a ] , next :: [ Workspace a ] }

A workspace itself is just a numeric tag, and a stack of windows:

data Workspace a = Workspace { tag :: Int , stack :: Stack a }

Finally, a window stack on a workspace is a flat zipper giving us a cursor onto the currently focused window, or the empty stack:

data Stack a = Empty | Node { focus :: a , left :: [ a ] , right :: [ a ] }

Mmm… data structures are fun in Haskell. And we have a polymorphic window manager to boot: the StackSet is polymorphic in the window type. Note that our Stack and StackSet data types are really degenerate 1 level zippers, as they only provide operations to traverse left or right. We have no need to move focus up or down a two level tree.

We can visualise the window manager implemented as a zipper-based StackSet this way:

In this picture we have a window manager managing 6 virtual workspaces. Workspace 4 is currently on screen. Workspaces 1, 2, 4 and 6 are non-empty, and have some windows. The window currently receiving keyboard focus is window 2 on the workspace 4. The focused windows on the other non-empty workspaces are being tracked though, so when we view another workspace, we will know which window to set focus on. Workspaces 3 and 5 are empty.

With this structure, rather than a set of tables, we have a natural account of the behaviour of window focus during window manager operations. From the picture alone, we can see how we can ensure operations affect, for example, the current workspace only — such operations will affect only the current node of the workspace zipper. It is also obvious how shifting window focus in the window manager corresponds to shifting the cursor point on the current window zipper.

By using a data structure that tracks focus in this way, we hope to avoid annoying book keeping code, leading to a simpler implementation of our window manager, and, importantly, making it easier to ensure that the implementation of our window manager matches the specified behaviour. Keep it simple!

A zipper API for the window manager

Let’s now consider a natural API for this data structure:

new :: Int -> StackSet a peek :: StackSet a -> Maybe a index :: StackSet a -> [ a ] focusLeft , focusRight :: StackSet a -> StackSet a insert :: a -> StackSet a -> StackSet a delete :: -> StackSet a -> StackSet a

There are some additional operations we’d like to support, which operate on the workspace level:

viewLeft , viewRight :: StackSet a -> StackSet a shift :: Int -> StackSet a -> StackSet a

Now to implement this interface for our zipper-based StackSet.

A zipper implementation of a window manager

First, we can construct a new, empty window manager, with ‘n’ workspaces, using a list comprehension to unfold the Empty workspace list:

new :: Int -> StackSet a new n | n > 0 = StackSet t [ ] rs where ( t : rs ) = [ Workspace i Empty | i <- [ 0 .. n 1 ] ]

The data structure requires us to state which workspace has initial focus, so we pick the head of the list: Workspace 0. Good — the more policy that is obviously determined by the data structure, the less thinking we have to do!

Extracting the currently focused window is requires us to find the current workspace, then find the current window. We can use Haskell’s record access syntax for this:

peek s = case stack ( current s ) of Empty -> Nothing Node t _ _ -> Just t

However, since almost all operations are on the current workspace, we can abstract out this access pattern into a higher order function — ‘with’ — which will take a default value to return in case of ‘Empty, and a function to apply to a non-empty stack. It is like the function ‘maybe’ for Stacks, and can be written as:

with :: b -> ( Stack a -> b ) -> StackSet i a s -> b with d f s = case stack ( current s ) of Empty -> d v -> f v

Yielding the much nicer:

peek :: StackSet a -> Maybe a peek = with Nothing ( return . focus )

The lesson here is to refactor and abstract repetitive logic, so you get it right once, and then gain from simpler, more domain-specific code. Higher-order functions forever!

We can implement ‘index’ with similar ease:

index :: StackSet i a s -> [ a ] index = with [ ] $ \ ( Node t ls rs ) -> reverse l ++ t : rs

Modifying the window manager

As we see, read-only access is easy. Modifying the window manager is more interesting. For this we’ll want to provide a ‘modify’ higher-order function, similar to ‘with’, to abstract out the process of replacing the current workspace with a modified version:

modify :: Stack a -> ( Stack a -> Stack a ) -> StackSet i a s -> StackSet i a s modify d f s = s { current = ( current s ) { stack = with d f s } }

This just applies ‘with’ to the current workspace, and uses record update syntax to replace the current workspace, returning a modified StackSet. We can now implement the focus-changing functions. Firstly, focusLeft and focusRight. Both shift focus to the window immediately to the left or right of the cursor, or if we’re at the end of the stack, they wrap focus:

focusLeft , focusRight :: StackSet i a s -> StackSet i a s focusLeft = modify Empty $ \ c -> case c of Node _ [ ] [ ] -> c Node t ( l : ls ) rs -> Node l ls ( t : rs ) Node t [ ] rs -> Node x xs [ t ] where ( x : xs ) = reverse rs focusRight = modify Empty $ \ c -> case c of Node _ [ ] [ ] -> c Node t ls ( r : rs ) -> Node r ( t : ls ) rs Node t ls [ ] -> Node x [ t ] xs where ( x : xs ) = reverse ls

The implementation is straight forward, except for the wrapping case. Indeed, all operations are O(1), except for when we wrap, which is O(w), (‘w’ is the number of windows on the screen). We see here how the cursor-style data structure leads to a simple implementation of focus shifting in a window manager, and helps us think more clearly about the corner cases for focus movement. Pick the right data structures, and the algorithms will be obvious!

We can implement insertLeft and insertRight, which insert a window to the left or right of focus, and move focus there. The cases to consider for insert are:

Inserting into an empty workspace, yields a workspace with a single focused window.

Inserting into a non-empty workspace moves the current window to the left or right, and sets focus to the inserted window.

Yielding the obvious implementation:

insertLeft , insertRight :: a -> StackSet i a s -> StackSet i a s insertLeft a = modify ( Node a [ ] [ ] ) $ \ ( Node t l r ) -> Node a l ( t : r ) insertRight a = modify ( Node a [ ] [ ] ) $ \ ( Node t l r ) -> Node a ( t : l ) r

Again, we get a policy for free stating where focus moves on insert, avoiding tedious bookkeeping, and gaining strong assurances from the simpler code. The data structure hints at the natural policy our window manager should adopt.

We now consider how to delete the current window, and we will try to specify precisely where window manager focus goes on delete. Remember: tiling window managers need to have predictable behaviour, to be usable, so the more behaviour we can precisely specify, the more predictable our window manager will be.

There are four cases to consider:

delete on an Empty workspace leaves it Empty;

otherwise, try to move focus to the right;

otherwise, try to move focus to the left;

otherwise, a workspace with 1 element becomes Empty.

We implement this as:

delete :: StackSet a -> StackSet a delete = modify Empty $ \ c -> case c of Node t ls ( r : rs ) -> Node r ls rs Node t ( l : ls ) [ ] -> Node l ls [ ] Node _ [ ] [ ] -> Empty

And that’s it. The code just follows from the spec, and the core window manager’s implementation, with correct focus behaviour, falls out of the data structure.

Operations on workspaces

Navigating through virtual workspaces uses the level 1 zipper:

data StackSet a = StackSet { current :: Workspace a , prev :: [ Workspace a ] , next :: [ Workspace a ] }

The implementation is obvious, and follows the ‘focus’ functions on stacks:

viewLeft :: StackSet a -> StackSet a viewLeft ( StackSet t ( l : ls ) rs ) = StackSet l ls ( t : rs ) viewLeft t = t viewRight :: StackSet a -> StackSet a viewRight ( StackSet t ls ( r : rs ) ) = StackSet r ( t : ls ) rs viewRight t = t

The next, and hardest function to implement, is ‘shift’. shift moves the currently focused window to workspace ‘n’, leaving the current workspace focused on a new window, and with a new window focused on workspace ‘n’. We can implement it as a little `script’ using the primitives defined already, by folding the sequence of operations over our StackSet. One interesting point to note is that allowing indexing of workspaces requires us to perform sanity checks on the index value. If ‘shift’ only moved to the left or right workspaces, no bounds checking would be needed. Local operations are more natural for zippers.

shift :: Int -> StackSet a -> StackSet a shift new s | new >= 0 && new < max_workspaces && new /= old = maybe s go ( peek s ) | otherwise = s where old = tag ( current s ) go w = foldl ( $ ) s [ delete , view new , insertLeft w , view old ]

The ‘go’ script performs delete, view, insert, and view calls, and the ‘view’ function is a generalisation of viewLeft/viewRight to traverse to an arbitrary workspace ‘n’. To do this, it has to work out which direction to travel, given the current workspace tag, and the desired workspace:

view :: Int -> StackSet a -> StackSet a view i s @ ( StackSet ( Workspace n _ ) _ _ ) | i >= 0 && i < max_workspaces = foldr traverse s [ 1 .. abs ( i n ) ] | otherwise = s where traverse _ = if signum ( i n ) >= 0 then viewRight else viewLeft

And that’s it. Let’s now look at using QuickCheck to ensure we have the implementation right, and that our API makes sense.

QuickCheck Your Window Manager

The first thing to check is that the data invariants of the StackSet type hold for arbitrary StackSets QuickCheck generates. For the data structure we have defined, we no longer need to check for focus being valid — the data structure won’t let us focus on a window not on the current workspace! However there is one property not ensured directly by the data structure: uniqueness. It is required that no window appear more than once in the window manager. We can check that operations don’t violate this rule with the following property:

type T = StackSet Char invariant ( x :: T ) = nub ws == ws where ws = [ focus t : left t ++ right t | w <- current s : prev s ++ next s , let t = stack w , t /= Empty ]

Now we can check each operation on StackSets preserves this invariant:

prop_new ( n :: Positive Int ) = invariant $ new ( fromIntegral n ) prop_focusLeft ( n :: NonNegative Int ) ( x :: T ) = invariant $ foldr ( const focusLeft ) x [ 1 .. n ] prop_insertLeft n ( x :: T ) = invariant $ insertLeft n x prop_delete ( x :: T ) = invariant $ maybe x ( \ i -> delete i x ) ( peek x )

And so forth. Everything looks good so far:

$ runhaskell tests / Properties . hs new : invariant : OK , 100 tests . focusLeft : invariant : OK , 100 tests . insertLeft : invariant : OK , 100 tests . delete : invariant : OK , 100 tests .

Higher level properties

Rather than go through the basic properties tested in the last article, let’s now look at stronger properties. In particular, we will check that particular operations are ‘local’ to the area of the data structure they work on — that they do not change parts of the StackSet (and thus the screen!) they shouldn’t, and secondly, we will check particular policies regarding the movement of focus in the window manager.

A simple property to check is that moving focus on the current virtual workspace never touches the hidden workspaces. It is purely local to the workspace in focus:

prop_focus_local ( x :: T ) = hidden ( focusRight x ) == hidden x

This says that the hidden workspaces are entirely unchanged by the movement of window focus on the current workspace. ‘hidden’ is straightforward:

hidden ( x :: T ) = [ w | w <- prev x ++ next x ]

Similarly, insert and delete should be local. They should not change anything other than the current workspace:

prop_insert_local ( x :: T ) i = hidden x == hidden ( insertLeft i x ) prop_delete_local ( x :: T ) = hidden x == hidden ( delete x )

And QuickCheck agrees with us. Good! We are starting to build up some confidence that our window manager model behaves the way a window manager should.

Another interesting property to check is that shifting focus is reversible. That is, if you move focus to the left window, you can move it back again by moving focus to the right. The obvious implementation is:

prop_focus_right ( x :: T ) = focusRight ( focusLeft x ) == x

However, if we run this, we find:

focus right / left : Falsifiable after 0 tests : StackSet { current = Workspace { tag = 0 , stack = Node { focus = 'n' , left = [ ] , right = "d" } } , prev = [ ] , next = [ Workspace { tag = 1 , stack = Empty } , Workspace { tag = 2 , stack = Node { focus = 'e' , left = "" , right = "" } } ] }

What happened? If we look at the current workspace, we see that it has no windows to the left of the currently focused element: “left = []”, which means that focusLeft will cause a wrapping, which rests the distribution of windows to the right of the focused element. So while a window distribution with left, focused and right of:

[] 'n' ['d']

and:

['d'] 'n' []

will have the same window ordering modulo wrapping behaviour of focus, they do not compare as equal when tested by QuickCheck. We thus need to first normalise the window distribution. We choose to be right-biased here:

normal = modify Empty $ \ c -> case c of Node t ls rs -> Node t [ ] ( rs ++ reverse ls )

Which yields the property:

prop_focus_right ( x :: T ) = normal ( focusRight ( focusLeft x ) ) == normal x

Which does indeed hold.

Testing where focus goes

Finally, we can check that inserting and deleting a window leaves us with the original StackSet unmodified, including where focus is set:

prop_insert_delete n ( x :: T ) = delete n ( insertLeft n x ) == x prop_delete_insert ( x :: T ) = case peek x of Nothing -> True Just n -> insertLeft n ( delete x ) == x

And QuickCheck again confirms this:

insert is reversible : OK , 100 tests . delete is reversible : OK , 100 tests .

Wonderful! Having a data structure that tracks focus automatically gives us simple, clean semantics for the behaviour of focus when windows pop up and are deleted. In fact, all our properties are now able to assert the behaviour of focus will be unchanged, or will be changed in precisely specified ways, as equality on StackSets now means equality on workspaces, including focus. This is exactly what we require for an elegant and predictable implementation of a tiling window manager.

Conclusion

By moving more information into our pure data structure, we are able to make stronger statements about the behaviour of our code, on that information. By using a purely functional data structure, and QuickCheck, we can encode these statements as QuickCheck properties, and have them tested automatically. Finally, by using a data structure which more closely models the domain we’re looking at, the implementation becomes simpler, cleaner and more obviously correct, and with a strong, statically typed data structure, we get behaviour for free that previously required dynamic checking. The data type described here is the core of a branch of the xmonad window manager.

The zipper version of xmonad is running on my laptop as I write, and plans are to merge it into the main xmonad branch soon. Perhaps the most interesting result is that xmonad is now 50 lines shorter! By moving more logic (explicit focus handling) into the pure StackSet data structure, most of the event handling code is now just a shallow wrapper over the StackSet api. Smarter data structures, less code. It’s all good!

Thanks goes to Wouter Swiestra for the original idea of using a zipper to implement focus-tracking in the xmonad, and to Spencer Janssen, Jason Creighton and David Roundy for several discussions about the subtleties of focus behaviour.

For further information on the lovely zipper data type, see:

Huet made the first presentation of the zipper type in his functional pearl.

Ralf Hinze and Johan Jeuring extended this further in another functional pearl.

There are many interesting results for the zipper type, and an excellent introduction is available on the Haskell wikibook zipper article.

The relationship between zippers and delimited continuations has been described by Oleg Kiselyov.