I was look­ing for some more infor­ma­tion on how to con­struct lazy sequences and real­ized that there is lit­tle in the way of doc­u­men­ta­tion on lazy sequences in Clo­jure. These design notes were the best I could find on the sub­ject and while they are illu­mi­nat­ing, I thought the con­cept could be explained bet­ter.

Lazy Sequences are an inte­gral part of the Clo­jure pro­gram­ming lan­guage but one you don’t see in a lot of other pro­gram­ming lan­guages. A sequence, in Clo­jure, is a sort of metatype or pro­to­col. It sig­ni­fies a class of datas­truc­tures which store items in a log­i­cal sequence, such as arrays or lists. Clo­jure has sev­eral types of sequences includ­ing clas­sic Lisp-style lists, vec­tors (which cor­re­spond most closely to arrays in other lan­guages,) and lazy sequences.

Lazy sequences are inter­est­ing datas­truc­tures in the sense that they aren’t really datas­truc­tures. Instead they are objects which con­form to the sequence API, but which con­tain a func­tions which return the next item in the sequence. So while call­ing first on a Clo­jure list, returns the item stored in the first cons cell in the list, call­ing the same on a lazy sequence instead calls a func­tion which returns both a value and another func­tion. The value is returned by first. Here’s a small sub­set of the sequence pro­to­col func­tions and what they do:

Regular Sequence Lazy Sequence first returns the first item in the sequence calls a function which returns both a value and another function; returns the value rest returns a sequence with the first item removed calls a function which returns both a value and another function; returns the new function second returns the second item in the sequence calls the function returned by rest ; this function in turn returns both another value and another function; returns the function nth returns an item from the sequence n steps from the head executes a chain of n functions and returns the last value returned drop returns a new sequence equivalent to the current sequence with the first n items removed executes a chain of n functions and returns the last function returned last returns the last item in the sequence calls every chained function until no more functions are returned; returns the last value returned take returns a sequence of the first n items in the sequence executes a chain of n functions and returns a sequence of all the values returned

It should be clear after read­ing this that lazy sequences are anal­o­gous to old style Lisp cons cells. Whereas in a cons cell, the head con­tains an item and the tail con­tains a pointer to the next cell in the list, with a lazy sequence, the func­tion returns both an item and the next func­tion in the sequence.

In fact, in Clo­jure, one can mix cons cell based lists with lazy sequences. A one point in the list the tail of a cons cell is set to the head of a lazy sequence so that the first por­tion of the sequence is a con­crete list while the remain­der is an ephemeral lazy sequence. You actu­ally make this hap­pen with the code below:

user> (def foo (conj (lazy-seq [1 2 3]) 0)) #'user/foo user> (type foo) clojure.lang.Cons user> (type (drop 1 foo)) clojure.lang.LazySeq

It’s also worth not­ing that this can go the other way. A lazy sequence can also return a con­crete sequence rather than a lazy sequence pair. This allows struc­tural shar­ing with other sequence types.

This rather clearly sum­ma­rized what lazy sequences are, but it does­n’t say much about why one would want to use them. The short answer is that lazy sequence make things pos­si­ble that would be impos­si­ble with­out them and solve prob­lems that would oth­er­wise have to solved with state­full variables.

A Clo­jure lazy sequence allows for an algo­rith­mi­cally gen­er­ated log­i­cal sequence which does­n’t have to exist in mem­o­ry. Among other things, this makes it pos­si­ble to have infi­nite sequences, or to use sequences to rep­re­sent objects which are too large to load into mem­o­ry, (such as large files,) or even to rep­re­sent abstract sequences which don’t exist as such. This also makes it pos­si­ble to com­pose sequence oper­a­tions before actu­ally oper­at­ing on sequences, which allows for a lot of per­for­mance enhance­ments, such as par­al­leliz­ing an entire pipeline.

Take for example:

(iterate inc 1)

This will gen­er­ate a sequence of num­bers start­ing with one and incre­ment­ing by one onto infin­i­ty. You loop over this with another sequence and have index num­ber avail­able for each item in that sequence, like so:

user> (doall (map #(println (str %2 ": " %1)) persons (range))) 0: Irene 1: Joe 2: Steve 3: Fred 4: Thomas 5: Louise 6: Carlotta 7: Johnson (nil nil nil nil nil nil nil nil)

That’s a sim­ple exam­ple, but line-seq let’s one oper­ate on a file one line at a time with­out read­ing the whole file into mem­o­ry. tree-seq let’s one oper­ate on a tree struc­ture as a sequence, one node at a time, with­out hav­ing to con­vert the whole tree into a sequence.

You’ll notice that lazy sequences ful­fill some of the uses of an iter­a­tor object in other lan­guages.((In face Clo­jure has the iterator-seq func­tion which wraps Java iter­a­tors.) In that they cre­ate an abstract inter­face which allows one to loop over anything, with­out hav­ing to worry so much about mem­ory usage or per­for­mance. That’s what lazy sequences do.

So how does one gen­er­ate and use lazy sequence? Well the good news is that most Clo­jure which emit sequences emit lazy ones or will at least do so if they are passed a lazy sequence in the first place. This includes func­tions like map , remove , filter , take , drop , concat , and oth­ers. To gen­er­ate a lazy sequence from scratch you can use one of the gen­er­a­tor functions:

Some Clojure sequence generator functions range A lazy sequence of numbers with an option starting value, maximum value, and step size repeat An indefinitely repeating sequence of a constant repeatedly An indefinite sequence consisting of the output of repeated calls to the same function iterate A lazy sequence of output of repeated calls of a function, passed the output of the previous call to same function

Each of the func­tions gen­er­ates a lazy sequence from some pre­con­di­tion and one can expand on it using map or sim­i­lar func­tion.

In addi­tion, one can gen­er­ate lazy sequences from scratch using the lazy-seq macro. This macro takes a form which gen­er­ates a sequences and wraps it in a func­tion which is used as the tail of a lazy sequence. You can recur­sively gen­er­ate a lazy sequence then, with a recur­sive func­tion that wraps it’s body in a call to lazy-seq . For example:

(defn iterate [fn val] (cons val (lazy-seq (iterate fn (fn val)))))

Here, ‘ iterate ’ returns a lazy sequence, which when invoked, returns val and another call to iter­ate with (fn val) as the input. This is in fact equiv­a­lent to the built-in iterate func­tion. You can use lazy-seq like this as a wrap­per for any func­tion which recur­sively build­ings a list to instead cre­ate a lazy sequence.