Let’s now go to our second problem: Iterating over multiple sequences. For this we’ll consider the problem of transposing two vectors. I.e: Given [a b c] and [0 1 2] we create [[a 0] [b 1] [c 2]] as the output. Not too many function in clojure.core can deal with multiple input sequences: As far as I can tell only map , mapv , pmap , sequence , interleave , lazy-cat and mapcat can. So if you need to iterate over multiple collection you have to use these limited set of functions.

Instead, using iterators we can easily do this:

20ms

This runs in about 20ms for two vectors with 1 million elements. Using (mapv vector xs xs) runs in about 85ms. So over 4x faster when using iterators. Of course, it’s slightly unfair since we’re not providing a function f to call here. Doing so only slows down the function by about 1–2% however.

Side note: If you really need this to be fast you can use also use arrays to get this down to 10ms:

There are many applications for loop-it . We can also use it to create a faster interleave function (not lazy obviously):

26ms

This runs in about 26ms whereas (doall (interleave xs xs)) runs in about 64ms. A nice speedup.

Of course I hear you say: map is much more flexible: It can take a variable number of sequences:

Heck, it can even properly deal with infinite sequences:

Can we achieve the same with loop-it ? Not in its current state: It requires you — at compile time — to give it the sequences you want to iterate over. For a variable number of sequences we can use a trick (copied from clojure.core ): Transform a variable number of iterators into a single iterator. The new iterator returns a collection for each call to it.next() . We call this a multi iterator. Since the class in clojure.core is private we have to rewrite it:

Just like clojure.lang.MultiIterator it uses native arrays for speed. However, one problem remains: It cannot deal with infinite number of sequences since it is eager. So let’s create another lazy one:

Now colls can be an infinite sequence.

Let’s actually see how we can use it:

It’s a start. But before benchmarking, let’s combine and unroll some more to come up with a really fast mapv :

Note: I also change the loop-it macro to call ensure-iter instead of clojure.lang.RT/iter .

Let’s benchmark:

(mapv vector xs xs xs xs)

For 1 million elements this runs in about 1.28 seconds. Our new iteration based implementation runs this in only 147ms. Almost a 9x speedup and the implementation is pretty readable.

Other applications

Let’s write a really fast take-nth :

This runs in 52ms for (take-nth-vec 5 xs-large) where xs-large has 10 million elements. (into [] (take-nth 5) xs-large) runs in 273ms and (doall (take-nth 5 xs-large)) runs in 290ms.

2. Partition your sequence into two sets: all-truthy-values , all-falsy-values :

This runs in about 176ms for the loop-it based implementation and 291ms for reduce based implementation. And IMO the loop is more readable.

Side note: This is how many other programming languages define partition .

3. Compute the sum of a collection.

From top to bottom: 177ms, 130ms, 59ms

Clojurescript

All this is also possible in Clojurescript. Creating an iter is done with iter , only a change to ensure-iter is a little trickier.

Dreams

An Iterator is pretty simple: it.hasNext() and it.next() . A really nice feature would an improved iterator that also allows us to say “give me the rest of the sequence right now”. Something like it.restSeq() . This could allow us to speed up things like nthrest which currently isn’t possible with iterators. One notable idea is to speed up apply which isn’t all that fast. It currently iterates over the given sequence multiple times. See my clojurescript ticket for ideas. If an Iterator had a it.restSeq() it could be used right there.

Conclusion

Don’t forget about Iterators in clojure. They can often be worth it. Use them if you need to:

Accumulate multiple values

Want to do primitive math

Want to iterate over multiple sequences at the same time

As always: If performance matters: Benchmark. There are times when using an Iterator is not faster. For instance, an eager mapcatv implementation was slower with a nested loop-it implementation (but faster using reduce for the inner loop).

All code is at: https://github.com/rauhs/clj-bench