Photo by Johannes Hofmann on Unsplash

When working with collections and Kotlin’s vast number of collection functions we tend to forget about performance, assuming that Kotlin’s collections functions are well designed and perform extremely well. Although this is completely true, when we chain these kind of functions, performance can suffer.

Eagerly created intermediate collections

These are the collections that Kotlin creates to evaluate a chain of collections functions calls.

Kotlin creates two list on this chain of calls. One for the result of the map and another one for the result of the filter. This is ok for small number of elements , but not for a big list.

sequences

In Kotlin you can use sequences to avoid the creation of the intermediate collections. The syntax is pretty easy.

Now we get the same result but with better performance. We can convert any collection to a sequences with asSequence() and convert it backwards with toList().

Every element on the list is first mapped and then filtered, on the contrary if we don’t use sequences, it will first map all the list, and then filter the result of that map.

sequences are similar to Java’s 8 streams but without the option to run stream operations on multiple CPUs on parallel.

Lazy operations

There are 2 types of operations when executing sequences.

intermediate operations: returns another sequence: map(), filter()…

returns another sequence: map(), filter()… terminal operations: returns an element. A collection, a number, an object, etc: toList(), find(), count()…

Only when the terminal operation is called the intermediate operations are executed.

Create a sequence

Creating a sequence directly from a lambda can be really helpful , and it is very easy to do. We use generateSequence with a “seed” value, 0 in this case and a lamba {it +1} on the this sample.

Note that the first 2 lines will not be executed until the sum, the terminal operation of this chain, is called.

Improving sequences

We can categorize collections functions in 2 groups.

The ones that run always on the whole collections data: map(), filter()…

The ones that don’t need to run on all the data: find(), any()…

With these 2 categories and, as mentioned before, in a sequence every element on the list performs all the chain operations before passing to the next element, we can assume that using sequences can improve performance even more.

In the code above we can see in the first chain that the map is executed in the whole list.

In the second one, because it is perform on a sequence, the map is executed on the first element, then passed to the find function and terminates. The find function returns when it finds the first occurrence of the lambda, and it happen to be the first one of our array.

As a Kotlin rule use sequences whenever you have chain operations on a large collection. On regular collections eager operations performs efficiently.

Conclusion

When using chains of collection functions, we can code faster, use standard library functions and get readable code. Keeping attention on how big our collections are and use sequences when they are big, will avoid future performance problems.