Kotlin: Beware of Java Stream API Habits

Kotlin’s Collections API is extremely versatile and expressive but there’s one important thing to be aware of especially when migrating from Java.

Aching Design of Java Collections

In Java, although we could easily implement our own immutable data structures or simply use provided immutable views, using them on an everyday basis was challenging because of the Collections API design.

For example, imagine yourself implementing a new immutable data structure for Java and trying to implement the following methods:

boolean add(E e); boolean remove(Object o); void clear(); boolean removeAll(Collection<?> c); // ...

The only reasonable way is to simply forbid using them:

public E set(int index, E element) { throw new UnsupportedOperationException(); } public void add(int index, E element) { throw new UnsupportedOperationException(); } public E remove(int index) { throw new UnsupportedOperationException(); } // and so on ...

And this shall remain like this forever as long as Java embraces backward compatibility (although last releases were not very good at this).

This is one of the reasons why the only reasonable way to enrich the API, was to introduce a new abstraction – the Stream API. Streams aren’t collections so the API could be designed from scratch.

Stream API Laziness

Besides the new fancy functional API, the major selling point of java.util.stream.Stream was laziness – which made them nearly as performant as standard imperative solutions and made it possible to work with potentially infinite sequences.

Consider the following example:

Optional<Integer> resultStream = list.stream() .map(i -> i * 2) .filter(i -> i > 1) .findAny();

Initially, it might be tempting to think that the optimistic time complexity is O(2*N), while the imperative alternative provides O(1) while being less error-prone and less readable:

Optional<Integer> resultLoop = Optional.empty(); for (Integer i : list) { Integer integer = i * 2; if (integer > 1) { resultLoop = Optional.of(integer); break; } }

Even the huge improvement in code maintainability/readability wouldn’t be enough to justify the performance impact.

Luckily, thanks to lazy evaluation, both examples provide the optimistic time complexity of O(1) (the first encountered element matches) and pessimistic of O(N) (no elements match) – Streams simply take elements one by one and push them through the whole operation chain allowing short-circuiting as soon as the result is obtained.

A Trap of Kotlin’s Expressive Collections API

Kotlin’s Collections API is extremely versatile, expressive, and well-suited for working with immutable data structures – sounds great, right? But they surely can backfire if not used properly!

The great thing about collections in Kotlin is that they are immutable by default and feature a very rich functional API so we no longer need to turn to Stream API because our collections provide all this functionality already.

So, if we tried to recreate the example above, we could simply write:

val list = listOf(1, 2, 3, 4, 5) list .map(expensiveOperation()) .map(anotherExpensiveOperation()) .first()

Nice and clean but there’s a catch – let’s measure the execution time.

What do you think the result will be assuming that both expensiveOperation() and anotherExpensiveOperation() last exactly one second?

Kotlin’s collections are not lazy hence the operation took around 10 seconds which matches the optimistic time-complexity of O(2*N), where N = 5.

With great power comes great responsibility – an addition of every new map/flatMap/filter/… call, might negatively impact the time-complexity of your solution in certain scenarios because each of them simply creates a new instance of collection simply by iterating over the previous one – those also don’t support short-circuiting.

Bringing Laziness Back

Good news, everyone!

Naturally, Kotlin provided a native way of achieving laziness in situations like above by introducing lazy sequences. We can convert every Iterable to a lazy Sequence, which is a Kotlin’s Stream alternative, by using the asSequence() method:

val list = listOf(1, 2, 3, 4, 5).asSequence() list .map(expensiveOperation()) .map(anotherExpensiveOperation()) .first()

If we measure the execution time now, we can breathe a sigh of relief because the result is around 2 seconds regardless of the initial sequence size which matches the time-complexity of O(1) (2 seconds because of two map() calls).

The above code snippets can be found on GitHub.

The followup article goes into particular examples:

Key Takeaways

Kotlin Collections are not lazy in nature like Java Stream API is

If you want laziness, use asSequence()



