Problem

We have to implement the following interface to reverse an input list:

As an example of the expected result:

Input: 1, 2, 3

Output: 3, 2, 1

The only constraint is that the input list is immutable.

The current implementation is the following:

This solution will produce the expected result. Yet, from a performance perspective, there are few problems. Can you spot them? Would you have implemented it in a different way?

We consider accessing a given index of the input list is done in a constant time.

As a basis, running this function on Oracle JDK 9 and an Intel Core i7–7700 (3.6 GHz) with 1M elements results in the following response time:

problem avgt 403.000 us/op

Possible Solutions

Iteration Mode

First of all, let’s check the way to iterate on the input list. As we know, there are different alternatives to the classic for.

From Java 5, using the for-each loop:

Is it really faster? In Effective Java, Joshua Block says:

[The for-each loop] may offer a slight performance advantage over an ordinary for loop in some circumstances, as it computes the limit of the array index only once. While you can do this by hand, programmers don’t always do so.

Said differently, if we already pre-compute the list size (as we did in the initial implementation), there should not be any difference with a classic for.

Another option from Java 8 is to use the Stream API:

Again, this does not offer any major performance advantage. Running a benchmark with the for-each and the stream solution is giving more or less the same results:

problem avgt 403.000 us/op

for-each avgt 403.000 us/op

stream avgt 403.000 us/op

ArrayList Allocation

An ArrayList is a resizable-array implementation. Under the hood, it manages a capacity which is the size of the array used to store the elements.

Each ArrayList implementation has a growth policy which may vary depending on the JDK (for example, multiplying the capacity by 50% once the array is full).

If we want to get rid of this dynamic growth, we can just initialize the ArrayList with a given initial capacity. This is made possible as we already know the size of the output list.

This is done this way:

Let’s check the results:

problem avgt 403.000 us/op

initial-capacity avgt 403.000 us/op

Nothing really changed. How can we explain that?

Most likely, the runtime optimizations made by the JIT compiler leads to thinking that missing this initial capacity is somehow not that important. If we disable the warmup period, we can notice the response time difference between both tests is way more important.

Elements Shift

Let’s take a closer look at the way we insert elements in the produced ArrayList :

The time complexity of this simple expression is linear, not constant. Indeed, inserting an element at the position zero of an ArrayList requires to shift all the elements to the right.

This is the main problem of the implementation. So what are the different options?

The first option is to use a data structure having a constant time to insert an element at position zero. In Java, we can use a LinkedList as it manages a pointer on the first element.

Inserting at the first position is done using the method addFirst() :

Any major improvement this time?

problem avgt 403.000 us/op

linked-list avgt 541 us/op

Way better! 🍾

The second option is to keep an ArrayList structure and to insert the elements to the end (which is, this time, managed in a constant time complexity).

As a consequence, we have to iterate in the opposite order like this:

problem avgt 403.000 us/op

linked-list avgt 541 us/op

insert-last-pos avgt 281 us/op

The response time is even smaller than with the LinkedList solution.

Actually, using a LinkedList adds a small overhead due to dynamic memory allocation (every element is wrapped in a node object). So, if we already know the size of the structure, it is obviously faster to use a structure like ArrayList .

Internal Array Copy

ArrayList is backed by an array. What if we were trying to copy this internal array and doing an in-place inversion of this copy?

The swap() function simply swaps two indexes from a given array.

Any performance gain?

problem avgt 403.000 us/op

linked-list avgt 541 us/op

insert-last-pos avgt 281 us/op

copy-swap-array avgt 254 us/op

Interestingly enough, the solution is even faster than the previous ones. There are some possible explanations:

The array copy is managed by System.arraycopy() which is a native and optimized function

which is a native and optimized function The code is more CPU-cache friendly

View

The last solution relies on the fact that the input list is immutable. Hence, we could create a view on top on this list by creating our own AbstractList :

Obviously, in this case, comparing the reverseList() benchmarks is not that interesting as the output list is constructed lazily:

problem avgt 403.000 us/op

linked-list avgt 541 us/op

insert-last-pos avgt 281 us/op

copy-swap-array avgt 254 us/op

view avgt 0,002 us/op

There will be a tiny overhead when we will call get() as we need to process list.size() - 1 - index . In real life, it might be worth considering how the list will be used. If it is accessed very frequently, perhaps a copy would have been the best solution.

One last thing to mention. Some people tend to highlight that functional programming and immutability will always have a negative impact on the performance of an application. This statement cannot be generalized.

Actually, in our case, the best solution relies on the constraint of the problem itself.

Further Learning

10 Books to Prepare Technical Programming/Coding Job Interviews

10 Algorithm Books Every Programmer Should Read

Top 5 Data Structure and Algorithm Books for Java Developers

From 0 to 1: Data Structures & Algorithms in Java

Data Structure and Algorithms Analysis — — Job Interview

20+ String based coding problems from interviews

20+ linked list problems from interviews

20+ basic algorithms based problems from interviews