As we talked about last time, there are certain use cases where the Array performs horribly when compared to a List. So in this article, we'll explore how we can change the internals to improve those cases.

"Internals!?" you might say, and rightly so. The very word instill fear. I wouldn't blame you if you feel as petrified as if a wild Rhinoceros had just come home from a hard day at the swamp and found you wearing his pyjamas, smoking his cigars and in bed with his wife. But do not worry, I think I'm able to explain these malicious internals in a way that will keep the fear well under wraps.

In fact, I've already explained how Elm's Array implementation works at last years Elm Europe. Before reading the rest of the article, you really should see the ten minute explanation. Don't worry, I'll wait.

Back? Good. Let's begin.

What was the problem again?

My argument is that an Array is a better general purpose data structure than the List, and that the Array should replace the List as the default collection type in Elm. The goal is therefore to close the performance gap between these two structures in as many use cases as possible.

The List is ridiculously fast when modifying elements at the beginning of it. The Array is simply not suited for that particular use case. Why is that?

If you remember what I said last year at Elm Europe, an Elm Array is just a tree of javascript arrays. To navigate that tree we look at the bit representation of the index. The first five bits tells us what slot to look at in the root array, the next five bits tells us what slot to look at in the second level, and so on until we find the slot that contains the element we're looking for.

Retrieving any element in a given Array takes an equal amount of time, as we're doing the exact same navigation whatever the index might be. Replacing any element is a little bit slower, as we have to copy and modify every javascript array we touch. So if we want to replace the first element in an Array with 1024 elements, we would need to copy two arrays, both containing 32 elements each, and then modify two elements. This is much faster than copying all 1024 elements, but it still is a decent amount of copying.

A List, on the other hand, would be able to replace the first element with a single allocation, but the performance gets worse the farther you get away from the first element, so the Array will perform better in general.

The problem is when one wants to remove, or add, elements to the beginning of the Array. It's important to remember that the index is a way of navigating our tree, so if we remove, or add, elements from the beginning of our Array, then the path to the first element has to change. If the path to the first element has to change, the path to every other elements needs to change as well. This means the entire Array changes, which is going to be slow.

The reverse, however, is not so bad. If we add or remove elements from the end, only the effected javascript arrays needs to change. No one expects that the element at index 0 would switch places if the last element is removed, or if there's a new last element. So when removing, or adding, elements to the end of an Array, it's about as fast as replacing any element, which isn't so bad.

Ideally, we'd like to have the same performance in both cases. Adding an element to the beginning of an Array, should be just as fast as adding to the end of it.

How do we do that?

One way is to add some mechanism of altering the index in the case the array has been sliced or appended. So that, if you pass in index 0, we can translate this into index 2 if the Array has already removed the first two elements.

We can steal a technique from B-Trees for this. B-Trees are most commonly found in databases, like mysql or postgresql. The way they work is that they keep a record of what index range resides in a lower-level node. The way this would be implemented in Elm, is that we keep two javascript arrays for every level of the tree, instead of just one. The first javascript array contains the actual elements, or sub-trees, as before. The second javascript array contains the maximum index found in the sub-trees.

When looking for an element we would start to navigate as we've done before, but before actually descending to a lower level, we first check the record to see if the index we want actually resides in the sub-tree we're looking at. If it doesn't, we check if the neighbor contains the correct index range, and then descend into that sub-tree if it does.

Of course, if a record isn't needed (because the Array has never been sliced or appended) we don't need to keep a record around. That way, we can avoid the overhead of keeping our index records up to date unless we really need to, which will leave the performance unchanged for most use cases.

This idea isn't new. The implementation is called RRB-Trees and was invented by Bagwell & Rompf. You can read the paper here.

What performance can we expect?

In the best case, using this optimization will make adding things to the beginning just as fast as adding things to the end. This should also be the case for removals. I'm saying "best case" here, because there will be some overhead by keeping the index records up to date.

Below I've added some benchmark results of removing, and adding, elements from the beginning and the end of an Array, versus doing the same operations to a List. These benchmarks were run on a Macbook Air from mid-2012, using Chrome 65. You can find the code here.



# Initial collection size is 1,000 elements. # Removing the first element * List: 55,833,836 ops/sec * Array: 84,795 ops/sec * List is 65,744.93% faster # Adding a new first element * List: 58,044,193 ops/sec * Array: 94,506 ops/sec * List is 61,318.03% faster # Removing the last element * List: 57,275 ops/sec * Array: 3,024,220 ops/sec * Array is 5,180.17% faster # Adding a new last element * List: 116,999 ops/sec * Array: 8,866,400 ops/sec * Array is 7,478.14% faster

As you can see, List's are still faster for the specific use case of modifying the beginning of the collection, but List cannot retain the performance in the general case. As such, I'd argue that the Array would become a much better general purpose data structure, if we manage to implement the optimization we discussed in the last section.

Fantastic stuff! Let's get crackin'

Not so fast.

There are still things you can do with a List that you cannot do using an Array. Try implementing a find function to see what I mean. One can easily make an efficient implementation for the List, but there is no way of doing this for the Array without traversing the entire collection.

Tune in next time to explore two ways of solving this problem, along with benchmarks, code snippets and, of course, another reference to black adder.