Results

Which of these array memory layouts is fastest?

The answer is complicated, and it seems to depend on the data size, the cache size, the cache line width, and the relative cache speed. In many settings B-trees (with a properly chosen value of B) are best. In others, the Eytzinger layout wins. In others, still, the van Emde Boas layout is the winner (at least for large enough array sizes).

For an example, consider the following two graphs, generated by running the same code on two different Intel machines. In the left graph, the Eytzinger layout is almost as slow as a plain sorted array while the van Emde Boas and B-tree layouts are more than twice as fast. In the right graph, the Eytzinger layout and b-tree are the fastest, the sorted array is still the slowest, and the vEB layout is somewhere in betweeen (for array sizes).

This is why I need your help. I have some theories that may explain this complicated behaviour, but they need to be validated on a larger sample of hardware. If you have a Linux machine and would like to contribute to this effort, then please follow the instructions at the top of the page.