Overview

AVL & red-black trees have been compared for many years. When inserts are random, both will have nearly identical average node height and insertion time. For sequential input, AVL trees are significantly more efficient at building a more balanced tree – over 30% faster than red-black trees in the Java implementations described below. This is exactly the result predicted by theoretical computer science which gives worst case height bounds of ~1.44 and 2 for AVL and red-black trees respectively – a difference of 32.6%. Historically, the number of rotations was seen as important; however, profiling demonstrates that the number of rotations is not a worthwhile statistic to monitor for performance, number of comparisons per insert/lookup/delete or average node height are the key benchmarks to watch.

Profiling Data for One Million Random Inserts

For these tests the OpenJDK 1.8 TreeMap red-black implementation was compared against an AVL implementation available on GitHub WAVLTreeMap. The driver program for the comparison reads one million java.util.Random numbers from a file and inserts them into the tree building ten trees of size 100,000, clearing the data each time when the size reaches 100,000.

AVL inserts Red-black inserts

Without the overhead of JProfiler, the clock times for one million insertions was similar. In the profiling snapshots attached, the red-black tree spent 8640ms performing inserts (put() in Java’s TreeMap) and 86ms rotating left or right a total of 581,968 times. The AVL tree spent 5278ms inserting and 95ms rotating left or right a total of 697,913 times. Although total execution time varied with the profiler attached, the number of rotations and relative time spent rotating did not change from run to run. The WAVL insertion algorithm also spent 1822ms calling compareTo(). The number of comparisons is much more important than number of rotations!

Without the overhead of the profiler, both trees consistently complete 100,000 inserts in 56ms+/-3ms on my 2013 MacBook.

So AVL & Red-black Trees Perform the Same?

Absolutely not. The worst case height differences demonstrated by theoretical analysis is important to keep in mind. Performing sequential inserts, the AVL algorithm is 37.5% faster! For 100,000 sequential inserts the red-black code runs in 19ms+/-1ms and the AVL code takes 13ms+/-1ms. The reason for this is clear when the average node height and resulting trees are considered. Images for trees built by inserting elements 1-15 sequentially are below:

AVL Tree with 15 elements Red-black Tree with 15 elements

However, the AVL code performed even better than the 32.6% predicted by theoretical analysis. Part of that difference is likely due to code complexity.

Conclusion

If you are using any of the multiple red-black tree implementations in Java and your data is not randomly ordered consider changing.

Your feedback is welcome –David M.