Ruby 2.4.0 introduced a lot of great new features. One of them was open addressing for hash tables - the details of open addressing are a bit obscure, but Ruby hash tables are now faster. Everybody uses hash tables, so everybody gets extra speed. Awesome!

But how did that happen? There's an interesting story there. Let's tell that story and benchmark with Rails Ruby Bench, shall we? (Don't care about the story? Scroll down to the end for graphs of the speed differences.)

A Beginning and Some Dueling Banjos

Ruby's open addressing for hash tables is recorded by a truly wonderful bug report. If you don't care about my commentary, just go read it. Seriously.

It begins with Vladimir Makarov proposing open addressing for Ruby's hash tables and including a patch. Open addressing is a better match for modern multilevel CPU caches than Ruby's previous method. That was very nice of him. Thank you, Vladimir! (Here's his explanation of the hash table changes.)

Is that the end of the story? Not so much.

Koichi points out that his very first patch wasn't perfect, and increased memory usage in some cases (true.) Nobu and Yura Sokolov (funny_falcon) point out some other minor problems. Feedback happens, especially with a large patch, or one that touches very common functionality like hash tables.

Vladimir responded, more back-and-forth ensued, and funny_falcon continued to engage more and talk about how he'd have done it (he didn't think open addressing was necessary, for instance, and that he could get similar results without it.) Vladimir responded to him. There was a highly-technical argument, mostly good-natured, going strong. And eventually less good-natured. It's easy for tempers to run hot in technical discussions -- I do the same thing, and they clearly understood what was going on. Isn't it wonderful to watch engineers doing what they feel passionate about, showing that they care but also acknowledging that we all want the same thing? I love watching that.

If you have time, read through the whole thread. The back-and-forth is wonderful, and highly educational -- "you should use quadratic probing," "here's the wikipedia article for...," "I disagree that this should be int32," "test large inserts, does the time grow linearly?" It's not just a great deep dive into hash tables. It's a great study in passionate disagreement between highly skilled engineers.

It also involved Vladimir and Yura proposing and counter-proposing patches with different good and bad points, back and forth, and critiquing each other's code constantly. Who had the better hash table implementation?

Eventually Shyouhei and Koichi (prominent core Ruby committers) looked over the results and checked for errors. The patches continued to improve, and the edge cases kept getting fixed. Either Yura's or Vladimir's patch might win. Each had taken tricks from the other.

Nearly-final patches were prepared. Decisions were made about features like maximum hash size. Evaluations continued and intensified. Fixes were made. Yura's patch eventually adopted open addressing, and the two patches were very similar...

Koichi put together some great benchmarks and a wonderfully comprehensive report - and basically said the implementations were so close you could pick between them with a coin toss.