Hands-on engineering leader. Expertise in backend scalability during hypergrowth. Interested in solving complex technical problems. Passionate about mentorship and organizational culture. Runs on a combination of optimism and pragmatism.

Ruby's 'each_with_object' versus 'tap + each'

each_with_object and tap + each both return the subject object which saves us the need to explicitly return the object.

For example, take a look at the code below:

def method_one(array) hash = {} array.each do |element| # do things with element, populate hash end hash end def method_two(array) {}.tap do |hash| array.each do |element| # do things with element, populate hash end end end def method_three(array) array.each_with_object({}) do |element| # do things with element, populate hash end end

The first method has to explicitly return the hash in the end whereas the last two methods do not. Plus the latter two offer blocks where you can create new variables that will be block-scoped. Sometimes, the same advantage can be gotten from using variations of map , reduce , or inject but with those, you need to return the ‘required object’ somehow at the end of the block.

Mainly because of these reasons, many Ruby developers are attracted towards each_with_object and tap + each .

But which one should be preferred?

At first glance, my answer to that was each_with_object . You can see from the code above that you save two entire lines with each_with_object . Furthermore, the meat of your code is less indented. The result is succinct and more elegant.

My next step was to compare speed.

The test was to iterate over an array of integers and store each element’s square as a value in an empty hash. The array was [1, 2, 3] thus the resulting hash in each scenario was { 1 => 1, 2 => 4, 3 => 9 } . For each code implementation, I used three-million iterations.

The structure of each of my tests was:

require 'benchmark' array = [1, 2, 3] iterations = 1_000_000 Benchmark.bm do |x| x.report("each_with_object") do iterations.times do # code being tested goes here end end x.report("tap + each") do iterations.times do # code being tested goes here end end end

The code I tested for each_with_object was:

array.each_with_object({}) do |integer, hash| hash[integer] = integer ** 2 end

And the one for the combination of tap and each was:

{}.tap do |hash| array.each do |integer| hash[integer] = integer ** 2 end end

Test results were:

# user system total real # each_with_object 2.830000 0.010000 2.840000 2.835867 # tap + each 2.580000 0.000000 2.580000 2.579430

If you would like to know what those headings mean, you should read up on benchmarking code in Ruby. To put it simply, the stats under the “real” heading is what really matters.

From my results, I saw that the tap + each implementation was nearly 9% faster than the each_with_object implementation.

With this information at hand, I would go with tap + each only when the objects were extremely large - which is rare; perhaps in such a case I would stick with method_one for best performance. However in most scenarios I would choose each_with_object for the visual appeal.

P. S. In case you are wondering, I deliberately used a small hash and a large number of iterations rather than the other way around. Why? Because as a Ruby hash grows, it has a growth performance cost when it rehashes. You can read more about this process here and in this excellent book. In our case the growth performance cost of a large hash would have possibly influenced the tests and yielded unreliable results. Thus I decided to play it safe and restrict the test to small objects.