In Part I we saw how instrumentation profilers work, and in Part II we learned how to measure execution time of programs using CPU time and Wall time. It’s recommended to read those first before you continue. In this part we’ll apply what we have learned so far and build a very basic instrumentation CPU profiler.

Part III. DIY Instrumentation CPU Profiler

Instrumentation in Ruby

In Part I, we learned that instrumentation profiling tools make use of hooks which can either be installed by the profiling tools or provided by the language itself to do their job. Luckily, Ruby already has these hooks baked into it and gives us the nifty TracePoint class to make use of these hooks.

TracePoint API

Ruby VM emits a bunch of events at various points while executing your code. Ruby allows you to enable callbacks for these events using the TracePoint class. The API documentation of TracePoint has a list of all the events you could listen to, but we’re interested in only two of those for now:

:call , which is triggered when a ruby method gets called :return , which is triggered when a ruby method returns.

We can use the TracePoint.new method to create a tracepoint object and pass it an event type to listen to and a block of code which should get executed when the event gets triggered. Here’s the code to create a tracepoint for :call and :return events:

trace = TracePoint . new ( :call , :return ) do | trace | puts "Method: #{ trace . method_id } , event: #{ trace . event } " end

The argument trace which is passed into the callback code block gives you access to a bunch of tracepoint attributes which can tell you more about the event that got triggered. In our example, we’re going to need the method_id attribute which gives us the method name and the event attribute which gives us the name of the event which gets triggered. The callback code block above will print the method name and the type of the triggered event (call/return), every time a Ruby method gets called or returned.

After you create the tracepoint, you need to enable it:

trace . enable

Once the tracepoint is enabled, we can create a method and call it, and see if the callback that we registered gets executed:

def hello end hello #=> Method: hello, event: call #=> Method: hello, event: return

Great, our callback got executed twice as expected.

diy_prof gem

It’s finally time to start putting together all that we’ve learned and build a RubyGem. We’re going to call this gem diy_prof . The source for this gem is on GitHub and you can find it here.

Now let’s build the gem from scratch using bundler:

bundle gem diy_prof

This creates a skeleton project which we’ll build upon. Next, we’ll create a TimeHelpers module which will contain the methods cpu_time and wall_time which we wrote in Part II:

# lib/diy_prof/time_helpers.rb module DiyProf::TimeHelpers # These methods make use of `clock_gettime` method introduced in Ruby 2.1 # to measure CPU time and Wall clock time. def cpu_time Process . clock_gettime ( Process :: CLOCK_PROCESS_CPUTIME_ID , :microsecond ) end def wall_time Process . clock_gettime ( Process :: CLOCK_MONOTONIC , :microsecond ) end end

We also need to require this file in our gem’s main file lib/diy_prof.rb :

# lib/diy_prof.rb require 'diy_prof/version' require 'diy_prof/time_helpers' # Rest of the original file ..

Next, we’ll use the TracePoint API in our gem to print the time (using our TimeHelpers) when a :call or :return event occurs.

# lib/diy_prof/tracer.rb module DiyProf class Tracer include TimeHelpers def initialize ( clock_type : :cpu ) @tracepoint = TracePoint . new ( :call , :return ) do | trace | time = clock_type == :wall ? wall_time : cpu_time printf ( "%-20s:%-20s%-20s

" , time , trace . event , trace . method_id ) end end def enable @tracepoint . enable end def disable @tracepoint . disable end end end

Also, we’ll require this file from our gem’s main file too:

# lib/diy_prof.rb require 'diy_prof/version' require 'diy_prof/time_helpers' require 'diy_prof/tracer' # Rest of the original file ..

Now let’s write an example script to test what we’ve built so far. We’ll use the same example we saw in Part I:

# examples/demo.rb $: << File . join ( File . dirname ( __FILE__ ), "../lib" ) require 'diy_prof' ### Begin sample program ### def main 3 . times do find_many_square_roots find_many_squares end end def find_many_square_roots 5000 . times { | i | Math . sqrt ( i )} end def find_many_squares 5000 . times { | i | i ** 2 } end ### End sample program ### tracer = DiyProf :: Tracer . new ( clock_type : :cpu ) tracer . enable main tracer . disable

If we run the above Ruby program, we should get the following output:

70508 :return enable 70587 :call main 70604 :call find_many_square_roots 73225 :return find_many_square_roots 73256 :call find_many_squares 74527 :return find_many_squares 74562 :call find_many_square_roots 77042 :return find_many_square_roots 77064 :call find_many_squares 77653 :return find_many_squares 77690 :call find_many_square_roots 79681 :return find_many_square_roots 79698 :call find_many_squares 80288 :return find_many_squares 80306 :return main 80326 :call disable

The first column of the output is the CPU time of the triggered events, the second column is the name of the event that got triggered(call/return) and the third column is the name of the method which got called or returned. This is similar to the output we saw in Part I where we learned how instrumentation profilers work. As you can see, we can easily recreate a call stack using this output because we know which method called which other methods. We also have the time of all call and return events, so we can also tell how much time was spent in each method. It wouldn’t be too difficult to write a script that parses this output and prints a list of methods ordered by their execution them. But that’s too boring, we can build something way more interesting than that. Watch out for the next part to learn what else we can do with this data.

Recap

We’ve learned about Ruby’s TracePoint API and how to use it to listen to method calls and returns. We’ve also taken the code we’ve written so far and put everything together to build a gem which allows you to print the execution trace of your Ruby program. In Part 4 we’ll learn how to make sense of the data that our gem collects and build some cool call graphs out of it. Thanks for reading! If you would like to get updates about subsequent blog posts in this DIY CPU profiler series, do follows us on twitter @codemancershq.

If you have any questions or feedback, feel free to drop us a mail at team@codemancers.com