TLDR: you can find it on Github here.

When it comes to understanding what your code is doing under the hood and what its wasting its time on stackprof is an incredible tool. We’ve been using it to slim down our APIs to ensure we’re not wasting away compute time on inefficient code and mistakes — with great results. It’s a fantastic companion to NewRelic, you look for slow aggregate traces in NR and then perform a deeper diver in stackprof.

stackprof generates some cool data like this

» be stackprof /Users/callumj/Downloads/stackprof_weboutput.dump --text --sort-total ================================== Mode: cpu(1000) Samples: 60 (17.81% miss rate) GC: 2 (3.33%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 57 (95.0%) 1 (1.7%) SomeController#index 57 (95.0%) 1 (1.7%) Core.run .......

In this stackprof output you can see for a sample my code spent 95% of its time running something that I wrote called Core.run . I wonder what that is doing down there?

» be stackprof /Users/callumj/Downloads/stackprof_weboutput.dump --text --method "Core.run" Core.run (/Users/callumj/dev/scrap/rackapp/core.rb:4) samples: 1 self (1.7%) / 57 total (95.0%) callers: 57 ( 100.0%) SomeController#index callees (56 total): 55 ( 98.2%) #<Module:0x00007f8b0b4b0490>.method_missing 1 ( 1.8%) #<Module:0x00007f8b0b1b8620>.const_missing code: | 4 | def self.run | 5 | foo = [] | 6 | 100.times do |i| | 7 | foo << i | 8 | end 55 (91.7%) | 9 | x = Faraday.get("https://media.giphy.com/media/2wYxItOyFUtjcQxCtc/giphy.gif").body 2 (3.3%) / 1 (1.7%) | 10 | Digest::SHA256.hexdigest(x) | 11 | end

Cool! stackprof is telling me that the offending Core.run is just really slow because of some weird call I make to Faraday.

So wouldn’t it be neat if this was in Sidekiq? There hasn’t been too many gems out there like the Rack middleware integration that stackprof (you just plug it in an it begin aggregating your web requests) offers but targeted at Sidekiq. I wanted something like this that you plug into Sidekiq and can easily see what your workers are doing under the hood.

I’ve just released sidekiq_profiling_middleware, a server side middleware that wraps your workers in StackProf or MemoryProfiler and out respective profiles keyed by the worker name and timestamp. You can target specific problematic workers so you can focus your investigations. It includes an S3 integration for deployments when your Sidekiq server runs anywhere so you don’t have to collect dumps from a wild herd of servers. This should be useful if you run Sidekiq in a Docker environment like Kubernetes or ECS.

You can activate it like this:

require "sidekiq_profiling_middleware/stack_prof" Sidekiq.configure_server do |config|

....

config.server_middleware do |chain|

chain.add(

SidekiqProfilingMiddleware::StackProf,

only: [ThisReallySlowWorker].to_set,

s3_bucket: "cj-profiling",

output_prefix: "stackprof/#{ENV["GIT_SHA]}_",

)

end

In this case I’ve setup a stackprof tracing of one specific worker with the dumps saving to an S3 bucket in a folder which is prefixed by stackprof and the Git SHA of the build I’m running (if I was publishing my Git SHA as an ENV — which many do with Docker images). From there I can just do a download from S3 and analyze them offline with the stackprof binary provided in the regular stackprof repo.

You’ll need to ensure the worker you run your middleware on has a Sidekiq concurrency set to 1. This is because many of the Ruby tracing tools that expose the information that stackprof uses aren’t thread aware, so if you’ve got two different workers running at the same time on a Sidekiq you’d be seeing some weird stuff. So you don’t mess with the concurrency of your Sidekiq setup you’ll then want to run this middleware on a dedicated Sidekiq instance.

sidekiq_profiling_middleware is pretty fresh so please feel free to try it out, see what breaks and see what you’d want from it next.

P.S. you may also want to check out rbspy