Introduction

Localizing web application performance problems and response latency could be tricky in the projects with complex infrastructure.

And having monitoring for all the services is highly crucial.

Sometimes performance degradation might be induced one step ahead of the main application because of the lack of the web server capacity. As the most popular web server for running Ruby web applications is Puma, let me explain how to implement and tune up the simple monitoring for it.

Puma control application

Puma has a built-in control application for managing web-server and asking it's internal statistics. Control application actives by the following code, called from Puma configuration file puma/config.rb :



activate_control_app ( 'tcp://127.0.01:9000' , auth_token: 'top_secret' ) # or without any params, just activate_control_app

For serving it, Puma runs the separate web-server instance with the specific rack backend application.

This backend returns the worker statistics data in JSON format:



# GET /stats?secret=top_secret { "workers" => 2 , "phase" => 0 , "booted_workers" => 2 , "old_workers" => 0 , "worker_status" => [ { "pid" => 13 , "index" => 0 , "phase" => 0 , "booted" => true , "last_checkin" => "2019-03-31T13:04:28Z" , "last_status" => { "backlog" => 0 , "running" => 5 , "pool_capacity" => 5 , "max_threads" => 5 } }, { "pid" => 17 , "index" => 1 , "phase" => 0 , "booted" => true , "last_checkin" => "2019-03-31T13:04:28Z" , "last_status" => { "backlog" => 0 , "running" => 5 , "pool_capacity" => 5 , "max_threads" => 5 } } ] }

The exact answer schema depends on the Puma configuration: when it is in clustered mode (has more than one worker), the output describes each worker.

If Puma is in non-clustered, the result describes only the single worker with rapid output:



{ "backlog" => 0 , "running" => 5 , "pool_capacity" => 4 , "max_threads" => 5 }

Anyway, there are meaningful metrics for monitoring purposes, such as:

max_threads - preconfigured maximum number of worker threads

- preconfigured maximum number of worker threads running - the number of running threads (spawned threads) for any Puma worker

- the number of running threads (spawned threads) for any Puma worker pool_capacity - the number of requests that the server is capable of taking right now. More details are here.

- the number of requests that the server is capable of taking right now. More details are here. backlog - the number of connections in that worker's "todo" set waiting for a worker thread

Using them, we can automate the monitoring system, which checks the values periodically any Puma cluster and show metrics in dynamic, such as shown below:



Decreasing pool_capacity means raising the load of the server. It is the starting point of rising the request processing time latency by capacity issues.

Yabeda framework

For the Ruby world, we have the extendable framework for collecting and exporting metrics, which is called Yabeda.

It provides a simple DSL for describing the metrics and fetching their values with a simple lambda function.

For now, Yabeda framework provides solutions for monitoring Rails and Sidekiq out of the box.

Prometheus

Monitoring considers periodically storing the metric values for future analysis. And one of the most popular and suitable solutions for that is Prometheus. As Prometheus implements the "HTTP pull model," it expects the monitorable subject to expose some endpoint with the metrics value in the specific format.

Yabeda framework allows exporting metrics with the help of Prometheus Exporter.

Yabeda for Puma

Now I am going to introduce one more new monitoring solution of the Yabeda family - puma monitoring plugin.

It just needs to load the yabeda-puma-plugin gem and to configure Puma web server with following lines in puma/config.rb file:



activate_control_app plugin :yabeda

That's it. After the Puma web server start, the plugin will do all the job for collection the metrics.

Get things together

Here is the overall architecture of the Puma monitoring solution:



It gets all the metrics from Puma control application statistics and consolidates them to the Yabeda framework. Values could be exported by Prometheus rack-middleware, serving the /metrics path of the web application and providing metrics values in prometheus-friendly format. Here is the sample response of metrics endpoint for Puma, configured with two workers:



GET /metrics puma_backlog{index="0"} 0 puma_backlog{index="1"} 0 puma_running{index="0"} 5 puma_running{index="1"} 5 puma_pool_capacity{index="0"} 1 puma_pool_capacity{index="1"} 5 puma_max_threads{index="0"} 5 puma_max_threads{index="1"} 5 puma_workers 2 puma_booted_workers 2 puma_old_workers 0

Visualization

Depending on your needs, the data could be visualized in many ways; here is the example of basic summarized metrics values:



This diagram shows the overall metrics values for all the Puma workers. Also, indicators could be displayed separately for all the workers, or all the Puma cluster instances.

"Application busy" metric

Looking at all the raw Puma metrics might be not visually comfortable to make some quick overview of the system in general. More suitable way if to calculate the composite metric, describing the overall workload of the web-server in percentage. Let call it "Application busy" or just busy-metric. Formula evaluates the percentage of overall workload:



(1 - pool_capacity / max_threads) * 100

It turns out to have the only chart instead of several:



The busy-metric looks to be more informative to overview the health of the system. It shows the actual workload of overall Puma cluster in a more friendly way. When busy-metrics sticks up, it means that the application is under high load, and it probably needs to tune up the Puma web server.

Busy-metric allows to determine the problem state easily, but for specific incident investigation, raw metrics might be more helpful and advisable.

Metrics playground

Yabeda framework supply the example project with all the monitoring infrastructure set up for monitoring the Sidekiq, Rails, and Puma. It is easy to set it up with docker-compose.

Wrapup

Setting up the monitoring infrastructure makes to build more stable and maintainable software, and sleep calmly at night.

Monitoring is made easy with Yabeda framework.

Check out the yabeda-puma-plugin for getting ready to monitor Puma!