The Kamon tracing feature described in the previous post is really nice, although, in some cases, it might be insufficient for analyzing performance problems. In this part, I will try to monitor the very low-level components of the application: actors and dispatchers.

It is crucial to understand that with those low-level metrics you really need to know what you want to monitor. Otherwise, you will end up with a lot of useless data. The decision regarding what will actually be monitored should strongly depend on the technology stack and architecture of your application.

Let's examine some examples of the metrics that could be useful to observe.

Actors

Using Kamon, you can check metrics of each particular actor in all actor systems started in your application. Since actors are usually launched in huge numbers, it is impossible to observe them all. Fortunately, we can filter them with: kamon.metric.filters.includes/excludes params.

kamon.metric.filters.akka-actor { includes = ["sandbox-actor-system/**"] excludes = ["sandbox-actor-system/system/IO**", "sandbox-actor-system/user/Stream**", "sandbox-actor-system/system/transports**"] }

For each actor you have access to 4 metrics:

errors

mailbox-size

processing-time

time-in-mailbox

Some of the metrics, like mailbox-size, could be presented globally (for all actors):

Actors with a strategic role deserve dedicated graphs:

Error count could be calculated for all actors in the cluster:

Dispatchers

The same rules apply to the dispatchers. For filtering you should use:

kamon.metric.filters.akka-dispatcher { includes = ["sandbox-actor-system/**"] excludes = [] }

This time I performed 3 different simulations:

Thread.sleep() enabled with 3,000 samples, Thread.sleep() disabled with 3,000 samples, Thread.sleep() disabled with 30,000 samples.

Datadog allows you to very easily analyze the anomalies or trends - here for the dispatcher metrics:

Clearly, you can see the difference in the number of running threads, that depends on how Thread.sleep() is used. Also, with approximately the same number of running threads, Akka can process 10 times more samples (see processed tasks on simulation nr 3).

Summary

This was the last part of the Akka monitoring series. I hope it helps you with the monitoring setup. One more thing to note about Kamon is that its architecture is very plugin-oriented. You can integrate many different technologies or write your own module, which can be easily connected to the Kamon core.

To play more with the above metrics, you can clone the repository and checkout the part3 tag. Having done that, similarly to the instructions in part 2, run Vagrant to set up the cluster, and Gatling UserCreationInClusterSimulation to generate some load.

This blog post is a part 3 of the 3-part series, see part 1 | part 2.