In a previous post, I wrote about how to implement client side service discovery with Consul and ASP.NET Core. It's a very useful technique if you're doing any with containers or microservices.

In addition to registering your services, one thing that would be pretty helpful is if the service registry could track the health of each instance of a service. If a client knows that a service instance is no longer available or is unhealthy, it can avoid sending requests to that location and dealing with the resulting errors. In this post, we will define what it means for a service to be healthy and see how we can implement health checks with ASP.NET Core and Consul.

Service Health

We can think about checking the health of a service instance in a few ways. At the application level, you might consider the service to be healthy if it's running, if it can process requests, if it's not throwing exceptions, and so on. Another thing to think about would be the state of the machine (or VM or container) that the service is running on. You might want to know what the memory consumption is, what percentage of disk space is being used up, or what's the CPU utilization? A third category of health checks to consider is the status of external dependencies. For example, can the service communicate with the database, can it interact with the caching service,and can it connect to a given 3rd party vendor? Even though the service itself is up and running, it might not be able to function correctly if the external pieces it needs are not accessible.

Consul Health Checks

As I'm writing this post, the current version of Consul (0.8.0) supports 5 different types of checks.

HTTP Interval checks - An HTTP GET request will be made at a given interval to the specified URL. Responses with 2xx status codes are considered to be passing while all others considered problematic. This can be easily implemented with a /status or /health endpoint in your web application.

- An HTTP GET request will be made at a given interval to the specified URL. Responses with 2xx status codes are considered to be passing while all others considered problematic. This can be easily implemented with a or endpoint in your web application. TCP Interval checks - An attempt is made to connect to a given host/IP address and port number. If a connection can be made then the health check status is considered passing, otherwise the status is critical. This can be useful for your own or even 3rd party applications that are listening on a port.

- An attempt is made to connect to a given host/IP address and port number. If a connection can be made then the health check status is considered passing, otherwise the status is critical. This can be useful for your own or even 3rd party applications that are listening on a port. Script Interval checks - With this type of check, an external application is invoked, performs the coded operation, exits with an appropriate system code, and potentially generates some output. If you have more complex logic behind your checks, then this is a great option. You can provide a script (written in bash, Powershell, python, etc) and Consul will even capture the output into its UI.

- With this type of check, an external application is invoked, performs the coded operation, exits with an appropriate system code, and potentially generates some output. If you have more complex logic behind your checks, then this is a great option. You can provide a script (written in bash, Powershell, python, etc) and Consul will even capture the output into its UI. Docker Interval checks - An external application within the Docker container is invoked. It is expected to perform a health check of the service running inside the container, and exit with an appropriate exit code.

- An external application within the Docker container is invoked. It is expected to perform a health check of the service running inside the container, and exit with an appropriate exit code. Time to Live (TTL) checks - The state of the check must be updated periodically over Consul's HTTP interface; typically by some external system. So instead of Consul initiating the health checks, it will expect the status of a service to be provided within a given time period. If the TTL expires, then then status will be set to critical.

Adding health checks to ASP.NET Core

Health checks can optionally be included at service registration time. For some context, here is what the registration code from the previous post looked like.

public static IApplicationBuilder RegisterWithConsul(this IApplicationBuilder app, IApplicationLifetime lifetime) { // Retrieve Consul client from DI var consulClient = app.ApplicationServices .GetRequiredService<IConsulClient>(); var consulConfig = app.ApplicationServices .GetRequiredService<IOptions<ConsulConfig>>(); // Get server address information var features = app.Properties["server.Features"] as FeatureCollection; var addresses = features.Get<IServerAddressesFeature>(); var address = addresses.Addresses.First(); var uri = new Uri(address); // Register service with consul var registration = new AgentServiceRegistration() { ID = $"{consulConfig.Value.ServiceID}-{uri.Port}", Name = consulConfig.Value.ServiceName, Address = $"{uri.Scheme}://{uri.Host}", Port = uri.Port, Tags = new[] { "Students", "Courses", "School" } }; consulClient.Agent.ServiceDeregister(registration.ID).Wait(); consulClient.Agent.ServiceRegister(registration).Wait(); lifetime.ApplicationStopping.Register(() => { consulClient.Agent.ServiceDeregister(registration.ID).Wait(); });

The AgentServiceRegistration class has a Checks property that can be used to supply a list of health checks to be associated with the service registration. Let's take a look at adding a few of the Consul health checks to this application.

HTTP Health Check

var registration = new AgentServiceRegistration() { ID = $"{consulConfig.Value.ServiceID}-{uri.Port}", Name = consulConfig.Value.ServiceName, Address = $"{uri.Scheme}://{uri.Host}", Port = uri.Port, Tags = new[] { "Students", "Courses", "School" }, Checks = [new AgentCheckRegistration() { HTTP = $"{uri.Scheme}://{uri.Host}:{uri.Port}/api/health/status", Notes = "Checks /health/status on localhost", Timeout = TimeSpan.FromSeconds(3) , Interval = TimeSpan.FromSeconds(10) }] };

This check expects there to be a /api/health/status endpoint available in the application for Consul to issue HTTP GET requests to. Also notice that you can set the timeout and check interval.

Here's what the endpoint looks like. Of course you can add more code in the HealthController to determine what conditions result in returning Ok .

[Route("api/[controller]")] public class HealthController : Controller { [HttpGet("status")] public IActionResult Status() => Ok(); }

TCP Health Check

var registration = new AgentServiceRegistration() { ID = $"{consulConfig.Value.ServiceID}-{uri.Port}", Name = consulConfig.Value.ServiceName, Address = $"{uri.Scheme}://{uri.Host}", Port = uri.Port, Tags = new[] { "Students", "Courses", "School" }, Checks = [new AgentCheckRegistration() { TCP = "localhost:8000", Notes = "Runs a TCP check on port 8000", Timeout = TimeSpan.FromSeconds(2), Interval = TimeSpan.FromSeconds(5), }] };

With this TCP check in place, Consul attempts to open a connection to port 8000 on localhost. Please note that these types of checks aren't restricted to only local connections. They can be used to check if connections can be made to any application, listening on any port at a given IP address.

Script Health Check

var registration = new AgentServiceRegistration() { ID = $"{consulConfig.Value.ServiceID}-{uri.Port}", Name = consulConfig.Value.ServiceName, Address = $"{uri.Scheme}://{uri.Host}", Port = uri.Port, Tags = new[] { "Students", "Courses", "School" }, Checks = [new AgentCheckRegistration() { Script = "/path/to/script/check.py", Notes = "Runs check.py in the project folder", Timeout = TimeSpan.FromSeconds(2), Interval = TimeSpan.FromSeconds(5), }] };

The code above sets up a health check that will invoke a python script. If the script returns an appropriate exit code, then the check had succeed.

Here's a simple script that I used to test with.

#! /usr/bin/env python3 import os, sys print(os.getcwd()) sys.exit(os.EX_OK)

With these heath checks in place, anyone in your organization can easily review the status of your registered services in the Consul UI.

For programatic access to health check status, we can leverage Consul's HTTP API directly or make use of the NuGet package.

var _consulClient = new ConsulClient(c => { var uri = new Uri(_configuration["consulConfig:address"]); c.Address = uri; }); var services = _consulClient.Agent.Services().Result.Response; foreach (var service in services) { var checks = _consulClient.Health .Checks(_configuration["consulConfig:serviceName"]) .Result; foreach (var checkResult in checks.Response) { Console.WriteLine($"{checkResult.ServiceID} - {checkResult.Status.Status}"); } }

Conclusion

In this post, we covered how to register health checks with Consul and the PlayFab Consul NuGet package.

Consul is just one of many options for setting up health checks in your application. As I'm writing this, it looks like the ASP.NET team is working on their own version of health check support.