Just a few weeks ago a coworker of mine sent me this conference talk by Kelsey Hightower. I recommend watching the talk and following him on Twitter. To summarize, health checks should live inside the app instead of just existing as external bash scripts, smoke tests, etc. Where better to determine the health of an app than in the app itself?

We can accomplish this by serving a /healthz endpoint from the service itself that will respond with an http status code, 200 for healthy, or 500 for unhealthy. Kubernetes (or whatever else) can then use this to determine if a instance needs to be restarted.

For the most part, this is accomplished using the method shown in this Github example repo. The /healthz endpoint is defined inside of the main package, with the handler function being imported. All of the health checks then live inside of this handler function, to be run every time the endpoint is requested.

func (h *handler) ServeHTTP(w http.ResponseWriter, r *http.Request) {

response := Response{

Hostname: h.hostname,

Metadata: h.metadata,

}



statusCode := http.StatusOK

errors := make([]Error, 0)



err := h.dc.Ping()

if err != nil {

errors = append(errors, Error{

Type: "DatabasePing",

Description: "Database health check.",

Error: err.Error(),

})

}



response.Errors = errors

if len(response.Errors) > 0 {

statusCode = http.StatusInternalServerError

for _, e := range response.Errors {

log.Println(e.Error)

}

}



w.Header().Set("Content-Type", "application/json")

w.WriteHeader(statusCode)

data, err := json.MarshalIndent(&response, "", " ")

if err != nil {

log.Println(err)

}

w.Write(data)

}

This method is simple and relatively elegant, if we can connect to the database from within our app, then this health check will succeed, but now we’re repeating ourselves. Why try to connect to the database again in the handler function when our app already connected to the database on startup?

We’ve now moved the health checks inside of the same repository, while still keeping them independent from our normal code execution. This might not be reverse engineered, but it also doesn’t seem like the completion of the thought to move away from using reverse engineered bash scripts to gauge health.

My solution is to report the health of the app alongside error handling.

// run me locally

package main



import (

"fmt"

"os"



"github.com/natethinks/healthz"

)



func main() {



healthz.Serve("localhost:8080", "/healthz")



_, err := os.Open("filename.ext")

if err != nil {

healthError := healthz.HealthError{

Error: err.Error(),

Description: "File read health check.",

}

healthz.NewFatalError(healthError)

}



fmt.Println("Serving healthz on http://localhost:8080/healthz")



// infinite loop to keep the endpoint running

forever := make(chan bool)

<-forever

}

First we call healthz.Serve() which starts an http server. Our /healthz endpoint will return a status 200 until a fatal error has been reported. Then we attempt to do something critical to our program, in this case it’s opening a file. If this fails, we record a new fatal error. The endpoint will then return a status 500 along with a JSON body containing an array of all reported errors.

Using this method, health checks are no long independent of the code. A health check can be added in the error handling of any function we deem mission critical, and health status can be changed in real time, right alongside the normal execution of our code.

You can find the code for package healthz on GitHub. Feel free to submit issues or pull requests.

Follow me on twitter: Nate Meyers