If you’re anything like me, you learn something new and you’re ready to completely revolutionize the world with what you’ve learned. I’ve worked with Docker and Kubernetes for the past year and have slept with a microservices architecture textbook under my pillow, so i’m ready to take any piece of software and turn it into a microservice and deploy it!

With the rise of automation in the orchestration of containers it is easy to fall in the trap of believing “Kubernetes will handle it” when architecting your system. The wisdom that I hope you take away after reading this is that Kubernetes can only do so much, as developers we have to hold up our end of the agreement and give the system everything it needs to automate effectively. Our lesson learned, the first of hopefully many to come, involves the graceful termination of pods. Disclaimer: the terms pod and container are used interchangeably, the assumption moving forward is that the pod is simple and has only a single user-defined container.

What happens when pods are terminated?

Terminating pods are routine tasks on Kubernetes systems. Scaling resources and updating deployments involve terminating pods, for example. With regards to how pods are terminated, according to Kubernetes documentation:

Because Pods represent running processes on nodes in the cluster, it is important to allow those processes to gracefully terminate when they are no longer needed (vs being violently killed with a KILL signal and having no chance to clean up). Users should be able to request deletion and know when processes terminate, but also be able to ensure that deletes eventually complete. When a user requests deletion of a Pod, the system records the intended grace period before the Pod is allowed to be forcefully killed, and a TERM signal is sent to the main process in each container. Once the grace period has expired, the KILL signal is sent to those processes, and the Pod is then deleted from the API server. If the Kubelet or the container manager is restarted while waiting for processes to terminate, the termination will be retried with the full grace period.

To summarize, Kubernetes sends the process running as PID 1 (the main process) in your pods/container a SIGTERM signal. This is its end of the bargain. By default Kubernetes waits for a grace period of 30 seconds for the process to end before sending it a SIGKILL. What we do from here is really up to us, but before we proceed,

What is the main process?

To understand what the main process is, it is useful to have some Docker knowledge. You’ve taken your app and packed it all into a Docker image. You run the Docker run command on the image, but nothing happens. You realize that you need to have an ENTRYPOINT or CMD instruction defined in your Dockerfile that contains your startup logic. You try again and like magic your container starts up. You just defined your container’s main process. Once you spin up your image as a container, you have the freedom to interact with it. Run docker exec -it <CONTAINER_NAME/ID> /bin/bash to execute into the container, and once you’re exec’d into the container run a ps , what is running as PID 1 should be the startup logic you defined in your Dockerfile. Now, why is this important? Well firstly,

There is no wrong way to start a container

Containers are intended to be flexible environments from which you can run your application, and only you and your team know what is needed to run it. This is likely why Kubernetes itself doesn’t go beyond sending a SIGTERM to the main process of the container. This is where as developers we have to maintain the core fundamentals of software engineering. You need to make sure that when that SIGTERM comes, your application handles it and ensures that all connections and processes are closed gracefully. Lets explore an example of a NodeJS server app and a few scenarios of how we can start up the container with its ENTRYPOINT or CMD instruction

Scenario 1: node approach

Often enough if you have a simple application, all it needs to get started is node index.js . This is often lauded as the safest method of starting your container because of its simplicity.

Scenario 2: npm approach

Your application may need some logic at startup completed, so your team uses npm to accomplish this. Capturing what is necessary to start your nodeJS application in a package.json file is a very common practice. The start command could be bound to node index.js , and off you go.

Scenario 3: The bash script approach

This is the approach I would like to dive deeper in, because it covers a practice that my team learned to use carefully moving forward. Suppose your application at runtime requires some initialization that you capture in a bash script start.sh . Your Dockerfile has in its CMD instruction to run start.sh at startup, whose contents may be something like below,

#!/bin/bash some_command

export IMPORTANT_VARIABLE=`some_other_command` npm start

This is a fairly common practice because often applications may need parameters or configuring at runtime. Let’s break down what we have just done. Your container boots up and starts up your bash script as PID 1. It runs some initialization steps, then runs npm start as a child process, which then in turn runs our node index.js process. Executing into your kubernetes pod you may see something like this if you run the ps command,

PID USER TIME COMMAND

1 microuse 0:00 /bin/bash /usr/src/app/start.sh

27 microuse 0:00 npm

38 microuse 11:49 node index.js

56 microuse 0:00 bash

62 microuse 0:00 ps

Ignoring the last two processes as these are related to running the ps command, your pod has 3 running processes. Let’s walk through what happens when the pod is terminated by Kubernetes. First a SIGTERM signal is sent to PID 1. The bash script has an npm child process, and node is a child process of the npm process. But how does Bash handle sending SIGTERMs to its child processes? Long story short, it doesn’t. Bash doesn’t natively pass along SIGTERMs unless you set up signal traps, etc. So what happens to your pod? PID 1 is terminated and nothing is done with other processes that are running on your pod, Kubernetes sees that PID 1 process has been terminated and thinks everything is good to go, and terminates your pod, effectively killing your orphaned processes. This is a violent way to shut down your pod.

But you need to have this type of start up behavior for your application, so what can be done. Now that you are mindful that SIGTERMS aren’t being sent to your node application, a fast solution can be something like the below,

#!/bin/bash some_command

export IMPORTANT_VARIABLE=`some_other_command` exec npm start

What you just did by adding the exec command is tell bash don’t start npm as a child process, replace the existing process with this one. After making this change your processes may look something like the following,

PID USER TIME COMMAND

1 microuse 0:00 npm

28 microuse 1:32 node index.js

50 microuse 0:00 bash

56 microuse 0:00 ps

And you celebrate because npm will pass down the SIGTERM signal into its child processes, of which your application is one. So now what?

Your application has to handle SIGTERM signals

Having the signal passed into your node process isn’t good enough to end things gracefully, you still have to do something with that signal. A loose outline of a basic index.js is below,

const app = require('express')();

const serverPort = 3000; // Start the server

const app_server = http.createServer(app).listen(serverPort, function () {

console.log('Your server is listening on port %d', serverPort);

});

NodeJS apps need to explicitly handle SIGTERMS, otherwise they do nothing. We need to add code that listens for a SIGTERM, closes all open connections, and gracefully shuts down the node process. We can add the following logic to index.js ,

const app = require('express')();

const signals = require('./signals.js');

const serverPort = 3000; // Start the server

const app_server = http.createServer(app).listen(serverPort, function () {

console.log('Your server is listening on port %d', serverPort);

}); process.on('SIGTERM', () => {

signals.sigterm(app_server);

});

with signals.js defined as below,

exports.sigterm = function (server) {

console.log('Received SIGTERM. Initiating graceful shutdown…');

sigtermHelper(server);

}; function sigtermHelper(server) {

server.getConnections(function (error, count) {

if (error) {

console.log(`Received error getting open connections: ${error}`);

} if(count > 0) {

console.log(`Connections still open: ${count}`);

setTimeout(() => {

sigtermHelper(server);

}, 5000);

} else {

server.close(() => {

console.log('No open connections. Closing gracefully.');

process.exit(0);

});

}

});

}

NodeJS has signal handling libraries. When our app receives a SIGTERM, we check the number of open connections with our HTTP server. If there are open connections, we use setTimeout to wait 5 seconds and check again. For the purpose of this example, this handler is left unbound because eventually Kubernetes will send a SIGKILL, so as opposed to defining cutoffs in two places we’ll stick to the default grace period set in Kubernetes. That’s all there is to it!

A word on preStop Hooks

It’s important to acknowledge that there is another path worth considering when figuring out how to gracefully shutdown your container: using preStop hooks. According to Kubernetes documentation:

This hook is called immediately before a container is terminated due to an API request or management event such as liveness probe failure, preemption, resource contention and others. A call to the preStop hook fails if the container is already in terminated or completed state. It is blocking, meaning it is synchronous, so it must complete before the call to delete the container can be sent. No parameters are passed to the handler.

What this basically means is that before a SIGTERM is sent to the main process, the preStop hook is triggered, and any logic within the hook is executed. This is useful for more complex containers, where perhaps it has multiple parallel processes or runtime environments. For the purpose of this article I chose to keep the example simple so that I can drive home the key points, but it is worth mentioning this path.

Concluding Thoughts

There are a number of things that have been discussed, but they can be broken down into a few key points to summarize:

Kubernetes will not do everything for you, you have to hold up your end of the bargain.

Ensure that how your container starts up allows for signals to be passed to child processes. With the flexibility of these environments developers need to be aware of the system level impacts of the changes they make to code.

Once your application is being passed signals within the container, handle them!

This has been just one exercise of holding up your end of the agreement when creating your Kubernetes application. Kubernetes has many methods of automation built in, but as developers we have to work within this environment and give the system everything it needs to automate effectively. If the application is a production service, customers expect uninterrupted service. When designing solutions and offerings on Kubernetes, these details will impact the customer experience. Keep these bullet points in mind and you’ve taken a few small steps on your way to making a reliable Kubernetes application. I hope you found this article useful, and hopefully there will be more to come as I learn more. In the meantime, happy sailing!

Useful Resources