Before I start let me just say I thought I had running background processes down. Many times I've called nohup and let some code run on a server.

Well, now having spent two days debugging an issue on AegisBlade, I've learned in the world of systemd there can be many different problems encountered when you try to run a background process via ssh.

In this post I'll go over the challenges I faced in starting background processes in an SSH session on CoreOS. The post is specifically for CoreOS and it's nuances, but much of it can be applied to any distribution utilizing systemd .

Create a Background Process

This is the typical step of nohup ... & or otherwise creating a background process.

In my case, I'm launching the process from inside python, so I execute the unix double-fork to create a totally disconnected process from the parent.

A side note here, if you fork a background process that inherits stdout from the parent, SSH will not terminate until the background process closes.

Stop Logind from Killing your Processes on Logout

To be honest, I don't know what systemd-logind is or what it really does.

What I do know, is that it kills your processes when the ssh session ends, by default, and a lot of people are not happy about it. It breaks nohup, screen, and tmux.

This change was introduced in version 230, which you can check with:

systemctl --version

Luckily, the solution is rather simple, a config value needs to be changed in /etc/systemd/logind.conf . You can vim into this file and change the KillUserProcesses option to no or run the following command to append to the file:

echo 'KillUserProcesses=no' | sudo tee -a /etc/systemd/logind.conf

Now, reboot your system, and run the following command to ensure that the new config value was picked up:

busctl get-property org.freedesktop.login1 /org/freedesktop/login1 org.freedesktop.login1.Manager KillUserProcesses

The output should be:

b false

Turn off Socket Activation

At this point, I was baffled, I did all of the above and my background processes were still getting killed as soon as the SSH session ended.

Using strace I was able to discover any process started by the SSH session (even disconnected ones) (even several layers deep) are sent SIGTERM from PID 1 (which turns out to be systemd ). I tried some clever hacks to ignore the signal, but they couldn't propagate through the whole process tree.

By God's Grace, it seems, I landed on a page that described the situation.

The SSH service needs to be enabled.

By default, the SSH service is not running on CoreOS. The socket is watched, and when you connect the service is started anew. That also means the service is terminated when the ssh session is ended.

As it turns out, systemd has a default behavior of tracking any processes created by a specific service and then forcefully terminating them upon the termination of the service.

This is called socket activation and can be disabled via a CoreOS ignition config.

{ "ignition": { "config": {}, "timeouts": {}, "version": "2.1.0" }, "networkd": {}, "passwd": {}, "storage": {}, "systemd": { "units": [ { "enable": true, "name": "sshd.service" }, { "mask": true, "name": "sshd.socket" } ] } }

If you're not using ignition configs, you can also disable it until reboot with the following commands:

systemctl mask --now sshd.socket

systemctl restart sshd.service

Alternate Solutions

The stackoverflow link describes an alternate solution changing the behavior of systemd which may fit your scenario better.

Another alternative is to use the systemd user scope. I tried this out but hit some weird errors and didn't get too far.

Conclusion

If you found this useful, give us a shout on twitter @aegisbladeHQ.

Also, check out AegisBlade, where you can deploy and run any of your code with a single function call.

But they that wait upon the Lord shall renew their strength; they shall mount up with wings as eagles; they shall run, and not be weary; and they shall walk, and not faint. - Isaiah 40:31