Photo by rawpixel on Unsplash

At work, most of our virtual machines run Linux. Being able to SSH into our servers in order to develop, automate, or troubleshoot services across different stacks is critical for us. Having said this, making sure that we do so in a secure way has been a top concern since day one.

In the following article, I share a few measures we have implemented along the way to keep SSH under control.

1: Getting rid of Key explosion with SSH client certificates

Everyone knows that SSH keys is the first thing to get out of control when busy engineers meet shared servers. After just a few days into work, SSH keys start getting thrown through all channels: Email, Slack, Copied onto servers or sometimes they even end up on a Confluence site.

To prevent this we use SSH certificates. The concept is similar to TLS PKI whereby a third party authority is introduced to sign user keys. As a result, target servers don’t need to rely on an authorized_key file anymore because they can instead validate presented credentials on demand against a local public CA root certificate.

On top of this, SSH certificates can carry additional metadata about the client like like common name, allowed capabilities, expiration information, allowed Linux user impersonation, etc.

https://ef.gy/hardening-ssh (scroll down to SSH certificates)

2: Using Hashicorp Vault to provide SSH certificate Self Signing Service

Hashicorp Vault offers an easy to setup PKI backend to handle the SSH certificate signing workflow. In our case, users login to Vault using their AD or AWS credentials and then then hand in their personal id_rsa.pub key for Vault to sign it. The resulting certificate allows users to SSH into different environments for a maximum of a day depending on their roles.

# ------------------------------------

# cat /usr/local/bin/vault-ssh.sh

# ------------------------------------ #! /bin/bash set -ue

SSH_DIR=~/.ssh

export VAULT_ADDR= USERNAME=fist.last@ldapdomain.comSSH_DIR=~/.sshexport VAULT_ADDR= https://vault.shrd. internal.net

if ! vault token lookup > /dev/null 2>&1; then

# The Vault cli will prompt for a password

vault login -method=ldap -field token_policies

fi # Login to Vault using LDAP if not already logged inif ! vault token lookup > /dev/null 2>&1; then# The Vault cli will prompt for a passwordvault login -method=ldap -field token_policies username=$USERNAME fi # Sign POC and DEV environment certificates

vault write -field=signed_key ssh-poc/sign/ec2-user public_key=@${SSH_DIR}/id_rsa.pub > ${SSH_DIR}/id_rsa-poc-cert.pub vault write -field=signed_key ssh-dev/sign/ec2-user public_key=@${SSH_DIR}/id_rsa.pub > ${SSH_DIR}/id_rsa-dev-cert.pub

To connect to a server we combine the signed certificate and the private key in the SSH Command arguments:

ssh -i id_rsa -i id_rsa-poc-cert.pub ec2-user@mybox.internal.net

3: Setting up SSH config files

SSH config files help you in getting rid of SSH command-line verbosity by moving all the connection information to a file. The following is an example of a default SSH config file that allows me to tunnel into my development Kafka server through a bastion by just typing ssh poc_kafka :

# ------------------

# cat ~/.ssh/config

# ------------------ Host poc_datastax

Hostname datastax.poc.internal.net Host poc_loopback

Hostname loopback.poc.internal.net Host poc_kafka

Hostname kafka.poc.internal.net Host poc_*

IdentityFile ~/.ssh/id_rsa

CertificateFile ~/.ssh/id_rsa-poc-cert.pub

ProxyCommand ssh bastion_shrd nc %h %p 2> /dev/null

User ec2-user # Tunnel SSH connections through a single subnet

Host bastion_shrd

Hostname bastion.shrd.internal.net

IdentityFile ~/.ssh/id_rsa

CertificateFile ~/.ssh/id_rsa-poc-cert.pub

User ec2-user

But there is more: you can also split this information into multiple files and then use them to separate your environments:

ssh -F ~/.ssh/config-stg loopback

ssh -F ~/.ssh/config-prod loopback

4: Sharing SSH configs through source control

An problem we have experienced with SSH config files is that they get outdated pretty quick and it is hard to keep the latest ssh information in sync though-out the teams.

As a solution, we recently started to source control our SSH config files per project so that it is easy to know where to go to when needing connection information for specific workloads: If I need to access the Kafka servers, I should be able to find the connection information in the Kafka git repo.

This solution can also be applied to Ansible projects so that SSH connection information is maintained next to the code:

# ------------------

# cat ./ansible.cfg

# ------------------ [defaults]

inventory = ./inventory.ini

remote_user = ec2-user host_key_checking = false

retry_files_enabled = false

retry_files_enabled=false

nocows = true [privilege_escalation]

become = true

become_method = sudo [ssh_connection]

control_path = %(directory)s/%%h-%%r

pipelining = true

ssh_args = -F ./ansible-ssh.cfg

And then we pack all the connection information in the local SSH config file as follows:

# ---------------------

# cat ./ansible-ssh.cfg

# --------------------- Host bastion_shrd

Hostname bastion.shrd.internal.net

IdentityFile ~/.ssh/id_rsa

CertificateFile ~/.ssh/id_rsa-poc-cert.pub

User ec2-user Host 10.90.?.*

CertificateFile ~/.ssh/id_rsa-poc-cert.pub Host 10.90.3?.*

CertificateFile ~/.ssh/id_rsa-dev-cert.pub Host 10.90.6?.*

CertificateFile ~/.ssh/id_rsa-shrd-cert.pub Host 10.90.*.*

IdentityFile ~/.ssh/id_rsa

ProxyCommand ssh -F ./ansible-ssh.cfg bastion_shrd nc %h %p 2> /dev/null

User ec2-user

5: Tunnel SSH through 2-factor OpenVPN + a management network bastion

All our servers are running in the cloud behind private network and dns. This means that there is no way to reach services from our desktops without some kind of networking magic. This is where OpenVPN comes in handy:

It serves as a NAT, allowing connected clients to reach private network IPs.

It can modify the desktop’s default DNS server to use OpenVPN’s one itself so that private DNS records can be resolved.

It can integrate with an LDAP identity provider, strengthened with 2 factor authentication, handle brute force password attacks, control which user has access to which range of subnets, etc.

Having said this, not everybody in the network, either human or machine, should have SSH capabilities: we don’t want any rogue compromised server sniffing around other machines. In order to control this, we set ACLs at the network level to restrict SSH connectivity to a management subnet which also hosts automation engines like Jenkins and Ansible Tower. We also place a SSH bastion in this subnet so that trusted users coming from OpenVPN can then tunnel through it in order to reach other servers in the environment.

6: Monitor activity by reporting on /var/log/secure files

We haven’t started work on this yet but as we can see below, the use of SSH certificates leaves us with user-to-connection information. We are currently pushing this logging information into a central repository for now but in the future we would like to create graphical analysis and metrics to understand the behaviour around our servers.

Aug 4 02:20:46 ip-x-x-x-x sshd[12437]: Accepted publickey for ec2-user from x.x.x.x port 55746 ssh2: RSA-CERT ID vault-ldap-first.last@ldapdomain.com.au -1c5ea7ca827f81296db79c94cc2b46acb908cc14d84eaa28701dfc9748794465 (serial 1112258391964872070) CA RSA SHA256:VXSUR8gWWDItRMTMMhgC2XrUxvMtk4b5qacG8RyUX3g Aug 4 02:20:47 ip-x-x-x-x sshd[12437]: pam_unix(sshd:session): session opened for user ec2-user by (uid=0) Aug 4 02:20:48 ip-x-x-x-x sshd[12440]: Received disconnect from x.x.x.x port 55746:11: disconnected by user Aug 4 02:20:48 ip-x-x-x-x sshd[12440]: Disconnected from x.x.x.x port 55746 Aug 4 02:20:48 ip-x-x-x-x sshd[12437]: pam_unix(sshd:session): session closed for user ec2-user

Wrapping up

As you can see, SSH management is not a simple task. It would be nice if Cloud providers, other than Google, provided SSH management out of the box and make things simpler for all of us. I hope that until then, this article is able to provide some ideas in this regard.

Cheers