The whole playbook from this blog post can be seen in this gist.

Ansible is a Python-based tool for automating application deployment and infrastructure setups. It's often compared with Capistrano, chef, puppet and fabric. These comparisons don't always compare apples to apples as each tool has their own distinctive capabilities and philosophy as to how best automate deployments and system setups. Flexibility and conciseness of describing tasks varies widely between tools as well.

I've had to use every tool mentioned above at one time or another for various clients. What I like most about Ansible is that it keeps playbooks concise while not abstracting anything away. Some tools try to hide the differences between apt install and yum install but I found these abstractions made for a steeper learning curve and made out-of-the-ordinary changes take longer to get working.

Ansible can be installed via pip and just needs an inventory file to begin being useful.

For this post I try to keep Ansible's files to a minimum. You can organise playbooks into separate files, setup Travis CI to test your playbooks and so on but for the sake of simplicity I stick to the task of getting a load-balanced, two-node Django cluster setup with as few lines of code as I could.

A cluster of machines I launched three Ubuntu 14.04 virtual machines on my workstation, configured them with the user mark , identical passwords and sudo access. I then added the three virtual machine IP addresses to my hosts file: $ grep 192 /etc/hosts 192 .168.223.131 web1 192 .168.223.132 web2 192 .168.223.133 lb I copied my SSH public key to the ~/.ssh/authorized_keys file on each VM: $ ssh-copy-id web1 $ ssh-copy-id web2 $ ssh-copy-id lb I then made sure each host's ECDSA key fingerprint was stored in my ~/.ssh/known_hosts file by connecting to each host: $ ssh lb uptime $ ssh web1 uptime $ ssh web1 uptime

Running Ansible with an inventory I installed Ansible: $ pip install ansible == 1 .7.2 And created an inventory file: $ cat inventory [ load_balancers ] lb ansible_ssh_host = lb [ app_servers ] web1 ansible_ssh_host = web1 web2 ansible_ssh_host = web2 The first token is the host alias used by Ansible, the second token is the connection instruction for Ansible. I had already created web1, web2 and lb in my /etc/hosts file so I simply referred to those hostnames. With an inventory file in place it's possible to test that Ansible can connect and communicate with each node in the cluster: $ ansible all -i inventory -a uptime lb | success | rc = 0 >> 09 :07:18 up 7 :18, 2 users, load average: 0 .05, 0 .03, 0 .05 web2 | success | rc = 0 >> 09 :07:18 up 7 :19, 2 users, load average: 0 .01, 0 .02, 0 .05 web1 | success | rc = 0 >> 09 :07:19 up 7 :23, 2 users, load average: 0 .00, 0 .01, 0 .05

A cluster of configuration files Ansible will need to deploy configuration files, keys and certificates to our cluster. Below is the file and folder layout of this project: $ tree . ├── files │ ├── nginx-app.conf │ ├── nginx-load-balancer.conf │ ├── nginx.crt │ ├── nginx.key │ ├── ntp.conf │ ├── supervisord.conf │ └── venv_activate.sh ├── inventory └── playbook.yml

SSL-terminating load balancer To create files/nginx.crt and files/nginx.key I generated a self-signed SSL certificate using openssl . The certificate won't be of much use to https clients that verify certificates against trusted authorities but it's useful for demonstrating ssl termination by the load balancer in a local environment. $ openssl req -x509 -nodes -days 365 \ -newkey rsa:2048 \ -keyout files/nginx.key \ -out files/nginx.crt ... Common Name ( e.g. server FQDN or YOUR name ) [] :localhost ... There are two distinctive nginx configurations in this setup: the first is for the load balancer and the second is for the app servers. I chose nginx for the load balancer as it supports SSL, caching and handles app servers restarting more gracefully than haproxy. $ cat files/nginx-load-balancer.conf upstream app_servers { { % for host in groups [ 'app_servers' ] % } server {{ hostvars [ host ][ 'ansible_eth0' ][ 'ipv4' ][ 'address' ] }} fail_timeout = 5s ; { % endfor % } } server { listen 80 ; server_name localhost ; return 301 https:// $server_name$request_uri ; } server { listen 443 ; server_name localhost ; ssl on ; ssl_certificate /etc/nginx/ssl/nginx.crt ; ssl_certificate_key /etc/nginx/ssl/nginx.key ; location / { proxy_pass http://app_servers ; } } The nginx config for the app servers simply proxies requests to gunicorn: $ cat files/nginx-app.conf server { listen 80 ; server_name localhost ; location / { proxy_pass http://127.0.0.1:8000 ; } }

Keeping clocks in sync I wanted to keep the clocks on each node in the cluster in sync so I created an NTP configuration file. I choose the European pool but there are pools all around the world listed the NTP pool project site. $ cat files/ntp.conf server 0 .europe.pool.ntp.org server 1 .europe.pool.ntp.org server 2 .europe.pool.ntp.org server 3 .europe.pool.ntp.org If you wanted to use the North American pool you could replace the contents with the following: server 0.north-america.pool.ntp.org server 1.north-america.pool.ntp.org server 2.north-america.pool.ntp.org server 3.north-america.pool.ntp.org

Managing gunicorn and celeryd I'll use supervisor to run the Django app server and celery task queue. The files below have a number of variables in them, Ansible will transform them when it deploys them. $ cat files/supervisord.conf [ program:web_app ] autorestart = true autostart = true command ={{ home_folder }} /.virtualenvs/ {{ venv }} /exec gunicorn faulty.wsgi:application -b 127 .0.0.1:8000 directory ={{ home_folder }} /faulty redirect_stderr = True stdout_logfile ={{ home_folder }} /faulty/supervisor.log user = mark [ program:celeryd ] autorestart = true autostart = true command ={{ home_folder }} /.virtualenvs/ {{ venv }} /exec python manage.py celeryd directory ={{ home_folder }} /faulty redirect_stderr = True stdout_logfile ={{ home_folder }} /faulty/supervisor.log user = mark I also need a bash file that will activate the virtualenv used by Django: $ cat files/venv_activate.sh #!/bin/bash source {{ home_folder }} /.virtualenvs/ {{ venv }} /bin/activate $@ When venv_activate.sh is uploaded to each web server it'll sit in /home/mark/.virtualenvs/faulty/exec .

Writing the playbook I wanted to keep the number of files Ansible I was working with as low as I thought sensible for this blog post. Ansible is very flexible in terms of file organisation so a more broken up and organised approach is possible but for this example I'll just use one playbook. There's much more hardening that could have been done with these instances. For the sake of conciseness I kept the number of tasks down. I've broken playbook.yml into sections and will walk through each explaining what actions are taking place and reasoning behind each. Too see the whole playbook.yml file please see this gist.

SSH tightening The first tasks are to disable root's ssh account and remove support for password-based authentication. I already stored my public ssh key in each server's authorized_keys file so there is no need to login with password authentication. Once these configuration changes are made the ssh server will be restarted: --- - name : SSH tightening hosts : all sudo : True tasks : - name : Disable root's ssh account action : > lineinfile dest=/etc/ssh/sshd_config regexp="^PermitRootLogin" line="PermitRootLogin no" state=present notify : Restart ssh - name : Disable password authentication action : > lineinfile dest=/etc/ssh/sshd_config regexp="^PasswordAuthentication" line="PasswordAuthentication no" state=present notify : Restart ssh handlers : - name : Restart ssh action : service name=ssh state=restarted There are three dashes at the top of this file as it's a YAML convention.

Package cache Every system needs APT's cache updated. You can append update_cache=yes to specific package installations but I found it was required for every package installation so I perform the update once per machine rather than per package. - name : Update APT package cache hosts : all gather_facts : False sudo : True tasks : - name : Update APT package cache action : apt update_cache=yes

Syncing to UTC I then set the time zone on each machine to UTC and setup NTP to synchronise their clocks with the European NTP pool. Note when dealing with dpkg-reconfigure make sure you pass --frontend noninteractive , otherwise Ansible will freeze while waiting for dpkg-reconfigure to accept input that Ansible isn't configured to capture interactively. - name : Set timezone to UTC hosts : all gather_facts : False sudo : True tasks : - name : Set timezone variables copy : > content='Etc/UTC' dest=/etc/timezone owner=root group=root mode=0644 backup=yes notify : - Update timezone handlers : - name : Update timezone command : > dpkg-reconfigure --frontend noninteractive tzdata - name : Syncronise clocks hosts : all sudo : True tasks : - name : install ntp apt : name=ntp - name : copy ntp config copy : src=files/ntp.conf dest=/etc/ntp.conf - name : restart ntp service : name=ntp state=restarted I also made sure each machine has unattended upgrades installed. - name : Setup unattended upgrades hosts : all gather_facts : False sudo : True tasks : - name : Install unattended upgrades package apt : name=unattended-upgrades notify : - dpkg reconfigure handlers : - name : dpkg reconfigure command : > dpkg-reconfigure --frontend noninteractive -plow unattended-upgrades

Setup the Django app server The Django app servers have the largest number of tasks. Here is a summarised list of what's being performed: Setup uncomplicated firewall to block all incoming traffic except for tcp port 22 and 80; HTTP traffic should only come from the load balancer; ssh logins are rate-limited to slow down brute force attacks.

Install Python's virtualenv and development libraries.

Install git and checkout a public repo of a Django project hosted on Bitbucket.

Copy over the virtual environment activation script which is used by every Django command.

Install a virtual environment and the Django project's requirements.

Setup and migrate Django's database. The database doesn't need to be shared between web servers for this application so it's just a local SQLite3 file on each individual server.

Install supervisor and have it run the gunicorn app server and celery task runner.

Launch an nginx reverse proxy which acts as a buffer between gunicorn and the load balancer. - name : Setup App Server(s) hosts : app_servers sudo : True vars : home_folder : /home/mark venv : faulty tasks : - ufw : state=enabled logging=on - ufw : direction=incoming policy=deny - ufw : rule=limit port=ssh proto=tcp - ufw : rule=allow port=22 proto=tcp - ufw : > rule=allow port=80 proto=tcp from_ip={{ hostvars['lb']['ansible_default_ipv4']['address'] }} - name : Install python virtualenv apt : name=python-virtualenv - name : Install python dev apt : name=python-dev - name : Install git apt : name=git - name : Checkout Django code git : > repo=https://bitbucket.org/marklit/faulty.git dest={{ home_folder }}/faulty update=no - file : > path={{ home_folder }}/faulty owner=mark group=mark mode=755 state=directory recurse=yes - name : Install Python requirements pip : > requirements={{ home_folder }}/faulty/requirements.txt virtualenv={{ home_folder }}/.virtualenvs/{{ venv }} - template : > src=files/venv_activate.sh dest={{ home_folder }}/.virtualenvs/{{ venv }}/exec mode=755 - command : > {{ home_folder }}/.virtualenvs/{{ venv }}/exec python manage.py syncdb --noinput args : chdir : '{{ home_folder }}/faulty' - command : > {{ home_folder }}/.virtualenvs/{{ venv }}/exec python manage.py migrate args : chdir : '{{ home_folder }}/faulty' - name : Install supervisor apt : name=supervisor - template : > src=files/supervisord.conf dest=/etc/supervisor/conf.d/django_app.conf - command : /usr/bin/supervisorctl reload - supervisorctl : name=web_app state=restarted - supervisorctl : name=celeryd state=restarted - name : Install nginx apt : name=nginx - name : copy nginx config file template : > src=files/nginx-app.conf dest=/etc/nginx/sites-available/default - name : enable configuration file : > dest=/etc/nginx/sites-enabled/default src=/etc/nginx/sites-available/default state=link - service : name=nginx state=restarted

The load balancer The load balancer has a simpler task list: Block all incoming traffic except for tcp 22, 80, 443; rate limit ssh.

Install Nginx and copy in the self-signed certificates.

Copy in the load balancer configuration and launch nginx. - name : Setup Load balancer(s) hosts : load_balancers sudo : True tasks : - ufw : state=enabled logging=on - ufw : direction=incoming policy=deny - ufw : rule=limit port=ssh proto=tcp - ufw : rule=allow port=22 proto=tcp - ufw : rule=allow port=80 proto=tcp - ufw : rule=allow port=443 proto=tcp - apt : name=nginx - name : copy nginx config file template : > src=files/nginx-load-balancer.conf dest=/etc/nginx/sites-available/default - copy : src=files/nginx.key dest=/etc/nginx/ssl/ - copy : src=files/nginx.crt dest=/etc/nginx/ssl/ - name : enable configuration file : > dest=/etc/nginx/sites-enabled/default src=/etc/nginx/sites-available/default state=link - service : name=nginx state=restarted