Django in Production: Part 3 - Automation & Monitoring

This is the third part in a series about Django in Production. If you haven't already, read part 1 and part 2 first.

In the first two posts of this series, I described the core stack which powers a Django application in production, and the Celery task queue which can be used to execute code asynchronously. In this third post, I'll describe how a production Django application can be monitored, and how common tasks such as deployment can be automated.

Monitoring Django Applications

There are many ways to monitor a Django application, but one that is particularly useful is Graphite.

Graphite

Graphite provides real-time visualization and storage of numeric time-series data on an enterprise level.

The key thing is not to be scared by the word enterprise. Graphite is a relatively simple 3-part system: Whisper is an efficient, pure-Python implementation of a round-robin database, and Carbon is a daemon which manages the Whisper database and provides caching. Finally, the Graphite "webapp" provides a Django frontend to the data stored in the Whisper database.

Graphite's web interface is, admittedly, hard to use. Its redeeming feature is a powerful URL-based API which allows you to compose graphs programatically - which is in some ways easier than navigating the difficult menu system. Once mastered, though, Graphite can produce some amazingly detailed stats about the very deepest internals of your application.

Clearly, the key to great graphs is lots (and lots) of data. In fact, 37signals report that their servers recorded an incredible 100,000,000 measurements in the first 10 days of 2012. Those measurements were made using a statsd, which is a Node.js daemon for collecting statistics over a simple UDP protocol, and sending them to Graphite.

Statsd

The great thing about statsd is that it uses UDP to collect statistics. This means that a client can be written in almost any language, and when the server isn't running the client is completely unaffected. After all, statistics are important, but not important enough to knock the entire application offline when they aren't being collected.

So far, there's been nothing Django-specific (save for the fact that Graphite is written in Django). In fact, these systems can be used to monitor just about anything.

django-statsd provides a very useful set of basic statistics for Django applications, though, and is highly recommended. Strangely, it's installed from PyPi with pip install django-statsd-mozilla , as the name was already taken by another app.

Once installed, just enable a couple of middleware classes which record every request/response and their execution time to statsd. For more specific stats, though, the client library is very easy to use:

from django_statsd.clients import statsd statsd . incr ( 'response.200' )

Interestingly, it also integrates with django-debug-toolbar, and can "monkeypatch" Django to enable model/cache statistics - but I'll leave that as as exercise for the reader.

Automation

Often, in their quest for the least effort possible, programmers automate too much. In the case of deploying applications, though, which tends to be a simple but repetitive task, I think we're justified.

Fabric

If you haven't already, you should spend some time reading about Fabric. Essentially, it's a Python library that allows SSH commands to be scripted, whether they are on one or many remote hosts.

Fabric is a Python library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks.

Fabric uses a file at the top-level of a project, called fabfile.py , to define the functions which can be used on the command line.

There are some very complete examples of "fabfiles" which can be used for Django deployment, such as this one by Gareth Rushgrove.

For my projects, which involve only one server, I use a rather more simplified version:

import os from fabric.api import env, require, run, sudo, cd env.project_name = '' env.server_name = '' env.webapps_root = '/opt/webapps/' env.project_root = os.path.join(env.webapps_root, env.project_name) env.activate_script = os.path.join(env.project_root, 'env/bin/activate') env.wsgi_file = os.path.join(env.project_root, 'django.wsgi') env.repo_root = os.path.join(env.project_root, 'repository') env.search_index = os.path.join(env.project_root, 'search_index') env.requirements_file = os.path.join(env.repo_root, 'requirements.txt') env.manage_dir = os.path.join(env.repo_root, env.project_name) def production(): env.hosts = [env.server_name] prod = production def virtualenv(command, use_sudo=False): if use_sudo: func = sudo else: func = run func('source "%s" && %s' % (env.activate_script, command)) def update(): require('hosts', provided_by=[production]) with cd(env.repo_root): run('git pull origin master') def install_requirements(): require('hosts', provided_by=[production]) virtualenv('pip install -q -r %(requirements_file)s' % env) def manage_py(command, use_sudo=False): require('hosts', provided_by=[production]) with cd(env.manage_dir): virtualenv('python manage.py %s' % command, use_sudo) def syncdb(app=None): require('hosts', provided_by=[production]) manage_py('syncdb --noinput') def migrate(): require('hosts', provided_by=[production]) manage_py('migrate') def rebuild_index(): require('hosts', provided_by=[production]) manage_py('rebuild_index --noinput', use_sudo=True) sudo('chown -R www-data:www-data %(search_index)s' % env) def collectstatic(): require('hosts', provided_by=[production]) manage_py('collectstatic -l --noinput') def reload(): require('hosts', provided_by=[production]) sudo('supervisorctl status | grep %(project_name)s ' '| sed "s/.*[pid ]\([0-9]\+\)\,.*/\\1/" ' '| xargs kill -HUP' % env) def deploy(): require('hosts', provided_by=[production]) update() install_requirements() syncdb() migrate() collectstatic() reload()

As you can probably tell from the code, this performs the following operations at the push of a button:

Updates the server's Git repository

Installs any new requirements with pip

Synchronises the database to create any new tables

Executes any South migrations that haven't yet been run

Links any static media

Reloads the webserver (see part 1 for the Gunicorn setup)