This post describes a deployment and maintenance method for Django projects that was designed with scalability in mind. The goal is to push new releases into production at any moment with minimal or no downtime. Upgrades can be performed with unprivileged access to the production server, and rollbacks are possible. I use Gunicorn, Fabric and Supervisord in the examples.

Dependency management

One important task when automating processes is to make them determinstic. This means that the outcome will always be the same, no matter when the process was started. Deploying a commit to staging should have exactly the same outcome as deploying it to production, even if new versions of dependencies were released in between. If deployments are not consistent, anything can happen. Most Django projects include a requirements.txt file and use pip with virtualenv to manage dependencies. This is easy, but managing this process manually takes too much time. Pip-tools is a great tool to make this easier. Everything starts with a requirements.in file:

Django<1.12 django-mptt django-taggit easy-thumbnails gunicorn Pillow==3.4.2

This is everything the project needs to run. Django itself stays on the 1.11 branch, the latest LTS release. Pillow is pinned to the version that's distributed with the OS, to avoid unnecessary builds during deployments. Running pip-compile against this file produces the following output:

django-mptt==0.8.7 django-taggit==0.22.1 django==1.11 easy-thumbnails==2.4.1 gunicorn==19.7.1 pillow==3.4.2 # via easy-thumbnails pytz==2017.2 # via django

Great, now all packages are pinned to their latest compatible releases. Pip-tools makes it easier to have a deterministic deployment process, and pip-compile makes it very easy to upgrade all requirements at once.

Choosing a deployment method

The deployment method I describe makes a few assumptions: The project's application is run by a dedicated user inside a date-based directory (e.g. /srv/www/project/20170420/ )

inside a (e.g. ) A symlink called current points to this directory ( /srv/www/project/current/ )

called points to this directory ( ) A system service automatically restarts the application when it exits

the application when it exits It is possible to access the user account remotely I use Supervisord and SSH for the latter, but other configurations are possible. You can also name your directories however you like, I append the git tag to the date for example. Next is an example of a Supervisord config I use. Notice that the project is always accessed through the current symlink, and that the pid file is in a known location:



[ program:demo_wsgi ] command = /srv/www/demo/current/repository/virtualenv/bin/gunicorn demo.wsgi:application --chdir repository --bind 127 .0.0.1:8001 --log-file demo-wsgi.log --pid demo.pid directory = /srv/www/demo/current/repository/ user = demo group = demo autostart = true autorestart = true redirect_stderr = true

With this out of the way, let's have a look at the deployment process itself: A new date-based directory is created in the user's home directory The code repository is cloned into it A virtualenv is created, and all the pinned requirements are installed Static files are collected, database migrations and a few more management commands are run

The current symlink is renamed to previous and a symlink named current to the new date-based directory is created The previous app server process is killed, Supervisord notices this and starts the newly deployed code Once the migrations run or the current symlink is updated, the application can break in various ways. The old version of the website might use static files that the webserver can't find any more, or the old code might not be compatible with the migrated database. Solutions for these short-lived problems are described below.

Picking an automation tool

The process I described above could be performed manually, and it's probably a good idea to try it like that a few times. Once familiar with the procedure it's time to automate it. My primary tool for app-level automation in Django projects is Fabric. Any task runner, scripting language or config management tool would do, but Fabric has the advantage of being written in Python and of integrating nicely with virtualenv using fabric-virtualenv. And it doesn't need any special privileges, it can do anything your user can do. If you aren't using any task runner or automation tool yet I'd recommend you look into Fabric. Fabric is not Python3 ready yet, but as it's only used to push code and not for your Django project that is tolerable, and like Raffaele pointed out in the comments there is a Python3 fork. Another possible tool is Ansible, but it is more complex than Fabric. Some basic tasks that can be automated as an exercise are: compiling new requirements files

building documentation and reports



pulling data snapshots and files from production into dev

Below is a Fabric script that performs the described deployment method.

import datetime , os from fabric.api import run , cd , settings from fabvenv import make_virtualenv , virtualenv GIT_REPO = 'user@example.com/path/to/project.git' GIT_BRANCH = 'production' HOME = '/srv/www/djangoproject/' def deploy (): """ A basic deployment script for Django projects that minimizes downtime. The warn_only setting is used for steps that can fail the first time the script runs. """ version = datetime . datetime . now () . strftime ( '%Y%m %d -%H%M%S' ) deploy_path = os . path . join ( HOME , version ) venv_path = os . path . join ( deploy_path , 'virtualenv' ) repository_path = os . path . join ( deploy_path , 'repository' ) # I have a src directory inside the git repository that contains the actual Django project src_path = os . path . join ( repository_path , 'src' ) pid_file = os . path . join ( repository_path , 'demo.pid' ) # Create home directory if necessary with settings ( warn_only = True ): run ( 'mkdir {} ' . format ( HOME )) # Step 1: Create a new deployment directory run ( 'mkdir {} ' . format ( deploy_path )) # Step 2: Check out the source code run ( 'git clone --branch {} {} {} ' . format ( GIT_BRANCH , GIT_REPO , repository_path )) # Step 3: Create the virtualenv and install dependencies make_virtualenv ( venv_path , system_site_packages = True ) with cd ( repository_path ): with virtualenv ( venv_path ): run ( 'pip install --upgrade pip' ) run ( 'pip install -r requirements.txt' ) # Step 4: Run management commands with cd ( src_path ): with virtualenv ( venv_path ): run ( 'python manage.py check' ) run ( 'python manage.py collectstatic --noinput' ) run ( 'python manage.py compilemessages' ) run ( 'python manage.py migrate' ) # Step 5: Update the links to current and previous deployments with cd ( HOME ): with settings ( warn_only = True ): run ( 'rm -f previous' ) run ( 'mv current previous' ) run ( 'ln -s {} current' . format ( deploy_path )) # Step 6: Force a restart # Kill the old worker so that supervisord starts the new one with settings ( warn_only = True ): run ( 'kill -TERM `cat {} `' . format ( pid_file ))

This is a complete example of a fabfile.py, you can start the deployment process with fab -H example.com deploy.

Update 2019: Bash deploy script example

Ok, I'm not really proud of this as going from fabric to bash feels like a downgrade. But I have a few older projects that still need changes deployed, but I simply don't have the time to replace the now obsolete fabric with a proper technology. Well, it turns out the method I described in this article is simple enough that a bash script can do it. Here is one version I use at the moment.

#!/bin/bash # Fixed settings commit = $( git rev-parse HEAD ) date = $( date +%Y%m%d_%H%M%S ) name = " ${ date } _ ${ commit } " src = "~/ ${ name } /git" venv = "~/ ${ name } /virtualenv" manage = " ${ venv } /bin/python ${ src } /src/manage.py" manage_latest = "~/latest/virtualenv/bin/python latest/git/src/manage.py" archive = " ${ name } .tar.gz" previous = "previous" latest = "latest" # Dynamic settings python = /usr/bin/python3.5 pidfile = " ${ previous } /git/src/project.pid" remote_suggestion = "user@example.com" compilemessages = 1 # Arg "parsing" cmd = $1 remote = ${ 2 :- ${ remote_suggestion }} if [[ ! " ${ remote } " ]] ; then echo "No remote given, aborting, try ${ remote_suggestion } " exit 1 fi if [[ ! " ${ cmd } " ]] ; then echo No command given, aborting, try deploy remoteclean getdata exit 1 fi if [[ " ${ cmd } " == "deploy" ]] ; then set -e echo "Transfer archive..." git archive --format tar.gz -o " ${ archive } " " ${ commit } " scp " ${ archive } " " ${ remote } :" rm -f " ${ archive } " echo "Set up remote host..." ssh " ${ remote } " mkdir -p " ${ src } " ssh " ${ remote } " tar xzf " ${ archive } " -C " ${ src } " ssh " ${ remote } " virtualenv --quiet " ${ venv } " -p ${ python } ssh " ${ remote } " " ${ venv } /bin/pip" install --quiet --upgrade pip setuptools ssh " ${ remote } " " ${ venv } /bin/pip" install --quiet -r " ${ src } /requirements.txt" echo "Set up django..." ssh " ${ remote } " " ${ manage } check --deploy" ssh " ${ remote } " " ${ manage } migrate --noinput" if [[ ${ compilemessages } -gt 0 ]] ; then ssh " ${ remote } " "cd ${ src } && ${ manage } compilemessages" fi ssh " ${ remote } " " ${ manage } collectstatic --noinput" echo "Switching to new install..." ssh " ${ remote } " rm -fv " ${ previous } " ssh " ${ remote } " mv -v " ${ latest } " " ${ previous } " ssh " ${ remote } " ln -s " ${ name } " " ${ latest } " echo "Killing old worker, pidfile ${ pidfile } " ssh " ${ remote } " "test -f ${ pidfile } && kill -15 \$(cat ${ pidfile } ) || echo pidfile not found" echo "Cleaning up..." ssh " ${ remote } " rm -f " ${ archive } " rm -f " ${ archive } " set +e elif [[ " ${ cmd } " == "getdata" ]] ; then echo "Dumping prod data" ssh " ${ remote } " " ${ manage_latest } dumpdata --format json --indent 2 --natural-foreign --natural-primary -o data.json" echo "Fetching prod data" rsync -avz --progress " ${ remote } :data.json" data/ fi if [[ " ${ cmd } " == "deploy" || " ${ cmd } " == "remoteclean" ]] ; then echo "Deleting obsolete deploys" ssh " ${ remote } " '/usr/bin/find . -maxdepth 1 -type d -name "2*" | ' \ 'grep -v "$(basename "$(readlink latest)")" | ' \ 'grep -v "$(basename "$(readlink previous)")" | ' \ '/usr/bin/xargs /bin/rm -rf' ssh " ${ remote } " rm -fv 2 *tar.gz fi This code was originally published on Simple bash deployment script for Django.

Things to keep in mind

I have used this method for a while now, and it does what it was designed to do. There are a few important things I haven't mentioned yet: Static files and localization

I compile static files and translations on production, during deployment, inside the deployment directory. All those assets are in the source code repository, so this makes sense. However, it's perfectly fine to perform this step on a different machine, and to transfer the compiled assets to production. Media files Media files should obviously not be inside the deployment directory, or they would be lost after an upgrade because the webserver doesn't know about their old location. I keep them in the user's home directory or put them on an external storage. Migrations and rollback One potential source of conflicts during deployments are database migrations. If your new database scheme is incompatible with the production code, your application will generate errors sooner or later, during the migrations, after rollbacks, etc. One way to avoid this problem is to only deploy non-destructive migrations when you roll out new features. Such a migration doesn't delete any data or rename existing fields and models, it just adds new fields and data. Doing this also has the benefit that rolling back your production code can be as simple as updating a symlink and restarting the application server. Once your new code has proven to be stable in production you can create additional migrations to get rid of legacy data. Caches If you use caching you should think about potential cache conflicts. You can avoid them for example by running the clear_cache management command, or by adding a KEY_PREFIX to your cache config. Clearing the entire cache for every deployment seems a little aggressive though. Keeping your code portable You probably want your deployment scripts to be reusable in multiple projects, so think about ways to avoid hardcoding paths etc. inside your Fabric scripts (if that's what you use). I use a custom Fabric package. Cleaning up So far we have kept old deployment directories around, which makes rollback possible, but it's not necessary to keep all old deployments. Which cleanup process you choose depends on your requirements. Using the date in the directory name makes managing them easier. Deploying secrets Storing secrets like the SECRET_KEY, mail configuration and other sensitive information in the source repository should usually be avoided. Distributing them is another potential task for your script. Feature switches Being able to roll back releases is nice, but it's also nice to be able to enable and disable features with a simple configuration switch, or to perform A/B testing. Feature switches can also help to merge code more frequently.

Reducing downtime: More app servers