Before Docker came along, virtualenv was pretty much the tool of choice for Python developers as it allows you to separate package dependencies for different applications you're working on without having to create separate virtual machines for each one. It worked very well for me.

However, there are still some annoying things the developers will have to deal with. Such things include having to install additional applications and services required by the project such as PostgreSQL, RabbitMQ, and Solr, for example. Also, some Python packages won't install/compile properly without additional libraries installed on the developer's machine. Pillow and lxml are just two that come to mind. There is also the issue of predictability when you deploy to production. You may be developing on a Mac but your production server runs Ubuntu 12.04. Features that worked fine locally may have issues when deployed to the servers.

Of course, virtual machines can solve these issues. But VMs are heavy, take some time to start up, and if you like to separate services like I do (say a different VM for PostgreSQL and Solr to closely mimic your production setup), that could use up quite a bit of system resources and if you use a laptop for development you will see a pretty significant reduction in battery life.

With Docker, these issues go away. You can have all these services in isolated Docker containers that are lightweight and start up very quickly. You can use base images for different Linux distros, preferably the same distro and version you use in production.

I've been using this setup since last year and have used it on 6 Django projects so far. I can't imagine going back to my old setup with virtualenv. I've even reinstalled the OS on my Ubuntu laptop and getting my projects up and running was as simple as installing Docker and Docker Compose, running pip install and the database migration command in the Docker container, and typing in docker-compose up.

I'm not going to go into detail on how Docker and Docker Compose (formerly known as Fig) work as the Docker website already covers these. But I will explain how I set this up for an open-source Django app I've built, YouTube Audio Downloader, so you can try it out as well.

Getting Started

To get the YouTube Audio Downloader app running:

1. Install Docker

2. Install Docker Compose

3. Clone the youtube-audio-dl project from GitHub:

git clone https://github.com/jcalazan/youtube-audio-dl.git

Note for Boot2Docker users:

In docker-compose.yml you may need to remove the ~ in ~/dockerfiles. Boot2Docker automatically mounts the user directory to the Boot2Docker VM and you may run into permission issues.

4. In the project root (where a file called docker-compose.yml is located), run this command to install required packages with pip:

docker-compose run django pip install -r requirements.txt

This will create a ~ /dockerfiles directory on your machine where the Python files are stored.

5. Start up the containers in the background.

docker-compose up -d

6. Run the Django database migrations and access the site at http://localhost.

docker-compose run django python manage.py migrate

Note for Boot2Docker users:

To access the site, you'll need to find the IP address of the VM, you can do this by typing in boot2docker ip and access it at http://ip_of_boo2docker_vm).

To see a tail of the logs, type in docker-compose logs. You will see an output that looks something like this:

To see just the logs for a specific service/container, type in docker-compose logs service_name.

How It Works

Building Docker images from Dockerfiles and storing them in Docker Hub

I created a separate GitHub repo to store the Dockerfiles that I use for my side projects. I then created a Docker Hub account which is linked to this GitHub repo so the Docker images get built automatically every time I push a change to GitHub. I can then just do a docker pull repo_name to pull the images from Docker Hub.

To build the images manually, you can use the docker build command. For example, I can run this command where the PostgreSQL Dockerfile is located to build the PostgreSQL image:

docker build -t jcalazan/postgresql .

I can then push this image manually to Docker Hub:

docker push jcalazan/postgresql

The docker-compose.yml file

Docker Compose is a great tool for managing your containers during development. The docker-compose.yml file pretty much configures the Docker environment for you and is very readable. Without this tool, you will have to run some pretty complex Docker commands to specify volume mapping, linked containers, environment variables, port forwarding, etc.

Here's a sample docker-compose.yml file from the YouTube Audio Downloader project:

postgresql: image: jcalazan/postgresql environment: - POSTGRESQL_DB=youtubeadl - POSTGRESQL_USER=youtubeadl - POSTGRESQL_PASSWORD=password volumes: - ~/dockerfiles/youtube-audio-dl/postgresql:/var/lib/postgresql ports: - "5432:5432" rabbitmq: image: jcalazan/rabbitmq ports: - "15672:15672" # NOTES: # - The C_FORCE_ROOT variable allows celery to run as the root user. celery: image: jcalazan/django environment: - C_FORCE_ROOT=true - DATABASE_HOST=postgresql - BROKER_URL=amqp://guest:guest@rabbitmq// working_dir: /youtube-audio-dl command: bash -c "sleep 3 && celery -A youtubeadl worker -E -l info --concurrency=3" volumes: - .:/youtube-audio-dl - ~/dockerfiles/youtube-audio-dl/python:/usr/local/lib/python2.7 - ~/dockerfiles/youtube-audio-dl/bin:/usr/local/bin links: - postgresql - rabbitmq # NOTES: # - The C_FORCE_ROOT variable allows celery to run as the root user. flower: image: jcalazan/django environment: - C_FORCE_ROOT=true - DATABASE_HOST=postgresql - BROKER_URL=amqp://guest:guest@rabbitmq// working_dir: /youtube-audio-dl command: bash -c "sleep 3 && celery -A youtubeadl flower --port=5555" volumes: - .:/youtube-audio-dl - ~/dockerfiles/youtube-audio-dl/python:/usr/local/lib/python2.7 - ~/dockerfiles/youtube-audio-dl/bin:/usr/local/bin ports: - "5555:5555" links: - postgresql - rabbitmq django: image: jcalazan/django environment: - DATABASE_HOST=postgresql - BROKER_URL=amqp://guest:guest@rabbitmq// working_dir: /youtube-audio-dl command: bash -c "sleep 3 && python manage.py runserver_plus 0.0.0.0:80" volumes: - .:/youtube-audio-dl - ~/dockerfiles/youtube-audio-dl/python:/usr/local/lib/python2.7 - ~/dockerfiles/youtube-audio-dl/bin:/usr/local/bin ports: - "80:80" links: - postgresql - rabbitmq

The project uses 5 services/containers:

postgresql rabbitmq celery flower (celery monitoring) django

One thing you'll notice is that the celery, flower, and django containers use the same image as they're really all the same apps using different commands.

Now let's go over the different options:

image This is basically just the image name/repo. If the image is not found on your local machine, Docker will look for it in Docker Hub and automatically downloads. So in this case, if you didn't build the image manually, running a docker-compose command will simply just pull the image I created in Docker Hub.

environment This option sets additional environment variables (or overwrites existing ones) when the container starts. For example, if you want to set a different default Django setting file, you can set the DJANGO_SETTINGS_MODULE env var here.

working_dir This will basically just do a cd to a path you specify when the container starts.

command This will run whatever command you want when the container starts up. In the case of the "django" service/container, runserver is called so you can access the site right away. Note that I added a sleep to some of the containers to make sure services that these containers depend on, such as PostgreSQL, are ready to accept connections before running the command as the containers start asynchronously (more info here).

volumes The volumes option allows you to map a path from your local machine to the Docker container. Some examples of paths you might want to map are your code repo directory, database files, and logs directories. Containers are ephemeral, so any changes inside the containers that aren't mapped to the host will disappear when you stop them. You'll notice in the example that I also mapped the bin and Python package directories. The main reason for this is so I can re-use the same django image for all my projects. When I run pip install inside the container, the files go to ~/dockerfiles/youtube-audio-dl of the host machine. When the container starts up, that path is then mounted to /usr/local/... of the container.

ports This option exposes the container ports. You can also specify the port forwarding here from the host machine to the container. For example, "8000:80" will forward port 8000 of the host to port 80 of the container. This is useful for avoiding port conflicts. A good example is if your environment involves multiple Django apps which you need to be running simultaneously and need to be able to access from your web browser.

links You can define other containers here that the service depends on. The "django" service/container in the example requires a connection to the "postgresql" container. So when you start up the "django" container, the "postgresql" container will also automatically start. Docker will add a host entry postgresql in /etc/hosts of the "django" container pointing to the IP address of the "postgresql" container. You can also set an alias here. For example, you can specify "postgresql:pgsql01.domain.com" and an additional host entry pgsql01.domain.com will be added to /etc/hosts also pointing to the IP address of the "postgresql" container.



See docker-compose.yml reference for all the available options.

Conclusion

This setup is especially useful for complex development environments. At my previous job, for example, it could take a new developer an entire day or two just go get his/her development environment running because we had too many services that were required for the application to run properly. Most of them needed to be installed and configured manually (PosgreSQL, Mongo, Solr, RabbitMQ, a PHP app, a Tornado app, etc.). Installation instructions could also vary depending on the developer's OS. Not to mention additional time taken away from other developers to help the new developer. With Docker, getting up and running was reduced to less than an hour, and most of that time is spent waiting for the Docker images to finish downloading from Docker Hub.