Over the last year we have written about getting several application stacks running on top of docker, e.g. Magento, Jenkins, Prometheus and so forth. However, containerized deployment can be useful for more than just defining application stacks. In this series of articles we would like to cover an end-to-end development pipeline and discuss how to leverage Docker and Rancher in its’ various stages. Specifically, we’re going to cover; building code, running tests, packaging artifacts, continuous integration and deployment, as well as managing an application stack in production. You can also download the entire series as an eBook beginning today.

To kick things off, we start at the pipheline ingress, i.e., building source code. When any project starts off, building/compilation is not a significant concern as most languages and tools have well-defined and well documented processes for compiling source code. However, as projects and teams scale, and the number of dependencies increase, ensuring a consistent and stable build for all developers while ensuring code quality becomes a much bigger challenge. In this post we will cover some of the challenges around CI and testing, discuss best practices and show how Docker can be used to implement them.

Challenges of Scaling Docker Build Systems

Before we get into best practices, let’s look at some of the challenges that arise in maintaining build systems. The first issue that your project will face as it scales out is Dependency Management. As developers pull in libraries and integrate source code against them it becomes important to: track versions of each library being used by the code, ensure the same version is being used by all parts of your project, test upgrades to library versions and push tested updates to all parts of your project.

A related but slightly different problem is to manage environment dependencies. This includes IDE and IDE configurations, tools versions (e.g. maven version, python version) and configuration e.g. static analysis rule files, code formatting templates. Environmental dependency management can get tricky because sometimes different parts of the project have conflicting requirements. Unlike conflicting code level dependencies it is often not possible or easy to resolve these conflicts. For example, in a recent project we used fabric for deployment automation and s3cmd for uploading artifacts to Amazon S3. Unfortunately, the latest version of fabric required python2.7 where as s3cmd required python2.6. A fix required us to either switch to a beta version of s3cmd or an older version of fabric.

Lastly, a major problem that every large project faces is build times. As projects grow in scope and complexity, more and more languages get added (my current project uses Java, Groovy, python and Protocol Buffers IDL). Tests get added for various components which are all interdependent. For example if you have a shared database then tests which mutate the same data cannot be run at the same time. In addition, we need to make sure that tests setup expected state prior to execution and cleanup after themselves when they finish. This lead to builds that can take anything from minutes to hours which either slows down development or leads to a dangerous practice of skipping test runs.

Solutions and Best Practices

To solve all these problems a good build system needs to support the following requirements (among others):

Repeatability We must be able to generate/create similar (or identical) build environments with the same dependencies on different developer machines and automated build servers. Centralized Management We must be able to control the build environment for all developers and build servers from a central code repository or server. This includes setting up the build environment as well as updates overtime. Isolation The various sub-components of the project must be build in isolation other than well defined shared dependencies. Parallelization We must be able to run parallel builds for sub-components.

To support the *Repeatbility *requirement we must use centralized dependency management. Most modern languages and development frameworks have support for automated dependency management. Maven is used extensively in Java and a few other languages, python uses pip and ruby has bundler. All these tools have a very similar paradigm, where an index file (pom.xml, requirements.txt or gemfile) is committed to your control. The tool can then be run to consume the file and download dependencies onto the build machine. Index files can be managed centrally, after testing them you would push out the change by updating the index in source control. However, there remains the issue of managing environmental dependencies. For example the correct version of maven, python and ruby have to be installed. We also need to ensure that the tools are run by developers. Maven automates the check for dependency updates but for pip and bundler we must wrap our build commands in scripts which trigger a dependency update run.

In order to setup the dependency management tools and scripts most small teams just use documentation and leave the onus on developers. This however, does not scale to large teams specially if the dependencies are updated over time. Further complicating matters is the fact that installation instructions for these tools can vary by platform and OS of the build machines. You can use orchestration tools such as Puppet or Chef to manage installation of dependencies and setting up configuration files. Both Puppet and Chef allow for central servers or shared configuration in source control to allow centralized management. This allows you to test configuration changes ahead of time and then push them out to all developers. However, these tools have some drawbacks, installing and configuring puppet or chef is non-trivial and full featured versions of these tools are not free. In addition, each has its own language for defining tasks. This introduces another layer of management overhead for IT teams as well as developers. Lastly, orchestration tools do not provide isolation hence conflicting tool versions are still a problem and running parallel tests is still an open problem.

To ensure component isolation and reduce build times we can use an automated virtualization system such as Vagrant. Vagrant can create and run virtual machines (boxes) which can isolate the build for various components and also allow for parallel builds. The vagrant configuration files can be committed into source control and pushed to all developers when ready to ensure centralized management. In addition, boxes can be tested and deployed to an \“Atlas\” for all developers to download. This still has the drawback that you will need a further layer of configuration to setup vagrant and that virtual machines are a very heavy weight solution for this problem. Each VM runs an entire OS and network stack just to contain a test run or compiler. Memory and Disk resources need to be partitioned ahead of time for each of these VMs.

Despite the caveats and drawbacks, using Dependency Management (maven, pip, bundler), orchestration (puppet, chef) and virtualization (vagrant), we can build a stable, testable centrally managed build system. Not all projects warrant the entire stack of tools, however, any long running large project will need this level of automation.

Leveraging Docker for Build systems

Docker and its ecosystem of tools can help us target the requirements above without the large investment of time and resources to support all the tools mentioned above. In this section we’ll go through the steps below for creating containerized build environments for applications.

Containerizing your build environment Packaging your application with Docker Using Docker compose for creating build environments

In order to illustrate the use of docker in build pipelines for this (and subsequent articles) we’ll be using a sample application called go-messenger. To follow along you can fetch the application from Github. The major data flows of the system are shown below. The application has two components; a RESTful authentication server written in Golang and a session manager which accepts long running TCP connections from clients and routes messages between clients. For the purposes of this article, we will be concentrating on the RESTful Authentication Service (go-auth). This sub-system consists of an array of stateless web-servers and a database cluster to store user information.

1. Containerizing your build environment

The first step in setting up the build system is to create a container image with all tools required to build the project. The docker file for our image is shown below and is also available here. Since our application is written in Go, we are using the official golang image and installing the godep dependency management tool. Note that if you are using Java for your project, a similar \“build container\” can be created with Java base image and installation of Maven instead of godep.

from golang:1.4 # Install godep RUN go get github.com/tools/godep Add compile.sh /tmp/compile.sh CMD /tmp/compile.sh

We then add a compile script which puts all the steps required to build and test our code in one place. The script shown below downloads dependencies using godep restore, standardizes formatting using the go fmt command, runs tests using the \“go test\” command and then compiles the project using go build.

#!/bin/bash set -e # Set directory to where we expect code to be cd /go/src/ ${ SOURCE_PATH } echo "Downloading dependencies" godep restore echo "Fix formatting" go fmt ./... echo "Running Tests" go test ./... echo "Building source" go build echo "Build Successful"

To ensure repeatability we can use docker containers with all tools required to build a component into a single, versioned container image. This image can be downloaded from Dockerhub or built from Dockerfile. Now all developers (and build machines) can use the container to build any go project using the following command:

docker run --rm -it \ -v $PWD:/go/src/github.com/[USERNAME]/[PROJECT]/[SUB-CDIRECTORY]/ \ -e SOURCE_PATH=github.com/[USERNAME]/[PROJECT]/[SUB-CDIRECTORY]/ \ usman/go-builder:1.4

In the above command we are running the usman/go-builder image version 1.4 and mounting our source code into the container using the -v switch and specifying the SOURCE_PATH environment variable using the -e switch. In order to test the go-builder on our sample project you can use the commands below to run all the steps and create an executable file called go-auth in the root directory of the go-auth project.

git clone git@github.com:usmanismail/go-messenger.git cd go-messenger/go-auth docker run --rm -it \ -v $PWD:/go/src/github.com/usmanismail/go-messenger/go-auth/ \ -e SOURCE_PATH=github.com/usmanismail/go-messenger/go-auth/ \ usman/go-builder:1.4

An interesting side-effect of isolating all source from build tools is that we can easily swap out build tools and configuration. For example in the commands above we have been using golang 1.4. By changing go-builder:1.4 to go-builder:1.5 in the commands above you can test the impact of using golang 1.5 on the project. In order to centrally manage the image used by all developers, we can deploy the latest tested version of the builder container to a fixed version (i.e. latest) and make sure all developers use go-builder:latest to build the source code. Similarly, if different parts of our project use different versions of build tools we can use different containers to build them without worrying about managing multiple language versions in a single build environment. For example, our earlier python problem could be mitigated by using the official python image which supports various python versions.

2. Packaging your application with Docker

If you would like to package the executable in a container of its own, add a Dockerfile with the content shown below and run \“docker build -t go-auth .\” In the dockerfile we are adding the binary output from the last step into a new container and exposing the 9000 port for application to accept incoming connections. We also specify the entry point to run our binary with the required parameters. Since Go binaries are self-contained, we’re using a stock Ubuntu image, however, if your project requires run time dependencies they can be packaged into the container as well. For example if you were generating a war file you could use a tomcat container.

FROM ubuntu ADD ./go-auth /bin/go-auth EXPOSE 9000 ENTRYPOINT ["/bin/go-auth","-l","debug","run","-p","9000"]

3. Using Docker Compose for creating build environments

Now that we have our project building, repeatably, in a centrally managed container which isolates the various components, we can also extend the build pipeline to run integration tests. This will also help us highlight the ability of docker to speed up builds using parallelization. One major reason that tests cannot be parallelized is because of shared databases. This is especially true for integration tests where we would not typically mock out external databases. Our sample project has a similar issue, we use a MySQL database to store users. We would like to write a test which ensures that we can register a new user. The second time a registration is attempted for the same user we expect a conflict error. This forces us to serialize tests so that we can cleanup our registered users after a test is complete before starting a new one.

To setup isolated parallel builds we can define a docker compose template (docker-compose.yml) as follows. We define a database service which uses the MySQL official image with required environment variables. We then create a GoAuth service with the container we created to package our application and link the database container to it.

Database: image: mysql environment: MYSQL_ROOT_PASSWORD: rootpass MYSQL_DATABASE: messenger MYSQL_USER: messenger MYSQL_PASSWORD: messenger expose: - "3306" stdin_open: true tty: true Goauth: image: go-auth ports: - "9000:9000" stdin_open: true links: - Database:db command: - "--db-host" - "db" tty: true

With this docker-compose template defined we can run the application environment by running docker compose up. We can then simulate our integration tests by running the following curl command. It should return 200 OK the first time and 409 Conflict the second time. Lastly, after running tests, we can run docker compose rm to clean up the entire application environment.

curl -i -silent -X PUT -d userid=USERNAME -d password=PASSWORD ${service_ip}:9000/user

[In order to run multiple isolated versions of the application we need to update docker-compose template to add the service database1 and goauth1 with identical configurations to their counterparts. The only change is that in Goauth1 we need to change the ports entry from 9000:9000 to 9001:9000. This is so that the publicly exposed port of the application does not conflict. The complete template is available here. When you run docker compose up now you can run the two integration test runs in parallel. Something like this can be effectively used to speed up builds for a project with multiple independent sub-components, e.g., a multi-module maven project. ]{.pl-s}

curl -i -silent -X PUT -d userid=USERNAME -d password=PASSWORD ${service_ip}:9000/user ... 200 OK curl -i -silent -X PUT -d userid=USERNAME -d password=PASSWORD ${service_ip}:9001/user ... 200 OK curl -i -silent -X PUT -d userid=USERNAME -d password=PASSWORD ${service_ip}:9001/user ... 409 Conflict curl -i -silent -X PUT -d userid=USERNAME -d password=PASSWORD ${service_ip}:9000/user ... 409 Conflict

Creating a Continuous Integration Pipeline with Docker and Jenkins

Now that we have the build setup, let’s create a continuous integration pipeline for our sample application. This will help ensure that best-practices are followed and that conflicting changes are not acting together to cause problems. However, before we get into setting up the continuous integration for our code we will spend a little time talking about how to partition out code into branches.

Branching Model

As we automate our continuous integration pipeline, an important aspect to consider is the development model followed by the team. This model is often dictated by how the team uses the version control system. Since our application is hosted in a git repository, we’re going to use git-flow model for branching, versioning and releasing our application. It’s one of the most commonly used models for git based repositories. Broadly, the idea is to maintain two branches; a develop(ment) branch and a master branch. Whenever we want to work on a new feature, we create a new branch from develop and when the feature work is complete, it is merged back into it. All feature branches are managed individually by developers working on those features. Once code is committed to the develop branch, CI servers are responsible for making sure the branch always compiles, passes automated tests and is available on a server for QA testing and review. Once we’re ready to release our work, a release is created from the develop branch and is merged into the master branch. The specific commit hash being released is also tagged with a version number. Tagged releases can then be pushed to Staging/Beta or Production environments.

We are going to be using the gitflow tool to help manage our git branches. To install git-flow, follow the instructions here. Once you have git-flow installed you can configure your repository by running the *git flow init *command as shown below. Git flow is going to ask a few questions and we recommend going with the defaults. Once you execute the git-flow command, it will create a develop branch (if it didn’t exist) and check it out as the working branch.

$ git flow init Which branch should be used for bringing forth production releases? - master Branch name for production releases: [master] Branch name for "next release" development: [develop] How to name your supporting branch prefixes? Feature branches? [feature/] Release branches? [release/] Hotfix branches? [hotfix/] Support branches? [support/] Version tag prefix? []

Now, let’s create a new feature using git flow by typing git flow feature start [feature-name]. It’s a common practice to use ticket/issue id as the name of the feature. For example, if you are using something like Jira and working on a ticket, the ticket Id (e.g., MSP-123) can become the feature name. You’ll notice that when you create a new feature with git-flow, it will automatically switch to the feature branch.

git flow feature start MSP-123 Switched to a new branch 'feature/MSP-123' Summary of actions: - A new branch 'feature/MSP-123' was created, based on 'develop' - You are now on branch 'feature/MSP-123' Now, start committing on your feature. When done, use: git flow feature finish MSP-123

At this point you can do all the work needed for the feature and then run your automated suite of tests to make sure that everything is in order. Once you are ready to ship your work, simply tell git-flow to finish the feature. You can do as many commits as you need for the feature. For our purposes, we’re just going update the README file and finish off the feature by typing \“git flow feature finish MSP-123\“.

Switched to branch 'develop' Updating 403d507..7ae8ca4 Fast-forward README.md | 1 + 1 file changed, 1 insertion(+) Deleted branch feature/MSP-123 (was 7ae8ca4). Summary of actions: - The feature branch 'feature/MSP-123' was merged into 'develop' - Feature branch 'feature/MSP-123' has been removed - You are now on branch 'develop'

Note that git flow merges the feature in ‘develop’, deletes the feature branch and takes you back to the develop branch. At this point you can push your develop branch to remote repository (git push origin develop:develop). Once you commit to the develop branch the CI server takes over to run the Continuous integration pipeline. Note, for a larger team, an alternative and a more suitable model would be to push feature branches to remote before finishing them off, getting them reviewed and using Pull requests to merge them into develop.

Creating CI pipeline with Jenkins

For this section we assume that you have gotten a Jenkins cluster up and running. If not, you can read more about setting up a scalable Jenkins cluster in our earlier post. Once you have Jenkins running, we need the following plugins and dependencies installed on your Jenkins server:

Once you have setup the requisite plugins we can create the first three jobs in our Build Pipeline: compile, package and integration test. These will serve as the starting point of our continuous integration and deployment system.

Job 1: Build Go Auth Service

The first job in the sequence will checkout the latest code from source control after each commit and ensure that it compiles. It will also run units tests. To setup the first job for our sample project select New Item > Freestyle Project. Select the \“This build is parameterized\” to add a \“Git Parameter\” called GO_AUTH_VERSION as shown below. Next configure the parameter to pick up any tags matching \“v*\” (e.g., v2.0) and default to develop (branch) if no value is specified for the parameter. This is quite useful for getting a list of version tags from Git and populating a selection menu for the job. If the job is automatically triggered and no value is specified, the value of GO_AUTH_VERSION defaults to develop.

Next, In the Source Code Management section add https://github.com/usmanismail/go-messenger.git as the repository url,* *specify the branch as */develop and set a poll interval, e.g., 5 minutes. With this, Jenkins will keep tracking our develop branch for any changes to automatically trigger the first job in our CI (and CD) pipeline.

Now in the* Build* section select Add Build Step > Execute Shell and copy the docker run command from earlier in the article. This will get the latest code from Github and build the code into the go-auth executable.

Following the build step we need to add two post build steps, Archive the Artifacts to archive the go-auth binary that we build in this job and Trigger parameterized builds to kick off the next job in the pipeline as shown below. When adding the Trigger parameterized build action, make sure to add Current build parameters from Add Parameters. This will make all the parameters (e.g., GO_AUTH_VERSION) for the current job available for the next job. Note the name to use for the downstream job in the trigger parameterized build section as we’ll need it in the following step.

The log output form the build job should look something like following. You can see that we use a dockerized container to run the build. The build will use go fmt to fix an formatting inconsistencies in our code and also run our unit tests. If any tests fail or if there are compilation failures, Jenkins will detect the failure. Furthermore, you should configure notifications via email or chat integrations (e.g. Hipchat or Slack) to notify your team if the build fails so that it can be fixed quickly.

Started by an SCM change Building in workspace /var/jenkins/jobs/build-go-auth/workspace > git rev-parse --is-inside-work-tree # timeout=10 Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/usmanismail/go-messenger.git # timeout=10 Fetching upstream changes from https://github.com/usmanismail/go-messenger.git > git --version # timeout=10 > git -c core.askpass=true fetch --tags --progress https://github.com/usmanismail/go-messenger.git +refs/heads/*:refs/remotes/origin/* > git rev-parse refs/remotes/origin/develop^{commit} # timeout=10 > git rev-parse refs/remotes/origin/origin/develop^{commit} # timeout=10 Checking out Revision 89919f0b6cd089342b1c5b7429bca9bcda994131 (refs/remotes/origin/develop) > git config core.sparsecheckout # timeout=10 > git checkout -f 89919f0b6cd089342b1c5b7429bca9bcda994131 > git rev-list 7ae8ca4e8bed00cf57a2c1b63966e208773361b4 # timeout=10 [workspace] $ /bin/sh -xe /tmp/hudson1112600899558419690.sh + echo develop develop + cd go-auth + docker run --rm -v /var/jenkins/jobs/build-go-auth/workspace/go-auth:/go/src/github.com/usmanismail/go-messenger/go-auth/ -e SOURCE_PATH=github.com/usmanismail/go-messenger/go-auth/ usman/go-builder:1.4 Downloading dependencies Fix formatting Running Tests ? github.com/usmanismail/go-messenger/go-auth [no test files] ? github.com/usmanismail/go-messenger/go-auth/app [no test files] ? github.com/usmanismail/go-messenger/go-auth/database [no test files] ? github.com/usmanismail/go-messenger/go-auth/logger [no test files] ok github.com/usmanismail/go-messenger/go-auth/user 0.328s Building source Build Successful Archiving artifacts Warning: you have no plugins providing access control for builds, so falling back to legacy behavior of permitting any downstream builds to be triggered Triggering a new build of package-go-auth Finished: SUCCESS

Job 2: Packaging Go Auth

Once we have compiled the code, we can package it into a docker container. To create the package job select, New Item > Freestyle Project and give your second job a name matching what you specified in the previous job. As before, this job is also going to be a parameterized build with a GO_AUTH_VERSION parameter. Note that for this and all subsequent jobs, the GO_AUTH_VERSION is simply a string parameter with a default value of \“develop\“. The expectation here is that this value would be coming from the upstream.

As before specify the Github project in the source code section and add a build step to execute shell.

echo ${GO_AUTH_VERSION} cd go-auth chmod +x go-auth chmod +x run-go-auth.sh chmod +x integration-test.sh docker build -t usman/go-auth:${GO_AUTH_VERSION} .

In order for us to build the docker container we also need the executable we built in the previous step. To do this we add a build step to copy artifacts from the upstream build. This will make sure that we have the executable available for the docker build command which can be packaged into a docker container. Note that we’re using the GO_AUTH_VERSION variable to tag the image we’re building. By default, for changes in develop branch, it would always build usman/go-auth:develop and overwrite the existing image. In the next article, we’ll revisit this pipeline for releasing new versions of our application.

As before use the Trigger parameterized builds (with Current build parameters) post-build action to trigger the next job in the pipeline which will run integration tests using the docker container we just built and the docker compose template that we detailed earlier in the article.

Job 3: Run integration tests

To run integration tests create a new job. As with the package job, this job needs to be a parameterized build with the GO_AUTH_VERSION string variable. Next copy artifacts from the build job. This time we will use the docker compose template above to bring up a multi-container test environment and run integration tests against our code. Integration tests (unlike unit tests) are typically kept entirely separate from the code being tested. To this end we will use a shell script which runs http queries against our test environment. In your execute shell command change directory to go-auth and run integration-test.sh.

echo ${GO_AUTH_VERSION} cd go-auth chmod +x integration-test.sh ./integration-test.sh

The contents of the script are available here. We use docker compose to bring up our environment and then use curl to send http requests to the container we brought up. The logs for the job will be similar to the ones shown below. Compose will launch a database container, and link it to the goauth container. Once the database is connected you should see a series of \”Pass: ...\” as the various tests are run and verified. After the tests are run, the compose template will clean up after itself by deleting the database and go-auth containers.

Creating goauth_Database_1... Creating goauth_Goauth_1... [36m04:02:52.122 app.go:34 NewApplication DEBUG [0m Connecting to database db:3306 [36m04:02:53.131 app.go:37 NewApplication DEBUG [0m Unable to connec to to database: dial tcp 10.0.0.28:3306: connection refused. Retrying... [36m04:02:58.131 app.go:34 NewApplication DEBUG [0m Connecting to database db:3306 [36m04:02:58.132 app.go:37 NewApplication DEBUG [0m Unable to connec to to database: dial tcp 10.0.0.28:3306: connection refused. Retrying... [36m04:03:03.132 app.go:34 NewApplication DEBUG [0m Connecting to database db:3306 [36m04:03:03.133 common.go:21 Connect DEBUG [0m Connected to DB db:3306/messenger [36m04:03:03.159 user.go:29 Init DEBUG [0m Created User Table [36m04:03:03.175 token.go:33 Init DEBUG [0m Created Token Table [36m04:03:03.175 app.go:42 NewApplication DEBUG [0m Connected to database [36m04:03:03.175 app.go:53 Run DEBUG [0m Listening on port 9000 Using Service IP 10.0.0.29 Pass: Register User Pass: Register User Conflict Stopping goauth_Goauth_1... Stopping goauth_Database_1... Finished: SUCCESS

With the three jobs now setup you can create a new Build Pipeline view by selecting the + tab in the Jenkins view and selecting the build pipeline view. In the configuration screen that pops up, select your compile/build job as the initial job and select OK. You should now see your CI pipeline take shape. This gives a visual indication of how each commit is progressing through your build and deployment pipeline.

When you make changes to the develop branch, you’ll notice that the pipeline is automatically triggered by Jenkins. To manually trigger the pipeline, select your first (build) job and run. It would ask you to select the value of the git parameter (e.g., GO_AUTH_VERSION). Not specifying any will result in the default value and run the CI pipeline against the latest in the develop branch. You can also just click ‘Run’ in the pipeline view, however, at the time of writing, there is an open bug in Jenkins which prevents it from starting the pipeline if the first job is a parametrized build. Let’s quickly review what we’ve done so far. We created a CI pipeline for our application with the following steps:

Use git-flow to add new features and merge them into develop Track changes on develop branch and build our application in a containerized environment Package our application in a docker container Spin up short-lived environment(s) using docker compose Run integration tests and tear down environments

With the above CI pipeline, every time a new feature (or fix) is merged into the develop branch, all of the above steps are executed by our CI pipeline to create the \“usman/go-auth:develop\” docker image. Further as we build out a deeper pipeline in upcoming articles which integrates deployment. You will also be able to use this view to promote application versions to various deployment environments as they clear testing phases.

Conclusion

In this article we’ve seen how to leverage docker to create a continuous integration pipeline for our project which is centrally managed, testable, and repeatable across machines and in time. We were able to isolate the environmental dependencies for various components as needed. This forms a starting point to a longer docker based build and deployment pipeline which we’ll continue to build and document in a series of write ups. The next step in our pipeline is to setup continuous deployment. Next week we will show how to use Rancher to deploy an entire server environment to run our code. We will also cover best-practices for how to setup a long running testing environment and deployment pipeline for large scale projects.

To get started with Rancher, please register for the Rancher beta. You can also download our free eBook \“Continuous Integration and Delivery with Docker and Rancher\” which covers all aspects of building a docker-based development pipeline.