Recently, one of our clients was experiencing a lot of extra traffic, and we took the time to set them up with AWS autoscaling groups. Now, when there’s more traffic and load than usual, extra servers are automatically added to the pool to handle the increased requests. Pretty standard stuff, and it makes life a lot easier.

However, while autoscaling groups are great, they present some new issues, the biggest of which is deploying. Our previous deploy process was pretty simple: we had an Ansible script which connected to each server, updated the code, and restarted the app. We ran into a few issues right off the bat with this process:

Servers and IP addresses can change at any time. When new servers are spun up, they use whatever is in the base image, which may not be the latest code—it’s just whatever the latest code was when you built the base AMI.

Solving #1 isn’t too tough: Ansible has ways of building a dynamic inventory, which just pulls down the latest servers from EC2. It makes the deploy process a bit slower, but not enough to be a big deal.

The second problem is trickier. If you’re using a typical deploy process, your live servers are constantly being deployed to, but the base AMI hasn’t been updated for a long time. The problem is, when the auto scaling group spins up a new server, it has that old code, which could be months out of date. This means your users could start seeing all sorts of weird behavior, unless you make sure no server is added without first pulling down the latest code. There are two main ways to do this:

Build a new AMI every time you deploy Add startup hooks that automatically pull down the latest code when a new box is added.

Method #1 has some obvious advantages: if your deploy process creates a new base image every time you deploy, you can be sure that the base image is always up to date. That means whenever a new server is added, it’s ready to go, without any changes, resulting in the fastest possible spin up. Also, since you’re not pulling down new code when a new instance spins up, it’s very unlikely you’ll have any errors in the launch process, since no new code is being run.

The problem with #1 is the speed of deployment. Here’s what the deploy process for the baked AMI method would look like:

Spin up a new EC2 instance for building the AMI Deploy the latest code to the instance Build an AMI Create a new launch configuration based on the AMI Switch your auto scaling group to the new launch configuration Phase out old servers.

Best case, this process will take 5-10 minutes, but it likely will be longer. That’s a long time to wait for every deploy. That’s why, unless you have good reasons not to, I really like option #2.

With option #2, startup hooks, you build a base system AMI once, and then write deploy scripts to run whenever new boxes spin up. If you were doing this all by hand, it would be a little painful to set up, but thankfully, Amazon has a service pre-made to do exactly this: CodeDeploy.

With CodeDeploy, Amazon takes care of everything related to deploying to an auto scaling group. It does the stuff you’d expect: running a deploy will update all the existing servers. However, the best feature is that it automatically creates all of the startup hooks you need, without any extra work from you. It automatically sets up all of the startup hooks, so when a new server gets spun up, it’s automatically updated before being added to your auto scaling group. It’s awesome.

After having gone through the process of setting up CodeDeploy for the first time, I learned a bunch of little things that I wanted to share:

Lessons Learned

1. Your base AMI should be as complete as possible

This is really important. You want CodeDeploy to do as little work as possible during the actual deploy, so it’s as fast as possible. Build your base AMI with everything you need for your app to run, so the deploy script just needs to grab the latest code, maybe run a few basic scripts to get it ready, and restart the server. There’s just no need to be doing things like installing system packages or anything like that.

On a related note, I found it very helpful to have separate AMIs for different types of servers. For instance, if you have app servers and job servers, they may be running the same code, but they might have completely different types of processes running. Your app servers will have the actual web server running, but job servers will likely just have job processes like Resque or Sidekiq. Build separate AMIs for each and have them in separate auto scaling groups, so they’re already preconfigured and ready to go when you install the latest code.

2. Use set -e in your bash scripts

When setting up CodeDeploy, you’ll likely be writing some bash scripts to set up your code. Make sure to use the set -e command at the top of script! By default, bash scripts won’t fail if a single command fails, which means that if you run several different commands in your script and one of them fails, you might never know. Using set -e ensures that the script will immediately exit if any individual command fails. Just add it at the top of your script:

#!/bin/bash

set -e

And you’re set to go.

3. Write a wrapper script for your deployments

CodeDeploy does have a web interface to run deployments, but it’s not that great. The best way to run deploys is with the AWS Command Line Interface, which has a super straightforward create-deployment command. I highly recommend writing a nice wrapper script around this and commit it to your repository. That way, someone can download the repo and run the script, without having to know all the specifics of your actual deployment.

For my latest project, I ended creating a simple Ruby script which I was able to run just by doing the following:

./deploy.rb production origin/master

It then calls CodeDeploy and starts the deploy—super simple. I recommend writing this in whatever makes sense for your development team: whether it’s bash, Ruby, or something else, you just need a script that’s easy for your developers to use.

4. Don’t use separate deployment groups for the same server

When you set up a deployment configuration in CodeDeploy, you set up separate deployment groups. For the client project I was setting up, we have both app servers and job servers, and I thought I’d be clever by making it easy to deploy to just app servers, just job servers, or both, so I created a few separate groups: “Production-ALL”, “Production-JOB”, and “Production-APP”. However, that presented some problems.

When CodeDeploy spins up a new server, it checks all of the deployment groups to see what the latest revision is, and then deploys for each of them. In this case, if a job server spun up, it would check both “Production-ALL” and “Production-JOB” and start the deploy process for both of them, which can lead to some really strange results, because they’ll likely be fighting with each other. While there are definite cases where you might want to do something like this, I highly recommend keeping it simple at first and only having one deployment group per server.

5. Consider using symlinks

One thing I noticed about CodeDeploy is it’s not a big fan of overwriting existing files. When I was first setting things up, my base AMI had the application code in /opt/app , which is the same place I was having CodeDeploy deploy to. As soon as I ran the deployment, CodeDeploy complained, because it didn’t want to overwrite files it didn’t know about.

To fix this, I decided to use symlinks instead. Now, my base AMI has the app installed to /opt/app-base , and CodeDeploy deploys to /opt/app-codedeploy . In the after deploy step, I run a quick ln -s call to switch over the symlink to the CodeDeploy directory. Nice and simple, and everyone’s happy.

6. Use CodeDeploy’s environment variables