How to configure a production-grade CI/CD workflow for Infrastructure Code

Learn how to set up a secure and fully automated CI / CD pipeline for your Terraform and Terragrunt code

At Gruntwork, our mission is to make it 10x easier to understand, build, and deploy software. One of the main ways we do this is by providing a battle tested, production-ready Infrastructure as Code Library that you can build your infrastructure on top of. This library has helped hundreds of organizations launch their apps on production in days.

Once your app is shipped to production though, the focus shifts from battle tested infrastructure libraries to defining workflows around the infrastructure code. How do you collaborate on infrastructure code? How do you handle merge conflicts? How should the repository be organized? What would CI/CD look like? Which branches should I run plan from? How about apply ? How do I avoid giving admin credentials to my CI server?

Up until now, we didn’t have a good solution for these questions. There are many products in this space, from generic CI/CD servers (e.g., Jenkins, CircleCI, and GitLab) to dedicated platforms for Terraform (e.g., Atlantis and Terraform Cloud / Enterprise), but none of them met our needs in terms of security (i.e., not having to trust a CI server or 3rd party with admin-like credentials), support for a variety of infrastructure code (Terraform, Terragrunt, etc), and support for approval-based workflows.

To address these concerns, we’ve created a Gruntwork Infrastructure Pipeline solution. Here’s a quick .gif that shows you what it does:

For an extended version with audio commentary, see https://youtu.be/iYXghJK7YdU

Here’s how it works:

You make some changes in your Terraform or Terragrunt code and commit them to a branch and open a PR.

When you open a PR, a CI server (in the demo we show CircleCI but any generic CI solution works, like Jenkins or GitLab) runs terraform plan on the code changes.

on the code changes. Your team member reviews the code and the plan output, and when everything looks good, merges the PR.

output, and when everything looks good, merges the PR. On merge, the CI server will rerun the plan action from the master branch to ensure the latest plan view is captured. This is to handle multiple merges happening at the same time.

action from the branch to ensure the latest view is captured. This is to handle multiple merges happening at the same time. Once the plan succeeds, the CI server will notify an admin (e.g., via a Slack notification) that an approval is ready for review, and pause the deployment.

succeeds, the CI server will notify an admin (e.g., via a Slack notification) that an approval is ready for review, and pause the deployment. When the admin approves the build, the CI server will proceed to run terraform apply .

Note that we don’t run any deployment actions ( plan or apply ) directly on the CI server itself. You don’t have to share your AWS credentials for your environments with a cloud CI server to implement this workflow! Instead, we run deployments off of an isolated, locked-down deploy server in your own AWS account and expose a limited trigger interface to the CI server. Best of all, the deploy server is serverless. The deployments run from ECS Fargate tasks, so you don’t need to maintain or secure your own servers to run the service.

In addition to the modules for setting up the pipeline, we wrote up a new Production Deployment Guide (How to configure a production-grade CI/CD workflow for infrastructure code) that walks through how to use the modules to assemble the pipeline for your environment.

Read the guide to learn more about CI/CD, the production-grade design, the code you can use to set this up for yourself, and let us know what you think!