If you've worked with Terraform on a large enough scale you'll agree that a single repository with a flat file hierarchy of code is difficult to manage, navigate and execute. It's also difficult to create multiple instances of infrastructure from a single code base, so you end up forking it, and the fragmentation/split brain begins.

But there is a better way.

The primary problem I want to address and solve here is really simple to grasp: I want a single directory of code and I can repeatedly deploy knowing its identical every time, without having to create multiple copies of it.

For me all implementations (prod, nonprod, staging, uat, etc) of a design should be identical and managed from a single piece of code. I believe the organisational method I outline below achieves this.

Split it up

I've found a good starting point for organising one's Terraform code is to keep the "design" separated from the "implementation". I also have a "management" tier to support each "implementation".

The directory structure might look something like this:

. |-- README.md |-- .gitlab-ci.yml |-- management | |-- gitlab-runner.tf | |-- outputs.tf | |-- providers.tf | |-- security_groups.tf | |-- state.tf | |-- subnets.tf | |-- terraform.tfvars | |-- inputs.tf | `-- vpc.tf |-- production.tf `-- design |-- main.tf |-- terraform.tfvars `-- variables.tf

Let me break all of that down a bit because frankly what I just said is confusing me, and I wrote it...

The Management Tier

This is potentially an optional thing to have. I'll explain what it means anyway and let you decide if it's something you want to employ.

In most cases you're going to build out some infrastructure (as code) and then you'll (ideally) want to implement a CI/CD stack to manage that code. The CI/CD solution needs to be stood up first before the design can be deployed.

Depending on your goals or your design, you likely want two environments in the same VPC (but with clear security boundaries), so the management tier stands up that VPC (and those security boundaries) for you.

It's also going to standup S3 Buckets and the DynamoDB Tables for use as state storage and locking mechanisms. Of course you might be using AzureRM or Terraform Cloud/Enterprise.

Essentially it's designed to stand up the "meta" infrastructure that manages and supports your infrastructure. It means the design doesn't have to manage anything to do with networking outside of firewall rules and simple routing decisions.

It's entirely optional, though. You might want your design to implement its own VPC because you want a VPC per environment. That's fine too, and potentially a better idea depending on requirements.

The Design

When you have something to stand up in, say, AWS, you start architecting the solution on paper (or in draw.io.) This includes the finer details like the VPC, subnets, EC2 Instances, EFS, S3 Buckets, and more. It's the actual "guts" of a solution. That's a design.

That design should be represented as a module in Terraform. That module should not have a provider{} or a terraform{} configuration. It should just be a module with inputs and outputs. The entire thing: a module.

The module is then imported into an "implementation" that represents some environment. This module is fed the outputs from the management tier (should you choose to the fully adopt this model) such as subnet IDs, etc, and then it does its thing.

Where is the design module imported?

One Design, Many (Identical) Environments

At the environment level. This can be implemented in one of two ways.

The above output from tree shows a production.tf file. This imports the ./design module. When being imported the module can be fed information from the management tier, or not, about where it is to stand up its resources. And from here on the idea behind using a module to represent a design becomes apparant.

Next you might want a staging environment: staging.tf . You create that file, import the same module, but provide different inputs such as different subnet IDs, and you've got a guarenteed identical setup to production.tf but in the staging part of the VPC.

But the above method assumes you're happy to keep each environment's state in a single state file (regardless of where it's stored/managed.) This might not work for you or your organisation's auditing and security models.

Personally I'm OK with this because each environment is built off of the same design – they're identical – so really there's nothing too individualistic about each environment exept their placement within your remote environment.

Should it bother you, however, or not fit in with how you want to work, you can simply take advantage of the fact Terraform lets us reference modules using relative paths.

Instead of root level files, we can create an environments/ directory which contains subdirectories for each environment. Inside of these we can now provide unique terraform{} configurations per environment, allowing you to keep their state (and locking capabilities) separate and isolated.

All you have to do now is reference the module as ../../workload instead of ./workload . That's easy enough to manage.

Summary

The idea is to keep the design representative of the architecture you want to implement, minus a few details such as high level networking and security boundaries. This enables you to create multiple instances of your workload knowing they're perfectly identical every time.

What are your thoughts? How do you manage your code base?