IaC Patterns That Actually Work

| 4 min read |
infrastructure terraform iac devops

Opinionated Infrastructure as Code patterns from running Terraform at the fintech startup. Repo layout, modules, state management, and the stuff that burns you if you ignore it.

Most IaC advice reads like a textbook nobody asked for. Here’s what I’ve actually learned running Terraform at the fintech startup, where our entire AWS stack is codified and any manual console change gets treated like a bug.

Declarative or go home

Imperative scripts describe steps. Declarative config describes outcomes. Huge difference. Terraform handles ordering, diffs, idempotency – you just tell it what you want.

resource "aws_instance" "web" {
  ami           = var.ami_id
  instance_type = "t3.micro"

  tags = {
    Name = "web"
  }
}

At the fintech startup we stopped writing bash provisioning scripts the week we adopted Terraform. Never looked back.

Immutable over mutable. Always.

Replace, don’t patch. When config changes, tear down and rebuild. Sounds wasteful until you realize in-place mutations are where drift lives. Rollback becomes “redeploy the previous version” instead of “figure out which of 47 manual tweaks broke something.”

If it’s not in git, it doesn’t exist

I mean this literally. We reject any infrastructure change that isn’t a PR. No exceptions. The diff is the documentation.

Repo layout

We use a monorepo. One repo, all infra. Refactors are easier, search is easier, onboarding is easier. The risk is blast radius – but that’s what small state files and good reviews are for.

Structure by concern, not by team:

terraform/
  global/
  network/
  data/
  services/
    api/
    web/

Each directory maps to its own state file. Smaller state means faster plans and smaller explosions when someone makes a mistake.

Modules that don’t suck

A few rules I enforce:

One job per module. A network module. A database module. Not an everything-module with 30 variables and a prayer.

Explicit inputs, focused outputs. If your module takes 15 required variables, it’s doing too much. Outputs should be stable values other modules can depend on without breaking.

Sensible defaults. Most consumers should be able to use your module with minimal config. Overrides available, not required.

Compose, don’t inherit. Build bigger systems by wiring small modules together. Adding flags to one giant module is how you get a 2000-line main.tf that nobody wants to touch.

Pin your versions. Unpinned modules are a time bomb. We learned this the hard way when an upstream module update silently changed our security group rules. Pin everything, upgrade deliberately.

Environments

One directory per environment. Shared modules underneath. Each environment gets its own state.

# environments/production/main.tf
module "app" {
  source         = "../../modules/app"
  environment    = "production"
  instance_count = 3
}

Environments differ by variables, not by code. Promotion is dev to staging to production, same Terraform, different tfvars. Simple.

Workspaces? Fine for side projects. For a real team, explicit directories win. Workspaces hide differences and make it too easy to apply to the wrong environment.

CI pipeline

Treat infra code like app code. Our pipeline:

terraform fmt -check
terraform init
terraform validate
terraform plan -no-color

Plan output goes into the PR as a comment. Apply is gated behind manual approval for production. Fast automation, human control where it counts.

Drift detection

People will click things in the console. It happens. We run scheduled plans and alert on any diff. The goal isn’t punishment – it’s getting manual fixes back into code before they become invisible tech debt.

Secrets and state

Three things I’m strict about:

Secrets never go in .tf files. Use AWS Secrets Manager, Vault, injected env vars. Whatever. Just not in your repo.

State files are sensitive. They contain plaintext values for things you’d rather keep private. Remote backend, encryption on, locking on.

terraform {
  backend "s3" {
    bucket         = "terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

Least privilege for CI. Your automation role shouldn’t be able to delete your account. Scope it tight to the resources it manages. Broad permissions hide mistakes until they become outages.

The boring part is the point

Good IaC is boring. Small diffs, predictable plans, easy reviews, easy rollbacks. That’s the goal. At the fintech startup this approach has kept our infra stable through multiple scaling phases, and I’d use the same patterns anywhere else.