Readit News logoReadit News
xyzzy123 · 2 years ago
Here's my #1 tip, most important:

Try to keep your stateful resources / services in different "stacks" than your stateful things.

Absolutely 100% completely obvious, maybe too obvious? Because none of these guides ever mention it.

If you have state in a stack it becomes 10x more expensive and difficult to "replace your way out of trouble" (aka destroy and recreate as last resort). You want as much as possible in stateless, disposable stacks. DONT put that customer exposed bucket or DB in the same state/stack as app server resources.

I don't care about your folder structure, I care about what % of the infra I can reliably burn to the ground and replace using pipelines and no manual actions.

coryrc · 2 years ago
You mean keep stateless separate from stateful?

Everyone else seems to be reading over the typo or I'm more confused than I thought.

wahnfrieden · 2 years ago
Yes
sverhagen · 2 years ago
Is a "stack" here a (root) folder on which you'd do a "terraform apply"? I've never know what to call those, surely they aren't "modules".

And, so, you're saying: try to have a separate deployment (stack then?) that contains the state, so you can wipe away everything else if you want to, without having to manage the state?

xyzzy123 · 2 years ago
It's not exactly about the folder, the IaC from a single folder / project can be instantiated in multiple places. Each time you do that, it has a unique state file, so I usually hear it referred to as a "state". In cfn you can similarly deploy the same thing lots of times and each instantiation is called a "stack", so stack/state tend to get used inter-changeably.

And yes, that's a succinct rephrasing.

When you first use iac it maybe seems logical to put your db and app server in the same "thing" (stack or state file) but now that thing is "pet like" and you have to take care of it forever. You can't safely have a "destroy" action in your pipeline as a last resort.

If you put the stateful stuff in a separate stack you can freely modify the things in the stateless one with much less worry.

raffraffraff · 2 years ago
I have the same issue. They're all modules, but the ones at the tip of the directory tree (right at the end of your env/region/stack) are called root modules. Which makes no sense because the term "root" always implies that they are at the beginning, not the tippy-toe end. So I call mine "stacks". But as another answer suggested, "states" is also fine. Even though the actual state isn't inside that directory, it's probably in an object store.

At the end of the day I don't care what other people call them.

GauntletWizard · 2 years ago
I have adopted the term "Root Module" vs "Submodule" because those line up with terraform's own definitions, but I agree that they're terribly, terribly named.
thwway23432 · 2 years ago
The "stack" nomenclature used here is jarring since it is unrepresented in Terraform HCL literature.

A CDK stack, (assuming that's what is used here), would be loosely equivalent to a Terraform HCL module.

robertlagrant · 2 years ago
Makes sense, but how do you connect the two so e.g. credentials from one are surfaced in the other?
dharmab · 2 years ago
Use Data Sources to reference resources in a different state: https://developer.hashicorp.com/terraform/language/data-sour...
no_circuit · 2 years ago
IMO you shouldn't be storing credentials in shared state, as suggested by the other comments, since that means that the principals able to read the state to deploy their service can also read the credentials for other services bundled in that state file. This could be the case if one had broken down the root modules into scopes/services like the linked page suggests.

It is reasonable to assume if you are using Terraform to manage your infra, than your infra likely has access to a secrets manager from your infra vendor, e.g., AWS. Instead I'd recommend using a Terraform data resource to pull a credential from the secret manager by name -- and the name doesn't even necessarily have to be communicated through Terraform state. Then the credentials could directly be fed into where it is needed, e.g., a resource like a Kubernetes Secret. One can even skip this whole thing if the service can use the secret manager api itself. Finally access to the credentials itself would be locked down with IAM/RBAC.

paulddraper · 2 years ago
terraform_remote_state

The root module can have outputs just like any other module. These outputs can be accessed from other stacks from the backend.

And if you use CDKTF the references are handled transparently.

Deleted Comment

harha_ · 2 years ago
I've never used terraform, but I have used CloudFormation and AWS CDK. It's been a while though, is there a clear indication on the major cloud provider docs which resources are stateful? Or is it always obvious?
tyingq · 2 years ago
Difficult question, as people mean different things when they say state. One example might be a relatively simple AWS Lambda. Most people would say that's easily stateless.

But, what if that Lambda depends on a VPC with some specific networking config to allow it to connect to some partner company private network? And, it's difficult to recreate that VPC without service disruption for a variety of reasons that are out of your control. Well, now you have state because you need to track which existing VPC the Lambda needs if you tear the Lambda down and recreate it.

hansoolo · 2 years ago
Just stopped here because I had said XYZZY way too often in the last three hours xD
time0ut · 2 years ago
I’ve been using Terragrunt [0] for the past three years to manage loosely coupled stacks of Terraform configurations. It allows you to compose separate configurations almost as easily as you compose modules within a configuration. Its got its own learning curve, but its a solid tool to have in the tool box.

Gruntwork is a really cool company that makes other tools in this space like Terratest [1]. Every module I write comes with Terratest powered integration tests. Nothing more satisfying than pushing a change, watching the pipeline run the test, and then automatically release a new version that I know works (or at least what I tested works).

[0] https://terragrunt.gruntwork.io/

[1] https://terratest.gruntwork.io/

mike_d · 2 years ago
They seem very insistent on keeping things DRY but not explaining why. Does Terraform tend to cause water leaks?
raffraffraff · 2 years ago
Terraform is supposed to let you write modular, reusable code. But because it's a limited DSL that lacks many "proper language" features (and occasionally breaks the rule of least-suprise). There are several major impediments to fully data-driven terraform. These ultimately result in copy/paste code, or tools like terragrunt which essentially wrap terraform and perform the copy/pasta behind your back by generating that code for you.

Some minor examples:

- calling a module multiple times using `for_each` to iterate over data works, except if the module contains a "provider" block

- if you are deploying two sets of resources by iterating over data, terraform can detect dependency cycles where there are not any

SgtBastard · 2 years ago
DRY = Don’t Repeat Yourself.
waffletower · 2 years ago
While combining the word "best with "Terraform" in a sentence is more than likely to result in an oxymoron, it is counter-productive not to attempt to organize and utilize terraform as elegantly and DRY as possible. We interact with stacks (which we call projects typically) via Terragrunt and have a very large surface of modules as we do have a fair amount of infrastructure pieces. But we also try to expose Terraform infrastructure changes by use of Atlantis; though bulky, github does provide a reasonable means to dialogue and manage changes made by multiple teams. The use of modules also helps us encapsulate infrastructure, and state problems are rare with these approaches, but the data sprawl inherent to Terraform is very unwieldy regardless of so called "best" practices. The language features are weak, awkward and directly encourage repetition and specification bloat. We have had some success via Data Sources to export logic outside of Terraform and provide much needed sanity when interacting with very verbose infrastructure such as Lake Formation.
thunfisch · 2 years ago
We're using Terragrunt with hundreds of AWS accounts and thousands of Terraform deployments/states.

I'll never want to do this without Terragrunt again. The suggested method of referencing remote states, and writing out the backends will fall apart instantly at that scale. It's just way too brittle and unwieldy.

Terragrunt with some good defaults that will be included, and separated states for modules (which makes partial applies a breeze) as well as autogenerated backend configs (let Terragrunt inject it for you, with templated values) is the way to go.

rvdginste · 2 years ago
We use a setup where we have multiple repos with Terraform configuration and thus multiple Terraform states. We then use Terraform remote state to link everything together. I am talking about 10-20 repos and states. Orthogonal to that, we use multiple workspaces to describe the infra in different environments.

The problems I have personally experienced with this approach are:

- if you update one of the root Terraform states, you need to execute a Terraform apply for every repo that depends on that Terraform state; developers do not do that because either they forget or they do know but are too lazy and subsequently are surprised that things are broken

- if you use workspaces for maintaining the infra in different environments, and certain components are only needed in specific environments, then the Terraform code becomes pretty ugly (using count which makes a single thing suddenly a list of things, which you then have to account for in the outputs which becomes very verbose)

Is Terragrunt something that would help us? I do not know Terragrunt, and a quick look at the website did not make that clear for me.

ckdarby · 2 years ago
Have you spent any time with Pulumi?

I've kind of found terraform is dying and encourages a lot of bad practices but everyone agrees with them because HCL and it is transferable as most companies are just using TF.

RulerOf · 2 years ago
> I've kind of found terraform is dying

I don't think it's dying. The hype has worn off. Everybody uses it. It's very mature. There's a module for everything.

It's just not new and sexy anymore IMO.

DelightOne · 2 years ago
Do you need to chain multiple Terragrunt executions to first bring the Kubernetes cluster up and then the containers, or does Terragrunt fix that?
miduil · 2 years ago
Yes, with terragrunt you can do a `terragrunt run-all apply` and based on `output` to `variable` in each module data can be passed from one state/module to the next one, terragrunt knows how to run them in the right order so you can bootstrap your EKS cluster by having one module which bootstraps the account, then another one which bootstraps EKS, then one that configures the cluster, installs your "base pods" and then later everything else.
rcarr · 2 years ago
Genuine question for DevOps people:

Other than the fact it seems to be an industry standard so it's good for your job prospects, what are the benefits to Terraform over CloudFormation/CDK or whatever the equivalent is for your particular cloud provider?

Most companies/people pick a provider and then stick with it and it doesn't seem like there's much portability between configurations if you do decide to switch providers later down the line so I'm not sure what the benefits are. I haven't delved into Terraform yet but I tried doing a project in Pulumi once and felt by the end of it that I might as well have just wrote it in AWS CDK directly.

abrookewood · 2 years ago
After you have waited 20 minutes for CloudFormation to fail and tell you that it can't delete a resource (but won't tell you why), and this is the third time it has happened in a week, you start looking at alternatives.
androidbishop · 2 years ago
This 1000%. Also recently discovered Google Deployment Manager is shit for the exact same reasons. I honestly don't get it.
solatic · 2 years ago
> your particular cloud provider

Just this week, I wrote a Terraform module that uses the GCP, Kubernetes, and Cloudflare providers to allow us to bring up a single business-need (that will be needed, hopefully, many times in the future) that spans those three layers of the stack. 200 lines of Terraform written in an afternoon replaced a janky 2,000 line over-engineered Python microservice (including much retry and failure-handling logic that Terraform gives you for free) whose original author (not DevOps) moved on to better pastures.

CDK is fine if you're all-in on AWS. It has its tradeoffs compared to Terraform. Pick the right tool for the job.

Centigonal · 2 years ago
I worked closely with the folks that wrote our platform's IaC, first in CDK, then in Terraform. I wrote a bit of CDK and zero TF myself, but here are some of the reasons we switched:

A big plus is that Terraform works outside of AWS land.

CDK is a nightmare to work with. You're writing with programming-language syntax, which tempts you to write dynamic stuff - but everything still compiles down to declarative CFN, which just makes the ergonomics feel limited. The L2 and L3 constructs have a lot of implicit defaults that came back to bite us later.

With CDK you get synth and deploy, which felt like a black box. Minor changes would do the same 8 minute long deploy process as large infrastructure refactors. Switching to TF significantly sped up our builds for minor commits. There might be a better way to do this with CDK (maybe deploying separate apps for each part of our infrastructure) and we may have just missed it.

androidbishop · 2 years ago
Terraform, and by extension HCL, is more powerful and flexible. It can be used across clouds. It has providers for all kinds of things, like kubernetes. It can be abstracted and modularized. It supports cool features like workspaces and junk, depending on how you want to use it.

Also recently I was forced to use Google Cloud Deployment Manager scripts for some legacy project we were migrating to Terraform, and I was shocked at how buggy and useless it was. Failed to create resources for no discernible reason, couldn't update existing resources with dependencies, couldn't delete resources, was just unfathomably shit all around. Finished the Terraform migration earlier this morning and everything went off without a hitch, plus we got more coverage for stuff Deployment Manager doesn't support. It's also organized much nicer now, with versioned modules and what-have-you.

Cloudformation is ugly and again, surprisingly isn't well supported by AWS. I don't understand how it's possible, but terraform providers seem to be more up to date with products and APIs. Maybe that's just me but I've seen others complain about the same thing.

wodenokoto · 2 years ago
Isn’t google cloud deployment just bash calls to the google cloud cli disguised as declarations by way of yaml?
yellowapple · 2 years ago
- In the event that you are working with different cloud providers, Terraform is one thing to learn that then applies to all of them, as opposed to learning each provider's bespoke infra-as-code offering. Most companies stick to one PaaS/IaaS, but individual personnel ain't necessarily as limited over the courses of their careers.

- Not all cloud providers have an infra-as-code offering of their own in the first place (especially true with traditional server hosts), whereas pretty much every provider with some sort of API most likely has a Terraform provider implemented for it.

- Terraform providers include more than just PaaS/IaaS providers / server hosts; for example, my current job includes provisioning Datadog metrics and PagerDuty alerts alongside applications' AWS infra in the same per-app Terraform codebase, and a previous job entailed configuring Keycloak instances/realms/roles/etc. via Terraform.

androidbishop · 2 years ago
Also pretty neat that there's a Terraform provider for Kubernetes native resources.
RulerOf · 2 years ago
I've got a lot of opinions here, but the only one I'll share is that HCL knocks the socks off of json and yaml. Json is too rigid. YAML is too nested. HCL gets this just right.

Venturing away from opinions, the provider ecosystem with terraform enables some wonderful design options. For example, I have a module template that takes some basic container configs (e.g. ports, healthchecks) and a GitHub URL, then stands the service up on ECS and configures CI in the linked repo. CF can't do that.

nuker · 2 years ago
Im 10 years working with AWS. I strongly prefer Cloudformation, just separate things smartly between stacks. It has export/import for stack outputs too. Just look at the “root module” mess in this discussion and you’ll get why.
raffraffraff · 2 years ago
For me personally, I chose terraform because it can work with AWS and a heap of other 3rd party services and software (Cloudflare, PostgreSQL, Keycloak, Kubernetes/Helm, Github, Azure)
x3n0ph3n3 · 2 years ago
I have used both terraform and cloudformation substantially and they each have pros and cons. One thing terraform has over cloudformation is its rapid support for new services and features. AWS has done an awful job ensuring that cloudformation support is part of each team's definition of "done" for each release. It just doesn't get the support it really needs from AWS.
koolba · 2 years ago
CloudFormation is the ugly step child of AWS. It has bugs that have languished for years
dgrin91 · 2 years ago
Companies choose providers and tend to stick with them, but people don't always stick with companies. If I know TF there is a decent chance my skills will be applicable when I change companies.

Also some big corps run their own internal datacenters and have cloud-like interfaces with them. You can write TF providers for that (its not going to be as nice as the public cloud ones, but still nice). Then you can utilize Terraforms multi-provider functionality to have 1 project manage deployments on multiple clouds that include on-prem.

Also terraforms multi-provider functionality is also useful for non aws/azure/gcp such as Cloudflare. As far as I know CDK does not support that.

Illotus · 2 years ago
> Other than the fact it seems to be an industry standard so it's good for your job prospects, what are the benefits to Terraform over CloudFormation/CDK or whatever the equivalent is for your particular cloud provider?

For me the killer feature is that both plan and apply show the actual diff of changes vs running infrastructure. It makes understanding effects of changes much easier.

Bellyache5 · 2 years ago
Agreed, Terraform does a good job of this. But CloudFormation & CDK can also do this via Change Sets and CDK diff.

https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGui...

https://blog.mikaeels.com/what-does-the-aws-cdk-diff-command...

thedougd · 2 years ago
Providers I regularly use, even mixed in a single project. There are others I could use if they were available.

AWS GitHub Opsgenie Okta Scalr TLS DNS

koolba · 2 years ago
You forgot the greatest escape hatch of all: null
jahsome · 2 years ago
Third-party integrations and the universality/reusability across multiple products and familiarity of HCL are big for me.
333throwaway342 · 2 years ago
> Most companies/people pick a provider and then stick with it and it doesn't seem like there's much portability between configurations if you do decide to switch providers later down the line so I'm not sure what the benefits are.

This smells like kubernetes

> Terraform over CloudFormation/CDK

They both work. It's more about which providers you need.

maccard · 2 years ago
We don't just have AWS resources. Our CI pipelines are managed by terraform [0], they communicate with GitHub [1]. I like that it's declarative and limited, it stops people trying to do "clever shit" with our infra, which is complicated enough as it is.

[0] https://buildkite.com/blog/manage-your-ci-cd-resources-as-co...

[1] https://registry.terraform.io/providers/integrations/github/...

cwp · 2 years ago
It's subtle and so difficult to see the differences at smaller scales. If you're going to provision a handful of EC2 instances, all the tools work fine.

I think HCL is an under appreciated aspect of Terraform. It was kinda awful for a while, but it's gotten a lot better and much easier to work with. It hits a sweet spot between data languages like JSON and YAML and fully-general programming languages like Python.

Take CloudFormation. The "native" language is JSON, and they've added YAML support for better ergonomics. But JSON is just not expressive enough. You end up with "pseudoparameters" and "function calls" layered on top. Attribute names doubling as type declarations, deeply nested layers of structure and incredible amounts of repetitious complexity just to be able to express all the details need to handle even moderate amounts of infrastructure.

So, ok, AWS recognizes this and they provide CDK so you can wring out all the repetion using a real programming language - pick your favourite one, a bunch are supported. That helps some, but now you've got the worst of both worlds. It's not "just JSON" anymore. You need a full programming environment. The CDK, let's say the Python version, has to run on the right interpreter. It has a lot of package dependencies, and you'll probably want to run it in a virtualenv, or maybe a container. And it's got the full power of Python, so you might have sources of non-determinism that give you subtle errors and bugs. Maybe it's daylight saving gotchas or hidden dependencies on data that it pulls in from the net. This can sound paranoid, but these things do start to bite if you have enough scale and enough time.

And then, all that Python code is just a front end to the JSON, so you get some insulation from it, but sometimes you're going to have to reason about the JSON it's producing.

HCL, despite its warts, avoids the problems with these extremes. It's enough of a programming language that you can just use named, typed variables to deal with configuration, instead of all the { "Fn::GetAtt" : ["ObjectName", "AttName"] } nonsense that CloudFormation will put you through. And the ability to create modules that can call each other is sooo important for wringing out all the repetition that these configurations seem to generate.

On the other hand, it's not fully general, so you don't have to deal with things like loops, recursion, and so on. This lack of power in the language enables more power in the tools. Things like the plan/apply distinction, automatically tracking dependencies between resources, targeting specific resources, move blocks etc. would be difficult or impossible with a language as powerful as Python.

HCL isn't the only language in this space - see CUE and Dhall, for example - but it's undoubtedly the most widely used. And it makes a real difference in practice.

swozey · 2 years ago
This was a good read but really if you already follow the common best practices of IAC/terraform/aws multi-account I don't think you're going to learn much.

The comments in here kind of made me think I was going to hop in and take away some huge wins I hadn't considered. But I have been working with Terraform and AWS for a very long time.

If you're unfamiliar with AWS multi-account best practices this is a good read.

https://aws.amazon.com/organizations/getting-started/best-pr...

bcjordan · 2 years ago
I remember periodically coming across services/platforms that purport to make setting up secure AWS accounts / infra configuration easier and default secure — anyone know what I may be thinking of?
swozey · 2 years ago
Actually the article here is one of those options - https://substrate.tools/

I don't know how integrating this into an environment where you already have tons of AWS accounts would go but it's interesting. Thankfully I only have to make new accounts when we greenfield a service and that's maybe a yearly thing.

spicyusername · 2 years ago
Everybody in here is recommending tarragrunt, but I'm not sure what value it provides over regular terraform.

After using it for a few months all of the features found in tarragrunt are in terraform.

jbjohns · 2 years ago
This is my impression as well. As far as I've understood, terragrunt was made back when terraform was missing a lot of key features (I think it maybe didn't even have modules yet) but when I was asked to evaluate it recently for a client I couldn't find a single reason to justify adding another tool.
linuxdude314 · 2 years ago
The primary thing terragrunt was designed to do was let you dynamically render providers.

Terraform still does not let you this.

It becomes very problematic when using providers that are region specific, amongst other scenarios.

That being said I don’t like the extra complexity terragrunt adds and instead choose to adopt a hierarchical structure that solves most of the problems being able to dynamically render providers would solve.

Each module is stored in its own git repo.

Top layer or root module contains one tf file that is ONLY imports with no parameters.

The modules being imported are called “tenant modules”. A tenant module contains instantiations of providers and modules with parameters.

The nodules imported by the tenant modules at the ones that actually stand up the infrastructure.

Variables are used, but no external parameters files are used at any level (except for testing).

All of the modules are versioned with git tagged releases so the correct version can easily be imported.

Couple this with a single remote state provider in the root module and throw it in a CI/CD pipeline and you have a gitops driven infrastructure as code pipeline.

maccard · 2 years ago
We migrated from terragrunt to terraform as we thought the same thing. I'm in half a mind to go back.

Managing multiple environments is much easier in TG. State management in TF is kneecapped by the lack of variable support in backend blocks. I can only assume it's to encourage people into using terraform cloud for state management.

yellowapple · 2 years ago
Terragrunt shines in cases where you have independent sets of Terraform state, especially if they are dependencies/dependents of one another.

For example, say you're using Terraform to manage AWS resources, and you've provisioned an Active Directory forest that you in turn want to manage with Terraform via the AD provider. Terraform providers can't dynamically pull things like needed credentials from existing state, so you end up needing two separate Terraform states: one for AWS (which outputs the management credentials for the AD servers you've provisioned) and one for AD (which accepts those credentials as config during 'terraform init').

Terragrunt can do this in an automated way within a single codebase, redefining providers and handling dependency/dependent relationships. I don't know of a way to do it in pure Terraform that doesn't entail manual intervention.

nwmcsween · 2 years ago
Ideally you decouple this and store the creds in a key vault or whatever, this way you have to explicitly grant access to the service principal to access the kv secret. Decoupling usually fixes other issues as well such as expiring creds from service a to service b will then get coded into terraform to refresh.
badblock · 2 years ago
Some of this seems like old advice, instead of having directories per environment you should be using workspaces to keep your environments consistent so you don't forget to add your new service to prod.
rcrowley · 2 years ago
(Hi, I’m one of the authors of the article at the root of this thread.)

I’ve gone back and forth on workspaces versus more root modules. On balance, I like having more root modules because I can orient myself just by my working directory instead of both my working directory and workspace. Plus, I feel better about stuffing more dimensions of separation into a directory tree than into workspace names. YMMV.

36chamber · 2 years ago
Do you always store modules in the same repo as the terraform itself?

Why not put them in seperate repos that can be tagged and versioned and then referenced like below?

source = "git::https://bitbucket.org/foocompany/module_name.git?ref=v1.2"

dharmab · 2 years ago
What do you think about multiple backends? It seems to be working well for me to have a single root module but with a separate backend configuration per environment.