Pulumi 3.0 - Readit News

Glad to see all the innovation happening in the IaC space.

I like the multi-language premise of Pulumi but I have the feeling that over time one of the languages will gain a significant advantage over the others (better supported, more features, native...) and thus will become the primary language for Pulumi package authors and users (similar to HCL for Terraform and TypeScript for CDK).

Meaning that eventually all the "serious" Pulumi users will end up converging on one single language (similar to this comment of a CDK user migrating from Python to TypeScript). Feedback loops will only help amplify this effect (e.g. more Pulumi users on language A -> more examples online -> more package authors working in this language -> better established best practices -> more submitted issue tickets for language A -> language better supported by Pulumi -> more users...)

Curious about how Pulumi multi-language components [0] work under the hood. Isn't writing a "Pulumi schema" describing resources the same as writing declarative HCL at the end of the day?

[0] https://www.pulumi.com/blog/pulumiup-pulumi-packages-multi-l...

joeduffy · 4 years ago

Excellent topic. [caveat, Pulumi co-founder here]

Indeed what you say is true of many other "multi-language" platforms. I was an early engineer on .NET at Microsoft, and although it was multi-language from the outset (COBAL.NET was a thing!), the reality is most folks write in C# these days. And yet, you still see a lot of excitement for PowerShell, Visual Basic, and F#, each of which has a rich community, but uses that same common core. A similar phenomenon has happened in the JVM ecosystem with Java dominating most usage until the late 2000s, at which point my impression is that Groovy, Scala, and now Kotlin won significant mindshare.

I have reasons to be optimistic the infrastructure language domain will play out similarly. Especially as we are fundamentally serving multiple audiences -- we see that infrastructure teams often elect Python, while developers often go with TypeScript or Go, because they are more familiar with it. For those scenarios, the new multi-language package support is essential, since many companies have both sorts of engineers working together.

A "default language for IaC" may emerge with time, but I suspect that is more likely to be Python than, say, HCL. (And even then, I'm not so sure it will happen.) One of the things I'm ridiculously excited about, by the way, is bringing IaC to new audiences -- many folks learn Python at school, not so much for other infrastructure languages. Again, I'm biased. But, even if a default emerges, I guarantee there will be reasons for the others to exist. I for one am a big functional language fan and especially for simple serverless apps, I love seeing that written in F#. And we've had a ton of interest in PowerShell support since many folks working with infrastructure for the Microsoft stack know it. And Ruby due to the Chef and Puppet journeys.

I also won't discount the idea of us introducing a cloud infrastructure-specific language ;-). But it would be more general purpose than not. I worked a lot on parallel computing in the mid-2000s and that temptation was always there, but I'm glad we resisted it and instead just added tasks/promises and await to existing languages.

As to the Pulumi schema, you're right, that's a step we aim to remove soon. For TypeScript, we'll generate it off the d.ts files; for Go we'll use struct tags; and so on. Now that the basic runtime is in place, we are now going to focus there. This issue tracks it: https://github.com/pulumi/pulumi/issues/6804. Our goal is to make this as ridiculously easy as just writing a type/class in your language of choice.

mtalantikite · 4 years ago

> There might be "a default language" but I suspect that is more likely to be Python than, say, HCL. One of the things I'm ridiculously excited about is bringing IaC to new audiences -- many folks learn Python at school, not so much for other infrastructure languages.

As someone that has worked largely on backend systems and infra for a long time, are colleges actually training these infra skills? Some app engineers at a place I consult for was just having this conversation last week, interested in how to train infra skills and pretty much everyone had sort of fallen into infra roles over their careers, with no formal training in it. Most of us were just *nix hackers as kids, learned to program at some point, and now we’re here. Working on infra is quite a lot more than just knowing the language.

I’m not mad at having something other than HCL — years ago when I worked at Engine Yard we developed a cross platform cloud library that let us write things in Ruby which was nice. But when thinking of solving infra problems I’ve never once thought “you know, if I could just write this in Python these problems would go away”. Actually I personally hate Python as a language, I’d much prefer to write Go or Rust or TypeScript, and it does feel like a bonus that everyone touching infra sort of just has to use HCL which removes a lot of bike shedding.

Totally open to improvements! More isn’t always better though.

Deleted Comment

neopointer · 4 years ago

I couldn't agree more. In my company we've decided that everyone will use Typescript with Pulumi because we believe that the developer experience (IDE, code autocompletion, etc) will be the best + it's a language that arguably is more used here as frontenders will know it too + providing examples, etc.

JimDabell · 4 years ago

When I first started using Pulumi, I was using Python and it seemed like the Pulumi team focused on JavaScript and the rest of the languages were afterthoughts. But they’ve improved a lot on this front, so if anything the trend is in the opposite direction.

Can someone who has used both Pulumi and AWS CDK describe the differences between the two?

I am using Terraform 100% now, but sometimes wish I had more than the HCL (hashicorp configuration language) syntax available to use in my code.

rossmohax · 4 years ago

I used both in depth. CDK as it is now is a wrapper around CloudFormation and therefore suffers from same limitations: AWS only, 'static' templates.

Pulimi on the other hand is closer to terraform, except you can write in a language which actually doesn't stand in your way and allows you to programmatically access values available during execution only. With CDK you can only reference them in CF template (lets say instance ID which you just created), but these are never "lifted" to your code. Also CDK misses native 'datasources' and offers limited mechanism for lookups.

Pulumi also has a killer Automatioin featre, where you can code infrastructure migrations, not unlike you'd do it for SQL. Neither TF nor CDK allows you to do that, and you'd need to code infrastructure state transitions in error prone bash wrappers.

As a developer, I quite liked Pulumi resource model, documentation and team responsiveness on Github and slack channel. I am working with CDK now, because of costs and I'd prefer Pulumi if there was a choice.

pm90 · 4 years ago

The migration as code feature sounds the most promising from your comment. I use terraform heavily, and while its good to capture infra state, there's no good mechanism to automate infrastructure migration from one state to another in code. I think this would be incredibly valuable to capture that.

I must admit that I'm skeptical on how well this would actually work though. Infrastructure API's tend to be flaky, and I'm not sure how reusable the migration code would actually be. If its too flaky, people will just not use it.

fulafel · 4 years ago

The migration style changes sound really interesting.

CDK indeed is just another layer of abstraction on top of CloudFormation, itself already a leaky complex abstraction, and it bites you in a lot of ways.

tasssko · 4 years ago

When i looked the SDK was only available in certain languages which was a issue for me at the time. I just checked and they have python and csharp. Pulumi has potential to reduce the verbosity of infrastructure configuration and enable engineers to tailor it more to the application on demand. Adding a new queue can be done as part of a app release and not handled in a separate pipeline. It isn’t fun to wrangle infrastructure pipelines and when necessary orchestrate changes.

junon · 4 years ago

Thanks for the writeup. Terraform has only ever worked against me, and hadn't heard of Pulimi until now.

Will have to look into it now.

aequitas · 4 years ago

You could try Terraform CDK[0] which supports TypeScript, Python, Java, and C#.

[0] https://github.com/hashicorp/terraform-cdk

nhoughto · 4 years ago

CDK is just a different way of writing cloudformation essentially, its code that is transformed into cloudformation templates, so inherits all its problems.

Pulumi is more like Terraform but without HCL. So imagine instead of having to find-the-HCL-way-to-do-things and work within those constraints to make terraform work, you can use one of a few languages to build up resources (similar to you would in HCL) and have pulumi provision/act on them for you including managing their state.

Quite different to CDK/CloudFormation and much more like Terraform without the handcuffs of HCL. (sometimes handcuffs are useful tho..)

mdeeks · 4 years ago

In addition to what others have said, it is important to point out that the CDK is AWS specific. Your "infrastructure" is rarely just "AWS" even if you're not multi-cloud. Terraform and Pulumi can operate on things outside of AWS like Github, Okta, Cloudflare, Datadog, Hashicorp Vault, etc.

It is nice to have one system or even repo for managing all of that.

https://www.pulumi.com/docs/intro/vs/cloud_template_transpil...

cmclaughlin · 4 years ago

> In addition to what others have said, it is important to point out that the CDK is AWS specific.

As mentioned in other comments here, CDK for Terraform is another option and it's not AWS specific - it should work with any Terraform provider and there are heaps available.

https://github.com/hashicorp/terraform-cdk

manigandham · 4 years ago

All those frameworks have their own DSL/config languages that lack expressiveness and inevitably become a really bad scripting language to support everything people want to do.

Pulumi started with actual programming languages instead which gives you full expressiveness, IDE integration, and more, without learning yet another new language. And since infrastructure is just represented as objects in the language, creating and combining them in complex ways is much more natural than wiring up configs.

carlosf · 4 years ago

I migrated from Terraform to AWS CDK and I'm very happy.

The good:

- Tooling for any major language (Typescript, Python)… is lightyears ahead of anything you will find for Terraform / HCL.

- You can write your code as declaratively as possible (like you would do using TF), but you always have the escape hatch of using all libs available to your language of choice.

- AWS CDK uses CloudFormation under the hood, so you get cool stuff like automatic rollbacks in case of failure.

- You have access to much more mature testing frameworks, compared to what is available to TF. Because AWS CDK synthetize CF templates, you can also snapshot those for regression testing. Applying good software engineering practices is overall much easier. Most languages are much easier to extend than HCL.

- AWS is constantly releasing higher level libraries so you don't need to fiddle with low level API details. When using Terraform, you generally must understand your cloud provider API at a very fine level to implement IaC.

The bad:

- AWS CDK is written in Typescript and automatically translated to other languages. Even though you can use Python, C#, etc... Most examples and tutorials exist in Typescript. Also, you always must install npm + cdk + the libraries in the language you are using, so using any language other than Typescript means supporting two toolchains, which is a pain in the ass. I started with AWS CDK in Python and now I'm migrating to Typescript.

- Some modules only exist for Typescript.

- Since CF is used under the hood, it only supports resources supported by CF. Sometimes CF support takes a while. Since TF is just a wrapper for a Cloud Provider API, it generally implements new resources much quicker.

- It has its own vocabulary and and ways to do stuff (L1/L2 Construct? Stack? Retention Policy?)... even if you've used a lot of TF and know AWS very well, there's a learning curve.

Some cool videos:

- An AWS dev shows how to create your own construct and test it (that's the less "toy example" tutorial I have found): https://www.youtube.com/watch?v=cTsSXYOYQPw

- Same guy shows how to contribute to AWS CDK. I've found that looking at their source code is a great way to learn about good practices and patterns: https://www.youtube.com/watch?v=OXQSSibrt-A

k__ · 4 years ago

The CDK synthesizes CloudFormation templates (for AWS) and Pulumi seems to use the AWS APIs to provision and doesn't touch CloudFormation.

bloopernova · 4 years ago

Right - I was wondering about the developer "experience" between the two.

satya71 · 4 years ago

CDK maintains the state with AWS itself, Pulumi operates their own state service (with option of self management). I have never used CDK, but I've heard that it's hard to recover from state corruption in CDK.

jen20 · 4 years ago

While this is true, the reason is not called out: CDK simply generates a CloudFormation Template, and thus uses CloudFormation for state management, rather than using the cloud APIs directly.

ynouri · 4 years ago

methyl · 4 years ago

Native Provider means it no longer uses Terraform under the hood and it excites me a lot. We were having hard time with some more complex setups and it seems it can be remedy for issues that we faced. Stoked to check it out!

leg100 · 4 years ago

According to [1], "Native providers are built directly and automatically from the cloud provider’s API...", and are not hand-coded, unlike Terraform's providers.

I'm surprised this is possible. If it was, why didn't Terraform follow this approach a long time ago?

The cloud providers' APIs provide the endpoints for creating, updating, reading and deleting resources. Terraform and Pulumi's value is to provide an idempotent abstraction on top of that. And that abstraction is not straightforward to write, because it has to handle numerous nuances and anachronisms in the underlying APIs. For example, in the event of updating an existing storage bucket, the abstraction has to determine whether the bucket can be simply updated, or if it needs to be re-created (say if you were changing the name or location). And the underlying API will not necessarily reveal this kind of information.

Hence one would think completely avoiding handwritten code is incredibly difficult if not an insurmountable problem.

[1]: https://www.pulumi.com/blog/pulumiup-native-providers/

[caveat, Pulumi co-founder here]

You are right that it's not easy. Thankfully the cloud providers themselves have moved in the direction of auto-generation for their own SDKs, documentation, etc., which has forced this to get better over time. This is motivated by much the same reason we've done it -- keeping one's own sanity, ensuring high quality and timeliness of updates, and ensuring good coverage and consistency, when supporting so many cloud resource types across many language SDKs.

Microsoft, for instance, created OpenAPI specifications very early on, and standardized on them (AFAIK, they are required for any new service). Those API specifications contain annotations to describe what properties are immutable (as you say, the need to "re-create" the resource rather than update it in place). Google Cloud's API specifications similarly contain such information but it's based on the presence of PATCH support for resources and their properties. Etc.

The good news is that we've partnered with the cloud providers themselves to build these and we expect this to be increasingly the direction things go.

antoncohen · 4 years ago

The Terraform provider for Google Cloud uses partial autogeneration, here is the repo that does the autogeneration for multiple automation tools:

https://github.com/GoogleCloudPlatform/magic-modules

This is an excellent comment. I think the answer is just that the cloud provider API's are getting more "stable" and user friendly with time, making the additional abstractions of e.g. terraform not worth the overhead of learning that entire world. The response from the Pulumi co-founder seems to indicate as such, that they're relying on the simplicity of the API's to not get into a bad state (although I'm still skeptical if that's possible).

Wouldn't this mean that Pulumi has better coverage than CloudFormation?

Some things in the AWS API aren't (yet) available in CFN.

octopoc · 4 years ago

The Pulumi Automation API lets you have an API to update infrastructure from your own process instead of via CLI commands that call your executable. I hope this makes it easy to implement multi-tenancy by spinning up separate infrastructure for different customers.

This would make it easier for software for dynamic schema built on top of a relational database. Dynamic schema is one area where most libraries don't help you out much, but it's an increasingly important feature, especially for businesses.

giovannibonetti · 4 years ago

Some relational databases like Postgres and MySQL support JSON columns. This is useful if, for example, you want to create an ecommerce application with multiple product variations stored in a single table, but with different fields - that is called Single Table Inheritance (STI). You can have some regular columns for the common fields, and a JSON column for the specific ones.

If your app is built with Rails, you can use this library to help you on that (I'm not affiliated with it): https://github.com/jrochkind/attr_json

vdfs · 4 years ago

Is search in large tables slow in Postgres JSON columns? can it be index?

mssundaram · 4 years ago

I am having trouble understand what Pulumi does, since I rarely work in web architecture. Could someone explain it?

qbasic_forever · 4 years ago

It's infrastructure-as-code to the logical conclusion that you write real programming language code (Python, Javascript, Go, etc.) to build your infrastructure. Something like Terraform lets you declare infrastructure with HCL/JSON but doesn't give you a lot of flexibility to reason about or dynamically change things. If you want to spin up a frontend server for every backend for example, you have to write out the configs and tweak them to match the desired state. Pulumi and similar tools go the next step and let you write code to build your infrastructure--if you want your frontends and backends to be in sync, write some Python to make it happen instead of spelunking in a pile of configs.

It's super useful and compelling at scale, but there are some potential downsides. For one, it's a real program and IMHO you should treat it as such with the same rigor of documentation, testing, etc. as the production code you're deploying. This might actually make your processes a bit slower or more risky. Because if you don't do that then you just risk building unmaintainable, untrustworthy deployment code that nobody wants to use.

The second downside is you're taking a big bet on Pulumi to stick around. At this point, it's probably a safe bet. But the more you use their SDK and system the more tightly coupled your production deployment is to the continued existence of Pulumi. If they go bust and can't maintain the SDK you might have a production system that can't be deployed anymore.

jacobmischka · 4 years ago

Thank you for this concise summary. While not a dedicated cloud services administrator, I keep up with new technologies and follow tech news fairly religiously, and this press release made my head hurt with all of the cloud-specific buzzwords and proper nouns.

Sounds like a neat and useful project for complex setups.

aetherspawn · 4 years ago

Thanks, I wanted to know the main difference between Terraform and this (ie why would I pay $50/mo), but this sums it up nicely.

Still, Terraform has my needs at this stage and looks a lot more mature. But if I was a small web host or something this could be a really interesting integration.

It lets you provision and configure cloud infrastracture using code. Think about all the things you could do via cloud provider consoles, but now via code. So you can automate deployment, testing, migration. It's all well documented in one place, shareable, repeatable.

adflux · 4 years ago

Define your cloud infrastructure in code (python, typescript, other languages) and deploy it with ease. I think it beats other formats.

tomca32 · 4 years ago

It's a library around Terraform that lets you write infra stuff in Python, JavaScript, Go, and maybe some other langs.

For me the biggest benefit is the fact that I can write my infra with Types in TypeScript.

spooneybarger · 4 years ago

It's no longer built around terraform. It uses the cloud provider APIs directly now.

See native provider.

rswail · 4 years ago

A couple of years ago I started down the Pulumi path, but what was unclear at the time is that there are two phases of a pulumi run.

The first is on constructing the DAG of resources, the second is using that DAG to orchestrate the changes from the previous state.

The problem I had is that values that are only available in the second phase (eg a subnet ID) can't be accessed in the first phase directly. The subnet's ID is accessed via "my_subnet.id" but it's actually effectively a future.

You can't write something like "if my_subnet.id == "1234" (this is an arbitrary example), but effectively a reference to a resource's attributes is only available on the "right hand side" of an expression, and pretty much as a simple assignment only.

To use the value of an attribute, you essentially need to access the value in the second phase of the pulumi process, and to do that (at the time) involved essentially pushing through source code to be "eval"ed on the "other side".

I'll be interested to see what's changed :)

mbStavola · 4 years ago

I'm in love with Pulumi. Simply one of the best tools I have in my stack.

After about a year of use, I simply cannot go back to editing YAML files or clicking things in a web UI. It's ruined me completely.

TruthWillHurt · 4 years ago

I'm sorry, I really liked Pulumi, but it became too complex after a short while.

Since all infra resources are defined as code you end up defining them in a procedural manner (Python), and struggle to arrange/resolve dependencies.

happyrock · 4 years ago

I agree, I enjoyed using it and found it very powerful, but it was tricky to organize the code in a way that kept things simple, and there weren't many great examples to draw from in the documentation or elsewhere. I'd love to see what a large multi-provider Pulumi deployment looks like when it's done the right way.