Pgroll – Zero-downtime, reversible, schema changes for PostgreSQL (new website)

This seems nice at first glance, but lets think about it for a minute:

Before: You had a .sql file and if you messed up you had to revert manually. Maybe you would pre-write the revert script, maybe your site is down if you mess up. It's super easy to understand what is happening though.

Now: you use pgroll. An absolute heaping ton of magic is happening behind the scenes. Every table is replaced by a view and tables have tons of 'secret' columns holding old data. Every operation that made sense before (ALTER TABLE foo ADD COLUMN bar INTEGER NOT NULL) turns into some ugly json mess (look at the docs, it's horrible) that then drops into the magic of pgroll to get turned into who knows what operations in the db, with the PROMISE that it can be reversed safely. Since all software has bugs, pgroll has bugs. When pgroll leaves you in a broken state, which it will sooner or later, you are just FUCKED. You will have to reverse engineer all it's complicated magic and try to get it back on the rails live.

You're trading something you understand and is simple, but maybe not super convenient, for something that is magically convenient and which will eventually put a bullet in your head. It's a horrific product, don't use it.

erulabs · a year ago

What is good for the goose is not necessarily good for the gander.

Obviously for startups, you're 100% right. Just announce a brief downtime and/or do migrations after-hours. Keep it simple, no one will care if their requests timeout once every week for 30 seconds.

If your company has hundreds of developers making changes across every timezone and downtime (or developers being blocked waiting for scheduled merge windows) costs real money or creates real problems other than optics, something like this or Vitess (MySQL) is definitely worth it.

Engineering should not be a "one-size-fits-all" type of job, and while I do love postgres, my main gripe with the community is that the "keep it simple stupid" mentality persists well beyond its sell-by date in many cases.

ltbarcly3 · a year ago

The bigger and more important your company is, the less you should rely on a tool like this. You have more budget to invest in operations and less tolerance for being down for 3 days when a tool like this has a bug that takes your site down. You should hire DBA's and operations staff that understand how to apply db migrations in a safe way, and have them review them before and/or apply them during deployments. It's not hard to do manually, just a little bit of extra work (not that much extra). With a little bit of feedback engineers will learn the basics of how to write migrations that are safe and then the use case for this product is dramatically reduced anyway.

You keep things simple because you need to be able to understand what is going on to work with it later. pgroll is inherently complex and poorly designed, but even if it wasn't it is still bad to use something that you can't reasonably correct the problems it causes when it breaks.

emmelaich · a year ago

I've found in practice, over many years in many industries that downtime is actually much easier to schedule and makes many things far far easier.

tudorg · a year ago

I think it is fair criticism that this adds complexity. However, I do have a couple of counter-arguments:

In order to avoid downtime and locking, you generally need multiple steps (e.g. some variation of add another column, backfill the data, remove previous column). You can codify this in long guidebooks on how to do schema changes (for example this one from gitlab [1]). You also need to orchestrate your app deployments in between those steps, and you often need to have some backwards compatibility code.

This is all fine but: 1. it slows you down and 2. it's manual and error prone. With pgroll, the process is always the same (start pgroll migration, deploy code, complete/rollback migration) so the team can exercise it often.

Second, while any software has bugs, it's worth noting that the main reason roll-ing back is quick and safe with pgroll is that it only has to drop views and any hidden columns. While the physical schema is changed, it is in a backwards compatible way until with complete the migration, so you can always skip the views if you have to bypass whatever pgroll is doing.

[1]: https://docs.gitlab.com/ee/development/migration_style_guide...

hinkley · a year ago

I don't necessarily see friction as a bad thing. I had to explain a lot at my last job that yeah, we did in fact do a whole bunch of work to smooth out a process. Why are we "only" seeing a 70% reduction in error rate per unit time? Well that's because we're using the process 3x as much now. We reduced errors by 10x which makes people more likely to use the process. Supply and demand.

A bit of friction on tasks that can result in massive problems can cause people to tap the brakes a bit.

Pgroll shines if you are doing slow rollouts.

Recently on the market for a tool to manage SQL migration patches with no need for slow rollouts, I reviewed many such tools and the one that impressed me was sqitch: https://github.com/sqitchers/sqitch

So if you are interrested in this field and if Pgroll is not quite what you are looking for, I recommand you have a look at sqitch.

rswail · a year ago

Using sqitch in production, the main issue is that it effectively has no knowledge of what is happening except for running psql files for you. So it's hard to keep track of what the actual resulting schema should look like after each change has been applied to work out whether it's correct.

Getting people to write deploy/revert scripts are relatively easy. Asking them to write verify scripts that do more than check that a column has been added/removed is hard.

There's the "physical" issues of modifying a schema, and sqitch is great for that. But handling the "logical" issues of schema migration is more than an automated tool for running scripts.

This tool (pgroll) allows you to actually test the modified schema for logical validity without impacting the ongoing operations, which to me, seems like a win.

rixed · a year ago

> Using sqitch in production, the main issue is that it effectively has no knowledge of what is happening except for running psql files for you.

I came to the conclusion that you can not have anything more than that without compromising on how you can change your schema.

> So it's hard to keep track of what the actual resulting schema should look like after each change has been applied to work out whether it's correct.

At first sight, this is a gripe that has to be addressed to SQL itself that the SQL to change a schema can not trivially be inferred from the "before" and "after" schema definitions. But probably there is no way around it, il all generality.

One can store the current version of the schema in the source repository and make sure that the schema extracted after one or several applications of the migration match that.

lucideer · a year ago

> if Pgroll is not quite what you are looking for

If you don't need slow rollouts, what would you say the downsides of using Pgroll over Sqitch would be?

(I've used neither, but I got the impression from the op that slow rollouts was a feature, not a requirement)

rixed · a year ago

What Pgroll offers does not come for free: it puts some constraints on your schema and what migrations you can perform. Most "automatic migration" tools I've seen are designed for the easy migrations ("horizontal" changes, ie adding/removing/altering columns). But a schema migration is not equivalent to a view and a pair of triggers in general, even if in practice it's often limited to that. (simplest if a bit contrived example I can come up with: imagine you have a short table with 3 columns and 4 lines, and you wish to use instead a transposition of this table with 4 columns and 3 lines)

It's also why I'm dubious of "revert patches" in general. If there exist a revert patch for a migration, that's an easy migration. Sqitch can use a revert patch, like it can use a verify statement, but just for the convenience; it does not require them.

Yet, 9 schema migrations out of 10 are easy ones that pgroll handle nicely. It's a bit like using an ORM : if you just need a DB to store objects manipulated only in your program, sure go ahead use an ORM; but if your DB is the core of your business then you'd better not let an ORM anywhere near your schema.

At the end of the day, I'm under the impression that if one wants to handle the general case then one has to keep the whole previous DB and apps in one hand and the new DB and apps in the other, and transition customers from the former to the later. That's much less work if you don't need slow rollout.

mdaniel · a year ago

My experience with Sqitch was "all the fun of git rebase with all the fun of manual rollback and verification code" :-( I would never wish that upon my enemies

I'm open to the fact that we may have just had legacy antipatterns drug into the project, since it was shoehorned into the team by a similarly strongly opinionated advocate

filterfish · a year ago

I've been using sqitch for years and haven't had any problems with it. Though they haven't been big projects with lots of devs so I may be using in it's sweet spot.

jokethrowaway · a year ago

I don't think it's that bad but I used sqitch from the start of the project - so maybe I am the problem

I also don't think rebasing is the nightmare everybody makes it out to be. git rebase -i follow the instructions and you are good to go

hendiatris · a year ago

Sqitch is an incredibly under appreciated tool. It doesn’t have a business pushing it like flyway and liquibase, so it isn’t as widely known, but I vastly prefer it to comparable migration tools.

Arctic_fly · a year ago

We've looked into this tool a couple times, haven't adopted it yet. The main issue is compatibility with prisma, which we use for defining our db's schema and automating migrations. If there were a slick integration between the two, that would go a long way to convincing us to adopt. In the meantime, we'll go with convenience and (slightly, almost immeasurably) lower reliability. We haven't had downtime due to a migration in months!

Yes, this is the biggest issue right now and we’re working on a solution.

What we want to do is add the ability to generate the pgroll migrations based on the prisma generated migration files. Depending on the operation, you might need to add more info.

This will work fairly generally, not only prisma.

notpushkin · a year ago

Same but with Alembic! I'm wondering how hard would it be to open pgroll up for third party migration systems.

rtuin · a year ago

This seems like a great tool! I like how it "simply" works by putting a view on top of the actual table. This concept is widely used in the data engineering world (in dbt).

There must be an easier way to write migrations for pgroll though. I mean, JSON, really?

colemannerd · a year ago

Are migrations still specified in json?

Yes, but we're working on a converter from DDL sql to pgroll json.

The reason for JSON is because the pgroll migrations are "higher level". For example, let's say that you are adding a new unique column that should infer its data from an existing column (e.g. split `name` into `first_name` and `last_name`). The pgroll migration contains not only the info that new columns are added, but also about how to backfill the data.

The sql2pgroll converter is creating the higher level migration files, but leaves placeholder for the "up" / "down" data migrations.

The issue where sql2pgroll is tracked is this one: https://github.com/xataio/pgroll/issues/504

solatic · a year ago

I completely understand the need for a higher-level language above SQL, but straight JSON is a deal-breaker. It's not just comments, it's also that editors won't understand that JSON should have syntax-highlighting to help people catch trivial typos. A configuration language that allows for importing a file as a string would allow users to write the SQL in files with .sql extensions and get syntax-highlighting.

pgroll is written in Go, so if you were to accept configuration written in CUE, you would get the best of all worlds:

* Besides support for comments, there's first-class support in Go for writing configuration in CUE and then importing it: https://cuelang.org/docs/concept/how-cue-works-with-go/#load...

* SQL can written in files with .sql extensions and embedded in CUE with @embed: https://cuelang.org/docs/howto/embed-files-in-cue-evaluation...

Thank you so much! I’ll be watching

joelhaasnoot · a year ago

Lol, it's funny you need a higher level language and choose... JSON.

https://github.com/xataio/pgroll/issues/281 may be worth a :+1: even though it seems from tudorg's comment that they're really wedded to .json :-(

Ha ha, we're not wedded, I was only explaining why a .sql file is not quite enough. Using some json equivalent should be fine, thanks for pointing to the issue.

cpursley · a year ago

Here's a few other migration tooling options for Postgres: https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f...

brycethornton · a year ago

Can you add https://github.com/shayonj/pg-osc? It's my favorite PG migration tool.

Deleted Comment

paulryanrogers · a year ago

How many have you used?

At ORIS I wrote a Laravel wrapper for PTOSC and really miss it now that I'm back on PostgreSQL. Now I mostly use updatable views in front of modified tables then drop-swap things later once any transitional backfilling is done.

shayonj · a year ago

Thank you Bryce <3

muratsu · a year ago

Unrelated to the OP product but I’m curious how people are solving this issue on smaller scale with nextjs + prisma/drizzle. Do you just run the builtin migrate script with npm run?

maxf · a year ago

We use a CI job that runs drizzle migrate whenever the schema files change. Drizzle doesn’t have rollbacks though. Replacing this process with pgroll as a sort of drop-in replacement would be nice. Then orchestrate everything with a couple of CI jobs, and done!

konradb · a year ago

Cool new site! I got 'invalid invite' on the discord link from the front page, I only mention as I'd like to join it.

Oops, fixing it now. This link should work: https://discord.com/invite/kvAcQKh7vm