Go ahead, self-host Postgres

Self-hosting is more a question of responsibility I'd say. I am running a couple of SaaS products and self-host at much better performance at a fraction of the cost of running this on AWS. It's amazing and it works perfectly fine.

For client projects, however, I always try and sell them on paying the AWS fees, simply because it shifts the responsibility of the hardware being "up" to someone else. It does not inherently solve the downtime problem, but it allows me to say, "we'll have to wait until they've sorted this out, Ikea and Disney are down, too."

Doesn't always work like that and isn't always a tried-and-true excuse, but generally lets me sleep much better at night.

With limited budgets, however, it's hard to accept the cost of RDS (and we're talking with at least one staging environment) when comparing it to a very tight 3-node Galera cluster running on Hetzner at barely a couple of bucks a month.

Or Cloudflare, titan at the front, being down again today and the past two days (intermittently) after also being down a few weeks ago and earlier this year as well. Also had SQS queues time out several times this week, they picked up again shortly, but it's not like those things ...never happen on managed environments. They happen quite a bit.

mattmanser · 2 months ago

Over 20 year I've had lots of clients on self-hosted, even self-hosting SQL on the same VM as the webserver as you used to in the long distant past for low-usage web apps.

I have never, ever, ever had a SQL box go down. I've had a web server go down once. I had someone who probably shouldn't have had access to a server accidentally turn one off once.

The only major outage I've had (2/3 hours) was when the box was also self-hosting an email server and I accidentally caused it to flood itself with failed delivery notices with a deploy.

I may have cried a little in frustration and panic but it got fixed in the end.

I actually find using cloud hosted SQL in some ways harder and more complicated because it's such a confusing mess of cost and what you're actually getting. The only big complication is setting up backups, and that's a one-off task.

paulryanrogers · 2 months ago

Disks go bad. RAID is nontrivial to set up. Hetzner had a big DC outage that lead to data loss.

Off site backups or replication would help, though not always trivial to fail over.

arwhatever · 2 months ago

Me: “Why are we switching from NoNameCMS to Salesforce?”

Savvy Manager: “NoNameCMS often won’t take our support calls, but if Salesforce goes down it’s in the WSJ the next day.”

dilyevsky · 2 months ago

This ignores the case when BigVendor is down for your account and your account only and support is mia, which is not that uncommon ime

TheNewsIsHere · 2 months ago

Just wait until you end up spending $100,000 for an awful implantation from a partner who pretends to understand your business need but delivers something that doesn’t work.

But perhaps I’m bitter from prior Salesforce experiences.

madeofpalk · 2 months ago

> but it allows me to say, "we'll have to wait until they've sorted this out, Ikea and Disney are down, too."

From my experience your client’s clients don’t care about this when they’re still otherwise up.

tjwebbnorfolk · 2 months ago

Yes but the fact that it's "not their fault" keeps the person from getting fired.

Don't underestimate the power of CYA

blitz_skull · 2 months ago

From my experience, this completely disavows you from an otherwise reputation damaging experience.

vb-8448 · 2 months ago

You can still outsource up to VM level and handle everything else on you own.

Obviously it depends on the operational overhead of specific technology.

bossyTeacher · 2 months ago

> Self-hosting is more a question of responsibility I'd say. I am running a couple of SaaS products and self-host at much better performance at a fraction of the cost of running this on AWS

It is. You need to answer the question: what are the consecuences of your service being down for lets say 4 hours or some security patch isn't properly applied or you have not followed the best practices in terms of security? Many people are technically unable, lack the time or the resources to be able to confidently address that question, hence paying for someone else to do it.

Your time is money though. You are saving money but giving up time.

Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).

bigstrat2003 · 2 months ago

> Like everything, it is always cheaper to do it (it being cooking at home, cleaning your home, fixing your own car, etc) yourself (if you don't include the cost of your own time doing the service you normally pay someone else for).

In a business context the "time is money" thing actually makes sense, because there's a reasonable likelihood that the business can put the time to a more profitable use in some other way. But in a personal context it makes no sense at all. Realistically, the time I spend cooking or cleaning was not going to earn me a dime no matter what else I did, therefore the opportunity cost is zero. And this is true for almost everyone out there.

PunchyHamster · 2 months ago

You can pay someone else to manage your hardware stack, there are literal companies that will just keep it running, while you just deploy your apps on that.

> It is. You need to answer the question: what are the consecuences of your service being down for lets say 4 hours or some security patch isn't properly applied or you have not followed the best practices in terms of security?

There is one advantage self hosted setup has here, if you set up VPN, only your employees have access, and you can have server not accessible from the internet. So even in case of zero day that WILL make SaaS company leak your data, you can be safe(r) with self-hosted solution.

> Your time is money though. You are saving money but giving up time.

The investment compounds. Setting up infra to run a single container for some app takes time and there is good chance it won't pay back for itself.

But 2nd service ? Cheaper. 5th ? At that point you probably had it automated enough that it's just pointing it at docker container and tweaking few settings.

It's cheaper if you include your own time. You pay a technical person at your company to do it. Saas company does that, then pays sales and PR person to sell it, then pays income tax to it, then it also needs to "pay" investors.

Yeah making a service for 4 people in company can be more work than just paying $10/mo to SaaS company. But 20 ? 50 ? 100 ? It quickly gets to point where self hosting (whether actually "self" or by using dedicated servers, or by using cloud) actually pays off

jbverschoor · 2 months ago

Yea I agree.. better outsource product development, management, and everything else too by that narrative

Thaxll · 2 months ago

That argument does not hold when there is aws serverless pg available, which cost almost nothing for low traffic and is vastly superior to self hosting regarding observability, security, integration, backup ect...

There is no reason to self manage pg for dev / environnement.

https://aws.amazon.com/rds/aurora/serverless/

starttoaster · 2 months ago

"which cost almost nothing for low traffic" you invented the retort "what about high traffic" within your own message. I don't even necessarily mean user traffic either. But if you constantly have to sync new records over (as could be the case in any kind of timeseries use-case) the internal traffic could rack up costs quickly.

"vastly superior to self hosting regarding observability" I'd suggest looking into the cnpg operator for Postgres on Kubernetes. The builtin metrics and official dashboard is vastly superior to what I get from Cloudwatch for my RDS clusters. And the backup mechanism using Barman for database snapshots and WAL backups is vastly superior to AWS DMS or AWS's disk snapshots which aren't portable to a system outside of AWS if you care about avoiding vendor lock-in.

jread · 2 months ago

This was true for RDS serverless v1 which scaled to 0 but is no longer offered. V2 requires a minimum 0.5 ACU hourly commit ($40+ /mo).

maccard · 2 months ago

Aurora serverless requires provisioned compute - it’s about $40/mo last time I checked.

gonzo41 · 2 months ago

Just use a pg container on a vm, cheap as chips and you can do anything to em.

> I'd argue self-hosting is the right choice for basically everyone, with the few exceptions at both ends of the extreme:

> If you're just starting out in software & want to get something working quickly with vibe coding, it's easier to treat Postgres as just another remote API that you can call from your single deployed app

> If you're a really big company and are reaching the scale where you need trained database engineers to just work on your stack, you might get economies of scale by just outsourcing that work to a cloud company that has guaranteed talent in that area. The second full freight salaries come into play, outsourcing looks a bit cheaper.

This is funny. I'd argue the exact opposite. I would self host only:

* if I were on a tight budget and trading an hour or two of my time for a cost saving of a hundred dollars or so is a good deal; or

* at a company that has reached the scale where employing engineers to manage self-hosted databases is more cost effective than outsourcing.

I have nothing against self-hosting PostgreSQL. Do whatever you prefer. But to me outsourcing this to cloud providers seems entirely reasonable for small and medium-sized businesses. According to the author's article, self hosting costs you between 30 and 120 minutes per month (after setup, and if you already know what to do). It's easy to do the math...

Nextgrid · 2 months ago

> employing engineers to manage self-hosted databases is more cost effective than outsourcing

Every company out there is using the cloud and yet still employs infrastructure engineers to deal with its complexity. The "cloud" reducing staff costs is and was always a lie.

PaaS platforms (Heroku, Render, Railway) can legitimately be operated by your average dev and not have to hire a dedicated person; those cost even more though.

Another limitation of both the cloud and PaaS is that they are only responsible for the infrastructure/services you use; they will not touch your application at all. Can your application automatically recover from a slow/intermittent network, a DB failover (that you can't even test because your cloud providers' failover and failure modes are a black box), and so on? Otherwise you're waking up at 3am no matter what.

molf · 2 months ago

> Every company out there is using the cloud and yet still employs infrastructure engineers

Every company beyond a particular size surely? For many small and medium sized companies hiring an infrastructure team makes just as little sense as hiring kitchen staff to make lunch.

scott_w · 2 months ago

> Every company out there is using the cloud and yet still employs infrastructure engineers to deal with its complexity. The "cloud" reducing staff costs is and was always a lie.

This doesn’t make sense as an argument. The reason the cloud is more complex is because that complexity is available. Under a certain size, a large number of cloud products simply can’t be managed in-house (and certainly not altogether).

Also your argument is incorrect in my experience.

At a smaller business I worked at, I was able to use these services to achieve uptime and performance that I couldn’t achieve self-hosted, because I had to spend time on the product itself. So yeah, we’d saved on infrastructure engineers.

At larger scales, what your false dichotomy suggests also doesn’t actually happen. Where I work now, our data stores are all self-managed on top of EC2/Azure, where performance and reliability are critical. But we don’t self-host everything. For example, we use SES to send our emails and we use RDS for our app DB, because their performance profiles and uptime guarantees are more than acceptable for the price we pay. That frees up our platform engineers to spend their energy on keeping our uptime on our critical services.

aranelsurion · 2 months ago

> still employs infrastructure engineers

> The "cloud" reducing staff costs

Both can be true at the same time.

Also:

> Otherwise you're waking up at 3am no matter what.

Do you account for frequency and variety of wakeups here?

spiralpolitik · 2 months ago

In-house vs Cloud Provider is largely a wash in terms of cost. Regardless of the approach, you are going need people to maintain stuff and people cost money. Similarly compute and storage cost money so what you lose on the swings, you gain on the roundabouts.

In my experience you typically need less people if using a Cloud Provider than in-house (or the same number of people can handle more instances) due to increased leverage. Whether you can maximize what you get via leverage depends on how good your team is.

US companies typically like to minimize headcount (either through accounting tricks or outsourcing) so usually using a Cloud Provider wins out for this reason alone. It's not how much money you spend, it's how it looks on the balance sheet ;)

matthewmacleod · 2 months ago

I don’t think it’s a lie, it’s just perhaps overstated. The number of staff needed to manage a cloud infrastructure is definitely lower than that required to manage the equivalent self-hosted infrastructure.

Whether or not you need that equivalence is an orthogonal question.

riedel · 2 months ago

Working in a university Lab self-hosting is the default for almost anything. While I would agree that cost are quite low, I sometimes would be really happy to throw money at problems to make them go away. Without having the chance and thus being no expert, I really see the opportunity of scaling (up and down) quickly in the cloud. We ran a postgres database of a few 100 GB with multiple read replica and we managed somehow, but actually really hit our limits of expertise at some point. At some point we stopped migrating to newer database schemas because it was just such a hassle keeping availability. If I had the money as company, I guess I would have paid for a hosted solution.

AYBABTME · 2 months ago

The fact that as many engineers are on payroll doesn't mean that "cloud" is not an efficiency improvement. When things are easier and cheaper, people don't do less or buy less. They do more and buy more until they fill their capacity. The end result is the same number (or more) of engineers, but they deal with a higher level of abstraction and achieve more with the same headcount.

strken · 2 months ago

I can't talk about staff costs, but as someone who's self-hosted Postgres before, using RDS or Supabase saves weeks of time on upgrades, replicas, tuning, and backups (yeah, you still need independent backups, but PITRs make life easier). Databases and file storage are probably the most useful cloud functionality for small teams.

If you have the luxury of spending half a million per year on infrastructure engineers then you can of course do better, but this is by no means universal or cost-effective.

erulabs · 2 months ago

Well sure you still have 2 or 3 infra people but now you don’t need 15. Comparing to modern Hetzner is also not fair to “cloud” in the sense that click-and-get-server didn’t exist until cloud providers popped up. That was initially the whole point. If bare metal behind an API existed in 2009 the whole industry would look very different. Contingencies Rule Everything Around Me.

cardanome · 2 months ago

You are missing that most services don't have high availability needs and don't need to scale.

Most projects I have worked on in my career have never seen more than a hundred concurrent users. If something goes down on Saturday, I am going to fix it on Monday.

I have worked on internal tools were I just added a postgres DB to the docker setup and that was it. 5 Minute of work and no issues at all. Sure if you have something customer facing, you need to do a bit more and setup a good backup strategy but that really isn't magic.

lucideer · 2 months ago

> at a company that has reached the scale where employing engineers to manage self-hosted databases is more cost effective than outsourcing.

This is the crux of one of the most common fallacies in software engineering decision making today. I've participated in a bunch of architecture / vendor evaluations that concluded managed services are more cost effective almost purely because they underestimated (or even discarded entirely) the internal engineering cost of vendor management. Black box debugging is one of the most time costuming engineering pursuits, & even when it's something widely documented & well supported like RDS, it's only really tuned for the lowest common denominator - the complexities of tuning someone else's system at scale can really add up to only marginally less effort than self-hosting (if there's any difference at all).

But most importantly - even if it's significantly less effort than self-hosting, it's never effectively costed when evaluating trade-offs - that's what leads to this persistent myth about the engineering cost of self-hosting. "Managing" managed services is a non-zero cost.

Add to that the ultimate trade-off of accountability vs availability (internal engineers care less about availability when it's out of there hands - but it's still a loss to your product either way).

bastawhiz · 2 months ago

> Black box debugging is one of the most time costuming engineering pursuits, & even when it's something widely documented & well supported like RDS, it's only really tuned for the lowest common denominator - the complexities of tuning someone else's system at scale can really add up to only marginally less effort than self-hosting (if there's any difference at all).

I'm really not sure what you're talking about here. I manage many RDS clusters at work. I think in total, we've spent maybe eight hours over the last three years "tuning" the system. It runs at about 100kqps during peak load. Could it be cheaper or faster? Probably, but it's a small fraction of our total infra spend and it's not keeping me up at night.

Virtually all the effort we've ever put in here has been making the application query the appropriate indexes. But you'd do no matter how you host your database.

Hell, even the metrics that RDS gives you for free make the thing pay for itself, IMO. The thought of setting up grafana to monitor a new database makes me sweat.

convolvatron · 2 months ago

its not. I've been in a few shops that use RDS because they think their time is better spend doing other things.

except now they are stuck trying to maintain and debug Postgres without having the same visibility and agency that they would if they hosted it themselves. situation isn't at all clear.

Nextgrid · 2 months ago

One thing unaccounted for if you've only ever used cloud-hosted DBs is just how slow they are compared to a modern server with NVME storage.

This leads the developers to do all kinds of workarounds and reach for more cloud services (and then integrating them and - often poorly - ensuring consistency across them) because the cloud hosted DB is not able to handle the load.

On bare-metal, you can go a very long way with just throwing everything at Postgres and calling it a day.

molf · 2 months ago

Interesting. Is this an issue with RDS?

I use Google Cloud SQL for PostgreSQL and it's been rock solid. No issues; troubleshooting works fine; all extensions we need already installed; can adjust settings where needed.

prisenco · 2 months ago

| self hosting costs you between 30 and 120 minutes per month

Can we honestly say that cloud services taking a half hour to two hours a month of someone's time on average is completely unheard of?

SatvikBeri · 2 months ago

I handle our company's RDS instances, and probably spend closer to 2 hours a year than 2 hours a month over the last 8 years.

It's definitely expensive, but it's not time-consuming.

esseph · 2 months ago

Very much depends on what you're doing in the cloud, how many services you are using, and how frequently those services and your app needs updates.

fhcuvyxu · 2 months ago

Self hosting does not cost you that much at all. It's basically zero once you've got backups automated.

npn · 2 months ago

I also encourage people to just use managed databases. After all, it is easy to replace such people. Heck actually you can fire all of them and replace the demand with genAI nowadays.

anal_reactor · 2 months ago

The discussion isn't "what is more effective". The discussion is "who wants to be blamed in case things go south". If you push the decision to move to self-hosted and then one of the engineers fucks up the database, you have a serious problem. If same engineer fucks up cloud database, it's easier to save your own ass.

jrochkind1 · 2 months ago

Agreed. As someone in a very tiny shop, all us devs want to do as little context switching to ops as possible. Not even half a day a month. Our hosted services are in aggregate still way cheaper than hiring another person. (We do not employ an "infrastructure engineer").

arevno · 2 months ago

> trading an hour or two of my time

pacman -S postgresql

initdb -D /pathto/pgroot/data

grok/claude/gpt: "Write a concise Bash script for setting up an automated daily PostgreSQL database backup using pg_dump and cron on a Linux server, with error handling via logging and 7-day retention by deleting older backups."

ctrl+c / ctrl+v

Yeah that definitely took me an hour or two.

solatic · 2 months ago

So your backups are written to the same disk?

> datacenter goes up in flames

> 3-2-1 backups: 3 copies on 2 different types of media with at least 1 copy off-site. No off-site copy.

Whoops!

mittermayr · 2 months ago

ZeroConcerns · 2 months ago

So, yeah, I guess there's much confusion about what a 'managed database' actually is? Because for me, the table stakes are:

-Backups: the provider will push a full generic disaster-recovery backup of my database to an off-provider location at least daily, without the need for a maintenance window

-Optimization: index maintenance and storage optimization are performed automatically and transparently

-Multi-datacenter failover: my database will remain available even if part(s) of my provider are down, with a minimal data loss window (like, 30 seconds, 5 minutes, 15 minutes, depending on SLA and thus plan expenditure)

-Point-in-time backups are performed at an SLA-defined granularity and with a similar retention window, allowing me to access snapshots via a custom DSN, not affecting production access or performance in any way

-Slow-query analysis: notifying me of relevant performance bottlenecks before they bring down production

-Storage analysis: my plan allows for #GB of fast storage, #TB of slow storage: let me know when I'm forecast to run out of either in the next 3 billing cycles or so

Because, well, if anyone provides all of that for a monthly fee, the whole "self-hosting" argument goes out of the window quickly, right? And I say that as someone who absolutely adores self-hosting...

thedougd · 2 months ago

It's even worse when you start finding you're staffing specialized skills. You have the Postgres person, and they're not quite busy enough, but nobody else wants to do what they do. But then you have an issue while they're on vacation, and that's a problem. Now I have a critical service but with a bus factor problem. So now I staff two people who are now not very busy at all. One is a bit ambitious and is tired of being bored. So he's decided we need to implement something new in our Postgres to solve a problem we don't really have. Uh oh, it doesn't work so well, the two spend the next six months trying to work out the kinks with mixed success.

arcbyte · 2 months ago

Slack is a necessary component in well functioning systems.

satvikpendem · 2 months ago

This would be a strange scenario because why would you keep these people employed? If someone doesn't want to do the job required, including servicing Postgres, then they wouldn't be with me any longer, I'll find someone who does.

marcosdumay · 2 months ago

IMO, the reason to self-host your database is latency.

Yes, I'd say backups and analysis are table stakes for hiring it, and multi-datacenter failover is a relevant nice to have. But the reason to do it yourself is because it's literally impossible to get anything as good as you can build in somebody's else computer.

andersmurphy · 2 months ago

Yup, often orders of magnitude better.

If you set it up right, you can automate all this as well by self hosting. There is really nothing special about automating backups or multi region fail over.

awestroke · 2 months ago

But then you have to check that these mechanisms work regularly and manually

odie5533 · 2 months ago

Self-host things the boss won't call at 3 AM about: logs, traces, exceptions, internal apps, analytics. Don't self-host the database or major services.

cube00 · 2 months ago

Depending on your industry, logs can be very serious business.

wahnfrieden · 2 months ago

Yugabyte open source covers a lot of this

graemep · 2 months ago

Which providers do all of that?

BoorishBears · 2 months ago

I don't know which don't?

The default I've used on Amazon and GCP both do (RDS, Cloud SQL)

jeffbee · 2 months ago

GCP Alloy DB

dangoodmanUT · 2 months ago

There should be no data loss window with a hosted database

Feom what I remember if AWS loses your data they are basically give you some credits and that's it.

jeremyjh · 2 months ago

That requires synchronous replication, which reduces availability and performance.

xboxnolifes · 2 months ago

Why is that?

donatj · 2 months ago

The author brings up the point, but I have always found surprising how much more expensive managed databases are than a comparable VPS.

I would expect a little bit more as a cost of the convenience, but in my experience it's generally multiple times the expense. It's wild.

This has kept me away from managed databases in all but my largest projects.

orev · 2 months ago

Once they convince you that you can’t do it yourself, you end up relying on them, but didn’t develop the skills you would need to migrate to another provider when they start raising prices. And they keep raising prices because by then you have no choice.

zbentley · 2 months ago

There is plenty of provider markup, to be sure. But it is also very much not a given that the hosted version of a database is running software/configs that are equivalent to what you could do yourself. Many hosted databases are extremely different behind the scenes when it comes to durability, monitoring, failover, storage provisioning, compute provisioning, and more. Just because it acts like a connection hanging off a postmaster service running on a server doesn’t mean that’s what your “psql” is connected to on RDS Aurora (or many of the other cloud-Postgres offerings).

citizenpaul · 2 months ago

I have not tested this in real life yet but it seems like all the argument about vendor lock in can be solved, if you bite the bullet and learn basic Kubernetes administration. Kubernetes is FOSS and there are countless Kubernetes as a service providers.

I know there are other issues with Kubernetes but at least its transferable knowledge.

ch2026 · 2 months ago

Wait, are you talking about cloud providers or LLMs?

nrhrjrjrjtntbt · 2 months ago

Yes if the DB is 5x the VM and the the VM is 10x the dedicated server from say OVH etc. then you are payng 50x.

tgtweak · 2 months ago

As someone who self hosted mysql (in complex master/slave setups) then mariadb, memsql, mongo and pgsql on bare metal, virtual machines then containers for almost 2 decades at this point... you can self host with very little downtime and the only real challenge is upgrade path and getting replication right.

Now with pgbouncer (or whatever other flavor of sql-aware proxy you fancy) you can greatly reduce the complexity involved in managing conventionally complex read/write routing and sharding to various replicas to enable resilient, scalable production-grade database setups on your own infra. Throw in the fact that copy-on-write and snapshotting is baked into most storage today and it becomes - at least compared to 20 years ago - trivial to set up DRS as well. Others have mentioned pgBackRest and that further enforces the ease with which you can set up these traditionally-complex setups.

Beyond those two significant features there isn't many other reasons you'd need to go with hosted/managed pgsql. I've yet to find a managed/hosted database solution that doesn't have some level of downtime to apply updates and patches so even if you go fully hosted/managed it's not a silver bullet. The cost of managed DB is also several times that of the actual hardware it's running on, so there is a cost factor involved as well.

I guess all this to say it's never been a better time to self-host your database and the learning curve is as shallow as it's ever been. Add to all of this that any garden-variety LLM can hand-hold you through the setup and management, including any issues you might encounter on the way.

heipei · 2 months ago

I still don't get how folks can hype Postgres with every second post on HN, yet there is no simple batteries-included way to run a HA Postgres cluster with automatic failover like you can do with MongoDB. I'm genuinely curious how people deal with this in production when they're self-hosting.

franckpachot · 2 months ago

It's largely cultural. In the SQL world, people are used to accepting the absence of real HA (resilience to failure, where transactions continue without interruption) and instead rely on fast DR (stop the service, recover, check for data loss, start the service). In practice, this means that all connections are rolled back, clients must reconnect to a replica known to be in synchronous commit, and everything restarts with a cold cache.

Yet they still call it HA because there's nothing else. Even a planned shutdown of the primary to patch the OS results in downtime, as all connections are terminated. The situation is even worse for major database upgrades: stop the application, upgrade the database, deploy a new release of the app because some features are not compatible between versions, test, re-analyze the tables, reopen the database, and only then can users resume work.

Everything in SQL/RDBMS was thought for a single-node instance, not including replicas. It's not HA because there can be only one read-write instance at a time. They even claim to be more ACID than MongoDB, but the ACID properties are guaranteed only on a single node.

One exception is Oracle RAC, but PostgreSQL has nothing like that. Some forks, like YugabyteDB, provide real HA with most PostgreSQL features.

About the hype: many applications that run on PostgreSQL accept hours of downtime, planned or unplanned. Those who run larger, more critical applications on PostgreSQL are big companies with many expert DBAs who can handle the complexity of database automation. And use logical replication for upgrades. But no solution offers both low operational complexity and high availability that can be comparable to MongoDB

Beyond the hype, the PostgreSQL community is aware of the lack of "batteries-included" HA. This discussion on the idea of a Built-in Raft replication mentions MongoDB as:

>> "God Send". Everything just worked. Replication was as reliable as one could imagine. It outlives several hardware incidents without manual intervention. It allowed cluster maintenance (software and hardware upgrades) without application downtime. I really dream PostgreSQL will be as reliable as MongoDB without need of external services.

https://www.postgresql.org/message-id/0e01fb4d-f8ea-4ca9-8c9...

abrookewood · 2 months ago

"I really dream PostgreSQL will be as reliable as MongoDB" ... someone needs to go and read up on Mongo's history!

Sure, the PostrgreSQL HA story isn't what we all want it to be, but the reliability is exceptional.

mfalcao · 2 months ago

The most common way to achieve HÁ is using Patroni. The easiest way to set it up is using Autobase (https://autobase.tech).

CloudNativePG (https://cloudnative-pg.io) is a great option if you’re using Kubernetes.

There’s also pg_auto_failover which is a Postgres extension and a bit less complex than the alternatives, but it has its drawbacks.

Be sure to read the Муths and Truths about Synchronous Replication in PostgreSQL (by the author of Patroni) before considering those solutions as cloud-native high availability: https://www.postgresql.eu/events/pgconfde2025/sessions/sessi...

tresil · 2 months ago

If you’re running Kubernetes, CloudNativePG seems to be the “batteries included” HA Postgres cluster that’s becoming the standard in this area.

CloudNativePG is automation around PostgreSQL, not "batteries included", and not the idea of Kubernetes where pods can die or spawn without impacting the availability. Unfortunately, naming it Cloud Native doesn't transform a monolithic database to an elastic cluster

monus · 2 months ago

We’ve recently had a disk failure in the primary and CloudNativePG promoted another to be primary but it wasn’t zero downtime. During transition, several queries failed. So something like pgBouncer together with transactional queries (no prepared statements) is still needed which has performance penalty.

jknoepfler · 2 months ago

I use Patroni for that in a k8s environment (although it works anywhere). I get an off-the-shelf declarative deployment of an HA postgres cluster with automatic failover with a little boiler-plate YAML.

Patroni has been around for awhile. The database-as-a-service team where I work uses it under the hood. I used it to build database-as-a-service functionality on the infra platform team I was at prior to that.

It's basially push-button production PG.

There's at least one decent operator framework leveraging it, if that's your jam. I've been living and dying by self-hosting everything with k8s operators for about 6-7 years now.

tempest_ · 2 months ago

We use patroni and run it outside of k8s on prem, no issues in 6 or 7 years. Just upgraded from pg 12 to 17 with basically no down time without issue either.

wb14123 · 2 months ago

Yeah I'm also wondering that. I'm looking for self-host PostgreSQL after Cockroach changed their free tier license but found the HA part of PostgreSQL is really lacking. I tested Patroni which seems to be a popular choice but found some pretty critical problems (https://www.binwang.me/2024-12-02-PostgreSQL-High-Availabili...). I tried to explore some other solutions, but found out the lack of a high level design really makes the HA for PostgreSQL really hard if not impossible. For example, without the necessary information in WAL, it's hard to enforce primary node even with an external Raft/Paxos coordinator. I wrote some of them down in this blog (https://www.binwang.me/2025-08-13-Why-Consensus-Shortcuts-Fa...) especially in the section "Highly Available PostgreSQL Cluster" and "Quorum".

My theory of why Postgres is still getting the hype is either people don't know the problem, or it's acceptable on some level. I've worked in a team that maintains the in house database cluster (even though we were using MySQL instead of PostgreSQL) and the HA story was pretty bad. But there were engineers manually recover the data lost and resolve data conflicts, either from the recovery of incident or from customer tickets. So I guess that's one way of doing business.

forinti · 2 months ago

I love Postgresql simply because it never gives me any trouble. I've been running it for decades without trouble.

OTOH, Oracle takes most of my time with endless issues, bugs, unexpected feature modifications, even on OCI!

dpedu · 2 months ago

This is my gripe with Postgres as well. Every time I see comments extolling the greatness of Postgres, I can't help but think "ah, that's a user, not a system administrator" and I think that's a completely fair judgement. Postgres is pretty great if you don't have to take care of it.

I manage Postgresql and the thing I really love about it is that there's not much no manage. It just works. Even setting up streaming replication is really easy.

christophilus · 2 months ago

I’ve been tempted by MariaDB for this reason. I’d love to hear from anyone who has run both.

IMO Maria has fallen behind MySQL. I wouldn't chose it for anything my income depends on.

(I do use Maria at home for legacy reasons, and have used MySQL and Pg professionally for years.)

Patroni, Zolando operator on k8s

groundzeros2015 · 2 months ago

Because that’s an expensive and complex boondoggle almost no business needs.

RDS provides some HA. HAProxy or PGBouncer can help when self hosting.

notaseojh · 2 months ago

it's easy to through names out like this (pgbackrest is also useful...) but getting them setup properly in a production environment is not at all straightforward, which I think is the point.

isuckatcoding · 2 months ago

Take a look at https://github.com/vitabaks/autobase

In case you want to self host but also have something that takes care of all that extra work for you

runako · 2 months ago

Thank you, this looks awesome.

I wonder how well this plays with other self hosted open source PaaS, is it just a Docker container we can run I assume?

yakkomajuri · 2 months ago

Just skimmed the readme. What's the connection pooling situation here? Or is it out of scope?

yoan9224 · 2 months ago

I've been self-hosting Postgres for production apps for about 6 years now. The "3 AM database emergency" fear is vastly overblown in my experience.

In reality, most database issues are slow queries or connection pool exhaustion - things that happen during business hours when you're actively developing. The actual database process itself just runs. I've had more AWS outages wake me up than Postgres crashes.

The cost savings are real, but the bigger win for me is having complete visibility. When something does go wrong, I can SSH in and see exactly what's happening. With RDS you're often stuck waiting for support while your users are affected.

That said, you do need solid backups and monitoring from day one. pgBackRest and pgBouncer are your friends.