Interesting, but unless you understand every piece of software running and how they interact and are deployed, running this sounds like a devops and security nightmare.
That’s my thing. I closed the docs as soon as I opened them because I already couldn’t imagine convincing anyone this was a good idea enough to convince me it was a good idea. If that made any sense
You need a lot of clout to get something like this out. Kubernetes is an example of something in this category, for example. That baby needed to come out of Google to be both fully formed and in the "didn't get fired for choosing" bucket :-).
Speaking of which, why isn't this project delivered as a Helm chart?
However, when it comes to getting it up & running, the process is pretty straightforward, and the outcome is leagues ahead compared to running the kernel in the raw way.
For the extension part, maybe. we only use very limited extensions in production: PostGIS, TimescaleDB, and PGVector will suffice for most cases, plus pg_repack, pg_cron, and wal2json for maintenance.
The core/RDS part (HA/PTIR/IaC/Monitor) is quite robust, which has served our 25000 core deployment for 3 years+ and survived dozens of hardware failures.
It runs on or depends on so many disparate software packages that each comes with their own versioning strategy, security model, etc.
Usually when you install a package in a linux distro either the distro or the package maintainer of the package you install is responsible to ensure it keeps working (with varying levels of effort depending on distro, support strategy, etc...).
With these sort of amalgamation packages that becomes weird. Nobody is responsible and its built with many parts that never tested new versions in that configuration.
You need to either understand what you run or understand who you delegated that understanding to.
This is true for individual packages and even more for amalgamations like this.
In most of these amalgamations you delegated it to nobody and you don't realize until it breaks and you cannot fix it.
This looks very good, but the marketing copy (the README is written as such) is horrendous.
This HN thread is already full of people saying "this looks brittle", "maintenance nightmare", &c. because the README spends so long trying to convince you that this product is a huge, multifaceted kitchen sink. People want simplicity, noone is looking for a plethora of things they need to run, why would you ever try and sell anything as that?
Most of the "features" they enumerate are either:
- optional pg extensions that are available in normal pg by default
- orchestration software that one might use to deploy HA pg clusters in a distributed / k8s type env anyway (i.e. it's not extra, it's just your underlying infra templates - e.g. Terraform, Patroni, &c.)
The only thing I really see there that is "not-really-needed-kitchen-sink-extras" is the observability stack (grafana, loki, &c.)
RDS is a managed service. This is a code repo. It is useless until you deploy it at which point... it's no longer free. Then you have to manage it yourself and deal with the overhead yourself.
This is like saying I can fire my gardener because you have a free gardener alternative, and then handing me a pair of scissors. I now have all the problems that I paid the Gardener to solve.
Likewise, we pay AWS to manage all the headaches that deploying this would introduce. And trust me, if you are running databases at scale in production, then RDS feels very affordable considering the problems it solves.
The scale of operation is the crux here. For a modest number of cores, RDS is a congenial choice. However, when you're going to hundreds or, as in our scenario, tens of thousands of cores with PostgreSQL, clinging to RDS is sheer lunacy.
I was the designated gardener to tackle this: architecting and managing a PostgreSQL deployment with 25K cores and 3M TPS. We've been shelling out a cool $1M annually, covering the whole thing - hardware, software, and DBAs. Meanwhile, the toll for RDS is an astronomical tenfold of our current expenditure, yet it comes with a lesser degree of availability and a starkly crippled observability and other stuff.
This is cool and I applaud the effort but basically no one should use this.
Most folks should use RDS etc. (Yes, it's expensive and has limitations.)
If you need or want to self-host, you need to understand every moving part and how they fit together. The effort and knowledge required to assemble your own setup are essential, and outsourcing them to a magic bundle would be a mistake.
Yeah, the original title was "Show HN: OSS PostgreSQL RDS with Supabase,PostgresML,Vector,HA,PITR,Monitor,&100+ Extensions" ... now auto-renamed without the "RDS" part...
Speaking of which, why isn't this project delivered as a Helm chart?
https://doc.pigsty.cc/#/PGSQL-ARCH?id=component-overviewhttps://doc.pigsty.cc/#/PGSQL-ADMINhttps://doc.pigsty.cc/#/SECURITYhttps://doc.pigsty.cc/#/ARCH
However, when it comes to getting it up & running, the process is pretty straightforward, and the outcome is leagues ahead compared to running the kernel in the raw way.
The core/RDS part (HA/PTIR/IaC/Monitor) is quite robust, which has served our 25000 core deployment for 3 years+ and survived dozens of hardware failures.
Usually when you install a package in a linux distro either the distro or the package maintainer of the package you install is responsible to ensure it keeps working (with varying levels of effort depending on distro, support strategy, etc...).
With these sort of amalgamation packages that becomes weird. Nobody is responsible and its built with many parts that never tested new versions in that configuration.
You need to either understand what you run or understand who you delegated that understanding to.
This is true for individual packages and even more for amalgamations like this.
In most of these amalgamations you delegated it to nobody and you don't realize until it breaks and you cannot fix it.
This HN thread is already full of people saying "this looks brittle", "maintenance nightmare", &c. because the README spends so long trying to convince you that this product is a huge, multifaceted kitchen sink. People want simplicity, noone is looking for a plethora of things they need to run, why would you ever try and sell anything as that?
Most of the "features" they enumerate are either:
- optional pg extensions that are available in normal pg by default
- orchestration software that one might use to deploy HA pg clusters in a distributed / k8s type env anyway (i.e. it's not extra, it's just your underlying infra templates - e.g. Terraform, Patroni, &c.)
The only thing I really see there that is "not-really-needed-kitchen-sink-extras" is the observability stack (grafana, loki, &c.)
RDS is a managed service. This is a code repo. It is useless until you deploy it at which point... it's no longer free. Then you have to manage it yourself and deal with the overhead yourself.
This is like saying I can fire my gardener because you have a free gardener alternative, and then handing me a pair of scissors. I now have all the problems that I paid the Gardener to solve.
Likewise, we pay AWS to manage all the headaches that deploying this would introduce. And trust me, if you are running databases at scale in production, then RDS feels very affordable considering the problems it solves.
I was the designated gardener to tackle this: architecting and managing a PostgreSQL deployment with 25K cores and 3M TPS. We've been shelling out a cool $1M annually, covering the whole thing - hardware, software, and DBAs. Meanwhile, the toll for RDS is an astronomical tenfold of our current expenditure, yet it comes with a lesser degree of availability and a starkly crippled observability and other stuff.
Most folks should use RDS etc. (Yes, it's expensive and has limitations.)
If you need or want to self-host, you need to understand every moving part and how they fit together. The effort and knowledge required to assemble your own setup are essential, and outsourcing them to a magic bundle would be a mistake.
Deleted Comment