Because the actual process of upgrading Postgres is terrible.
I say this as someone who absolutely loves using it, but the actual process of upgrading Postgres is something that takes significant downtime, is error-prone and you're often better off just dumping all the databases and reimporting them in a new folder. (A good idea in general since it'll vacuum and compact indexes as well if you do it, combining a few maintenance routines in your upgrade path.)
It requires having the previous version of Postgres installed, something which can mess with a number of distro policies (not to mention docker, which is the most popular way to deploy software that will typically rely on Postgres), and unlike most software with that issue, Postgres is software you want to be managed by your distro.
Therefore, most people only upgrade by necessity - when their distro forces the upgrade or the version they're using reaches EOL.
MS SQL Server does better on both of these fronts: Upgrades, when running MS SQL Server in a container, is rebooting the database with the new database engine version. The DBMS sees the old files, upgrades them in place and starts the DB completely seamlessly 100% of the time.
The MS SQL Server protocol, TDS, supports native named parameters, executes the T-SQL PL inline with SQL. TDS also supports inline query cancellations because the protocol is framed, and both T-SQL and TDS supports returning multiple result sets from a single query.
But yes, whenever I see PG docs saying, major versions require an export and re-load, I shudder.
As someone who has a decade of professional postgres experience (and built my company on it), and 6 years of SqlServer experience, I couldn't agree more with your comment. I really, really like the TDS wire protocol and what it enables.
I also really miss their query optimizer. It was just so damn smart.
This has been solved by logical replication a few versions ago.
1. Leave your old database running
2. Start your new database with new version
3. Run logical replication and let it catch up
4. Decide you want to switch over.
5. Using a script: Stop your apps writing to the database (or switch to read only mode), let logical replication catch up one final time, stop logical replication, run a procedure to fix your sequences, and switch your apps over to the new database.
You are now done. Sure, it requires a one-time investment to work out the exact steps needed for your situation, but it's easy to do with minimal downtime.
Of course if you don't care about downtime just shut down your database server, make a backup(!), and then do pg_upgrade.
> Of course if you don't care about downtime just shut down your database server, make a backup(!), and then do pg_upgrade.
I think this is the right answer for 99% of companies. Yes, everyone wants zero downtime, but I see incredibly few cases where it is actually worth the cost and effort.
This has not been solved by logical replication. There are features in PostgreSQL not supported by logical replication. One of the issues I'm dealing with is our product switched from Oracle to PostgreSQL and the ORM stuff love LOBs, and you know LOBs can't be logical replicated.
Financial incentives likely, because PG is developed by companies that make money via providing support. Doesn't have to be an insidious plot but just why work on something you know well and that makes you money. MSSQL wants people to pay to upgrade, it behooves them to make it seemless.
I have always compiled from source so that I can have more than one version in place. The installation takes less than 1GB and is quite easy and fast to compile and I've never had any issues with upgrades. If you forget to compile an extension, you can do it later, copy the .so to the right folder, and you don't even have to restart the cluster.
If you don't want to use pg_upgrade, you can dump from one cluster and pipe directly into another, with no need for a temporary file. It couldn't be easier.
Can't say the same for Oracle. Even applying a patch can ruin several weekends.
I worked at a place that was running a 8 year old install of Oracle 8i because upgrading was just too painful. Oracle 10g was out by that point. It's been over 15 years since... I wonder if they ever upgraded?
This!
I already mentioned this a while back and was basically berated (by someone that seemed like a dev/close to dev) that current setup is just fine and because of postgress complexity (extensions) it has to be done that way... and while I like postgress a lot it's quite annoying that the upgrade is such a terrible experience... :|
I'll confess - I have a project that uses Heroku's managed Postgres and my preferred upgrade method is to set the maintenance window to the middle of the night, create a backup, and be awake at 1am to make sure that nothing is broken after they force the upgrade. Their auto-upgrade process hasn't failed me so far, but there's no way to manually trigger it.
> Postgres is software you want to be managed by your distro.
Why?
For production systems I generally try to avoid using anything not built internally. When it comes to PostgreSQL, for example, if you want to submit benchmarks, or you want to report bugs, an important or sometimes a mandatory component is to provide compilation flags. Also, seeing how every kind of storage can act in a different way, tuning of higher-level storage programs seems inevitable. Even though PostgreSQL has a lot of configurations that can be done to the compiled program, some (eg. block size) are compile-time that can only be changed by compiling the program.
Depends on your scale. If you're a startup or even just a small side service, performance isn't going to be a bottleneck and you/the org wants the thing to be fire and forget, including bug fixes and especially security patches. A distro takes care of all of those generally and makes sure the dependencies have the same care taken.
And considering how many people use some sort of abstraction layer like an ORM, have databases installed as part of some deployed application, or just do basic CRUD stuff, it probably would have worked on a postgres install from the 90s.
I keep an eye out for vulnerabilities, and that's about it.
My opinion is Postgres was designed by software developers for software developers. The split on “which relational database to use” in my career has almost always been perfectly split between SWE vehemently demanding pgsql for the feature set, and the sysadmins having to support maintenance and production availability preferring MySQL.
One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level. Apparently us lowly sysadmins may have had a point after all.
This is a bit tongue in cheek but really not far from my lived reality. When the focus is on features and “correctness” at the near total expense of sane systems tooling folks can develop some myopia on the subject. So many arguments with devs on my teams over this subject that were utterly horrified to find we were running MySQL for a given service.
Open source projects tend to fix the pain points its contributors experience, and I assume there were not too many contributors wanting to deal with the boring work of making administration and easy task - it’s thankless “sideways” work that won’t result in many accolades or personal satisfaction for most SWEs.
The end users are almost always developers, most of whose experiences in production entail either the equivalent of a docker container level scale system, or are simply given a connection string and the rest is a black box to them. Under those contexts I’d personally prefer Postgres as well and it wouldn’t even be close. When you get into backups, clustering, upgrades, and high availability under extreme load? IMO the story falls apart real fast.
To avoid having to slow down development of new PostgreSQL features. Improving upgrades in a way where PostgreSQL does not need to either maintain multiple different versions of parts of the code and/or lock down internal interfaces which now can change freely every major version so they cannot be refactored and improved in the future is not a trivial task, maybe even impossible. Even just the existence of pg_upgrade has to some degree limited what can be done to improve PostgreSQL. Obviously pg_upgrade is worth it, but hindering development even further might not be popular.
The PostgreSQL team simply does not have the resources to do this. At least not without significantly slowing down development of everything else which there is no political will for. Maybe someone will come up with a genius idea which solves this but I am doubtful. Usually there is no free lunch.
Maybe some core dev will correct me, I am quite familiar with the PostgreSQL project but not one of the core devs. :)
I think one really BIG factor is that built-in logical replication wasn't introduced until PostgreSQL 10 in 2017, before that you only had physical replication for master-slave but iirc that didn't work between versions so doing a "hot" upgrade was more or less impossible without third-party tools iirc.
So even if it's available these days, the amount of people still subjected to upgrades from older version still leaves an impression that it's really bad.
Lots of dogmatism in this discussion, it seems. A couple of things:
1. Most psql deployments are not exposed to the interwebz, they are typically only accessible to the applications that need them by virtue of network setup (firewalls etc). This limits the attack vector to whatever the application does. Good.
2. Distro vendors (RHEL et al) often stick to major psql release for the lifecycle of the OS version. If the OS lives longer than the psql major version, they take on the responsability of backporting critical security issues.
3. While upgrades aren't hard, they're not easy either.
4. Psql is pretty much feature complete for many workloads, and pretty stable in general. For many people, there is little need to chase the latest major version.
> 4. Psql is pretty much feature complete for many workloads, and pretty stable in general. For many people, there is little need to chase the latest major version.
To drive this a little further, "latest and greatest" doesn't always apply. I've chosen software - even databases - for greenfield deployments one or two releases behind for their known characteristics.
Stability doesn't imply perfection, but rather, predictability.
> 3. While upgrades aren't hard, they're not easy either.
And in my experience, apart from ease of use there's also a major trust issue here. If you're upgrading your app server framework/language, it's easy enough to do a rollback. With databases, people are worried that they might not notice errors right away and then you have to merge the data accumulated since the upgrade with the last backup in case of a rollback.
Not saying that this is entirely rational...
Also, new features on the SQL level are hard to sell if all you're doing is lowest common denominator ORM ("New window functions and faster lateral joins? But we're doing all that in our code!").
It requires a lot of work, planned downtime, or some way to smear updates across the estate.
The cost of any failure is very high. The benefit of any major upgrade is also vanishingly small. Unless you need a specific feature, its just not worth it.
5. If your IT department is spread thin already and that old version is running fine, the incentive to potentially create more work for yourself is not gigantic.
> 3. While upgrades aren't hard, they're not easy either
I guess it depends on scale? I was surprised how easy it was on Ubuntu. There was an automatic migration script, and it worked. Took less than 5 minutes to upgrade.
Sure, there was downtime, but I think most applications out there can live with scheduled downtime of a few minutes.
I think this is a really good take. It all boils down to "if it ain't broke don't fix it."
No matter how easy it is, it takes more effort and thought to do an upgrade than it does to not do an upgrade at all, and for most users the upside is basically invisible if it exists at all.
"What the application does" may not be what you think of, as it is dependent on how secure the application or the layers beneath it are. This is how people get everything pwned step by step. The database server may then reveal credentials to other apps etc.
Sure. Defense in depth is important. But I hope that your application is only able to talk TCP/5432 to psql. No amount of psql upgrading will protect you against SQL injections in shitty application code.
I've used postgresql in most of my previous jobs, but using mariadb in my current one, and I must say it's a joy to administer.
The replication and high availability that mariadb offers is rock solid, and much more advanced than what postgresql has to offer.
It works out of the box, no plugins or third party shenanigans needed, and there is "one obvious way" to do it. Not a dozen options like in the postgresql ecosystem, each with it's own quirks and drawbacks.
Also, upgrades are a dream. I did 4 major long term stable release upgrades so far and everything was fully automatic and smooth sailing all around.
All of that with about 10 seconds downtime only for each upgrade, despite being a chunky 6TB database, thanks to the user friendly replication process.
I respect postgresql a lot, but mariadb allows me to sleep much better at night thanks to replication, smooth upgrades and no VACUUM to worry about.
To be fair to PostgreSQL, it has multiple replication setups because they have different inherent properties. For instance, it comes with built in streaming and logical replication setups. Which should you use? It’s impossible to say without knowing your own exact use case.
By analogy: should a database cluster optimize consistency or availability? Answer: there’s no way its author can guess which is more important to how you want to use it.
MySQL/MariaDB have binary and logical replication too. They also support much more advanced topologies like multi-master out of the box. It's just that what you need for 99% of situations relies on a well trodden path and requires no configuration at all.
I have suspected for a long time that most people who criticize MySQL have never actually worked with it, or have done so a couple of decades ago. It's often the default choice if you don't need some of the PostgreSQL features (like PostGIS) and can work with either one.
> The replication and high availability that mariadb offers is rock solid, and much more advanced than what postgresql has to offer. It works out of the box, no plugins or third party shenanigans needed, and there is "one obvious way" to do it.
Upgrades are hard. There was no replication in the before times. The original block-level replication didn't work among different major versions. Slony was a painful workaround based on triggers that amplified writes.
Newer PostgreSQL versions are better. Yet still not quite as robust or easy as MySQL.
At a certain scale even MySQL upgrades can be painful. At least when you cannot spare more than a few minutes of downtime.
> At least when you cannot spare more than a few minutes of downtime.
I think it boils down to this. We used to constantly be running the obsolete pg version until it became an emergency mostly because upgrading with the tooling available at the time was very painful. Today however, we stay relatively up to date. Once you figure out the data replication, you can almost do blue green deployments on databases with almost no down time.
"MySQL is a popular relational database. We revisit Kleppmann’s 2014 Hermitage and confirm that MySQL’s Repeatable Read still allows G2-item, G-single, and lost update. Using our transaction consistency checker Elle, we show that MySQL Repeatable Read also violates internal consistency. Furthermore, it violates Monotonic Atomic View: transactions can observe some of another transaction’s effects, then later fail to observe other effects of that same transaction. We demonstrate violations of ANSI SQL’s requirements for Repeatable Read. We believe MySQL Repeatable Read is somewhat stronger than Read Committed. As a lagniappe, we show that AWS RDS MySQL clusters routinely violate Serializability."
"PostgreSQL is a widely-known relational database system. We evaluated PostgreSQL using Jepsen’s new transactional isolation checker Elle, and found that transactions executed with serializable isolation on a single PostgreSQL instance were not, in fact, serializable. Under normal operation, transactions could occasionally exhibit G2-item: an anomaly involving a set of transactions which (roughly speaking) mutually fail to observe each other’s writes. In addition, we found frequent instances of G2-item under PostgreSQL “repeatable read”, which is explicitly proscribed by commonly-cited formalizations of repeatable read. As previously reported by Martin Kleppmann, this is due to the fact that PostgreSQL “repeatable read” is actually snapshot isolation. This behavior is allowable due to long-discussed ambiguities in the ANSI SQL standard, but could be surprising for users familiar with the literature. A patch for the bug we found in serializability is scheduled for the next minor release, on August 13th, and the presence of G2-item under repeatable read could be readily addressed through documentation."
They’re not wrong. If you’ve ever spent meaningful time administering both, you’ll know that Postgres takes far more hands-on work to keep it going.
To be clear, I like both. Postgres has a lot more features, and is far more extensible. But there’s no getting around the fact that its MVCC implementation means that at scale, you have to worry about things that simply do not exist for MySQL: vacuuming, txid wraparound, etc.
Early MySQL versions made egregious design choices like quietly ignoring missing foreign keys and enum typos, truncating long strings, and randomly choosing rows from groups.
I am a database specialist and have worn the DBA had for many years. I have run MySQL and Postgres in production, both self-hosted and using managed services. Postgres wins on every single dimension that matters, every time. Yes MySQL is easier to setup for non-experts. That counts for nothing.
If you are sticking up for MySQL in this thread... I just don't even know, man.
I think so. The PostgreSQL logo is an elephant, and the community calls him "Slonik", probably derived from the Russian word for elephant. There is also a node.js driver that is called "slonik"
I've always wondered why Postgres is so insanely popular. I mean it has some nice things like very powerful support for a very comprehensive subset of SQL functionality, but most apps don't need all that.
It really feels like early 1990s vintage Unix software. It's clunky and arcane and it's hard to feel confident doing anything complex with it.
> I've always wondered why Postgres is so insanely popular.
In no particular order, my preference for postgres is driven by:
* Date / time functions that don't suck
* UTF-8 is really UTF-8
* 99% of a backup can be done live with nothing more than rsyncing the data directory and the WAL files
* Really comprehensive documentation
* LTREE and fuzzy string match extensions
* Familiarity from using it for years
MySQL/Maria I'm sure is fine, but it's one of hose things where it's just different enough and I haven't encountered a compelling use case for changing my preference.
> I've always wondered why Postgres is so insanely popular.
Just another anecdote: MySQL lost data for me (2004). I spent some time evaluating the projects and Postgres’ development process seemed much more mature — methodical, careful, and focused on correctness. Boring, which I loved.
I didn’t need whatever perf advantage MySQL had so I switched to Postgres and never looked back. And then the Oracle drama and Monty’s behavior around it — not saying he was wrong or right, but it was the opposite of boring — just reinforced my decision.
I like to play with new tech in various spots of the stack, but for filesystems and databases I go boring all the way.
For me Postgres is 100% predictable and reliable. It's neither clunky nor arcane in my experience. I don't need to think about it, I just SQL it and that's about it. It quietly works in the background. At some scale there might be some issues, but there is always known path to solve things.
You might as well ask, why does anyone run an older version or anything? The reasons will be largely the same.
Most of the software on my machines are "old" because they are part of a Linux distribution that (aside from security issues) was frozen in time a year or two ago so that it could be tested, released, and maintained. I am quite happy to have a system that I know is not going to break (either itself, or my workflow) when I apply security updates.
People who MUST HAVE the latest version of everything I feel either have some deeper FOMO issues to work out, suffer from boredom, or look at their computers as hobbies themselves rather than tools. (Which is fine, just be honest about what it is.)
That said, much of my career has been spent working at companies who got so busy shipping features that upgrading infrastructure never makes it above the fold. You can tell the managers that working around old software adds costs that scale with the age of the infrastructure, but they don't always listen. I currently work at a company that still has loads of CentOS 7 hosts still in production, and only fairly recently began upgrading them to RHEL 8. (Not 9!)
> These are the companies you want to be at IMHO. Provided the compensation is adequate, slow and stable > fast and pivot-y.
Absolutely...not.
Slow does not mean stable. Slow means the floor is rotting out from under you constantly.
Being prudent about when and where to upgrade is a very active, intentional process that the typical company simply don't have the stomach or skill for.
Not chasing shiny is important but generally when tech debt builds up this high, life is generally hell in terms of outages, unable to accomplish basic tasks and dealing with a bunch of people who have NIH syndrome.
older versions can also mean deprecated packages everyone's too scared to touch, failure to invest in maintenance and tech debt reduction, or use of old technologies which stopped receiving security updates
I don’t necessarily need to be on the latest version, but I prefer to take many small upgrades rather than one big upgrade at least when it comes to databases. Frequent upgrades also forces an organization to get good at upgrading and managing the risk.
That and capitalism doesn’t reward slow and steady, keeping things stable and well oiled. It rewards delivering shoddy features at break neck speed even if they need to be recalled after two weeks. That’s what Wall Street and the shareholders want. Hence why senior management rarely cares about addressing tech debt.
“Show me the incentive, I’ll show you the outcome”
Once your version doesn’t receive security fixes you’re one CERT advisory away from having your whole week pre-empted by an emergency upgrade.
I’ve been there with products that were still internal at the time. I can only imagine how much fun that is with a public product. But then I do have a pretty vivid imagination. We changed to periodic upgrades after that to avoid the obvious problem staring us in the face.
Upgrading when multiple versions behind is significantly more risky than doing it when the update is relatively fresh.
Additionally, actions done frequently are less risky than actions done rarely, since you develop skills in performing that action as an organization - see high deployment frequency as a strategy of managing deployment risk.
This adds up to continuous upgrading being the least risky option in aggregate.
Why? If the implemented featureset meets your needs, and there are no unresolved bugs or security vulnerabilities relevant to your use cases, what further "maintenance" do you need?
In Oracle, ALTER TABLE MOVE in 8i was a godsend, finally enabling a table reorganization without export/import.
My timid management forbade an upgrade from Oracle 7.3.4 until 2013. It was agony to remain on that museum piece for as long as we did.
I am upgrade-minded, but my management is not. I always lose.
I am retiring in two years. I will not miss their problems, not at all.
Edit: Oracle 10g was the last release that (for us) brought must-have features. Sure, upgrading to 19 or 23 would be great, but it doesn't bring anything that I really want.
I say this as someone who absolutely loves using it, but the actual process of upgrading Postgres is something that takes significant downtime, is error-prone and you're often better off just dumping all the databases and reimporting them in a new folder. (A good idea in general since it'll vacuum and compact indexes as well if you do it, combining a few maintenance routines in your upgrade path.)
It requires having the previous version of Postgres installed, something which can mess with a number of distro policies (not to mention docker, which is the most popular way to deploy software that will typically rely on Postgres), and unlike most software with that issue, Postgres is software you want to be managed by your distro.
Therefore, most people only upgrade by necessity - when their distro forces the upgrade or the version they're using reaches EOL.
* Upgrades
* Database Protocol
MS SQL Server does better on both of these fronts: Upgrades, when running MS SQL Server in a container, is rebooting the database with the new database engine version. The DBMS sees the old files, upgrades them in place and starts the DB completely seamlessly 100% of the time.
The MS SQL Server protocol, TDS, supports native named parameters, executes the T-SQL PL inline with SQL. TDS also supports inline query cancellations because the protocol is framed, and both T-SQL and TDS supports returning multiple result sets from a single query.
But yes, whenever I see PG docs saying, major versions require an export and re-load, I shudder.
I also really miss their query optimizer. It was just so damn smart.
1. Leave your old database running
2. Start your new database with new version
3. Run logical replication and let it catch up
4. Decide you want to switch over.
5. Using a script: Stop your apps writing to the database (or switch to read only mode), let logical replication catch up one final time, stop logical replication, run a procedure to fix your sequences, and switch your apps over to the new database.
You are now done. Sure, it requires a one-time investment to work out the exact steps needed for your situation, but it's easy to do with minimal downtime.
Of course if you don't care about downtime just shut down your database server, make a backup(!), and then do pg_upgrade.
I think this is the right answer for 99% of companies. Yes, everyone wants zero downtime, but I see incredibly few cases where it is actually worth the cost and effort.
But obviously this is not suitable for all use cases. I don't know why pg doesn't invest in this sector.
https://apt.postgresql.org/
https://yum.postgresql.org/
If you don't want to use pg_upgrade, you can dump from one cluster and pipe directly into another, with no need for a temporary file. It couldn't be easier.
Can't say the same for Oracle. Even applying a patch can ruin several weekends.
If you have a trivial database, yes.
If you have a large, complicated database that takes hours/days to restore and has an HA SLA, very hell no.
Every bit of the experience before getting it working, feels like it was designed by an MBA who owns a Postgres hosting service.
Other than that, it's amazing.
https://devcenter.heroku.com/articles/heroku-postgres-mainte...
Why?
For production systems I generally try to avoid using anything not built internally. When it comes to PostgreSQL, for example, if you want to submit benchmarks, or you want to report bugs, an important or sometimes a mandatory component is to provide compilation flags. Also, seeing how every kind of storage can act in a different way, tuning of higher-level storage programs seems inevitable. Even though PostgreSQL has a lot of configurations that can be done to the compiled program, some (eg. block size) are compile-time that can only be changed by compiling the program.
On all open-source distros, you can look into your distro's source and check their compile flags.
Dead Comment
I keep an eye out for vulnerabilities, and that's about it.
One of the few things I’ve enjoyed with the move into devops and companies forcing previously “pure” developers into operational roles was their discovery that Postgres was utterly horrible to administer at a systems level. Apparently us lowly sysadmins may have had a point after all.
This is a bit tongue in cheek but really not far from my lived reality. When the focus is on features and “correctness” at the near total expense of sane systems tooling folks can develop some myopia on the subject. So many arguments with devs on my teams over this subject that were utterly horrified to find we were running MySQL for a given service.
Open source projects tend to fix the pain points its contributors experience, and I assume there were not too many contributors wanting to deal with the boring work of making administration and easy task - it’s thankless “sideways” work that won’t result in many accolades or personal satisfaction for most SWEs.
The end users are almost always developers, most of whose experiences in production entail either the equivalent of a docker container level scale system, or are simply given a connection string and the rest is a black box to them. Under those contexts I’d personally prefer Postgres as well and it wouldn’t even be close. When you get into backups, clustering, upgrades, and high availability under extreme load? IMO the story falls apart real fast.
The PostgreSQL team simply does not have the resources to do this. At least not without significantly slowing down development of everything else which there is no political will for. Maybe someone will come up with a genius idea which solves this but I am doubtful. Usually there is no free lunch.
Maybe some core dev will correct me, I am quite familiar with the PostgreSQL project but not one of the core devs. :)
So even if it's available these days, the amount of people still subjected to upgrades from older version still leaves an impression that it's really bad.
1. Most psql deployments are not exposed to the interwebz, they are typically only accessible to the applications that need them by virtue of network setup (firewalls etc). This limits the attack vector to whatever the application does. Good.
2. Distro vendors (RHEL et al) often stick to major psql release for the lifecycle of the OS version. If the OS lives longer than the psql major version, they take on the responsability of backporting critical security issues.
3. While upgrades aren't hard, they're not easy either.
4. Psql is pretty much feature complete for many workloads, and pretty stable in general. For many people, there is little need to chase the latest major version.
To drive this a little further, "latest and greatest" doesn't always apply. I've chosen software - even databases - for greenfield deployments one or two releases behind for their known characteristics.
Stability doesn't imply perfection, but rather, predictability.
And in my experience, apart from ease of use there's also a major trust issue here. If you're upgrading your app server framework/language, it's easy enough to do a rollback. With databases, people are worried that they might not notice errors right away and then you have to merge the data accumulated since the upgrade with the last backup in case of a rollback.
Not saying that this is entirely rational...
Also, new features on the SQL level are hard to sell if all you're doing is lowest common denominator ORM ("New window functions and faster lateral joins? But we're doing all that in our code!").
It requires a lot of work, planned downtime, or some way to smear updates across the estate.
The cost of any failure is very high. The benefit of any major upgrade is also vanishingly small. Unless you need a specific feature, its just not worth it.
…or you're on AWS RDS, which will automatically bump your db cluster if it goes EOL and you ignore the notices for more than a year.
5. If your IT department is spread thin already and that old version is running fine, the incentive to potentially create more work for yourself is not gigantic.
And no, being old is not broken.
I guess it depends on scale? I was surprised how easy it was on Ubuntu. There was an automatic migration script, and it worked. Took less than 5 minutes to upgrade.
Sure, there was downtime, but I think most applications out there can live with scheduled downtime of a few minutes.
If you can't have downtime, then nothing is easy.
No matter how easy it is, it takes more effort and thought to do an upgrade than it does to not do an upgrade at all, and for most users the upside is basically invisible if it exists at all.
The replication and high availability that mariadb offers is rock solid, and much more advanced than what postgresql has to offer. It works out of the box, no plugins or third party shenanigans needed, and there is "one obvious way" to do it. Not a dozen options like in the postgresql ecosystem, each with it's own quirks and drawbacks.
Also, upgrades are a dream. I did 4 major long term stable release upgrades so far and everything was fully automatic and smooth sailing all around.
All of that with about 10 seconds downtime only for each upgrade, despite being a chunky 6TB database, thanks to the user friendly replication process.
I respect postgresql a lot, but mariadb allows me to sleep much better at night thanks to replication, smooth upgrades and no VACUUM to worry about.
By analogy: should a database cluster optimize consistency or availability? Answer: there’s no way its author can guess which is more important to how you want to use it.
I have suspected for a long time that most people who criticize MySQL have never actually worked with it, or have done so a couple of decades ago. It's often the default choice if you don't need some of the PostgreSQL features (like PostGIS) and can work with either one.
Looking at https://mariadb.com/kb/en/standard-replication/, is the one obvious way binlog replication or Galera?
Deleted Comment
Newer PostgreSQL versions are better. Yet still not quite as robust or easy as MySQL.
At a certain scale even MySQL upgrades can be painful. At least when you cannot spare more than a few minutes of downtime.
I think it boils down to this. We used to constantly be running the obsolete pg version until it became an emergency mostly because upgrading with the tooling available at the time was very painful. Today however, we stay relatively up to date. Once you figure out the data replication, you can almost do blue green deployments on databases with almost no down time.
https://jepsen.io/analyses/mysql-8.0.34 ( MySQL 8.0.34 2023-12-19 )
HN: https://news.ycombinator.com/item?id=38695750
"MySQL is a popular relational database. We revisit Kleppmann’s 2014 Hermitage and confirm that MySQL’s Repeatable Read still allows G2-item, G-single, and lost update. Using our transaction consistency checker Elle, we show that MySQL Repeatable Read also violates internal consistency. Furthermore, it violates Monotonic Atomic View: transactions can observe some of another transaction’s effects, then later fail to observe other effects of that same transaction. We demonstrate violations of ANSI SQL’s requirements for Repeatable Read. We believe MySQL Repeatable Read is somewhat stronger than Read Committed. As a lagniappe, we show that AWS RDS MySQL clusters routinely violate Serializability."
---------------
https://jepsen.io/analyses/postgresql-12.3 ( PostgreSQL 12.3 2020-06-12 )
HN: https://news.ycombinator.com/item?id=23498781
"PostgreSQL is a widely-known relational database system. We evaluated PostgreSQL using Jepsen’s new transactional isolation checker Elle, and found that transactions executed with serializable isolation on a single PostgreSQL instance were not, in fact, serializable. Under normal operation, transactions could occasionally exhibit G2-item: an anomaly involving a set of transactions which (roughly speaking) mutually fail to observe each other’s writes. In addition, we found frequent instances of G2-item under PostgreSQL “repeatable read”, which is explicitly proscribed by commonly-cited formalizations of repeatable read. As previously reported by Martin Kleppmann, this is due to the fact that PostgreSQL “repeatable read” is actually snapshot isolation. This behavior is allowable due to long-discussed ambiguities in the ANSI SQL standard, but could be surprising for users familiar with the literature. A patch for the bug we found in serializability is scheduled for the next minor release, on August 13th, and the presence of G2-item under repeatable read could be readily addressed through documentation."
To be clear, I like both. Postgres has a lot more features, and is far more extensible. But there’s no getting around the fact that its MVCC implementation means that at scale, you have to worry about things that simply do not exist for MySQL: vacuuming, txid wraparound, etc.
https://web.archive.org/web/20230922210124/https://grimoire....
MySQL was immortalized as the database in every LAMP stack. And just like PHP it improved considerably since then.
I am a database specialist and have worn the DBA had for many years. I have run MySQL and Postgres in production, both self-hosted and using managed services. Postgres wins on every single dimension that matters, every time. Yes MySQL is easier to setup for non-experts. That counts for nothing.
If you are sticking up for MySQL in this thread... I just don't even know, man.
(which may be actually derived from the turkish aslan, for lion, but somehow the animal got mixed up)
It really feels like early 1990s vintage Unix software. It's clunky and arcane and it's hard to feel confident doing anything complex with it.
In no particular order, my preference for postgres is driven by:
MySQL/Maria I'm sure is fine, but it's one of hose things where it's just different enough and I haven't encountered a compelling use case for changing my preference.* comprehensive transaction model using different modes
* PostGIS and lots of other great extensions
* supports most of the current SQL standard and is clear on interpretation of edge-cases in the documentation
* support for writing stored procedures in any major programming language
* many useful functions regarding dates, sets, ranges, json, xml, ...
* custom datatypes
* extremely thought-out and standardized approach to development: if a feature is included it generally works well in interaction with everything else
* syntax, semantics and performance are all very predictable
* great documentation
Regarding MySQL / MariaDB: MySQL optimized for performance first. Until 2010 the standard-storage-engine MyISAM didn't even support transactions.
PostgreSQL always focused on correctness and stability and then made sure everything performed.
How software "feels" is subjective. Can you be more specific?
Just another anecdote: MySQL lost data for me (2004). I spent some time evaluating the projects and Postgres’ development process seemed much more mature — methodical, careful, and focused on correctness. Boring, which I loved.
I didn’t need whatever perf advantage MySQL had so I switched to Postgres and never looked back. And then the Oracle drama and Monty’s behavior around it — not saying he was wrong or right, but it was the opposite of boring — just reinforced my decision.
I like to play with new tech in various spots of the stack, but for filesystems and databases I go boring all the way.
Real answer: no licensing cost
Most of the software on my machines are "old" because they are part of a Linux distribution that (aside from security issues) was frozen in time a year or two ago so that it could be tested, released, and maintained. I am quite happy to have a system that I know is not going to break (either itself, or my workflow) when I apply security updates.
People who MUST HAVE the latest version of everything I feel either have some deeper FOMO issues to work out, suffer from boredom, or look at their computers as hobbies themselves rather than tools. (Which is fine, just be honest about what it is.)
That said, much of my career has been spent working at companies who got so busy shipping features that upgrading infrastructure never makes it above the fold. You can tell the managers that working around old software adds costs that scale with the age of the infrastructure, but they don't always listen. I currently work at a company that still has loads of CentOS 7 hosts still in production, and only fairly recently began upgrading them to RHEL 8. (Not 9!)
Absolutely...not.
Slow does not mean stable. Slow means the floor is rotting out from under you constantly.
Being prudent about when and where to upgrade is a very active, intentional process that the typical company simply don't have the stomach or skill for.
Not chasing shiny is important but generally when tech debt builds up this high, life is generally hell in terms of outages, unable to accomplish basic tasks and dealing with a bunch of people who have NIH syndrome.
So for fresh installs yes but existing ones not so much
“Show me the incentive, I’ll show you the outcome”
Significant security vulnerability? Upgrade
Feature you need? Upgrade
All other reasons: Don't upgrade.
Upgrading takes effort and it is risky. The benefits must be worth the risks.
I’ve been there with products that were still internal at the time. I can only imagine how much fun that is with a public product. But then I do have a pretty vivid imagination. We changed to periodic upgrades after that to avoid the obvious problem staring us in the face.
Additionally, actions done frequently are less risky than actions done rarely, since you develop skills in performing that action as an organization - see high deployment frequency as a strategy of managing deployment risk.
This adds up to continuous upgrading being the least risky option in aggregate.
Thank you Amazon!
My timid management forbade an upgrade from Oracle 7.3.4 until 2013. It was agony to remain on that museum piece for as long as we did.
I am upgrade-minded, but my management is not. I always lose.
I am retiring in two years. I will not miss their problems, not at all.
Edit: Oracle 10g was the last release that (for us) brought must-have features. Sure, upgrading to 19 or 23 would be great, but it doesn't bring anything that I really want.