jhas78asd (u/jhas78asd)

jhas78asd commented on Ask HN: Could you share your personal blog here? · Posted by u/revskill

jhas78asd · 2 years ago

PostgreSQL and Ruby on Rails. https://andyatkinson.com/

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

jitl · 3 years ago

Postgres query planner suddenly deciding to do something silly in the middle of the night has taken Notion down a few times. It's quite annoying, and it's very frustrating to have no recourse.

jhas78asd · 3 years ago

Any posts on this? Are there bulk data loads that make table stats more stale and affect plans? I’m wondering what would suddenly make a plan selection change a lot that might be a contributing factor.

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

manv1 · 3 years ago

The main reason I prefer mysql over PostgreSQL is that mysql is just more consistent - in its commands, quirks, etc.

Postgres - is it pg, pgsql, psql, postgres, postgresQL? The answer is "yes."

Plus the case behavior for tables and column names drives me crazy. It's like some leftover VMS shit. I mean seriously fix it. Can you or can you not use a capital letter for a table/column name? I can never remember. Or you can, but you have to quote it? Fuck.

Until recently (which to be fair might be 8-10 years ago) postgres' performance monitoring tools sucked compared to mysql. I know at one point in the last 10 years they still used sunos4 as their base configuration because you know, the OS had been EOL for like a decade at that point.

MySQL is fire and forget. psql (or postgres or pg or postgresql?) is not fire and forget. It's twitchy and requires constant vigilance. I don't want a piece of infrastructure that requires constant vigilance.

That's not to say I won't use it. It's geo stuff is really great. It's JSON support is better than MongoDB's, from what I've heard. Row level security is awesome. But are those features good enough to overcome psql's quirks? Sometimes.

jhas78asd · 3 years ago

If you’re talking about the command line client that’s built in, it’s psql. If you can’t remember the command name to launch it or regularly type those other commands when you meant to type psql, you could add aliases to your shell that point to psql. :)

Learning any new CLI client is a bit daunting at first. With repetition and intention, I think the commands become very memorable. Eg “describe table” is “dt”.

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

craigkerstiens · 3 years ago

Having answered this a ton over the years, don't want to really take shots at MySQL. But Postgres stands in pretty unique ground.

1. It's solid as a really reach data platform (more than just a relational database). It's extension framework is quite unique compared to others. It's JSONB support was the first among other relational databases and is feature rich and performant. Multiple index types. Transactional DDL. The list goes on.

2. No central owner. A lot of open source is source code is open, but it's maintained by a central company.

3. I mentioned extensions, but really that is understated. It can do really advanced geospatial, full text search, time series, the list goes on.

Having explained this a ton of times first 10 years ago - https://www.craigkerstiens.com/2012/04/30/why-postgres/ and then again 5 years later with and updated version, most recently tried to capture more of this in an updated form on the Crunchy Data blog - https://www.crunchydata.com/why-postgres

jhas78asd · 3 years ago

Hi Craig! :) https://twitter.com/andatki

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

spudlyo · 3 years ago

It's also more opinionated than the MySQL query optimizer, in that you can't give it hints to prevent it from making a horrible mistake.

jhas78asd · 3 years ago

There is a way to provide some type of planner hints https://pghintplan.osdn.jp/pg_hint_plan.html

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

spacedcowboy · 3 years ago

I don't need it often enough to invest the time. I generally set up a database as backing store to some project, fiddle with it until I'm happy it's working at the scale/performance I want, and then move on to something else.

During those few weeks I'm actively using the database on the project, I can either get frustrated beyond belief with the CLI for Postgres, or just use what's at hand with MySQL. In fact, these days SQLite is getting more of my attention anyway, and I wrote a small CLI for it a decade or so back (before the sqlite3 client gave us most of the below) to provide:

- Timings for the queries (in fact I called it 'tsql')

- Aligned-column displays, with | separators between columns and index-indicators

- Ability to parse some blobs (Plists on the Mac, for example) and display

- Things like "SHOW DATABASES", "SHOW TABLES", "SHOW TABLES LIKE" etc.

Mainly I wrote it to do some benchmarking, but I eventually preferred it over sqlite3 as the CLI.

Note that all this is personal stuff - When I do commercial stuff, the answer is always "what is best understood by the people maintaining it afterwards"...

jhas78asd · 3 years ago

Each of the bullets you listed have very straightforward and memorable meta commands that I use on a regular basis with psql. It may be worth learning them just for when you use Postgres. There is also a built in help. These can also be saved into your dot files so you don’t need to memorize them. Happy to show you if you’re interested!

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

arp242 · 3 years ago

> I forget if there's an equivalent for the first one

Not really; PostgreSQL doesn't store the original query, so you'll need to re-create it from pg_class, pg_attribute, and all of that (which is really what \d and such in psql do). The easiest way is probably pg_dump, but it's best to just get used to \-commands because it's really just the same thing.

jhas78asd · 3 years ago

Find the SQL from meta commands. Example: https://dba.stackexchange.com/a/131031

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

herpderperator · 3 years ago

I find PostgreSQL permission management quite convoluted. In MySQL it is simple to query for what grants a user has, but in PostgresSQL you need to write 100 lines of SQL to do the same... and you can't run \du and other commands without psql. Why couldn't they just come up with `SHOW` shortcuts that work in any SQL client?

jhas78asd · 3 years ago

You can likely get the SQL for a meta command, and you could run the SQL from your preferred client if you don’t use psql. Here is one example: https://dba.stackexchange.com/a/131031

I also highly recommend investing in psql skills though if you are a Postgres user.

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

skunkworker · 3 years ago

Postgres for anything with a single database size < 4TB.

But if you need something that can handle 100TB+, go Vitess(mysql compatible).

jhas78asd · 3 years ago

What’s the reason though for Vitess? Postgres supports tables up to 32TB but hopefully you’re splitting them up using declarative partitioning in one or more ways before that. If you have tables that are smaller than a TB and a large memory DB (>1 TB RAM) Postgres should run ok right? I’d also imagine you’re splitting up your database into multiple databases and multiple instances (the writers) well before that as well right?

jhas78asd commented on Ask HN: It's 2023, how do you choose between MySQL and Postgres? · Posted by u/debo_

jpgvm · 3 years ago

Generally speaking it should always be PostgreSQL.

It's the default choice for a number of reasons but chief among them is just that it's higher quality. That is it's developed to higher standards due to community bar being really high (thanks Tom Lane, et al for your stewardship) and testing and analysis of changes to the database being ingrained into the culture.

By pursuing correctness first before performance for many years PostgreSQL has built a stronger foundation that is now paying dividends in terms of both performance but also ability to ship powerful features quickly. This has generally resulted in the gap between MySQL and PostgreSQL only continuing to widen over the last 10 years.

So when would you consider picking MySQL?

To me that comes down to exactly one set of use-cases and that is workloads that are fundamentally incompatible with VACUUM. The PostgreSQL MVCC system requires that table heap be maintained by the VACUUM process to both ensure safety (txid wraparound) and reclaim/reuse heap storage. This process is very expensive for workloads that do a lot of updates, especially on indexed columns (as indices need VACUUMing also), less of an issue for non-indexed columns if you can use HOT (heap only tuple) updates and tune the target fill ratio of heap pages appropriately.

In most cases it's highly unlikely your business is going to reach the level of write load where these deficiencies in write behaviour actually matter but it is possible. Uber famously migrated from PostgreSQL primarily because their experiences with write amplification and VACUUMing.

If for instance though your data consists of a smaller live hot set and a warm set that is less frequently updated and easily separable by a deterministic factor like time you can very easily use PostgreSQL table partitioning to isolate the two and continue to scale for a very very long time on pure PostgreSQL.

In practice this may be fixed in PostgreSQL one day, there was a project called zheap to implement an UNDO log style storage system for PostgreSQL (which would avoid all the VACUUM maintenance etc) but it stalled out, largely I believe because it wasn't able to provide obvious wins quick enough to stimulate further interest. However OrioleDB has picked up the torch now and does in fact seem to be showing very impressive results.

If such a storage engine is merged in my mind there will no longer any reason to consider MySQL for production workloads.

jhas78asd · 3 years ago

Thanks for calling out table partitioning. Besides implementing it at one level, multiple levels can be used simultaneously (eg list and range). Tables can be grouped and split out to their own database (aka functional sharding/vertical sharding) and again partitioned. This all takes more effort and investment but keeps you on PostgreSQL. As you said fillfactor can be tuned, more HOT updates. Even analyzing whether the Updates could be turned into inserts that are written at a high rate, not incurring bloat, and then fewer updates are made at a rate that does not outrun Vacuum.