Readit News logoReadit News
eirikbakke · 3 months ago
The PostgreSQL data directory format is not very stable or portable. You can't just ZIP it up and move it to a different machine, unless the new machine has the same architecture and "sufficiently" similar PostgreSQL binaries.

In theory the data directory works with any PostgreSQL binaries from the same major version of PostgreSQL, but I have seen cases where this fails e.g. because the binaries were from the same major version but compiled with different build options.

OutOfHere · 3 months ago
I would never ever zip up a PostgreSQL data directory and expect it to restore elsewhere. I would use a proper export. If the export is too slow, it could help to use streaming replication to write intermediate files which can be moved to a backup location.

Even with SQLite, for a simple file copy to work reliably, one has to set these three:

  "_pragma=synchronous(FULL)",
  "_pragma=checkpoint_fullfsync(ON)",
  "_pragma=fullfsync(ON)",

markusw · 3 months ago
Agreed about the Postgres export!

For sqlite, I would recommend "sqlite3 app.db '.backup backup.db'" though, if that's an option. Guaranteed consistent!

tracker1 · 3 months ago
Yeah, I was going to mention, just upgrading between PG versions can be a bit of a pain. Dump/Restore really seems like a less than stellar option of you have a LOT of data. I mean you can stream through gzip/bzip to save space but still.

I often wish that Firebird had a license that people found friendlier to use as it always felt like a perfect technical option for many use cases from embedded to a standalone server. PG has clearly eclipsed it at this point though.

markusw · 3 months ago
On the other hand, and especially if migrating from SQLite, there's typically not _that_ much data. Even hundreds of GBs would probably be okay? Drives are so fast these days. Would be interesting to benchmark.
markusw · 3 months ago
Yeah, that really is a great thing about SQLite.

I wonder whether packaging everything in Docker (including a specific Postgres container identified by hash or whatever) and deploying on the same architecture would solve this?

edoceo · 3 months ago
I work on a "complex" web-app stack that we want to be self-hostable.

Our docker-compose.yaml has Postgrs, Redis, OPA, Traefik and then our four services. Works a treat.

One thing we haven't solved for is how to have the PG upgrade work when we update major in the image. Nice that 16 should work for a long while.

oulipo2 · 3 months ago
So the alternative would be to just pg_dump / pg_restore the content? Is it an issue?
eirikbakke · 3 months ago
That is the correct way to do it for PostgreSQL.

It does rule out some common SQLite use cases, such as using the database for app persistence. The app would never be able to upgrade the major version of PostgreSQL, unless it bundled PostgreSQL binaries for every major version ever used.

chucky_z · 3 months ago
If it's more than even like 10GB this is going to take _awhile_.

I love the pg_* commands but they ain't exactly the speediest things around.

amtamt · 3 months ago
Logical replication would be a good option, with temporarily extra hardware added.
freedomben · 3 months ago
I've been hit by this too, so definitely a risk. I've also had permissions on the files get broken which is a bitch to debug
mutagen · 3 months ago
Some apps do, the most used I know of is Blackmagic's Davinci Resolve, the video editor with a relatively full featured free edition available. I think this has more to do with its roots being in a high end networked environment but still, the local desktop version installs Postgres.
markusw · 3 months ago
Oh, interesting! But that's more of a desktop application now, right? I was thinking of web servers when writing the article, but I can see how that's not totally clear. :-)
emschwartz · 3 months ago
I think this is a neat direction to explore. I've wondered in the past whether you could use https://pglite.dev/ as if it were SQLite.
tensor · 3 months ago
Someone is working on a libpglite based around pglite. I think this is what will provide an actual sqlite alternative:

https://github.com/electric-sql/pglite-bindings

It would still be missing the standardized disk format aspect of sqlite, but otherwise will be neat.

thruflo · 3 months ago
This is very much the point of https://pglite.dev

It's an embeddable Postgres you can run in process as a local client DB, just like SQLite but it's actually Postgres.

emschwartz · 3 months ago
For sure. I’m curious if anyone using it in production as an alternative to SQLite and, if so, what the performance and experience is like.
OutOfHere · 3 months ago
I wish this were available for Go lang, perhaps via Wazero.
xarope · 3 months ago
You got my hopes up, but it's WASM for now, not something I could add into a golang [1] or python app and have running like sqlite. OK, still hoping...!

[1]https://github.com/electric-sql/pglite/issues/89

markusw · 3 months ago
Interesting! I'm going to look into that. :-) Thank you for sharing.
bob1029 · 3 months ago
I think this is a great idea for testing. MSSQL has LocalDB which is used a lot throughout the .NET ecosystem:

https://learn.microsoft.com/en-us/sql/database-engine/config...

For heavy duty production use (i.e., pushing the actual HW limits), I would feel more comfortable with the SQLite vertical. Unix sockets are fast, but you are still going across process boundaries. All of SQLite can be ran on the same thread as the rest of the application. This can dramatically reduce consumption of memory bandwidth, etc.

bluGill · 3 months ago
Memory bandwidth I don't worry about much - most of the time you should settup a small database with just enough data for that test, which hopefully is fast. However sockets and processes are a worry as starting as there are places things can go wrong not related to your test and then you have flakely tests nobody trusts.
HackerThemAll · 3 months ago
> most of the time you should settup a small database with just enough data for that test, which hopefully is fast.

Yeah, this way you're not going to notice performance issues with your data model or application. Test databases should be of sizes comparable to what's expected in production, specifically after several years in production.

markusw · 3 months ago
Note: The last paragraph was about in production, not tests. :-)
markusw · 3 months ago
I am so tempted to benchmark some more… :D Would be great to get some numbers on this, super interesting.
srameshc · 3 months ago
When I want to treat my Postgres like SQLite , I use DuckDB using PostgreSQL Extension https://duckdb.org/docs/stable/core_extensions/postgres.html and this is one of may extensions, there is more and I love DuckDB for that
markusw · 3 months ago
While I really like DuckDB, I wouldn't use it for OLTP workloads.

I'm curious, when do you want to treat your Postgres like SQLite? :-) That's basically the opposite of what I was thinking of in the article.

srameshc · 3 months ago
One of our web apps need a small subset of data from Postgres and we use DuckDB WASM to keep it clost to front end. Yes, I agree with not using it for for OLTP workloads. But instead of installing like author mentions in the post, this is a light replacement that works perfectly if you need a Postgres like DB in your app. Just have to add a new file, INSTALL postgres, LOAD postgres,

ATTACH 'dbname=mydb user=dbname host=25.25.25.25 password=aaaa AS db (TYPE postgres, READ_ONLY);

You can CREATE TABLE my_duck_table_stream AS SELECT * FROM db.table_stream;

It's just fun

oulipo2 · 3 months ago
In what kind of scenarios are you using Duckdb from postgres? And does it improve over, say, having a Clickhouse instance nearby?
marcobambini · 3 months ago
Instead of sqlite-vec you can take a look at the new sqlite-vector extension: https://substack.com/home/post/p-172465902
markusw · 3 months ago
Uuuuh, that looks super interesting! I hadn't heard of that. Thank you for sharing!
theamk · 3 months ago
Note that project is source-available, and not compatible with existing open source licenses.
kissgyorgy · 3 months ago
The huge advantage of SQLite is not that it's on the same machine, but it's that is it in process which makes deployment and everything else just simpler.
markusw · 3 months ago
Yeah, totally agreed. An embedded Postgres would be sweet (see pglite elsewhere here in the comments, looks interesting).
kketch · 3 months ago
I've been using postgres as a local database for one of my personal projects, a GUI app or to run python tests that depend on it without having to rely on installing it in my environment.

I created a Python package that downloads an embedded Postgres binary, sets up the database, and gives you a database URL: https://github.com/kketch/tinypg

It downloads pg binaries from this project: https://github.com/zonkyio/embedded-postgres-binaries. There are other similar projects listed on that page that provide this for Java, Go, Rust and Node

kketch · 3 months ago
When not using python, been using this script to create ephemeral postgres databases for tests but also persistent one in my dev containers: https://github.com/eradman/ephemeralpg

I've wrapped it with yet another shell script to make it usable just like this:

`export DB_URL=$(./pgtest.sh)`

This version (pgtest.sh), just creates a disposable ephemeral postgres: https://gist.github.com/kketch/d4e19a7fb6ebc1cfc265af44c1b41...

This version (pgdev.sh), starts a postgres instance (if not already running) and persists the DB. Also supports seeding it with a SQL script (dev_seed.sql): https://gist.github.com/kketch/88bb5b766994e247a9f2c37f13306...