Readit News logoReadit News
dkam commented on Litestream VFS   fly.io/blog/litestream-vf... · Posted by u/emschwartz
dkam · 2 months ago
Love the progress being made here. I've been really enjoying learning about another embedded database - DuckDB - the OLAP to SQLite's OLTP.

DuckDB has a lakehouse extension called "DuckLake" which generates "snapshots" for every transaction and lets you "time travel" through your database. Feels kind of analogous to LiteStream VFS PITR - but it's fascinating to see the nomenclature used for similar features. The OLTP world calls it Point In Time Recovery, while in the OLAP/data lake world, they call it Time Travel and it feels like a first-class feature.

In SQLite Litestream VFS, you use `PRAGMA litestream_time = ‘5 minutes ago’` ( or a timestamp ) - and in DuckLake, you use `SELECT * FROM tbl AT (VERSION => 3);` ( or a time stamp ).

DuckDB (unlike SQLite) doesn't allow other processes to read while one process is writing to the same file - all processes get locked out during writes. DuckLake solves this by using an external catalog database (PostgreSQL, MySQL, or SQLite) to coordinate concurrent access across multiple processes, while storing the actual data as Parquet files. It's a clever architecture for "multiplayer DuckDB.” - deliciously dependent on an OLTP to manage their distributed multiple user OLAP. Delta Lake uses uploaded JSON files to manage the metadata skipping the OLTP.

Another interesting comparison is the Parquet files used in the OLAP world - they’re immutable, column oriented and contain summaries of the content in the footers. LTX seems analogous - they’re immutable, stored on shared storage s3, allowing multiple database readers. No doubt they’re row oriented, being from the OLTP world.

Parquet files (in DuckLake) can be "merged" together - with DuckLake tracking this in its PostgreSQL/SQLite catalog - and in SQLite Litestream, the LTX files get “compacted” by the Litestream daemon, and read by the LitestreamVFS client. They both use range requests on s3 to retrieve the headers so they can efficiently download only the needed pages.

Both worlds are converging on immutable files hosted on shared storage + metadata + compaction for handling versioned data.

I'd love to see more cross-pollination between these projects!

dkam commented on Show HN: I built a site that finds the cheapest place to buy a book   pagesonpages.com/... · Posted by u/shnksi
steven_noble · 3 years ago
I'm a huge fan of Booko, which has been doing exactly this for years
dkam · 3 years ago
Thanks for the shout out. First SVN(!) commit was July 2, 2007, making Booko a 15+ year old Rails app.
dkam commented on Show HN: Boook.link – Share a book with links to all stores   boook.link... · Posted by u/Cenk
ggop · 5 years ago
Australians can use https://booko.com.au and it includes prices as well.
dkam · 5 years ago
I built this to learn Rails in 2007. You can use https://booko.info/ if you want to type 2 less characters and aren't in Australia. Also, if you use DuckDuckGo you can type !booko A Series of Unfortunate Events and be taken right to the search results.
dkam commented on Show HN: Find the cheapest Amazon for your books   piranhas.co/... · Posted by u/k33l0r
nubbie · 13 years ago
You might find http://booko.com.au/ useful
dkam · 13 years ago
Hey that's my site - thanks for mentioning it! There's also a UK, NZ and US site( http://booko.us/ ) now.

Deleted Comment

u/dkam

KarmaCake day7July 28, 2009View Original