log4shell (u/log4shell)

log4shell commented on Every System is a Log: Avoiding coordination in distributed applications restate.dev/blog/every-sy... · Posted by u/sewen

trollbridge · a year ago

We called it a ledger since we stored financial data and basically used the “ledger” format from plain text accounting, initially.

log4shell · a year ago

That answers my question, thanks! Ordering does not matter part has me curious too, I will read other comment and come back to you l.

log4shell commented on Every System is a Log: Avoiding coordination in distributed applications restate.dev/blog/every-sy... · Posted by u/sewen

hcarvalhoalves · a year ago

I believe "ledger" implies commutative property (order does not matter).

log4shell · a year ago

I am not aware of any such implicit connection of ledger and commutative property, also couldn't find anything as my google-fu is letting me down. Anything I can refer to? Generally curious to know use of term ledger outside of accounting and blockchains.

I have seen it used to mean WAL before, so I am taking this with a dose of skepticism.

log4shell commented on Every System is a Log: Avoiding coordination in distributed applications restate.dev/blog/every-sy... · Posted by u/sewen

trollbridge · a year ago

I’ve been doing a similar thing, although I called it “append only transaction ledgers”. Same idea as a log. A few principles:

- The order of log entries does not matter.

- Users of the log are peers. No client / server distinction.

- When appending a log entry, you can send a copy of the append to all your peers.

- You can ask your peers to refresh the latest log entries.

- When creating a new entry, it is a very good idea to have a nonce field. (I use nano IDs for this purpose along with a timestamp, which is probabilistically unique.)

- If you want to do database style queries of the data, load all the log entries into an in memory database and query away.

- You can append a log entry containing a summary of all log entries you have so far. For example: you’ve been given 10 new customer entries. You can create a log entry of “We have 10 customers as of this date.”

- When creating new entries, prepare the entry or list of entries in memory, allow the user to edit/revise them as a draft, then when they click “Save”, they are in the permanent record.

- To fix a mistake in an entry, create a new entry that “negates” that entry.

A lot of parallelism / concurrency problems just go away with this design.

log4shell · a year ago

Calling a WAL a ledger, why? Ledger sounds fancier but why would it be a ledger in this case?

log4shell commented on Query Engines: Gatekeepers of the Parquet File Format duckdb.org/2025/01/22/par... · Posted by u/tosh

log4shell · a year ago

Good call-out, should have named the query engines lagging behind too!

I am wondering how portable is parquet format and how interchangeable it is now?

log4shell commented on Sail – Unify stream processing, batch processing and compute-intensive workloads github.com/lakehq/sail... · Posted by u/chenxi9649

log4shell · a year ago

It is refreshing to see multiple projects with arrow/datafusion trying to bank on existing and user friendly spark's API instead of reinventing the API all over again.

There is likes of comet and blaze that replace execution backend of spark with datafusion and then you have single process alternatives like sail trying to settle in "not so big data" category.

I am watching evolution of projects powered by datafusion and compatible with spark with keen eye. Early days but quite exciting.

log4shell commented on Sail – Unify stream processing, batch processing and compute-intensive workloads github.com/lakehq/sail... · Posted by u/chenxi9649

anonzzzies · a year ago

Bit off topic; we are looking for something like this but with a facility for untrusted users to run sandboxed code instead of trusted code. All that I found (but I am relatively new to this field) are hacky and, worse, slow solutions.

log4shell · a year ago

What is your use case?

log4shell commented on DuckDB 1.1.0 Released duckdb.org/2024/09/09/ann... · Posted by u/craigkerstiens

log4shell · a year ago

Congratulations to duckdb team! Can't wait to try some of the newly released features and performance improvements.

I am quite curious about the plans for python dataframe like API for duckdb, and python ecosystem in general.

log4shell commented on Farewell Pandas, and thanks for all the fish ibis-project.org/posts/fa... · Posted by u/nojito

log4shell · a year ago

Its great to have a single entrypoint for multiple backends. What I am trying to understand and couldn't find much information related to: How does the use of multiple engines in Ibis impact the consistency of results for the same input and query, particularly in relation to semantic differences among the engines?

log4shell commented on Tech failure nearly caused massive flood in Amsterdam nltimes.nl/2024/09/03/tec... · Posted by u/belter

log4shell · a year ago

Is there a way for general public to see the status of open sluice gates and water levels in various parts of netherlands? A live datastream might be the best!