tomasol (u/tomasol) - Readit News

tomasol commented on 100k TPS over a billion rows: the unreasonable effectiveness of SQLite andersmurphy.com/2025/12/... · Posted by u/speckx

andersmurphy · 3 months ago

PRAGMA synchronous="normal" is fine if you are in WAL mode. The database cannot be corrupted by power loss unlike in journal mode.

> The synchronous=NORMAL setting provides the best balance between performance and safety for most applications running in WAL mode. You lose durability across power lose with synchronous NORMAL in WAL mode, but that is not important for most applications. Transactions are still atomic, consistent, and isolated, which are the most important characteristics in most use cases.

tomasol · 3 months ago

fsync is the most expensive operation during a write. NORMAL mode means you don't care whether last ~100 ms of transactions before a process crash / VM restart are going to be persisted or not. My suggestion is either to use synchronous="full" or disable `synchronous_commit` on Postgres to avoid comparing apples to oranges.

Edit: Also, the example indicates financial transactions. Can you explain why you need serializability but not durability?

tomasol commented on 100k TPS over a billion rows: the unreasonable effectiveness of SQLite andersmurphy.com/2025/12/... · Posted by u/speckx

tomasol · 3 months ago

Author is setting PRAGMA synchronous="normal", meaning fsync is not issued as part of every write tx, but eventually. In order to make the comparison fair it should be set to "full".

tomasol commented on Show HN: Obelisk – a WASM-based deterministic workflow engine obeli.sk/... · Posted by u/tomasol

AlotOfReading · a year ago

I get that, the nondeterminism would come from the completion order of the join set. If the children sleep appropriately, they'll race to be inserted after completing, and order of the result set will depend on the specifics of the implementation. It's possible this could happen deterministically, but probably not reasonably.

tomasol · a year ago

Sorry for the late reply.

The actual order in which child workflows finish and their results hit the persistence layer is indeed nondeterministic in real-time execution. Trying to force deterministic completion order would likely be complex and defeat the purpose of parallelism, as you noted.

However, this external nondeterminism is outside the scope of the workflow execution's determinism required for replay.

When the workflow replays, it doesn't re-run the race. It consumes events from the log. The `-await-next` operation during replay simply reads the next recorded result, based on the fixed order. Since the log provides the same sequence of results every time, the workflow's internal logic proceeds identically, making the same decisions based on that recorded history.

Determinism is maintained within the replay context by reading the persisted, ordered outcomes of those nondeterministic races.

tomasol commented on Show HN: Obelisk – a WASM-based deterministic workflow engine obeli.sk/... · Posted by u/tomasol

AlotOfReading · a year ago

WASM isn't quite deterministic. An easy example is NaN propagation, which can be nondeterministic in certain circumstances. Obelisk itself seems to allow nondeterminism via the sleep() function. Just create a race condition among a join set. I imagine that might even get easier once the TODO to implement sleep jitter is completed.

It's certainly close enough that calling it deterministic isn't misleading (though I'd stop short of "true determinism"), but there's still sharp edges here with things like hashmaps (e.g. by recompiling: https://dev.to/gnunicorn/hunting-down-a-non-determinism-bug-...).

tomasol · a year ago

> Just create a race condition among a join set.

All responses and completed delays are stored in a table with an auto-incremented id, so the `-await-next` will always resolve to the same value.

As you mention, putting a persistent sleep and a child execution into the same join set is not yet implemented.