Readit News logoReadit News
digikata commented on What Does a Database for SSDs Look Like?   brooker.co.za/blog/2025/1... · Posted by u/charleshn
esperent · 13 hours ago
Don't some SSDs have 512b page size?
digikata · 12 hours ago
I would guess by now none have that internally. As a rule of thumb every major flash density increase (SLC, TLC, QLC) also tended to double internal page size. There were also internal transfer performance reasons for large sizes. Low level 16k-64k flash "pages" are common, and sometimes with even larger stripes of pages due to the internal firmware sw/hw design.

Deleted Comment

digikata commented on MinIO is now in maintenance-mode   github.com/minio/minio/co... · Posted by u/hajtom
dardeaup · 17 days ago
I've done some preliminary testing with garage and I was pleasantly surprised. It worked as expected and didn't run into any gotchas.
digikata · 17 days ago
Garage is really good for core S3, the only thing I ran into was it didn't support object tagging. It could be considered maybe a more esoteric corner of the S3 api, but minio does support it. Especially if you're just mapping for a test api, object tagging is most likely an unneeded feature anyway.

It's a "Misc" endpoint in the Garage docs here: https://garagehq.deuxfleurs.fr/documentation/reference-manua...

digikata commented on MinIO stops distributing free Docker images   github.com/minio/minio/is... · Posted by u/LexSiga
digikata · 2 months ago
Incidentally there is a open source S3 project in rust that I have been following. About a year ago, I applied Garage images to replace some minio instances used in CI pipelines - lighter weight and faster to come up.

https://github.com/deuxfleurs-org/garage

digikata commented on Show HN: FeOx – Fast embedded KV store in Rust   github.com/mehrantsi/FeOx... · Posted by u/mehrant
emschwartz · 4 months ago
Sounds interesting, though that durability tradeoff is not one that I’d think most people/applications want to make. When you save something to the DB, you generally want that to mean it’s been durably stored.

Are there specific applications you’re targeting where latency matters more than durability?

digikata · 4 months ago
This seems around the durability that most databases can reach. Aside from more specialized hardware arrangements, with a single computer, embedded database there is always a window of data loss. The durability expectation is that some in-flight window of data will be lost, but on restart, it should recover to a consistent state of the last settled operation if at all possible.

A related questions is if the code base is mature enough when configured for higher durability to work as intended. Even with Rust, there needs to be some hard systems testing and it's often not just a matter of sprinkling flushes around. Further optimization can try to close the window tighter - maybe with a transaction log, but then you obviously trade some speed for it.

digikata commented on Using Podman, Compose and BuildKit   emersion.fr/blog/2025/usi... · Posted by u/LaSombra
digikata · 4 months ago
On Linux I'm using colima with docker compose and buildx and it seems to work ok for my limited cases.

On Mac it works ok to, but there are networking cases that Colima on mac doesn't handle - so orbstack for there

digikata commented on Who Invented Backpropagation?   people.idsia.ch/~juergen/... · Posted by u/nothrowaways
mystraline · 4 months ago
> BP's modern version (also called the reverse mode of automatic differentiation)

So... Automatic integration?

Proportional, integrative, derivative. A PID loop sure sounds like what they're talking about.

digikata · 4 months ago
There are large bodies of work for optimization of state space control theory that I strongly suspect as a lot of crossover for AI, and at least has very similar mathematical structure.

e.g. optimization of state space control coefficients looks something like training a LLM matrix...

digikata commented on The future of large files in Git is Git   tylercipriani.com/blog/20... · Posted by u/thcipriani
tempay · 4 months ago
> binary chunking and deduplication

Are there many binaries that people would store in git where this would actually help? I assume most files end up with compression or some other form of randomization between revisions making deduplication futile.

digikata · 4 months ago
I don't know, it's all probability in the dataset that makes one optimization strategy better over another. Git annex iirc does file level dedupe. That would take care of most of the problem if you're storing binaries that are compressed or encrypted. It's a lot of work to go beyond that, and probably one reason no one has bothered with git yet. But borg and restic both do chunked dedupe I think.
digikata commented on The future of large files in Git is Git   tylercipriani.com/blog/20... · Posted by u/thcipriani
ks2048 · 4 months ago
I'm not sure binary diffs are the problem - e.g. for storing images or MP3s, binary diffs are usually worse than nothing.
digikata · 4 months ago
I would think that git would need a parallel storage scheme for binaries. Something that does binary chunking and deduplication between revisions, but keeps the same merkle referencing scheme as everything else.
digikata commented on China is about to launch SSDs so small you insert them like a SIM card   theverge.com/news/759624/... · Posted by u/t-3
digikata · 4 months ago
(2016) https://news.samsung.com/global/samsung-mass-producing-indus...

The size is nothing new, but the packaging into some sort of slottable physical interface with good signal integrity is somewhat new.

u/digikata

KarmaCake day2433September 19, 2010
About
Working in Rust and in pipeline streaming and processing with https://fluvio.io

Reach me on Fedi/Mastodon @digikata@fosstodon.org

View Original