Readit News logoReadit News
AlexClickHouse commented on Dynamic Bird Migration Map   explorer.audubon.org/expl... · Posted by u/skadamat
AlexClickHouse · 3 months ago
I've recently implemented a similar service: https://adsb.exposed/?dataset=Birds&zoom=5 - a viewer for eBird data, where you can filter by various species and make visualizations with SQL.

Full write-up on how it is created: https://clickhouse.com/blog/birds

AlexClickHouse commented on Vector database that can index 1B vectors in 48M   vectroid.com/blog/why-and... · Posted by u/mathewpregasen
chatmasta · 3 months ago
I would like to see a “DataFusion for Vector databases,” i.e. an embeddable library that Does One Thing Well – fast embedding generation, index builds, retrieval, etc. – so that different systems can glue it into their engines without reinventing the core vector capabilities every time. Call it a generic “vector engine” (or maybe “embedding engine” to avoid confusion with “vectorized query engine.”)

Currently, every new solution is either baked into an existing database (Elastic, pgvector, Mongo, etc) or an entirely separate system (Milvus, now Vectroid, etc.)

There is a clear argument in favor of the pgvector approach, since it simply brings new capabilities to 30 years of battle-tested database tech. That’s more compelling than something like Milvus that has to re-invent “the rest of the database.” And Milvus is also a second system that needs to be kept in sync with the source database.

But pgvector is still _just for Postgres_. It’s nice that it’s an extension, but in the same way Milvus has to reinvent the database, pgvector needs to reinvent the vector engine. I can’t load pgvector into DuckDB as an extension.

Is there any effort to make a pure, Unix-style, batteries not included, “vector engine?” A library with best-in-class index building, retrieval, storage… that can be glued into a Postgres extension just as easily as it can be glued into a DuckDB extension?

AlexClickHouse · 3 months ago
USearch is this type of library: https://github.com/unum-cloud/usearch

Used in ClickHouse and a few other DBMS.

AlexClickHouse commented on Show HN: I made a small site to share text and files   dum.pt/... · Posted by u/MarsB
AlexClickHouse · 3 months ago
I've implemented a similar site a few years ago, with one crucial difference, which makes it even simpler: https://pastila.nl/

The difference is that there is no "share" button, so you don't have to press it, and just copy the page URL any time.

AlexClickHouse commented on SQLite's File Format   sqlite.org/fileformat.htm... · Posted by u/whatisabcdefgh
alphazard · 3 months ago
SQLite is a great example of a single factor mattering more than everything else combined. A database contained in a single file is such a good idea that it outweighs a poorly designed storage layer, poorly designed column formats, and a terrible SQL implementation.

If craftsmanship is measured by the long tail of good choices that give something a polished and pristine feel, then SQLite was built with none of it. And yet, it's by far the best initial choice for every project that needs a database. Most projects will never need to switch to anything more.

AlexClickHouse · 3 months ago
Exactly as in MS Access, Interbase/Firebird, and dBase II.

u/AlexClickHouse

KarmaCake day93January 30, 2025View Original