alanwli (u/alanwli) - Readit News

alanwli commented on ANN v3: 200ms p99 query latency over 100B vectors turbopuffer.com/blog/ann-... · Posted by u/_peregrine_

alanwli · 16 days ago

Out of curiosity, how is the 92% recall calculated? For a given query, is the recall compared to the true topk of all 100B vectors vs. recall at each of N shards compared to the topk of each respective shard?

alanwli commented on The Case Against PGVector alex-jacobs.com/posts/the... · Posted by u/tacoooooooo

alanwli · 3 months ago

I've seen a decent amount of production use of pgvector HNSW from our customers on GCP, but as the author noted is not without some flaws and are typically in the smallish range (0-10M vectors) for the systems characteristics that he pointed out - i.e. build times, memory use. The tradeoffs to consider are whether you want to ETL data into yet another system and deal with operational overhead, eventual consistency, application-logic to join vector search with the rest of your operational data. Whether the tradeoffs are worth it really depends on your business requirements.

And if one needs the transactional/consistency semantics, hybrid/filtered-search, low latencies, etc - consider a SOTA Postgres system like AlloyDB with AlloyDB ScaNN which has better scaling/performance (1B+ vectors), enhanced query optimization (adaptive pre-/post-/in-filtering), and improved index operations.

Full disclosure: I founded ScaNN in GCP databases and currently lead AlloyDB Semantic Search. And all these opinions are my own.

alanwli commented on Will Amazon S3 Vectors kill vector databases or save them? zilliz.com/blog/will-amaz... · Posted by u/Fendy

simonw · 5 months ago

This is a good article and seems well balanced despite being written by someone with a product that directly competes with Amazon S3. I particularly appreciated their attempt to reverse-engineer how S3 Vectors work, including this detail:

> Filtering looks to be applied after coarse retrieval. That keeps the index unified and simple, but it struggles with complex conditions. In our tests, when we deleted 50% of data, TopK queries requesting 20 results returned only 15—classic signs of a post-filter pipeline.

Things like this are why I'd much prefer if Amazon provided detailed documentation of how their stuff works, rather than leaving it to the development community to poke around and derive those details independently.

alanwli · 5 months ago

The alternative is to find solutions that can reasonably support different requirements because business needs change all the time especially in the current state of our industry. From what I’ve seen, OSS Postgres/pgvector can adequately support a wide variety of requirements for millions to low tens of millions of vectors - low latencies, hybrid search, filtered search, ability to serve out of memory and disk, strong-consistency/transactional semantics with operational data. For further scaling/performance (1B+ vectors and even lower latencies), consider SOTA Postgres system like AlloyDB with AlloyDB ScaNN.

Full disclosure: I founded ScaNN in GCP databases and am the lead for AlloyDB Semantic Search. And all these opinions are my own.

alanwli commented on Full text search over Postgres: Elasticsearch vs. alternatives blog.paradedb.com/pages/e... · Posted by u/philippemnoel

retakeming · 2 years ago

Good question. That was from a very old version of pg_bm25 (since renamed to pg_search). BM25 indexes are now strongly consistent.

alanwli · 2 years ago

Nice! I've seen other extensions that don't have transactional semantics, which runs counter to the norm for PG.

So since it was previously weakly consistent due to performance reasons, how does strong consistency affect transactional inserts/updates latency?

alanwli commented on Full text search over Postgres: Elasticsearch vs. alternatives blog.paradedb.com/pages/e... · Posted by u/philippemnoel

alanwli · 2 years ago

Always great to see Postgres-based alternatives.

One clarification question - the blog post lists "lack of ACID transactions and MVCC can lead to data inconsistencies and loss, while its lack of relational properties and real-time consistency makes many database queries challenging" as the bad for ElasticSearch. What is pg_bm25's consistency model? It had been mentioned previously as offering "weak consistency" [0], which I interpret to have the same problems with transactions, MVCC, etc?

[0]: https://news.ycombinator.com/item?id=37864089