Single-node Chroma is very easy to self-host for free: https://docs.trychroma.com/guides/deploy/docker
Single-node Chroma is very easy to self-host for free: https://docs.trychroma.com/guides/deploy/docker
Deleted Comment
I see your email in your profile, so I'll reach out.
We work closely with any startup that wants help, ranging from whiteboarding architectures to a shared slack channel.
Deleted Comment
I think there's a misconception among many non-technical people that they should "fine tune" models - to make an LLM sound like their brand or know their internal data. But, the best practice today is "context engineering" - to use a commodity LLM, and add propriety information into the prompt.
The hard part of context engineering is knowing which data to incorporate. For example, all help docs in every context creates context rot [1] that hurts accuracy, so you need a way to find which information to put in which context. The solution is essentially an internal search engine, which is Chroma.
Chroma gives AI multiple ways to search, and you can either hard-code which one works best for you - or let an agent write its own queries to conduct research by itself. Vector search is a generation ahead of old search infrastructure, and is useful for relatedness queries - like "help docs about billing". Full-text search is useful for proper nouns, like "Next.js". Regex search is useful for code and laws. Metadata search is more nuanced, but becomes really important in document search (e.g., PDFs). Chroma lets you run all of these search methods against private data, and you can use it to even include citations in results.
So, the high-level answer is: Chroma enables you to incorporate your business or customer data into AI.
Chroma supports multiple search methods - including vector, full-text, and regex search.
Four quick ways Chroma is different than pgvector: Better indexes, sharding, scaling, and object storage.
Chroma uses SPANN (Scalable Approximate Nearest Neighbor) and SPFresh (a freshness-aware ANN index). These are specialized algorithms not present in pgvector. [1].
The core issue with scaling vector database indexes is that they don't handle `WHERE` clauses efficiently like SQL. In SQL you can ask "select * from posts where organization_id=7" and the b-tree gives good performance. But, with vector databases - as the index size grows, not only does it get slower - it gets less accurate. Combining filtering with large indexes results in poor performance and accuracy.
The solution is to have many small indexes, which Chroma calls "Collections". So, instead of having all user data in one table - you shard across collections, which improves performance and accuracy.
The third issue with using SQL for vectors is that the vectors quickly become a scaling constraint for the database. Writes become slow due to consistency, disk becomes a majority vector indexes, and CPU becomes clogged by re-computing indexes constantly. I've been there and ultimately it hurts overall application performance for end-users. The solution for Chroma Cloud is a distributed system - which allows strong consistency, high-throughput of writes, and low-latency reads.
Finally, Chroma is built on object storage - vectors are stored on AWS S3. This allows cold + warm storage tiers, so that you can have minimal storage costs for cold data. This "scale to zero" property is especially important for multi-tenant applications that need to retain data for inactive users.
Do your jobs tend to be for technical or non-technical customers? What are the characteristics of developers who succeed on your site?
Maybe there are millions in America that only keep their jobs for the health benefits rather than starting a 1-2 person business.
It just seems so silly.
Related: "My gift to USA" by Harald Eia (Norwegian comedian) https://www.youtube.com/watch?v=PguJ-lm4uLg
Takeaway of that video: "Can you really be free if you have to depend on somebody else?"