amirkabbara (u/amirkabbara)

amirkabbara commented on Show HN: Intent vectors for AI search and knowledge graphs for AI analytics platform.papr.ai/... · Posted by u/amirkabbara

GraphNinja23 · 8 days ago

You might want to try using a low latency Graph Database like FalkorDB https://github.com/FalkorDB/falkordb

amirkabbara · 8 days ago

Yes, you can swap it into our open source version.

amirkabbara commented on Ask HN: What's your experience with using graph databases for agentic use-cases? · Posted by u/mpetyak

amirkabbara · 3 months ago

At papr.ai, we've been building agentic 'RAG' pipelines for 3+ years and tried almost every new thing to enable agents to search info - keyword search/grep/regex/text2sql/bmi25/semantic vectordbs/knowledge graphs/etc.

We validated the obvious thing - the best approach depends on your use case: 1. keyword/grep/regex/bmi25 works best (fastest, cheapest, most accurate) when you know exactly what you're looking for. 2. semantic search works best with unstructured data when you're not exactly sure what you're looking for. 3. text2sql works best when you have a few pre-defined queries with limited joins the agent can use to fetch structured data. 4. knowledge graphs works best when you need to find info across unstructured + structure data that go beyond semantics similarity (i.e. find arxiv reports by x author the discuss novel knowledge graph methods published in the past 3 years but don't mention neo4j).

So - we ended up building a simple add/search api that predicts where the data should come from, and what the user needs this week/today and cache it. It's accurate and it's fast.

amirkabbara commented on Everyone's engineering context, we're predicting it. Introducing Papr memory API paprai.substack.com/p/int... · Posted by u/amirkabbara

amirkabbara · 4 months ago

Most AI systems today rely on vector search to find semantically similar information. This approach is powerful, but it has a critical blind spot: it finds fragments, not context. It can tell you that two pieces of text are about the same topic, but it can't tell you how they're connected or why they matter together.

To solve this, everyone is engineering context, trying to figure out what to put into context to get the best answer using RAG, agentic-search, hierarchy trees etc. At Papr we tested almost every option that exists. These methods work in simple use cases but not at scale. That's why MIT's report says 95% of AI pilots fail, and why we're seeing a thread around vectors not working.

Instead of humans engineering context, we've built a model to predict the right context. Our model ranks #1 on Stanford's STARK benchmark that measures retrieval in complex real-world queries (not useless needle in a haystack benchmark). It's also super fast because it's predicted in advanced, which is essential for a ton of use cases like voice conversations. Try it out on papr.ai, our open source chat app or use papr's memory APIs to create your own experiences with papr.

We've also developers a retrieval loss formula and show that Papr's memory APIs get better with more data. Not worse like other retrieval systems today. A similar pattern to LLMs - the more data the better.

amirkabbara commented on In another AI push, China holds the first sports event for humanoid robots nbcnews.com/world/asia/ch... · Posted by u/amirkabbara

amirkabbara · 4 months ago

About 500 robot athletes from 16 countries competed in Beijing as the United States and China race against each other to shape the future of AI.