emil_sorensen (u/emil_sorensen)

emil_sorensen commented on Is the doc bot docs, or not? robinsloan.com/lab/what-a... · Posted by u/tobr

skrebbel · 8 months ago

Why RAG at all?

We concatenated all our docs and tutorials into a text file, piped it all into the AI right along with the question, and the answers are pretty great. Cost was, last I checked, roughly 50c per question. Probably scales linearly with how much docs you have. This feels expensive but compared to a human writing an answer it's peanuts. Plus (assuming the customer can choose to use the AI or a human), it's great customer experience because the answer is there that much faster.

I feel like this is a no-brainer. Tbh with the context windows we have these days, I don't completely understand why RAG is a thing anymore for support tools.

emil_sorensen · 8 months ago

Accuracy drops hard with context length still. Especially in more technical domains. Plus latency and cost.

emil_sorensen commented on Is the doc bot docs, or not? robinsloan.com/lab/what-a... · Posted by u/tobr

emil_sorensen · 8 months ago

Docs bots like these are deceptively hard to get right in production. Retrieval is super sensitive to how you chunk/parse documentation and how you end up structuring documentation in the first place (see frontpage post from a few weeks ago: https://news.ycombinator.com/item?id=44311217).

You want grounded RAG systems like Shopify's here to rely strongly on the underlying documents, but also still sprinkle a bit of the magic of the latent LLM knowledge too. The only way to get that balance right is evals. Lots of them. It gets even harder when you are dealing with GraphQL schema like Shopify has since most models struggle with that syntax moreso than REST APIs.

FYI I'm biased: Founder of kapa.ai here (we build docs AI assistants for +200 companies incl. Sentry, Grafana, Docker, the largest Apache projects etc).

emil_sorensen commented on Writing documentation for AI: best practices docs.kapa.ai/improving/wr... · Posted by u/mooreds

MK_Dev · 9 months ago

How do you turn off dark mode on that site? Hurts my eyes

emil_sorensen · 9 months ago

Thanks for the feedback. We should definitely add that. :)

emil_sorensen commented on Writing documentation for AI: best practices docs.kapa.ai/improving/wr... · Posted by u/mooreds

emil_sorensen · 9 months ago

OP here. It's kind of ironic that making the docs AI-friendly essentially just ends up being what good documentation is in the first place (explicit context and hierarchy, self-contained sections, precise error messages).

Posted by u/emil_sorensen 9 months ago

Writing documentation for AI: best practices docs.kapa.ai/improving/wr...

emil_sorensen commented on Evaluating modular RAG with reasoning models kapa.ai/blog/evaluating-m... · Posted by u/emil_sorensen

aantix · a year ago

When aggregating data from multiple systems, how do you handle the case of only searching against data chunks that the user is authorized to view? And if those permissions change?

emil_sorensen · a year ago

We focus mainly on external use cases (e.g., helping companies like Docker and Monday.com deploy customer facing "Ask AI" assistants) so we don't run into much of that given all data is public.

For internal use cases that require user level permissions that's a freaking rabbit role. I recently heard someone describe Glean as a "permissions company" more so than a search company for that reason. :)

emil_sorensen commented on Evaluating modular RAG with reasoning models kapa.ai/blog/evaluating-m... · Posted by u/emil_sorensen

serjester · a year ago

We tried something similar and found much better results with o1 pro than o3 mini. RAG seems to require a level of world knowledge that the mini models don’t have.

This comes at the cost of significantly higher latency and cost. But for us, answer quality is a much higher priority.

emil_sorensen · a year ago

Super cool! Yep, a lot seems to get lost through distillation.

emil_sorensen commented on Evaluating modular RAG with reasoning models kapa.ai/blog/evaluating-m... · Posted by u/emil_sorensen

zurfer · a year ago

Yes. Our main finding was that o3 mini especially is great on paper but surprisingly hard to prompt, compared to non reasoning models. I don't think it's a problem with reasoning, but rather with this specific model. I also suspect that o3 mini is a rather small model and so it can lack useful knowledge for broad applications. Especially for RAG, it seems that larger and fast models (e.g. gpt4o) perform better as of today.

emil_sorensen · a year ago

I suspect you're right here! Excited to get our hands on the non-distilled o3. :)

emil_sorensen commented on Evaluating modular RAG with reasoning models kapa.ai/blog/evaluating-m... · Posted by u/emil_sorensen

mkesper · a year ago

Latency must be brutal here. This will not be possible for any chat application, I guess.

emil_sorensen · a year ago

Yep even with a small bump in performance (which we only saw for a subset of coding questions), it wouldn't be worth the huge latency penalty. Though that will surely go down over time.

u/emil_sorensen

KarmaCake day168July 17, 2019

About

founder of kapa.ai (YC S23)

View Original