Emerging architectures for LLM applications

> So, agents have the potential to become a central piece of the LLM app architecture (or even take over the whole stack, if you believe in recursive self-improvement). ... . There’s only one problem: agents don’t really work yet.

I really appreciate that they called out and separated some hype vs. practice, specifically with regards to Agents. This is something I keep hoping works better than it does, and in practice every attempt I've taken in this direction leads to disappointment.

majestic5762 · 3 years ago

What vector DB are you using? What is the data structure that you're vectorizing? What is your chunk size? Have you implemented memory? What prompt or technique are you using (ReACT, CoT, few-shot, etc)? Are you only using vector DBs? Do you use sequential chains? Does it need tools? Depending on your data, business case and what output you expect from the agent, there is no one-size-fits-them-all.

baner2022 · 3 years ago

Ha, I had the exact same reaction

I feel like lots of paper are getting published and reviewed which is good as bad ideas don’t get to propagate for ages

liampulles · 3 years ago

This mirrors my experience as well. I've tried to use them for pretty straightforward support agent type tasks and found that they very often go down wrong paths trying to solve the problem.

jerpint · 3 years ago

Similarly for retrieval augmented LLM agents, they break down very quickly once the question is not directly found in the documents

EGreg · 3 years ago

I am so glad that top VCs are thinking along these lines, of architectures that incorporate AI as part of the flow.

We've spun off a company to realize the vision of bringing an open-source, standardized framework to the PHP ecosystem, where we've been building apps for communities for over a decade. It's "AI for the rest of us", but at the same time promoting positive collaboration between communities and AI. It also involves micropayments for tasks done by either an AI or human agent.

If you're a VC or an expert in the space, I'd love to get feedback on this: https://engageusers.ai/ecosystem.pdf

And if you want to get involved in any capacity, whether as an investor or developer, please email me greg at the domain engageusers.ai -- this time around we are planning to take on venture capital funding for this project, and syndicate a round later this summer.

Bit of self-promotion, but Milvus (https://milvus.io) is another open-source vector database option as well (I have a pretty good idea as to why it isn't listed in a16z's blog post). We also have milvus-lite, a pip-installable package that uses the same API, for folks who don't want to stand up a local service.

    pip install milvus

Other than that, it's great to see the shout-out for Vespa.

baner2022 · 3 years ago

Appreciate sharing, will try Milvus

Vector database space is the Wild West, keep at it

fzliu · 3 years ago

Hell yeah! Feel free to reach out if you need any help.

serjester · 3 years ago

Enjoying your guy's podcast!

tartakovsky · 3 years ago

Why do you not think it’s featured? Has a16z funded many of those companies, lol? And somehow, rejected milvus?

SheepHerdr · 3 years ago

The guys at Milvus raised a total of $113M according to Crunchbase, second only to Pinecone, which is funded by a16z. You're not going to highlight the main competitor of one of your portfolio companies.

ericjang · 3 years ago

I am an AI researcher. Most actual AI researchers and engineers use very few of these tools - the only one being model providers like OpenAI API and public clouds (AWS, Azure, GCP). The rest of these are infra-centric tools that a16z is highly incentivized to over-inflate the importance of.

paulgb · 3 years ago

This does look like the sort of complex ecosystem that emerges when there is an inflection point and then, before consolidation happens. It reminds me of adtech in the early 2010s.

That said, while much of this might not have any real traction long-term, looking at what researchers use seems to miss the mark a bit. It’s like saying network technology researchers aren’t using Vercel.

haldujai · 3 years ago

There are some other useful ones in there. Hugging Face jumps out. W&B. I haven't used Mosaic but I could see myself for bigger projects, I know of at least two PIs @ Stanford using them.

comfypotato · 3 years ago

I think the diagram was just meant to be comprehensive. The writeup itself doesn’t imply all the tool nonsense.

smeagull · 3 years ago

Yes, because you're not working on applications.

golergka · 3 years ago

This blog post is not about AI researchers — it's about developers who make products out of LLMs.

bluecoconut · 3 years ago

killdozer · 3 years ago

This blog post is way more complex than it needs to be, a lot of what most people are doing with llms right now boils down to using vector databases to provide the "best" info/examples to your prompt. This is a slick marketing page but im not sure what they think they're providing beyond that.

TechBro8615 · 3 years ago

This is what a16z does. A few years ago it was the "Modern Data Stack" and a few years before that it was "DevOps." For some reason, venture capitalists really like making these fancy charts to describe the obvious, and then mostly ignoring them during their investment decisions (or sometimes they make the investments, then they make the charts and put their portfolio companies in the boxes).

im_down_w_otp · 3 years ago

To protect their downside risk by seeding the broader marketplace with a narrative which later stage investors and acquirers will be influenced by in making decisions to invest or acquire.

swyx · 3 years ago

interesting to see that the word "generative" does not appear in this blogpost (apart from the tags). 6 months ago Generative AI was all the rage: https://a16z.com/2023/01/19/who-owns-the-generative-ai-platf...

I think this is a very well articulated breakdown of the "LLM Core Code Shell" (https://www.latent.space/p/function-agents#%C2%A7llm-core-co...) view of the world. but it is underselling the potential to leave the agents stuff to a three paragraph "what about agents?" piece at the end. the emerging architecture of "Code Core, LLM Shell" decentralizing and specializing the role of the LLM will hopefully get more airtime in the december a16z landscape chart!

rajko_rad · 3 years ago

Hi @swyx, Thanks for the kind words!

we actually just purposefully left that part a bit scarce because we have something else coming up on the topic! I'm sure we will be chatting through it soon :)

stan_kirdey · 3 years ago

this and other different end-to-end architectures are offered in deepset/haystack, one of the best and quite mature frameworks to work with LLMs (pre-GPT craze) and do augmented retrieval, etc.

I do feel the article presents old concepts as "emerging".

if you are curious about building something quickly, you can jump into one of the tutorials https://haystack.deepset.ai/tutorials

Over a weekend I've used deepset/haystack to build a Q/A engine over open source communities slack and discord threads that can potentially have an answer - it was a joy and a breeze to implement. If you have question about Metaflow, K8s, Golang, Deepset, Deep Java Library and some other tech - try asking your quick question on https://www.kwq.ai :-)

aantti · 3 years ago

Thanks for the mention and being part of Haystack's community :)

adamgordonbell · 3 years ago

Microsoft guidance is legit and useful. It's a bunch of prompting features piled on top of handlebar syntax. ( And it has its own caching. Set temp to 0 and it caches. no need for LLM specific caching libs :) )

https://github.com/microsoft/guidance

behnamoh · 3 years ago

Yes, but I've found LMQL equally impressive. It has a much better documentation than Guidance.

mikehollinger · 3 years ago

How prescient is the "Hidden Technical Debt" [1] paper from ~8 yrs ago compared to this? See the top of pg4 for a figure that I've personally found to be useful in explaining all the stuff necessary to put together a reasonable app using ML/DL stuff (up until today, anyway).

I see all the same bits called out:

- Data collection

- Machine /resource management

- Serving

- Monitoring

- Analysis

- Process mgt

- Data verification

There's some new concepts that aren't quite captured in the original paper like the "playground" though.

I've kind of been expecting a follow-up that shows an update to that original paper.

[1] https://proceedings.neurips.cc/paper_files/paper/2015/file/8...