I am an AI researcher. Most actual AI researchers and engineers use very few of these tools - the only one being model providers like OpenAI API and public clouds (AWS, Azure, GCP). The rest of these are infra-centric tools that a16z is highly incentivized to over-inflate the importance of.
This does look like the sort of complex ecosystem that emerges when there is an inflection point and then, before consolidation happens. It reminds me of adtech in the early 2010s.
That said, while much of this might not have any real traction long-term, looking at what researchers use seems to miss the mark a bit. It’s like saying network technology researchers aren’t using Vercel.
There are some other useful ones in there. Hugging Face jumps out. W&B. I haven't used Mosaic but I could see myself for bigger projects, I know of at least two PIs @ Stanford using them.
> So, agents have the potential to become a central piece of the LLM app architecture (or even take over the whole stack, if you believe in recursive self-improvement). ... . There’s only one problem: agents don’t really work yet.
I really appreciate that they called out and separated some hype vs. practice, specifically with regards to Agents. This is something I keep hoping works better than it does, and in practice every attempt I've taken in this direction leads to disappointment.
What vector DB are you using? What is the data structure that you're vectorizing? What is your chunk size? Have you implemented memory? What prompt or technique are you using (ReACT, CoT, few-shot, etc)? Are you only using vector DBs? Do you use sequential chains? Does it need tools? Depending on your data, business case and what output you expect from the agent, there is no one-size-fits-them-all.
This mirrors my experience as well. I've tried to use them for pretty straightforward support agent type tasks and found that they very often go down wrong paths trying to solve the problem.
I am so glad that top VCs are thinking along these lines, of architectures that incorporate AI as part of the flow.
We've spun off a company to realize the vision of bringing an open-source, standardized framework to the PHP ecosystem, where we've been building apps for communities for over a decade. It's "AI for the rest of us", but at the same time promoting positive collaboration between communities and AI. It also involves micropayments for tasks done by either an AI or human agent.
And if you want to get involved in any capacity, whether as an investor or developer, please email me greg at the domain engageusers.ai -- this time around we are planning to take on venture capital funding for this project, and syndicate a round later this summer.
This blog post is way more complex than it needs to be, a lot of what most people are doing with llms right now boils down to using vector databases to provide the "best" info/examples to your prompt. This is a slick marketing page but im not sure what they think they're providing beyond that.
This is what a16z does. A few years ago it was the "Modern Data Stack" and a few years before that it was "DevOps." For some reason, venture capitalists really like making these fancy charts to describe the obvious, and then mostly ignoring them during their investment decisions (or sometimes they make the investments, then they make the charts and put their portfolio companies in the boxes).
To protect their downside risk by seeding the broader marketplace with a narrative which later stage investors and acquirers will be influenced by in making decisions to invest or acquire.
Bit of self-promotion, but Milvus (https://milvus.io) is another open-source vector database option as well (I have a pretty good idea as to why it isn't listed in a16z's blog post). We also have milvus-lite, a pip-installable package that uses the same API, for folks who don't want to stand up a local service.
pip install milvus
Other than that, it's great to see the shout-out for Vespa.
The guys at Milvus raised a total of $113M according to Crunchbase, second only to Pinecone, which is funded by a16z. You're not going to highlight the main competitor of one of your portfolio companies.
I think this is a very well articulated breakdown of the "LLM Core Code Shell" (https://www.latent.space/p/function-agents#%C2%A7llm-core-co...) view of the world. but it is underselling the potential to leave the agents stuff to a three paragraph "what about agents?" piece at the end. the emerging architecture of "Code Core, LLM Shell" decentralizing and specializing the role of the LLM will hopefully get more airtime in the december a16z landscape chart!
we actually just purposefully left that part a bit scarce because we have something else coming up on the topic! I'm sure we will be chatting through it soon :)
this and other different end-to-end architectures are offered in deepset/haystack, one of the best and quite mature frameworks to work with LLMs (pre-GPT craze) and do augmented retrieval, etc.
I do feel the article presents old concepts as "emerging".
Over a weekend I've used deepset/haystack to build a Q/A engine over open source communities slack and discord threads that can potentially have an answer - it was a joy and a breeze to implement. If you have question about Metaflow, K8s, Golang, Deepset, Deep Java Library and some other tech - try asking your quick question on https://www.kwq.ai :-)
Microsoft guidance is legit and useful. It's a bunch of prompting features piled on top of handlebar syntax. ( And it has its own caching. Set temp to 0 and it caches. no need for LLM specific caching libs :) )
How prescient is the "Hidden Technical Debt" [1] paper from ~8 yrs ago compared to this? See the top of pg4 for a figure that I've personally found to be useful in explaining all the stuff necessary to put together a reasonable app using ML/DL stuff (up until today, anyway).
I see all the same bits called out:
- Data collection
- Machine /resource management
- Serving
- Monitoring
- Analysis
- Process mgt
- Data verification
There's some new concepts that aren't quite captured in the original paper like the "playground" though.
I've kind of been expecting a follow-up that shows an update to that original paper.
That said, while much of this might not have any real traction long-term, looking at what researchers use seems to miss the mark a bit. It’s like saying network technology researchers aren’t using Vercel.
I really appreciate that they called out and separated some hype vs. practice, specifically with regards to Agents. This is something I keep hoping works better than it does, and in practice every attempt I've taken in this direction leads to disappointment.
I feel like lots of paper are getting published and reviewed which is good as bad ideas don’t get to propagate for ages
We've spun off a company to realize the vision of bringing an open-source, standardized framework to the PHP ecosystem, where we've been building apps for communities for over a decade. It's "AI for the rest of us", but at the same time promoting positive collaboration between communities and AI. It also involves micropayments for tasks done by either an AI or human agent.
If you're a VC or an expert in the space, I'd love to get feedback on this: https://engageusers.ai/ecosystem.pdf
And if you want to get involved in any capacity, whether as an investor or developer, please email me greg at the domain engageusers.ai -- this time around we are planning to take on venture capital funding for this project, and syndicate a round later this summer.
Vector database space is the Wild West, keep at it
I think this is a very well articulated breakdown of the "LLM Core Code Shell" (https://www.latent.space/p/function-agents#%C2%A7llm-core-co...) view of the world. but it is underselling the potential to leave the agents stuff to a three paragraph "what about agents?" piece at the end. the emerging architecture of "Code Core, LLM Shell" decentralizing and specializing the role of the LLM will hopefully get more airtime in the december a16z landscape chart!
we actually just purposefully left that part a bit scarce because we have something else coming up on the topic! I'm sure we will be chatting through it soon :)
I do feel the article presents old concepts as "emerging".
if you are curious about building something quickly, you can jump into one of the tutorials https://haystack.deepset.ai/tutorials
Over a weekend I've used deepset/haystack to build a Q/A engine over open source communities slack and discord threads that can potentially have an answer - it was a joy and a breeze to implement. If you have question about Metaflow, K8s, Golang, Deepset, Deep Java Library and some other tech - try asking your quick question on https://www.kwq.ai :-)
https://github.com/microsoft/guidance
I see all the same bits called out:
- Data collection
- Machine /resource management
- Serving
- Monitoring
- Analysis
- Process mgt
- Data verification
There's some new concepts that aren't quite captured in the original paper like the "playground" though.
I've kind of been expecting a follow-up that shows an update to that original paper.
[1] https://proceedings.neurips.cc/paper_files/paper/2015/file/8...