Ask HN: Examples of agentic LLM systems in production?

An anecote that helps you maybe:

I do contracting work, we're building a text-to-sql automated business analyst. It's quite well-rounded: it tries to recover from errors, allows automatic creation of appropriate visualisations, has a generic "faq" component to help the user understand how to use the tool. The tool is available to some 10.000 b2b users.

It's just a bunch of prompts conditionally slapped together in a call graph.

The client needed AGENTIC AI, without specifying exactly what this meant. I spent two weeks pushing back on it, stating that if you replace the hardcoded call graph with something that has """free will""", accuracy and interpretability goes down whilst runtimes go up... but no, we must have agents.

So I did nothing, and called the current setup "constrained agentic ai". The result: High fives all around, everyone is happy

Make of that what you will... ai agents are at least 90% hype.

bradarner · a year ago

The hype of Agentic AI is to LLMs what an MBA is to business. Overcomplicating something with language that is pretty common sense.

I've implement countless LLM based "agentic" workflows over the past year. They are simple. It is a series of prompts that maintain state with a targeted output.

The common association with "a floating R2D2" is not helpful.

They are not magic.

The core elements I'm seeing so far are: the prompt(s), a capacity for passing in context, a structure for defining how to move through the prompts, integrating the context into prompts, bridging the non-deterministic -> deterministic divide and callbacks or what-to-do-next

The closest analogy that I find helpful is lambda functions.

What makes them "feel" more complicated is the non-deterministic bits. But, in the end, it is text going in and text coming out.

BrandiATMuhkuh · a year ago

Do you have some advice on how to build the structure on how to move from one prompt to the next?

Are you using a separate state manager + function calling so the LLM knows where it is?

beernet · a year ago

Am I the only one who finds these types of comments arrogant? I mean, we get it, you know better and have been doing this for a long time and so forth...Sometimes I feel like it's just about relativizing whatever tech is popular right now. Just to come back two years later and say "oh well I've been telling people about this cool tech two years ago!"

JTyQZSnP3cQGa8B · a year ago

Give a counter example then. I’ve been doing this for years: people want the hot new thing even if it’s the worst idea, you rebrand it, and everyone is happy. Then a few months later, people praise you for not having implemented that bad idea.

th0ma5 · a year ago

100% agree. I'm not sure what they're trying to convey even.

SebaSeba · a year ago

Sounds awesome. :D For real, the anecdote is hilarious and I find it easy to believe but also sounds cool what you are working on.

isoprophlex · a year ago

Well you work in the field for a while, and you accumulate anecdotes of colleagues dropping tactical sleep(5000)'s so they can shave some milliseconds of latency each week and keep the boss happy.

I love those stories but I could never do that with a straight face. However, the AI field is such an uphill battle against all the crap that LinkedIn influencers are pushing into the minds of the C-suite... I feel it's okay to get a bit creative to get a win-win here ;)

simonw · a year ago

Love that. Reminds me of a time I was asked to build a "machine learning algorithm" driven recommendation system... and eventually I realized that delivering a recommendation system based on one big BM25 search query was fine, and the people asking for it to use "machine learning" didn't actually understand or care about the difference.

isoprophlex · a year ago

Haha yes, the LLM era is "data science is the hottest new job" all over again.

I guess everything with an algorithm in it is AI if you look at it from enough of a distance...

arresin · a year ago

It's nice to combine the two but the ranking takes tuning.

philipodonnell · a year ago

I’ve been doing a lot of work on semantic data architecture that better supports LLM analytics, did you use any framework or methodology to decide how exactly to present the data/metadata to the LLM context to allow it to make decisions?

isoprophlex · a year ago

A pre-processing phase does a lot of heavy lifting, where we stuff the table and column comments, additional metadata, and some hand-tuned heuristics into a graph-like structure. Basically using LLMs itself to preprocess the schema metadata.

Everything is very boring tech-wise, using vanilla postgres/pgvector and a few hundred lines of python. Every RAG-searchable text field (mostly column descriptions and a list of LLM-generated example queries) is linked to nodes holding metadata, at most 2 hops out. The tool is available to 10.000 users, but load is only a few queries per minute at peak... so performance wise it's fine.

moltar · a year ago

Is the tool public? We are looking for a solid text to sql tool that works with Athena.

isoprophlex · a year ago

Sadly, no, it's a walled-off customer facing tool integrated into one of my client's B2B business intelligence portals.

Hope you can find a tool; the big data players are of course jumping on this (snowflake, databricks, they all talk about their text-to-sql tools).

If you have the budget and want something bespoke built that has some magic sauce tuned to your exact problem field, send me an email!

furyofantares · a year ago

If you are looking for LLM agents that go off and do a bunch of work on their own, you will be supremely underwhelmed. Anyone who went straight to building agents without a human in some large loop found that they were trying to make the LLM do things it was extremely bad at.

The right approach to build toward agents is to start with something that gives pretty good responses to prompts and build up an agentic mode to let it do more and more in response to each prompt. It should be thought of as extending how much you get per prompt, and doing so by chaining together components you've already worked at making to good at.

Cursor (the LLM powered VS Code fork) has an agentic mode and they are doing this the right way. The normal chat window is good at producing changes to your code, and at applying them, at looking at lints, at suggesting terminal commands, at doing directory listings or RAG on your codebase. Agentic mode is tying those together to do more of the work you want with fewer prompts from you.

bluejay2387 · a year ago

As a side note, while I know of several language model based systems that have been deployed in companies, some companies don't want to talk about it:

1. Its still perceived as an issue of competitive advantage

2. There is a serious concern about backlash. The public's response to finding out that companies have used AI has often not been good (or even reasonable) -- particularly if there was worker replacement related to it.

It's a bit more complicated with "agents" as there are 4 or 5 competing definitions for what that actually means. No one is really sure what an 'agentic' system is right now.

ilaksh · a year ago

There is a very simple and obvious definition: it's agentic if it uses tool calls to accomplish a task.

This is the only one that makes sense. People want to conflate it with their random vague conceptions of AGI or ASI or make some kind of vague requirement for a certain level of autonomy, but that doesn't make sense.

An agent is an agent and an autonomous agent is an autonomous agent, but a fully autonomous agent is a fully autonomous agent. An AGI is an AGI but an ASI is an ASI.

Somehow using words and qualifiers to mean different specific things is controversial.

The only thing I will say to complicate it though is if you have a workflow and none of the steps give the system an option to select from more than one tool call, then I would suggest that should be called an LLM workflow and not an agent. Because you removed the agency by not giving it more than one option of action to select from.

authorfly · a year ago

Agentic AI comes out of historical AI, systems computing and further back biological/philosophical discussion. It's not about tool use although ironically, animal tool use is a fascinating subject not yet corrupted by the hype around intelligence.

I implore you to look into that to see how some people relate it to autonomy or AGI or ASI(wrongly, imo - I think shoehorning OOP and UML diagrams plus limited database like memory/context is not a path to AGI. Clever use of final layers, embeddings and how you store/weight them (and even more interesting combinations) may yield interesting results because we can (buzzword warning) transcend written decoding paradigms - the Human brain clearly does not rely on language).

However what gets marketed today is, as you say, not capable of any real agent autonomy like in academia - they are just self-recursive ChatGPT prompts with additional constraining limits. One day it might be more, but libraries now are all doing that from my eye. And recursion has pros but emphasizes the unreliability con/negative of LLMs.

maeil · a year ago

That's not the definition I see most people using. Plenty of tool calling going on purely to structure an output, which could also be achieved by prompting.

For me, agentic means that at least at some stage, one model is prompting another model.

austinkhale · a year ago

This has been my experience. Lots of companies are implementing LLMs but are not advertising it. There's virtually no upside to being public about it.

mejutoco · a year ago

Investors at throwing money at ai projects. That is one upside.

maeil · a year ago

Very accurate. So much of succesful (from company PoV) real-world LLM use is about replacing employees. HN is still far too skeptical about how much this is in fact happening and is likely to accelerate.

th0ma5 · a year ago

This is exactly it except also, the use cases are so constrained as to be hardly using LLMs at all.

bronco21016 · a year ago

With all the agencies and the YouTube demos of n8n and Make.com they should be everywhere.

I look at my workplace and I see places where they might fit in but if the reliability isn’t 99.5% they won’t be trusted and I think that’s a problem.

I made a toy in n8n that collects transactions in YNAB via API and matches them to Amazon orders in GMail. It then uses GPT-4o with vision to categorize the product pictures according to my budget’s categories but I have to add the order link to the transaction memo and add a flag for human review because it’s only 80% or so. It has sped up the workflow for sure but nowhere near good enough to set it and forget it.

SebaSeba · a year ago

Interesting! To me 80% hitrate sounds actually pretty good and awesome if it actually improves productivity, though understandably not something that could be left on it's own devices.

I had no idea about Make.com or n8n, they seem interesting. Thanks for the tip! Will check them out.

arresin · a year ago

also: nodered, windmill.

And non ui flow diagram but essentially the same thing: inngest, hatchet.

moyix · a year ago

We've been using them to find novel vulnerabilities in open source web apps. The past 4 posts here have details:

- Auth bypass/arbitrary file read in Scoold: https://xbow.com/blog/xbow-scoold-vuln/

- SSRF in 2FAuth: https://xbow.com/blog/xbow-2fauth-ssrf/

- Stored XSS in 2FAuth: https://xbow.com/blog/xbow-2fauth-xss/

- Path traversal in Labs.AI EDDI: https://xbow.com/blog/xbow-eddi-path/

Each of those has an associated agent trace so you can go read exactly what the agent did to find and exploit the vulnerability.

simonw · a year ago

If we're going to have a conversation about agents or agentic it is really important we agree on which definition of those terms we are using for the purpose of this conversation.

If you ask two different people in the AI space to define "agent" you almost always get two slightly (or significantly) different definitions!

Here are just some of the definitions I've seen over time: https://news.ycombinator.com/item?id=42216217#42228364

For the purpose of this thread the most cynical definition, "LLMs that do something useful", might actually be the best fit!

Deleted Comment

austinkhale · a year ago

I know of many, many LLM systems in production system, since that's what I've been helping companies build since the start of the year. Mostly it's pretty rote automation work but the cost savings are incredible.

Agentic workflows are a much higher bar that are just barely starting to work. I can't speak to their efficacy but here's a few of the ones that are sort of starter-level agents that I've started seeing some companies adopt:

- https://www.intercom.com/fin

- https://www.rox.com/

- https://devin.ai/

- https://bolt.new/

- https://v0.dev/

remoquete · a year ago

Cost saving as in...? Hopefully not saving through making human employees redundant.

cpursley · a year ago

Before farming technology like tractors, 97% of people worked the fields on come capacity. Now it’s the inverse. Technology frees human potential from drudgery.

burningion · a year ago

The way I look at Agentic systems is that there are Tools an LLM can call out to, and do work with.

Last week Wednesday I participated in Anthropic's Model Context Protocol hackathon, and built a system with my team partner Zia to automatically search and find restaurants for your dietary preferences and group size.

It also automatically downloads social media of the restaurant to get a vibe for the place.

There's a video of it in action here: https://www.youtube.com/watch?v=c6vGrfHFyu8

And a Github repo here: https://github.com/zia-r/gotta-eat