A2UI: A Protocol for Agent-Driven Interfaces

codethief · 4 days ago

> A2UI lets agents send declarative component descriptions that clients render using their own native widgets. It's like having agents speak a universal UI language.

(emphasis mine)

Sounds like agents are suddenly able to do what developers have failed at for decades: Writing platform-independent UIs. Maybe this works for simple use cases but beyond that I'm skeptical.

rockwotj · 4 days ago

this isn’t the right way to look at it. It’s really server side rendering where the LLM is doing the markup language generation instead of a template. The custom UI is usually higher level. Airbnb has been doing this for years: https://medium.com/airbnb-engineering/a-deep-dive-into-airbn...

observationist · 3 days ago

Nope, it's just a repackaging of the same problem, except in this case, the problem is solved with APIs and CLI and not jumping through hoops in order to get the AI to do what humans do.

It's about accomplishing a task, not making a bot accomplish a task using the same tools and embodiment context as a human - there's no upside, unless the bot is actually using a humanoid embodiment, and even then, using a CLI and service API is going to be preferable to doing things with UI in nearly every possible case, except where you want to limit to human-ish capabilities, like with gaming, or you want to deceive any monitors into thinking that a human is operating.

It's going to be infinitely easier to wrap a json get/push wrapper around existing APIs or automation interfaces than to universalize some sort of GUI interactions, because LLM's don't have the realtime memory you need to adapt to all the edge cases on the fly. It's incredibly difficult for humans, and hundreds of billions of dollars have been spent trying to make software universally accessible and dumbed down for users, and still ends up being either stupidly limited, or fractally complex in the tail, and no developer can ever account for all the possible ways in which users interact with a feature for any moderately complex piece of software.

Just use existing automation patterns. This is one case where if an AI picks up this capability alongside other advances, then awesome, but any sort of middleware is going to be a huge hack that immediately gets obsoleted by frontier models as a matter of course.

hurturue · 4 days ago

platform independent UIs exist - HTML and Electron

kridsdale3 · 3 days ago

Sure. HTML is a Markup-Language (it's in the acronym). Markdown is also a Markup Language. LLMs are super good at Markdown and just about every chatbot frontend now has a renderer built in.

A2UI is a superset, expanding in to more element types. If we're going to have the origin of all our data streams be string-output-generators, this seems like an ok way to go.

I've joined an effort inside Google to work in this exact space, though what we're doing has no plan to become open source, other groups are working on stuff like A2UI and we collaborate with them.

My career previous to this was nearly 20 years of native platform UI programming and things like Flutter, React Native, etc have always really annoyed me. But I've come around this year to accept that as long as LLMs on servers are going to be where the applications of the future live, we need a client-OS agnostic framework like this.

giancarlostoro · 3 days ago

I've thought about how to write a platform independent UI framework that doesn't care what language you write it in, and every time I find myself reinventing X.org or at least my gut tells me I'm just reinventing a cross-platform X server implementation.

mentalgear · 4 days ago

It still needs language-specific libraries [1] (and no sveltekit even announced yet :( ).

[1] https://a2ui.org/renderers/

ddrdrck_ · 4 days ago

Well it is open source and they expect the community to add more renderers. So if you are a sveltekit specialist this could actually be an opportunity.

awei · 4 days ago

I see how useful a universal UI language working across platforms is, but when I look at some examples from this protocol, I have the feeling it will eventually converge to what we already have, html. Instead of making all platforms support this new universal markup language, why not make them support html, which some already do, and which llms are already trained on.

Some examples from the documentation: { "id": "settings-tabs", "component": { "Tabs": { "tabItems": [ {"title": {"literalString": "General"}, "child": "general-settings"}, {"title": {"literalString": "Privacy"}, "child": "privacy-settings"}, {"title": {"literalString": "Advanced"}, "child": "advanced-settings"} ] } } }

{ "id": "email-input", "component": { "TextField": { "label": {"literalString": "Email Address"}, "text": {"path": "/user/email"}, "textFieldType": "shortText" } } }

epec254 · 4 days ago

A key challenge with HTML is client side trust. How do I enable an agent platform (say Gemini, Claude, OpenAI) to render UI from an untrusted 3p agent that’s integrated with the platform? This is a common scenario in the enterprise version of these apps - eg I want to use the agent from (insert saas vendor) alongside my company’s home grown agents and data.

Most HTML is actually HTML+CSS+JS - IMO, accepting this is a code injection attack waiting to happen. By abstracting to JSON, a client can safely render UI without this concern.

lunar_mycroft · 4 days ago

If the JSON protocol in question supports arbitrary behaviors and styles, then you still have an injection problem even over JSON. If it doesn't support them you don't need to support those in an HTML protocol either, and you can solve the injection problem the way we already do: sanitizing the HTML to remove all/some (depending on your specific requirements) script tags, event listeners, etc.

epicurean · 3 days ago

Perhaps the protocol, is then html/css/js in a strict sandbox. Component has no access to anything outside of component bounds (no network, no dom/object access, no draw access, etc).

awei · 4 days ago

Right this makes sense, I wonder if it would then be a good idea to abstract html to JSON, making it impossible to include css and js into it

pedrozieg · 4 days ago

We’ve had variations of “JSON describes the screen, clients render it” for years; the hard parts weren’t the wire format, they were versioning components, debugging state when something breaks on a specific client, and not painting yourself into a corner with a too-clever layout DSL.

The genuinely interesting bit here is the security boundary: agents can only speak in terms of a vetted component catalog, and the client owns execution. If you get that right, you can swap the agent for a rules engine or a human operator and keep the same protocol. My guess is the spec that wins won’t be the one with the coolest demos, but the one boring enough that a product team can live with it for 5-10 years.

mbossie · 4 days ago

So there's MCP-UI, OpenAI's ChatKit widgets and now Google's A2UI, that I know of. And probably some more...

How many more variants are we introducing to solve the same problem. Sounds like a lot of wasted manhours to me.

MrOrelliOReilly · 4 days ago

I agree that it's annoying to have competing standards, but when dealing with a lot of unknowns it's better to allow divergence and exploration. It's a worse use of time to quibble over the best way to do things when we have no meaningful data yet to justify any decision. Companies need freedom to experiment on the best approach for all these new AI use cases. We'll then learn what is great/terrible in each approach. Over time, we should expect and encourage consolidation around a single set of standards.

pscanf · 4 days ago

> when dealing with a lot of unknowns it's better to allow divergence and exploration

I completely agree, though I'm personally sitting out all of these protocols/frameworks/libraries. In 6 months time half of them will have been abandoned, and the other half will have morphed into something very different and incompatible.

For the time being, I just build things from scratch, which–as others have noted¹–is actually not that difficult, gives you understanding of what goes on under the hood, and doesn't tie you to someone else's innovation pace (whether it's higher or lower).

¹ https://fly.io/blog/everyone-write-an-agent/

shireboy · 4 days ago

AGUI sounds similar: https://github.com/ag-ui-protocol/ag-ui

meander_water · 3 days ago

This provides a bit more detail on how they relate to each other

https://www.copilotkit.ai/ag-ui-and-a2ui

epec254 · 4 days ago

Same team! AGUI uses a2UI as the protocol under the hood.

hobofan · 4 days ago

MCP-UI and OpenAI Apps are converging into the MCP Apps extension specification: https://blog.modelcontextprotocol.io/posts/2025-11-21-mcp-ap...

mystifyingpoi · 4 days ago

> Sounds like a lot of wasted manhours to me

Sounds like a lot of people got paid because of it. That's a win for them. It wasn't their decision, it was company decision to take part in the race. Most likely there will be more than 1 winner anyway.

kridsdale3 · 3 days ago

I'm one of these people. We have to start working on the problem many months before the competition announces that they exist. So we are all just doing parallel evolution here. Everyone agrees that to sit and wait for a standard means you wouldn't waste energy, but you'd also have no influence.

Like you mentioned, its a good time to be employed.

adamesque · 3 days ago

Unlike many of those approaches which concern themselves with delivery of human-designed static UI, this seems to be a tool designed to support generative UIs. I personally think that's a non-starter and much prefer the more incremental "let the agent call a tool that renders a specific pre-made UI" approach of MCP UI/Apps, OpenAI Apps SDK, etc for now.

zeroasterisk · a day ago

Legitimate curiosity - why?

Making an agent call a tool to manipulate a UI does feel like normal application development and an event driven interaction... I get that.

What else drives your preference?

Deleted Comment

p_v_doom · 4 days ago

We should make one new standard for everyone to use ...

askl · 4 days ago

Obligatory https://xkcd.com/927/

zeroasterisk · a day ago

Oh yes. I send this all the time. And I also see the irony.

I can justify A2UI as doing something not otherwise accomplishable in the market today, but you saw how long the blog post was trying to explain that. :shrug:

wongarsu · 4 days ago

I wouldn't want this anywhere near production, but for rapid prototyping this seems great. People famously can't articulate what they want until they get to play around with it. This lets you skip right to the part where you realize they want something completely different from what was first described without having to build the first iteration by hand

turnsout · 4 days ago

Honestly the point of this is not to help app developers—it's to replace the need for apps altogether.

The vision here is that you can chat with Gemini, and it can generate an app on the fly to solve your problem. For the visualized landscaping app, it could just connect to landscapers via their Google Business Profile.

As an app developer, I'm actually not even against this. The amount of human effort that goes into creating and maintaining thousands of duplicative apps is wasteful.

verdverm · 3 days ago

This sounds like they creators think that even more duplicative apps that no one knows how it works or what the code even looks like... is a better idea?

How many times are users going to spin GPUs to create the same app?

Deleted Comment

jy14898 · 4 days ago

I never want to unknowingly use an app that's driven this way.

However, I'm happy it's happening because you don't need an LLM to use the protocol.

tasoeur · 4 days ago

In an ideal world, people would be implementing UI/UX accessibility in the first place, and a lot of those problems would be solved in the first place. But one can also hope that having the motivation to get agents running on those things could actually bring a lot of accessibility features to newer apps.

ChrisArchitect · 3 days ago

Blog post: https://developers.googleblog.com/introducing-a2ui-an-open-p...