TS_Posts (u/TS_Posts)

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

cowpig · a year ago

I don't totally understand why we need an additional layer of abstraction over MCP at this point. Why can't an agent just be an MCP server? What is the fundamental difference between an MCP server "tool" and an agent "capability"?

This kind of feels to me like someone at google saw how successful MCP was becoming and said "we need something like that". I feel the same way about OpenAI's Agent SDK.

I think the word "Agent" appearing in any engineering project is a tell that it's driven by marketing rather than engineers' needs.

TS_Posts · a year ago

Hi there (I work on a2a) - reposting from above.

A2A works at a different level than MCP. We are working with partners on very specific customer problems. Customers are building individual agents in different frameworks OR are purchasing agents from multiple vendors. Those agents are isolated and do not share tools, or memory, or context.

For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.

It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.

This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks.

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

programd · a year ago

I largely agree with most of this. My only concern is that the spec is a bit underspecified.

For example I wish they'd specify the date format more tightly - unix timestamp, some specific ISO format, precision. Which is it?

The sessionID is not specified. You can put all sorts of crazy stuff in there, and people will. Not even a finite length is required. Just pick some UUID format already, or specify it has to be an incrementing integer.

Define some field lenght limits that can be found on the model card - e.g. how long can the description field be before you get an error? Might be relevant to context sizes. If you don't you're going to have buffer overflow issues everywhere because vibe coders will never think of that.

Authentication methods are specified as "Open API Authentication formats, but can be extended to another protocol supported by both client and server". That's a recipe for a bunch of byzantine Enterprize monstrosities to rear their ugly heads. Just pick one or two and be done with it.

The lesson of past protocols is that if you don't tightly specify things you're going to wind up with a bunch of nasty little incompatibilities and "extensions" which will fragment the ecosystem. Not to mention security issues. I guess on the whole I'm against Postel's Law on this.

TS_Posts · a year ago

(I work on a2a)

Thank you for the feedback? Would you consider writing up an issue on our github with some more specifics? https://github.com/google/a2a

A2A is being developed in the open with the community. You are finding some early details that we are looking into and will be addressing. We have many partners who will be contributing and want this to be a truly open, collaborative endeavor. We acknowledge this is a little different than dropping a polished '1.0' version in github on day 1. But that is intentional :)

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

vessenes · a year ago

OK, I’ve read the website, the spec, and JavaScript and python clients and servers. Here’s a quick initial reaction.

1. This is in the “embrace and extend” type area vis-a-vis MCP — if you implemented A2A for a project I don’t think you’d need to implement MCP. That said, if you have an MCP server, you could add a thin layer for A2A compliance.

2. This hits and improves on a bunch of pain points for MCP, with reasonable relatively light weight answers — it specs out how in-band and out-of-band data should get passed around, it has a sane (token based largely) approach to security for function calling, it has thought about discovery and security with a simple reliance on the DNS security layer, for instance.

3. The full UI demos imagine significantly more capable clients - ones that can at least implement Iframes - and reconnect to lost streaming connections, among other things. It’s not clear to me that there’s any UI negotiation baked into this right now, and it’s not clear to me what the vision is for non-HTML-capable clients. That said, they publish clients that are text-only in the example repo. It may be an area that isn’t fully fleshed out yet, or there may be a simple answer I didn’t see immediately.

Upshot - if you’re building an MCP server right now, great —- you should read the A2A spec for a roadmap on some things you’ll care about at some point, like auth and out of band data delivery.

If you’re thinking about building an MCP server, I’m not sure I’d move ahead on vanilla MCP - I think the A2A spec is better specified, and if for some reason A2A doesn’t take off, it will only be because MCP has added support for a couple of these key pain points — it should be relatively easy to migrate.

I think any mid-size or better tool calling LLM should be able to get A2A capability json and figure out what tool to call, btw.

One last thing - I appreciate the GOOG team here for their relatively clear documentation and explanation. The MCP site has always felt a little hard to understand.

Second last thing: notably, no openAI or Anthropic support here. Let’s hope we’re not in xkcd 927 land.

Upshot: I’d think of this as a sane superset of MCP and I will probably try it out for a project or two based on the documentation quality. Worst case, writing a shim for an exact MCP capable server is a) probably not a big deal, and b) will probably be on GitHub this week or so.

TS_Posts · a year ago

Hi there - I work on a2a. Thanks for the reaction - lots of good points here. We really do see a2a as different and complementary to MCP. I personally am working on both and see them in very different contexts.

I see MCP as vital when building an agent. An agent is an LLM with data, resources, tools, and services. However, our customers are building or purchasing agents from other providers - e.g. purchasing "HR Agent", "Bank Account Agent", "Photo Editor Agent", etc. All of these agents are closed systems and have access to private data, APIs, etc. There needs to be a way for my agent to work with these other agents when a tool is not enough.

Other comments you have are spot on - the current specification and samples are early. We are working on many more advanced examples and official SDKs and client/servers. We're working with partners, other Google teams, and framework providers to turn this into a stable standard. We're doing it in the open - so there are things that are missing because (a) its early and (b) we want partners and the community to bring features to the table.

tldr - this is NOT done. We want your feedback and sincerely appreciate it!

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

bryan_w · a year ago

They are thinking about Enterprises. Shirley from accounting isn't going to install an mcp service to pull the receipt photos from Dropbox and upload them to SAP/Concur (expense reimbursement)

TS_Posts · a year ago

Yes, that is our assumption. reposting from above:

We are working with partners on very specific customer problems. Customers are building individual agents in different frameworks OR are purchasing agents from multiple vendors. Those agents are isolated and do not share tools, or memory, or context.

For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.

It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.

This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks when they cannot be modeled as tools.

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

a_wild_dandan · a year ago

I suppose Google wants us to pretend that "agents" can't be "resources." MCP is already well established (Anthropic, OpenAI, Cursor, etc), so Google plastering their announcement with A2A endorsements just reeks of insecurity.

I figure this A2A idea will wind up in the infamous Google graveyard within 8 months.

TS_Posts · a year ago

Hi there (I work on a2a) - reposting from above.

We are working with partners on very specific customer problems. Customers are building individual agents in different frameworks OR are purchasing agents from multiple vendors. Those agents are isolated and do not share tools, or memory, or context.

For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.

It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.

This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks when they cannot be modeled as tools.

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

simonw · a year ago

OK, I have to ask: isn't this agents to agents idea kind of Science Fiction?

I absolutely get the value of LLMs calling tools and APIs. I still don't see much value in LLMs calling other LLMs.

Everyone gets really excited about it - "langchain" named their whole company over the idea of chaining LLMs together - but aside from a few niche applications (Deep Research style tools presumably fire off a bunch of sub-prompts to summarize content they are crawling, Claude Code uses multiple prompts executions to edit files) is it really THAT useful? Worth building an entire new protocol with a flashy name and a bunch of marketing launch partners?

LLMs are unreliable enough already without compounding their unreliability by chaining them together!

TS_Posts · a year ago

Hi there (I work on a2a) - reposting from above.

We are working with partners on very specific customer problems. Customers are building individual agents in different frameworks OR are purchasing agents from multiple vendors. Those agents are isolated and do not share tools, or memory, or context.

For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.

It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.

This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks when they cannot be modeled as tools.

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

phillipcarter · a year ago

A key difference between MCP and A2A that is apparent to me after building with MCP and now reading the material on A2A:

MCP is solving specific problems people have in practice today. LLMs need access to data that they weren't trained on, but that's really hard because there's a millions different ways you could RAG something. So MCP defines a standard by which LLMs can call APIs through clients. (and more).

A2A solves a marketing problem that Google is chasing with technology partners.

I think I can safely say which one will still be around in 6 months, and it's not the one whose contributors all work for the same company.

TS_Posts · a year ago

Hi there (I work on a2a) - A2A works at a different level than MCP. We are working with partners on very specific customer problems. Customers are building individual agents in different frameworks OR are purchasing agents from multiple vendors. Those agents are isolated and do not share tools, or memory, or context.

For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.

It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.

This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks.

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

jacobs123 · a year ago

It's shown in the link below. It's kind of crazy that they have this huge corporate announcement with 50 logos for something that under the hood seems sort of arbitrary and very fragile, and is probably very sensitive to things like exact word choice and punctuation. There will be effects like bots that say "please" and "thank you" to each other getting measurably better results.

https://google.github.io/A2A/#/documentation?id=multi-turn-c...

TS_Posts · a year ago

Hi there (I work on a2a) - can you explain the concern a bit more? We'd be happy to look.

A2A is a conduit for agents to speak in their native modalities. From the receiving agent implementation point of view, there shouldn't be a difference in "speaking" to a user/human-in-the-loop and another agent. I'm not aware of anything in the protocol that is sensitive to the content. A2A has 'Messages' and 'Artifacts' to distinguish between generated content and everything else (context, thoughts, user instructions, etc) and should be robust to formatting challenges (since it relies on the underlying agent).

TS_Posts commented on The Agent2Agent Protocol (A2A) developers.googleblog.com... · Posted by u/meetpateltech

zellyn · a year ago

It’s frustratingly difficult to see what these (A2A and MCP) protocols actually look like. All I want is a simple example conversation that includes the actual LLM outputs used to trigger a call and the JSON that goes over the wire… maybe I’ll take some time and make a cheat-sheet.

I have to say, the endorsements at the end somehow made this seem worse…

TS_Posts · a year ago

Hi there! If you load the CLI demo in the github repo (https://github.com/google/A2A/tree/main/samples/python/hosts...) you can see what the A2A servers are returning. Take a look!