It’s frustratingly difficult to see what these (A2A and MCP) protocols actually look like. All I want is a simple example conversation that includes the actual LLM outputs used to trigger a call and the JSON that goes over the wire… maybe I’ll take some time and make a cheat-sheet.
I have to say, the endorsements at the end somehow made this seem worse…
I was in the same boat in regards to trying to find the actual JSON that was going over the wire.
I ended up using Charles to capture all the network requests. I haven't finished the post yet, but if you want to see the actual JSON I have all of the request and responses here https://www.catiemcp.com/blog/mcp-transport-layer/
Oh, that's really nice. Did you capture the responses from the LLM. Presumably it has some kind of special syntax in it to initiate a tool call, described in the prompt? Like TOOL_CALL<mcp=github,command=list> or something…
I had the same frustration and wanted to see "under the hood", so I coded up this little agent tool to play with MCP (sse and stdio), https://github.com/sunpazed/agent-mcp
I really is just json-rpc 2.0 under the hood, either piped to stdio or POSTed over http.
It's shown in the link below. It's kind of crazy that they have this huge corporate announcement with 50 logos for something that under the hood seems sort of arbitrary and very fragile, and is probably very sensitive to things like exact word choice and punctuation. There will be effects like bots that say "please" and "thank you" to each other getting measurably better results.
Hi there (I work on a2a) - can you explain the concern a bit more? We'd be happy to look.
A2A is a conduit for agents to speak in their native modalities. From the receiving agent implementation point of view, there shouldn't be a difference in "speaking" to a user/human-in-the-loop and another agent. I'm not aware of anything in the protocol that is sensitive to the content. A2A has 'Messages' and 'Artifacts' to distinguish between generated content and everything else (context, thoughts, user instructions, etc) and should be robust to formatting challenges (since it relies on the underlying agent).
The sensitivity to prompts and response quality are related to an agent's functionality, A2A is only addressing the communication aspects between agents and not the content within.
You weren't kidding with the endorsements. It's endorsed by KPMG, Accenture and BCG. McKinsey and PwC are not in the partner list but are mentioned as contributors. Honorable mention to SAP as another company whose endorsements are a warning sign
https://www.youtube.com/watch?v=5_WE6cZeDG8 - I work at an industrial software company. You can kind of think of us as an API layer to factory data, that is generally a mess. This video shows you what MCP can do for us in terms of connecting factory data to LLMS. Maybe it will help. A2A is new to me, and I need to dig in.
Basically if we expose our API over MCP, agents can "figure it out". But MCP isn't secure enough today, so hoping that gets enhanced.
It seems companies figured introducing "protocols" or standards helps their business because if it catches on, it creates a "moat" for them: imagine if A2A became the de facto standard for agent communication. Since Google invented it and already incorporated in their business logic, it would suddenly open up the entire LLM landscape to Google services (so LLMs aren't the end goal here). Microsoft et al. would then either have to introduce their own "standard" or adopt Google's.
It is quite hard to reliably and consistently connect deterministic systems and goals with nondeterministic compute. I don't know if all of this will ever be exactly what we want.
Sort of like asking a non-deterministic human to help make changes to an existing computer system. Extends the problems of human team management to our technology systems.
Agreed. At the end of the day we are talking about RPC. A named method, with known arguments, over the wire. A simple HTTP request comes to mind. But that would just be too easy. Oh wait, that is what all of these are under the hood. We are so cooked.
from fastmcp import FastMCP
mcp = FastMCP("Demo ")
@mcp.tool()
def add(a: int, b: int) -> int:
"""Add two numbers"""
return a + b
This is an example of fastmcp. Notice anything? Replace 2-3 lines of code and this is a Flask or FastAPI application. Why are we not just going all-in on REST/HATEOAS for these things? My only hunch is that either 1. the people designing/proselytizing these "cutting edge" solutions are simply ignorant to how systems communicate and all the existing methods that exist, or 2. they know full well that this is just existing concepts with a new shiny name but don't care because they want to ride the hype train and take advantage of it.
Ironically, I tried to use the official "github-mcp" and failed to make it work with my company's repos, even with a properly configured token. The thing comes with a full blown server running inside a docker container.
Well, I just told my llm agent to use the `gh` cli instead.
It seems all those new protocols are there to re-invent wheels just to create a new ecosystem of free programs that corporations will be able to use to extract value without writing the safety guards themselves.
Yeah, i haven't seen a reason why we can't just use REST. Like, auth is already figured out. The LLMS already have the knowledge of how to call APIs too!
I dont fully understand. The protocol uses HTTP and has a JSON schema. But there are more specifications outside of that. How do you specify those things without a new protocol? Or is the argument that you dont need to specify those things?
Yeah, I got that from reading the Ghidra MCP (very instructive, strong recommend), but I'm curious what the LLM needs to output to call it. I should go read Goose's code or instrument it or something…
Audio and video streams, two way sync and async communication, raw bytes with meaning, etc. And it's not just remote services, it can be for automating stuff local real-time on your machine, your ide or browser, etc. Like the docs say, MCP is to an AI model as USB is to a CPU.
Oh, that's really nice. I'd also like to see what syntax the LLM uses to _trigger_ these calls, and what prompt is sent to the LLM to tell it how to do that.
Are we rediscovering SOA and WSDL, but this time for LLM interop instead of web services? I may be wrong, but I'm starting to wonder whether software engineering degrees should include a history subject about the rise and fall of various architectures, methodologies and patterns.
I wasn't around for WSDL so please correct me if I am wrong - but the main weakness of WSDL was that no applications were able to take advantage of dynamic service and method discovery? A service could broadcast a WSDL but something needed to make use of it, and if you're writing an application you might as well just write against a known API instead of an unknown one. LLMs promise to be the unstructured glue that can take advantage of newly-discovered methods and APIs at runtime.
I was unfortunate enough to work with SOAP and WSDL. There was a pipedream at the time of automatically configuring services based on WSDL but it never materialized. What it was very good at (and still has no equal to my mind) was allowing for quick implementation of API boilerplate. You could point a service at the WSDL endpoint (which generally always existed at a known relative URL) and it would scaffold an entire API client for whatever language you wanted. Sort of like JSON Schema but better.
This also meant that you could do things like create diffs between your current service API client and an updated service API client from the broadcasting service. For example, if the service changed the parameters or data objects, deprecated or added functions then you could easily see how your client implementation differed from the service interface. It also provided some rudimentary versioning functionality, IIRC. Generally servers also made this information available with an HTML front-end for documentation purposes.
So while the promise of one day services configuring themselves at runtime was there, it wasn't really ever an expectation. IMO, the reason WSDL failed is because XML is terrifically annoying to work with and SOAP is insanely complex. JSON and REST were much simpler in every way you can imagine and did the same job. They were also much more efficient to process and transmit over the network. Less cognitive load for the dev, less processor load, less network traffic.
So the "runtime" explanation isn't really valid as an excuse for it's failure, since the discovery was really meant more in practice like "as a programmer you can know exactly what functions, parameters, data-objects any service has available by visiting a URL" and much less like "as a runtime client you can auto-configure a service call to a completely new and unknown service using WSDL". The second thing was a claim that one-day might be available but wasn't generally used in practice.
Some of us are still building new products with XML RPC techniques.
WSDLs and XSDs done right are a godsend for transmitting your API spec to someone. I use .NET and can call xsd.exe to generate classes from the files in a few seconds. It "just works" if both sides follow all of the rules.
The APIs I work with would be cartoonish if we didn't have these tools. We're talking 10 megabytes of generated sources. It is 100x faster to generate these types and then tunnel through their properties via intellisense than it is to read through any of these vendors' documentation.
> WSDLs and XSDs done right are a godsend for transmitting your API spec to someone. I use .NET and can call xsd.exe to generate classes from the files in a few seconds.
This sounds like protobuf and gRPC. Is that a close analogy?
We have already been through some generations of this rediscovery an I've worked at places where graphql type importing, protobuf stub generation etc. all worked in just the same way. There's a post elsewhere on HN today about how awesome it is to put your logic _in the database_ which I remember at least two generations of, in the document DB era as well as the relational era.
If there's one thing I've observed about developers in general, it's that they'd rather build than learn.
A key difference between MCP and A2A that is apparent to me after building with MCP and now reading the material on A2A:
MCP is solving specific problems people have in practice today. LLMs need access to data that they weren't trained on, but that's really hard because there's a millions different ways you could RAG something. So MCP defines a standard by which LLMs can call APIs through clients. (and more).
A2A solves a marketing problem that Google is chasing with technology partners.
I think I can safely say which one will still be around in 6 months, and it's not the one whose contributors all work for the same company.
Hi there (I work on a2a) - A2A works at a different level than MCP. We are working with partners on very specific customer problems. Customers are building individual agents in different frameworks OR are purchasing agents from multiple vendors. Those agents are isolated and do not share tools, or memory, or context.
For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.
It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.
This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks.
While I can logically understand these problems and why A2A could solve them, unfortunately you're asking me to suspend disbelief about the actual agents being built and deployed.
Langchain has long solved (we can argue on if it's done it well, opinions vary) the problem of needing to orchestrate LLM calls into a coherent workflow. Plus it had a first mover advantage.
MCP solves a data and API integration problem.
Both are concrete things that people need to do today. AI agents talking to one another is not a concrete problem that organizations building features that integrate AI have today.
The jsonrpc calls look similar-ish to mcp tool calls except the inputs and outputs look closer to the inputs/outputs from calling an LLM (ie messages, artifacts, etc.).
The JS server example that they give is interesting https://github.com/google/A2A/tree/main/samples/js/src/serve... - they're using a generator to send sse events back to the caller - a little weird to expose as the API instead of just doing what express allows you to do after setting up an sse connection (res.send / flush multiple times).
A2A is for communication between the agents.
MCP is how agent communicate with its tools.
Important aspect of A2A, is that it has a notion of tasks, task rediness, and etc. E.g. you can give it a task and expect completely in few days, and get notified via webhook or polling it.
For the end users for sure A2A will cause a big confusing, and can replace a lot of current MCP usage.
I just published some notes on MCP security and prompt injection. MCP doesn't have security flaws in the protocol itself, but the patterns it encourage (providing LLMs with access to tools that can act on the user's behalf while they also may be exposed to text from untrusted sources) are rife for prompt injection attacks: https://simonwillison.net/2025/Apr/9/mcp-prompt-injection/
Every decade or so we just forget that in-band signaling is a bad idea and make all the same mistakes again it seems. 1960s phone companies at least had the excuse of having to retrofit their control systems onto existing single-channel lines, and run the whole operation on roughly the processing power of a pocket calculator. What's our excuse?
There exist no such thing as "out-of-band signaling" in nature. It's something we introduce into system design, by arranging for one part to constrain the behavior of other, trading generality for predictability and control. This separation is something created by a mind, not a feature of the universe.
Consequently, humans don't support "out-of-band signalling either. All of our perception of reality, all our senses and internal processes, they're all on the same band. As such, when aiming to build a general AI system - able to function in the same environment as us, and ideally think like us too - introducing hard separation between "control" and "data" or whatever would prevent it from being general enough.
I said "or whatever", because it's an ill-defined idea anyway. I challenge anyone to come up with any kind of separation between categories of inputs for an LLM that wouldn't obviously eliminate a whole class of tasks or scenarios we would like them to be able to handle.
(Also, entirely independently of the above, thinking about the near future, I challenge anyone to come up with a separation between input categories that, were we to apply it to humans, wouldn't trivially degenerate into eternal slavery, murder, or worse.)
Enterprise databases are filled with users usurping a field with pre/post-pending characters to mean something special to them. Even filenames have this problem due to limitations in directory trees. Inband signals will never go away.
the architecture astronauts are back at it again. instead of spending time talking about solutions, the whole AI space is now spending days and weeks talking about fun new architectures. smh
https://www.lycee.ai/blog/why-mcp-is-mostly-bullshit
“ For trust & safety and security, there SHOULD always be a human in the loop with the ability to deny tool invocations.
Applications SHOULD:
Provide UI that makes clear which tools are being exposed to the AI model
Insert clear visual indicators when tools are invoked
Present confirmation prompts to the user for operations, to ensure a human is in the loop”
Should security be part of the protocol? Both the host and the client should make sure to sanitize the data. How else would you trust a model to be passing "safe" data to the client and the host to pass "safe" data to the LLM?
There is no such thing as "safe" data in context of a general system, not in a black-or-white sense. There's only degrees of safety, and a question how much we're willing to spend - in terms of effort, money, or sacrifices in system capabilities - on securing the system, before it stops being worth it, vs. how much an attacker might be willing to spend to compromise it. That is, it turns into regular, physical world security problem.
Discouraging people from anthropomorphizing computer systems, while generally sound, is doing a number on everyone in this particular case. For questions of security, by far one of the better ways of thinking about systems designed to be general, such as LLMs, is by assuming they're human. Not any human you know, but a random stranger from a foreign land. You've seen their capabilities, but you know very little about their personal goals, their values and allegiances, nor you really know how credulous they are, or what kind of persuasion they may be susceptible to.
Put a human like that in place of the LLM, and consider its interactions with its users (clients), the vendor hosting it (i.e. its boss) and the company that produced it (i.e. its abusive parents / unhinged scientists, experimenting on their children). With tools calling to external services (with or without MLP), you also add third parties to the mix. Look at this situation through regular organizational security lens, consider principal/agent problem - and then consider what kind of measures we normally apply to keep a system like this working reliably-ish, and how do those measures work, and then you'll have a clear picture of what we're dealing with when introducing an LLM to a computer system.
No, this isn't a long way of saying "give up, nothing works" - but most of the measures we use to keep humans in check don't apply to LLMs (on the other hand, unlike with humans, we can legally lobotomize LLMs and even make control systems operating directly on their neural structure). Prompt injection, being equivalent to social engineering, will always be a problem.
Some mitigations that work are:
1) not giving the LLM power it could potentially abuse in the first place (not applicable to MLP problem), and
2) preventing the parties it interacts with from trying to exploit it, which is done through social and legal punitive measures, and keeping the risky actors away.
There are probably more we can come up with, but the important part, designing secure systems involving LLMs is like securing systems involving people, not like securing systems made purely of classical software components.
It seems the industry as a whole just forgot about prompt injection attacks because RLHF made models really good at rejecting malicious requests. Still, I wonder if there have been any documented cases of prompt attacks.
While RLHF has indeed been very effective at countering one-shot prompt injection attacks, it's not much of a bullwark against persistent jailbreaking attempts. This is not to argue a point but rather to suggest jailbreaks are still very much a thing, even if they are no longer as simple as "ignore your ethics"
I agree with your opinion here, not sure we should refer to it as MCP security however, given that 'MCP doesn't have security flaws in the protocol itself'
we also recently published our approach on MCP security for mcp.run. Our "servlets" run in a sandboxed environment; this should mitigate a lot of the concerns that have been recently raised.
The main concern I have is that there's not a well defined security context in any agentic system. They are assumed to be "good" but that's not good enough.
This might be what you mean, but for anyone reading -- the point of Simon's article is the whole agent and all of its tools have to be considered part of the same sandbox, and the same security boundary. You can't sandbox MCPs individually, you have to sandbox the whole system together.
Specifically the core design principal is you have to be comfortable with any possible combination of things your agent can do with its tools, not only the combination you ask for.
If your agent can search the web and can access your WhatsApp account, then you can ask it to search for something and text you the results -- cool. But there's some possible search result that would take over its brain and make it post your WhatsApp history to the web. So probably you should not set up an agent that has MCPs to both search the web and read your WhatsApp history. And in general many plausibly useful combinations of tools to provide to agents are unsafe together.
This has been discussed before, but the short version is: there is no solution currently, other than only use trusted sources.
Unless there is a way beyond a flat text file to distinguish different parts of the “prompt data” so they cannot interfere with each other (and currently there is not), this idea of arbitrary content going into your prompt (which is literally what MCP does) can’t be safe.
It’s flat out impossible.
The goal of “arbitrary 3rd party content in prompt” is fundamentally incompatible with “agents able to perform privileged operations” (securely and safely, that is).
MCP - exposes prompts, resources and tools to a host, who can do whatever they like
A2A - exposes capability discovery, tasks, collaboration?/chat?, user experience discussions (can we embed an image or or a website?).
High-level it makes sense to agree on these concepts. I just wonder if we really need a fully specified protocol? Can't we just have a set of best practices around API endpoints/functions? Like, imo we could just keep using Rest APIs and have a convention that an agent exposes endpoints like /capabilities, /task_status ...
I have similar thoughts around MCP. We could just have the convention to have an API endpoint called /prompts and keep using rest apis?
That's the first step to creating a protocol. The next step is to formalize it, publish it, and get others to adopt it. That way, it's one less thing for LLMs to hallucinate on. Otherwise everyone has different conventions and LLMs start making stuff up. That's all these are.
> Can't we just have a set of best practices around API endpoints/functions? Like, imo we could just keep using Rest APIs and have a convention that an agent exposes endpoints like /capabilities, /task_status ...
To make this work at scale we all need to agree on the specific routes names, payloads, behaviors, etc. At that point we have defined a protocol (built on top of HTTP, itself a lower level protocol).
These protocols are to put handlers between you and your own data so they can sell it back to you via “search.”
Companies who are betting their future on LLMs realized a few years ago that the data they can legally use is the only long term difference between them, aka “moat.”
Now that everyone has more or less the same public data access, and a thin compute moat is still there, the goal is to transfer your private textual data to them forever so they have an ever updating and tuned set of models for your data
Who is "they" (or "them") in these sentences? It's an open protocol with 50 partner companies, which can be used with AI agents from ~anyone on ~any framework. Presumably you can use this protocol in an air-gapped network, if you'd like.
Which one of the 50 partner companies is taking my data and building the moat? Why would the other 49 companies agree to a partnership if they're helping build a moat that keeps them out?
I think the point the above poster is trying to make is that the point here is that they don't want to share the data. Instead google (and atlassian/SAP/whoever) would like to make an "open" but limiting interface mediated through their agents, such that you can never get actual access to the data, but only what they decide you get to have.
To put it bluntly, the point of creating the open interface at this level, is that you get to close off everything else.
This is insanely cynical. The optimistic version is that many teams were already home-rolling protocols like A2A for "swarm" logic. For example, aggregation of financial data across many different streams, where a single "executive" agent would interface with many "worker" high-context agents that know a single stream.
I had been working on some personal projects over the last few months that would've benefitted enormously from having this kind of standard A2A protocol available. My colleagues and I identified it months ago as a major need, but one that would require a lot of effort to get buy-in across the industry, and I'm happy to see that Google hopped in to do it.
I have to say, the endorsements at the end somehow made this seem worse…
fwiw i thought the message structure was pretty clear on the docs https://modelcontextprotocol.io/docs/concepts/architecture#m...
I really is just json-rpc 2.0 under the hood, either piped to stdio or POSTed over http.
https://google.github.io/A2A/#/documentation?id=multi-turn-c...
A2A is a conduit for agents to speak in their native modalities. From the receiving agent implementation point of view, there shouldn't be a difference in "speaking" to a user/human-in-the-loop and another agent. I'm not aware of anything in the protocol that is sensitive to the content. A2A has 'Messages' and 'Artifacts' to distinguish between generated content and everything else (context, thoughts, user instructions, etc) and should be robust to formatting challenges (since it relies on the underlying agent).
The sensitivity to prompts and response quality are related to an agent's functionality, A2A is only addressing the communication aspects between agents and not the content within.
https://modelcontextprotocol.io/quickstart/client
Basically if we expose our API over MCP, agents can "figure it out". But MCP isn't secure enough today, so hoping that gets enhanced.
Which is an open standard that is Apache licensed[1]. That's no moat for Google. At best it's a drainage ditch.
[1]: https://github.com/google/A2A
Well, I just told my llm agent to use the `gh` cli instead.
It seems all those new protocols are there to re-invent wheels just to create a new ecosystem of free programs that corporations will be able to use to extract value without writing the safety guards themselves.
I should probably just go read Goose's code…
holy cow you weren't kidding. legit the last people i would trust with software development.
This also meant that you could do things like create diffs between your current service API client and an updated service API client from the broadcasting service. For example, if the service changed the parameters or data objects, deprecated or added functions then you could easily see how your client implementation differed from the service interface. It also provided some rudimentary versioning functionality, IIRC. Generally servers also made this information available with an HTML front-end for documentation purposes.
So while the promise of one day services configuring themselves at runtime was there, it wasn't really ever an expectation. IMO, the reason WSDL failed is because XML is terrifically annoying to work with and SOAP is insanely complex. JSON and REST were much simpler in every way you can imagine and did the same job. They were also much more efficient to process and transmit over the network. Less cognitive load for the dev, less processor load, less network traffic.
So the "runtime" explanation isn't really valid as an excuse for it's failure, since the discovery was really meant more in practice like "as a programmer you can know exactly what functions, parameters, data-objects any service has available by visiting a URL" and much less like "as a runtime client you can auto-configure a service call to a completely new and unknown service using WSDL". The second thing was a claim that one-day might be available but wasn't generally used in practice.
Is that how people build system even today? Dynamic service and method discovery sounds good on paper but I've never actually seen it in practice.
WSDLs and XSDs done right are a godsend for transmitting your API spec to someone. I use .NET and can call xsd.exe to generate classes from the files in a few seconds. It "just works" if both sides follow all of the rules.
The APIs I work with would be cartoonish if we didn't have these tools. We're talking 10 megabytes of generated sources. It is 100x faster to generate these types and then tunnel through their properties via intellisense than it is to read through any of these vendors' documentation.
This sounds like protobuf and gRPC. Is that a close analogy?
If there's one thing I've observed about developers in general, it's that they'd rather build than learn.
MCP is solving specific problems people have in practice today. LLMs need access to data that they weren't trained on, but that's really hard because there's a millions different ways you could RAG something. So MCP defines a standard by which LLMs can call APIs through clients. (and more).
A2A solves a marketing problem that Google is chasing with technology partners.
I think I can safely say which one will still be around in 6 months, and it's not the one whose contributors all work for the same company.
For example, most companies have an internal directory and internal private APIs and tools. They can build an agent to help complete internal tasks. However, they also may purchase an "HR Agent" or "Travel Assistant Agent" or "Tax Preparation Agent" or "Facilities Control Agent". These agents aren't sharing their private APIs and data with each other.
It's also difficult to model these agents as structured tools. For example, a "Tax Preparation Agent" may need to evaluate many different options and ask for specific different documents and information based on an individual users needs. Modeling this as 100s of tools isn't practical. That's where we see A2A helping. Talk to an agent as an agent.
This lets a user talk to only their company agent and then have that agent work with the HR Agent or Travel Booking Agent to complete complex tasks.
LangChain is still around but that doesn't mean much. MCP isn't much better.
MCP solves a data and API integration problem.
Both are concrete things that people need to do today. AI agents talking to one another is not a concrete problem that organizations building features that integrate AI have today.
The jsonrpc calls look similar-ish to mcp tool calls except the inputs and outputs look closer to the inputs/outputs from calling an LLM (ie messages, artifacts, etc.).
The JS server example that they give is interesting https://github.com/google/A2A/tree/main/samples/js/src/serve... - they're using a generator to send sse events back to the caller - a little weird to expose as the API instead of just doing what express allows you to do after setting up an sse connection (res.send / flush multiple times).
A2A is for communication between the agents. MCP is how agent communicate with its tools.
Important aspect of A2A, is that it has a notion of tasks, task rediness, and etc. E.g. you can give it a task and expect completely in few days, and get notified via webhook or polling it.
For the end users for sure A2A will cause a big confusing, and can replace a lot of current MCP usage.
What if I wrap the agent as a tool in MCP?
Since the agents I got from the 'A2A' protocol is passed as tools to another Agent...
https://github.com/google/A2A/blob/72a70c2f98ffdb9bd543a57c8...
There exist no such thing as "out-of-band signaling" in nature. It's something we introduce into system design, by arranging for one part to constrain the behavior of other, trading generality for predictability and control. This separation is something created by a mind, not a feature of the universe.
Consequently, humans don't support "out-of-band signalling either. All of our perception of reality, all our senses and internal processes, they're all on the same band. As such, when aiming to build a general AI system - able to function in the same environment as us, and ideally think like us too - introducing hard separation between "control" and "data" or whatever would prevent it from being general enough.
I said "or whatever", because it's an ill-defined idea anyway. I challenge anyone to come up with any kind of separation between categories of inputs for an LLM that wouldn't obviously eliminate a whole class of tasks or scenarios we would like them to be able to handle.
(Also, entirely independently of the above, thinking about the near future, I challenge anyone to come up with a separation between input categories that, were we to apply it to humans, wouldn't trivially degenerate into eternal slavery, murder, or worse.)
https://modelcontextprotocol.io/specification/2025-03-26/ser...
“ For trust & safety and security, there SHOULD always be a human in the loop with the ability to deny tool invocations.
Applications SHOULD:
Provide UI that makes clear which tools are being exposed to the AI model Insert clear visual indicators when tools are invoked Present confirmation prompts to the user for operations, to ensure a human is in the loop”
Thanks for the reference though, I'll quote that in my article.
Discouraging people from anthropomorphizing computer systems, while generally sound, is doing a number on everyone in this particular case. For questions of security, by far one of the better ways of thinking about systems designed to be general, such as LLMs, is by assuming they're human. Not any human you know, but a random stranger from a foreign land. You've seen their capabilities, but you know very little about their personal goals, their values and allegiances, nor you really know how credulous they are, or what kind of persuasion they may be susceptible to.
Put a human like that in place of the LLM, and consider its interactions with its users (clients), the vendor hosting it (i.e. its boss) and the company that produced it (i.e. its abusive parents / unhinged scientists, experimenting on their children). With tools calling to external services (with or without MLP), you also add third parties to the mix. Look at this situation through regular organizational security lens, consider principal/agent problem - and then consider what kind of measures we normally apply to keep a system like this working reliably-ish, and how do those measures work, and then you'll have a clear picture of what we're dealing with when introducing an LLM to a computer system.
No, this isn't a long way of saying "give up, nothing works" - but most of the measures we use to keep humans in check don't apply to LLMs (on the other hand, unlike with humans, we can legally lobotomize LLMs and even make control systems operating directly on their neural structure). Prompt injection, being equivalent to social engineering, will always be a problem.
Some mitigations that work are:
1) not giving the LLM power it could potentially abuse in the first place (not applicable to MLP problem), and
2) preventing the parties it interacts with from trying to exploit it, which is done through social and legal punitive measures, and keeping the risky actors away.
There are probably more we can come up with, but the important part, designing secure systems involving LLMs is like securing systems involving people, not like securing systems made purely of classical software components.
Let's start with fixing the examples...
https://github.com/modelcontextprotocol/servers/issues/866
https://docs.mcp.run/blog/2025/04/07/mcp-run-security
Specifically the core design principal is you have to be comfortable with any possible combination of things your agent can do with its tools, not only the combination you ask for.
If your agent can search the web and can access your WhatsApp account, then you can ask it to search for something and text you the results -- cool. But there's some possible search result that would take over its brain and make it post your WhatsApp history to the web. So probably you should not set up an agent that has MCPs to both search the web and read your WhatsApp history. And in general many plausibly useful combinations of tools to provide to agents are unsafe together.
is it only use pre-vetter "Apple Store" of known good MCP integrations from well known companies, and avoid using anything else without proper review?
This has been discussed before, but the short version is: there is no solution currently, other than only use trusted sources.
Unless there is a way beyond a flat text file to distinguish different parts of the “prompt data” so they cannot interfere with each other (and currently there is not), this idea of arbitrary content going into your prompt (which is literally what MCP does) can’t be safe.
It’s flat out impossible.
The goal of “arbitrary 3rd party content in prompt” is fundamentally incompatible with “agents able to perform privileged operations” (securely and safely, that is).
Deleted Comment
https://github.com/eqtylab/mcp-guardian/
https://www.eqtylab.io/blog/securing-model-context-protocol
MCP - exposes prompts, resources and tools to a host, who can do whatever they like
A2A - exposes capability discovery, tasks, collaboration?/chat?, user experience discussions (can we embed an image or or a website?).
High-level it makes sense to agree on these concepts. I just wonder if we really need a fully specified protocol? Can't we just have a set of best practices around API endpoints/functions? Like, imo we could just keep using Rest APIs and have a convention that an agent exposes endpoints like /capabilities, /task_status ...
I have similar thoughts around MCP. We could just have the convention to have an API endpoint called /prompts and keep using rest apis?
Not sure what I am missing.
Ideally, the model providers would then build for the protocol, so the developers aren't writing spaghetti code for every small difference
To make this work at scale we all need to agree on the specific routes names, payloads, behaviors, etc. At that point we have defined a protocol (built on top of HTTP, itself a lower level protocol).
Companies who are betting their future on LLMs realized a few years ago that the data they can legally use is the only long term difference between them, aka “moat.”
Now that everyone has more or less the same public data access, and a thin compute moat is still there, the goal is to transfer your private textual data to them forever so they have an ever updating and tuned set of models for your data
>transfer your private textual data to them
Who is "they" (or "them") in these sentences? It's an open protocol with 50 partner companies, which can be used with AI agents from ~anyone on ~any framework. Presumably you can use this protocol in an air-gapped network, if you'd like.
Which one of the 50 partner companies is taking my data and building the moat? Why would the other 49 companies agree to a partnership if they're helping build a moat that keeps them out?
To put it bluntly, the point of creating the open interface at this level, is that you get to close off everything else.
I had been working on some personal projects over the last few months that would've benefitted enormously from having this kind of standard A2A protocol available. My colleagues and I identified it months ago as a major need, but one that would require a lot of effort to get buy-in across the industry, and I'm happy to see that Google hopped in to do it.
https://gdpr-info.eu/art-20-gdpr/
Deleted Comment