I think I'd be happier with a standard .agents directory and have all of these in there. I imagine each agent is going to need its own tweaks to get it to work just right with their system prompts, just as Claude already has it's project-specfic .claude directory for hooks and commands and whatnot.
I'd rather .agents/claude or something so we can get these files out of the root directory, which, at least for typescript projects, is already a confetti-like jumble of json files for every little "simple" tool that needs its own configuration file.
I get why package.json isn't enough. But would a .config directory standard really have hurt us so much?
Ideally (and in practice, so far) they shouldn't change that much. The agent instructions should be close to a human explanation, and they are pretty good at parsing instructions anyway. In my experience you can symlink the same file to all paths and they work as expected.
If I'm writing for a human contributor, I'm gonna have a pretty high bar for the quality of that writing.
An agent on the other hand, one who is in that sweet spot where they're no longer ignorant, and not yet confused... It's nice to have them dump their understanding to agent_primers/subsystem_foo.md for consumption by the next agent that touches that subsystem. I don't usually even read these until I suspect a problem in one. They're just nuggets of context transfer.
Yes! I want an option to always add README.md to the context; It would force me to have a useful, up to date document about how to build, run, and edit my projects.
That sounds nice and I have the same pain, but not sure AGENT.md is the right abstraction either. After all, these models are indeed different and will respond differently even given the same prompting. Not to mention that different wrappers around those models have different capabilities.
e.g. maybe for CURSOR.md you just want to provide context and best practices without any tool-calling context (because you've found it doesn't do a great job of tool-calling), while for CLAUDE.md (for use with Claude Code) you might want to specify tools that are available to it (because it does a great job with tool calling).
Probably best if you have an AGENT.md that applies to all, and then the tools can also ingest their particular flavor in addition, which (if anything is in conflict) would trump the baseline AGENT file.
I asked claude code for a guidelines file so it would collaborate with windsurf. This is what it proposed:
---
This project uses shared planning documents for collaboration with Claude Code. Please:
1. First read and understand these files:
- PLAN.md - current project roadmap and objectives
- ARCHITECTURE.md - technical decisions and system design
- TODO.md - current tasks and their status
- DECISIONS.md - decision history with rationale
- COLLABORATION.md - handoff notes from other tools
2. Before making any significant changes, check these documents for:
- Existing architectural decisions
- Current sprint priorities
- Tasks already in progress
- Previous context from Claude Code
3. After completing work, update the relevant planning documents with:
- Task completion status
- New decisions made
- Any changes to architecture or approach
- Notes for future collaboration
Always treat these files as the single source of truth for project state.
problem is that claude doesn't actually read those or keep them in context unless you prompt it to. it has to be in CLAUDE.md or it'll quickly forget about the contents
I really like the idea of standardizing on AGENT.md, although it's too bad it doesn't really work with the .cursor/rules/ approach of having several rules files that get included based on matching the descriptions or file globs in frontmatter. Then again, I'm not sure if any other agents support an approach like that, and in my experience Cursor isn't entirely predictable about which rules files it ends up including in the context.
I guess having links to supplementary rules files is an option, but I'm not sure which agents (if any) would work well with that.
Yep, that's a peeve of mine. I've resorted to using AGENT.md, and aliasing Claude, Gemini, etc to a command that calls them with an initial instruction to read that file. But of course they will forget after some time.
The whole agentic coding via CLI experience could be much improved by:
- Making it easy to see what command I last issued, without having to scroll up through reams of output hunting for context
- Making it easy to spin up a proper sandbox to run sessions unattended
- Etc.
Maybe for code generation, what we actually need is a code generator that is itself deterministic but uses AI, instead of AI that does code generation.
I've experimented a little with LLM agents (only Claude Code). I definitely don't want the agent to write the commit messages (they should be written by a human as they're for human consumption) so I manually added the co-author trailer. It's morally correct to provide attribution.
Never. It's a marketing strategy. Some percentage of users will check these files into their repos, and some percentage of repo browsers will think "what is this X.md?" Given how much money people are spending on these things the value of having a unique filename must be enormous.
It’s a marketing strategy that works here and now, but “never” is a very long time. What could be seen as pioneers claiming names today could be also seen as retrogressive stubbornness tomorrow and lose its marketing value.
I just wish the AGENTS.md standard wasn't a single file. I have a lot of smaller context documents that aren't applicable to every task, so I like to throw them into a folder (.ai/ or .agents/) and then selectively cat them together or tell the agent to read them.
You could have a python script that generates the MD file on the fly, based on how you want to prompt the model. I think it's kind of funny, how deep we are getting with tools instructing tools instructing tools.
FWIW at least with Claude and Jules on a project I have a decent setup where I put all of the real content in an agents.md and then use “@agents.md” in CLAUDE.md. If all of the tools supported these kinds of context references in markdown it wouldn’t be that hard to have a single source of truth for memory files.
Yeah I suspect some of these providers will become Microsoft in the '90s type bully holdouts on implementing the emerging conventions. But ultimately with CLI interface you have workarounds to all the major providers read in your system guidelines. But in an IDE - e.g. like MS had with VisualStudio - you more lock-in potential for your config files.
The deeper problem are the custom commands, hooks and subagents. The time has come that you need to make a strategic choice. Once you have heavily invested into CC, it is not easy to turn to an alternative.
Side remark:
CC is very expensive when using API billing (compared to e.g. GPT-5). Once a company adopts CC and all developers start to adapt to it at full scale, the bill will go out of the roof.
For example: i'm using Claude Code mostly + Gemini CLI. Gemini CLI is not such powerful as CC, also it won't work with some mcp's. So i have different *.md files.
I went from years of vscode to "Cursor is the future" to never using Cursor at all. Claude Code, even with new limits, is just too good. If I were to switch to gpt-5, why wouldn't I just use Codex? I'm struggling to understand the value of what they're presenting.
I find the Codex CLI to be the worst of the CLI tools I’ve used (including, but not limited to, Claude Code, Gemini, Aider).
There’s something about it that makes it clunky.
Haven’t tried Cursor CLI yet though.
Because iterating multiple sessions through multiple terminals is obviously more efficient and seamless than interacting thought a scuffed IDE side panel ui.
What I have found Claude Code is extremely good at is that it makes one change at a time, gives you a chance to read the code its changing, and lets you give feedback in real time and steer it properly. I find the mental load with this method to be MUCH lower than Cursor or any of the other tools which give you two very different options: "Ask" mode which dumps a ton of suggestions on your and then requires semi-manual implementation, or "Agent" mode which dumps a ton of actual changes on you and requires your inspection and feedback and roll-backs, etc.
This may not work for everyone, but as a solo dev who wants to keep a real mental model of my work (and not let it get polluted with AI slop), the Claude Code approach just works really well for me. It's like having a coding partner who can iterate and change direction as you talk, not a junior dev who dumps a pile of code on your plate without discussion.
In my experience, it is much better at tool-calling, which is huge when we're talking about agentic coding. It also seems to do a better job of keeping things cleaning and not going off on tangents for anything that isn't accomplished in one shot.
CC just feeds the whole codebase and entire files into the model, no RAG, nothing in the way. It works substantially better because of that, but it's $expensive$.
The values is that, to use cursor, you don't need anymore to switch your IDE (unless you were already using vscode). You can keep your preferred IDE and run the agent in the terminal. IDE Is for humans, agents need only a terminal for running.
I find it really difficult to take this stuff seriously when the proponents of these tools are constantly flitting from tool to tool. There's tools and paradigms I've used for years at a time. Some things like vim have been used for decades.
This all just feels profoundly immature. You tell us one thing then two months later you're on to the next thing. There's no sense of mastery here.
is the multi-file editing as easily understood in the Claude Code cli? or am I missing something? I always felt like Claude Code CLI was OK as a tool yet powered by one of the best AI models. I stuck with Cursor b/c of the multi-file editing, agent mode, etc. I'm very very willing to leave Cursor because they've recently been scummy about their pricing changes.
Sure, you can have your LLM code with any JavaScript framework you want, as long as you don't mind it randomly dropping React code and React-isms in the middle of your app.
To be honest I am being positive and hopefully we'll see an explosion of AI agent that will help iron out all the bug in FOSS that is hosted on different source code hosting platform. Renovate on steroid. I would work on that if my daytime job wasn't my main and only source of revenue.
Think how much training has been done on such Javascript frameworks... no one stops wondering what the outcome would be. The only fact that when I ask to create an app, without any further detail about what to use, and it defaults on React, imo it's a total failure whatever the agent
Holy moly. I did not see that coming, but it makes sense. I’m enjoying the terminal-based coding agents way more than I ever would have expected. I can keep one spinning in the background while I do #dayjob, and as a bonus I feel like a haX0r.
2025 is the year of the terminal, apparently?
For my prototype purposes, it’s great, and Claude code the most fun I’ve had with tech in a jillion years.
Fascinating to see how agents are redefining what IDEs are. This was not really the case in the chat AI era. But as autonomy increases, the traditional IDE UI becomes less important form of interaction.
I think those CLI tools have pretty good chance to create a new dev tools ecosystem. Creating a full featured language plugin (let alone a full IDE) for VSCode or Intellij is not for a faint-hearted, and cross IDE portability is limited. CLI tools + MCP can be a lot simpler, more composable and more portable.
IDE UI should shift to focusing on catching agentic problems early and obviously, and providing drop dead simple rollback strategies, parallel survival-of-the-fittest solution generation, etc
My fundamental worry with this technology is that you all are going to seriously fuck up the development experience for those of us who feel the technology at the core of this stuff is not sufficient. Development efforts will focus on this work flow at the expense of good software.
With all the frontier labs competing in this space now, and them letting you use your consumer subscription through the CLI, I don’t understand how the Cursor products will survive. Why pay an extra $X/mo when I can get this functionality included in the $Y/mo I’m already paying OAI/Anthropic/GOOG?
I think the complete opposite. I love the ux for claude code, but it would be better if it wasnt locked to a single vendor's model. It seems pretty clear to me that a vendor neutral product with a UX as good as Claude Code would be the clear winner.
Habe you tried opencode? I haven't really, but it can use your anthropic subscription and also switch to most other models. It also looks quite nice IMO
If Cursor can build the better UX for all the use-cases, mobile/desktop chatbot, assistant, in IDE coding agent, CLI coding agent, web-based container coding agent, etc.
In theory, they can spend all their resourcing on this, so you could assume they could have those be more polished.
If they win the market-share here, than the models are just commodity, Cursor lets you pick which ever is best at any given time.
In a sense, "users" are going to get locked in on the tooling. They learn the commands, configuration, and so on of Cursor, it's a higher cost for them to re-learn a different UX. Uninstalling and re-installing another app, plugin, etc. is annoying.
No, model providers are not going to let Cursor eat their pie. The biggest cost in AI is in developing LLM models and inference. Players incurring those costs will basically control this market.
I agree that cursor has to take an aggressive and differentiated approach to succeed, but they have the benefit of pushing each lab into a commodity.
I pay for Cursor and ChatGPT. I can imagine I’d pay for Gemini if I used an android. The chat bots (1) won’t keep the subscription competitive with APIs because the cost and usage models are different and (2) most chat bots today are more of a UX competition than model quality. And the only winners are ChatGPT and whatever integrated options the user has by default (Gemini, MSFT Copilot, etc).
Because you can always use the best model. Yesterday is was Claude Opus 4.1, today it's GPT-5. If you just were paying Anthropic you will be stuck with Claude.
I'm having trouble finding a use for this outside of virtualized unused environments. Why not instead give me a virtual machine that runs this in a confined storage space?
I would _never_ give an LLM access to any disk I own or control if it had anything more than read permissions
Why not? Have you ever actually used these things? The risk is incredibly low. I run claude code with zero permissions every day for hours. Never a problem.
I have (not an exhaustive list) SSH keys and sensitive repositories hanging out on my filesystem. I don't trust _myself_ with that, let alone an LLM, unless I'm running ollama or similar local nonsense with no net connectivity.
I'm a few degrees removed from an air gapped environment so obviously YMMV. Frankly I find the idea of an LLM writing files or being allowed to access databases or similar cases directly distasteful; I have to review the output anyway and I'll decide what goes to the relevant disk locations / gets run.
For example, Gemini CLI [1] can use native sandboxing on macOS. It's just a matter of time before every major coding agent will run inside of an operating system's native sandbox/container/jail/VM.
https://agent.md [redirect -> https://ampcode.com/AGENT.md] https://agent-rules.org
I'd rather .agents/claude or something so we can get these files out of the root directory, which, at least for typescript projects, is already a confetti-like jumble of json files for every little "simple" tool that needs its own configuration file.
I get why package.json isn't enough. But would a .config directory standard really have hurt us so much?
An agent on the other hand, one who is in that sweet spot where they're no longer ignorant, and not yet confused... It's nice to have them dump their understanding to agent_primers/subsystem_foo.md for consumption by the next agent that touches that subsystem. I don't usually even read these until I suspect a problem in one. They're just nuggets of context transfer.
e.g. maybe for CURSOR.md you just want to provide context and best practices without any tool-calling context (because you've found it doesn't do a great job of tool-calling), while for CLAUDE.md (for use with Claude Code) you might want to specify tools that are available to it (because it does a great job with tool calling).
Probably best if you have an AGENT.md that applies to all, and then the tools can also ingest their particular flavor in addition, which (if anything is in conflict) would trump the baseline AGENT file.
---
I guess having links to supplementary rules files is an option, but I'm not sure which agents (if any) would work well with that.
The whole agentic coding via CLI experience could be much improved by:
- Making it easy to see what command I last issued, without having to scroll up through reams of output hunting for context - Making it easy to spin up a proper sandbox to run sessions unattended - Etc.
Maybe for code generation, what we actually need is a code generator that is itself deterministic but uses AI, instead of AI that does code generation.
Till then you can also use symlinks
there are issues opened in some repos for this
- Support "AGENT.md" spec + filename · Issue #4970 · google-gemini/gemini-cli
https://github.com/google-gemini/gemini-cli/issues/4970#issu...
cat AGENT.md | claude
IIRC this saves some tokens.
they also suggest using symlinks for now
Claude Code likes to add "attribution" in commit messages, which is just pure spam.
Why are we purposely creating CLI dialects?
Yesterday, I was writing about a way I found to pass the same guideline documents into Claude, Gemini, and Aider CLI-coders: https://github.com/sutt/agro/blob/master/docs/case-studies/a...
> Set your own rules: Customize Cursor's work with rules, AGENTS.md, and MCP.
There's no mention of it in the docs, though. It's also interesting it's AGENTS.md on that page instead of AGENT.md, I wonder if that's a typo.
Side remark: CC is very expensive when using API billing (compared to e.g. GPT-5). Once a company adopts CC and all developers start to adapt to it at full scale, the bill will go out of the roof.
Dead Comment
https://x.com/OpenAIDevs/status/1953559797883891735 (0.19 now)
I seem to always have better outcomes with Claude code.
This may not work for everyone, but as a solo dev who wants to keep a real mental model of my work (and not let it get polluted with AI slop), the Claude Code approach just works really well for me. It's like having a coding partner who can iterate and change direction as you talk, not a junior dev who dumps a pile of code on your plate without discussion.
This all just feels profoundly immature. You tell us one thing then two months later you're on to the next thing. There's no sense of mastery here.
Sure, you can have your LLM code with any JavaScript framework you want, as long as you don't mind it randomly dropping React code and React-isms in the middle of your app.
2025 is the year of the terminal, apparently?
For my prototype purposes, it’s great, and Claude code the most fun I’ve had with tech in a jillion years.
If Cursor can build the better UX for all the use-cases, mobile/desktop chatbot, assistant, in IDE coding agent, CLI coding agent, web-based container coding agent, etc.
In theory, they can spend all their resourcing on this, so you could assume they could have those be more polished.
If they win the market-share here, than the models are just commodity, Cursor lets you pick which ever is best at any given time.
In a sense, "users" are going to get locked in on the tooling. They learn the commands, configuration, and so on of Cursor, it's a higher cost for them to re-learn a different UX. Uninstalling and re-installing another app, plugin, etc. is annoying.
I pay for Cursor and ChatGPT. I can imagine I’d pay for Gemini if I used an android. The chat bots (1) won’t keep the subscription competitive with APIs because the cost and usage models are different and (2) most chat bots today are more of a UX competition than model quality. And the only winners are ChatGPT and whatever integrated options the user has by default (Gemini, MSFT Copilot, etc).
I guess Cursor makes sense for people who only use LLMs for coding.
I would _never_ give an LLM access to any disk I own or control if it had anything more than read permissions
I'm a few degrees removed from an air gapped environment so obviously YMMV. Frankly I find the idea of an LLM writing files or being allowed to access databases or similar cases directly distasteful; I have to review the output anyway and I'll decide what goes to the relevant disk locations / gets run.
[1]: https://github.com/google-gemini/gemini-cli/blob/main/docs/c...