For tiny, throwaway projects, a monolithic .md file is fine. A folder allows more complex projects to use "just enough hierarchy" to provide structure, with index.md as the entry point. Along with top-level universal guidance, it can include an organization guide (easily maintained with the help of LLMs).
In my experience, this works loads better than the "one giant file" method. It lets LLMs/agents add relevant context without wasting tokens on unrelated context, reduces noise/improves response accuracy, and is easier to maintain for both humans and LLMs alike.
¹ Ideally with a better name than ".agents", like ".codebots" or ".context".
Except not hidden. Why do people want to hide important files and directories? Particularly documentation? Tradition, I guess, but it's an antipattern that makes everything more opaque.
Where are they hidden that you are having trouble with? I've had an alias for `ls` that always includes dotfiles and `shopt -s dotglob` in my bash profile for decades. Mac Finder is a little more obnoxious with having to do `Meta+Shift+.` to reveal dotfiles.
Other than that, modern tooling like Git and IDEs do not "hide" dotfiles.
These days, a `.` in front of a file or folder in a repo is more to indicate it is metadata/config. Although I am in favor of putting all that stuff under `.config/`.
The most effective argument I have for getting other developers to comment their code is "The agent will read it and it will give better suggestions".
Truly perverse, but it works.
I agree with you... but the reality is that there's a wide contingent of people that are not capable of understanding "people don't know the same things as me". So they need some other reason.
I don’t think they serve the same purpose. Most of the instructions I have for an agent won’t apply to a human. It’s mostly around the the requirements to bootstrap the project vs what I’d ask for a human to accept their pull request.
Been using a similar setup, with so far pretty decent results. With the addition of having a short explanation for each file within index.md
I've been experimenting with having a rules.md file within each directory where I want a certain behavior. Example, let us say I have a directory with different kind of services like realtime-service.ts and queue-service.ts, I then have a rules.md file on the same level as they are.
This lets me scaffold things pretty fast when prompting by just referencing that file. The name is probably not the best tho.
This looks like a general software design / coding style docs both for humans and robots alike. I put these .md files into the docs/ folder. And they're written by the Claude Code itself.
AGENTS.md (and friends like CLAUDE.md) should be for robots only, whether a large single file with h2 headers (##) sections, or a directory with separate sections, is a matter of taste. Some software arch/design doc formats support both versions, i.e. see Arc42.
Though, it's much easier and less error-prone to @-mention a separate .md file, rather than a section in a large markdown file.
Smaller files also might be better when you want to focus a coding agent's attention on a specifric thing.
>whether a large single file with h2 headers (##) sections, or a directory with separate sections, is a matter of taste
Not sure it is when you consider how agents deal with large files, hows it gonna follow coding conventions if it doesn’t even grep them or just read the first few lines
You can have multiple AGENTS.md files in your codebase and tooling will look at both the one in the current directory as well as in the root of the codebase. This way you can sort of do what you're suggesting but simultaneously keep the information closer to the code that it is describing.
so you would have an Agents.md in your testing folder and it would describe how to run the tests or generate new tests for the project - am I understanding the usage correctly?
I've been trying to keep my baked in llm instructions to a terse ~100 line file, mostly headered sections with 5 or so bullet points each. Covering basic expectations for architecture, test mocking, approach to large changes etc. I can see why for some projects that wouldn't be enough but I feel like it covers everything for most of mine.
This is one of the reasons I'm sticking with .github/... copilot instructions for now. We'll see if this proposal evolves over time as more voices enter the conversation
This is what I do. Everywhere my agent works it uses a .agent dir to store its logs and intermediary files. This way the main directories aren't polluted with cruft all the time.
I'd be interested in smarter ways of doing this, but currently I just use my CLAUDE.local.md to serve as the index.md in my example. It includes the 'specialist' .md files with their relative paths and descriptions, and tells Claude Code to use these when planning.
I also have explicit `xnew`, `xplan`, `xcode` and `xcheck` commands in CLAUDE.md that reinforce this. For example, here's my `xnew`:
## Remember Shortcuts
Remember the following shortcuts, which the user may invoke at any time.
### XNEW
When I type "xnew", this means:
```Understand all BEST PRACTICES listed in CLAUDE.md.
Your code SHOULD ALWAYS follow these best practices.
REVIEW relevant documentation in .agents/ before starting new work.
Your code SHOULD use existing patterns and architectural decisions
documented there rather than creating new approaches.```
We're in a transition phase today where agents need special guidance to understand a codebase that go beyond what humans need. Before long, I don't think they will. I think we should focus on our own project documentation being comprehensive (e.g. the contents of this AGENTS.md are appropriate to live somewhere in our documentation), but we should always write for humans.
The LLM's whole shtick is that it can read and comprehend our writing, so let's architect for it at that level.
It's not just understanding the codebase, it's also stylistic things, like "use this assert library to write tests", or "never write comments", or "use structured logging". It's just as useful --- more so even --- on fresh projects without much code.
Honestly, everything I have written in markdown files as AI context fodder is stuff that I write down for human contributors anyway. Or at least stuff I want to always write down, but maybe only halfway do. The difference now is it is actually being read, seemingly understood, and often followed!
Stylistic preferences could usually be inferred by just looking at the code. Perhaps if code is mid-refactor there may be inconsistencies, but ideal ai coding agent could also look through git history
I suspect machine readable practices will become standard as AI is incorporated more into society.
A good example is autonomous driving and local laws / context. "No turn on red. School days 7am-9am".
So you need: where am I, when are school days for this specific school, and what datetime it is. You could attempt to gather that through search. Though more realistically I think the municipality will make the laws require less context, or some machine readable (e.g. qrcode) transfer of information will be on the sign. If they don't there's going to be a lot of rule breaking.
Very strong "reverse centaur" vibes here, in the sense of humans becoming servants to machines, instead of vice versa. Not that I think making things more machine-readable is a waste of time, but you have to keep in mind the amount of human time sacrificed.
That seems anachronistic, form over function. Machines should be able to access an API that returns “signs” for their given location. These signs don’t need any real world presence and can be updated instantly.
I think they'll always need special guidance for things like business logic. They'll never know exactly what it is that you're building and why, what the end goal of the project is without you telling them. Architectural stuff is also a matter of human preference: if you have it mapped out in your head where things should go and how they should be done, it will be better for you when reading the changes, which will be the real bottleneck.
Indeed I have observed that my coworkers "never know exactly what it is that [we]'re building and why, what the end goal of the project is without [me] telling them"
Not at all. Good documentation for humans are working well for models too, but they need so much more details and context to be reliable than humans that it needs a different style of description.
This needs to contain things that you would never write for humans.
They also do stupid things which need to be adjusted by these descriptions.
Unless we write down on the what we often consider implicit the LLM will not know it. There might be the option to deduce some implicit requirements from the code but unlikely 100%. Thus making the requirements explicit is the go.
Better to work with the tools we have instead of the tools we might one day have. If you want agents to work well today, you need to build for the agents we have today.
We may never achieve your future where context is unlimited, models are trained on your codebase specifically, and tokens are cheap enough to use all of this. We might have a bubble pop and in a few years we could all be paying 5-10X current prices (read: the actual cost) for similar functionality to today. In that reality, how many years of inferior agent behavior do you tolerate before you give up hoping that it will evolve past needing the tweaks?
> We're in a transition phase today where agents need special guidance to understand a codebase that go beyond what humans need. Before long, I don't think they will.
This isn't guaranteed. Just like we will never have fully self-driving cars, we likely won't have fully human quality coders.
Right now AI coders are going to be another tool in the tool bucket.
I don't think the bar here is a human level coder, I think the bar is an LLM which reads and follows the README.md.
If we're otherwise assuming it reads and follows an AGENTS.md file, then following the README.md should be within reach.
I think our task is to ensure that our README.md is suitable for any developer to onboard into the codebase. We can then measure our LLMs (and perhaps our own documentation) by if that guidance is followed.
Analyze the repository and add a suitable agents.md
It did a decent job. I didn't really have much to add to that. I guess, having this file is a nice optimization but obviously it doesn't contain anything it wasn't able to figure out by itself. What's really needed is a per repository learning base that gets populated with facts the agents discovers during it's many experiments with the repository over the course of many conversations. It's a performance optimization.
The core problem is that every conversation is like ground hog day. You always start from scratch. Agents.md is a stop gap solution for that problem. Chatgpt actually has some notional memory that works across conversations. But it's a bit flaky, slow, and limited. It doesn't really learn across conversations.
That btw. is a big missing piece on the path to AGIs. There are some imperfect workarounds but a lot of knowledge is lost in between conversations. And the trick of just growing the amount of context we give to our prompts doesn't seem like it's the solution.
I see the groundhog day problem as a feature, not a bug.
It's an organizational challenge, requiring a top level overview and easy to find sub documentation - and clear directives to use them when the AI starts architecting on a fresh start.
Overall, it's a good sign when a project is understandable in small independent chunks that don't demand a programmer/llm take in more context than was referenced.
I think the sweet spot would be all agents agree on a MUST-READ reference syntax for inside comments & docs that through simple scanning forces the file into the context. eg
// See @{../docs/payment-flow.md} for the overall design.
Your prompt is pretty basic. Both Claude Code and Github Copilot having similar features. Claude Code has `init` which has a lot of special sauce in the prompt to improve the CLAUDE.md. And github copilot added a self-documenting prompt as well that runs on new repos, and you can see their prompt here https://docs.github.com/en/copilot/how-tos/configure-custom-...
Reading their prompt gives ideas on how you can improve yours.
At this point AGENTS.md is a README.md with enough hype behind it to actually motivate people to populate it with contents. People were too lazy to write docs for other people, but funnily enough are ok with doing it for robots.
This situation reminds me a bit of ergonomic handles design. Designed for a few people, preferred by everyone.
I like this insight. We kind of always knew that we wanted good docs, but they're demotivating to maintain if people aren't reading them. LLMs by their nature won't be onboarded to the codebase with meetings and conversations, so if we want them to have a proper onboarding then we're forced to be less lazy with our docs, and we get the validation of knowing they're being used.
I mean the agents are to lazy to read any of this anyway and often will forget the sort of instructions being spam these with after 3 more instructions too.
The difference now is that people are actively trying to remove people (others and themselves) from software development work, so the robots have to have adequate instructions. The motivation is bigger. To dismantle all human involvement with software development is something that everyone wants, and they want it yesterday.
You didn't get the memo? Vibe coding, obviously. Joke aside, I remember there was an article here a few weeks ago maybe about writing docs for bots about which commenters here said it was no different that writing better docs.
I am developing a coding agent that currently manages and indexes over 5,000 repositories. The agent's state is stored locally in a hidden `.agent` directory, which contains a configuration folder for different agent roles and their specific instructions.
Then we've a "agents" folder with multiple files, each file has
<Role> <instruction>
Agent only reads the file if its role is defined there.
Inside project directory, we've a dot<coding agent name> folder where coding agents state is stored.
Our process kicks off with an `/init` command, which triggers a deep analysis of an entire repository. Instead of just indexing the raw code, the agent generates a high-level summary of its architecture and logic. These summaries appear in the editor as toggleable "ghost comments." They're a metadata layer, not part of the source code, so they are never committed in actual code. A sophisticated mapping system precisely links each summary annotation to the relevant lines of code.
This architecture is the solution to a problem we faced early on: running Retrieval-Augmented Generation (RAG) directly on source code never gave us the results we needed.
Our current system uses a hybrid search model. We use the AST for fast, literal lexical searches, while RAG is reserved for performing semantic searches on our high-level summaries. This makes all the difference. If you ask, "How does authentication work in this app?", a purely lexical search might only find functions containing the word `login` and functions/classes appearing in its call hierarchy. Our semantic search, however, queries the narrative-like summaries. It understands the entire authentication flow like it's reading a story, piecing together the plot points from different files to give you a complete picture.
Working on something similar. Legacy codebase understanding requires this type of annotation, and “just use code comments” is too much of a blunt instrument to too much good. Are you storing the annotations completely out of band wrt the files, or using filesystem capabilities like metadata?
This type of metadata itself could have individual value; there are many types of documents that will be analyzed by LLMs, and will need not only a place to store analysis alongside document-parts, but meta-metadata related to the analysis (like timestamps, models, prompts used etc). Of course this could all be done OOB, but then you need a robust way to link your metadata store to a file that has a lifecycle all its own thats only observable by you (probably).
You can create a hierarchy of summaries. The idea being summaries can exist at the method level, class level and the microservice or module level. Each layer of summary points to its child layers and leaf nodes are code themselves. I think it can be a B tree or a normal tree.
The RAG agent can traverse as deep as needed for the particular semantic query. Each level maintains semantic understanding of the layer beneath it but as a tradeoff it loses a lot of information and keeps only what is necessary.
This will work if the abstractions in the codebase are done nicely - abstractions are only useful if they actually hide implementation details. If your abstractions are good enough, you can afford to keep only the higher layers (as required) in your model context. But if it’s not good, you might even have to put actual code in it.
For instance a method like add(n1, n2) is a strong abstraction - I don’t need to know its implementation but only semantic meaning at this level.
But in real life methods don’t always do one thing - there’s logging, global caches etc.
The agent I’m developing is designed to improve or expand upon "old codebases." While many agents can generate code from scratch, the real challenge lies in enhancing legacy code without breaking anything.
This is the problem I’m tackling, and so far, the approach has been effective. It's simple enough for anyone to use: the agent makes a commit, and you can either undo, squash, or amend it through the Git UI.
The issue is that developers often skip reviewing the code properly. To counter that, I’m considering shifting to a hunk-by-hunk review process, where each change is reviewed individually. Once the review is complete, the agent would commit the code.
The concept is simple, but the fun lies in the implementation details—like integrating existing CLI tools without giving the agent full shell access, unlike other agents.
What excites me most is the idea of letting 2–3 agents compete, collaborate, and interpret outputs to solve problems and "fight" to find the best solution.
That’s where the real fun is.
Think of it as surgeons scalpel approach rather than "steam roller" approach most agents take.
* Writing style. In agent docs, using all caps might be an effective way to emphasize a particular instruction. In internal eng docs, this might come off rude or distracting.
* Conciseness vs. completeness. In agent docs, you likely need to keep the content highly curated. If you put in too much content, you’ll blast through your API quotas quickly and will probably reduce LLM output quality. In internal eng docs, we ideally aim for 100% completeness. I.e. every important design decision, API reference, workflow, etc. is documented somewhere.
* Differing knowledge needs. The information that LLMs need help with is not the same as the information that human engineers need help with. For example, Gemini 2.5 Pro has pretty good built-in awareness of Pigweed’s C++ Style Guide. I tested that assertion by invoking the Gemini API and instructing it Recite the Pigweed C++ Guide in its entirety. It did not recite in full, but it gave a detailed summary of all the points. So the Gemini 2.5 Pro API was either trained on the style guide, or it’s able to retrieve the style guide when needed. Therefore, it’s not necessary to include the full style guide as AGENTS.md context. (Credit to Keir Mierle for this idea.)
Arguments against:
* Duplication. Conceptually, agent docs are a subset of internal eng docs. The underlying goal is the same. You’re documenting workflows and knowledge that’s important to the team. But now you need to maintain that same information in two different doc sets.
> Writing style. In agent docs, using all caps might be an effective way to emphasize a particular instruction. In internal eng docs, this might come off rude or distracting.
To pile on to this, an agent needs to see "ABSOLUTELY NEVER do suchandsuch" to not do suchandsuch, but still has a pretty fair chance of doing it by accident. A talented human seeing "ABSOLUTELY NEVER do suchandsuch" will interpret this to mean there are consequences to doing suchandsuch, like being fired or causing production downtime. So the same message will be received differently by the different types of readers.
At this point, README.md becomes the "marketing/landing page markdown" and AGENTS.md/CLAUDE.md becomes the ones you visit to get an overview of the actual code/architecture/usage.
There's a lot of shit in my claude.md that would be condescending to a human. Some of it I don't even mean, it's just there to correct egregious patterns the model loves to do. I'm not gonna write "never write fallback code" in my readme, but it saves a lot of time to prevent claude from constantly writing code that would silently fail if it got past me because it contains a fallback path with hardcoded fake data.
One reason to consider is around context usage with LLMs. Less is generally better and README.md files are often too much text some of which I don’t want in every context window.
I find AGENT.md and similar functioning files for LLMs in my projects contains concise and specific commands around feedback loops such as testing, build commands, etc. Yes these same commands might be in a README.md but often there is a lot more text that I don’t want in the context window being sent with every turn to the LLM.
* Agents still kinda suck so need the help around context management and avoiding foot guns. Eg, we make a < 500loc ai/readme.md with must haves, links to others, and meta how-to-use
* Similar to IaaC, useful to separate out as not really ready the same way we read it, and many markdowns are really scripts written in natural language, eg, plan.md.template
Perhaps. I let Claude put whatever it wants in its Claude file and check it’s not batshit from time to time, where I’m very protective of the high-quality README I write for humans. The Claude file has stuff that would be obvious to a human and weird to jam into the README (we use .spec.ts not .test.ts) but that Claude needs in order to get things right.
Isn't the promise of AI that we don't have to adhere to precise formats? We can just write it down in whatever format makes the most sense to us, and any impedance mismatch is on the machine to figure out?
Still need to write it down to make it explicit. The longer the writing gets the more a structured approach is suitable for the human writing and maintaining the instructions.
Joking aside this is kind of how I'm pitching it to our team. Even if LLMs don't significantly improve productivity writing code they will at least get us to document everything in a similar amount of time.
For tiny, throwaway projects, a monolithic .md file is fine. A folder allows more complex projects to use "just enough hierarchy" to provide structure, with index.md as the entry point. Along with top-level universal guidance, it can include an organization guide (easily maintained with the help of LLMs).
In my experience, this works loads better than the "one giant file" method. It lets LLMs/agents add relevant context without wasting tokens on unrelated context, reduces noise/improves response accuracy, and is easier to maintain for both humans and LLMs alike.¹ Ideally with a better name than ".agents", like ".codebots" or ".context".
Maybe robot_docs?
Other than that, modern tooling like Git and IDEs do not "hide" dotfiles.
These days, a `.` in front of a file or folder in a repo is more to indicate it is metadata/config. Although I am in favor of putting all that stuff under `.config/`.
> Maybe robot_docs?
No thanks.
Deleted Comment
The content of the AGENTS.md is the same as what humans are looking for when contributing to a project.
Truly perverse, but it works.
I agree with you... but the reality is that there's a wide contingent of people that are not capable of understanding "people don't know the same things as me". So they need some other reason.
Also, you don’t even address their point.
I've been experimenting with having a rules.md file within each directory where I want a certain behavior. Example, let us say I have a directory with different kind of services like realtime-service.ts and queue-service.ts, I then have a rules.md file on the same level as they are.
This lets me scaffold things pretty fast when prompting by just referencing that file. The name is probably not the best tho.
AGENTS.md (and friends like CLAUDE.md) should be for robots only, whether a large single file with h2 headers (##) sections, or a directory with separate sections, is a matter of taste. Some software arch/design doc formats support both versions, i.e. see Arc42.
Though, it's much easier and less error-prone to @-mention a separate .md file, rather than a section in a large markdown file.
Smaller files also might be better when you want to focus a coding agent's attention on a specifric thing.
They're also easier to review diffs / PRs.
Not sure it is when you consider how agents deal with large files, hows it gonna follow coding conventions if it doesn’t even grep them or just read the first few lines
You can just use the AGENTS.md file as an index pointing to other doc files.
This example does that -
https://github.com/apache/airflow/blob/main/AGENTS.md
Having an in-your-face file that links to a hidden file serves no purpose.
Stop polluting the root dir.
I'm not a fan of the name "well-known", but at least it's a convention [1].
I think it'd be great if we took something like XDG [2] and made it common for repositories, build scripts, package managers, tooling configs, etc.
[1] https://www.rfc-editor.org/rfc/rfc8615
[2] https://wiki.archlinux.org/title/XDG_Base_Directory
Deleted Comment
Or use urls in your main AGENTS like I do for https://gitchamber.com
Deleted Comment
I also have explicit `xnew`, `xplan`, `xcode` and `xcheck` commands in CLAUDE.md that reinforce this. For example, here's my `xnew`:
Deleted Comment
The LLM's whole shtick is that it can read and comprehend our writing, so let's architect for it at that level.
Deleted Comment
A good example is autonomous driving and local laws / context. "No turn on red. School days 7am-9am".
So you need: where am I, when are school days for this specific school, and what datetime it is. You could attempt to gather that through search. Though more realistically I think the municipality will make the laws require less context, or some machine readable (e.g. qrcode) transfer of information will be on the sign. If they don't there's going to be a lot of rule breaking.
Slapping on a sign is ineffective
This needs to contain things that you would never write for humans. They also do stupid things which need to be adjusted by these descriptions.
We may never achieve your future where context is unlimited, models are trained on your codebase specifically, and tokens are cheap enough to use all of this. We might have a bubble pop and in a few years we could all be paying 5-10X current prices (read: the actual cost) for similar functionality to today. In that reality, how many years of inferior agent behavior do you tolerate before you give up hoping that it will evolve past needing the tweaks?
This isn't guaranteed. Just like we will never have fully self-driving cars, we likely won't have fully human quality coders.
Right now AI coders are going to be another tool in the tool bucket.
If we're otherwise assuming it reads and follows an AGENTS.md file, then following the README.md should be within reach.
I think our task is to ensure that our README.md is suitable for any developer to onboard into the codebase. We can then measure our LLMs (and perhaps our own documentation) by if that guidance is followed.
“Never is a long time...and none of us lives to see its length.” Elizabeth Yates, A Place for Peter (Mountain Born, #3)
“Never is an awfully long time.” J.M. Barrie, Peter Pan
Deleted Comment
The core problem is that every conversation is like ground hog day. You always start from scratch. Agents.md is a stop gap solution for that problem. Chatgpt actually has some notional memory that works across conversations. But it's a bit flaky, slow, and limited. It doesn't really learn across conversations.
That btw. is a big missing piece on the path to AGIs. There are some imperfect workarounds but a lot of knowledge is lost in between conversations. And the trick of just growing the amount of context we give to our prompts doesn't seem like it's the solution.
It's an organizational challenge, requiring a top level overview and easy to find sub documentation - and clear directives to use them when the AI starts architecting on a fresh start.
Overall, it's a good sign when a project is understandable in small independent chunks that don't demand a programmer/llm take in more context than was referenced.
I think the sweet spot would be all agents agree on a MUST-READ reference syntax for inside comments & docs that through simple scanning forces the file into the context. eg
// See @{../docs/payment-flow.md} for the overall design.
Reading their prompt gives ideas on how you can improve yours.
This situation reminds me a bit of ergonomic handles design. Designed for a few people, preferred by everyone.
With an agent I know if I write once to CLAUDE.md and it will be read by 1000’s of agents in a week.
<Role> <instruction>
Agent only reads the file if its role is defined there.
Inside project directory, we've a dot<coding agent name> folder where coding agents state is stored.
Our process kicks off with an `/init` command, which triggers a deep analysis of an entire repository. Instead of just indexing the raw code, the agent generates a high-level summary of its architecture and logic. These summaries appear in the editor as toggleable "ghost comments." They're a metadata layer, not part of the source code, so they are never committed in actual code. A sophisticated mapping system precisely links each summary annotation to the relevant lines of code.
This architecture is the solution to a problem we faced early on: running Retrieval-Augmented Generation (RAG) directly on source code never gave us the results we needed.
Our current system uses a hybrid search model. We use the AST for fast, literal lexical searches, while RAG is reserved for performing semantic searches on our high-level summaries. This makes all the difference. If you ask, "How does authentication work in this app?", a purely lexical search might only find functions containing the word `login` and functions/classes appearing in its call hierarchy. Our semantic search, however, queries the narrative-like summaries. It understands the entire authentication flow like it's reading a story, piecing together the plot points from different files to give you a complete picture.
It works like magic.
This type of metadata itself could have individual value; there are many types of documents that will be analyzed by LLMs, and will need not only a place to store analysis alongside document-parts, but meta-metadata related to the analysis (like timestamps, models, prompts used etc). Of course this could all be done OOB, but then you need a robust way to link your metadata store to a file that has a lifecycle all its own thats only observable by you (probably).
You can create a hierarchy of summaries. The idea being summaries can exist at the method level, class level and the microservice or module level. Each layer of summary points to its child layers and leaf nodes are code themselves. I think it can be a B tree or a normal tree.
The RAG agent can traverse as deep as needed for the particular semantic query. Each level maintains semantic understanding of the layer beneath it but as a tradeoff it loses a lot of information and keeps only what is necessary.
This will work if the abstractions in the codebase are done nicely - abstractions are only useful if they actually hide implementation details. If your abstractions are good enough, you can afford to keep only the higher layers (as required) in your model context. But if it’s not good, you might even have to put actual code in it.
For instance a method like add(n1, n2) is a strong abstraction - I don’t need to know its implementation but only semantic meaning at this level.
But in real life methods don’t always do one thing - there’s logging, global caches etc.
This is the problem I’m tackling, and so far, the approach has been effective. It's simple enough for anyone to use: the agent makes a commit, and you can either undo, squash, or amend it through the Git UI.
The issue is that developers often skip reviewing the code properly. To counter that, I’m considering shifting to a hunk-by-hunk review process, where each change is reviewed individually. Once the review is complete, the agent would commit the code.
The concept is simple, but the fun lies in the implementation details—like integrating existing CLI tools without giving the agent full shell access, unlike other agents.
What excites me most is the idea of letting 2–3 agents compete, collaborate, and interpret outputs to solve problems and "fight" to find the best solution.
That’s where the real fun is.
Think of it as surgeons scalpel approach rather than "steam roller" approach most agents take.
(Quoting from that post)
Arguments in favor of keeping them separated:
* Writing style. In agent docs, using all caps might be an effective way to emphasize a particular instruction. In internal eng docs, this might come off rude or distracting.
* Conciseness vs. completeness. In agent docs, you likely need to keep the content highly curated. If you put in too much content, you’ll blast through your API quotas quickly and will probably reduce LLM output quality. In internal eng docs, we ideally aim for 100% completeness. I.e. every important design decision, API reference, workflow, etc. is documented somewhere.
* Differing knowledge needs. The information that LLMs need help with is not the same as the information that human engineers need help with. For example, Gemini 2.5 Pro has pretty good built-in awareness of Pigweed’s C++ Style Guide. I tested that assertion by invoking the Gemini API and instructing it Recite the Pigweed C++ Guide in its entirety. It did not recite in full, but it gave a detailed summary of all the points. So the Gemini 2.5 Pro API was either trained on the style guide, or it’s able to retrieve the style guide when needed. Therefore, it’s not necessary to include the full style guide as AGENTS.md context. (Credit to Keir Mierle for this idea.)
Arguments against:
* Duplication. Conceptually, agent docs are a subset of internal eng docs. The underlying goal is the same. You’re documenting workflows and knowledge that’s important to the team. But now you need to maintain that same information in two different doc sets.
To pile on to this, an agent needs to see "ABSOLUTELY NEVER do suchandsuch" to not do suchandsuch, but still has a pretty fair chance of doing it by accident. A talented human seeing "ABSOLUTELY NEVER do suchandsuch" will interpret this to mean there are consequences to doing suchandsuch, like being fired or causing production downtime. So the same message will be received differently by the different types of readers.
Why we don't treat coding agents as developers and have them reading CONTRIBUTING.md is baffling to me.
They are separate for a good reason. My CLAUDE.md and README.md look very different.
Document how to use and install your tool in the readme.
Document how to compile, test, architecture decisions, coding standards, repository structure etc in the agents doc.
I find AGENT.md and similar functioning files for LLMs in my projects contains concise and specific commands around feedback loops such as testing, build commands, etc. Yes these same commands might be in a README.md but often there is a lot more text that I don’t want in the context window being sent with every turn to the LLM.
* Agents still kinda suck so need the help around context management and avoiding foot guns. Eg, we make a < 500loc ai/readme.md with must haves, links to others, and meta how-to-use
* Similar to IaaC, useful to separate out as not really ready the same way we read it, and many markdowns are really scripts written in natural language, eg, plan.md.template
> Are there required fields?
> No. AGENTS.md is just standard Markdown. Use any headings you like; the agent simply parses the text you provide.
https://xkcd.com/810/