First, I implemented `repogather --all` , which unintelligently copies all sources files in your repository to the clipboard (delimited by their relative filepaths). To my surprise, for less complex repositories, this alone is often completely workable for Claude — much better than pasting in the just the few files you are looking to update. But I never would have done it if I had to copy/paste everything individually. 200k is quite a lot of tokens!
But as soon as the repository grows to a certain complexity level (even if it is under the input token limit), I’ve found that Claude can get confused by different unrelated parts / concepts across the code. It performs much better if you make an attempt to exclude logic that is irrelevant to your current change. So I implemented `repogather "<query here>"` , e.g. `repogather "only files related to authentication"` . This uses gpt-4o-mini with structured outputs to provide a relevance score for each source file (with automatic exclusions for .gitignore patterns, tests, configuration, and other manual exclusions with `--exclude <pattern>` ).
gpt-4o-mini is so cheap and fast, that for my ~8 dev startup’s repo, it takes under 5 seconds and costs 3-4 cents (with appropriate exclusions). Plus, you get to watch the output stream while you wait which always feels fun.
The retrieval isn’t always perfect the first time — but it is fast, which allows you to see what files it returned, and iterate quickly on your command. I’ve found this to be much more satisfying than embedding-search based solutions I’ve used, which seem to fail in pretty opaque ways.
https://github.com/gr-b/repogather
Let me know if it is useful to you! Always love to talk about how to better integrate LLMs into coding workflows.
On Greenfield projects. I ask Claude Soñnet to write all the function and their signature with return value etc..
Then I've a script which sends these signature to Google Flash which writes all the functions for me.
All this happens in paraellel.
I've found if you limit the scope, Google Flash writes the best code and it's ultra fast and cheap.
What if you need to iterate on the functions it gives? Do you just start over a with a different prompt, or do you have the ability to do a refinement with Google Flash on existing functions?
That's why Gemini Flash might appear dumb in front of Sonnet. But who writes the dumb functions better which are guaranteed to work for long time in production? Gemini.
But Sonnet makes silly mistakes like, even when I feed it requirements.txt it still uses methods which either do not exist or used to exist but not anymore.
Gemini Flash isn't as creative.
So basically, we use Sonnet to do high level programming and Flash for low level (writing functions which are guaranteed to be correct and clean, no black magic)
Problem with sonnet is it's slow. Sometimes you'll be stuck in a loop where it suggests something, then removes it when it encounters errors, then it again suggests the vary same thing you tried before.
I am using Claude Soñnet via cursor.
>What if you need to iterate on the functions it gives?
I can do it via Aider and even modify the prompt it sends to Gemini Flash.
srtp -> .
OSError: [Errno 62] Too many levels of symbolic links: 'submodules/externals/srtp/include/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp/srtp'Example,
Here is my whole project, now implement user authentication with plain username/password?
* Until the repository gets more complicated, which is why we need the intelligent relevance filtering features of repogather, e.g. `repogather "Only files related to authentication and avatar uploads"`
It doesn’t have all the fancy LLM integration though.
If your codebase is structured in a very modular way than this one liner mostly just works:
find . -type f -exec echo {} \; -exec cat {} \; | pbcopy
Would it be okay if I include this one liner in the readme (with credit) as an alternative?
Part of it is that I actually get better results using repogather + Claude UI for asking questions about my code than I get with Cursor’s chat. I suspect the index it creates on my codebase just isn’t very good, and it’s opaque to me.
I wonder if an increase in usable (not advertised) context tokens may obviate many of these approaches.
> I wonder if an increase in usable (not advertised) context tokens may obviate many of these approaches.
I've been extremely interested in this question! Will be interesting to see how things develop, but I suspect that relevance filtering is not as difficult as coding, so small, cheap LLMs will make the former a solved, inexpensive problem, while we will continue to build larger and more expensive LLMs to solve the latter.
That said, you can buy a lot of tokens for $150k, so this could be short sighted.
I think in the future as the cost of gpt-4o-mini level intelligence decreases, it will become increasingly worth it, even for larger repositories, to simply attend to every token for certain coding subtasks. I'm assuming here that relevance filtering is a much easier task than coding itself, otherwise you could just copy/paste everything into the final coding model's context. What I think would make much more sense for this project is to optimize the cost / performance of a small LLM fine-tuned for this source relevance task. I suspect I could do much better than gpt-4o-mini, but it would be difficult to deploy this for free.
Deleted Comment