Post: https://cline.bot/blog/why-cline-doesnt-index-your-codebase-...
That post however does not apply to offline processing use case. Here are his 3 main problem points they re trying to solve:
Code Doesn't Think in Chunks
But then he is describing follow semantic links through imports, etc. -> that technique is still hierarchical chunking, and I am planning to implement that as well: it's straightforward.
2. Indexes Decay While Code Evolves
This is just not true - there are multiple ways to solve it. One, for example, is continuous indexing at low priority in the background. Another one - monitoring for file changes and reindexing only differences, etc. I already implemented first iteration for this: index remains current.
3. Security Becomes a Liability (and then goes into embeddings to be stored somewhere)
We are talking about offline mode of operation. Not with Aye Chat: it implements embedding store locally - with ChromaDB and ONNXMiniLM_L6_V2 model.
So as you can see - none of his premises apply here.
And then as part of solution he claims that "context window does not matter because Claude and ChatGPT models are now into 1M context window" - but once again that does not apply to locally hosted models: I am getting 32K context with Qwen 2.5 Coder 7B on my non-optimized setup with 8Gb VRAM.
The main thing why I think it may work is the following: answering a question includes "planning for what to do", and then "doing it". Models are good at "doing it" if they are given all necessary info, so if we unload that "planning" into application itself - I think it may work.
In the meantime, apparently, with the rise of the agents, that future is already here, and there are some who already harnessed such power. See the video: the author explains their process, and claims that he had not looked into the code for 6+ weeks and only works with the MD specs now.
When preparing recipes for agents - they build entire workflow around managing context, and make them 3-fold: "research, plan, implement", and then he goes into details how they avoid slop and keep mental alignment for developers to keep up with the changes.
Compared to that - I now realize that what I originally built (an AI helper for a terminal: https://github.com/acrotron/aye-chat) is on the "naive" side: prompt until you either get it right from AI or give up and try from the beginning - and with this context generation approach explained I think I will start moving to the agent-based implementation while keeping control plane still in the terminal: current implementation works on smaller code bases but with this approach should be able to cover larger ones as well.
With these techs developing so fast - I think it's just a matter of keeping up with the news - and be aware of what's being done successfully, and unfortunately that's not always easy to do. This one specifically - to let go of code reviews and learning how to work with spec files only - will require mentality change, and it will be a psychological barrier to overcome.
The idea is to remove the copy/paste/review loop entirely. Instead of asking an AI for code and then manually approving and applying it, the tool writes directly to files in your folder and automatically snapshots everything so you can diff or instantly undo if it gets something wrong.
It lives entirely in the terminal, so you can prompt the AI, run tests, open vim, refactor, restore changes, all in one flow. The bet is that with current models, the main bottleneck is the human, not LLM.
It’s open source and still very early, but we already have a steady cohort of users - as the flow is sticky after the "aha" moment. Repo is here if anyone’s curious, give it a star if you like the idea: https://github.com/acrotron/aye-chat
Happy to answer questions or hear skepticism :)