I think, notably, one of the errors has been to name functions calls "tools"...
> right tools allow small models to perform better than undirected tool like bash to do everything.
Interesting enough the newer mini swe agent was refutation of this hypothesis for very large LLMs from the original swe agent paper (https://arxiv.org/pdf/2405.15793) assuming that specialized tools work better.
I guess that it's only a matter of finetuning.
LLM have lots of experience with bash so I get they figure out how to work with it. They don't have experience with custom tools you provide it.
And also, LLM "tools" as we know it need better design (to show states, dynamic actions).
Given both, AI with the right tools will outperform AI with generic and uncontrolled tool.
Lack of tools in mini-swe-agent is a feature. You can run it with any LLM no matter how big or small.
I've built a SWE agent too (for fun), check it out => https://github.com/myriade-ai/autocode
Surely listing files, searching a repo, editing a file can all be achieved with bash?
Or is this what's demonstrated by https://news.ycombinator.com/item?id=45001234?
Most readers on initial posting?
Quality comments?
Quantity traffic on shown website?
You are right that there is lots of way to measure this but quality comments is way harder to judge and we don't have quantity traffic info.