I tend to be sceptical when it comes to LLM based coding tools but many people seem to be raving about huge productivity gains which I wouldn’t mind as well.
However when trying cc it left me vey disappointed. For context I’m working on a relatively greenfield rust project and gave it tasks that I would consider appropriate for a junior level colleague like:
- change the return type of a trait and all it’s impls
- refactor duplicate code into a helper function
- replace some of our code with an external crate
it didn’t get any of them correct and took a very long time. Am I using the tool wrong?
How are you using cc or other agentic tools?
I think you are using the wrong language to be honest. LLMs are best at languages like Python, Javascript and Go. Relatively simple structures and huge amounts of reference code. Rust is a less common language which is much harder to write.
Did you give claude code tests and the ability to compile in a loop? It's pretty good in go at least at debugging and fixing issues when allowed to loop.
What helped me was shifting how I use it. I don’t treat it like a junior dev anymore, I treat it more like a second brain. For example:
I use Claude Code to explore options before I commit to a design. I’ll ask it “what are 3 ways to abstract this logic?” and sometimes that alone gives me a better direction.
It’s pretty good at turning rough notes or comments into starter code or test cases. That saves time on boilerplate.
If I feed it a clean, self-contained chunk of code and ask for a targeted change (e.g., “convert this to async”), it often nails it. But yeah, across a codebase, not so much.
Have been using it to build a DSL in JS. Greenfield. I’ve followed the commonly touted “plan, act, evaluate” approach; I’ve got it to generate a clear project vision, scope, and feature checklist. Then told it to refer to that for context. I’ve been descriptive and explicit in my prompting, way more so than previously.
It has gotten the broad strokes right, I’ve got an exceptionally barebones DSL, made up of 5 entities, working…just.
It has now started to spin its wheels on small issues and can’t fix them without breaking something else. The codebase isn’t even big (~8 main functions across a few files). Troubleshooting the code is difficult because it’s convoluted and I lack the same intuition for it I would have had I written it myself. I’ve decided to rewrite everything with less control ceded to the LLM.
When it works, it feels great. When it doesn’t, which is often, the spell is broken and I feel I’ve wasted a bunch of time and have not much to show for it.
Refactoring duplicate code into a helper function should be achievable with current agents. To replace existing code with an external crate , you could try giving the agent access to a browser (e.g. playwright-mcp), and instructing it to browse the crate docs. For anything that involves using APIs that may be past the knowledge cutoff for the agent's model, it's definitely worthwhile to have some MCP tools on hand that'll let it browse for up-to-date info - the brave-search and context7 MCPs are good.
I've been having fun with Claude Code and VSCode's agent. Any reasonably experienced engineer should be able to use it for a subset of languages without too many issues, but they definitely need to hydrate the context (eg. using Claude.md) and have a sensible set of system prompts set up. Good, well-written and broken-down-into-steps user prompts are non-negotiable.
Had less luck on generating new features. It's great for prototyping UI but I routinely end up writing it myself.
It's also quick to forget how I like to do things or what libraries and packages it should use. So I either have to keep reminding it or fix up the work myself. While I'm unsure whether it still ends up being quicker, that's really immaterial for me because it absolutely kills the enjoyment of the work.