Not only did they produce about the same amount of code in a day that they used to produce in a week (or two), several other things made my work harder than before:
- During review, they hadn't thought as deeply about their code so my comments seemed to often go over their heads. Instead of a discussion I'd get something like "good catch, I'll fix that" (also reminiscent of an LLM).
- The time spent on trivial issues went down a lot, almost zero, the remaining issues were much more subtle and time-consuming to find and describe.
- Many bugs were of a new kind (to me), the code would look like it does the right thing but actually not work at all, or just be much more broken than code with that level of "polish" would normally be. This breakdown of pattern-matching compared to "organic" code made the overhead much higher. Spending decades reviewing code and answering Stack Overflow questions often makes it possible to pinpoint not just a bug but how the author got there in the first place and how to help them avoid similar things in the future.
- A simple, but bad (inefficient, wrong, illegal, ugly, ...) solution is a nice thing to discuss, but the LLM-assisted junior dev often cooks up something much more complex, which can be bad in many ways at once. The culture of slowly growing a PR from a little bit broken, thinking about design and other considerations, until its high quality and ready for a final review doesn't work the same way.
- Instead of fixing the things in the original PR, I'd often get a completely different approach as the response to my first review. Again, often broken in new and subtle ways.
This lead to a kind of effort inversion, where senior devs spent much more time on these PRs than the junior authors themselves. The junior dev would feel (I assume) much more productive and competent, but the response to their work would eventually lack most of the usual enthusiasm or encouragement from senior devs.
How do people work with these issues? One thing that worked well for me initially was to always require a lot of (passing) tests but eventually these tests would suffer from many of the same problems
Another thing I do is ask for the claude session log file. The inputs and thought they provided to claude give me a lot more insight than the output of claude. Quite often I am able to correct the thought process when I know how they are thinking. I've found junior developers treat claude like a sms - small ambiguous messages with very little context, hoping it would perform magic. By reviewing the claude session file, I try to fix this superficial prompting behaviour.
And third, I've realized claude works best of the code itself is structured well and has tests, tools to debug and documentation. So I spend more time on tooling so that claude can use these tools to investigate issues, write tests and iterate faster.
Still a far way to go, but this seems promising right now.
This made me wonder: can I share Claude Code's conversation history? Turns Claude stores them.
So I made a full-stack "snippet" app with Claude and Instant. You can:
1. Upload jsonl files 2. Share them in a nice UI
(Going meta) here's the first conversation I had with Claude in order to build it:
https://claude-code-viewer.vercel.app/view/c4ca91ac-9624-40f...
After I deployed, I asked it to fix the tool use UI:
https://claude-code-viewer.vercel.app/view/faf9b2cc-c3cf-4d0...
I used Instant's auth to gate uploads. Views are public, but limited only to the snippets you know (i.e have links for).
If you want to upload your own conversations:
1. They live in ~/.claude. Head on over and grab a file 2. Go to https://claude-code-viewer.vercel.app and sign up 3. Start uploading : )
Some notes:
* Be careful when sharing log files. Claude can include secrets in there. Some hackers may notice an adminToken in the convo. I rotated it before we pushed.
* It was fun to see Claude use the query language. It thought we had a `$startsWith` modifier. Right now we only have $like. But `$startsWith` is a great idea, we may just implement it real quick!