Readit News logoReadit News
pcwelder commented on How to build a coding agent   ghuntley.com/agent/... · Posted by u/ghuntley
faangguyindia · 2 days ago
Anyone can build a coding agent which works on a) fresh code base b) when you've unlimited token budget

now build it for old codebase, let's see how precisely it edits or removes features without breaking the whole codebase

lets see how many tokens it consumes per bug fix or feature addition.

pcwelder · 2 days ago
Agree. To reduce costs:

1. Precompute frequently used knowledge and surface early. For example repository structure, os information, system time.

2. Anticipate next tool calls. If a match is not found while editing, instead of simply failing, return closest matching snippet. If read file tool gets a directory, return directory contents.

3. Parallel tool calls. Claude needs either a batch tool or special scaffolding to promote parallel tool calls. Single tool call per turn is very expensive.

Are there any other such general ideas?

pcwelder commented on Let's properly analyze an AI article for once   nibblestew.blogspot.com/2... · Posted by u/pabs3
pcwelder · 17 days ago
>I found, a required sample size for just one thousand people would be 278

It's interesting to note that for a billion people this number changes to a whopping ... 385. Doesn't change much.

I was curious, with 22 sample size (assuming unbiased sample, yada yada), while estimating the proportion of people satisfying a criteria, the margin of error is 22%.

While bad, if done properly, it may still be insightful.

pcwelder commented on Getting good results from Claude Code   dzombak.com/blog/2025/08/... · Posted by u/ingve
libraryofbabel · 18 days ago
Yeah, agree that the benchmarks don't really seem to reflect the community consensus. I wonder if part of it is the better symbiosis between the agent (Claude Code) and the Opus and Sonnet models it uses, which supposedly are fine-tuned on Claude Code tool calls? But agree, there is probably some additional secret sauce in the training, perhaps to do with RL on multi-step problems...
pcwelder · 17 days ago
I get similar accuracy to claude code using claude desktop app with a file+bash mcp (different tools same performance).

My guess for why GPT5 scores more on benchmarks is that they evaluate on well defined tasks with all instructions given at the start.

Real life is multi turn. Multiple set of prompts to adhere to. This is where Claude is likely better.

Deleted Comment

pcwelder commented on Why build a domain-specific agent for front end tasks?   kombai.com/why... · Posted by u/pcwelder
alganet · a month ago
Why not simply call it "specialist"? Are you trying to make some close connection to "Domain Specific Languages" somehow?
pcwelder · a month ago
To be absolutely honest, this wasn't a very conscious choice :-)

I don't think a direct similarity with domain specific languages is evident to me. I rather find the messaging similar to some "agents" from other domains. e.g. https://www.harvey.ai/

pcwelder commented on LLMs are bad at returning code in JSON   aider.chat/2024/08/14/cod... · Posted by u/pcwelder
pcwelder · a month ago
PSA: don't generate code using tools (and MCPs) if you're using Gemini or Openai; both ask LLMs to generate JSON directly for function calling. Claude uses XML, so it escapes the issue.
pcwelder commented on Rethinking CLI interfaces for AI   notcheckmark.com/2025/07/... · Posted by u/Bogdanp
pcwelder · a month ago
Losing the sense of cwd is the reason why I append it in the output of each command run in wcgw mcp [1]

It rarely does it incorrectly after that.

I won't be surprised if claude code does the same soon.

However, they do have an env flag called CLAUDE_BASH_MAINTAIN_PROJECT_WORKING_DIR=1

This should also fix the wrong dir behavior.

[1] https://github.com/rusiaaman/wcgw

pcwelder commented on Grok: Searching X for "From:Elonmusk (Israel or Palestine or Hamas or Gaza)"   simonwillison.net/2025/Ju... · Posted by u/simonw
eightysixfour · 2 months ago
He is saying he gave them a prompt to tell them they are built by xAI.
pcwelder · a month ago
Yes, thanks for clarifying. I specified in the system prompt that they're built by xAI and other system instructions from Grok 4.

u/pcwelder

KarmaCake day330November 27, 2019View Original