We could run Claude on our code and call it a day, but we have hundreds of style, safety, etc rules on a very large C++ codebase with intricate behaviour (cooperative multitasking be fun).
So we run dozens of parallel CLI agents that can review the code in excruciating detail. This has completely replaced human code review for anything that isn't functional correctness but is near the same order of magnitude of price. Much better than humans and beats every commercial tool.
"scaling time" on the other hand is useless. You can just divide the problem with subagents until it's time within a few minutes because that also increases quality due to less context/more focus.
Isn’t functional correctness pretty much the only thing that matters though?
That's the choice as seen from the perspective of a white-hat hacker. But for an exploitable vulnerability, the real choice is to sell it to malware producers (I'm including state-sponsored spyware companies like the makers of Pegasus in this category) for a lot of money, or do the more moral thing and earn at least a little bit of money via a bug bounty program.
What if psychology has such things that just work even if the theory is wrong? The trauma healing layman psychology industry might have a method that works for some people, so that they get into a better mental state.
There must be a corollary somewhere about how much you should read the average newspaper.