https://github.com/kitlangton/Hex
It translates to proper language also
Also, as someone else said, consider the root causes of an issue, whether those are in code logic or business ops or some intersection between the two.
When I save twenty hours of a client's money and my own time, by telling them that a new software feature they want would be unnecessary if they changed the order of questions their employees ask on the phone, I've done my job well.
By the same token, if I'm bored and find weird stuff in the database indicating employees tried to perform the same action twice or something, that is something that can be solved with more backstops and/or a better UI.
Coding business logic is not a one-way street. Understanding the root causes and context of issues in the code itself is very hard and requires you to have a mental model of both domains. Going further and actually requesting changes to the business logic which would help clean up the code requires a flexible employer, but also an ability to think on a higher order than simply doing some CRUD tasks.
The fact that I wouldn't trust any LLM to touch any of my code in those real world cases makes me think that most people who are touting them are not, in fact, writing code at the same level or doing the same job I do. Or understand it very well.
So right now an LLM and the developer you describe here are two very different thing and an LLM will, by design, never replace you
For my part, I'd give 80% confidence that LLMs will be able to do this within two years, without fundamental architectural changes.
Then there are the innovations people had tried over the years like different styles of kid seats, calculators built into the handle, coupon scanners built in, security boots on the wheel, Aldi store coin lock connectors, motorized baskets, Ikea escalator locking wheels.
Thinking further, the designs change across the various countries I have visited over the years.
On top of this, I can visually picture all the different styles the groceries and department stores use near me to "brand" their carts and experience directly(Target's specific branded plastic carts and baskets). The very much see the shopping cart as part of their customer experience and have experimented with different setups. One could argue that the scope of utility for a shipping cart is miniscule compared to many websites. And yet, there is actually a lot of variety.
Given how there are people dedicated to so many seemingly insignificant corporate details(email signatures and other branding activities), it seems custom "website experience rules" would slot right into that line of thinking.
We actually dialed it back a bunch, because it feels _terrible_. Yes, you get more correct answers, but it's more akin to giving the agent anxiety. Especially with agents that have access to tools, they'll burn enormous amounts of time on tool calls, trying to get enough information to overcome a motivation that's essentially burned into its identity.
(We saw one conversation where it just browsed social media instead of looking at the code for like 5 minutes, which ... you know, I get it.)
It's been much more effective to make uncertainty or further exploration be part of the agents success criteria.
- BAD: "Critique your own thoughts" -> leads to the agent trying really hard to get it right, but still not willing to actually be wrong
- GOOD: "Expose where your thoughts are unsupported or could benefit from further information" -> leads to the agent producing high-quality results, with loose ends that the user can choose to incorporate, ignore, or correct.
That prompt, combined with dialing up the thinking (either with API or prompt tuning) works much better, because it's sidestepping the training and tuning that's implicitly encouraged it to sound correct at all times.
[0] https://tern.sh, code migration AI