Readit News logoReadit News
shinycode commented on Sprinkling self-doubt on ChatGPT   justin.searls.co/posts/sp... · Posted by u/ingve
trjordan · 3 days ago
We've been building out our agent [0], and we've found this to be the case.

We actually dialed it back a bunch, because it feels _terrible_. Yes, you get more correct answers, but it's more akin to giving the agent anxiety. Especially with agents that have access to tools, they'll burn enormous amounts of time on tool calls, trying to get enough information to overcome a motivation that's essentially burned into its identity.

(We saw one conversation where it just browsed social media instead of looking at the code for like 5 minutes, which ... you know, I get it.)

It's been much more effective to make uncertainty or further exploration be part of the agents success criteria.

- BAD: "Critique your own thoughts" -> leads to the agent trying really hard to get it right, but still not willing to actually be wrong

- GOOD: "Expose where your thoughts are unsupported or could benefit from further information" -> leads to the agent producing high-quality results, with loose ends that the user can choose to incorporate, ignore, or correct.

That prompt, combined with dialing up the thinking (either with API or prompt tuning) works much better, because it's sidestepping the training and tuning that's implicitly encouraged it to sound correct at all times.

[0] https://tern.sh, code migration AI

shinycode · 3 days ago
Anxiety for AI ? I don’t follow all developments but it looks « weird » to me. Like AI could benefit from a psychologist or « psychology prompting » in its chain of thought like « don’t panic, you’re the best, you can do it » would have a positive outcome ? Pep talk for AI ?
shinycode commented on AI crawlers, fetchers are blowing up websites; Meta, OpenAI are worst offenders   theregister.com/2025/08/2... · Posted by u/rntn
shinycode · 4 days ago
In the same time it’s so practical to ask a question and it opens 25 pages to search and summarize the answer. Before that’s more or less what I was trying to do by hand. Maybe not 25 websites because of crap SEO the top 10 contains BS content so I curated the list but the idea is the same no ?
shinycode commented on Show HN: Whispering – Open-source, local-first dictation you can trust   github.com/epicenter-so/e... · Posted by u/braden-w
shinycode · 5 days ago
It already exists with great execution :

https://github.com/kitlangton/Hex

It translates to proper language also

shinycode commented on Why LLMs can't really build software   zed.dev/blog/why-llms-can... · Posted by u/srid
noduerme · 10 days ago
Good programmers working hand in glove with good companies do much more than this. We question the business logic itself and suggest non-technical, operational solutions to user issues before we take a hammer to the code.

Also, as someone else said, consider the root causes of an issue, whether those are in code logic or business ops or some intersection between the two.

When I save twenty hours of a client's money and my own time, by telling them that a new software feature they want would be unnecessary if they changed the order of questions their employees ask on the phone, I've done my job well.

By the same token, if I'm bored and find weird stuff in the database indicating employees tried to perform the same action twice or something, that is something that can be solved with more backstops and/or a better UI.

Coding business logic is not a one-way street. Understanding the root causes and context of issues in the code itself is very hard and requires you to have a mental model of both domains. Going further and actually requesting changes to the business logic which would help clean up the code requires a flexible employer, but also an ability to think on a higher order than simply doing some CRUD tasks.

The fact that I wouldn't trust any LLM to touch any of my code in those real world cases makes me think that most people who are touting them are not, in fact, writing code at the same level or doing the same job I do. Or understand it very well.

shinycode · 10 days ago
True and LLM have no incentive to avoid writing code. It’s even worse they are « paid » by the amount of code they generate. So default behavior is to avoid asking questions to refine the need. They thrive on blurry and imprecise prompt because in any case they’ll generate thousands of loc, regardless of the pertinence. Many people confirmed that in their experience. I’ve never seen an LLM step back, ask questions and then code or avoid coding. It’s by design a choice of generating the most stuff because of money.

So right now an LLM and the developer you describe here are two very different thing and an LLM will, by design, never replace you

shinycode commented on Show HN: Omnara – Run Claude Code from anywhere   github.com/omnara-ai/omna... · Posted by u/kmansm27
kmansm27 · 13 days ago
You're correct, that's one pro for vibetunnel/mobile SSH clients - they're a direct connection to your machine. For our platform, the messages flow through our server, which enables some use cases like push notifications and easier setup/reliability, but for a tradeoff of the data not being local.
shinycode · 12 days ago
In the era of data being gold, it’s quite useful to gather CC usage and even more things. What kind of data do you gather and have access to ? Are you compliant to anything regarding data ? (GDPR or else)
shinycode commented on LLMs aren't world models   yosefk.com/blog/llms-aren... · Posted by u/ingve
wizzwizz4 · 13 days ago
Those spec sheets exist: they're called software.
shinycode · 13 days ago
Not exactly. It depends how software is written and if there is ADRs in the project. I had to work on projects where there was bugs because someone coded business rules in a very bad and unclear way. You move an if somewhere and something breaks somewhere else. You ask « is this condition the way it’s supposed to work or is it a bug » when software is not clear enough - and often it isn’t because we have to go fast - we ask people to confirm the rule. My point is this, amazingly written software surely works best with LLMs. That’s not the most software written for now because businesses value speed over engineering sometimes (or it’s lack of skills)
shinycode commented on LLMs aren't world models   yosefk.com/blog/llms-aren... · Posted by u/ingve
ameliaquining · 13 days ago
One thing I appreciated about this post, unlike a lot of AI-skeptic posts, is that it actually makes a concrete falsifiable prediction; specifically, "LLMs will never manage to deal with large code bases 'autonomously'". So in the future we can look back and see whether it was right.

For my part, I'd give 80% confidence that LLMs will be able to do this within two years, without fundamental architectural changes.

shinycode · 13 days ago
« autonomously » what happens when subtle updates that are not bugs but change the meaning of some features that might break the workflow on some other external parts of a client’s system ? It happens all the time and, because it’s really hard to have the whole meaning and business rules written and maintained up to date, an LLM might never be able to grasp some meaning. Maybe if instead of developing code and infrastructures, the whole industry shifts toward only writing impossibly precise spec sheets that make meaning and intent crystal clear then, maybe « autonomously » might be possible to pull off
shinycode commented on One Million Screenshots   onemillionscreenshots.com... · Posted by u/gaws
unlikelytomato · 13 days ago
I am not so sure shopping carts are that great of a counter example. There are plastic ones like target, heavier duty ones, the weird ones at microcenter, lumberyard style, hand baskets, short ones, drag behinds, ones with kids car toys built in, tiny ones for kids to yeah along, ergonomic hand baskets, etc.

Then there are the innovations people had tried over the years like different styles of kid seats, calculators built into the handle, coupon scanners built in, security boots on the wheel, Aldi store coin lock connectors, motorized baskets, Ikea escalator locking wheels.

Thinking further, the designs change across the various countries I have visited over the years.

On top of this, I can visually picture all the different styles the groceries and department stores use near me to "brand" their carts and experience directly(Target's specific branded plastic carts and baskets). The very much see the shopping cart as part of their customer experience and have experimented with different setups. One could argue that the scope of utility for a shipping cart is miniscule compared to many websites. And yet, there is actually a lot of variety.

Given how there are people dedicated to so many seemingly insignificant corporate details(email signatures and other branding activities), it seems custom "website experience rules" would slot right into that line of thinking.

shinycode · 13 days ago
Yes but in itself it’s not meant to be artistic, what you describe is to me variants of the tool. Creative variant yes, but not for art purpose. Just like a website. Maybe somewhere in the world there might be an artistic version of a shopping cart but it’s not a tool anymore and it’s not found where it belongs, in a supermarket
shinycode commented on Starbucks in Korea asks customers to stop bringing in printers/desktop computers   fortune.com/2025/08/11/st... · Posted by u/zdw
somedude895 · 13 days ago
Or you know, you could just not go there if you don't like the place rather than be a prick to people who work there and customers who like going there.
shinycode · 13 days ago
I agree, following the logic means any customer from any shop can start doing anything regardless of policies and shops just need to adapt just because of my expectations ?

u/shinycode

KarmaCake day382May 14, 2020View Original