Readit News logoReadit News
dimitri-vs commented on Gemini 3 Flash: Frontier intelligence built for speed   blog.google/products/gemi... · Posted by u/meetpateltech
gaigalas · 2 days ago
If Opus is one-size-fits-all, then why Claude keeps the other series? (rethorical).

Opus and Sonnet are slower than Haiku. For lots of less sophisticated tasks, you benefit from the speed.

All vendors do this. You need smaller models that you can rapid-fire for lots of other reasons than vibe coding.

Personally, I actually use more smaller models than the sophisticated ones. Lots of small automations.

dimitri-vs · 2 days ago
Yes, all the major CLIs (Claude Code, Codex, etc) and many agentic applications use a large model main agent with task delegation to small model sub-agent. For example in CC using Opus4.5 it will delegate an Explore task to a Haiku/Sonnet subagent or multiple subagents.
dimitri-vs commented on The Gorman Paradox: Where Are All the AI-Generated Apps?   codemanship.wordpress.com... · Posted by u/ArmageddonIt
jackfranklyn · 5 days ago
davydm nails it. The gap isn't in generating code - it's in everything else that makes software actually work.

I've been building accounting tools for years. AI can generate a function to parse a bank statement CSV pretty well. But can it handle the Barclays CSV that has a random blank row on line 47? Or the HSBC format that changed last month? Or the edge case where someone exports from their mobile app vs desktop?

That's not even touching the hard stuff - OAuth token refresh failures at 3am, database migrations when you change your mind about a schema, figuring out why Xero's API returns different JSON on Tuesdays.

The real paradox: AI makes starting easier but finishing harder. You get to 80% fast, then spend longer on the last 20% than you would have building from scratch - because now you're debugging code you don't fully understand.

dimitri-vs · 5 days ago
As someone that's currently building accounting (and many many other) tools for myself: yes, it can.

But with a big fat asterisk that you: 1. Need to make it aware of all relevant business logic 2. Give it all necessary tools to iterate and debug and 3. Have significant experience with strengths and weaknesses of coding agents.

To be clear I'm talking about cli agents like Claude Code which IMO is apples and oranges vs ChatGPT (and even Cursor).

dimitri-vs commented on The unexpected effectiveness of one-shot decompilation with Claude   blog.chrislewis.au/the-un... · Posted by u/knackers
zdware · 13 days ago
Agree with this. I'm a software engineer that has mostly not had to manage memory for most of my career.

I asked Opus how hard it would be to port the script extender for Baldurs Gate 3 from Windows to the native Linux Build. It outlined that it would be very difficult for someone without reverse engineering experience, and correctly pointed out they are using different compilers, so it's not a simple mapping exercise. It's recommendation was not to try unless I was a Ghrida master and had lots of time in my hands.

dimitri-vs · 13 days ago
FWIW most LLMs are pretty terrible at estimating complexity. If you've used Claude Code for any length of time you might be familiar with it's plan "timelines" which always span many days but for medium size projects get implemented in about an hour.

I've had CC build semi-complex Tauri, PyQT6, Rust and SvelteKit apps for me without me having ever touched that language. Is the code quality good? Probably not. But all those apps were local-only tools or had less than 10 users so it doesn't matter.

dimitri-vs commented on Writing a good Claude.md   humanlayer.dev/blog/writi... · Posted by u/objcts
vunderba · 19 days ago
From the article:

> We recommend keeping task-specific instructions in separate markdown files with self-descriptive names somewhere in your project. Then, in your CLAUDE.md file, you can include a list of these files with a brief description of each, and instruct Claude to decide which (if any) are relevant and to read them before it starts working.

I've been doing this since the early days of agentic coding though I've always personally referred to it as the Table-of-Contents approach to keep the context window relatively streamlined. Here's a snippet of my CLAUDE.md file that demonstrates this approach:

  # Documentation References

  - When adding CSS, refer to: docs/ADDING_CSS.md
  - When adding assets, refer to: docs/ADDING_ASSETS.md
  - When working with user data, refer to: docs/STORAGE_MANAGER.md

Full CLAUDE.md file for reference:

https://gist.github.com/scpedicini/179626cfb022452bb39eff10b...

dimitri-vs · 19 days ago
Correct me if I'm wrong but I think the new "skillss are exactly this, but better.
dimitri-vs commented on Gemini CLI tips and tricks for agentic coding   github.com/addyosmani/gem... · Posted by u/ayoisaiah
agentdrek · 23 days ago
YMMV I guess but it's my goto tool; fast and reliable results at least for my use cases
dimitri-vs · 23 days ago
Agreed. Been using Claude Code daily for the past year and Codex as a fall back when Claude gets stuck. Codex has two problems: it Windows support sucks and it's way to "mission driven" vs the collaborative Claude. Gemini CLI falls somewhere in the middle, has some seriously cool features (Ctrl+X to edit prompt in notepad) and it's web research capability is actually good.
dimitri-vs commented on ChatGPT terms disallow its use in providing legal and medical advice to others   ctvnews.ca/sci-tech/artic... · Posted by u/randycupertino
sarchertech · a month ago
And it’s not just lab tests and bloodwork. Physicians use all their senses. They poke, they prod, they manipulate, they look, listen, and smell.

They’re also good at extracting information in a way that (at least currently) sycophantic LLMs don’t replicate.

dimitri-vs · a month ago
Agreed, but I'm sure you can see why people prefer the infinite patience and availability of ChatGPT vs having to wait weeks to see your doctor, see them for 15 minutes only to be referred to another specialist that's available weeks away and has an arduous hour long intake process all so you can get 15 minutes of their time.
dimitri-vs commented on A change of address led to our Wise accounts being shut down   shaun.nz/why-were-never-u... · Posted by u/jemmyw
epistasis · 2 months ago
As a Wise user, only for personal international transactions, I'm very curious to read this! I've had good experiences so far.
dimitri-vs · 2 months ago
I do dozens of transactions every month sending payments to various freelancers. Been doing this for five years and can count the numbers of times I hand problems making payments on one hand - all we're minor and resolved in just a few days.
dimitri-vs commented on Generative AI Image Editing Showdown   genai-showdown.specr.net/... · Posted by u/gaws
herval · 2 months ago
Gemini is great when it gets it right, but in my experience, it sometimes gives you completely unexpected results and won't get it right no matter what. You can see that in some of the examples (eg the Girl with the pearl earring one). I'm constantly surprised by how good Flux is, but the tragedy is most people (me included) will just default to whatever they normally use (chatgpt and gemini, in my case), so it doesn't really matter that it's better
dimitri-vs · 2 months ago
Agreed, to the point where I built my own UI where I can simultaneously generate three images and see a before/after. Most often only one of three is what I actually wanted.
dimitri-vs commented on Greatest irony of the AI age: Humans hired to clean AI slop   sify.com/ai-analytics/gre... · Posted by u/wahvinci
GoatInGrey · 3 months ago
I spent five years working in quality assurance in the manufacturing industry. Both on the plant floor and in labs, and the other user is largely correct in the spirit of their message. You are right that it's not just up to things being easy to spot, but that's why there are multiple layers of QA in manufacturing. It's far more intensive than even traditional software QA.

You are performing manual validation of outputs multiple times before manufacturing runs, and performing manual checks every 0.5-2 hours throughout the run. QA then performs their own checks every two hours, including validation that line operators have been performing their checks as required. This is in addition to line staff who have their eyes on the product to catch obvious issues as they process them.

Any defect that is found marks all product palleted since the last successful check as suspect. Suspect product is then subjected to distributed sampling to gauge the potential scope of the defect. If the defect appears to be present in that palleted product AND distributed throughout, it all gets marked for rework.

This is all done when making a single SKU.

In the case of AI, let's say AI programming, not only are we not performing this level of oversight and validation on that output, but the output isn't even the same SKU! It's making a new one-of-a-kind SKU every time, without the pre and post quality checks common in manufacturing.

AI proponents follow a methodology of not checking at all (i.e. spec-driven development) or only sampling every tenth, twentieth, or hundredth SKU rolling off the analogous assembly line.

dimitri-vs · 3 months ago
In the case of AI, it gets even worse when you factor in MCPs - which, to continue your analogy, is like letting random people walk into the factory and adjust the machine parameters at will.

But people won't care until a major correction happens. My guess is that we'll see a string of AI-enabled script kiddies piecing together massive hacks that leak embarrassing or incriminating information (think Celebgate-scale incidents). The attack surface is just so massive - there's never been a better time to be a hacker.

dimitri-vs commented on Disney reinstates Jimmy Kimmel after backlash over capitulation to FCC   arstechnica.com/tech-poli... · Posted by u/tomrod
dyauspitr · 3 months ago
Disney is a friend. You want to hurt them just enough to make a point but not enough to seriously hurt a actual ally.

Disney content, financially motivated or not, is some of the most left friendly media there is.

dimitri-vs · 3 months ago
I thought we had all collective moved past the naive idea that any corporation is ever "your friend".

u/dimitri-vs

KarmaCake day257September 22, 2024View Original