Readit News logoReadit News
jeswin commented on Why LLMs can't really build software   zed.dev/blog/why-llms-can... · Posted by u/srid
suriya-ganesh · 8 days ago
I can confirm this is exactly how llms are working. Spent two hours trying to get an llm to implement a filescan skip a specific directory. Tried, claude code, Gemini and cursor. All agents debugged and wrote code that just doesn't make sense.

Llms are really good at template tasks, writing tests, boilerplate etc. But, Most times I'm not doing implement this button. I'm doing there's a logic mismatch in my expectation

jeswin · 8 days ago
> Spent two hours trying to get an llm to implement a filescan skip a specific directory

There's a large variance in outcomes depending on the prompt, and the process. I've gotten it to do things which are harder than a filescan with a skipped directory - without too much trouble.

Add:

> Llms are really good at template tasks, writing tests, boilerplate etc.

If I have to stretch the definition of boilerplate to what's at the edge of a modern LLM's comprehension, I would say that 50% of software is some sort of boilerplate.

jeswin commented on Why LLMs can't really build software   zed.dev/blog/why-llms-can... · Posted by u/srid
usrbinbash · 9 days ago
> We don't just keep adding more words to our context window, because it would drive us mad.

That, and we also don't only focus on the textual description of a problem when we encounter a problem. We don't see the debugger output and go "how do I make this bad output go away?!?". Oh, I am getting an authentication error. Well, meaybe I should just delete the token check for that code path...problem solved?!

No. Problem very much not-solved. In fact, problem very much very bigger big problem now, and [Grug][1] find himself reaching for club again.

Software engineers are able to step back, think about the whole thing, and determine the root cause of a problem. I am getting an auth error...ok, what happens when the token is verified...oh, look, the problem is not the authentication at all...in fact there is no error! The test was simply bad and tried to call a higher privilege function as a lower privilege user. So, test needs to be fixed. And also, even though it isn't per-se an error, the response for that function should maybe differentiate between "401 because you didn't authenticate" and "401 because your privileges are too low".

[1]: https://grugbrain.dev

jeswin · 8 days ago
> Oh, I am getting an authentication error. Well, meaybe I should just delete the token check for that code path...problem solved?!

If this is how you think LLMs and Coding Agents are going about writing code, you haven't been using the right tools. Things happen, sure, but also mostly don't. Nobody is arguing that LLM-written code should be pushed directly into production, or that they'll solve every task.

LLMs are tools, and everyone eventually figures out a process that works best for them. For me, it was strongs specs/docs, strict types, and lots of tests. And then of course the reviews if it's serious work.

jeswin commented on GPT-5: Overdue, overhyped and underwhelming. And that's not the worst of it   garymarcus.substack.com/p... · Posted by u/kgwgk
jeswin · 14 days ago
The author seems to be more about self-promotion.

From the article: "that many online dubbed it “Gary Marcus Day” for proving your consistent criticism", "Even my anti-fan club (“Gary haters” in modern parlance)", "Tweets like “The saddest thing in my day is that @garymarcus is right”", and his bio - "known as a leading voice in AI".

Looping over his articles, I don't see anything interesting.

jeswin commented on Ana Marie Cox on the Shaky Foundation of Substack as a Business   newsletter.anamariecox.co... · Posted by u/Bogdanp
jeswin · 21 days ago
> It’s as unstable as a SpaceX launch

Hate makes people blind. Starship is failing by design - that's just how they're choosing to develop it. The chopsticks video that we saw earlier was very nearly science fiction. And as far as regular launches go, SpaceX has done more successful launches than any company or nation ever.

But more generally, this idea of abandoning an app (or product) the moment you encounter people who disagree with you is disheartening.

jeswin commented on 6 weeks of Claude Code   blog.puzzmo.com/posts/202... · Posted by u/mike1o1
unshavedyak · 21 days ago
> 2. Start with a good spec. Spend enough time on the spec. You can get AI to write most of the spec for you, if you provide a good outline.

Curious how you outline the spec, concretely. A sister markdown document? How detailed is it? etc.

> 3. Make sure you have tests from the beginning. This is the most important part. Tests (along with good specs) are how an AI agent can recurse into a good solution. TDD is back.

Ironically i've been struggling with this. For best results i've found claude to do best with a test hook, but then claude loses the ability to write tests before code works to validate bugs/assumptions, it just starts auto fixing things and can get a bit wonky.

It helps immensely to ensure it doesn't forget anything or abandon anything, but it's equally harmful at certain design/prototype stages. I've taken to having a flag where i can enable/disable the test behavior lol.

jeswin · 21 days ago
> Curious how you outline the spec, concretely. A sister markdown document? How detailed is it? etc.

Yes. I write the outline in markdown. And then get AI to flesh it out. The I generate a project structure, with stubbed API signatures. Then I keep refining until I've achieved a good level of detail - including full API signatures and database schemas.

> Ironically i've been struggling with this. For best results i've found claude to do best with a test hook, but then claude loses the ability to write tests before code works to validate bugs/assumptions, it just starts auto fixing things and can get a bit wonky.

I generate a somewhat basic prototype first. At which point I have a good spec, and a good project structure, API and db schemas. Then continuously refine the tests and code. Like I was saying, types and linting are also very helpful.

jeswin commented on 6 weeks of Claude Code   blog.puzzmo.com/posts/202... · Posted by u/mike1o1
jeswin · 21 days ago
Claude Code is ahead of anything else, in a very noticeable way. (I've been writing my own cli tooling for AI codegen from 2023 - and in that journey I've tried most of the options out there. It has been a big part of my work - so that's how I know.)

I agree with many things that the author is doing:

1. Monorepos can save time

2. Start with a good spec. Spend enough time on the spec. You can get AI to write most of the spec for you, if you provide a good outline.

3. Make sure you have tests from the beginning. This is the most important part. Tests (along with good specs) are how an AI agent can recurse into a good solution. TDD is back.

4. Types help (a lot!). Linters help as well. These are guard rails.

5. Put external documentation inside project docs, for example in docs/external-deps.

6. And finally, like every tool it takes time to figure out a technique that works best for you. It's arguably easier than it was (especially with Claude Code), but there's still stuff to learn. Everyone I know has a slightly different workflow - so it's a bit like coding.

I vibe coded quite a lot this week. Among them, Permiso [1] - a super simple GraphQL RBAC server. It's nowhere close to best tested and reviewed, but can be quite useful already if you want something simple (and can wait until it's reviewed.)

[1]: https://github.com/codespin-ai/permiso

jeswin commented on iPhone 16 cameras vs. traditional digital cameras   candid9.com/phone-camera/... · Posted by u/sergiotapia
jauntywundrkind · 25 days ago
Viltrox, Sirui, Sony themselves, and Samyang have all kicked out really nice 85mm fast primes. $600 down to $400, listed in decreasing weight order (down to 270g!). Yes, whatever you have: it's a massive amount of gear to carry compared to a phone. But what results!

The past 2-4 years have been amazing for lenses: Sony's willingness to let other people make lenses has been an amazing win for photography.

jeswin · 25 days ago
What has changed is the last four years is that Chinese and Korean lens makers have caught up in a big way, and are now producing excellent optics at a fraction of the price with AF and weather sealing (as of now, primes only). For example, the Viltrox Lab and Pro series, or the Samyang 135/1.8. The other Chinese manufacturers are a cut below.

Also, Sigma and Tamron (both Japanese) are putting out more higher quality lenses compared to a decade back. With optical quality rivaling Sony's own G Master series and the Zeissen.

jeswin commented on iPhone 16 cameras vs. traditional digital cameras   candid9.com/phone-camera/... · Posted by u/sergiotapia
SoftTalker · 25 days ago
Can you really have a 70mm focal length on a phone that is less than 10mm thick? I thought it was simulated by cropping the image from the actual very short focal length.
jeswin · 25 days ago
Yes, periscope lenses are fairly common on phones. 10x "optical zoom".
jeswin commented on Claude Code weekly rate limits    · Posted by u/thebestmoshe
jeswin · a month ago
I am a Max 20x subscriber, and I'm not unhappy that Anthropic is putting this in place.

Claude is vital to me and I want it to be a sustainable business. I won't hit these limits myself, and I'm saving many times what I would have spent in API costs - easily among the best money I've ever spent.

I'm middle aged, spending significant time on a hobby project which may or may not have commercial goals (undecided). It required long hours even with AI, but with Claude Code I am spending more time with family and in sports. If anyone from Anthropic is reading this, I wanted to say thanks.

jeswin commented on How to make websites that will require lots of your time and energy   blog.jim-nielsen.com/2025... · Posted by u/OuterVale
xg15 · a month ago
OK, but that's essentially "argument by authority" + "everyone is doing it".

They certainly have their reasons and they definitely aren't stupid, but it would be more useful to know what those reasons are and in what scope they are applicable.

jeswin · a month ago
Why types are good (even essential) for large projects has been documented quite extensively. Internet pushed JS to the forefront and as projects increased in scope and ambition, types became unavoidable.

More recently, see how there's a strong push towards types in Python. AI/ML is to Python what the internet was for JS. And types are here.

u/jeswin

KarmaCake day9409December 29, 2008
About
Available for consulting (Architecture, Training, .Net/C#/F#, NodeJS/React/TypeScript). Email me at: jeswinpk@agilehead.com

Projects I maintain:

https://github.com/codespin-ai/codespin-chrome-extension/ - Chrome Extension to enable editing your source code files with ChatGPT/Claude

https://github.com/codespin-ai/codespin-cli - GPT Code Generation CLI Tools

https://github.com/webjsx/webjsx - NEW! A library for creating Web Components with JSX

https://bashojs.org - Lazy JavaScript Evaluator for building Shell pipelines.

github: https://github.com/jeswin

View Original