Does telling the AI to "just be correct" essentially work? I have no idea after this article because there no details at all related to what changed the type of prompts etc
Does telling the AI to "just be correct" essentially work? I have no idea after this article because there no details at all related to what changed the type of prompts etc
So. Yes, technically possible. But impossible by accident. Furthermore when you make this argument you reveal that you don't understand how these models work. They do not simply compress all the data they were trained on into a tiny storable version. They are effectively multiplication matrices that allow math to be done to predict the most likely next token (read: 2-3 Unicode characters) given some input.
So the model does not "contain" code. It "contains" a way of doing calculations for predicting what text comes next.
Finally, let's say that it is possible that the model does spit out not entire works, but a handful of lines of code that appear in some codebase.
This does not constitute copyright infringement, as the lines in question a) represent a tiny portion of the whole work (and copyright only protecst against the reduplication of whole works or siginficant portions of the work), and B) there are a limited number of ways to accomplish a certain function and it is not only possible but inevitable that two devs working independently could arrive at the same implementation. Therefore using an identical implementation (which is what this case would be) of a part of a work is no more illegal than the use of a certain chord progression or melodic phrasing or drum rhythm. Courts have ruled about this thoroughly.
Because sometimes it can't trace down all the data paths and by the time it does it's context window is running out.
That seems to be the biggest issue I see for my daily use anyways
They were slower than coding by hand, if you wanted to keep quality. Some were almost as quick as copy-pasting from the code just above the generated one, but their quality was worse. They even kept some bugs in the code during their reviews.
So the different world is probably what the acceptable level of quality means. I know a lot of coders who don’t give a shit whether it makes sense what they’re doing. What their bad solution will cause in the long run. They ignore everything else, just the “done” state next to their tasks in Jira. They will never solve complex bugs, they simply don’t care enough. At a lot of places, they are the majority. For them, LLM can be an improvement.
Claude Code the other day made a test for me, which mocked everything out from the live code. Everything was green, everything was good. On paper. A lot of people simply wouldn’t care to even review properly. That thing can generate a few thousands of lines of semi usable code per hour. It’s not built to review it properly. Serena MCP for example specifically built to not review what it does. It’s stated by their creators.
I just recently got into JavaScript and typescript and being able to ask the llm how to do something and get some sources and link examples is really nice.
However using it in a language I'm much more familiar with really decreases the usefulness. Even more so when your code base is mid to large sized
Personally, I wrote 200K lines of my B2B SaaS before agentic coding came around. With Sonnet 4 in Agent mode, I'd say I now write maybe 20% of the ongoing code from day to day, perhaps less. Interactive Sonnet in VS Code and GitHub Copilot Agents (autonomous agents running on GitHub's servers) do the other 80%. The more I document in Markdown, the higher that percentage becomes. I then carefully review and test.
That way it constantly yells at sonnet 4 to get the code at least in a better state.
If anyone is curious I have a massive eslint config for typescript that really gets good code out of sonnet.
But before I started doing this the code it wrote was so buggy and it was constantly trying to duplicate functions into separate files etc
I'm auto-completing crazy complex Rust match branches for record transformation. 30 lines of code, hitting dozens of fields and mutations, all with a single keystroke. And then it knows where my next edit will be.
I've been programming for decades and I love this. It's easily a 30-50% efficiency gain when plumbing fields or refactoring.
Really is game changing
I've tried throwing LLMs at every part of the work I do and it's been entirely useless at everything beyond explaining new libraries or being a search engine. Any time it tries to write any code at all it's been entirely useless.
But then I see so many praising all it can do and how much work they get done with their agents and I'm just left confused.
It's so accurate it's scary. Shows me the what side of the road I'm on and even shows me how much I'm in the lane lol
> testing where they change one word here or there and compare
You can be that person. You can write that post. Nothing is stopping you.
You're missing the forest for the trees with your response