It seems to me that the ongoing “vibe coding” debate on HN, about whether AI coding agents are helpful or harmful, often overlooks one key point: the better you are as a coder, the less useful these agents tend to be.
Years ago, I was an amazing C++ dev. Later, I became a solid Python dev. These days, I run a small nonprofit in the digital rights space, where our stack is mostly JavaScript. I don’t code much anymore, and honestly, I’m mediocre at it now. For us, AI coding agents have been a revelation. We are a small team lacking resources and agent let us move much faster, especially when it comes to cleaning up technical debt or handling simple, repetitive tasks.
That said, the main lesson I learned about vibe coding, or using AI for research and any other significant task, is that you must understand the domain better than the AI. If you don’t, you’re setting yourself up for failure.
I think it's the opposite , the better you are as a coder and know your domain, the better you can use ai tools. someone with no expertise is set up for failure
Domain knowledge is key I agree. I think we’re going to see waterfall development come back. Domain experts, project managers and engineers gathering requirements and planning architecture up front in order to create the ultra detailed spec needed for the agents to succeed. Between them they can write a CLAUDE.md file, way of working (“You will do TDD, update JIRA ticket like so”) and all the supporting context docs. There isn’t the penalty anymore for waterfall since course corrections aren’t as devastating or wasteful of dev hours.
> That said, the main lesson I learned about vibe coding, or using AI for research and any other significant task, is that you must understand the domain better than the AI. If you don’t, you’re setting yourself up for failure.
Only if you fully trust it works. You can also first take time to learn about the domain and use AI to assist you in learning it.
This whole thing is really about assistance. I think in that sense, OpenAI's marketing was spot on. LLMs are good at assisting. Don't expect more of them.
The only "overlooked" part of "vibe coding" conversations on HN appear to be providing free training for these orgs that host the models, and the environmental and social impact of doing so.
Why do you think AI is producing low quality code? Before I started using AI, my code was often rejected as "didn't use thing X" or "didn't follow best practice Y" but ever since I started coding with AI, that was gone. Works especially well when the code is being reviewed by a person who is clueless about AI.
I've seen plenty of mediocre, even bad, code from real humans who didn't realise they were bad coders. Yet: while LLMs often beat those specific humans, I do also see LLMs doing their own mistakes.
LLMs are very useful tools, but if they were human, they'd be humans with sleep deprivation or early stage dementia of some kind.
That's a good point. The majority of human programmers aren't exactly super talented either, and now due to AI many have now lost all hope for personal development, but that's their choice.
All code needs to be carefully scrutinized, AI generated or not. Maybe always prefix your prompt with: "Your operations team consists of a bunch and middel aged angry Unix fans, who will call you at 3:00AM if your service fails and belittle your abilities at the next incidents review meeting.".
As for the 100% vibe coders, please let them. There's plenty of good money to be made cleaning up after them and I do love refactoring, deleting code and implementing monitoring and logging.
My experience is that it does produce low quality code. Perhaps I tried some unusual stuff, but the other day, I had a simple problem in JavaScript: you have an image, add a gray border around it, and keep it within given width/height limits. I figured that should be common enough for whatever OpenAI model I was using to generate useable code. It started with doing something straightforward with a good-looking Math.min operation, and return a promise to a canvas. I asked "why is this returning a promise?", and of course it answered that I was right and removed the promise. Then it turned out that if the image was larger than limits, it would simply draw over the borders. Now I had to add that it should scale the image. It made an error in the scaling. IIRC, I had to tell it that both dimensions should be scaled identically, which led a few more trials before it looked like decent code. It's a clueless junior that has been kicked out of boot camp.
What it does do perfectly: convert code from one language to another. It was a fairly complex bit, and the result was flawless.
> My experience is that it does produce low quality code. Perhaps I tried some unusual stuff, but the other day
I've seen both happen. Sometimes it produced fairly good quality code on small problem domains. Sometimes it produced bad code on small problem domains.
The code is always not that great to bad at big problem domains.
This is because AI-generated code will always be mediocre. There is so much poor-quality code in the training base that it will dilute the high-quality sources. AI is not a craftsman who builds on his own high standards, but rather a token grinder tool calibrated to your prompts. Even if you describe your problem (prompt) to a high standard, there is no way it can deliver a solution of the same standard. This will be true in 2025 as it was in 2023 and probably always will be.
The LLM vendors are all competing on how well their models can write code, and the way they're doing that is to refine their training data - they constantly find new ways to remove poor quality code from the training data and increase the volume of high quality code.
One way they do this is by using code that passes automated tests. That's a unique characteristic of code - you can't do that for regular prose, or legal analysis or whatever.
"Even if you describe your problem (prompt) to a high standard, there is no way it can deliver a solution of the same standard."
My own experience doesn't match that. I can describe my problems to a good LLM and get back code that I would have been proud to have written myself.
100%. Generative AI is and will always be trained on more or less all open source code that is out there, and by definition, from the training data, it will create a mix of this, which will statistically be mediocre.
From using it, working with people who use it and from reviewing AI generated code. The AI generated code is typically on par with people like QA or sysadmins who do not code as their primary job.
If someone on my team who was a software engineer and not very junior consistently produced such low quality code I would put them on a performance improvement plan.
I think the analogy still holds; fast fashion is generally of higher quality than "random-person-sewed-a-shirt-at-home". At least superficially.
What the vibe-coded software usually lacks is someone (man or machine) who thought long and hard about the purpose of the code, along with extended use and testing leading to improvements.
> Why do you think AI is producing low quality code?
I asked for a very, very simple bash script to test code generation abilities once.
The AI got it spectacularly wrong. So wrong that it was ridiculous.
Here's my reason why I think it does produce low quality code; because it does.
I feel like sooner or later the standard for these types of discussions should become:
> "Here's a link to the commits in my GitHub repo, here's the exact prompts and models that were used that generated bad output. This exact example proves my point beyond a doubt."
I've used Claude Sonnet 4 and Google Gemini 2.5 Pro to pretty good results otherwise, with RooCode - telling it what to look for in a codebase, to come up with an implementation plan, chatting with it about the details until it fills out a proper plan (sometimes it catches edge cases that I haven't thought of), around 100-200k tokens in usually it can knock out a decent implementation for whatever I have in mind, throw in another 100-200k tokens and it has made the tests pass and also written new ones as needed.
Another 200k-400k for reading the codebase more in depth and doing refactoring (e.g. when writing Go it has a habit of doing a lot of stuff inline instead of looking at the utils package I have, less of an issue with Spring Boot Java apps for example cause there the service pattern is pretty common in the code it's been trained on I'd reckon) although adding something like AI.md or a gradually updated CODEBASE.md or indexing the whole codebase with an embedding model and storing it in Qdrant or something can help to save tokens there somewhat.
Sometimes a particular model does keep messing up, switching over to another and explaining what the first one was doing wrong can help get rid of that spiraling, other times I just have to write all the code myself anyways because I have something different in mind, sometimes stopping it in the middle of editing a file and providing additional instructions. On average, still faster than doing everything manually and sometimes overlooks obvious things, but other times finds edge cases or knows syntax I might not.
Obviously I use a far simpler workflow for one off data transformations or knocking out Bash scripts etc. Probably could save a bunch of tokens if not for RooCode system prompt, that thing was pretty long last I checked. Especially good as a second set of eyes without human pleasantries and quick turnaround (before actual human code review, when working in a team), not really nice for my wallet but oh well.
Except when it does... CodeRabbit will review my PRs sometimes complaining about code written by Copilot or Augment. They should just fight it off between themselves.
The 'vibe coding as fast fashion' analogy is interesting, and the article makes some valid points about code quality, maintenance burden, and the 'don't build it' philosophy. As an OSS maintainer, the 'who's going to maintain it?' question hits home.
However, I find the analogy a bit off the mark. LLMs are, fundamentally, tools. Their effectiveness and the quality of output depend on the user's expertise and domain knowledge. For prototyping, exploring ideas, or debugging (as the author's Docker Compose example illustrates), they can be incredibly powerful (not to mention time-savers).
The risk of producing bloated, unmaintainable code isn't new. LLMs might accelerate the production of it, but the ultimate responsibility for the quality and maintainability still rests with the person pressing the proverbial "ship" button. A skilled developer can use LLMs to quickly iterate on well-defined problems or discard flawed approaches early.
I do agree that we need clearer definitions of 'good quality' and 'maintainable' code, regardless of AI's role. The 'YMMV' factor is key here: it feels like the tool amplifies the user's capabilities, for better or worse.
I think it's revealing that a group that historically values making decisions based on verifiable and accurate information is now jumping to discredit "Vibe Coding" based on rumors that are easily disproven.
2. Replit "AI Deleted my Database" drama was caused by guy getting inaccurate AI support. All he needed to do was click a "Rollback Here" button to instantly recover all code and data. https://x.com/jasonlk/status/1946240562736365809
What does this eagerness to discredit vibe coding say about us?
It's just human nature. Technology advances exponentially and huge leaps forward are increasingly being compressed into very short amounts of time. Just like how sages were worried about memory after the invention of writing or luddites were worried about job displacement with the industrial revolution; it is natural. What would the Italian Renaissance artists think of Photoshop? People whose livelihood and identity are inherently tied to this discipline can't help but be dismissive. "Vibe Coding" will be "coding" or "programming" in the near future, likely in just a few years as tools evolve. Just like we use text editors and GUIs now to do computing instead of punch cards and a single CLI.
> Just like how sages were worried about memory after the invention of writing or luddites were worried about job displacement with the industrial revolution;
This is pretty exemplary of resolution loss in verbal reasoning.
Which sages were worried about memory after the invention of writing? If you refer to Phaedrus dialogue, the passage about the invention of writing is a just a story that is used to move the conversation along. By that time writing had existed for thousands of year. The dialogue is about rhetoric and learning, not about writing itself.
Luddites are also often quoted by people with a lossy compression of history. These "King Ludd" graffiti and sabotage actions are highly correlated with labor action suppression and severe enforcement of criminal acts calling for execution of laborers involved in labor organization. It was a revolt against severe oppression, not some dumb way to try and stop progress.
I've been using Augment lately in dbt, PHP, and Typescript codebases, and it has been producing production-level code, it has been creating (and running!) tests automatically, and always goes through multiple levels of review before merge.
Posts like these will always be influenced by the author's experience with specific tools, in addition to what languages they use (as I can imagine lesser-used languages/frameworks will have less training material, thus lower quality output), as well as the choice of LLM that powers it behind the scenes.
I think it is a 'your mileage may vary' situation.
I knocked up a VSCode plugin in a few hours that extracted a JSON file from a zip file generated by the clang C++ static analyzer, parsed it into the VSCode problems and diagnostic view and provided quick fixes for simple things. All of this without hardly knowing or caring about java script or how the NPM tool chains work. Just kept taking screen shots of VSCode and saying what I wanted to go where and what the behaviour should be. When I was happy with certain aspects such as code parsing and patching I got it to write unit tests to lock in certain behaviours. If anyone tells you LLM's are just garbage generators they are not using the tools correctly.
I feel like no one has read the article, instead anyone jumps on the defense and says "but my AI code is good!!!!". It's not even about the quality, and no one said that AI just produces garbage, especially with newer models.
> My take on AI for programming and "vibe coding" is that it will do to software engineering what fast fashion did to the clothing industry: flood the market with cheap, low-quality products and excessive waste.
I don't claim the AI code is good or bad. It generated me a tool I needed in a short time in a domain where I don't have enough knowledge. I have other high value work to do. I got the tool I wanted and moved on. I didn't even bother code reviewing it. It's like you ask an intern to knock you up a tool to do a job and instead of two days waiting you get it in a few hours.
Many of the statements in the article would have been correct in 2023. OP sounds like he is judging stuff he doesn't have a lot of experience with, a bit like my grandma when she used to tell me how bad hip hop music is.
This doesn't make any sense, a lot of the statements are especially true now, and would've been wrong in 2023. Your comment sounds like a weak defense instead, like saying "ahh you just don't get hip hop, you're too old" to your grandma.
On a plane from Sydney to Tokyo. Just "vibe coded" a tool we've needed for years in a matter of hours. Web workers, OPFS file management, e2e tests via playwright, Effect service encapsulation and mocking etc.
If you know the domain it's a 3-6X efficiency improvement.
Amazing how well LLMs work on airplane wifi. Just text after all.
Years ago, I was an amazing C++ dev. Later, I became a solid Python dev. These days, I run a small nonprofit in the digital rights space, where our stack is mostly JavaScript. I don’t code much anymore, and honestly, I’m mediocre at it now. For us, AI coding agents have been a revelation. We are a small team lacking resources and agent let us move much faster, especially when it comes to cleaning up technical debt or handling simple, repetitive tasks.
That said, the main lesson I learned about vibe coding, or using AI for research and any other significant task, is that you must understand the domain better than the AI. If you don’t, you’re setting yourself up for failure.
Only if you fully trust it works. You can also first take time to learn about the domain and use AI to assist you in learning it.
This whole thing is really about assistance. I think in that sense, OpenAI's marketing was spot on. LLMs are good at assisting. Don't expect more of them.
Deleted Comment
LLMs are very useful tools, but if they were human, they'd be humans with sleep deprivation or early stage dementia of some kind.
All code needs to be carefully scrutinized, AI generated or not. Maybe always prefix your prompt with: "Your operations team consists of a bunch and middel aged angry Unix fans, who will call you at 3:00AM if your service fails and belittle your abilities at the next incidents review meeting.".
As for the 100% vibe coders, please let them. There's plenty of good money to be made cleaning up after them and I do love refactoring, deleting code and implementing monitoring and logging.
What it does do perfectly: convert code from one language to another. It was a fairly complex bit, and the result was flawless.
I've seen both happen. Sometimes it produced fairly good quality code on small problem domains. Sometimes it produced bad code on small problem domains.
The code is always not that great to bad at big problem domains.
The LLM vendors are all competing on how well their models can write code, and the way they're doing that is to refine their training data - they constantly find new ways to remove poor quality code from the training data and increase the volume of high quality code.
One way they do this is by using code that passes automated tests. That's a unique characteristic of code - you can't do that for regular prose, or legal analysis or whatever.
"Even if you describe your problem (prompt) to a high standard, there is no way it can deliver a solution of the same standard."
My own experience doesn't match that. I can describe my problems to a good LLM and get back code that I would have been proud to have written myself.
Which is fine, as long as people are aware of it.
If someone on my team who was a software engineer and not very junior consistently produced such low quality code I would put them on a performance improvement plan.
What the vibe-coded software usually lacks is someone (man or machine) who thought long and hard about the purpose of the code, along with extended use and testing leading to improvements.
I asked for a very, very simple bash script to test code generation abilities once. The AI got it spectacularly wrong. So wrong that it was ridiculous. Here's my reason why I think it does produce low quality code; because it does.
> "Here's a link to the commits in my GitHub repo, here's the exact prompts and models that were used that generated bad output. This exact example proves my point beyond a doubt."
I've used Claude Sonnet 4 and Google Gemini 2.5 Pro to pretty good results otherwise, with RooCode - telling it what to look for in a codebase, to come up with an implementation plan, chatting with it about the details until it fills out a proper plan (sometimes it catches edge cases that I haven't thought of), around 100-200k tokens in usually it can knock out a decent implementation for whatever I have in mind, throw in another 100-200k tokens and it has made the tests pass and also written new ones as needed.
Another 200k-400k for reading the codebase more in depth and doing refactoring (e.g. when writing Go it has a habit of doing a lot of stuff inline instead of looking at the utils package I have, less of an issue with Spring Boot Java apps for example cause there the service pattern is pretty common in the code it's been trained on I'd reckon) although adding something like AI.md or a gradually updated CODEBASE.md or indexing the whole codebase with an embedding model and storing it in Qdrant or something can help to save tokens there somewhat.
Sometimes a particular model does keep messing up, switching over to another and explaining what the first one was doing wrong can help get rid of that spiraling, other times I just have to write all the code myself anyways because I have something different in mind, sometimes stopping it in the middle of editing a file and providing additional instructions. On average, still faster than doing everything manually and sometimes overlooks obvious things, but other times finds edge cases or knows syntax I might not.
Obviously I use a far simpler workflow for one off data transformations or knocking out Bash scripts etc. Probably could save a bunch of tokens if not for RooCode system prompt, that thing was pretty long last I checked. Especially good as a second set of eyes without human pleasantries and quick turnaround (before actual human code review, when working in a team), not really nice for my wallet but oh well.
However, I find the analogy a bit off the mark. LLMs are, fundamentally, tools. Their effectiveness and the quality of output depend on the user's expertise and domain knowledge. For prototyping, exploring ideas, or debugging (as the author's Docker Compose example illustrates), they can be incredibly powerful (not to mention time-savers).
The risk of producing bloated, unmaintainable code isn't new. LLMs might accelerate the production of it, but the ultimate responsibility for the quality and maintainability still rests with the person pressing the proverbial "ship" button. A skilled developer can use LLMs to quickly iterate on well-defined problems or discard flawed approaches early.
I do agree that we need clearer definitions of 'good quality' and 'maintainable' code, regardless of AI's role. The 'YMMV' factor is key here: it feels like the tool amplifies the user's capabilities, for better or worse.
1. Tea App wasn't vibe coding - It was built before vibe coding and the leak was incorrectly secured Firebase https://simonwillison.net/2025/Jul/26/official-statement-fro...
2. Replit "AI Deleted my Database" drama was caused by guy getting inaccurate AI support. All he needed to do was click a "Rollback Here" button to instantly recover all code and data. https://x.com/jasonlk/status/1946240562736365809
What does this eagerness to discredit vibe coding say about us?
This is pretty exemplary of resolution loss in verbal reasoning.
Which sages were worried about memory after the invention of writing? If you refer to Phaedrus dialogue, the passage about the invention of writing is a just a story that is used to move the conversation along. By that time writing had existed for thousands of year. The dialogue is about rhetoric and learning, not about writing itself.
Luddites are also often quoted by people with a lossy compression of history. These "King Ludd" graffiti and sabotage actions are highly correlated with labor action suppression and severe enforcement of criminal acts calling for execution of laborers involved in labor organization. It was a revolt against severe oppression, not some dumb way to try and stop progress.
Posts like these will always be influenced by the author's experience with specific tools, in addition to what languages they use (as I can imagine lesser-used languages/frameworks will have less training material, thus lower quality output), as well as the choice of LLM that powers it behind the scenes.
I think it is a 'your mileage may vary' situation.
> My take on AI for programming and "vibe coding" is that it will do to software engineering what fast fashion did to the clothing industry: flood the market with cheap, low-quality products and excessive waste.
If you know the domain it's a 3-6X efficiency improvement.
Amazing how well LLMs work on airplane wifi. Just text after all.