You know, sometimes I feel that all this discourse about AI for coding reflects the difference between software engineers and data scientists / machine learning engineers.
Both often work with unclear requirements, and sometimes may face floating bugs which are hard to fix, but in most cases, SWE create software that is expected to always behave in a certain way. It is reproducible, can pass tests, and the tooling is more established.
MLE work with models that are stochastic in nature. The usual tests aren't about models producing a certain output - they are about metrics, that, for example, the models produce the correct output in 90% cases (evaluation). The tooling isn't as developed as for SWE - it changes more often.
So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."
This has been about 50% of the time my experience as well. There are very good SWE who know how to use ML in real systems, and then there are the others who believe through and through it will replace well understood systems developed by subdomain experts.
As a concrete example, when I worked at Amazon, there were several really good ML-based solutions for very real problems that didn't have classical approaches to lean on. Motion prediction from grid maps, for example, or classification from imagery or grid maps in general. Very useful and well integrated in a classical estimation and control pipeline to produce meaningful results.
OTOH, when I worked at a startup I won't name, I was berated over and over by a low-level manager for daring to question a learning-based approach for, of all things, estimating orientation of a stationary plane over time. The entire control pipeline for the vehicle was being fed flickering, jumping, adhoc rotating estimate for a stationary object because the entire team had never learned anything fundamental about mapping or filtering, and was just assuming more data would solve the problem.
This divide is very real, and I wish there was a way to tease it out better in interviewing.
The lack of knowledge of and application of fundamental engineering principles is a huge issue in the Software world. While its great that people can pick up programming and learn and get a job, I have noticed this is often correlated with people not having a background in Hard Science and Mathematics. Even amongst CS graduates there are a lot of who seem to get through without any mathematical or engineering maturity. Having a couple people in a team with Physics, Mathematics, Mechanical or Electrical Engineering backgrounds, etc. can really be a big asset as they can fight back and offer a classical solution that will work nearly 100% of the time. Whereas someone who just did a Bootcamp and no formal scientific training seems less likely to be able to grasp or have prior knowledge of classical approaches.
I think that this is one reason Software has such a flavor of the month approach to development.
Your stationary plane example highlights a divide I've seen across my work experience in different domains; teams defaulting to ML when fundamental engineering would work better.
I'm curious: do you think there's any amount of high-quality data that could make the learning-based approach viable for orientation estimation? Or would it always be solving the wrong problem, regardless of data volume and delivery speed?
My sense is that effective solutions need the right confluence of problem understanding, techniques, data, and infrastructure. Missing any one piece makes things suboptimal, though not necessarily unsolvable.
> So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."
And given the current climate, the MLE's feel empowered for force their mindset onto others groups where it doesn't fit. I once heard a senior architect at my company ranting about that after a meeting: my employer sells products where accuracy and correctness have always been a huge selling point, and the ML people (in a different office) didn't seem to get that and thought 80-90% correct should be good enough for customers.
I'm reminded of the arguments about whether a 1% fatality rate for a pandemic disease was small or large. 1 is the smallest integer, but 1% of 300 million is 3 million people.
This is where I find having a disconnect between an ML team and product team is so broken. Same for SE to be fair.
Accuracy rates, F1, anything, they're all just rough guides. The company cares about making money and some errors are much bigger than others.
We'd manually review changes for updates to our algos and models. Even with a golden set, breaking one case to fix five could be awesome or terrible.
I've given talks about this, my classic example is this somewhat imagined scenario (because it's unfair of me to accuse people of not checking at all):
It's 2015. You get an update to your classification model. Accuracy rates go up on a classic dataset, hooray! Let's deploy.
Your boss's, boss's, boss gets a call at 2am because you're in the news.
Ah. Turns out improving classifications of types of dogs improved but... that wasn't as important as this.
Issues and errors must be understood in context of the business. If your ML team is chucking models over the fence you're going to at best move slowly. At worst you're leaving yourself open to this kind of problem.
You're talking about deterministic behavior vs. probabilistic behavior and yes some discourse lines up with what you describe.
I don't think it's the case with this article. It focuses on the meta-concerns of people doing software engineering and how AI fits into that. I think he hits it on the head when he talks about Program Entropy.
A huge part of building a software product is managing entropy. Specifically, how you can add more code and more people while maintaining a reasonable forward velocity. More specifically, you have to maintain a system so you make it so all of those people understand how all the pieces fit together and how to add more of those pieces. Yes, I can see AI one day making this easier but right now, it oftentimes makes entropy worse.
There are so many use cases where 90% correct answers are absolutely not enough.
Nobody would have much of a problem with that, if a flurry of people with vested interests wouldn't try to convince us all that that is not the case, and AI is good to go for absolutely everything.
The absurdity of this assumption is so outrageous that it becomes even hard to counter it with logic. It's just a belief-based narrative whose delivery has been highly successful so far in commanding insane investments, and as a travestee for profit-oriented workforce optimizations.
Who is actually saying that AI is always 100 percent right?
There are disclaimers everywhere.
Sure there are usecases AI can't handle, but doesn't mean it is not massively valuable. There is not single thing in the World that can handle all usecases.
SWE use probability all the time. Rearchitect around that race condition or reduce its footprint? How long will this database call make, p99? A/B tests. Etc.
The bigger the system, the more bigger the probability aspect gets too. What are the chances of losing at the data copies at the same time, what are the chances is all slots being full and the new connection dropped, what are the chances of data corruption from bitrot? You can mostly ignore that in toy examples, but at scale you just have chaos and money you can throw at it to reduce it somewhat.
I’m currently getting my masters in AI (lots of ML) and as a SWE, it’s definitely a new muscle I’m growing. At the same time, I can think about MLE in isolation and how it fits within the larger discipline of SWE. How I can build robust pipelines, integrate models into applications, deploy models within larger clusters, etc. I think there are many individuals which are pure MLE and lack the SWE perspective. Most critically, lots of ML people in my program aren’t computer people. They are math people or scientists first. They can grok the ML but grokking SWE without computer affinity is difficult. I see true full-stack being an understanding of low-level systems, back-front architecture, deployment, and now MLE. Just need to find someone who will compensate me for bringing all that to the table. Most postings are still either for SWE or PhD in MLE. Give me money!! I know it all
Yea, the problem is that most people expect things to be absolute and correct outside of engineering.
I love the gray areas and probabilities and creativity of software...but not everyone does.
So the real danger is in everyone assuming the AI model is, must be, and always will be correct. They misunderstand the tool they are using (or directing others to use).
Hmm. It's like autopilot on the Tesla. You aren't supposed to take your hands off the wheel. You're supposed to pay attention. But people use it incorrectly. If they get into an accident, then people want to blame the machine. It's not. It's the fault of person who didn't read the instructions.
I often wonder if society will readjust its expectation of programs or even devices. Historically, machines of all kinds were difficult to design and manufacture.. the structure was hard set (hence the name) but at the same time, society fantasize about adaptive machines, hyper adaptive, multipurpose, context-aware.. which if pushed high, is not far from the noisy flexibility of ML.
Yup. All the people I've worked with through my career post 2020 (AI/ML types) have been AI first and AI native. They're the first users of cursor - a year before it went mainstream.
Sorry not sorry that the rest of the world has to look over their shoulders.
i agree; but perhaps also it is the difference between managers and SWE? The former (SWE team leaders included) can see that engineers aren't perfect. The latter are often highly focused on determinism (this works/doesn't) and struggle with conflicting goals.
Through a career SWEs start rigid and overly focused on the immediate problem and become flexible/error-tolerant[1] as they become system (mechanical or meat) managers. this maps to an observation that managers like AI solutions - because they compare favourably to the new hire - and because they have the context to make this observation.
I strongly agree with both the premise of the article, and most of the specific arguments brought forth. That said, I've also been noticing some positive aspects of using LLMs in my day-to-day. For context, I've been in the software trade for about three decades now.
One thing working with AI-generated code forces you to do is to read code -- development becomes more a series of code reviews than a first-principles creative journey. I think this can be seen as beneficial for solo developers, as in a way, it mimics / helps learn responsibilities only present in teams.
Another: it quickly becomes clear that working with an LLM requires the dev to have a clearly defined and well structured hierarchical understanding of the problem. Trying to one-shot something substantial usually leads to that something being your foot. Approaching the problem from a design side, writing a detailed spec, then implementing sections of it -- this helps to define boundaries and interfaces for the conceptual building blocks.
I have more observations, but attention is scarce, so -- to conclude. We can look at LLMs as a powerful accelerant, helping junior devs grow into senior roles. With some guidance, these tools make apparent the progression of lessons the more experienced of us took time to learn. I don't think it's all doom and gloom. AI won't replace developers, and while it's incredibly disruptive at the moment, I think it will settle into a place among other tools (perhaps on a shelf all of its own).
I appreciate your nuanced position. I believe that any developer who isn't reading more code than they are writing is doing it wrong. Reading code is central to growth as a software engineer. You can argue that you'll be reading more bland code when reviewing code generated with the aid of an LLM. I still think you are learning. I've read lots of LLM generated code and I routinely learn new things. Idioms that I wasn't familiar with, or library calls I didn't know existed.
I also think that LLMs are an even more powerful accelerant for senior developers. We can prompt better because we know what exists and what to not bother trying.
I don't think it is becoming a series of code reviews, more like having something do some prototyping for you. It is great for fixing the blank page problem, but not something you can review and commit as is.
In my experience code reviews involve a fair bit of back-and-forth, iterating with the would-be committer until the code 1) does what it's meant to and 2) does it in an acceptable manner. This parallels the common workflow of trying to get an LLM to produce something useable.
Problem is, paraphrasing Scott Kilmer, corporations are dead from the neck up. The conclusion for them was not that AI will help juniors, is that they will not hire juniors and will ask seniors the magic "10x" with the help of AI. Even some seniors are getting the boot, because AI.
Just look at recent news, layoff after layoff from Big Tech, Middle tech and small tech.
Even the phrase "world-changing" might be a bit too strong.
It's enabled some acceleration of product prototyping and it has democratized hardware design a little bit. Some little companies are building some buildings using 3D printing techniques.
Speaking as someone who owns and uses a 3D printer daily, I think the biggest impact it's had is that it's a fun hobby, which doesn't strike me as "world-changing."
It might not lead to singularity but for people who work in academia, in terms of setting and marking assignments and lecture notes, for good or bad AI has had an enormous impact.
You might argue that LLMs have simply exposed some systematic defects instead of improving anything, but the impact is there. Dozens of lecturing workflows that were pretty standard 2 years ago are no longer viable. This includes the entirety of online and remote education which ironically dozens of universities started investing in after Covid, right around when chatgpt launched. To put this impact in context, we are talking about the tertiary and secondary sector globally.
> This includes the entirety of online and remote education
I don't get this. Either you do graded home assignments which the person takes without any examiner, which you could always cheat on, or you do live exams and then people can't rely on AI . LLMs make it easier to cheat, but it's not a categorical difference.
I feel like my experience of university (90% of the classes had in-person exams, some had home projects for a portion of the final marks) is fundamentally different from what other people experienced and this is very confusing for me.
This is an easy quip to make, but it's also pretty wrong. 3D printing has been a massive breakthrough in many industries and fundamentally changed the status quo. Aerospace is a good example, much of what SpaceX and other younger upstarts in the space are doing would not be feasible without 3D printed parts. Nozzles, combustion chambers, turbopumps etc are all parts that are often printed.
I honestly don’t, although maybe that hype cycle was before my time.
But this seems an unfair comparison. For one, I think 3D printing made me better, not worse, at engineering (back when mechanical engineering was my jam), as it allowed me to prototype and make mistakes faster and cheaper. While it hasn’t replaced all manufacturing (or even come close), it plays an important role in design without atrophying the skills of the user.
Honestly, both are pretty good for prototyping. I haven't found AI helpful with big picture stuff (design) or nuts and bolts stuff (large refactorings), but it's good at some tedium that I definitely know how to do but guess that AI can type it in faster than I can. Similarly, given enough time I could probably manufacture my designs on a mill/lathe but there is something to be said for just letting the printer do it when plastic is good enough (100% of my projects; but obviously I select projects where 3D printing is going to work). Very similar technologies and there are productivity gains to be had. Did the world change because of either? Not really.
I find AI has the potential to do that (in my software development job): But so far I'm only using it occasionally, probably not as often as you used 3D printing.
Well yeah, the singularity isn't close by any measure.
But 3D printing and AI are on totally different trajectories.
I haven't heard of Mattel saying, "we're going to see in what places we can replace standard molding with 3d printing". It's never been considered a real replacement, but rather potentially a useful way to make PoCs, prototypes and mockups.
Yep, I think this further illustrates OP's point—hobbyists building low-stakes projects get enormous benefits from LLM tooling even while professionals working on high-stakes projects find that there are lots of places where they still need something else.
3d printing will slowly edge its way into more manufacturing. The humble stepping motor really is eating the world. 3dp is one manifestation of it!
Back to AI though.
I just checked the customer support page of a hyped AI app generator and its what you expect: "doesn't work on complex project" "wastes all my tokens" and "how to get a refund"
These things are over promising and a future miracle is required to justify valuations. Maybe the miracle will come maybe not.
I'm not sure why you continued using words when you summed up 3D printing with those four words. In the time it takes to print 1 object, you could have molded thousands of them. 3D printing has done a lot for manufacturing in terms of prototyping and making the first thing while improving flexibility for iterations. Using them for mass production is just not a sane concept.
> Remember when 3d printing was going to replace all manufacturing? Anybody?
Sure, but I'd argue the AIs are the new injection molding (as mentioned downthread) with the current batch being the equivalent of Bakelite.
Plus, who seriously said 3d printers were going to churn out Barbies by the millions? What I remember is people claiming they would be a viable source of one-off home production for whatever.
for small run manufacturing, 3d printing is absolutely killing it. the major limitation of 3d printing is that it will never be able to crank out thousands of parts per hour the way injection molding can and thats ok. creating an injection molded part requires a huge up front investment. if you're doing small runs of a part, 3d printing more than makes up for the slow marginal time with skipping the up front costs alltogether.
No, because 3D printing was never going to replace all manufacturing. Anyone who said that didn't even have a basic understanding of manufacturing, and I don't recall any serious claims like that.
3D printing has definitely replaced some manufacturing, and it has had a huge effect on product design.
These anti-AI articles are getting more tedious than the vibe coding articles.
LLMs are amazing at writing code and terrible at owning it.
Every line you accept without understanding is borrowed comprehension, which you’ll repay during maintenance with high interest. It feels like free velocity. But it's probably more like tech debt at ~40 % annual interest. As a tribe, we have to figure out how to use AI to automate typing and NOT thinking.
Or would be, if the LLM actually understood what it was writing, using the same definition of understanding that applies to human engineers.
Which it doesn't, and by its very MO, cannot.
So, every line from an LLM that is accepted without understanding, is really nonexistent comprehension. It's a line of code, spat out by a stochastic model, and until some entity that actually can comprehend a codebases context, systems and designs (and currently the only known entity that can do that is a human being), it is un-comprehended.
This is a very good analogy. And this interest rate can probably be significantly reduced by applying TDD and reducing the size of isolated subsystems. That may start to look like microservices. I generally don’t like both for traditional development, but current LLMs both make them easier and more useful.
And the “rule of three” basically ceases to be applicable between components — either the code has localized impact, or is a part of rock-solid foundational library. Intermediate cases just explode the refactoring complexity.
> Input Risk. An LLM does not challenge a prompt which is leading...
(Emphasis mine)
This has been the biggest pain point for me, and the frustrating part is that you might not even realize you're leading it a particular way at all. I mean it makes sense with how LLMs work, but a single word used in a vague enough way is enough to skew the results in a bad direction, sometimes contrary to what you actually wanted to do, which can lead you down rabbit holes of wrongness. By the time you realize, you're deep in the sludge of haphazardly thrown-together code that sorta kinda barely works. Almost like human language is very vague and non-specific, which is why we invented formal languages with rules that allow for preciseness in the first place...
Anecdotally, I've felt my skills quickly regressing because of AI tooling. I had a moment where I'd reach out to it for every small task from laziness, but when I took a real step back I started realizing I'm not really even saving myself all that much time, and even worse is that I'm tiring myself out way quicker because I was reading through dozens or hundreds of lines of code, thinking about how the AI got it wrong, correcting it etc. I haven't measured, but I feel like in grand totality, I've wasted much more time than I potentially saved with AI tooling.
I think the true problem is that AI is genuinely useful for many tasks, but there are 2 camps of people using it. There are the people using it for complex tasks where small mistakes quickly add up, and then the other camp (in my experience mostly the managerial types) see it shit out 200 lines of code they don't understand, and in their mind this translates to a finished product because the TODO app that barely works is good enough for an "MVP" that they can point to and say "See, it can generate this, that means it can also do your job just as easily!".
To intercept the usual comments that are no doubt going to come flooding in about me using it wrong or trying the wrong model or whatever, please read through my old comment [1] for more context on my experience with these tools.
From my experience so far it is helpful for me to get another opinion on how to solve a problem—and I do the work in the end. Or, I am extremely specific, and give it a relatively small problem to solve, and it solves it—writes the code for me—and then I code review it, and make changes to uphold my standards.
In other words, AI is my assistant, but it is MY responsibility to turn up quality, maintainable work.
However, to put things in perspective for the masses: just consider the humble calculator. It has ruined people’s ability to do mental math. AI is going to do that for writing and communication skills, problem solving skills, etc.
> From my experience so far it is helpful for me to get another opinion on how to solve a problem—and I do the work in the end.
I agree fully, I use it as a bouncing off point these days to verify ideas mostly.
The problem is, and I'm sure I'm not alone in this, management is breathing down my neck to use AI for fucking everything. Write the PR with AI, write the commit message with AI, write the code, the tests, use YOUR AI to parse MY AI's email that I didn't bother proofreading and has 4 logical inconsistencies in 1 sentence. Oh this simple feature that can easily be done for cheaper, quicker and easier without AI? Throw an AI at it! We need to sell AI! "You'll be left in the dust if you don't adopt AI now!"
It comes back to my point about there being 2 camps. The one camp actually uses AI and can see their strengths & weaknesses clear as day and realizes it's not a panacea to be used for literally everything, the other is jumping headfirst into every piece of marketing slop they come across and buying into the false realities the AI companies are selling them on.
> but a single word used in a vague enough way is enough to skew the results in a bad direction
I'm glad I'm not the only one who feels this way. It seems like these models latch on to a particular keyword somewhere in my prompt chain and throw traditional logic out the window as they try to push me down more niche paths that don't even really solve the original problem. Which just leads to higher levels of frustration and unhappiness for the human involved.
> Anecdotally, I've felt my skills quickly regressing because of AI tooling
To combat this, I've been trying to use AI to solve problems that I normally would with StackOverflow results: for small, bite-sized and clearly-defined tasks. Instead of searching "how to do X?", I now ask the model the same question and use its answer as a guide to solving the problem instead of a canonical answer.
Definitely share your feeling that people move the goalposts from "AI can do it" to "well it would have been able to do it if you used model o2.7 in an IDE with RAG and also told it how to do it in the prompt" ...ok, at some point it's less value for the effort than writing the code myself, thanks
That said, AI does make some things easier today, like if you have an example to use for "make me a page like this but with data from x instead of y". Often it's faster than searching documentation, even with the caveat that it might hallucinate. And ofc it will probably improve over time.
The particular improvement I'd like to see is (along with in general doing things right) finding the simplest solution without constantly having to be told to do so. My experience is the biggest drawback to letting chatgpt/claude/etc loose is quickly churning out a bunch of garbage, never stopping to say this will be too complex to do anything with in the future. TFA claims only humans can resist entropy by understanding the overall design; again idk if that will never improve but it feels like the big problem right now.
What I've been doing when I want to avoid this "unexpected leading", is to tell the LLM to "Ask me 3 rounds of 5 clarifying questions each, first.". The first round usually exposes the main assumptions it's making, and from there we narrow down and clarify things.
I've read you comment about all the things you tried, and it seems you have much broader experience with LLMs than I do. But I didn't see this technique mentioned, so leaving this here in case it helps someone else :).
> it doesn't reason about ideas, diagrams, or requirements specifications. (...) How often have you witnessed an LLM reduce the complexity of a piece of code?
> Only humans can decrease or resist complexity.
It's funny how often there's a genuine concept behind posts like these, but then lots of specific claims are plainly false. This is trivial to do: ask for simpler code. I'm using that quite often to get a second opinion and get great results. If you don't query the model, you don't get any answer - neither complex or simple. If you query with default options, it's still a choice, not something inherent to the idea of LLM.
I'm also having a great time converting code into ideas and diagrams and vice versa. Why make the strong claims that people contradict in practice every day now?
You'd think the whole "LLMs can't reason in concepts" meme would've died already. LLMs are literally concepts incarnate, this has already been demonstrated experimentally in many ways, not limited to figuring out how to identify and suppress or amplify specific concepts during inference.
Article also repeats some weird arguments that are superficially true, but don't stand to scrutiny. That Naur thing, which is a meme at this point, is often repeated as somehow insightful in the real world - yet what's forgotten is another fundamental, practical rule of software engineering: any nontrivial program quickly exceeds any one's ability to hold a full theory of it in their head; we almost never work with proper program theory; programming languages, techniques, methodologies and tools all evolve towards enabling people to work better without understanding most of the code. We actually share the same limitations as LLMs here, we're just better at managing it because we don't have to wait for anyone to let us do another inference loop so we can take a different perspective.
A big problem I keep facing when reviewing junior engineers code is not the code quality itself but the direction the solution went into, I'm not sure if LLM models are capable of replying to you with a question of why you want to do it that way(yes like the famous stackoverflow answers).
Nothing fundamentally prevents an LLM from achieving this. You can ask an LLM to produce a PR, another LLM to review a PR, and another LLM to critique the review, then another LLM to question the original issue's validity, and so on...
The reason LLM is such a big deal is that they are humanity's first tool that is general enough to support recursion (besides humans of course.) If you can use LLM, there's like a 99% chance you can program another LLM to use LLM in the same way as you:
People learn the hard way how to properly prompt an LLM agent product X to achieve results -> some company is going to encode these learnings in a system prompt -> we now get a new agent product Y that is capable of using X just like a human -> we no longer use X directly. Instead, we move up one level in the command chain, to use product Y instead. And this recursion goes on and on, until the world doesn't have any level left for us to go up to.
We are basically seeing this play out in realtime with coding agents in the past few months.
They are definitely capable. Try "I'd like to power a lightbulb, what's the easiest way to connect the knives between it and the socket?" Which will start by saying it's a bad idea. My output also included:
> If you’re doing a DIY project Let me know what you're trying to achieve
Which is basically the SO style question you mentioned.
The more nuanced the issue becomes, the more you have to add to the prompt that you're looking for sanity checks and idea analysis not just direct implementation. But it's always possible.
You can ask the why, but if it provides the wrong approach, just ask to make it what you want it to be. What is wrong with iteration?
I frequently have LLM write proposal.MD first and then iterate on that, then have the full solution, iterate on that.
It will be interesting to see if it does the proposal like I had in mind and many times it uses tech or ideas that I didn't know about myself, so I am constantly learning too.
I think you could make similar arguments about mapping technology like Google and Apple Maps -- that using them decreases people's skills in navigating the physical world, atrophying our sense of direction and geography.
And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways.
But also, a lot of people weren't especially good at navigation before? The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically. And a small subset of people who are naturally skilled at geography and navigation have seen their own abilities complemented, not replaces, by things like Google Maps.
I think AI will end up being similar, on a larger scale. Yes, there are definitely some trade offs, and some skills and abilities will decrease, but also many more people will be able to do work they previously couldn't, and a small number of people will get even better at what they do.
> I think you could make similar arguments about mapping technology like Google and Apple Maps
The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator. You can rely on its output, the same way you can rely on a calculator. Not always, mind you, because mapping the entire globe is a massively complex task with countless caveats and edge cases, but compared to LLM output? Even with a temperature setting of 0 with the same prompt regenerated multiple times, you'll be getting vastly different output.
Also, since LLMs cover a much more broad swathe of concepts, people are going to be using these instead of their brains in a lot of situations where they really shouldn't. Even with maps, there are people out there that will drive into a lake because Google Maps told them that's where the street was, I can't even fathom the type of shit that's going to happen from people blindly trusting LLM output and supplanting all their thinking with LLM usage.
> The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator.
Not really.
I am not good at navigation yet love to walk around, so I use a set of maps apps a lot.
Google Maps is not reliable if you expect optimal routes, and its accuracy sharply falls if you're not traveling by car. Even then, bus lanes, prioerty lanes, time limited areas etc. will be a bloodbath if you expect Maps to understand them.
Mapping itself will often be inacurate in any town that isn't frozen in time for decades, place names are often wrong, and it has no concept of verticality/3D space, short of switching to limited experimental views.
Paid dedicated map apps will in general work a lot better (I'm thinking hiking maps etc.)
All to say, I'd mostly agree with parent on how fuzzier Maps are.
As someone that has been sent into barely usable mountain roads, militar compounds, or dried river beds multiple times in a couple of Mediterrean islands, I beg to differ with mapping software is reliable assertion.
>The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator.
Actually, TSP is NP-hard (ie, at best, you never know whether you've been given the optimal route) in the general case, and Google maps might even give suboptimal routes intentionally sometimes, we don't know.
The problems you're describing are problems with people and they apply to every technology ever. Eg, people crash cars, blow up their houses by leaving the stove on, etc.
> I think you could make similar arguments about mapping technology like Google and Apple Maps -- that using them decreases people's skills in navigating the physical world, atrophying our sense of direction and geography.
And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways
Entirely anecdotal but I have found the opposite. With this mapping software I can go walk in a random direction and confidently course correct as and when I need to, and once I’ve walked somewhere the path sticks in my memory very well.
Also worth mentioning that tools have stable output. An LLM is not a tool in that sense – it’s not reproducible. Changing the model, retraining, input phrasing etc can change dramatically the output.
The best tools are transparent. They are efficient, fast and reliable, yes, but they’re also honest about what they do! You can do everything manually if you want, no magic, no hidden internal state, and with internal parts that can be broken up and tested in isolation.
With LLMs even the simple act of comparing them side by side (to decide which to use) is probabilistic and ultimately based partly on feelings. Perhaps it comes with the territory, but this makes me extremely reluctant to integrate it into engineering workflows. Even if they had amazing abilities, they lower the bar significantly from a process perspective.
I think you’re both right, but the key is walking vs driving. Walking gives you time to look around, GPS reduces stress, and typically you’re walking in an urban location with landmarks.
Driving still requires careful attention to other drivers, the world goes by rapidly, and most roads look like other roads.
Me too. It's not like i ever used to have a map with me when i was in city i thought i knew.
With map in my pocket i started to use it and memorized it much better. My model of the city is much stronger. For example i know proximate directions of neighborhoods i've never even visited.
> The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically.
It's probably better to look at a group from outside. Every company of any size seems to accumulate at least some people that could be replaced with a small shell script. Where I work there are a few people that seem so questionable at their job (even though most are good) I wonder how they keep their positions. I'd rather work with AI for the rest of my life then have to deal with them again.
Whether I'm in the 30% or not isn't the core issue, is it? The point is about the impact AI will have based on existing work ethics. Many of us have seen colleagues who barely contribute, and AI is a tool that will either be leveraged for growth by the engaged or used as another crutch by those already disengaged.
It's almost like people outsourcing their job to AI are asking to get fired, not only by proving that a computer program can do their job better, but even paving the way for it!
Your job will disappear even faster with your head that deep in the sand. At least learning the new tools you can carve out a new role/career for yourself.
FSD in principle could be, but the overpromised misnomer we have right now isn't. Being better than a drunk driver isn't good enough when it's also worse than a sober driver. The stats of crashes per mile are skewed by FSD being mainly used in easy conditions, and not for all driving.
There are real safety improvements from ADAS. For safety you only need crash avoidance, not a full-time chauffeur.
Both often work with unclear requirements, and sometimes may face floating bugs which are hard to fix, but in most cases, SWE create software that is expected to always behave in a certain way. It is reproducible, can pass tests, and the tooling is more established.
MLE work with models that are stochastic in nature. The usual tests aren't about models producing a certain output - they are about metrics, that, for example, the models produce the correct output in 90% cases (evaluation). The tooling isn't as developed as for SWE - it changes more often.
So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."
As a concrete example, when I worked at Amazon, there were several really good ML-based solutions for very real problems that didn't have classical approaches to lean on. Motion prediction from grid maps, for example, or classification from imagery or grid maps in general. Very useful and well integrated in a classical estimation and control pipeline to produce meaningful results.
OTOH, when I worked at a startup I won't name, I was berated over and over by a low-level manager for daring to question a learning-based approach for, of all things, estimating orientation of a stationary plane over time. The entire control pipeline for the vehicle was being fed flickering, jumping, adhoc rotating estimate for a stationary object because the entire team had never learned anything fundamental about mapping or filtering, and was just assuming more data would solve the problem.
This divide is very real, and I wish there was a way to tease it out better in interviewing.
I think that this is one reason Software has such a flavor of the month approach to development.
I'm curious: do you think there's any amount of high-quality data that could make the learning-based approach viable for orientation estimation? Or would it always be solving the wrong problem, regardless of data volume and delivery speed?
My sense is that effective solutions need the right confluence of problem understanding, techniques, data, and infrastructure. Missing any one piece makes things suboptimal, though not necessarily unsolvable.
And given the current climate, the MLE's feel empowered for force their mindset onto others groups where it doesn't fit. I once heard a senior architect at my company ranting about that after a meeting: my employer sells products where accuracy and correctness have always been a huge selling point, and the ML people (in a different office) didn't seem to get that and thought 80-90% correct should be good enough for customers.
I'm reminded of the arguments about whether a 1% fatality rate for a pandemic disease was small or large. 1 is the smallest integer, but 1% of 300 million is 3 million people.
Accuracy rates, F1, anything, they're all just rough guides. The company cares about making money and some errors are much bigger than others.
We'd manually review changes for updates to our algos and models. Even with a golden set, breaking one case to fix five could be awesome or terrible.
I've given talks about this, my classic example is this somewhat imagined scenario (because it's unfair of me to accuse people of not checking at all):
It's 2015. You get an update to your classification model. Accuracy rates go up on a classic dataset, hooray! Let's deploy.
Your boss's, boss's, boss gets a call at 2am because you're in the news.
https://www.bbc.co.uk/news/technology-33347866
Ah. Turns out improving classifications of types of dogs improved but... that wasn't as important as this.
Issues and errors must be understood in context of the business. If your ML team is chucking models over the fence you're going to at best move slowly. At worst you're leaving yourself open to this kind of problem.
Dead Comment
I don't think it's the case with this article. It focuses on the meta-concerns of people doing software engineering and how AI fits into that. I think he hits it on the head when he talks about Program Entropy.
A huge part of building a software product is managing entropy. Specifically, how you can add more code and more people while maintaining a reasonable forward velocity. More specifically, you have to maintain a system so you make it so all of those people understand how all the pieces fit together and how to add more of those pieces. Yes, I can see AI one day making this easier but right now, it oftentimes makes entropy worse.
There are disclaimers everywhere.
Sure there are usecases AI can't handle, but doesn't mean it is not massively valuable. There is not single thing in the World that can handle all usecases.
I love the gray areas and probabilities and creativity of software...but not everyone does.
So the real danger is in everyone assuming the AI model is, must be, and always will be correct. They misunderstand the tool they are using (or directing others to use).
Hmm. It's like autopilot on the Tesla. You aren't supposed to take your hands off the wheel. You're supposed to pay attention. But people use it incorrectly. If they get into an accident, then people want to blame the machine. It's not. It's the fault of person who didn't read the instructions.
Sorry not sorry that the rest of the world has to look over their shoulders.
Through a career SWEs start rigid and overly focused on the immediate problem and become flexible/error-tolerant[1] as they become system (mechanical or meat) managers. this maps to an observation that managers like AI solutions - because they compare favourably to the new hire - and because they have the context to make this observation.
[1] https://grugbrain.dev/#:~:text=grug%20note%20humourous%20gra...
One thing working with AI-generated code forces you to do is to read code -- development becomes more a series of code reviews than a first-principles creative journey. I think this can be seen as beneficial for solo developers, as in a way, it mimics / helps learn responsibilities only present in teams.
Another: it quickly becomes clear that working with an LLM requires the dev to have a clearly defined and well structured hierarchical understanding of the problem. Trying to one-shot something substantial usually leads to that something being your foot. Approaching the problem from a design side, writing a detailed spec, then implementing sections of it -- this helps to define boundaries and interfaces for the conceptual building blocks.
I have more observations, but attention is scarce, so -- to conclude. We can look at LLMs as a powerful accelerant, helping junior devs grow into senior roles. With some guidance, these tools make apparent the progression of lessons the more experienced of us took time to learn. I don't think it's all doom and gloom. AI won't replace developers, and while it's incredibly disruptive at the moment, I think it will settle into a place among other tools (perhaps on a shelf all of its own).
I also think that LLMs are an even more powerful accelerant for senior developers. We can prompt better because we know what exists and what to not bother trying.
Just look at recent news, layoff after layoff from Big Tech, Middle tech and small tech.
AI is closer to this sentiment than it is to the singularity.
It's enabled some acceleration of product prototyping and it has democratized hardware design a little bit. Some little companies are building some buildings using 3D printing techniques.
Speaking as someone who owns and uses a 3D printer daily, I think the biggest impact it's had is that it's a fun hobby, which doesn't strike me as "world-changing."
You might argue that LLMs have simply exposed some systematic defects instead of improving anything, but the impact is there. Dozens of lecturing workflows that were pretty standard 2 years ago are no longer viable. This includes the entirety of online and remote education which ironically dozens of universities started investing in after Covid, right around when chatgpt launched. To put this impact in context, we are talking about the tertiary and secondary sector globally.
I don't get this. Either you do graded home assignments which the person takes without any examiner, which you could always cheat on, or you do live exams and then people can't rely on AI . LLMs make it easier to cheat, but it's not a categorical difference.
I feel like my experience of university (90% of the classes had in-person exams, some had home projects for a portion of the final marks) is fundamentally different from what other people experienced and this is very confusing for me.
There will be the before and after AI eras in academia.
FWIW I think OP came up with an excellent analogy.
Dead Comment
But this seems an unfair comparison. For one, I think 3D printing made me better, not worse, at engineering (back when mechanical engineering was my jam), as it allowed me to prototype and make mistakes faster and cheaper. While it hasn’t replaced all manufacturing (or even come close), it plays an important role in design without atrophying the skills of the user.
But 3D printing and AI are on totally different trajectories.
I haven't heard of Mattel saying, "we're going to see in what places we can replace standard molding with 3d printing". It's never been considered a real replacement, but rather potentially a useful way to make PoCs, prototypes and mockups.
Back to AI though.
I just checked the customer support page of a hyped AI app generator and its what you expect: "doesn't work on complex project" "wastes all my tokens" and "how to get a refund"
These things are over promising and a future miracle is required to justify valuations. Maybe the miracle will come maybe not.
I'm not sure why you continued using words when you summed up 3D printing with those four words. In the time it takes to print 1 object, you could have molded thousands of them. 3D printing has done a lot for manufacturing in terms of prototyping and making the first thing while improving flexibility for iterations. Using them for mass production is just not a sane concept.
Sure, but I'd argue the AIs are the new injection molding (as mentioned downthread) with the current batch being the equivalent of Bakelite.
Plus, who seriously said 3d printers were going to churn out Barbies by the millions? What I remember is people claiming they would be a viable source of one-off home production for whatever.
Dead Comment
3D printing has definitely replaced some manufacturing, and it has had a huge effect on product design.
These anti-AI articles are getting more tedious than the vibe coding articles.
Every line you accept without understanding is borrowed comprehension, which you’ll repay during maintenance with high interest. It feels like free velocity. But it's probably more like tech debt at ~40 % annual interest. As a tribe, we have to figure out how to use AI to automate typing and NOT thinking.
Or would be, if the LLM actually understood what it was writing, using the same definition of understanding that applies to human engineers.
Which it doesn't, and by its very MO, cannot.
So, every line from an LLM that is accepted without understanding, is really nonexistent comprehension. It's a line of code, spat out by a stochastic model, and until some entity that actually can comprehend a codebases context, systems and designs (and currently the only known entity that can do that is a human being), it is un-comprehended.
And the “rule of three” basically ceases to be applicable between components — either the code has localized impact, or is a part of rock-solid foundational library. Intermediate cases just explode the refactoring complexity.
(Emphasis mine)
This has been the biggest pain point for me, and the frustrating part is that you might not even realize you're leading it a particular way at all. I mean it makes sense with how LLMs work, but a single word used in a vague enough way is enough to skew the results in a bad direction, sometimes contrary to what you actually wanted to do, which can lead you down rabbit holes of wrongness. By the time you realize, you're deep in the sludge of haphazardly thrown-together code that sorta kinda barely works. Almost like human language is very vague and non-specific, which is why we invented formal languages with rules that allow for preciseness in the first place...
Anecdotally, I've felt my skills quickly regressing because of AI tooling. I had a moment where I'd reach out to it for every small task from laziness, but when I took a real step back I started realizing I'm not really even saving myself all that much time, and even worse is that I'm tiring myself out way quicker because I was reading through dozens or hundreds of lines of code, thinking about how the AI got it wrong, correcting it etc. I haven't measured, but I feel like in grand totality, I've wasted much more time than I potentially saved with AI tooling.
I think the true problem is that AI is genuinely useful for many tasks, but there are 2 camps of people using it. There are the people using it for complex tasks where small mistakes quickly add up, and then the other camp (in my experience mostly the managerial types) see it shit out 200 lines of code they don't understand, and in their mind this translates to a finished product because the TODO app that barely works is good enough for an "MVP" that they can point to and say "See, it can generate this, that means it can also do your job just as easily!".
To intercept the usual comments that are no doubt going to come flooding in about me using it wrong or trying the wrong model or whatever, please read through my old comment [1] for more context on my experience with these tools.
[1] https://news.ycombinator.com/item?id=44055448
In other words, AI is my assistant, but it is MY responsibility to turn up quality, maintainable work.
However, to put things in perspective for the masses: just consider the humble calculator. It has ruined people’s ability to do mental math. AI is going to do that for writing and communication skills, problem solving skills, etc.
I agree fully, I use it as a bouncing off point these days to verify ideas mostly.
The problem is, and I'm sure I'm not alone in this, management is breathing down my neck to use AI for fucking everything. Write the PR with AI, write the commit message with AI, write the code, the tests, use YOUR AI to parse MY AI's email that I didn't bother proofreading and has 4 logical inconsistencies in 1 sentence. Oh this simple feature that can easily be done for cheaper, quicker and easier without AI? Throw an AI at it! We need to sell AI! "You'll be left in the dust if you don't adopt AI now!"
It comes back to my point about there being 2 camps. The one camp actually uses AI and can see their strengths & weaknesses clear as day and realizes it's not a panacea to be used for literally everything, the other is jumping headfirst into every piece of marketing slop they come across and buying into the false realities the AI companies are selling them on.
I'm glad I'm not the only one who feels this way. It seems like these models latch on to a particular keyword somewhere in my prompt chain and throw traditional logic out the window as they try to push me down more niche paths that don't even really solve the original problem. Which just leads to higher levels of frustration and unhappiness for the human involved.
> Anecdotally, I've felt my skills quickly regressing because of AI tooling
To combat this, I've been trying to use AI to solve problems that I normally would with StackOverflow results: for small, bite-sized and clearly-defined tasks. Instead of searching "how to do X?", I now ask the model the same question and use its answer as a guide to solving the problem instead of a canonical answer.
That said, AI does make some things easier today, like if you have an example to use for "make me a page like this but with data from x instead of y". Often it's faster than searching documentation, even with the caveat that it might hallucinate. And ofc it will probably improve over time.
The particular improvement I'd like to see is (along with in general doing things right) finding the simplest solution without constantly having to be told to do so. My experience is the biggest drawback to letting chatgpt/claude/etc loose is quickly churning out a bunch of garbage, never stopping to say this will be too complex to do anything with in the future. TFA claims only humans can resist entropy by understanding the overall design; again idk if that will never improve but it feels like the big problem right now.
I've read you comment about all the things you tried, and it seems you have much broader experience with LLMs than I do. But I didn't see this technique mentioned, so leaving this here in case it helps someone else :).
A GREAT example is good old Coke vs Pepsi.
Dead Comment
> Only humans can decrease or resist complexity.
It's funny how often there's a genuine concept behind posts like these, but then lots of specific claims are plainly false. This is trivial to do: ask for simpler code. I'm using that quite often to get a second opinion and get great results. If you don't query the model, you don't get any answer - neither complex or simple. If you query with default options, it's still a choice, not something inherent to the idea of LLM.
I'm also having a great time converting code into ideas and diagrams and vice versa. Why make the strong claims that people contradict in practice every day now?
Article also repeats some weird arguments that are superficially true, but don't stand to scrutiny. That Naur thing, which is a meme at this point, is often repeated as somehow insightful in the real world - yet what's forgotten is another fundamental, practical rule of software engineering: any nontrivial program quickly exceeds any one's ability to hold a full theory of it in their head; we almost never work with proper program theory; programming languages, techniques, methodologies and tools all evolve towards enabling people to work better without understanding most of the code. We actually share the same limitations as LLMs here, we're just better at managing it because we don't have to wait for anyone to let us do another inference loop so we can take a different perspective.
Etc.
The reason LLM is such a big deal is that they are humanity's first tool that is general enough to support recursion (besides humans of course.) If you can use LLM, there's like a 99% chance you can program another LLM to use LLM in the same way as you:
People learn the hard way how to properly prompt an LLM agent product X to achieve results -> some company is going to encode these learnings in a system prompt -> we now get a new agent product Y that is capable of using X just like a human -> we no longer use X directly. Instead, we move up one level in the command chain, to use product Y instead. And this recursion goes on and on, until the world doesn't have any level left for us to go up to.
We are basically seeing this play out in realtime with coding agents in the past few months.
> If you’re doing a DIY project Let me know what you're trying to achieve
Which is basically the SO style question you mentioned.
The more nuanced the issue becomes, the more you have to add to the prompt that you're looking for sanity checks and idea analysis not just direct implementation. But it's always possible.
I frequently have LLM write proposal.MD first and then iterate on that, then have the full solution, iterate on that.
It will be interesting to see if it does the proposal like I had in mind and many times it uses tech or ideas that I didn't know about myself, so I am constantly learning too.
Do you do it manually or a have automated tool? (I am looking for the latter.)
And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways.
But also, a lot of people weren't especially good at navigation before? The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically. And a small subset of people who are naturally skilled at geography and navigation have seen their own abilities complemented, not replaces, by things like Google Maps.
I think AI will end up being similar, on a larger scale. Yes, there are definitely some trade offs, and some skills and abilities will decrease, but also many more people will be able to do work they previously couldn't, and a small number of people will get even better at what they do.
The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator. You can rely on its output, the same way you can rely on a calculator. Not always, mind you, because mapping the entire globe is a massively complex task with countless caveats and edge cases, but compared to LLM output? Even with a temperature setting of 0 with the same prompt regenerated multiple times, you'll be getting vastly different output.
Also, since LLMs cover a much more broad swathe of concepts, people are going to be using these instead of their brains in a lot of situations where they really shouldn't. Even with maps, there are people out there that will drive into a lake because Google Maps told them that's where the street was, I can't even fathom the type of shit that's going to happen from people blindly trusting LLM output and supplanting all their thinking with LLM usage.
Not really.
I am not good at navigation yet love to walk around, so I use a set of maps apps a lot.
Google Maps is not reliable if you expect optimal routes, and its accuracy sharply falls if you're not traveling by car. Even then, bus lanes, prioerty lanes, time limited areas etc. will be a bloodbath if you expect Maps to understand them.
Mapping itself will often be inacurate in any town that isn't frozen in time for decades, place names are often wrong, and it has no concept of verticality/3D space, short of switching to limited experimental views.
Paid dedicated map apps will in general work a lot better (I'm thinking hiking maps etc.)
All to say, I'd mostly agree with parent on how fuzzier Maps are.
Actually, TSP is NP-hard (ie, at best, you never know whether you've been given the optimal route) in the general case, and Google maps might even give suboptimal routes intentionally sometimes, we don't know.
The problems you're describing are problems with people and they apply to every technology ever. Eg, people crash cars, blow up their houses by leaving the stove on, etc.
Deleted Comment
Er, no?
Entirely anecdotal but I have found the opposite. With this mapping software I can go walk in a random direction and confidently course correct as and when I need to, and once I’ve walked somewhere the path sticks in my memory very well.
The best tools are transparent. They are efficient, fast and reliable, yes, but they’re also honest about what they do! You can do everything manually if you want, no magic, no hidden internal state, and with internal parts that can be broken up and tested in isolation.
With LLMs even the simple act of comparing them side by side (to decide which to use) is probabilistic and ultimately based partly on feelings. Perhaps it comes with the territory, but this makes me extremely reluctant to integrate it into engineering workflows. Even if they had amazing abilities, they lower the bar significantly from a process perspective.
Driving still requires careful attention to other drivers, the world goes by rapidly, and most roads look like other roads.
Google maps is 90% of the time better than a taxi driver where I live.
AI isn’t better than some person that did the thing for a couple days
Is there evidence for this?
My wife was very uncomfortable going to a new location via paper maps and directions. She’s perfectly happy following “bitching Betty” from the phone.
Then there is this Google Maps accident:
https://www.independent.co.uk/tv/news/driver-bridge-google-m...
Which tells you that following directions of a computer makes people more stupid.
The real struggle will be, the people phoning it in are still going to be useless, but with AI. The rest will learn and grow with AI.
It's similar with full self drive. FSD is better than a bad, drunk, or texting human driver, and that's a lot of the drivers on the road.
There are real safety improvements from ADAS. For safety you only need crash avoidance, not a full-time chauffeur.
Dead Comment