AI: Accelerated Incompetence

You know, sometimes I feel that all this discourse about AI for coding reflects the difference between software engineers and data scientists / machine learning engineers.

Both often work with unclear requirements, and sometimes may face floating bugs which are hard to fix, but in most cases, SWE create software that is expected to always behave in a certain way. It is reproducible, can pass tests, and the tooling is more established.

MLE work with models that are stochastic in nature. The usual tests aren't about models producing a certain output - they are about metrics, that, for example, the models produce the correct output in 90% cases (evaluation). The tooling isn't as developed as for SWE - it changes more often.

So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."

jvanderbot · 7 months ago

This has been about 50% of the time my experience as well. There are very good SWE who know how to use ML in real systems, and then there are the others who believe through and through it will replace well understood systems developed by subdomain experts.

As a concrete example, when I worked at Amazon, there were several really good ML-based solutions for very real problems that didn't have classical approaches to lean on. Motion prediction from grid maps, for example, or classification from imagery or grid maps in general. Very useful and well integrated in a classical estimation and control pipeline to produce meaningful results.

OTOH, when I worked at a startup I won't name, I was berated over and over by a low-level manager for daring to question a learning-based approach for, of all things, estimating orientation of a stationary plane over time. The entire control pipeline for the vehicle was being fed flickering, jumping, adhoc rotating estimate for a stationary object because the entire team had never learned anything fundamental about mapping or filtering, and was just assuming more data would solve the problem.

This divide is very real, and I wish there was a way to tease it out better in interviewing.

ecshafer · 7 months ago

The lack of knowledge of and application of fundamental engineering principles is a huge issue in the Software world. While its great that people can pick up programming and learn and get a job, I have noticed this is often correlated with people not having a background in Hard Science and Mathematics. Even amongst CS graduates there are a lot of who seem to get through without any mathematical or engineering maturity. Having a couple people in a team with Physics, Mathematics, Mechanical or Electrical Engineering backgrounds, etc. can really be a big asset as they can fight back and offer a classical solution that will work nearly 100% of the time. Whereas someone who just did a Bootcamp and no formal scientific training seems less likely to be able to grasp or have prior knowledge of classical approaches.

I think that this is one reason Software has such a flavor of the month approach to development.

Mtinie · 7 months ago

Your stationary plane example highlights a divide I've seen across my work experience in different domains; teams defaulting to ML when fundamental engineering would work better.

I'm curious: do you think there's any amount of high-quality data that could make the learning-based approach viable for orientation estimation? Or would it always be solving the wrong problem, regardless of data volume and delivery speed?

My sense is that effective solutions need the right confluence of problem understanding, techniques, data, and infrastructure. Missing any one piece makes things suboptimal, though not necessarily unsolvable.

palmotea · 7 months ago

> So, for MLE, working with AI that isn't always reliable, is a norm. They are accustomed to thinking in terms of probabilities, distributions, and acceptable levels of error. Applying this mindset to a coding assistant that might produce incorrect or unexpected code feels more natural. They might evaluate it like a model: "It gets the code right 80% of the time, saving me effort, and I can catch the 20%."

And given the current climate, the MLE's feel empowered for force their mindset onto others groups where it doesn't fit. I once heard a senior architect at my company ranting about that after a meeting: my employer sells products where accuracy and correctness have always been a huge selling point, and the ML people (in a different office) didn't seem to get that and thought 80-90% correct should be good enough for customers.

I'm reminded of the arguments about whether a 1% fatality rate for a pandemic disease was small or large. 1 is the smallest integer, but 1% of 300 million is 3 million people.

IanCal · 7 months ago

This is where I find having a disconnect between an ML team and product team is so broken. Same for SE to be fair.

Accuracy rates, F1, anything, they're all just rough guides. The company cares about making money and some errors are much bigger than others.

We'd manually review changes for updates to our algos and models. Even with a golden set, breaking one case to fix five could be awesome or terrible.

I've given talks about this, my classic example is this somewhat imagined scenario (because it's unfair of me to accuse people of not checking at all):

It's 2015. You get an update to your classification model. Accuracy rates go up on a classic dataset, hooray! Let's deploy.

Your boss's, boss's, boss gets a call at 2am because you're in the news.

https://www.bbc.co.uk/news/technology-33347866

Ah. Turns out improving classifications of types of dogs improved but... that wasn't as important as this.

Issues and errors must be understood in context of the business. If your ML team is chucking models over the fence you're going to at best move slowly. At worst you're leaving yourself open to this kind of problem.

Dead Comment

ryanackley · 7 months ago

You're talking about deterministic behavior vs. probabilistic behavior and yes some discourse lines up with what you describe.

I don't think it's the case with this article. It focuses on the meta-concerns of people doing software engineering and how AI fits into that. I think he hits it on the head when he talks about Program Entropy.

A huge part of building a software product is managing entropy. Specifically, how you can add more code and more people while maintaining a reasonable forward velocity. More specifically, you have to maintain a system so you make it so all of those people understand how all the pieces fit together and how to add more of those pieces. Yes, I can see AI one day making this easier but right now, it oftentimes makes entropy worse.

camillomiller · 7 months ago

There are so many use cases where 90% correct answers are absolutely not enough. Nobody would have much of a problem with that, if a flurry of people with vested interests wouldn't try to convince us all that that is not the case, and AI is good to go for absolutely everything. The absurdity of this assumption is so outrageous that it becomes even hard to counter it with logic. It's just a belief-based narrative whose delivery has been highly successful so far in commanding insane investments, and as a travestee for profit-oriented workforce optimizations.

mewpmewp2 · 7 months ago

Who is actually saying that AI is always 100 percent right?

There are disclaimers everywhere.

Sure there are usecases AI can't handle, but doesn't mean it is not massively valuable. There is not single thing in the World that can handle all usecases.

tomrod · 7 months ago

SWE use probability all the time. Rearchitect around that race condition or reduce its footprint? How long will this database call make, p99? A/B tests. Etc.

viraptor · 7 months ago

The bigger the system, the more bigger the probability aspect gets too. What are the chances of losing at the data copies at the same time, what are the chances is all slots being full and the new connection dropped, what are the chances of data corruption from bitrot? You can mostly ignore that in toy examples, but at scale you just have chaos and money you can throw at it to reduce it somewhat.

randysalami · 7 months ago

I’m currently getting my masters in AI (lots of ML) and as a SWE, it’s definitely a new muscle I’m growing. At the same time, I can think about MLE in isolation and how it fits within the larger discipline of SWE. How I can build robust pipelines, integrate models into applications, deploy models within larger clusters, etc. I think there are many individuals which are pure MLE and lack the SWE perspective. Most critically, lots of ML people in my program aren’t computer people. They are math people or scientists first. They can grok the ML but grokking SWE without computer affinity is difficult. I see true full-stack being an understanding of low-level systems, back-front architecture, deployment, and now MLE. Just need to find someone who will compensate me for bringing all that to the table. Most postings are still either for SWE or PhD in MLE. Give me money!! I know it all

tom_m · 7 months ago

Yea, the problem is that most people expect things to be absolute and correct outside of engineering.

I love the gray areas and probabilities and creativity of software...but not everyone does.

So the real danger is in everyone assuming the AI model is, must be, and always will be correct. They misunderstand the tool they are using (or directing others to use).

Hmm. It's like autopilot on the Tesla. You aren't supposed to take your hands off the wheel. You're supposed to pay attention. But people use it incorrectly. If they get into an accident, then people want to blame the machine. It's not. It's the fault of person who didn't read the instructions.

agumonkey · 7 months ago

I often wonder if society will readjust its expectation of programs or even devices. Historically, machines of all kinds were difficult to design and manufacture.. the structure was hard set (hence the name) but at the same time, society fantasize about adaptive machines, hyper adaptive, multipurpose, context-aware.. which if pushed high, is not far from the noisy flexibility of ML.

Der_Einzige · 7 months ago

Yup. All the people I've worked with through my career post 2020 (AI/ML types) have been AI first and AI native. They're the first users of cursor - a year before it went mainstream.

Sorry not sorry that the rest of the world has to look over their shoulders.

ludicrousdispla · 7 months ago

Based on my experience as an MLE I would never use any of the current 'AI' offerings, so whatever bias you are suggesting is very recent.

twak · 7 months ago

i agree; but perhaps also it is the difference between managers and SWE? The former (SWE team leaders included) can see that engineers aren't perfect. The latter are often highly focused on determinism (this works/doesn't) and struggle with conflicting goals.

Through a career SWEs start rigid and overly focused on the immediate problem and become flexible/error-tolerant[1] as they become system (mechanical or meat) managers. this maps to an observation that managers like AI solutions - because they compare favourably to the new hire - and because they have the context to make this observation.

[1] https://grugbrain.dev/#:~:text=grug%20note%20humourous%20gra...

Remember when 3d printing was going to replace all manufacturing? Anybody?

AI is closer to this sentiment than it is to the singularity.

johnecheck · 7 months ago

Great analogy. 3d printing is awesome and incredibly useful tech. Truly world changing. But injection molding is here to stay.

dangus · 7 months ago

Even the phrase "world-changing" might be a bit too strong.

It's enabled some acceleration of product prototyping and it has democratized hardware design a little bit. Some little companies are building some buildings using 3D printing techniques.

Speaking as someone who owns and uses a 3D printer daily, I think the biggest impact it's had is that it's a fun hobby, which doesn't strike me as "world-changing."

Filligree · 7 months ago

Though we did figure out how to do injection molding with a 3d printer. In a printed mold.

ak_111 · 7 months ago

It might not lead to singularity but for people who work in academia, in terms of setting and marking assignments and lecture notes, for good or bad AI has had an enormous impact.

You might argue that LLMs have simply exposed some systematic defects instead of improving anything, but the impact is there. Dozens of lecturing workflows that were pretty standard 2 years ago are no longer viable. This includes the entirety of online and remote education which ironically dozens of universities started investing in after Covid, right around when chatgpt launched. To put this impact in context, we are talking about the tertiary and secondary sector globally.

riffraff · 7 months ago

> This includes the entirety of online and remote education

I don't get this. Either you do graded home assignments which the person takes without any examiner, which you could always cheat on, or you do live exams and then people can't rely on AI . LLMs make it easier to cheat, but it's not a categorical difference.

I feel like my experience of university (90% of the classes had in-person exams, some had home projects for a portion of the final marks) is fundamentally different from what other people experienced and this is very confusing for me.

illiac786 · 7 months ago

I fully agree, in academia it truly is a revolution - for good and for bad.

There will be the before and after AI eras in academia.

Etheryte · 7 months ago

This is an easy quip to make, but it's also pretty wrong. 3D printing has been a massive breakthrough in many industries and fundamentally changed the status quo. Aerospace is a good example, much of what SpaceX and other younger upstarts in the space are doing would not be feasible without 3D printed parts. Nozzles, combustion chambers, turbopumps etc are all parts that are often printed.

dcminter · 7 months ago

Unless you believe that 3D printing is "going to replace all manufacturing" then the OP is not "pretty wrong" and you don't even disagree with them.

FWIW I think OP came up with an excellent analogy.

rak111esh · 7 months ago

OP's comment and your response could both be true at same time

fuelled6532 · 7 months ago

I'd give up my 3D printer long before letting go of my Bridgeport...

Dead Comment

karn97 · 7 months ago

That's less than 1% of all manufacturing done on the planet. Stupid comment for the sake of commenting.

thinkindie · 7 months ago

or bitcoin going to replace banks? We ended up with banks selling financial tools based on bitcoin.

bryancoxwell · 7 months ago

I honestly don’t, although maybe that hype cycle was before my time.

But this seems an unfair comparison. For one, I think 3D printing made me better, not worse, at engineering (back when mechanical engineering was my jam), as it allowed me to prototype and make mistakes faster and cheaper. While it hasn’t replaced all manufacturing (or even come close), it plays an important role in design without atrophying the skills of the user.

jrockway · 7 months ago

Honestly, both are pretty good for prototyping. I haven't found AI helpful with big picture stuff (design) or nuts and bolts stuff (large refactorings), but it's good at some tedium that I definitely know how to do but guess that AI can type it in faster than I can. Similarly, given enough time I could probably manufacture my designs on a mill/lathe but there is something to be said for just letting the printer do it when plastic is good enough (100% of my projects; but obviously I select projects where 3D printing is going to work). Very similar technologies and there are productivity gains to be had. Did the world change because of either? Not really.

gwbas1c · 7 months ago

I find AI has the potential to do that (in my software development job): But so far I'm only using it occasionally, probably not as often as you used 3D printing.

kenjackson · 7 months ago

Well yeah, the singularity isn't close by any measure.

But 3D printing and AI are on totally different trajectories.

I haven't heard of Mattel saying, "we're going to see in what places we can replace standard molding with 3d printing". It's never been considered a real replacement, but rather potentially a useful way to make PoCs, prototypes and mockups.

baq · 7 months ago

I have a 3d printer and cadded me some parts, I don't have an injection molding plant.

lolinder · 7 months ago

Yep, I think this further illustrates OP's point—hobbyists building low-stakes projects get enormous benefits from LLM tooling even while professionals working on high-stakes projects find that there are lots of places where they still need something else.

MisterTea · 7 months ago

Because you dont have a need to mass produce.

nssnsjsjsjs · 7 months ago

3d printing will slowly edge its way into more manufacturing. The humble stepping motor really is eating the world. 3dp is one manifestation of it!

Back to AI though.

I just checked the customer support page of a hyped AI app generator and its what you expect: "doesn't work on complex project" "wastes all my tokens" and "how to get a refund"

These things are over promising and a future miracle is required to justify valuations. Maybe the miracle will come maybe not.

dylan604 · 7 months ago

> 3d printing will slowly

I'm not sure why you continued using words when you summed up 3D printing with those four words. In the time it takes to print 1 object, you could have molded thousands of them. 3D printing has done a lot for manufacturing in terms of prototyping and making the first thing while improving flexibility for iterations. Using them for mass production is just not a sane concept.

UncleEntity · 7 months ago

> Remember when 3d printing was going to replace all manufacturing? Anybody?

Sure, but I'd argue the AIs are the new injection molding (as mentioned downthread) with the current batch being the equivalent of Bakelite.

Plus, who seriously said 3d printers were going to churn out Barbies by the millions? What I remember is people claiming they would be a viable source of one-off home production for whatever.

spyckie2 · 7 months ago

Not saying that it’s not true but hardware and software have different trajectories.

cultofmetatron · 7 months ago

for small run manufacturing, 3d printing is absolutely killing it. the major limitation of 3d printing is that it will never be able to crank out thousands of parts per hour the way injection molding can and thats ok. creating an injection molded part requires a huge up front investment. if you're doing small runs of a part, 3d printing more than makes up for the slow marginal time with skipping the up front costs alltogether.

on_the_train · 7 months ago

Or 3d cinema. Or vr. All just hype bs

Dead Comment

IshKebab · 7 months ago

No, because 3D printing was never going to replace all manufacturing. Anyone who said that didn't even have a basic understanding of manufacturing, and I don't recall any serious claims like that.

3D printing has definitely replaced some manufacturing, and it has had a huge effect on product design.

These anti-AI articles are getting more tedious than the vibe coding articles.

Artgor · 7 months ago

btbuildem · 7 months ago

I strongly agree with both the premise of the article, and most of the specific arguments brought forth. That said, I've also been noticing some positive aspects of using LLMs in my day-to-day. For context, I've been in the software trade for about three decades now.

One thing working with AI-generated code forces you to do is to read code -- development becomes more a series of code reviews than a first-principles creative journey. I think this can be seen as beneficial for solo developers, as in a way, it mimics / helps learn responsibilities only present in teams.

Another: it quickly becomes clear that working with an LLM requires the dev to have a clearly defined and well structured hierarchical understanding of the problem. Trying to one-shot something substantial usually leads to that something being your foot. Approaching the problem from a design side, writing a detailed spec, then implementing sections of it -- this helps to define boundaries and interfaces for the conceptual building blocks.

I have more observations, but attention is scarce, so -- to conclude. We can look at LLMs as a powerful accelerant, helping junior devs grow into senior roles. With some guidance, these tools make apparent the progression of lessons the more experienced of us took time to learn. I don't think it's all doom and gloom. AI won't replace developers, and while it's incredibly disruptive at the moment, I think it will settle into a place among other tools (perhaps on a shelf all of its own).

doug_durham · 7 months ago

I appreciate your nuanced position. I believe that any developer who isn't reading more code than they are writing is doing it wrong. Reading code is central to growth as a software engineer. You can argue that you'll be reading more bland code when reviewing code generated with the aid of an LLM. I still think you are learning. I've read lots of LLM generated code and I routinely learn new things. Idioms that I wasn't familiar with, or library calls I didn't know existed.

I also think that LLMs are an even more powerful accelerant for senior developers. We can prompt better because we know what exists and what to not bother trying.

eikenberry · 7 months ago

I don't think it is becoming a series of code reviews, more like having something do some prototyping for you. It is great for fixing the blank page problem, but not something you can review and commit as is.

In my experience code reviews involve a fair bit of back-and-forth, iterating with the would-be committer until the code 1) does what it's meant to and 2) does it in an acceptable manner. This parallels the common workflow of trying to get an LLM to produce something useable.

Lazarus_Long · 7 months ago

Problem is, paraphrasing Scott Kilmer, corporations are dead from the neck up. The conclusion for them was not that AI will help juniors, is that they will not hire juniors and will ask seniors the magic "10x" with the help of AI. Even some seniors are getting the boot, because AI.

Just look at recent news, layoff after layoff from Big Tech, Middle tech and small tech.

EncomLab · 7 months ago

severusdd · 7 months ago

LLMs are amazing at writing code and terrible at owning it.

Every line you accept without understanding is borrowed comprehension, which you’ll repay during maintenance with high interest. It feels like free velocity. But it's probably more like tech debt at ~40 % annual interest. As a tribe, we have to figure out how to use AI to automate typing and NOT thinking.

usrbinbash · 7 months ago

> is borrowed comprehension,

Or would be, if the LLM actually understood what it was writing, using the same definition of understanding that applies to human engineers.

Which it doesn't, and by its very MO, cannot.

So, every line from an LLM that is accepted without understanding, is really nonexistent comprehension. It's a line of code, spat out by a stochastic model, and until some entity that actually can comprehend a codebases context, systems and designs (and currently the only known entity that can do that is a human being), it is un-comprehended.

croes · 7 months ago

Depending on the task they are terrible at writing code.

ansgri · 7 months ago

This is a very good analogy. And this interest rate can probably be significantly reduced by applying TDD and reducing the size of isolated subsystems. That may start to look like microservices. I generally don’t like both for traditional development, but current LLMs both make them easier and more useful.

And the “rule of three” basically ceases to be applicable between components — either the code has localized impact, or is a part of rock-solid foundational library. Intermediate cases just explode the refactoring complexity.

sensanaty · 7 months ago

> Input Risk. An LLM does not challenge a prompt which is leading...

(Emphasis mine)

This has been the biggest pain point for me, and the frustrating part is that you might not even realize you're leading it a particular way at all. I mean it makes sense with how LLMs work, but a single word used in a vague enough way is enough to skew the results in a bad direction, sometimes contrary to what you actually wanted to do, which can lead you down rabbit holes of wrongness. By the time you realize, you're deep in the sludge of haphazardly thrown-together code that sorta kinda barely works. Almost like human language is very vague and non-specific, which is why we invented formal languages with rules that allow for preciseness in the first place...

Anecdotally, I've felt my skills quickly regressing because of AI tooling. I had a moment where I'd reach out to it for every small task from laziness, but when I took a real step back I started realizing I'm not really even saving myself all that much time, and even worse is that I'm tiring myself out way quicker because I was reading through dozens or hundreds of lines of code, thinking about how the AI got it wrong, correcting it etc. I haven't measured, but I feel like in grand totality, I've wasted much more time than I potentially saved with AI tooling.

I think the true problem is that AI is genuinely useful for many tasks, but there are 2 camps of people using it. There are the people using it for complex tasks where small mistakes quickly add up, and then the other camp (in my experience mostly the managerial types) see it shit out 200 lines of code they don't understand, and in their mind this translates to a finished product because the TODO app that barely works is good enough for an "MVP" that they can point to and say "See, it can generate this, that means it can also do your job just as easily!".

To intercept the usual comments that are no doubt going to come flooding in about me using it wrong or trying the wrong model or whatever, please read through my old comment [1] for more context on my experience with these tools.

[1] https://news.ycombinator.com/item?id=44055448

bandoti · 7 months ago

From my experience so far it is helpful for me to get another opinion on how to solve a problem—and I do the work in the end. Or, I am extremely specific, and give it a relatively small problem to solve, and it solves it—writes the code for me—and then I code review it, and make changes to uphold my standards.

In other words, AI is my assistant, but it is MY responsibility to turn up quality, maintainable work.

However, to put things in perspective for the masses: just consider the humble calculator. It has ruined people’s ability to do mental math. AI is going to do that for writing and communication skills, problem solving skills, etc.

> From my experience so far it is helpful for me to get another opinion on how to solve a problem—and I do the work in the end.

I agree fully, I use it as a bouncing off point these days to verify ideas mostly.

The problem is, and I'm sure I'm not alone in this, management is breathing down my neck to use AI for fucking everything. Write the PR with AI, write the commit message with AI, write the code, the tests, use YOUR AI to parse MY AI's email that I didn't bother proofreading and has 4 logical inconsistencies in 1 sentence. Oh this simple feature that can easily be done for cheaper, quicker and easier without AI? Throw an AI at it! We need to sell AI! "You'll be left in the dust if you don't adopt AI now!"

It comes back to my point about there being 2 camps. The one camp actually uses AI and can see their strengths & weaknesses clear as day and realizes it's not a panacea to be used for literally everything, the other is jumping headfirst into every piece of marketing slop they come across and buying into the false realities the AI companies are selling them on.

sklarsa · 7 months ago

> but a single word used in a vague enough way is enough to skew the results in a bad direction

I'm glad I'm not the only one who feels this way. It seems like these models latch on to a particular keyword somewhere in my prompt chain and throw traditional logic out the window as they try to push me down more niche paths that don't even really solve the original problem. Which just leads to higher levels of frustration and unhappiness for the human involved.

> Anecdotally, I've felt my skills quickly regressing because of AI tooling

To combat this, I've been trying to use AI to solve problems that I normally would with StackOverflow results: for small, bite-sized and clearly-defined tasks. Instead of searching "how to do X?", I now ask the model the same question and use its answer as a guide to solving the problem instead of a canonical answer.

dematz · 7 months ago

Definitely share your feeling that people move the goalposts from "AI can do it" to "well it would have been able to do it if you used model o2.7 in an IDE with RAG and also told it how to do it in the prompt" ...ok, at some point it's less value for the effort than writing the code myself, thanks

That said, AI does make some things easier today, like if you have an example to use for "make me a page like this but with data from x instead of y". Often it's faster than searching documentation, even with the caveat that it might hallucinate. And ofc it will probably improve over time.

The particular improvement I'd like to see is (along with in general doing things right) finding the simplest solution without constantly having to be told to do so. My experience is the biggest drawback to letting chatgpt/claude/etc loose is quickly churning out a bunch of garbage, never stopping to say this will be too complex to do anything with in the future. TFA claims only humans can resist entropy by understanding the overall design; again idk if that will never improve but it feels like the big problem right now.

adriano_f · 7 months ago

What I've been doing when I want to avoid this "unexpected leading", is to tell the LLM to "Ask me 3 rounds of 5 clarifying questions each, first.". The first round usually exposes the main assumptions it's making, and from there we narrow down and clarify things.

I've read you comment about all the things you tried, and it seems you have much broader experience with LLMs than I do. But I didn't see this technique mentioned, so leaving this here in case it helps someone else :).

croemer · 7 months ago

People have similar biases, it's just a lot easier to test an LLM for biases than it is to test humans so we are more aware of LLM biases.

intended · 7 months ago

I consider this hype the preparatory phase for the era where we rediscover the laws of quality control.

A GREAT example is good old Coke vs Pepsi.

Almost all of ai hype seems to be on the backs of no coders or people who make toys as a career (if they even have one)

Nonono you've got it all wrong you clearly need to try [model name] with [coding product], it 100x'd my productivity!

> it doesn't reason about ideas, diagrams, or requirements specifications. (...) How often have you witnessed an LLM reduce the complexity of a piece of code?

> Only humans can decrease or resist complexity.

It's funny how often there's a genuine concept behind posts like these, but then lots of specific claims are plainly false. This is trivial to do: ask for simpler code. I'm using that quite often to get a second opinion and get great results. If you don't query the model, you don't get any answer - neither complex or simple. If you query with default options, it's still a choice, not something inherent to the idea of LLM.

I'm also having a great time converting code into ideas and diagrams and vice versa. Why make the strong claims that people contradict in practice every day now?

TeMPOraL · 7 months ago

You'd think the whole "LLMs can't reason in concepts" meme would've died already. LLMs are literally concepts incarnate, this has already been demonstrated experimentally in many ways, not limited to figuring out how to identify and suppress or amplify specific concepts during inference.

Article also repeats some weird arguments that are superficially true, but don't stand to scrutiny. That Naur thing, which is a meme at this point, is often repeated as somehow insightful in the real world - yet what's forgotten is another fundamental, practical rule of software engineering: any nontrivial program quickly exceeds any one's ability to hold a full theory of it in their head; we almost never work with proper program theory; programming languages, techniques, methodologies and tools all evolve towards enabling people to work better without understanding most of the code. We actually share the same limitations as LLMs here, we're just better at managing it because we don't have to wait for anyone to let us do another inference loop so we can take a different perspective.

Etc.

_shadi · 7 months ago

A big problem I keep facing when reviewing junior engineers code is not the code quality itself but the direction the solution went into, I'm not sure if LLM models are capable of replying to you with a question of why you want to do it that way(yes like the famous stackoverflow answers).

crazylogger · 7 months ago

Nothing fundamentally prevents an LLM from achieving this. You can ask an LLM to produce a PR, another LLM to review a PR, and another LLM to critique the review, then another LLM to question the original issue's validity, and so on...

The reason LLM is such a big deal is that they are humanity's first tool that is general enough to support recursion (besides humans of course.) If you can use LLM, there's like a 99% chance you can program another LLM to use LLM in the same way as you:

People learn the hard way how to properly prompt an LLM agent product X to achieve results -> some company is going to encode these learnings in a system prompt -> we now get a new agent product Y that is capable of using X just like a human -> we no longer use X directly. Instead, we move up one level in the command chain, to use product Y instead. And this recursion goes on and on, until the world doesn't have any level left for us to go up to.

We are basically seeing this play out in realtime with coding agents in the past few months.

They are definitely capable. Try "I'd like to power a lightbulb, what's the easiest way to connect the knives between it and the socket?" Which will start by saying it's a bad idea. My output also included:

> If you’re doing a DIY project Let me know what you're trying to achieve

Which is basically the SO style question you mentioned.

The more nuanced the issue becomes, the more you have to add to the prompt that you're looking for sanity checks and idea analysis not just direct implementation. But it's always possible.

You can ask the why, but if it provides the wrong approach, just ask to make it what you want it to be. What is wrong with iteration?

I frequently have LLM write proposal.MD first and then iterate on that, then have the full solution, iterate on that.

It will be interesting to see if it does the proposal like I had in mind and many times it uses tech or ideas that I didn't know about myself, so I am constantly learning too.

dr_kiszonka · 7 months ago

> I'm also having a great time converting code into [...] diagrams

Do you do it manually or a have automated tool? (I am looking for the latter.)

anyonecancode · 7 months ago

I think you could make similar arguments about mapping technology like Google and Apple Maps -- that using them decreases people's skills in navigating the physical world, atrophying our sense of direction and geography.

And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways.

But also, a lot of people weren't especially good at navigation before? The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically. And a small subset of people who are naturally skilled at geography and navigation have seen their own abilities complemented, not replaces, by things like Google Maps.

I think AI will end up being similar, on a larger scale. Yes, there are definitely some trade offs, and some skills and abilities will decrease, but also many more people will be able to do work they previously couldn't, and a small number of people will get even better at what they do.

> I think you could make similar arguments about mapping technology like Google and Apple Maps

The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator. You can rely on its output, the same way you can rely on a calculator. Not always, mind you, because mapping the entire globe is a massively complex task with countless caveats and edge cases, but compared to LLM output? Even with a temperature setting of 0 with the same prompt regenerated multiple times, you'll be getting vastly different output.

Also, since LLMs cover a much more broad swathe of concepts, people are going to be using these instead of their brains in a lot of situations where they really shouldn't. Even with maps, there are people out there that will drive into a lake because Google Maps told them that's where the street was, I can't even fathom the type of shit that's going to happen from people blindly trusting LLM output and supplanting all their thinking with LLM usage.

makeitdouble · 7 months ago

> The problem is that mapping software is reliable and doesn't spit out a result of what is essentially a random number generator.

Not really.

I am not good at navigation yet love to walk around, so I use a set of maps apps a lot.

Google Maps is not reliable if you expect optimal routes, and its accuracy sharply falls if you're not traveling by car. Even then, bus lanes, prioerty lanes, time limited areas etc. will be a bloodbath if you expect Maps to understand them.

Mapping itself will often be inacurate in any town that isn't frozen in time for decades, place names are often wrong, and it has no concept of verticality/3D space, short of switching to limited experimental views.

Paid dedicated map apps will in general work a lot better (I'm thinking hiking maps etc.)

All to say, I'd mostly agree with parent on how fuzzier Maps are.

pjmlp · 7 months ago

As someone that has been sent into barely usable mountain roads, militar compounds, or dried river beds multiple times in a couple of Mediterrean islands, I beg to differ with mapping software is reliable assertion.

rpgraham84 · 7 months ago

Actually, TSP is NP-hard (ie, at best, you never know whether you've been given the optimal route) in the general case, and Google maps might even give suboptimal routes intentionally sometimes, we don't know.

The problems you're describing are problems with people and they apply to every technology ever. Eg, people crash cars, blow up their houses by leaving the stove on, etc.

Deleted Comment

dTal · 7 months ago

>Even with a temperature setting of 0 with the same prompt regenerated multiple times, you'll be getting vastly different output.

Er, no?

iamacyborg · 7 months ago

> I think you could make similar arguments about mapping technology like Google and Apple Maps -- that using them decreases people's skills in navigating the physical world, atrophying our sense of direction and geography. And actually, that's not wrong. People really do often struggle to navigate these days if they don't have the crutch of something like Google Maps. It really has changed our relationship to the physical world in many ways

Entirely anecdotal but I have found the opposite. With this mapping software I can go walk in a random direction and confidently course correct as and when I need to, and once I’ve walked somewhere the path sticks in my memory very well.

klabb3 · 7 months ago

Also worth mentioning that tools have stable output. An LLM is not a tool in that sense – it’s not reproducible. Changing the model, retraining, input phrasing etc can change dramatically the output.

The best tools are transparent. They are efficient, fast and reliable, yes, but they’re also honest about what they do! You can do everything manually if you want, no magic, no hidden internal state, and with internal parts that can be broken up and tested in isolation.

With LLMs even the simple act of comparing them side by side (to decide which to use) is probabilistic and ultimately based partly on feelings. Perhaps it comes with the territory, but this makes me extremely reluctant to integrate it into engineering workflows. Even if they had amazing abilities, they lower the bar significantly from a process perspective.

macintux · 7 months ago

I think you’re both right, but the key is walking vs driving. Walking gives you time to look around, GPS reduces stress, and typically you’re walking in an urban location with landmarks.

Driving still requires careful attention to other drivers, the world goes by rapidly, and most roads look like other roads.

omnimus · 7 months ago

Me too. It's not like i ever used to have a map with me when i was in city i thought i knew. With map in my pocket i started to use it and memorized it much better. My model of the city is much stronger. For example i know proximate directions of neighborhoods i've never even visited.

ozgrakkurt · 7 months ago

The reliability difference is so big that the analogy isn’t relevant.

Google maps is 90% of the time better than a taxi driver where I live.

AI isn’t better than some person that did the thing for a couple days

davidclark · 7 months ago

> The overall average ability of people being able to get from Point A to Point B safely and reliably, especially in areas they are unfamiliar with, has certainly increased dramatically.

Is there evidence for this?

bombcar · 7 months ago

The number of people willing to try has certainly increased.

My wife was very uncomfortable going to a new location via paper maps and directions. She’s perfectly happy following “bitching Betty” from the phone.

bgwalter · 7 months ago

No, I've never heard of someone getting into a unsafe situation because of using paper maps. When there were only paper maps, people managed.

Then there is this Google Maps accident:

https://www.independent.co.uk/tv/news/driver-bridge-google-m...

Which tells you that following directions of a computer makes people more stupid.

bearjaws · 7 months ago

Let's be real, 70% of staff are phoning in their jobs so badly that an AI often is just as good if not better.

The real struggle will be, the people phoning it in are still going to be useless, but with AI. The rest will learn and grow with AI.

throw4847285 · 7 months ago

That's an extremely self-serving narrative. I assume you're part of the 30 percent?

pixl97 · 7 months ago

It's probably better to look at a group from outside. Every company of any size seems to accumulate at least some people that could be replaced with a small shell script. Where I work there are a few people that seem so questionable at their job (even though most are good) I wonder how they keep their positions. I'd rather work with AI for the rest of my life then have to deal with them again.

Whether I'm in the 30% or not isn't the core issue, is it? The point is about the impact AI will have based on existing work ethics. Many of us have seen colleagues who barely contribute, and AI is a tool that will either be leveraged for growth by the engaged or used as another crutch by those already disengaged.

90s_dev · 7 months ago

It's almost like people outsourcing their job to AI are asking to get fired, not only by proving that a computer program can do their job better, but even paving the way for it!

Don’t forget that they give away the data the AI needs for training.

highstep · 7 months ago

Your job will disappear even faster with your head that deep in the sand. At least learning the new tools you can carve out a new role/career for yourself.

api · 7 months ago

For large companies this is true, and this may be the best AI coding take I have read.

It's similar with full self drive. FSD is better than a bad, drunk, or texting human driver, and that's a lot of the drivers on the road.

pornel · 7 months ago

FSD in principle could be, but the overpromised misnomer we have right now isn't. Being better than a drunk driver isn't good enough when it's also worse than a sober driver. The stats of crashes per mile are skewed by FSD being mainly used in easy conditions, and not for all driving.

There are real safety improvements from ADAS. For safety you only need crash avoidance, not a full-time chauffeur.