Lazarus_Long (u/Lazarus_Long)

Lazarus_Long commented on AI: Accelerated Incompetence slater.dev/accelerated-in... · Posted by u/stevekrouse

rhubarbtree · 3 months ago

That’s not how it’s going to work. We’re not going to keep modifying code, we’re going to modify specs that map to code.

Lazarus_Long · 3 months ago

So AI is going to create UML or 5th gen language specifications? Hurrah!.

Lazarus_Long commented on The Darwin Gödel Machine: AI that improves itself by rewriting its own code sakana.ai/dgm/... · Posted by u/birriel

Lazarus_Long · 3 months ago

For anyone not familiar this is SWE https://huggingface.co/datasets/princeton-nlp/SWE-bench

One of the examples in the dataset they took from

https://github.com/pvlib/pvlib-python/issues/1028

What the AI is expected to do

https://github.com/pvlib/pvlib-python/pull/1181/commits/89d2...

Make your own mind about the test.

Lazarus_Long commented on Superhuman performance of an LLM on the reasoning tasks of a physician arxiv.org/abs/2412.10849... · Posted by u/amichail

Lazarus_Long · 3 months ago

In general my smoke test for this kind of things is, if the company (or whatever) gladly accept the full liability for the AI usage.

Cases like: - The AI replaces a salesperson but the sales are not binding or final, in case the client gets a bargain at $0 from the chatbot.

- It replaces drivers but it disengages 1 second before hitting a tree to blame the human.

- Support wants you to press cancel so the reports say "client cancel" and not "self drive is doing laps around a patch of grass".

- Ai is better than doctors at diagnosis, but in any case of misdiagnosis the blame is shifted to the doctor because "AI is just a tool".

- Ai is better at coding that old meat devs, but when the unmaintainable security hole goes to production, the downtime and breaches cannot be blamed on the AI company producing the code, it was the old meat devs fault.

AI companies want the cake and eat it too, until i see them eating the liability, i know, and i know they know, it's not ready for the things they say it is.

Lazarus_Long commented on AI: Accelerated Incompetence slater.dev/accelerated-in... · Posted by u/stevekrouse

crazylogger · 3 months ago

What are you basing this on? Is there an “inventiveness test” that humans can pass but LLMs don’t? I’m not aware of any.

Lazarus_Long · 3 months ago

I assume you ignored "teleology" because you concede the point, otherwise feel free to take it.

" Is there an “inventiveness test” that humans can pass but LLMs don’t?"

Of course, any topic where there is no training data available and that cannot be extrapolated by simply mixing the existing data. Of course that is harder to test on current unknowns and unknown unknowns.

But it is trivial to test on retrospective knowledge. Just train the AI with text say to the 1800 and see if it can come out with antibiotics and general relativity, or if it will simply repeat outdated notions of disease theory and newtonian gravity.

Lazarus_Long commented on AI: Accelerated Incompetence slater.dev/accelerated-in... · Posted by u/stevekrouse

crazylogger · 3 months ago

Nothing fundamentally prevents an LLM from achieving this. You can ask an LLM to produce a PR, another LLM to review a PR, and another LLM to critique the review, then another LLM to question the original issue's validity, and so on...

The reason LLM is such a big deal is that they are humanity's first tool that is general enough to support recursion (besides humans of course.) If you can use LLM, there's like a 99% chance you can program another LLM to use LLM in the same way as you:

People learn the hard way how to properly prompt an LLM agent product X to achieve results -> some company is going to encode these learnings in a system prompt -> we now get a new agent product Y that is capable of using X just like a human -> we no longer use X directly. Instead, we move up one level in the command chain, to use product Y instead. And this recursion goes on and on, until the world doesn't have any level left for us to go up to.

We are basically seeing this play out in realtime with coding agents in the past few months.

Lazarus_Long · 3 months ago

"Nothing fundamentally prevents an LLM from achieving this"

Well yes, LLMs are not teleological, nor inventive.

Lazarus_Long commented on AI: Accelerated Incompetence slater.dev/accelerated-in... · Posted by u/stevekrouse

btbuildem · 3 months ago

I strongly agree with both the premise of the article, and most of the specific arguments brought forth. That said, I've also been noticing some positive aspects of using LLMs in my day-to-day. For context, I've been in the software trade for about three decades now.

One thing working with AI-generated code forces you to do is to read code -- development becomes more a series of code reviews than a first-principles creative journey. I think this can be seen as beneficial for solo developers, as in a way, it mimics / helps learn responsibilities only present in teams.

Another: it quickly becomes clear that working with an LLM requires the dev to have a clearly defined and well structured hierarchical understanding of the problem. Trying to one-shot something substantial usually leads to that something being your foot. Approaching the problem from a design side, writing a detailed spec, then implementing sections of it -- this helps to define boundaries and interfaces for the conceptual building blocks.

I have more observations, but attention is scarce, so -- to conclude. We can look at LLMs as a powerful accelerant, helping junior devs grow into senior roles. With some guidance, these tools make apparent the progression of lessons the more experienced of us took time to learn. I don't think it's all doom and gloom. AI won't replace developers, and while it's incredibly disruptive at the moment, I think it will settle into a place among other tools (perhaps on a shelf all of its own).

Lazarus_Long · 3 months ago

Problem is, paraphrasing Scott Kilmer, corporations are dead from the neck up. The conclusion for them was not that AI will help juniors, is that they will not hire juniors and will ask seniors the magic "10x" with the help of AI. Even some seniors are getting the boot, because AI.

Just look at recent news, layoff after layoff from Big Tech, Middle tech and small tech.