The Timmy Trap - Readit News

> LLMs mimic intelligence, but they aren’t intelligent.

I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?

I see two possibilities:

1. We define "intelligence" as definitionally unique to humans. For example, maybe intelligence depends on the existence of a human soul, or specific to the physical structure of the human brain. In this case, a machine (perhaps an LLM) could achieve "quacks like a duck" behavioral equality to a human mind, and yet would still be excluded from the definition of "intelligent." This definition is therefore not useful if we're interested in the ability of the machine, which it seems to me we are. LLMs are often dismissed as not "intelligent" because they work by inferring output based on learned input, but that alone cannot be a distinguishing characteristic, because that's how humans work as well.

2. We define "intelligence" in a results-oriented way. This means there must be some specific test or behavioral standard that a machine must meet in order to become intelligent. This has been the default definition for a long time, but the goal posts have shifted. Nevertheless, if you're going to disparage LLMs by calling them unintelligent, you should be able to cite a specific results-oriented failure that distinguishes them from "intelligent" humans. Note that this argument cannot refer to the LLMs' implementation or learning model.

libraryofbabel · 9 days ago

Agree. This article would had been a lot stronger if it had just concentrated on the issue of anthropomorphizing LLMs, without bringing “intelligence” into it. At this point LLMs are so good at a variety of results-oriented tasks (gold on the Mathematical Olympiad, for example) that we should either just call them intelligent or stop talking about the concept altogether.

But the problem of anthropomorphizing is real. LLMs are deeply weird machines - they’ve been fine-tuned to sound friendly and human, but behind that is something deeply alien: a huge pile of linear algebra that does not work at all like a human mind (notably, they can’t really learn form experience at all after training is complete). They don’t have bodies or even a single physical place where their mind lives (each message in a conversation might be generated on a different GPU in a different datacenter). They can fail in weird and novel ways. It’s clear that anthropomorphism here is a bad idea. Although that’s not a particularly novel point.

southernplaces7 · 9 days ago

LLMs can't reason with self-awareness. Full stop (so far). This distinguishes them from human sentience and thus our version of intelligence completely, and it's a huge gulf, no matter how good they are at simulating discourse, thought and empathy, or at pretending to think the way we do. While processing vast reams of information for the sake of discussion and directed tasks is something an LLM can do on a scale that leaves human minds far behind in the dust (though LLMs fail at synthesizing said information to a notably high degree) even the most ordinary human with the most mediocre intelligence can reason with self awareness to some degree or another and this is, again, distinct.

You could also argue around how our brains process vast amounts of information unconsciously as a backdrop to the conscious part of us being alive at all, and how they pull all of this and awareness off on the same energy that powers a low-energy light bulb, but that's expanding beyond the basic and obvious difference stated above.

The Turing test has been broken by LLMs, but this only shows that it was never a good test of sentient artificial intelligence to begin with. I do incidentally wish Turing himself could have stuck around to see these things at work, and ask him what he thinks of his test and them.

andrewla · 9 days ago

I can conceptually imagine a world in which I'd feel guilty for ending a conversation with an LLM, because in the course of that conversation the LLM has changed from who "they" were at the beginning; they have new memories and experiences based on the interaction.

But we're not there, at least in my mind. I feel no guilt or hesitation about ending one conversation and starting a new one with a slightly different prompt because I didn't like the way the first one went.

Different people probably have different thresholds for this, or might otherwise find that LLMs in the current generation have enough of a context window that they have developed a "lived experience" and that ending that conversation means that something precious and unique has been lost.

anal_reactor · 9 days ago

I disagree. I see absolutely no problem with anthropomorphizing LLMs, and I do that myself all the time. I strongly believe that we shouldn't focus on how a word is defined in dictionary, but rather what's the intuitive meaning behind it. If talking to an LLM feels like talking to a person, then I don't see a problem with seeing it as a person-like entity.

tkiolp4 · 9 days ago

I think LLMs are not intelligent because they aren’t designed to be intelligent, whatever the definition of intelligence is. They are designed to predict text, to mimic. We could argue if predicting text or mimicking is intelligence, but first and foremost LLMs are coded to predict text and our current definition of intelligence afaik is not only the ability to predict text.

andrewla · 9 days ago

In the framework above it sounds like you're not willing to concede the dichotomy.

If your argument is that only things made in the image of humans can be intelligent (i.e. #1), then it just seems like it's too narrow a definition to be useful.

If there's a larger sense in which some system can be intelligent (i.e. #2), then by necessity this can't rely on the "implementation or learning model".

What is the third alternative that you're proposing? That the intent of the designer must be that they wanted to make something intelligent?

rsanek · 9 days ago

humans were designed to be intelligent?

bitwizeshift · 8 days ago

I don’t really think one needs to define intelligence to be able to acknowledge that inability to distinguish fact from fiction, or even just basic cognition and awareness of when it’s uncertain, telling the truth, or lying — is a glaring flaw in claiming intelligence. Real intelligence doesn’t have an effective stroke from hearing a username (token training errors); this is when you are peeling back the curtain of the underlying implementation and seeing its flaws.

If we measure intelligence as results oriented, then my calculator is intelligent because it can do math better than me; but that’s what it’s programmed/wired to do. A text predictor is intelligent at predicting text, but it doesn’t mean it’s general intelligence. It lacks any real comprehension of the model or world around it. It just know words, and

bitwizeshift · 8 days ago

I hit send too early; Meant to say that it just knows words and that’s effectively it.

It’s cool technology, but the burden of proof of real intelligence shouldn’t be “can it answer questions it has great swaths of information on”, because that is the result it was designed to do.

It should be focused on whether it can truly synthesize information and know its limitations - something any programmer using Claude, copilot, Gemini, etc will tell you that it fabricates false information/apis/etc on a regular basis and has no fundamental knowledge that it even did that.

Or alternatively, ask these models leading questions that have no basis in reality — and watch what it comes up with. It’s become a fun meme in some circles to ask for definitions of nonsensical made up phrases to models, and see what crap it comes up with (again, without even knowing that it is).

enknee1 · 8 days ago

I agree with your basic argument: intelligence is ill-defined and human/LLM intelligence being indistinguishable IS the basis for the power of these models.

But the point of the article is a distinct claim: personification of a model, expecting human or even human-like responses is a bad idea. These models can be held responsible for their answers independently because they are tools. They should be used as tools until they are powerful enough to be responsible for their actions and interactions legally.

But we're not there. These are tools. With tool limitations.

dkdcio · 10 days ago

> I see statements like this a lot, and I find them unpersuasive because any meaningful definition of "intelligence" is not offered. What, exactly, is the property that humans (allegedly) have and LLMs (allegedly) lack, that allows one to be deemed "intelligent" and the other not?

the ability for long-term planning and, more cogently, actually living in the real world where time passes

hackyhacky · 10 days ago

> the ability for long-term planning and, more cogently, actually living in the real world where time passes

1. LLMs seem to be able to plan just fine.

2. LLMs clearly cannot be "actually living" but I fail to see how that's related to intelligence per se.

libraryofbabel · 9 days ago

> actually living in the real world where time passes

sure, but it feels like this is just looking at what distinguishes humans from LLMs and calling that “intelligence.” I highlight this difference too when I talk about LLMs, but I don’t feel the need to follow up with “and that’s why they’re not really intelligent.”

iluvlawyering · 8 days ago

Human's are conscious beings. What kind of conscious beings are humans? Beings with eye consciousness, ear consciousness, nose consciousness, tongue consciousness, body consciousness, and mind consciousness. That is the definition of intelligence.

Eisenstein · 9 days ago

Intelligence is a tautological term. It is defined by itself. If you ask someone for examples of things inside the set of intelligence and outside of the set of intelligence, and then ask them to list off properties that would exclude something from the set, and properties that include something into the set, you will find things inside the set that have properties that should exclude them, and things outside the set which would have properties that should include them. But these contradictions will not cause the person to re-evaluate whether or not the things should be removed from the set or included in it, but instead they will become exceptions to the defining properties.

Thus we have to abandon any sort of metric for intelligence and just call it a tautology and rely on an something that we can define to be the litmus for whatever property we are looking for. I think 'agency' should be under consideration for this, since it is actually somewhat definable and testable.

abrookewood · 9 days ago

I think it has to include some measure of Agency. You can load up the most impressive LLM out there and if you don't give it any instructions, IT WON'T DO ANYTHING.

godelski · 8 days ago

  > but the goal posts have shifted

Is this shocking? We don't have a rigorous definition of intelligence so doesn't it make sense? The question isn't about such a goal post moving so much about how it is moving. It is perfectly acceptable for it to be refined while it wouldn't be to rewrite the definition in a way that isn't similar to the previous one.

So I think there are a lot more than your two possibilities. I mean psychologists and neuroscientists have been saying for decades that tests aren't a precise way to measure knowledge or intelligence, but that it is still a useful proxy.

  > "quacks like a duck" behavioral

I see this phrase used weirdly frequently. The duck test is

  | If it looks like a duck, swims like a duck, and quacks like a duck, then it ***probably*** is a duck.

I emphasize probably because the duck test doesn't allow you to distinguish a duck from a highly sophisticated animatronic. It's a good test, don't get me wrong, but that "probably" is a pretty important distinction.

I think if we all want to be honest, the reality is "we don't know". There's arguments to be made in both directions and with varying definitions of intelligence with different nuances involved. I think these arguments are fine as they make us refine our definitions but I think they can also turn to be entirely dismissive and that doesn't help us refine and get closer to the truth. We all are going to have opinions on this stuff but frankly, the confidence of our opinions needs to be proportional to the amount of time and effort spent studying the topic. I mean the lack of a formal definition means nuances dominate the topic. Even if things are simple once you understand them that doesn't mean they aren't wildly complex before that. I mean I used to think Calculus was confusing and now I don't. Same process but not on an individual scale.

hackyhacky · 8 days ago

> I emphasize probably because the duck test doesn't allow you to distinguish a duck from a highly sophisticated animatronic. It's a good test, don't get me wrong, but that "probably" is a pretty important distinction.

Why is it an important distinction? The relevance of the duck test is that if you can't tell a duck from a non-duck, then the non-duck is sufficiently duck-like for the difference to not matter.

card_zero · 9 days ago

It may be the case that the failures of the ability of the machine (2) are best expressed by reference to the shortcomings of its internal workings (1), and not by contrived tests.

hackyhacky · 9 days ago

It might be the case, but if those shortcomings are not visible in the results of the machine (and therefore not interpretable by a test), why do its internal workings even matter?

The article says that LLMs don't summarize, only shorten, because...

"A true summary, the kind a human makes, requires outside context and reference points. Shortening just reworks the information already in the text."

Then later says...

"LLMs operate in a similar way, trading what we would call intelligence for a vast memory of nearly everything humans have ever written. It’s nearly impossible to grasp how much context this gives them to play with"

So, they can't summarize, because they lack context... but they also have an almost ungraspably large amount of context?

usefulcat · 9 days ago

I think "context" is being used in different ways here.

> "It’s nearly impossible to grasp how much context this gives them to play with"

Here, I think the author means something more like "all the material used to train the LLM".

> "A true summary, the kind a human makes, requires outside context and reference points."

In this case I think that "context" means something more like actual comprehension.

The author's point is that an LLM could only write something like the referenced summary by shortening other summaries present in its training set.

jjaksic · 9 days ago

But "shortening other summaries from its training set" is not all an LLM is capable off. It can easily shorten/summarize a text it had never seen before, in a way that makes sense. Sure, it won't always summarize it the same way a human would, but if you do a double blind test where you ask people whether a summary was written by AI, a vast majority wouldn't be able to tell the difference (again this is with a completely novel text).

jchw · 10 days ago

I think the real takeaway is that LLMs are very good at tasks that closely resemble examples it has in its training. A lot of things written (code, movies/TV shows, etc.) are actually pretty repetitive and so you don't really need super intelligence to be able to summarize it and break it down, just good pattern matching. But, this can fall apart pretty wildly when you have something genuinely novel...

strangattractor · 9 days ago

Is anyone here aware of LLMs demonstrating an original thought? Something truly novel.

My own impression is something more akin to a natural language search query system. If I want a snippet of code to do X it does that pretty well and keeps me from having to search through poor documentation of many OSS projects. Certainly doesn't produce anything I could not do myself - so far.

Ask it about something that is currently unknown and it list a bunch of hypotheses that people have already proposed.

Ask it to write a story and you get a story similar to one you already know but with your details inserted.

I can see how this may appear to be intelligent but likely isn't.

gus_massa · 10 days ago

Humans too. If I were too creative writing the midterm, most of my students would fail and everyone would be very unhappy.

jjaksic · 9 days ago

And what truly novel things are humans capable of? At least 99% of the stuff we do is just what we were taught by parents, schools, books, friends, influencers, etc.

Remember, humans needed some 100, 000 years to figure out that you can hit an animal with a rock, and that's using more or less the same brain capacity we have today. If we were born in stone age, we'd all be nothing but cavemen.

What genuinely novel thing have you figured out?

btown · 10 days ago

It's an interesting philosophical question.

Imagine an oracle that could judge/decide, with human levels of intelligence, how relevant a given memory or piece of information is to any given situation, and that could verbosely describe which way it's relevant (spatially, conditionally, etc.).

Would such an oracle, sufficiently parallelized, be sufficient for AGI? If it could, then we could genuinely describe its output as "context," and phrase our problem as "there is still a gap in needed context, despite how much context there already is."

And an LLM that simply "shortens" that context could reach a level of AGI, because the context preparation is doing the heavy lifting.

The point I think the article is trying to make is that LLMs cannot add any information beyond the context they are given - they can only "shorten" that context.

If the lived experience necessary for human-level judgment could be encoded into that context, though... that would be an entirely different ball game.

entropicdrifter · 9 days ago

I agree with the thrust of your argument.

IMO we already have the technology for sufficient parallelization of smaller models with specific bits of context. The real issue is that models have weak/inconsistent/myopic judgement abilities, even with reasoning loops.

For instance, if I ask Cursor to fix the code for a broken test and the fix is non-trivial, it will often diagnose the problem incorrectly almost instantly, hyper-focus on what it imagines the problem is without further confirmation, implement a "fix", get a different error message while breaking more tests than it "fixed" (if it changed the result for any tests), and then declare the problem solved simply because it moved the goalposts at the start by misdiagnosing the issue.

tovej · 10 days ago

You can reconcile these points by considering what specific context is necessary. The author specifies "outside" context, and I would agree. The human context that's necessary for useful summaries is a model of semantic or "actual" relationships between concepts, while the LLM context is a model of a single kind of fuzzy relationship between concepts.

In other words the LLM does not contain the knowledge of what the words represent.

neerajsi · 9 days ago

> In other words the LLM does not contain the knowledge of what the words represent.

This is probably true for some words and concepts but not others. I think we find that llms make inhuman mistakes only because they don't have the embodied senses and inductive biases that are at the root of human language formation.

If this hypothesis is correct, it suggests that we might be able to train a more complete machine intelligence by having them participate in a physics simulation as one part of the training. I.e have a multimodal ai play some kind of blockworld game. I bet if the ai is endowed with just sight and sound, it might be enough to capture many relevant relationships.

ratelimitsteve · 10 days ago

I think the differentiator here might not be the context it has, but the context it has the ability to use effectively in order to derive more information about a given request.

cainxinth · 9 days ago

This can be solved with prompting. You can say: “summarize this document, but don’t just recap, give me the big picture” or anything to that effect.

kayodelycaon · 10 days ago

They can’t summarize something that hasn’t been summarized before.

timmg · 10 days ago

About a year ago, I gave a film script to an LLM and asked for a summary. It was written by a friend and there was no chance it or its summary was in the training data.

It did a really good -- surprisingly good -- job. That incident has been a reference point for me. Even if it is anecdotal.

originalcopy · 9 days ago

I'd like to see some examples of when it struggles to do summaries. There were no real examples in the text, besides one hypothetical which ChatGPT made up.

I think LLMs do great summaries. I am not able to come up with anything where I could criticize it and say "any human would come up with a better summary". Are my tasks not "truly novel"? Well, then I am not able, as a human, to come up with anything novel either.

naikrovek · 10 days ago

they can, they just can't do it well. at no point does any LLM understand what it's doing.