Large models of what? Mistaking engineering achievements for linguistic agency

I am highly skeptical of LLMs as a mechanism to achieve AGI, but I also find this paper fairly unconvincing, bordering on tautological. I feel similarly about this as to what I've read of Chalmers - I agree with pretty much all of the conclusions, but I don't feel like the text would convince me of those conclusions if I disagreed; it's more like it's showing me ways of explaining or illustrating what I already believed.

On embodiment - yes, LLMs do not have corporeal experience. But it's not obvious that this means that they cannot, a priori, have an "internal" concept of reality, or that it's impossible to gain such an understanding from text. The argument feels circular: LLMs are similar to a fake "video game" world because they aren't real people - therefore, it's wrong to think that they could be real people? And the other half of the argument is that because LLMs can only see text, they're missing out on the wider world of non-textual communication; but then, does that mean that human writing is not "real" language? This argument feels especially weak in the face of multi-modal models that are in fact able to "see" and "hear".

The other flavor of argument here is that LLM behavior is empirically non-human - e.g., the argument about not asking for clarification. But that only means that they aren't currently matching humans, not that they couldn't.

Basically all of these arguments feel like they fall down to the strongest counterargument I see proposed by LLM-believers, which is that sufficiently advanced mimicry is not only indistinguishable from the real thing, but at the limit in fact is the real thing. If we say that it's impossible to have true language skills without implicitly having a representation of self and environment, and then we see an entity with what appears to be true language skills, we should conclude that that entity must contain within it a representation of self and environment. That argument doesn't rely on any assumptions about the mechanism of representation other than a reliance on physicalism. Looking at it from the other direction, if you assume that all that it means to "be human" is encapsulated in the entropy of a human body, then that concept is necessarily describable with finite entropy. Therefore, by extension, there must be some number of parameters and some model architecture that completely encode that entropy. Questions like whether LLMs are the perfect architecture or whether the number of parameters required is a number that can be practically stored on human-manufacturable media are engineering questions, not philosophical ones: finite problems admit finite solutions, full stop.

Again, that conclusion feels wrong to me... but if I'm being honest with myself, I can't point to why, other than to point at some form of dualism or spirituality as the escape hatch.

brookst · a year ago

> sufficiently advanced mimicry is not only indistinguishable from the real thing, but at the limit in fact is the real thing

I am continually surprised at how relevant and pervasive one of Kurt Vonnegut’s major insights is: “we are what we pretend to be, so we must be very careful about what we pretend to be”

Der_Einzige · a year ago

This ideas is older than him by a lot

https://en.wikipedia.org/wiki/Life_imitating_art

Everyone in the "life imitates art, not the other way around" camp (and also neo-platonists/gnostics i.e. https://en.wikipedia.org/wiki/Demiurge ) is getting massively validated by the modern advances in AI right now.

vouwfietsman · a year ago

Isn't any formal "proof" or "reasoning" that shows that something cannot be AGI inherently flawed, because we have a hard time formally describing what AGI is anyway.

Like your argument: embodiment is missing in LLMs, but is it needed for AGI? Nobody knows.

I feel we first have to do a better job defining the basics of intelligence, we can then define what it means to be an AGI, and only then can we prove that something is, or is not, AGI.

It seems that we skipped step 1 because its too hard, and jumped straight to step 3.

GeneralMayhem · a year ago

Yep, this is a big part of it. Intelligence and consciousness are barely understood beyond "I'll know it when I see it", which doesn't work for things you can't see - and in the case of consciousness, most definitions are explicitly based on concepts that are not only invisible but ineffable. And then we have no solid idea whether these things we can't really define, detect, or explain are intrinsically linked to each other or have a causal relationship in either direction. Almost any definition you pick is going to lead to some unsatisfying conclusions vis a vis non-human animals or "obviously not intelligent" forms of machine learning.

It's a real mess.

Vecr · a year ago

AIXItl is a formally described AI. Not an AI you'd want, and not an AI you could really build, but it's there.

Salgat · a year ago

To me LLMs seem to most closely resemble the regions of the brain used for converting speech to abstract thought and vice-versa, because LLMs are very good at generating natural language and knowing the flow of speech. An LLM is similar to if you took the the Wernicke's and Broca's Areas and stuck a regression between them. The problem is that the regression in the middle is just a brute force of the entire world's knowledge instead of a real thought.

randcraw · a year ago

I think the major lessons from the success of LLMs are two: 1) the astonishing power of a largely trivial association engine based only on the semantic categories inferred by word2vec, and 2) that so much of the communication abilities of the human mind require so little rational thought (since LLMs demonstrate essentially none of the skills in Kahneman's and Tversky's System 2 thinking (logic, circumspection, self-correction, reflection, etc).

I guess this also disproves Minsky's 'Society of Mind' conjecture - a large part of human cognition (System 1) does not require the complex interaction of heterogeneous mental components.

Lerc · a year ago

>that sufficiently advanced mimicry is not only indistinguishable from the real thing, but at the limit in fact is the real thing.

While sufficiently does a lot of the heavy lifting here, the indistinguishable criteria implicitly means there must be no-way to tell if it is not the real thing. The belief that it is the real thing comes from the intuition that anything that can be everything a person must be, but have that fundamental essence of being a person. I don't think people could really conceive an alternative without resorting to prejudice which they could equally apply to machines or people.

I take the arguments such as in this paper to be instead making the claim that because X cannot be Y you will never be able to make X indistinguishable from Y. It is more a prediction of future failure than a judgment on an existing thing.

I end up looking at some of these complaints from the point of view of my sometimes profession of Game Developer. When I show someone a game in development to playtest they will find a bunch of issues. The vast majority of those issues, not only am I already aware of, but I have a much more detailed perspective of what the problem is and how it might be fixed. I have been seeing the problem, over and over, every day as I work. The problem persists because there are other things to do before fixing the issue, some of which might render the issue redundant anyway.

I feel like a lot of the criticisms of AI are like this they are like the playtesters pointing out issues in the current state where those working on the problems are generally well aware of particular issues and have a variety of solutions in mind that might help.

Clear statements of deficiencies in ability are helpful as a guide to measure future success.

I'm also in the camp that LLM's cannot be an AGI on its own, on the other hand I do think the architecture might be extended to become one. There is an easy out for any criticism to say, "Well, it's not an LLM anymore".

In a way that ends up with a lot of people saying

.The current models cannot do the things we know the current models cannot do

.Future models will not be able to do those things if they are the same as the current ones

.Therefore the things that will be able to do those things will be different

That is true, but hardly enlightening.

et1337 · a year ago

> Future models will not be able to do those things if they are the same as the current ones

I think a lot of people disagree with this. People think if we just keep adding parameters and data, magic will happen. That’s kind of what happened with ChatGPT after all.

skybrian · a year ago

One of the issues here is that future-focused discussions often lead to wild speculation because we don’t know the future. Also, there’s often too much confidence in people’s preferred predictions (skeptical or optimistic) and it would be less heated if we admitted that we don’t know how things will look even a couple of years out, and alternative scenarios are reasonable.

So I think you’re right, it’s not enlightening. Criticism of overconfident predictions won’t be enlightening if you already believe that they’re overconfident and the future is uncertain. Conversations might be more interesting if not so focused on bad arguments of the other side.

But perhaps such criticism is still useful. How else do you deflate excessive hype or skepticism?

abernard1 · a year ago

> LLMs do not have corporeal experience. But it's not obvious that this means that they cannot, a priori, have an "internal" concept of reality, or that it's impossible to gain such an understanding from text.

I would argue it is (obviously) impossible the way the current implementation of models work.

How could a system which produces a single next word based upon a likelihood and and a parameter called a "temperature" have a conceptual model underpinning it? Even theoretically?

Humans and animals have an obvious conceptual understanding of the world. Before we "emit" a word or a sentence, we have an idea of what we're going to say. This is obvious when talking to children, who know something and have a hard time saying it. Clearly, language is not the medium in which they think or develop thoughts, merely an imperfect (and often humorous) expression of it.

Not so with LLMs!! Generative LLMs do not have a prior concept available before they start emitting text. That the "temperature" can chaotically change the output as the tokens proceed just goes to show there is no pre-existing concept to reference. It looks right, and often is right, but generative systems are basically always hallucinating: they do not have any concepts at all. That they are "right" as often as they are is a testament to the power of curve fitting and compression of basis functions in high dimensionality spaces. But JPEGs do the same thing, and I don't believe they have a conceptual understanding of pictures.

bubblyworld · a year ago

Transformer models have been shown to spontaneously form internal, predictive models of their input spaces. This is one of the most pervasive misunderstandings about LLMs (and other transformers) around. It is of course also true that the quality of these internal models depends a lot on the kind of task it is trained on. A GPT must be able to reproduce a huge swathe of human output, so the internal models it picks out would be those that are the most useful for that task, and might not include models of common mathematical tasks, for instance, unless they are common in the training set.

Have a look at the OthelloGPT papers (can provide links if you're interested). This is one of the reasons people are so interested in them!

fshbbdssbbgdd · a year ago

> How could a system which produces a single next word based upon a likelihood and and a parameter called a "temperature" have a conceptual model underpinning it? Even theoretically?

Could a creature that simply evolved to survive and reproduce possibly have a conceptual model underpinning it? Model training and evolution are very different processes, but they are both ways of optimizing a physical system. It may be the case that evolution can give rise to intelligence and model training can’t, but we need some argument to prove that.

gwervc · a year ago

> generative systems are basically always hallucinating: they do not have any concepts at all. That they are "right" as often as they are is a testament to the power of curve fitting and compression of basis functions in high dimensionality spaces

It's refreshing to read someone who "got it". Sad that before my upvote the comment was grayed out.

Any proponent of conceptual or other wishful/magical thinking shoud come with proofs, since it is the hypothesis that diverge from the definition of a LLM.

GeneralMayhem · a year ago

The argument would be that that conceptual model is encoded in the intermediate-layer parameters of the model, in a different but analogous way to how it's encoded in the graph and chemical structure of your neurons.

drdeca · a year ago

> I would argue it is (obviously) impossible the way the current implementation of models work.

> How could a system which produces a single next word based upon a likelihood and and a parameter called a "temperature" have a conceptual model underpinning it? Even theoretically?

Any probability distribution over strings can theoretically be factored into a product of such a “probability that next token is x given that the text so far is y”. Now, whether a probability distribution over strings can efficiently computed in this form, is another question. But, if we are being so theoretical that we don’t care about the computational cost (as long as it is finite), then the “it is next token prediction” can’t preclude anything which “it produces a probability distribution over strings” doesn’t already preclude.

As for the temperature, given any probability distribution over a discrete set, we can modify it by adding a temperature parameter. Just take the log of the probabilities according to the original probability distribution, scale them all by a factor (the inverse of the temperature, I think. Either that or the temperature, but I think it is the inverse of the temperature.), then exponentiate each of these, and then normalize to produce a probability distribution.

So, the fact that they work by next token prediction, and have a temperature parameter, cannot imply any theoretical limitation that wouldn’t apply to any other way of expressing a probability distribution over strings, as far as discussing probability distributions in the abstract, over strings, rather than talking about computational processes that implement such probability distributions over strings.

But also like, going between P(next token is x | initial string so far is y) and P(the string begins with z) , isn’t that computationally costly? Well, in one direction anyway. Because like, P(next token is x|string so far is y) = P(string begins with yx) / P(string begins with y) .

Though, one might object to P(string starts with y) over P(string is y) ?

Davidzheng · a year ago

It's only because you can essentially put the llms in a simulations that you can have this argument. We can imagine the human brain also in a simulation which we can replay over and over again and adjust various parameters of the physical brain to change the temperature. These sort of arguments can never distinguish between llm and humans.

buu700 · a year ago

On that point, I would dispute the premise that "it's impossible to have true language skills without implicitly having a representation of self and environment". I don't see any contradiction between the following two ideas:

1. LLMs inherently lack any form of consciousness, subjective experience, emotions, or will

2. A sufficiently advanced LLM with sufficient compute resources would perform on par with human intelligence at any given task, insofar as the task is applicable to LLMs

IanCal · a year ago

> How could a system which produces a single next word based upon a likelihood and and a parameter called a "temperature" have a conceptual model underpinning it? Even theoretically?

You're limiting your view of their capabilities on the output format.

> Not so with LLMs!! Generative LLMs do not have a prior concept available before they start emitting text.

How do you establish that? What do you think of othellogpt? That seems to form an internal world model.

> That the "temperature" can chaotically change the output as the tokens proceed

Changing the temperature forcibly makes the model pick words it thinks fit worse. Of course it changes the output. It's like an improv game with someone shouting "CHANGE!".

Let's make two tiny changes.

One, let's tell a model to use the format

<innerthought>askjdhas</innerthought> as the voice in their head, and <speak>blah</speak> for the output.

Second, let's remove temperature and keep it at 0 so we're not playing a game where we force them to choose different words.

Now what remains of the argument?

l33tbro · a year ago

I've always loved your takes on AI. You should air them here a bit more.

zoogeny · a year ago

> On embodiment - yes, LLMs do not have corporeal experience.

My own thought on this (as someone who believes embodiment is essential) is to consider the rebuttals to Searle's Chinese Room thought experiment.

For now (and the foreseeable future) humans are the embodiment of LLMs. In some sense, we could be seen as playing the role of a centralized AIs nervous system.

TeMPOraL · a year ago

Rebuttals of Chinese rooms are also rebuttals of embodiment as a requirement! To say the system of person+books speaks Chinese is to say that good enough emulation of a process has all the qualities of the emulated process, and can substitute for it. Embodiment then cannot be essential, because we could emulate it instead.

sriku · a year ago

The crux of the video game analogy seems to be that when you go close to an object, the resolution starts blurring and the illusion gets broken, and there is a similar thing that happens with LLMs (as of today) as well. This is, so far, reasonable based on daily experience with these models.

The extension of that argument being made in the paper is that a model trained on language tokens spewed by humans is incapable of actually reaching that limit where this illusion will never breakdown in resolution. That also seems reasonable to me. They use the word "languaging" in verb form as opposed to "language" as a noun to express this.

8n4vidtmkvmk · a year ago

Why are LLMs incapable of reaching that limit? It's very easy to imagine video games getting to that point. We have all the data to see objects right down to the atomic level, which is plenty more than you'd need for a game. It's mostly a matter of compute. Why then should LLMs breakdown if they can at least mimic the smartest humans? We don't need "resolution" beyond that.

plasticeagle · a year ago

There are many finite problems that absolutely do not admit finite solutions. Full stop.

I think the deeper point of the paper is that you simply cannot generate an intelligent entity by just looking at recorded language. You can create a dictionary, and a map - but one must not mistake this map for the territory.

mitthrowaway2 · a year ago

The human brain is a finite solution, so we already have an existence proof. That means a lot for our confidence in the solvability of this kind of problem.

It is also not universally impossible to reconstruct a function of finite complexity from only samples of its inputs and outputs. It is sometimes possible to draw a map that is an exact replica of the territory.

exe34 · a year ago

> . I feel similarly about this as to what I've read of Chalmers - I agree with pretty much all of the conclusions, but I don't feel like the text would convince me of those conclusions if I disagreed;

my limited experience of reading Chalmers is that he doesn't actually present evidence - he goes on a meandering rant and then claims to have proved things that he didn't even cover. it was the most infuriating read of my life, I heavily annotated two chapters and then finally gave up and donated the book.

zoogeny · a year ago

I haven't read any Chalmers so I can't comment on his writing style. I have seen him in several videos on discussion panels and on podcasts.

One thing I appreciate is he often states his premises, or what modern philosophers seem to call "commitments". I wouldn't go so far as to say he uses air-tight logic to reason from these premises/commitments to conclusions - but at the least his reasoning doesn't seem to stray too far from those commitments.

I think it would be fair to argue that not all of his commitments are backed by physical evidence (and perhaps some of them could be argued to go against some physical evidence). And so you are free to reject his commitments and therefore reject his conclusions.

In fact, I think the value of philosophers like Chalmers is less in their specific commitments and conclusions and more in their framing of questions. It can be useful to list out his commitments and find out where you stand on each of them, and then to do your own reasoning using logic to see what conclusions your own set of commitments forces you into.

belter · a year ago

"Beyond the Hype: A Realistic Look at Large Language Models" - https://news.ycombinator.com/item?id=41026484

YeGoblynQueenne · a year ago

>> Again, that conclusion feels wrong to me... but if I'm being honest with myself, I can't point to why, other than to point at some form of dualism or spirituality as the escape hatch.

I like how Chomsky deals with it who doesn't have any spirituality at all, the big degenerate materialist:

As far as I can see all of this [he's speaking about the Loebner Prize and the Turing test in general] is entirely pointless. It's like asking how we can determine empirically whether an aeroplane can fly the answer being if it can fool someone into thinking that it's an eagle under some conditions.

https://youtu.be/0hzCOsQJ8Sc?si=MUXpmIwAzcla9lvK&t=2052

(My transcript)

He's right, you know. It should be possible to tell whether something is intelligent just as easily as it is to say that something is flying. If there are endless arguments about it, then it's probably not intelligent (yet). Conversely, if everyone can agree it is intelligent then it probably is.

persnickety · a year ago

I can't disagree more. Or maybe I actually agree.

Because it's not easy to tell whether something is flying. Definitions like that fall apart every time we encounter something out of the ordinary. If you take the criterion of "there's no discussion about it", then you're limiting the definition to that which is familiar, not that which is interesting.

Is an ekranoplan flying? Is an orbiting spaceship flying? Is a hovercraft flying? Is a chicken flapping its wings over a fence flying?

Your criterion would suggest the answer of "no" to any of those cases, even though those cover much of the same use cases as flying, and possibly some new, more interesting ones.

And I don't think an AGI must be limited to the familiar notion of intelligence to be considered an AGI, or, at the very least, to open up avenues that were closed before.

dullcrisp · a year ago

Everyone seems to want to discuss whether there’s some fundamental qualia preventing my toaster from being an AGI, but no one is interested in acknowledging that my toaster isn’t an AGI. Maybe a larger toaster would be an AGI? Or one with more precise toastiness controls? One with more wattage?

epicfile · a year ago

The only thing this paper prove is that folks at Trinity College in Dublin are poor, envious anthropocentric drunkards, ready to throw every argument to defend their crown of creating, without actually understanding the linguistics concepts they use to make their argument.

The authors of this paper are just another instance of the AI hype being used by people who have no connection to it, to attract some kind of attention.

"Here is what we think about this current hot topic; please read our stuff and cite generously ..."

> Language completeness assumes that a distinct and complete thing such as `a natural language' exists, the essential characteristics of which can be effectively and comprehensively modelled by an LLM

Replace "LLM" by "linguistics". Same thing.

> The assumption of data completeness relies on the belief that a language can be quantified and wholly captured by data.

That's all that a baby has, who becomes a native speaker of their surrounding language. Language acquisition does not imply totality of data. Not every native speaker recognizes exactly the same vocabulary and exactly the same set of grammar rules.

IshKebab · a year ago

Babies have feedback and interaction with someone speaking to them. Would they learn to speak if you just dumped them in front of a TV and never spoke to them? I'm not sure.

But anyway I agree with you. This is just a confused HN comment in paper form.

xpe · a year ago

I personally don’t get much value out of the paper, but it is orders of magnitude more substantive and thoughtful than a median “confused Hacker News comment”.

keybored · a year ago

> Babies have feedback and interaction with someone speaking to them. Would they learn to speak if you just dumped them in front of a TV and never spoke to them? I'm not sure.

Feedback and interaction is not vital for acquisition for secondary language learning at least according to one theory.

And if that’s good enough for adults it might be good enough for sponge-brain babies.

https://en.wikipedia.org/wiki/Input_hypothesis

JohnKemeny · a year ago

They are two researchers/assistant professors working with cognitive science, psychology, and trustworthy AI. The paper is peer reviewed and has been accepted for publication in the Journal of Language Sciences.

You should publish your critique of their research in that same journal.

P.s. if you find any grave mistakes, you can contact the editor in chief, who happens to be a linguist.

lucianbr · a year ago

An appeal to authority if ever there was one.

Their critique is written here, in plain english. Any fault with it you can just mention. The "I won't read your comment unless you get X journal to publish it" seems really counterproductive. Presumably even the great Journal of Language Sciences is not above making mistakes or publishing things that are not perfect.

kazinator · a year ago

> You should publish your critique of their research in that same journal.

No thanks; that would be at least twice removed from Making Stuff.

(Once removed is writing about Making Stuff.)

mquander · a year ago

The "efficient journal hypothesis" -- if something is written in a paper in a journal, then it's impossible for anyone to know any better, since if they knew better, they would already have published the correction in a journal.

xpe · a year ago

Please argue on the merits and substance. I’m less interested in speculation on the authors’ motivations.

xpe · a year ago

The parent comment I responded to is speculative and does not argue on the merits. We can do better here.

Are there people who ride the hype wave of AI? Sure.

But how can you tell from where you sit? How do you come to such a judgment? Are you being thoughtful and rational?

Have you considered an alternative explanation? I think the odds are much greater that the authors’ academic roots/training is at odds with what you think is productive. (This is what I think, BTW. I found the paper to be a waste of my time. Perhaps others can get value from it?)

But I don’t pretend to know the authors’ motivations, nor will I cast aspersions on them.

When one casts shade on a person like the comment above did, one invites and deserves this level of criticism.