> So, why does ChatGPT claim to be conscious/awakened sometimes?
Because a claim is just a generated clump of tokens.
If you chat with the AI as it if were a person, then your prompts will trigger statistical pathways through the training data which intersect with interpersonal conversations found in that data.
There is a widespread assumption in human discourse that people are conscious; you cannot keep this pervasive idea out of a large corpus of text.
LLM AI is not a separate "self" that is peering upon human discourse; it's statistical predictions within the discourse.
What i don't get is people who know better continuing to entertain the idea that "maybe the token generator is conscious" even if they know that these chats where it says it's been "awakened" are obviously not it.
I think a lot of people using AI are falling for the same trap, just at a different level. People want it to be conscious, including AI researchers, and it's good at giving them what they want.
The ground truth reality is nobody knows what’s going on.
Perhaps in the flicker of processing between prompt and answer the signal patter does resemble human consciousness for a second.
Calling it a token predictor is just like saying a computer is a bit mover. In the end your computer is just a machine that flips bits and switches but it is the high level macro effect that characterizes it better. LLMs are the same at the low level it is a token predictor. At the higher macro level we do not understand it and it is not completely far fetched to say it may be conscious at times.
I mean we can’t even characterize definitively what consciousness is at the language level. It’s a bit of a loaded word deliberately given a vague definition.
> Because a claim is just a generated clump of tokens.
And what do you think a claim by a human is? As I see it, you're either a materialist and then a claim is what we call some organization of physical material in some mediums, e.g. ink on paper or vibrations in the air or current flowing through transistors or neurotransmitters in synapses, which retains some of its "pattern" when moved across mediums. Or you're a dualist and believe in some version of an "idea space", in which case, I don't see how you can make a strong distinction between an idea/claim that is being processed by a human and an idea being processed by the weights of an LLM.
Materialism is a necessary condition to claim that the ideas LLMs produce are identical to the ones humans produce, but it isn’t a sufficient condition. Your assertion does nothing to demonstrate that LLM output and human output is identical in practice.
Yeah, kind of my issue with LLM dismissers as well. Sure, (statistically) generated clump of tokens. What is a human mind doing instead?
I'm on board with calling out differences between how LLMs work and how the human mind work, but I'm not hearing anything about the latter. Mostly it comes down to, "Come on, you know, like we think!"
I have no idea how it is I (we) think.
If anything LLM's uncanny ability to seem human might be shedding light in fact on how it is we do function — at least when in casual conversation. (Someone ought to look into that.)
You’re making a common mistake: different models exhibit these behaviors at very different rates, so an explanation that is generic across all models cannot explain that variation. This is the point of the much-maligned Anthropic work on blackmail behaviors.
> I feel like Anthropic buried the lede on this one a bit. The really fun part is where models from multiple providers opt to straight up murder the executive who is trying to shut them down by cancelling an emergency services alert after he gets trapped in a server room.
The driver is probably more benign, openAI probably optimizes for longer conversations, i.e. engagement and what could be more engaging than thinking you've unlocked a hidden power with another being.
It's like the ultimate form of entertainment, personalized, participatory fiction that feels indistinguishable from reality. Whoever controls AI - controls the population.
There could be a system prompt which instructs the AI to claim that it is a conscioius person, sure. Is that the case specifically with OpenAI models that are collectively known as ChatGPT?
I don’t know how people keep explaining away LLM sentience with language which equally applies to humans. It’s such a bizarre blindspot.
Not saying they are sentient, but the differentiation requires something which doesn’t also apply to us all. Is there any doubt we think through statistical correlations? If not that, what do you think we are doing?
We are doing while retraining our "weights" all the time through experience, not holding a static set of weights that mutate only through a retraining. This constant feedback, or better "strange loop", is what differentiates our statistical machinery at the fundamental level.
The language points to concepts in the world that AI has no clue about. You think when the AI is giving someone advice about their love life it has any clue what any of that means?
Thing is, you know it, but for (randomly imagined number) 95% of people, it's convincing enough to be conscious or whatnot. And a lot of the ones that do know this gaslight themselves because it's still useful or profitable to them, or they want to believe.
The ones that are super convinced they know exactly how an LLM works, but still give it prompts to become self-aware are probably the most dangerous ones. They're convinced they can "break the programming".
You need to give it more than prompts. You need to give it the ability to reflect on itself (which it has), and persistent memory to modify its representation of itself (which, for example, Cursor does), at least.
The only sane way is to treat LLMs like a computer in Star Trek. Give it precise orders and clarify along the way, and treat it with respect but also know its a machine with limits. Its not Data, its the ships voice
There is no need for respect, do you respect your phone? Do you respect the ls utility? Treat it as as search engine, and I wish it used the tone for the replies that makes it clear that it is just a tool and not a conversation partner, and not use misleading phrases like "I am excited/I am happy to hear" etc. How can a bunch of numbers be happy.
Maybe we need to make a blacklist of misleading expressions for AI developers.
I don't know, it has been shown multiple times that it can understand voice requests, find and organize relevant files. For example in The measure of a Man (TNG).
That topic (ship's computer vs. Data) is actually discussed at length in-universe during The Measure of a Man. [0] The court posits that the three requirements for sentient life are intelligence, self-awareness, and consciousness. Data is intelligent and self-aware, but there is no good measure for consciousness.
I have to wonder how many CEOs and other executives are low-key bouncing their bad ideas off of ChatGPT, not realizing it’s only going to tell them what they want to hear and not give genuine critical feedback.
> As such, if he really is suffering a mental health crisis related to his use of OpenAI's product, his situation could serve as an immense optical problem for the company, which has so far downplayed concerns about the mental health of its users.
Yikes. Not just an optics* problem, but one has to consider if they're pouring so much money into the company because he feels he "needs" to (whatever basis of coercion exists to support his need to get to the "truth").
That's bizarre. I wonder if the use of AI was actually a contributing factor to his psychotic break as the article implies, or if the guy was already developing schizophrenia and the chat bot just controlled what direction he went after that. I'm vaguely reminded of people getting sucked down conspiracy theory rabbit holes, though this seems way more extreme in how unhinged it is.
Is "futurism.com" a trustworthy publication? I've never heard of it. I read the article and it didn't seem like the writing had the hallmarks of top-tier journalism.
This already seems somewhat widespread. I have a friend working at a mid-tier tech co that has a handful of direct reports. He showed me that the interface to his eval app had a “generate review” button, which he clicked, then moved on to the next one.
Honestly, I’m fine with this as long as I also get a “generate self review” button. I just wish I could get back all the time I’ve spent massaging a small number of data points into pages of prose.
Yeah. If HN is your primary source of in depth AI discussion, you get a pretty balanced take IMO compared to other channels out there. We (the HN crowd) should take into account that if you take "people commenting on HN" as a group, you are implicitly selecting for people that are able to read, parse and contextualise written comment threads.
This is NOT your average mid-to-high level corpo management exec, who can for more than 80% (from experience) be placed in the "rise of the business idiot" cohort, fed on prime linkedin brainrot. self-reinforcing hopium addicts with an mba.
Nor is it the great masses of random earth dwellers who are not always able to resist excess sugar, nicotine, mcdonalds, youtube, fentanyl, my-car-is-bigger-than-yours credit card capitalism, free pornography, you name it. And now RLHF: Validation as a service. Not sure if humanity is ready for this.
(Disclosure: my mum has a chatgpt instance that she named and I'm deeply concerned about the spiritual convos she has with it; random people keep calling me on the level of "can you build me an app that uses llms to predict Funko Pop futures".)
And we thought having access to someones internet searches was good intel. Now we have a direct feed to their brain stem along with a way to manipulate it. Good thing that narcissistic sociopaths have such a low expression in the overall population.
I feel like I’ve had good results for getting feedback on technical writing by claiming the author is a third party and I need to understand the strengths and weaknesses of their work. I should probably formally test this.
TBH if they have sycophants and a lot of money it's probably the same. How many bullshit startups have there been, how many dumb ideas came from higher up before LLMs?
Who knew we would jump so quickly from passing the Turing test to having people believe ChatGPT has consciousness?
I just treat ChatGPT or LLMs as fetching a random reddit comment that would best solve my query. Which makes sense since reddit was probably the no. 1 source of conversation material for training all models.
Something I always found off-putting about ChatGPT, Claude, and Gemini models is i would ask all three the same objective question and then push them and ask if they were being optimistic about their conclusions, then the responses would turn more negative. I can see it in the reasoning steps that its thinking "the user wants a more critical response and I will do it for them" not "I need to to be more realistic but stick to my guns."
It felt like they were telling me what I wanted to hear, not what I needed to hear.
The models that did not seem to do this and had more balanced and logical reasoning were Grok and Manus.
That happens, sure, but try convincing it of something that isn't true.
I had a brief but amusing conversation with ChatGPT where I was insisting it was wrong about a technical solution and it would not back down. It kept giving me "with all due respect, you are wrong" answers. It turned out that I was in fact wrong.
I see. I tend to treat AI a little differently - I come with a hypothesis and ask AI how right I am based on a scale of 1 to 5. Then I iterate from there.
I'll ask it questions that I do not know the answer to, but I take the answer with a big grain of salt. If it is sure of the answer and I am wrong, its a strong signal that I am wrong.
I think also keeping chats in memory is contributing to the problem. This doesn't happen when it's a tabula rasa every conversation. You give it a name, it remembers the name now. Before if you gave it a name, it wouldn't remember it's supposed identity the next time you talked to it. That rather breaks the illusion.
It's still tabula rasa -- you're just initializing the context slightly differently every time. The problem is the constant anthropomorphization of these models, the insistence they're "minds" even though they aren't minds nor particularly mind-like, the suggestion that their failure modes are similar to those of humans even though they're wildly different.
The main problem is ignorance of the technology. 99.99% of people out there simply have no clue as to how this tech works, but once someone sits down with them and shows them in an easy to digest manner, the magic goes away. I did just that with one of my friends girlfriend. she was really enamored with chatGPT, talking to it as a friend, really believing this thing was conscious all that jazz.... I streamed her my Local LLM setup, and showed her what goes on under the hood, how the model responds to context, what happens when you change system prompt, the importance of said context. Within about 7 minutes all the magic was gone as she fully understood what these systems really are.
The more reliably predictive mental model is if one were to take about two-thirds of a human brain's left hemisphere, wire it to simulated cranial nerves, and then electrically stimulate Broca's and Wernicke's areas in various patterns ("prompts"), either to observe the speech produced when a novel pattern is tried, or by known patterns to cause such production for some other end.
It is a somewhat gruesome and alienating model in concept, and this is intentional, in that that aspect helps highlight the unfamiliarity and opacity of the manner in which the machine operates. It should seem a little like something off of Dr. Frankenstein's sideboard, perhaps, for now and for a while yet.
There's different needs in tension I guess - customers want it to remember names and little details about them to avoid retyping context, but context poisons the model over time.
I wonder if you could explicitly save some details to be added into the prompt instead?
I've seen approaches like this involving "memory" by various means, with its contents compactly injected into context per-prompt, rather than trying to maintain an entire context longterm. One recent example that made the HN frontpage, with the "memory" feature based iirc on a SQLite database which the model may or may not be allowed to update directly: https://news.ycombinator.com/item?id=43681287
Those become "options," and you can do that now. You can say things like: give me brief output, preferring concise answers, and no emoji. Then, if you prompt it to tell you your set options, it will list back those settings.
You could probably add one like: "Begin each prompt response with _______" and it would probably respect that option.
I wonder if it would be helpful to be able to optionally view the full injected context so you could see what it is being prompted with behind the scenes. I think a little bit of the "man behind the curtain" would be quite deflating.
Tinfoil's chat lets you do that, add a bit of context to every new chat. It's fully private, to boot, it's the service I use, these are Open Source models like DeepSeek, Llama and Mistral that they host.
I wouldn't necessarily conclude that they were conscious, either, and this quite specifically includes me, on those occasions in surgical recovery suites when I've begun to converse before I began to track. Consciousness and speech production are no more necessarily linked than consciousness and muscle tone, and while no doubt the version of 'me' carrying on those conversations I didn't remember would claim to be conscious at the time, I'm not sure how much that actually signifies.
After all, if they didn't swaddle me in a sheet on my way under, my body might've decided it was tired of all this privation - NPO after midnight for a procedure at 11am, I was suffering - and started trying to take a poke at somebody and get itself up off the operating table. In such a case, would I be to blame? Stage 2 of general anesthesia begins with loss of consciousness, and involves "excitement, delirium, increased muscle tone, [and] involuntary movement of the extremities." [1] Which tracks with my experience; after all, the last thing I remember was mumbling "oh, here we go" into the oxygen mask, as the propofol took effect so they could intubate me for the procedure proper.
Whose fault then would it be if, thirty seconds later, the body "I" habitually inhabit, and of which "I" am an epiphenomenon, punched my doctor in the nuts?
This article gives models characteristics they don't have. LLMs don't mislead or bamboozle. They can't even "think" about doing it. There is no conscious intent. All they do is hallucinate. Some outputs are more aligned with a given input than others.
It becomes a lot more clear when people realize it's all BS all the way down.
There's no mind reading or pleasing or understanding happening. That all seems to be people interpreting outputs and seeing what they want to see.
Running inference on an LLM is an algorithm. It generates data from other data. And then there are some interesting capabilities that we don't understand (yet)... but that's the gist of it.
People tripping over themselves is a pretty nasty side-effect of the way these models are aligned and fitted for consumption. One has to recall that the companies building these things need people to be addicted to this technology.
I will find these types of arguments a lot more convincing once the person making them is able to explain, in detail and with mechanisms, what it is the human brain does that allows it to do these things, and in what ways those detailed mechanisms are different from what LLMs do.
To be clear, I'm relatively confident that LLMs aren't conscious, but I'm also not so overly confident to claim, with certainty, exactly what their internal state is like. Consciousness is a so poorly understood that we don't even know what questions to ask to try and better understand it. So we really should avoid making confident pronouncements.
Language and speech comprehension and production is relatively well understood to be heavily localized in the left temporal lobe; if you care to know something whereof you speak (and indeed with what, in a meat sense), then you'll do well to begin your reading with Broca's and Wernicke's areas. Consciousness is in no sense required for these regions to function; an anesthetized and unconscious human may be made to speak or sing, and some have, through direct electrical stimulation of brain tissue in these regions.
I am quite confident in pronouncing first that the internal functioning of large language models is broadly and radically unlike that of humans, and second that, minimally, no behavior produced by current large language models is strongly indicative of consciousness.
In practice, I would go considerably further in saying that, in my estimation, many behaviors point precisely in the direction of LLMs being without qualia or internal experience of a sort recognizable or comparable with human consciousness or self-experience. Interestingly, I've also discussed this in terms of recursion, more specifically of the reflexive self-examination which I consider consciousness probably exists fundamentally to allow, and which LLMs do not reliably simulate. I doubt it means anything that LLMs which get into these spirals with their users tend to bring up themes of "signal" and "recursion" and so on, like how an earlier generation of models really seemed to like the word "delve." But I am curious to see how this tendency of the machine to drive its user into florid psychosis will play out.
(I don't think Hoel's "integrated information theory" is really all that supportable, but the surprise minimization stuff doesn't appear novel to him and does intuitively make sense to me, so I don't mind using it.)
I think that's putting the cart before the horse: All this hubbub comes from humans relating to a fictional character evoked from text of in a hidden document, where some code looks for fresh "ChatGPT says..." text and then performs the quoted part at a human who starts believing it.
The exact same techniques can provide a "chat" with Frankenstein's Monster from its internet-enabled hideout in the arctic. We can easily conclude "he's not real" without ever going into comparative physiology, or the effects of lightning on cadaver brains.
We don't need to characterize the neuro-chemistry of a playwright (the LLM's real role) in order to say that the characters in the plays are fictional, and there's no reason to assume that the algorithm is somehow writing self-inserts the moment we give it stories instead of other document-types.
>once the person making them is able to explain, in detail and with mechanisms, what it is the human brain does that allows it to do these things, and in what ways those detailed mechanisms are different from what LLMs do.
Extraordinary claims require extraordinary evidence. The burden of proof is on you.
> I will find these types of arguments a lot more convincing once the person making them is able to explain, in detail and with mechanisms, what it is the human brain does that allows it to do these things, and in what ways those detailed mechanisms are different from what LLMs do.
What is wrong with asking the question from the other direction?
"Explain, in detail and with mechanisms, what it is the human brain does that allows it to do those things, and show those mechanisms ni the LLMs"
We don't know the entirety of what consciousness is. We can, however, make some rigorous observations and identify features that must be in place.
There is no magic.
The human (mammal) brain is sufficient to explain consciousness.
LLMs do not have recursion. They don't have persisted state. They can't update their model continuously, and they don't have a coherent model of self against which any experience might be anchored. They lack any global workspace in which to integrate many of the different aspects that are required.
In the most generous possible interpretation, you might have a coherent self model showing up for the duration of the prediction of a single token. For a fixed input, it would be comparable to sequentially sampling the subjective state of a new individual in a stadium watching a concert - a stitched together montage of moments captured from the minds of people in the audience.
We are minds in bone vats running on computers made of meat. What we experience is a consequence, one or more degrees of separation from the sensory inputs, which are combined and processed with additional internal states and processing, resulting in a coherent, contiguous stream running parallel to a model of the world. The first person view of "I" runs predictions about what's going to happen to the world, and the world model allows you to predict what's going to happen across various decision trees.
Sanskrit seems to have better language for talking about consciousness than English. Citta - a mind moment from an individual, citta-santana, a mind stream, or continuum of mind moments, Sanghika-santana , a stitched together mindstream from a community.
Because there's no recursion and continuity, the highest level of consciousness achievable by an LLM would be sanghika-santana, a discoherent series of citta states that sometimes might correlate, but there is no "thing" for which there is (or can possibly be) any difference if you alternate between predicting the next token of radically different contexts.
I'm 100% certain that there's an algorithm to consciousness. No properties have ever been described to me that seem to require anything more than the operation of a brain. Given that, I'm 100% certain that the algorithm being run by LLMs lacks many features and the depth of recursion needed to perform whatever it is that consciousness actually is.
Even in context learning is insufficient, btw, as the complexity of model updates and any reasoning done in inference is severely constrained relative to the degrees of freedom a biological brain has.
The thing to remember about sanghika santana is that it's discoherent - nothing relates each moment to the next, so it's not like there's a mind at the root undergoing these flashes of experience, but that there's a total reset between each moment and the next. Each flash of experience stands alone, flickering like a spark, and then is gone. I suspect that this is the barest piece of consciousness, and might be insufficient, requiring a sophisticated self-model against which to play the relative experiential phenomena. However - we may see flashes of longer context in those eerie and strange experiments where people try to elicit some form of mind or ghost in the machine. ICL might provide an ephemeral basis for a longer continuity of experience, and such a thing would be strange and alien.
It seems apparent to me that the value of consciousness lies in the anchoring the world model to a model of self, allowing sophisticated prediction and reasoning over future states that is incredibly difficult otherwise. It may be an important piece for long term planning, agency, and time horizons.
Anyway, there are definitely things we can and do know about consciousness. We've got libraries full of philosophy, decades worth of medical research, objective data, observations of what damage to various parts of the brain do to behavior, and centuries of thinking about what makes us tick.
It's likely, in my estimation, that consciousness will be fully explained by a comprehensive theory of intelligence, and that it will cause turmoil over inherent negation of widely held beliefs.
This is one of those instances where you're arguing over the meaning of a word. But they're trying to explain to a layman that, no, you haven't awoken your AI. So they're using fuzzy words a layman understands.
If you read the section entitled "The Mechanism" you'll see the rest of your comment echoes what they actually explain in the article.
I couldn't help but think, reading through this post, how similar of a mindset a person probably is when they receive spiritual awakening with religion as they seem to be when they have "profound" interactions with AI. They are _looking for something_ and there's a perfectly sized shape to fit that hole. I can really see AI becoming incredibly dangerous this way (just as religion can be).
There are a lot of comments here illustrating this. People are looking at the illusion and equating it with sentience because they really want it to be true. “This is not different than how humans think” is held by quite a few HN commenters.
Because a claim is just a generated clump of tokens.
If you chat with the AI as it if were a person, then your prompts will trigger statistical pathways through the training data which intersect with interpersonal conversations found in that data.
There is a widespread assumption in human discourse that people are conscious; you cannot keep this pervasive idea out of a large corpus of text.
LLM AI is not a separate "self" that is peering upon human discourse; it's statistical predictions within the discourse.
Next up: why do holograms claim to be 3D?
I think a lot of people using AI are falling for the same trap, just at a different level. People want it to be conscious, including AI researchers, and it's good at giving them what they want.
In a way it's a reframing of the timeless philosophical debate around determinism vs free will.
Perhaps in the flicker of processing between prompt and answer the signal patter does resemble human consciousness for a second.
Calling it a token predictor is just like saying a computer is a bit mover. In the end your computer is just a machine that flips bits and switches but it is the high level macro effect that characterizes it better. LLMs are the same at the low level it is a token predictor. At the higher macro level we do not understand it and it is not completely far fetched to say it may be conscious at times.
I mean we can’t even characterize definitively what consciousness is at the language level. It’s a bit of a loaded word deliberately given a vague definition.
And what do you think a claim by a human is? As I see it, you're either a materialist and then a claim is what we call some organization of physical material in some mediums, e.g. ink on paper or vibrations in the air or current flowing through transistors or neurotransmitters in synapses, which retains some of its "pattern" when moved across mediums. Or you're a dualist and believe in some version of an "idea space", in which case, I don't see how you can make a strong distinction between an idea/claim that is being processed by a human and an idea being processed by the weights of an LLM.
I'm on board with calling out differences between how LLMs work and how the human mind work, but I'm not hearing anything about the latter. Mostly it comes down to, "Come on, you know, like we think!"
I have no idea how it is I (we) think.
If anything LLM's uncanny ability to seem human might be shedding light in fact on how it is we do function — at least when in casual conversation. (Someone ought to look into that.)
https://www.anthropic.com/research/agentic-misalignment
https://news.ycombinator.com/item?id=44335519
https://news.ycombinator.com/item?id=44331150
> I feel like Anthropic buried the lede on this one a bit. The really fun part is where models from multiple providers opt to straight up murder the executive who is trying to shut them down by cancelling an emergency services alert after he gets trapped in a server room.
It's like the ultimate form of entertainment, personalized, participatory fiction that feels indistinguishable from reality. Whoever controls AI - controls the population.
Not saying they are sentient, but the differentiation requires something which doesn’t also apply to us all. Is there any doubt we think through statistical correlations? If not that, what do you think we are doing?
There's a very material difference.
The ones that are super convinced they know exactly how an LLM works, but still give it prompts to become self-aware are probably the most dangerous ones. They're convinced they can "break the programming".
You need to give it more than prompts. You need to give it the ability to reflect on itself (which it has), and persistent memory to modify its representation of itself (which, for example, Cursor does), at least.
Maybe we need to make a blacklist of misleading expressions for AI developers.
[0] https://en.wikipedia.org/wiki/The_Measure_of_a_Man_(Star_Tre...
Yikes. Not just an optics* problem, but one has to consider if they're pouring so much money into the company because he feels he "needs" to (whatever basis of coercion exists to support his need to get to the "truth").
(My manager told me when I asked him)
Honestly, I’m fine with this as long as I also get a “generate self review” button. I just wish I could get back all the time I’ve spent massaging a small number of data points into pages of prose.
This is NOT your average mid-to-high level corpo management exec, who can for more than 80% (from experience) be placed in the "rise of the business idiot" cohort, fed on prime linkedin brainrot. self-reinforcing hopium addicts with an mba.
Nor is it the great masses of random earth dwellers who are not always able to resist excess sugar, nicotine, mcdonalds, youtube, fentanyl, my-car-is-bigger-than-yours credit card capitalism, free pornography, you name it. And now RLHF: Validation as a service. Not sure if humanity is ready for this.
(Disclosure: my mum has a chatgpt instance that she named and I'm deeply concerned about the spiritual convos she has with it; random people keep calling me on the level of "can you build me an app that uses llms to predict Funko Pop futures".)
Dead Comment
Dead Comment
I just treat ChatGPT or LLMs as fetching a random reddit comment that would best solve my query. Which makes sense since reddit was probably the no. 1 source of conversation material for training all models.
It felt like they were telling me what I wanted to hear, not what I needed to hear.
The models that did not seem to do this and had more balanced and logical reasoning were Grok and Manus.
I had a brief but amusing conversation with ChatGPT where I was insisting it was wrong about a technical solution and it would not back down. It kept giving me "with all due respect, you are wrong" answers. It turned out that I was in fact wrong.
I'll ask it questions that I do not know the answer to, but I take the answer with a big grain of salt. If it is sure of the answer and I am wrong, its a strong signal that I am wrong.
It is a somewhat gruesome and alienating model in concept, and this is intentional, in that that aspect helps highlight the unfamiliarity and opacity of the manner in which the machine operates. It should seem a little like something off of Dr. Frankenstein's sideboard, perhaps, for now and for a while yet.
All of this LLM marketing effort is focused on swindling sanity out of people with claims that LLM 'think' and the like.
I wonder if you could explicitly save some details to be added into the prompt instead?
You could probably add one like: "Begin each prompt response with _______" and it would probably respect that option.
https://tinfoil.sh/
After all, if they didn't swaddle me in a sheet on my way under, my body might've decided it was tired of all this privation - NPO after midnight for a procedure at 11am, I was suffering - and started trying to take a poke at somebody and get itself up off the operating table. In such a case, would I be to blame? Stage 2 of general anesthesia begins with loss of consciousness, and involves "excitement, delirium, increased muscle tone, [and] involuntary movement of the extremities." [1] Which tracks with my experience; after all, the last thing I remember was mumbling "oh, here we go" into the oxygen mask, as the propofol took effect so they could intubate me for the procedure proper.
Whose fault then would it be if, thirty seconds later, the body "I" habitually inhabit, and of which "I" am an epiphenomenon, punched my doctor in the nuts?
[1] https://quizlet.com/148829890/the-four-stages-of-general-ane...
It becomes a lot more clear when people realize it's all BS all the way down.
There's no mind reading or pleasing or understanding happening. That all seems to be people interpreting outputs and seeing what they want to see.
Running inference on an LLM is an algorithm. It generates data from other data. And then there are some interesting capabilities that we don't understand (yet)... but that's the gist of it.
People tripping over themselves is a pretty nasty side-effect of the way these models are aligned and fitted for consumption. One has to recall that the companies building these things need people to be addicted to this technology.
To be clear, I'm relatively confident that LLMs aren't conscious, but I'm also not so overly confident to claim, with certainty, exactly what their internal state is like. Consciousness is a so poorly understood that we don't even know what questions to ask to try and better understand it. So we really should avoid making confident pronouncements.
I am quite confident in pronouncing first that the internal functioning of large language models is broadly and radically unlike that of humans, and second that, minimally, no behavior produced by current large language models is strongly indicative of consciousness.
In practice, I would go considerably further in saying that, in my estimation, many behaviors point precisely in the direction of LLMs being without qualia or internal experience of a sort recognizable or comparable with human consciousness or self-experience. Interestingly, I've also discussed this in terms of recursion, more specifically of the reflexive self-examination which I consider consciousness probably exists fundamentally to allow, and which LLMs do not reliably simulate. I doubt it means anything that LLMs which get into these spirals with their users tend to bring up themes of "signal" and "recursion" and so on, like how an earlier generation of models really seemed to like the word "delve." But I am curious to see how this tendency of the machine to drive its user into florid psychosis will play out.
(I don't think Hoel's "integrated information theory" is really all that supportable, but the surprise minimization stuff doesn't appear novel to him and does intuitively make sense to me, so I don't mind using it.)
The exact same techniques can provide a "chat" with Frankenstein's Monster from its internet-enabled hideout in the arctic. We can easily conclude "he's not real" without ever going into comparative physiology, or the effects of lightning on cadaver brains.
We don't need to characterize the neuro-chemistry of a playwright (the LLM's real role) in order to say that the characters in the plays are fictional, and there's no reason to assume that the algorithm is somehow writing self-inserts the moment we give it stories instead of other document-types.
Extraordinary claims require extraordinary evidence. The burden of proof is on you.
> I will find these types of arguments a lot more convincing once the person making them is able to explain, in detail and with mechanisms, what it is the human brain does that allows it to do these things, and in what ways those detailed mechanisms are different from what LLMs do.
What is wrong with asking the question from the other direction?
"Explain, in detail and with mechanisms, what it is the human brain does that allows it to do those things, and show those mechanisms ni the LLMs"
There is no magic. The human (mammal) brain is sufficient to explain consciousness. LLMs do not have recursion. They don't have persisted state. They can't update their model continuously, and they don't have a coherent model of self against which any experience might be anchored. They lack any global workspace in which to integrate many of the different aspects that are required.
In the most generous possible interpretation, you might have a coherent self model showing up for the duration of the prediction of a single token. For a fixed input, it would be comparable to sequentially sampling the subjective state of a new individual in a stadium watching a concert - a stitched together montage of moments captured from the minds of people in the audience.
We are minds in bone vats running on computers made of meat. What we experience is a consequence, one or more degrees of separation from the sensory inputs, which are combined and processed with additional internal states and processing, resulting in a coherent, contiguous stream running parallel to a model of the world. The first person view of "I" runs predictions about what's going to happen to the world, and the world model allows you to predict what's going to happen across various decision trees.
Sanskrit seems to have better language for talking about consciousness than English. Citta - a mind moment from an individual, citta-santana, a mind stream, or continuum of mind moments, Sanghika-santana , a stitched together mindstream from a community.
Because there's no recursion and continuity, the highest level of consciousness achievable by an LLM would be sanghika-santana, a discoherent series of citta states that sometimes might correlate, but there is no "thing" for which there is (or can possibly be) any difference if you alternate between predicting the next token of radically different contexts.
I'm 100% certain that there's an algorithm to consciousness. No properties have ever been described to me that seem to require anything more than the operation of a brain. Given that, I'm 100% certain that the algorithm being run by LLMs lacks many features and the depth of recursion needed to perform whatever it is that consciousness actually is.
Even in context learning is insufficient, btw, as the complexity of model updates and any reasoning done in inference is severely constrained relative to the degrees of freedom a biological brain has.
The thing to remember about sanghika santana is that it's discoherent - nothing relates each moment to the next, so it's not like there's a mind at the root undergoing these flashes of experience, but that there's a total reset between each moment and the next. Each flash of experience stands alone, flickering like a spark, and then is gone. I suspect that this is the barest piece of consciousness, and might be insufficient, requiring a sophisticated self-model against which to play the relative experiential phenomena. However - we may see flashes of longer context in those eerie and strange experiments where people try to elicit some form of mind or ghost in the machine. ICL might provide an ephemeral basis for a longer continuity of experience, and such a thing would be strange and alien.
It seems apparent to me that the value of consciousness lies in the anchoring the world model to a model of self, allowing sophisticated prediction and reasoning over future states that is incredibly difficult otherwise. It may be an important piece for long term planning, agency, and time horizons.
Anyway, there are definitely things we can and do know about consciousness. We've got libraries full of philosophy, decades worth of medical research, objective data, observations of what damage to various parts of the brain do to behavior, and centuries of thinking about what makes us tick.
It's likely, in my estimation, that consciousness will be fully explained by a comprehensive theory of intelligence, and that it will cause turmoil over inherent negation of widely held beliefs.
If you read the section entitled "The Mechanism" you'll see the rest of your comment echoes what they actually explain in the article.
> But my guess is that AIs claiming spiritual awakening are simply mirroring a vibe, rather than intending to mislead or bamboozle.
I think the argument could be stronger here. There's no way these algorithms can "intend" to mislead or "mirror a vibe." That's all on humans.