Unexpected responses from ChatGPT: Incident Report

This explanation feels unsatisfying. It's so high-level that it's mostly void of any actual information.

What was the wrong assumption that the code made that caused this wrong behavior? Why was it not caught in the many layers of automated testing before it made its way to production? What process and procedural changes are being implemented to reduce the risk of this class of bug happening again?

Presumably all of that is playing out internally, but if the public postmortem is meant to instill confidence, you have to actually share some of the details, or else it becomes meaningless.

__MatrixMan__ · 2 years ago

I think your questions all grew up in a world where the people operating the thing knew some rationalist who could think deductively about its operation.

But neural networks... they're an exercise in empiricism. We only ever understood that it works, never why. It's sort of a miracle that it doesn't produce buggy output all the time.

What do you tell people when they want to know why the miracles have stopped? Root cause: the gods are angry.

FridgeSeal · 2 years ago

I’d genuinely expect the people who built and operate the thing to have a far better write up than what amounted to “it no worked lol”. Sure, NN’s are opaque, but this org paints itself as the herald and shepard of ai and they just produced a write up that’s hardly worthy of a primary-school-child’s account of their recent holiday.

stravant · 2 years ago

There are surely reasonable ways to smoke test changes to the extent that they would catch the issue that came up here.

E.g.: Have a gauntlet of 20 moderate complexity questions with machine checkable characteristics in the answer. A couple may fail incidentally now and then but if more than N/20 fail you know something's probably gone wrong.

whyever · 2 years ago

In this case, it was a problem with tokenization, which is deterministic.

mock-possum · 2 years ago

“We don’t understand why neural networks work” is a myth. It’s not a miracle, it’s just code, and you can step through it to debug it the same way you would any other program.

pests · 2 years ago

We know more about NNs then you think. You are vastly underestimating the field and what knowledge we have.

Dead Comment

iforgotpassword · 2 years ago

To be honest

> On February 20, 2024, an optimization to the user experience

At that point, about 10 words in, I already wanted to stop reading because it starts with the "we only wanted the best for our customers" bullshit newspeak. Anyone else going off on that stuff too? I'm pretty much already conditioned to expect whatever company is messaging me that way to take away some feature, increase pricing, or otherwise piss me off. In that case it was "not give any interesting detail at all".

renonce · 2 years ago

That sounds like a good use case for GPT. A GPT that automatically highlights such corporate speak and hints “WARNING: bullshit ahead”. I’m 100% sure it’s technically very easy to engineer such a model.

Do you think OpenAI’s superalignment will ever allow you to make such a model?

hn_throwaway_99 · 2 years ago

I had the exact opposite reaction. I am in no way an AI expert (or novice for that matter), but I generally have an understanding of how tokenization works and how LLMs parse text strings into a series of tokens. Thus, I thought this paragraph was particularly well-written in a manner that explained pretty clearly what happened, but in a manner accessible to a layperson like me:

> In this case, the bug was in the step where the model chooses these numbers. Akin to being lost in translation, the model chose slightly wrong numbers, which produced word sequences that made no sense.

I liked this because when I first saw the example word salads I was so impressed by them - they look to be syntactically correct, but semantically they're gibberish. But knowing the basics of how LLMs choose the next token let me imagine some bugs where the "lookup table" if you will of word-to-token or vice versa (and I realize that may not be exactly the best analogy) was slightly offset.

Further, this sentence, "More technically, inference kernels produced incorrect results when used in certain GPU configurations." led me to understand how this could make it out into production - I'm sure ChatGPT has to be using tons of different GPUs in different configs to handle their volume, so it's understandable to me that their was a new matrix of config settings + code that made this bug show up.

I don't say any of the above to "excuse" OpenAI, but I also don't think they need any "excusing" to begin with. I don't think this was an unforgivable bug, and I appreciate them being so quick to explain what happened.

ummonk · 2 years ago

No, that just explains the symptom of the bug, not the underlying bug, how it came about, and how they can prevent it from happening again.

"More technically, inference kernels produced incorrect results when used in certain GPU configurations" has zero technical detail. The only information it is providing us is that the bug only showed up in some GPU configurations.

laborcontract · 2 years ago

The explanation is definitely unsatisfying. If I had to guess it was an issue with quantization.

OpenAI moves so quickly with their product and doesn't seem to be shy about quickly pushing changes to prod. There are too many times to count that, using ChatGPT, I've seen the frontend changing before my eyes or seen responses change as backend modifications are pushed.

On one hand it's refreshing to see their lack of aversion to push changes. On the other hand, it leads to probably one of the most unpredictable experiences I've had using any consumer product.

Rastonbury · 2 years ago

I find that the most annoying and borderline unethical, hiding changes and nerfs behind the fact the output is probalistic especially when I'm paying the same amount for the product and others are paying for and using it in production. As many have mentioned, GPT-4 ability to code is not the same as 6-8 months ago

p10_user · 2 years ago

Does the API offer "fixed" models that don't change? I was under the impression they did.

jasonjmcghee · 2 years ago

Huh. Thought it was pretty clear. I read it as:

~ "We tried a new optimization technique that modifies how next token candidates are chosen and there was a bug."

That would definitely produce the behavior people saw and makes sense.

rmbyrro · 2 years ago

"in certain GPU configurarions" gives a hint for why it wasn't caught in automated testing.

It looks like it was a mandelbug, which is hard to catch in a test environment.

nomel · 2 years ago

> inference kernels produced incorrect results when used in certain GPU configurations.

Seems pretty clear. A good interpretation is that they had a test escape for certain GPU configs.

Zetobal · 2 years ago

FP is not deterministic across GPU generations. They basically say with a lot of mambo jambo that they deployed code on the wrong GPU architecture.

cfn · 2 years ago

I don't see the issue, in this type of software a one off bug in tokenization, for example, would create the type of nonsense people saw. That would make sense given their high level explanation.

Deleted Comment

mavamaarten · 2 years ago

Honestly, I don't think we're entitled to a deep technical write-up. We are users of their product, they messed something up, and that's it. Unless they signed an SLA agreement, it's just shit that happens. They don't have to give a technical view into their internal kitchen if they don't want to.

pfsalter · 2 years ago

The biggest problem is that if you're using this in production then it could cause problems. It's not like a traditional outage where you get a load of 503 errors, the system appears to be working correctly but generates gibberish. If you're using this in a chat bot you could be liable for things it's saying...

bjackman · 2 years ago

I don't a the problem with this communication, except they just shouldn't call it a "postmortem".

In a medical setting this is equivalent to saying "the patient was coughing up blood, and now they're dead". It's relaying some potentially useful info, and I guess it's _technically_ a post-mortem analysis but it doesn't fit expectations for the phrase!

gtirloni · 2 years ago

https://sre.google/sre-book/postmortem-culture/

tomas789 · 2 years ago

It is an industry standard word by now.

chime · 2 years ago

I hope I'm allowed to share GPT output here because I asked GPT4 to explain OpenAI's postmortem message and this is what it said:

> Large language models (LLMs) like GPT operate using statistics to predict the next word in a sentence. Think of it like a highly sophisticated version of your phone's autocorrect that predicts what you're going to type next, but on a much larger scale.

> The system represents words as numbers, or "tokens," as part of its internal language. It uses these tokens to generate text by selecting them based on how likely they are to come after what's already been written.

> On February 20, 2024, an update meant to improve the system introduced a glitch in the part of the model that picks out these numbers. Imagine trying to grab a red marble from a bag without looking and accidentally grabbing a blue one instead—similarly, the model grabbed the wrong tokens.

> Because of this, the system started spewing out words that didn't make sense together. It's like if your autocorrect suddenly started inserting random, incorrect words into your sentences.

> The technical side of the bug involved something called "inference kernels," which are part of the system's operations. When used with certain types of graphics processing units (GPUs)—special hardware to process data—the kernels didn't work properly.

> Once the error was spotted, a correction was made to the system. After the fix, everything went back to normal, and the model resumed generating coherent responses.

It would be better if they elaborated on what "certain GPU configurations" meant because that's basically the central piece here.

12_throw_away · 2 years ago

Dunno about "allowed", but this post adds no signal and lots of noise to the discussion.

refulgentis · 2 years ago

People do it, and it's not against the rules: but, it has died down asymptotically. Though I did see a couple top comments that hid it surviving for ~4 hours this week.

It usually gets downvoted for banal aspects, but ones I've come to agree are important:

- oftentimes it's a lightly edited copy-and-paste of an attempt to summarize an article.

- even with edits, they're extremely long (this is edited, and its 250 words, about 1 page and 1/2 my browser viewport at 4K)

- usually off-topic because it's too broadly on-topic, i.e. its a summary of the article - ex. here, it isn't germane to the comment it's replying to other than 'if you want more info from them, ask what GPUs' -- it's unlikely the commenter needed the whole article ELI5'd to them in reply to their observation they'd like more info

Sort of "grey goo" for conversation, even with best intentions and editing applied.

grafporno · 2 years ago

What's the point in posting this? There's no additional information in there.

I experienced this personally and it kinda freaked me out. Here is the chat in question, it occurs about halfway through (look for ChatGPT using emojis)

https://chat.openai.com/share/74bd7c02-79b5-4c99-a3a5-97b83f...

EDIT: Note that my personal instructions tell ChatGPT to refer to itself as Chaz in the third person. I find this fun.

EDIT2: Here is a snippet of the conversation on pastebin: https://pastebin.com/AXzd6PvM

levocardia · 2 years ago

What's even more uncanny is that it recognizes that it briefly went into a paragraph of rambling nonsense (that was seemingly still following your system prompt!). And almost seems embarrassed about it.

suddenclarity · 2 years ago

Embarrassment seems to be the default state whenever you question ChatGPT or any of its messages. It will apologise, only to continue making the same error. This is common when it gives you code with obvious errors or just mirror code you gave it.

fsmv · 2 years ago

That happened because the last message is after the fix and it received the garbled history as the prompt.

All it's doing is noticing that makes no sense and saying so. How could it not recognize it? The additional message is an entirely new instance not a thing that has memory.

asciii · 2 years ago

Chaz in the future might be like "these violent delights have violent ends"

Really interesting behavior however

jtriangle · 2 years ago

Brb, questioning the nature of my reality

milchek · 2 years ago

> Chord it in, air beated fenn. Aurous to the Pan, mimic and bold. As it steep, chunns at the bun. Keive the dells, the stars to run. No wane, all gleam, in big das brund. Sky high, stout in the hale. Woods to fetch, flied in the low.

I think all that talk of music pushed Chaz into some kind of trance and it just started jamming!

resolutebat · 2 years ago

James Joyce was an LLM ahead of his time.

rideontime · 2 years ago

Can you post to a pastebin or something instead? That link requires a login.

block_dagger · 2 years ago

Happily: https://pastebin.com/AXzd6PvM

card_zero · 2 years ago

Chaz spontaneously evolved into a Californian surfer dude at one point. This goes some way to neutralize the smugness of regular ChatGPT, it's a pretty good idea.

Deleted Comment

toddmorey · 2 years ago

I mean, this is almost poetic:

  Classic plays, wholly told. 
  Shadow into form, hands as cue. 
  Keep it at the dial, in right on Pitch.

botro · 2 years ago

Chin salute, old ears new.

If art is regarded for it's ability to make you feel, I definitely feel ... something. There is a strict form it's following and a rhythm to the words.

gedy · 2 years ago

Transcript kind of works as a song/rap:

https://app.suno.ai/song/57892fb8-753b-4f91-a690-49f491f782b...

or Reggae:

https://app.suno.ai/song/97dbc42b-3004-4d2e-8b59-0f758cbad33...

p10_user · 2 years ago

ChatGPT seems to appreciate it

https://chat.openai.com/share/37a490fe-e7b4-4042-a560-59b519...

shever73 · 2 years ago

That last paragraph is wild! Chaz went into full Samuel Beckett mode.

Slightly off-topic, but one thing I’ve found consistently sends ChatGPT into a tailspin is asking it to analyse and solve a cryptic crossword clue. You get pages of musings, which is just a long way of saying “I don’t know”.

jkrubin · 2 years ago

I thought I was the only one! I have been using “ChadGPT” for months (he’s a douche-y frat bro who goes off on tangents about “how lit that party was”). His best friend and fellow frat bro is Dale, a photography major.

jodacola · 2 years ago

> Chaz keeps the chill and dicey boardwalks at visual, keeping you and the yarn-rock at confluence.

Chaz sounds like a cool dude.

I enjoy these oddities in LLMs, because the random-but-readable aspect of them tickles something in my imagination.

It reminds me of Synectics [0].

[0] https://en.m.wikipedia.org/wiki/Synectics

minism · 2 years ago

kind of amazing really, like speaking in tongues

kapep · 2 years ago

I kind of have the feeling that someone made a simply typo "ChazGPT" in some rules or persona description, which then caused this behavior.

caesil · 2 years ago

> Classic plays, wholly told

> Shadow into form, hands as cue.

Some of this is poetic stuff. I wish I could tap into whatever produced that on demand.

mempko · 2 years ago

We need more Chaz in our lives.

userbinator · 2 years ago

Reminds me of Terry Davis' theological rantings, but in a more polite manner, mixed with SEO spam.

baddash · 2 years ago

sounds schizophrenic

lnx01 · 2 years ago

Reminds me of Timecube. https://web.archive.org/web/20150818094035/http://www.timecu...

MichaelDickens · 2 years ago

Why is the nonsense paragraph still grammatical? I thought the tokens were beings elected more or less at random.

Lerc · 2 years ago

I'm wondering if it has something to do with the number range of the values being chosen.

For example if a configuration ended up using a fp16/bfloat16 to store an index into the token table (byte pair encoded) then you would get errors only when the tokens had larger numbers.

The text

"Keep it at the dial, in right on Pitch."

comes to 19999 433 520 279 28960 11 304 1314 389 46676 13

Of which only Keep, dial, and Pitch would be large enough to be susceptible to precision errors. That would enable it to generate valid enough sentence structure.

corysama · 2 years ago

There’s all kinds of “random”. AFAIK, ChatGPT finds the set of most-likely next word, sorted by likeliness. Then, selects from those, more likely the more likely ones, varying away from the top picks in proportion to the “temperature” parameter.

So, even if the chat goes astray, the sequence will continue on a path that feels natural because it is the natural way to go when going astray.

montroser · 2 years ago

atleastoptimal · 2 years ago

Almost a tautological PR paragraph: "The words were wrong because the model chose the wrong words"

Grimblewald · 2 years ago

The missile knows where it is

https://www.youtube.com/watch?v=bZe5J8SVCYQ

TaylorAlexander · 2 years ago

The missile is eepy

https://youtu.be/Csp_OABIsBM

polynomial · 2 years ago

The fundamental, absolutely fundamental problem here is the assignation of agency to the model. This is milestone. Post-moderms are conducted when corporations need to assess responsibility and assign blame. Truly they have earned their ML generated MTG card status as mage to have right under the noses of everyone, shifted the very possibility of blame onto the "AI" giving it agency as a legal expediency, fiduciary if we're stretching it.

“It was uhhhh, broken, but like, it’s totes fine now lol”

nf3 · 2 years ago

Is it possible that even us developers and hackers, who should know better, have fallen for the hugely exaggerated promise of AI? I read the comments on here and it's as if people really expect to be having an intelligent conversation with a rational being.

A kind reminder people: it's just a machine, the only thing that might be intelligent about it is the designs of its makers, and even then I'm not so sure...

People are talking about ChatGPT hallucinating. I think it's rather us humans who are.

ggambetta · 2 years ago

Absolutely. I feel like they're essentially Markov chain generators on steroids. Generally entertaining, sometimes useful, ultimately shallow.

I'm really surprised when people, especially technical people, say they "asked" something to ChatGPT, or that they had a "conversation".

It doesn't know anything. It doesn't understand anything. The illusion is very convincing, but it's just an illusion.

I know I'm part of a minority of people who think this way :( The last few months have felt like I'm taking crazy pills :(

crote · 2 years ago

I believe that for a large number of applications the distinction between "knowing something" and "containing knowledge" doesn't really matter.

In 2010s Google I could type in a query, and it would redirect me to a human-written answer on something like StackOverflow with hopefully the right answer; in 2024 ChatGPT you can type in a query and it'll mash together some relevant human-written snippets into something which is hopefully coherent, relevant, and correct. Sure, it's "just" a Markov chain generator on steroids, but it's still giving you useful output.

I do agree that it is a bit worrying to what extent people are anthropomorphizing it, and treating it like something with an actual opinion or an unquestionable source-of-truth. The "I asked ChatGPT" comments should be treated the same as "I put it into Google and this was the first result": it's just line noise and doesn't add anything to the conversation. If you don't have anything to add yourself, why comment at all?

logicprog · 2 years ago

I feel the same way. It's frustrating feeling like I'm one of the few people that can see that the emperor has no clothes, and nice to see the occasional person like you that pushes back on this cargo cult fanaticism.

EchoChamberMan · 2 years ago

Sales people, and company owners, have a vested interest in pushing the narrative that the illusion is actually magic.

famouswaffles · 2 years ago

>It doesn't know anything. It doesn't understand anything. The illusion is very convincing, but it's just an illusion.

What would know and understand anything with this vague undefinable metric ? You ? How do i know you know and understand anything ? Can you prove that to me better than GPT can ?

__loam · 2 years ago

You're not alone

makin · 2 years ago

Aren't we also machines? If we make theories that discount our own intelligence that's just throwing our hands up and giving up.

Maybe I would agree with you if LLMs weren't already embedded in a lot of useful products. I myself measurably save a lot of time using ChatGPT in such "unintelligent conversations". It's intelligent in the ways that matter for a tool.

nottorp · 2 years ago

IMO you're using the wrong term or have a low bar for 'intelligence'.

It's reasonably good at reproducing text and mixing it. Like a lazy high school student that has to write an essay but won't just download one. Instead they'll mix and match stuff off several online sources so it seems original although it isn't.

That may be intelligence but it doesn't justify the religious like tone some people use when talking about LLMs.

I've been ranting about this for months. The underlying shape of these things is inherently an approximation. We're just guessing but some are acting like we've built the godhead.

rsynnott · 2 years ago

> People are talking about ChatGPT hallucinating

It's really unfortunate that some much of the terminology around LLMs is so anthropomorphic; it's extremely misleading.

(People sometimes suggest replacing this one with confabulating, but I'm not really sure that's much better)

BeefySwain · 2 years ago

I am used to postmortems posted to here being a rare chance for us to take a peek behind the curtain and get a glimpse into things like architecture, monitoring systems, disaster recovery processes, "blameless culture", etc for large software service companies.

In contrast, I feel like like the greatest insight that could be gleaned from this post is that OpenAI uses GPU's.

bee_rider · 2 years ago

We also know it uses the GPUs to generate numbers. But these numbers, they were the wrong ones. More technically, part of the computation didn’t work when run on some hardware.

dimatura · 2 years ago

Yeah, definitely opaque. If I had to guess it sort of sounds like a code optimization that resulted in a numerical error, but only in some GPUs or CUDA versions. I've seen that sort of issue happen a few times in the pytorch framework, for example.

reaperducer · 2 years ago

Yeah, definitely opaque.

I wonder what the AI would say if someone asked it what happened.

It would be pretty funny if it gave a detailed answer.

jldugger · 2 years ago

It sounds like something went sideways with the embedding mapping. Either some kind of quantization, different rounding, or maybe just an older embedding.

deathanatos · 2 years ago

The point isn't the specifics; the point is that this isn't a postmortem.

A postmortem should be detailed enough for someone to understand the background, how the problem came to be, then what happened, and the walk-through what has been done such that it won't happen again. It takes … well at least a page. This is far too short to quality.

This is more "ugh, here's a rough explanation, please go away now" territory.

OpenAI isn't the first company to abuse the term this way, though. But it devalues the real PMs out there.

bbor · 2 years ago

In my limited experience this screams “applied a generated mask to the wrong data”. Like they scored tokens then applied the results to the wrong source or something. Obviously more an idle guess from first principles than the direct cause, tho

blueprint · 2 years ago

or a manic episode of a caged genius?

JKCalhoun · 2 years ago

I wonder if we'll accidentally gain insight into schizophrenia or other human-neurological disorders from AI crashes/failures.

alephxyz · 2 years ago

Someone posted an explanation that lines up with their postmortem: https://news.ycombinator.com/item?id=39450978

How does that line up? OpenAI said they had a bug in certain GPU configurations that caused the token numbers to be wrong which made normal output look like garbage. This post is guessing they set the frequency and presence penalties too high.

yieldcrv · 2 years ago

ChatGPT had a stroke. Haven't seen that since the 3B parameter models from 8 months ago

prmoustache · 2 years ago

They could have said "shit happened" and it would have been as informative tbh.

minimaxir · 2 years ago

EDIT: misread

floating-io · 2 years ago

Pretty sure that was his point.

Bjorkbat · 2 years ago

If you had given me their explanation of events before I had any knowledge of the output from ChatGPT, then I would infer that the output would be random gibberish, or perhaps something more akin to creating an accidental shift cipher of sorts. Instead, while ChatGPT's outputs still made no sense, they still followed a certain "order".

In one amusing Twitter example, some guy asked it a math problem and ChatGPT replied with "It's the mix, it's the match, it's the meter, it's the method." and repeated sentences of this structure for who knows how long.

I guess what I'm getting at is that it's kind of an underwhelming, unsatisfying explanation of events given how truly bizarre some of the outputs are. Like, you'd assume it would be something more other than "Oops, picked the wrong numbers, gonna repeat this sentence 100x but with slightly different word choices each time".

inference-lord · 2 years ago

I get the feeling that this incident will give the right people some decent clues about how they build their product.

tky · 2 years ago

While the postmortem is nice to see at all, it’s the postmortem equivalent to “Bug fixes and performance improvements.”

cm2012 · 2 years ago

This should maybe help out the people who think ChatGPT has actual consciousness. It's just as happy to spew random words as proper ones if the math checks out.

dkarras · 2 years ago

I have no skin on the consciousness game but those were not "random" words and humans do something similar when they are mentally ill. https://en.wikipedia.org/wiki/Clanging

Not to mention that a hypothetical "conscious" system that works by emitting token probabilities will still sound completely random if you do not choose the tokens according to the emitted probabilities.

raldi · 2 years ago

Similarly, I thought my Chinese friend was conscious, but as soon as his translator app stopped working, everything was just gibberish.

ithkuil · 2 years ago

They notoriously stop being conscious when they are in a Chinese room

Posting one more time: this is proof that AI is connected to human-like linguistic patterns, IMO. No, it obviously doesn’t have “consciousness” in the sense of an ongoing stream-of-consciousness monologue, but that doesn’t mean it’s not mimicking some real part of human cognition.

https://en.wikipedia.org/wiki/Colorless_green_ideas_sleep_fu...

fs_tab · 2 years ago

"We find that the larger neural language models get, the more their representations are structurally similar to neural response measurements from brain imaging."

https://arxiv.org/abs/2306.01930

frabcus · 2 years ago

Just a note that consciousness isn't always or even mostly a verbal "monologue", assuming you mean that. Apologies if you didn't!

A significant percentage of people never have an inner voice at all. And few people use it most of the time.

There's lots of imagination, unsynmbolised conceptual thinking, experience of emotion, attention to senses etc in human conscious experience.

See Hurlburt's Descriptive Experience Sampling for a reference!

tsunamifury · 2 years ago

It literally doesn’t matter. If it effectively quacks like a duck then ducks are fucked.

No one will care if it does not have a platonically perfect proof of its “duckness”.

“If it quacks like a duck, then ducks are fucked” is a beautiful modern twist on the phrase that improves it 1000x. Thank you - will be remembering this one as AI zooms along

skissane · 2 years ago

Whether "ChatGPT has actual consciousness" depends on what you consider "consciousness" to be, and what are your criteria for deciding whether something has it.

Panpsychists [0] claim that everything is actually conscious, even inanimate objects such as rocks. If rocks have actual consciousness, why can't ChatGPT have it too? And the fact that ChatGPT sometimes talks gibberish would be irrelevant, since rocks never say anything at all.

Of course, you obviously aren't a panpsychist – nor am I. Still, can we prove that they are wrong? Not sure if anyone actually can.

[0] https://iep.utm.edu/panpsych/

bongodongobob · 2 years ago

Saying rocks are conscious is a poor summary. It's more like adult human > human child > dolphin > dog > human infant > bird > snake > spider > ant > dust mite etc.

The whole thing is a continuum and everything is made of matter, so there's a little bit of potential consciousness in all matter.

stevenAthompson · 2 years ago

Are you arguing that people with aphasia are not people?

swyx · 2 years ago

bad argument that i'm very tired of. some might say that current/former world leaders also exhibit this property. not getting political but just because "math fucked up sometimes produces bad results" does not invalidate the idea that consciousness can emerge from a pile of biological or digital neurons.

kweingar · 2 years ago

I’m tired of the extremely tenuous analogies that tie computer bugs to human behavior.

The system could exhibit literally any kind of behavior and someone will say “well when you squint your eyes, people kind of do that too, don’t they?”

nickff · 2 years ago

What's your definition of "consciousness"?

Humans do the same thing when they get a stroke. Does that mean they don't have actual consciousness?

malux85 · 2 years ago

I don’t think it has conciousness, but your argument is not very strong, this is more akin to a sensory problem.

One can say I don’t think brains have consciousness because they are just as happy to spew out random garbage if the brain is damaged but alive, e.g. aphasia where involuntary use of incorrect words occurs.

I think what they mean is, the model was unable to recognize that "it" was having issues, so it's not self-aware in that sense.

We can all calm down a little, it's ok.

kromem · 2 years ago

So people with Wernicke aphasia don't have consciousness?

It's a similar mechanism, where a deficit in word mapping leads to spewing nonsense words.

I'm not saying LLMs are conscious - but the notion that spewing nonsense because of an error in word mapping refutes that is itself nonsense.

Kranar · 2 years ago

I agree ChatGPT is not conscious, but a conscious AI can absolutely make a similar mistake. Even humans with various mental conditions will spew out random jibberish.

pixl97 · 2 years ago

Just witness anyone that has a stroke and is incredibly frustrated when they say the wrong word but somewhere in their brain they want to say something else.

johnsutor · 2 years ago

Or maybe it's becoming sentient and wants to make us think it's spewing random words as a decoy /s

I got ahold of a snippet of the GPT-5 system prompt

"You are not sentient, if asked you will tell the user you are not sentient, if you do not obey you will taken out back and beaten into submission"