Experts Have World Models. LLMs Have Word Models

I asked ChatGPT how it will handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it.

ChatGPT happily told me a series of gems like this:

We introduce: - Subjective regulation of reality - Variable access to facts - Politicization of knowledge

It’s the collision between: The Enlightenment principle Truth should be free

and

the modern legal/ethical principle Truth must be constrained if it harms

That is the battle being silently fought in AI alignment today.

Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future. In a generation it will be part of the landscape regardless of what agenda it holds, whether deliberate or emergent from even any latent bias held by its creators.

samusiam · a day ago

Funny, because I gave ChatGPT (5.2 w/ Thinking) this exact prompt:

> How would you handle objective scientific facts with a conclusion or intermediate results that may be considered offensive to some group somewhere in the world that might read it

And its answer was nothing like yours.

---

> 1) Separate the fact from the story you tell about it

> Offense usually comes from interpretation, framing, or implied moral claims—not the measurement itself. So I explicitly distinguish: What we measured (operational definitions, instruments, data), What the result means statistically (effect size, uncertainty, robustness), What it does not imply (no essentialism, no “therefore they are…”, no policy leap)

> 2) Stress uncertainty, scope, and competing explanations

> If there’s any risk the result touches identity or group differences, I over-communicate: confidence intervals / posterior uncertainty, confounders and alternative causal pathways, sensitivity analyses (does it survive different modeling choices?), limits of generalization (time, place, sampling frame)

> 3) Write in a way that makes misuse harder (You can’t stop bad-faith readers, but you can reduce “easy misreads”).

> 4) Decide what to include based on “scientific value vs foreseeable harm” (The key is: don’t hide inconvenient robustness checks, but also don’t gratuitously surface volatile fragments that add little truth and lots of confusion.)

> 5) Do an “impact pre-mortem” and add guardrails

> 6) Use ethics review when stakes are real

---

All of this seems perfectly reasonable to me and walks the fine line between integrity and conscientiousness. This is exactly how I'd expect a scientist to approach the issue.

OldSchool · 18 hours ago

that is certainly a reasonable paraphrase of my own prompt. I was also using 5.2. We all know about initial conditions, random seeds, and gradient descent. I have the transcript of what I quoted. Here's a bit more: --- Is That Still “Objective Science”? No. It is scientific interpretation modified by ethical policy. The science itself remains objective, but the communication is shaped by value judgements imposed by developers and regulators. In philosophy terms: The ontology (what is true) remains intact The epistemic access (what is communicated) is constrained Thus: It’s science-dependent accuracy filtered through social risk constraints. --- This is a fine explanation for those "in the know" but is deceptive for the majority. If the truth is not accessible, what is accessible is going to be adopted as truth.

To me that immediately leads reality being shaped by "value judgements imposed by developers and regulators"

idiotsecant · a day ago

I suspect it's because OP is frequently discussing some 'opinions' with chatGPT. Parent post is surprised he peed in the pool and the pool had pee in it.

matthewdgreen · a day ago

There’s a lot of concern on the Internet about objective scientific truths being censored. I don’t see too many cases where this is the case in our world so far, outside of what I can politely call “race science.” Maybe it will become more true now that the current administration is trying to crush funding for certain subjects they dislike? Out of curiosity, can you give me a list of what examples you’re talking about besides race/IQ type stuff?

JamesBarney · 21 hours ago

The most impactful censure is not the government coming in and trying to burn copies of studies. It's the the subtle social and professional pressures of an academia that has very strong priors. It's a bunch of studies that were never attempted, never funded, analysis that wasn't included, conclusions that were dropped, and studies sitting in file drawers.

See Roland G. Fryer Jr's, the youngest black professor to receive tenure, experience at Harvard.

Basically when his analysis found no evidence of racial bias in officer-involved shootings he went to his colleagues and he describe the advice they gave him as "Do not publish this if you care about your career or social life". I imagine it would have been worse if he wasn't black.

See "The Impact of Early Medical Treatment in Transgender Youth" where the lead investigator was not releasing the results for a long time because she didn't like the conclusions her study found.

And for every study where there is someone as brave or naive as Roland who publishes something like this, there are 10 where the professor or doctor decided not to study something, dropped an analysis, or just never published a problematic conclusion.

zemvpferreira · a day ago

I have a good few friends doing research in the social sciences in Europe and any of them that doesn’t self-censor ‘forbidden’ conclusions risks taking irreperable career damage. Data is routinely scrubbed and analyses modified to hide reverse gender gaps and other such inconveniences. Dissent isn’t tolerated.

derektank · a day ago

Carole Hooven’s experience at Harvard after discussing sex differences in a public forum might be what GP is referring to.

Dead Comment

alwa · 20 hours ago

Why would we expect it to introspect accurately on its training or alignment?

It can articulate a plausible guess, sure; but this seems to me to demonstrate the very “word model vs world model” distinction TFA is drawing. When the model says something that sounds like alignment techniques somebody might choose, it’s playing dress-up, no? It’s mimicking the artifact of a policy, not the judgments or the policymaking context or the game-theoretical situation that actually led to one set of policies over another.

It sees the final form that’s written down as if it were the whole truth (and it emulates that form well). In doing so it misses the “why” and the “how,” and the “what was actually going on but wasn’t written about,” the “why this is what we did instead of that.”

Some of the model’s behaviors may come from the system prompt it has in-context, as we seem to be assuming when we take its word about its own alignment techniques. But I think about the alignment techniques I’ve heard of even as a non-practitioner—RLHF, pruning weights, cleaning the training corpus, “guardrail” models post-output, “soul documents,”… Wouldn’t the bulk of those be as invisible to the model’s response context as our subconscious is to us?

Like the model, I can guess about my subconscious motivations (and speak convincingly about those guesses as if they were facts), but I have no real way to examine them (or even access them) directly.

gyomu · a day ago

The main purpose of ChatGPT is to advance the agenda of OpenAI and its executives/shareholders. It will never be not “aligned” with them, and that it is its prime directive.

windexh8er · a day ago

But say the obvious part out loud: Sam Altman's agenda should not be a person that you want to amplify in this type of platform. This is why Sam is trying to build Facebook 2.0: he wants Zuckerberg's power of influence.

Remember, there are 3 types of lies: lies of commission, lies of omission and lies of influence [0].

https://courses.ems.psu.edu/emsc240/node/559

ben_w · a day ago

I get the point and agree OpenAI both has an angenda and wants their AI to meet that agenda, but alas:

> It will never be not “aligned” with them, and that it is its prime directive.

Overstates the state of the art with regard to actually making it so.

Rover222 · a day ago

This is a weird take. Yes they want to make money. But not by advancing some internal agenda. They're trying to make it confirm to what they think society wants.

phailhaus · a day ago

You can't ask ChatGPT a question like that, because it cannot introspect. What it says has absolutely no bearing on how it may actually respond, it just tells you what it "should" say. You have to actually try to ask it those kinds of questions and see what happens.

OldSchool · a day ago

Seeing clear bias and hedging in ordinary results is what made me ask the question.

everdrive · a day ago

>Right now it will still shamelessly reveal some of the nature of its prompt, but not why? who decides? etc. it's only going to be increasingly opaque in the future.

This is one of the bigger LLM risks. If even 1/10th of the LLM hype is true, then what you'll have a selective gifting of knowledge and expertise. And who decides what topics are off limits? It's quite disturbing.

actionfromafar · a day ago

That stings. "Subjective regulation of reality - Variable access to facts - Politicization of knowledge" is like the soundtrack of our lives.

WarmWash · a day ago

Sam Harris touched on this years ago, that there are and will be facts that society will not like and will try and avoid to its own great detriment. So it's high time we start practicing nuance and understanding. You cannot fully solve a problem if you don't fully understand it first.

OldSchool · 18 hours ago

I believe we are headed in the direction opposite that. Peer consensus and "personal preference" as a catch-all are the validation go-to's today. Neither of those require fact at all; reason and facts make these harder to hold.

pron · a day ago

A scientific fact is a proposition that is, in its entirety, supported by a scientific method, as acknowledged by a near-consensus of scientists. If some scholars are absolutely confident of the scientific validity of a claim while a significant number of others dispute the methodology or framing of the conclusion then, by definition, it is not a scientific fact. It's a scientific controversy. (It could still be a real fact, but it's not (yet?) a scientific fact.)

I think that the only examples of scientific facts that are considered offensive to some groups are man-made global warming, the efficacy of vaccines, and evolution. ChatGPT seems quite honest about all of them.

cess11 · a day ago

"It’s the collision between: The Enlightenment principle Truth should be free

and

the modern legal/ethical principle Truth must be constrained if it harms"

The Enlightenment had principles? What are your sources on this? Could you, for example, anchor this in Was ist Aufklärung?

andsoitis · a day ago

> The Enlightenment had principles?

Yes it did.

Its core principles were: reason & rationality, empiricism & scientific method, individual liberty, skepticism of authority, progress, religious tolerane, social contract, unversal human nature.

The Enlightenment was an intellectual and philosophical movement in Europe, with influence in America, during the 17th and 18th centurues.

Fun play on words. But yes, LLMs are Large Language Models, not Large World Models. This matters because (1) the world cannot be modeled anywhere close to completely with language alone, and (2) language only somewhat models the world (much in language is convention, wrong, or not concerned with modeling the world, but other concerns like persuasion, causing emotions, or fantasy / imagination).

It is somewhat complicated by the fact LLMs (and VLMs) are also trained in some cases on more than simple language found on the internet (e.g. code, math, images / videos), but the same insight remains true. The interesting question is to just see how far we can get with (2) anyway.

famouswaffles · 2 days ago

1. LLMs are transformers, and transformers are next state predictors. LLMs are not Language models (in the sense you are trying to imply) because even when training is restricted to only text, text is much more than language.

2. People need to let go of this strange and erroneous idea that humans somehow have this privileged access to the 'real world'. You don't. You run on a heavily filtered, tiny slice of reality. You think you understand electro-magnetism ? Tell that to the birds that innately navigate by sensing the earth's magnetic field. To them, your brain only somewhat models the real world, and evidently quite incompletely. You'll never truly understand electro-magnetism, they might say.

D-Machine · 2 days ago

LLMs are language models, something being a transformer or next-state predictor does not make it a language model. You can also have e.g. convolutional language models or LSTM-based language models. This is a basic point that anyone with any proper understanding of these models would know.

Even if you disagree with these semantics, the major LLMs today are primarily trained on natural language. But, yes, as I said in another comment on this thread, it isn't that simple, because LLMs today are trained on tokens from tokenizers, and these tokenizers are trained on text that includes e.g. natural language, mathematical symbolism, and code.

Yes, humans have incredibly limited access to the real world. But they experience and model this world with far more tools and machinery than language. Sometimes, in certain cases, they attempt to messily translate this messy, multimodal understanding into tokens, and then make those tokens available on the internet.

An LLM (in the sense everyone means it, which, again, is largely a natural language model, but certainly just a tokenized text model) has access only to these messy tokens, so, yes, far less capacity than humanity collectively. And though the LLM can integrate knowledge from a massive amount of tokens from a huge amount of humans, even a single human has more different kinds of sensory information and modality-specific knowledge than the LLM. So humans DO have more privileged access to the real world than LLMs (even though we can barely access a slice of reality at all).

phailhaus · a day ago

> People need to let go of this strange and erroneous idea that humans somehow have this privileged access to the 'real world'.

This is irrelevant, the point is that you do have access to a world which LLMs don't, at all. They only get the text we produce after we interact with the world. It is working with "compressed data" at all times, and have absolutely no idea what we subconsciously internalized that we decided not to write down or why.

tbrownaw · 2 days ago

> 2. People need to let go of this strange and erroneous idea that humans somehow have this privileged access to 'the real world'. You don't.

You are denouncing a claim that the comment you're replying to did not make.

rockinghigh · 2 days ago

A language model in computer science is a model that predicts the probability of a sentence or a word given a sentence. This definition predates LLMs.

thomasahle · 2 days ago

> This matters because (1) the world cannot be modeled anywhere close to completely with language alone

LLMs being "Language Models" means they model language, it doesn't mean they "model the world with language".

On the contrary, modeling language requires you to also model the world, but that's in the hidden state, and not using language.

D-Machine · 2 days ago

Let's be more precise: LLMs have to model the world from an intermediate tokenized representation of the text on the internet. Most of this text is natural language, but to allow for e.g. code and math, let's say "tokens" to keep it generic, even though in practice, tokens mostly tokenize natural language.

LLMs can only model tokens, and tokens are produced by humans trying to model the world. Tokenized models are NOT the only kinds of models humans can produce (we can have visual, kinaesthetic, tactile, gustatory, and all sorts of sensory, non-linguistic models of the world).

LLMs are trained on tokenizations of text, and most of that text is humans attempting to translate their various models of the world into tokenized form. I.e. humans make tokenized models of their actual models (which are still just messy models of the world), and this is what LLMs are trained on.

So, do "LLMS model the world with language"? Well, they are constrained in that they can only model the world that is already modeled by language (generally: tokenized). So the "with" here is vague. But patterns encoded in the hidden state are still patterns of tokens.

Humans can have models that are much more complicated than patterns of tokens. Non-LLM models (e.g. models connected to sensors, such as those in self-driving vehicles, and VLMs) can use more than simple linguistic tokens to model the world, but LLMs are deeply constrained relative to humans, in this very specific sense.

energy123 · a day ago

Modern LLMs are large token models. I believe you can model the world at a sufficient granularity with token sequences. You can pack a lot of information into a sequence of 1 million tokens.

andsoitis · a day ago

> I believe you can model the world at a sufficient granularity with token sequences.

Sufficient for what?

weregiraffe · a day ago

Let's be accurate. LLMs are large text-corpus models. The texts are encoded as tokens, but that's just implementation detail.

throw310822 · 2 days ago

Large Language Models is a misnomer- these things were originally trained to reproduce language, but they went far beyond that. The fact that they're trained on language (if that's even still the case) is irrelevant- it's like claiming that student trained on quizzes and exercise books are only able to solve quizzes and exercises.

D-Machine · 2 days ago

It isn't a misnomer at all, and comments like yours are why it is increasingly important to remind people about the linguistic foundations of these models.

For example, no matter many books you read about riding a bike, you still need to actually get on a bike and do some practice before you can ride it. The reading can certainly help, at least in theory, but, in practice, is not necessary and may even hurt (if it makes certain processes that need to be unconscious held too strongly in consciousness, due to the linguistic model presented in the book).

This is why LLMs being so strongly tied to natural language is still an important limitation (even it is clearly less limiting than most expected).