Large language models can do jaw-dropping things. But nobody knows why.

> one of the biggest scientific puzzles of our time

> a hidden mathematical pattern in language that large language models somehow come to exploit... the fact that these things model language is probably one of the biggest discoveries in history

Is it? Excuse my ignorance. If we were talking about simple Markov chains, aren't these just stationary distributions?

psyklic · 2 years ago

Even if LLMs are thought of as Markov chains, there are still substantial unsolved scientific questions.

For a given prompt/"state", an LLM essentially computes the next-state probabilities. This is done by compressing language and storing particular patterns/distributions. To date, we still don't understand what statistics are stored by LLMs. Or even what stats might be necessary to produce natural-sounding language. (A canonical Markov chain only works with n-gram statistics, but we know these are insufficient.)

IMO figuring this out to a human-level understanding could be a major breakthrough in science. It would reveal a deeper structure to language. It may even help understand how our brains process language; we'll at least know one plausible algorithm. Prior to LLMs, many scholars assumed language was unique to the brain and could only poorly be "computed".

Loading comment...

sublinear · 2 years ago

phero_cnstrcts · 2 years ago

Had to press close on 4 popups before I could read anything and then another one as I quickly pressed the back button.

astrange · 2 years ago

It's not surprising that people don't know how GPT-4 works, because they don't have access to the model. Thus making it not an LLM but instead an API to a text generating oracle.

mo_42 · 2 years ago

> Large language models can do jaw-dropping things. But nobody knows exactly why. And that's a problem.

LLMs at least the most common ones can talk but they don't do things.

But even if they could. Humans can do it too, even your pet does jaw-dropping things. So far we never said that it's a problem in these cases.

So for me, such statements mostly communicate some sort of fear or skepticism. And I'm not saying that we shouldn't investigate why LLMs can do it. We should rather call it a research problem.

an_aparallel · 2 years ago

Yes, im sure OpenAI, and Google are paying incredibly high salaries to experts who dont know how LLM's work. I'm far from a computer scientist, or mathematician, but how the hell does this narrative keep being spouted by folks with a straight face? Oh, I know...no one can think critically...

Dead Comment