But asking for the countries in the European Union, it'll only list counties around the bridge. It then realizes it has failed, tries again, and fails again hard. Over and over. It's very lucid and can clearly still evaluate that it's going off, what it's doing wrong, but it just can't help itself, like an addict. I really don't like anthropomorphizing LLMs, it was borderline difficult to see how much it was struggling in some instances.
Anyways I also don't enjoy anthropomorphizing language models, but hey, you went there first :)
Basically, LLMs are “blind”. Fragments of text are converted to tokens, forming something like a big enum of possible tokens.
They can’t see spellings; they cannot see the letters.
So, they can’t handle things like “how many letters are in Mississippi?” reliably.
Due to chat bots running with nonzero temperature, they will sometimes emit a right answer just because the dice rolled in its favor. So if you go try this and get a good answer, that’s not conclusive either.
That’s the thing we’re dealing with, that’s how it works, that’s what it is.