It's a hard scifi novel with mild existential horror tones that is borne mostly of maths jokes. At one point the main character tries to escape the matrix (reality). But the matrix is defective, so the best way out was to orthogonalize the subspace and reduce the matrix to its eigenbasis instead. Most of the scenes are based on similar maths jokes.
Tentative name is Diagonalization of the Meta (I had previously called it The Metaverse).
But Andersen's story was itself a sanitized version of Friedrich de la Motte Fouqué's “Undine”, a fairy/morality tale in which a water spirit marries a human knight in order to gain an immortal soul. In that story, her husband ultimately breaks his wedding vows, forcing Undine to kill him, and losing her chance of going to heaven.
Andersen explicitly wrote that he found that ending too depressing, which is why he made up his whole bit about Ariel refusing to kill Prince Erik, and instead of dying, she turned into a spirit of the air, where if she does good deeds for 300 years, she's eventually allowed to go to heaven after all.
Even as a child, it felt like a cop-out to me. But my point was: “The Little Mermaid” is itself a sanitized version of the original novella, adapted to the author's modern sensibilities.
LLMs are pretty good at preserving who did what when they translate from one language to another. That's because translation examples they are trained on correctly preserve who did what.
> This study asked whether Large Language Models (LLMs) understand sentences in the minimal sense of representing “who did what to whom”. In Experiment 1, we found that the overall geometry of LLM distributed activity patterns failed to capture this information: similaritiesbetween sentences reflected whether they shared syntax more than whether they shared thematic role assignments. Human judgments, in contrast, were strongly driven by this aspect of meaning.
> In Experiment 2, we found limited evidence that thematic role information was available even in a subset of hidden units. Whereas activity patterns in subsets of hidden units often allowed for significant classification of whether sentence pairs had shared vs. opposite thematic role assignments, the effect sizes were small; even the best-performing case appeared to lag behind humans, and its representation of thematic roles did not seem robust across syntactic structures.
> However, thematic role information was reliably available in a large number of attention heads, demonstrating LLMs have the capacity to extract thematic role information. In some cases, information present in attention heads descriptively exceeded human performance.