Readit News logoReadit News
j_crick commented on LLMs Report Subjective Experience Under Self-Referential Processing   arxiv.org/abs/2510.24797... · Posted by u/j_crick
j_crick · 3 months ago
"Four main results emerge:

(1) Inducing sustained self-reference through simple prompting consistently elicits structured subjective experience reports across model families.

(2) These reports are mechanistically gated by interpretable sparse-autoencoder features associated with deception and roleplay: surprisingly, suppressing deception features sharply increases the frequency of experience claims, while amplifying them minimizes such claims.

(3) Structured descriptions of the self-referential state converge statistically across model families in ways not observed in any control condition.

(4) The induced state yields significantly richer introspection in downstream reasoning tasks where self-reflection is only indirectly afforded."

X thread from one of the authors: https://x.com/juddrosenblatt/status/1984336872362139686

j_crick commented on OpenAI’s latest research paper demonstrates that falsehoods are inevitable   theconversation.com/why-o... · Posted by u/ricksunny
danjc · 5 months ago
This is written by someone who has no idea how transformers actually work
j_crick · 5 months ago
> The way language models respond to queries – by predicting one word at a time in a sentence, based on probabilities

Kinda tells all you need to know about the author in this regard.

j_crick commented on Social media promised connection, but it has delivered exhaustion   noemamag.com/the-last-day... · Posted by u/pseudolus
j_crick · 5 months ago
More regulation and mandatory cool-downs to whatever is called “social media” because AI slop and bot-girls? Sounds reasonable /s
j_crick commented on ChatGPT is NOT a LLM – GPT is   vincirufus.com/posts/chat... · Posted by u/vincirufus
j_crick · 5 months ago
You forgot to mention that this post was written by Opus
j_crick commented on Descent of Inanna into the Underworld   en.wikipedia.org/wiki/Des... · Posted by u/alganet
eggsby · 6 months ago
I learned about this reading Ishtar Rising by Robert Anton Wilson
j_crick · 6 months ago
fnord hehe
j_crick commented on How I use Claude Code to implement new features in an existing complex codebase   sabrina.dev/p/ultimate-ai... · Posted by u/plentysun
AndyNemmity · 6 months ago
Because I gave examples, and details, and thousands of people read them from a pastebin i used to share.

I didn't release it as open source or anything, just sharing. I don't want to take questions concerning it so I can focus on moving it forward.

Today's goal is to try to build self healing agents that automatically fix the problems they encounter so they only happen once, automating a manual process I successfully use.

Perhaps if that works out well, that is something releasable I can do in a real way as opposed to paste bin.

j_crick · 6 months ago
Sadly I was just late to the discussion and missed the stuff. Would you mind sending what you shared to an email of mine? Not requesting further communication, just simply curious what people do with and around this.
j_crick commented on How I use Claude Code to implement new features in an existing complex codebase   sabrina.dev/p/ultimate-ai... · Posted by u/plentysun
AndyNemmity · 6 months ago
[flagged]
j_crick · 6 months ago
Why did you replace your comments in this thread with . ?
j_crick commented on Nine households control 15% of wealth in Silicon Valley as inequality widens   theguardian.com/us-news/2... · Posted by u/c420
j_crick · 7 months ago
Piketty was right? Who would’ve thought…
j_crick commented on Claude Code feels like magic because it is iterative   omarabid.com/claude-magic... · Posted by u/todsacerdoti
marliechiller · 8 months ago
Personally, I dont think so. I can understand a mathmatical axiom and reason with it. In a sequence of numbers I will be able to tell you N + 1, regardless of where N appears in the sequence. An LLM does not "know" this in the way a human does. It just applies whatever is the most likely thing that the training data suggests.
j_crick · 8 months ago
But technically you can do that only because you recognize the pattern, because the pattern (sequence) is there and you were taught that it’s a pattern and how to recognize it. Publicly available LLMs of now are taught different patterns, and are also constrained by how they are made.

Maybe there’s something for LLMs in reflection and self-reference that has to be “taught” to them (or has to be not blocked from them if it’s already achieved somehow), and once it becomes a thing they will be “cognizant” in the way humans feel about their own cognition. Or maybe the technology, the way we wire LLMs now simply doesn’t allow that. Who knows.

Of course humans are wired differently, but the point I’m trying to make is that it’s pattern recognition all the way down both for humans and LLMs and whatnot.

u/j_crick

KarmaCake day514April 8, 2019
About
just a circadian-challenged person opposing global early bird conspiracy
View Original