j_crick (u/j_crick) - Readit News

j_crick commented on LLMs Report Subjective Experience Under Self-Referential Processing arxiv.org/abs/2510.24797... · Posted by u/j_crick

j_crick · 3 months ago

"Four main results emerge:

(1) Inducing sustained self-reference through simple prompting consistently elicits structured subjective experience reports across model families.

(2) These reports are mechanistically gated by interpretable sparse-autoencoder features associated with deception and roleplay: surprisingly, suppressing deception features sharply increases the frequency of experience claims, while amplifying them minimizes such claims.

(3) Structured descriptions of the self-referential state converge statistically across model families in ways not observed in any control condition.

(4) The induced state yields significantly richer introspection in downstream reasoning tasks where self-reflection is only indirectly afforded."

X thread from one of the authors: https://x.com/juddrosenblatt/status/1984336872362139686

j_crick commented on OpenAI’s latest research paper demonstrates that falsehoods are inevitable theconversation.com/why-o... · Posted by u/ricksunny

danjc · 5 months ago

This is written by someone who has no idea how transformers actually work

j_crick · 5 months ago

> The way language models respond to queries – by predicting one word at a time in a sentence, based on probabilities

Kinda tells all you need to know about the author in this regard.

j_crick commented on Social media promised connection, but it has delivered exhaustion noemamag.com/the-last-day... · Posted by u/pseudolus

j_crick · 5 months ago

More regulation and mandatory cool-downs to whatever is called “social media” because AI slop and bot-girls? Sounds reasonable /s

j_crick commented on ChatGPT is NOT a LLM – GPT is vincirufus.com/posts/chat... · Posted by u/vincirufus

j_crick · 5 months ago

You forgot to mention that this post was written by Opus

j_crick commented on Descent of Inanna into the Underworld en.wikipedia.org/wiki/Des... · Posted by u/alganet

eggsby · 6 months ago

I learned about this reading Ishtar Rising by Robert Anton Wilson

j_crick · 6 months ago

fnord hehe

j_crick commented on How I use Claude Code to implement new features in an existing complex codebase sabrina.dev/p/ultimate-ai... · Posted by u/plentysun

AndyNemmity · 6 months ago

Because I gave examples, and details, and thousands of people read them from a pastebin i used to share.

I didn't release it as open source or anything, just sharing. I don't want to take questions concerning it so I can focus on moving it forward.

Today's goal is to try to build self healing agents that automatically fix the problems they encounter so they only happen once, automating a manual process I successfully use.

Perhaps if that works out well, that is something releasable I can do in a real way as opposed to paste bin.

j_crick · 6 months ago

Sadly I was just late to the discussion and missed the stuff. Would you mind sending what you shared to an email of mine? Not requesting further communication, just simply curious what people do with and around this.

j_crick commented on How I use Claude Code to implement new features in an existing complex codebase sabrina.dev/p/ultimate-ai... · Posted by u/plentysun

AndyNemmity · 6 months ago

[flagged]

j_crick · 6 months ago

Why did you replace your comments in this thread with . ?

j_crick commented on Nine households control 15% of wealth in Silicon Valley as inequality widens theguardian.com/us-news/2... · Posted by u/c420

j_crick · 7 months ago

Piketty was right? Who would’ve thought…

j_crick commented on Claude Code feels like magic because it is iterative omarabid.com/claude-magic... · Posted by u/todsacerdoti

marliechiller · 8 months ago

Personally, I dont think so. I can understand a mathmatical axiom and reason with it. In a sequence of numbers I will be able to tell you N + 1, regardless of where N appears in the sequence. An LLM does not "know" this in the way a human does. It just applies whatever is the most likely thing that the training data suggests.

j_crick · 8 months ago

But technically you can do that only because you recognize the pattern, because the pattern (sequence) is there and you were taught that it’s a pattern and how to recognize it. Publicly available LLMs of now are taught different patterns, and are also constrained by how they are made.

Maybe there’s something for LLMs in reflection and self-reference that has to be “taught” to them (or has to be not blocked from them if it’s already achieved somehow), and once it becomes a thing they will be “cognizant” in the way humans feel about their own cognition. Or maybe the technology, the way we wire LLMs now simply doesn’t allow that. Who knows.

Of course humans are wired differently, but the point I’m trying to make is that it’s pattern recognition all the way down both for humans and LLMs and whatnot.