> four key cognitive behaviors -- verification, backtracking, subgoal setting, and backward chaining -- that both expert human problem solvers and successful language models employ.
As we make AI better, perhaps we'll inadvertently find ways to make HI (human intelligence) better too.
I had a personal experience with this when I was studying for an exam recently. As I read over practice questions, I spoke aloud, replicating the reasoning methods/personality of Deepseek R1. By spending a lot of time reading long verbose R1 outputs, I've essentially fine-tuned my brain for reasoning tasks. I believe this method contributed to my excellent score on that exam.
This is a well-known approach: verbalizing your thought process (either by speaking aloud, or by writing) is something that's long established as a good tactic for making sure that you're actually thinking through something, rather than glossing over it. Ironically, I've seen people bemoaning that use of AI will rob people of that.
I agree that there's potential here, though, and do genuinely hope that we find ways to make human intelligence better as we're going about AI research. Even pessimistically, I think we'll at least surface approaches that people use without thinking about, which is on its own a good thing, because once you know you're doing something, it becomes a lot easier to train yourself to do it better.
> Ironically, I've seen people bemoaning that use of AI will rob people of that.
There's that quote from Socrates, recorded by Plato:
> For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them.
I use this method too for programming problems I would normally procrastinate on and offload to subconscious thinking.
Actually writing out all thinking steps helps with ironing out some wrong steps in my reasoning or going in circles due to having limited working memory.
I started doing this more rigorously after seeing how reasoning based AI does reasoning, because it seemed like a useful thinking technique.
These reasoning AI models help me think on a meta level about my own thinking and shows me tools I can use to improve it.
For me, I think this approach works because I can commit the current thoughts to some type of external (to my brain) storage, freeing up space to think about how to further subdivide those tasks.
In general, this is very helpful for when your executive function feels taxed, as it has the effect of coaching yourself.
Thinking out loud is an age old practice and is the equivalent of "rubber ducking" to yourself.
As someone who comes from a long ancestral line of people who talk to themselves while reasoning through problems - it would occasionally prove to be a minor handicap during proctored exams, as internal monologue isn't really the same thing.
From what I have seen from split-brain experiments, I am of the belief that by vocalizing our thoughts, we are more fully engaging both hemispheres of our brain through the auditory pathway in addition to the Corpus Callosum.
Those four parts sound like one unified, cognitive algorithm -- having an ontology of the problem by breaking it into subgoals; checking your work properly; thinking backwards to debug a mistake and retrying; and thinking ahead and reasoning backward from the end result. It's all just one algorithm for solving hard problems. A skill that can be practiced, and then it builds on itself.
Consider the recent advancements in reasoning models, I’d say your method is a bit inefficient ;)
It’s equivalent to the LLMs reasoning in the output and not in the latent space before the final output, which gave the rise to the reasoning models we see today. So speaking out loud might not be the best reasoning method ;)
> As I read over practice questions, I spoke aloud
This is also something that’s expected of the applicant in technical interviews
The interviewers want to hear the applicants thought process and how they develop a strategy to solve the problems presented to them as they work them out
Neural networks are interpolators, they are notoriously bad at extrapolation. For simple example, look at pendigits data set [1]. The test part of pendigits is taken from different "writers" than train part and neural networks aren't that good at it.
I think you're missing a key point, which is that by extensively reading the R1 outputs, I'm able to observe how R1 thinks about things, which I can then replicate.
There are good ways and bad ways to think aloud, R1 just gave me a large set of examples of doing it the "good" way.
It's uncommon to read hundreds of paragraphs of a smart person's internal reasoning process. Usually we're only able to read the final results of their thoughts.
We've had knowledge of how to eat better to not get extreme scenarios like obesity and look at the effectiveness of that. Until you have a pill that makes you think better only the motivated will do it, and in this case the motivated could already do it.
As someone with an educational background I actually often ask myself the opposite: Why don't AI techniques almost never seem to use the knowledge we have about human learning to train better AI?
I sometimes see these reddit threads of people talking about the experience of having an internal monologue. I have no such monologue, at least not one that is accessible to the part of my mind that calls itself 'me', but I have often wondered if that monologue is something like a 'chain of thought'. I feel like maybe without access to that 'idea feed' maybe my planning and executive functioning is less effective than some other people. I do find myself quite more effective with those sort of tasks when I do a little 'chain of thought' notepad.
I also suspect I spend less time ruminating and second-guessing myself and other anxious behaviours that I imagine would come with having someone talking in your ear all day, but that's probably off topic.
You never form thoughts in your mind in a linguistic way? Can you read a sentence and be aware of it as a sentence in your mind, or are you unable to do that?
I don't doubt you or anything like that, just very curious. As someone with a very strong internal monologue, it's hard for me to imagine not having one.
No. I don't think in 'first person' words at all. I might consciously compose a phrase if I'm doing something like writing a poem, which is more akin to arranging a puzzle or something or I might recall words of a conversation someone said to me and i do think of song lyrics if I have a song in my head, but there's no voice in my head and it's absolutely baffling to me to imagine otherwise, as I imagine it is for other people to imagine my situation.
When I read a sentence in a book I don't hear any kind of narration or anything, but I do assemble a 'scene' of images, sounds, facial expressions, motions, etc. not like a movie, but more like a series of small related ideas if that makes sense?
I find that I understand dialogue and characters in books much better when I listen to an audiobook than when I read, not sure if that's related or not.
I am a relatively intelligent successful professional, but I wonder sometimes if I am missing some processing hardware other people have access to.
For me it has linguistics components for sure, but it is many in parallel and a lot less 'linear'.
Where inner language most certainly comes into play is in the 'output' phase, be it spoken or written, as serialization is required there, but to be honest that often feels like a projection or even a reconstruction with an inherrent sense of loss as the inner is so much richer and nuanced.
That is not to say linearization has no merits. Even if it loses so much it forces consistency and rigor in the lower dimensional reasoning.
Genuine question, how does multi step reasoning work for you then?
Like eg if you have some math problem that's trivially to solve individually but needs multiple steps, lets say 16 * 3 + 5?
How does 16 * 3 = 48 land in some 'register' of your brain (short term memory), so that you can then add 5 to get to 53? Maybe 16 * 3 + 5 is to easy for you and you'll just 'see' it but the question still stands, just choose a more complex problem.
Isn't the same meta process at play when thinking about more fuzzy topics?
Not that poster, but for me it's directly manipulating numbers (for example, "16×3 + 5" turns into "10×3 + 6×3 + 5" into "30 + 18 + 5" into "30 + 10 + 8 + 5" into "40 + 13" into "53"). There's no language involved, though in some cases I might use some spatial reasoning by doing something like associating given chunks of an equation with different fingers.
It's also very probable that the verbalization the majority does internally is just that - a verbalization of the actual underlying thought process. That is what much of current cognitive linguistics points to as far as I have understood.
(Also a reason why I'm very sceptical that the current LLM approach will eventually lead to AGI, BTW)
I believe I have an internal narrator but I’m not certain exactly what others mean by that so I don’t know for sure.
However the way I think about math is different than the way I plan my day or other things. In my case, it is very much like I have registers that would hold the result of 16 x 3 in it so I can add the 5 to it later. I have a certain number of registers and with effort like repeating what Ive already solved I could temporarily create more.
It also feels somewhat physical, as if the register is an actual box or has a “location” or like I’ve put the answer down on the desk like a part of something I’m building. Perhaps not coincidentally I am one of the many people who have a “calendar shape” for the months.
I do have an internal monologue. I can also think in pictures and I can also think in terms of neither, just pure raw thought.
I would say most people are like me. They have 3 modes of thinking and they probably have a primary mode which they favor. I favor none and go into all 3 depending on whether I’m reading, writing or doing something else.
The second bigger group has only one primary mode of thinking. The internal monologue. They can only think in terms of an inner voice and this inner voice is so powerful I often encountered people who think this inner voice is the definition of thought. They assumed thinking was COT.
The even rarer versions you get people who assign colors to numbers or people who can’t even perceive to think in pictures. You’re the first person I’ve encountered who can’t even have an internal monologue.
I think you'd be surprised. I never knew that internal monologues were a thing until there was a HN thread about it and a lot of people had them and a lot of people didn't.
I always thought it was something that we did in TV shows or books to give you a sense of what a character was feeling, I didn't know this was an actual literal experience people had.
I can certainly have an internal monologue, in the way that you could put on a puppet show. I can conciously think to myself 'self, this is self. Clean your car out' I can form the feeling of those words in my head. But there's nobody 'saying' them if that makes sense. I'm playing back a design of my conscious self.
There's a fascinating thing called aphantasia where people can't picture things at all in their mind, but such people are able to lead normal lives and may never realize there's something different. This feels like a similar concept but for imaging speech.
True, but a problem is that self-improving AI leads to a somewhat troubling mode of thinking. AIs switch to an internal babbling type language that makes no sense but clearly still conveys meaning to the AIs, then think in that language (if it's a language, though not sure what else it could be) and then produce correct results.
Worse, when you use multiple agents to get AI LLMs talking to one another, all AI agents switch to this internal language and they make progress despite no human understanding what hell is happening. This seems very bad.
Illustration:
> How many r in strawberry?
I'm asked how many r in strawberry. I can just spell the word and a;dklsjaw;
a;ewjraqwpeouypaads;lq
qepwiouryaqeopw
qewrpoiuyoiauysdqw145124rfa.nkjlwh
;45a8345a894ya4a
q4p58q45jaq;lkjas;dlfkja;j
<answer>There are 3 (three) r's in strawberry</answer>
I’ve heard this described as talking in “Neuralese”. It seems plausible that this will be the most dense language for model-internal dialog (or presumably inter-LLM dialog assuming they share the same weights).
You will penalize this inasmuch as your alignment strategy depends on Deliberative Alignment. But at some point I assume that will come with a real capability cost as Neuralese can be more conceptually dense.
They are not going to invent a new language by themselves, they by definition can't even "think" in terms of languages they haven't seen. It does not occur to them that the language they use may be suboptimal. And surely, any better ways of thinking can still be described in English. It seems more likely there will be a gradual transition from us teaching LLMs how to reason, to LLMs being able to actually gobble and process enough data to learn more effective ways to reason, which it can then "teach" us. But that's just the LLM reflecting the way it was trained and aligned.
> four key cognitive behaviors -- verification, backtracking, subgoal setting, and backward chaining -- that both expert human problem solvers and successful language models employ.
Based on what have they claimed that such methods are used by expert human problem solvers?
I'm actually interested whether there is some larger explanation on these study methods. Maybe there's something worth integrating with myself and get more efficient at learning.
In my experience models aren't very good at following such prompts. Smart "non-reasoning" models like Claude 3.5 could, but would generate so much text when thinking they ran out of context.
As we make AI better, perhaps we'll inadvertently find ways to make HI (human intelligence) better too.
I had a personal experience with this when I was studying for an exam recently. As I read over practice questions, I spoke aloud, replicating the reasoning methods/personality of Deepseek R1. By spending a lot of time reading long verbose R1 outputs, I've essentially fine-tuned my brain for reasoning tasks. I believe this method contributed to my excellent score on that exam.
I agree that there's potential here, though, and do genuinely hope that we find ways to make human intelligence better as we're going about AI research. Even pessimistically, I think we'll at least surface approaches that people use without thinking about, which is on its own a good thing, because once you know you're doing something, it becomes a lot easier to train yourself to do it better.
There's that quote from Socrates, recorded by Plato:
> For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them.
Actually writing out all thinking steps helps with ironing out some wrong steps in my reasoning or going in circles due to having limited working memory.
I started doing this more rigorously after seeing how reasoning based AI does reasoning, because it seemed like a useful thinking technique.
These reasoning AI models help me think on a meta level about my own thinking and shows me tools I can use to improve it.
Great to see that I’m not alone in this!
In general, this is very helpful for when your executive function feels taxed, as it has the effect of coaching yourself.
As someone who comes from a long ancestral line of people who talk to themselves while reasoning through problems - it would occasionally prove to be a minor handicap during proctored exams, as internal monologue isn't really the same thing.
Girlfriend, coming in from outside: "Who are you talking to?"
Me: "I talk to myself. You know that."
Gf: "Oh right. You also whisper to yourself, which is scary."
Me: "Scary?"
Gf: "It sounds demonic."
Which, to be fair... Evidently, my internal monologuing gets quite a bit vocal even with other people around.
It’s equivalent to the LLMs reasoning in the output and not in the latent space before the final output, which gave the rise to the reasoning models we see today. So speaking out loud might not be the best reasoning method ;)
> As I read over practice questions, I spoke aloud
This is also something that’s expected of the applicant in technical interviews
The interviewers want to hear the applicants thought process and how they develop a strategy to solve the problems presented to them as they work them out
Research is extrapolation.
Neural networks are interpolators, they are notoriously bad at extrapolation. For simple example, look at pendigits data set [1]. The test part of pendigits is taken from different "writers" than train part and neural networks aren't that good at it.
[1] https://archive.ics.uci.edu/dataset/81/pen+based+recognition...
Humans do extrapolation all the time.
You are not “emulating R1”, you are talking to yourself to make sure you understand the concept.
Which is fine but don’t act like AI is making this part of life better in any way with this example. Nonsense
There are good ways and bad ways to think aloud, R1 just gave me a large set of examples of doing it the "good" way.
It's uncommon to read hundreds of paragraphs of a smart person's internal reasoning process. Usually we're only able to read the final results of their thoughts.
I'd say the motivated often reap the rewards of innovations more so than the average, as they were pushing the boundaries in the first place.
Having a dishwasher or a robot vacuum does not make me lazy. It allows me to do more productive things.
Dead Comment
One of the parts most worth a replication study.
I also suspect I spend less time ruminating and second-guessing myself and other anxious behaviours that I imagine would come with having someone talking in your ear all day, but that's probably off topic.
I don't doubt you or anything like that, just very curious. As someone with a very strong internal monologue, it's hard for me to imagine not having one.
When I read a sentence in a book I don't hear any kind of narration or anything, but I do assemble a 'scene' of images, sounds, facial expressions, motions, etc. not like a movie, but more like a series of small related ideas if that makes sense?
I find that I understand dialogue and characters in books much better when I listen to an audiobook than when I read, not sure if that's related or not.
I am a relatively intelligent successful professional, but I wonder sometimes if I am missing some processing hardware other people have access to.
Where inner language most certainly comes into play is in the 'output' phase, be it spoken or written, as serialization is required there, but to be honest that often feels like a projection or even a reconstruction with an inherrent sense of loss as the inner is so much richer and nuanced.
That is not to say linearization has no merits. Even if it loses so much it forces consistency and rigor in the lower dimensional reasoning.
Isn't the same meta process at play when thinking about more fuzzy topics?
(Also a reason why I'm very sceptical that the current LLM approach will eventually lead to AGI, BTW)
However the way I think about math is different than the way I plan my day or other things. In my case, it is very much like I have registers that would hold the result of 16 x 3 in it so I can add the 5 to it later. I have a certain number of registers and with effort like repeating what Ive already solved I could temporarily create more.
It also feels somewhat physical, as if the register is an actual box or has a “location” or like I’ve put the answer down on the desk like a part of something I’m building. Perhaps not coincidentally I am one of the many people who have a “calendar shape” for the months.
I would say most people are like me. They have 3 modes of thinking and they probably have a primary mode which they favor. I favor none and go into all 3 depending on whether I’m reading, writing or doing something else.
The second bigger group has only one primary mode of thinking. The internal monologue. They can only think in terms of an inner voice and this inner voice is so powerful I often encountered people who think this inner voice is the definition of thought. They assumed thinking was COT.
The even rarer versions you get people who assign colors to numbers or people who can’t even perceive to think in pictures. You’re the first person I’ve encountered who can’t even have an internal monologue.
I always thought it was something that we did in TV shows or books to give you a sense of what a character was feeling, I didn't know this was an actual literal experience people had.
I can certainly have an internal monologue, in the way that you could put on a puppet show. I can conciously think to myself 'self, this is self. Clean your car out' I can form the feeling of those words in my head. But there's nobody 'saying' them if that makes sense. I'm playing back a design of my conscious self.
https://en.m.wikipedia.org/wiki/Aphantasia
That said, most of my thinking is not done in the form of a linear monologue where I "talk through" steps to myself.
Worse, when you use multiple agents to get AI LLMs talking to one another, all AI agents switch to this internal language and they make progress despite no human understanding what hell is happening. This seems very bad.
Illustration:
> How many r in strawberry?
I'm asked how many r in strawberry. I can just spell the word and a;dklsjaw; a;ewjraqwpeouypaads;lq qepwiouryaqeopw qewrpoiuyoiauysdqw145124rfa.nkjlwh ;45a8345a894ya4a q4p58q45jaq;lkjas;dlfkja;j
<answer>There are 3 (three) r's in strawberry</answer>
You will penalize this inasmuch as your alignment strategy depends on Deliberative Alignment. But at some point I assume that will come with a real capability cost as Neuralese can be more conceptually dense.
Based on what have they claimed that such methods are used by expert human problem solvers?