doesn't seem paywalled, but just in case: https://archive.is/BwtDz (scroll down).
if this crunch really does materialize, I wonder what parts of the ecosystem (in the US) will benefit.
Power seems to be one big structural advantage china has here. No way we will learn build on their scale.
video of the trial (6 hours): https://youtu.be/1RBV9i4jaPo?si=oesH721IFLnmzEcW
I found this paper both really interesting and clear. No one part is very novel, but It composes disparate threads to obtain what looks like strong results in OOD length generalization. Even for the toy task, and using a DSL (vs. being an LM), length-generalizing on simple math >4x is impressive, from what I've read.
This also fits my priors for the key elements of unlocking better OOD compositional generalization: variable recurrence, step-wise curriculum training to build depth-invariant algorithms, discrete bottlenecks. Finally, it's very interesting to compare this to the below recent article arguing for the benefits of continuous latent spaces: Reasoning by Superposition: A Theoretical Perspective on Chain of Continuous Thought (https://arxiv.org/abs/2505.12514)
My take is both papers are right, and that continuous spaces are more expressive and can handle tougher problem spaces (e.g. shortest graph path), whereas discrete spaces will provide a better inductive bias for elegant algorithms that can scale OOD. And I bet the two can be combined / balanced.
1) the LLMs mostly used factual information to influence people (vs. say emotional or social influence) 2) the fact were mostly accurate
I'm not saying we shouldn't worry. But I expected the results to be worse.
Overall, the interesting finding here is that that political opinions can be changed by new information at all. I'm curious how this effect would compare to comparably informed human discussions. I would not be surprised if the LLMs were more effect for at least two reasons:
1) Cost-efficiency, in terms of the knowledge required, and effort/skill to provide personalized arguments. 2) Reduction in the emotional barrier to changing your mind: people don't want to "lose" by being wrong about politics to someone else. But perhaps the machine doesn't trigger this social/tribal response.
Cited papers:
https://www.nature.com/articles/s41586-025-09771-9
https://www.science.org/doi/10.1126/science.aea3884