Yann LeCun raises $1B to build AI that understands the physical world

Justifiable.

There are a lot more degrees of freedom in world models.

LLMs are fundamentally capped because they only learn from static text -- human communications about the world -- rather than from the world itself, which is why they can remix existing ideas but find it all but impossible to produce genuinely novel discoveries or inventions. A well-funded and well-run startup building physical world models (grounded in spatiotemporal understanding, not just language patterns) would be attacking what I see as the actual bottleneck to AGI. Even if they succeed only partially, they may unlock the kind of generalization and creative spark that current LLMs structurally can't reach.

andy12_ · 2 days ago

I don't understand this view. How I see it the fundamental bottleneck to AGI is continual learning and backpropagation. Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation. World models don't solve any of these problems; they are fundamentally the same kind of deep learning architectures we are used to work with. Heck, if you think learning from the world itself is the bottleneck, you can just put a vision-action LLM on a reinforcement learning loop in a robotic/simulated body.

zelphirkalt · 2 days ago

> I don't understand this view. How I see it the fundamental bottleneck to AGI is continual learning and backpropagation. Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation.

Even with continuous backpropagation and "learning", enriching the training data, so called online-learning, the limitations will not disappear. The LLMs will not be able to conclude things about the world based on fact and deduction. They only consider what is likely from their training data. They will not foresee/anticipate events, that are unlikely or non-existent in their training data, but are bound to happen due to real world circumstances. They are not intelligent in that way.

Whether humans always apply that much effort to conclude these things is another question. The point is, that humans fundamentally are capable of doing that, while LLMs are structurally not.

The problems are structural/architectural. I think it will take another 2-3 major leaps in architectures, before these AI models reach human level general intelligence, if they ever reach it. So far they can "merely" often "fake it" when things are statistically common in their training data.

jacquesm · 2 days ago

The main difference is humans are learning all the time and models learn batch wise and forget whatever happened in a previous session unless someone makes it part of the training data so there is a massive lag.

Whoever cracks the continuous customized (per user, for instance) learning problem without just extending the context window is going to be making a big splash. And I don't mean cheats and shortcuts, I mean actually tuning the model based on received feedback.

ben_w · 2 days ago

> Models today are static, and human brains don't learn or adapt themselves with anything close to backpropagation.

While I suspect latter is a real problem (because all mammal brains* are much more example-efficient than all ML), the former is more about productisation than a fundamental thing: the models can be continuously updated already, but that makes it hard to deal with regressions. You kinda want an artefact with a version stamp that doesn't change itself before you release the update, especially as this isn't like normal software where specific features can be toggled on or off in isolation of everything else.

* I think. Also, I'm saying "mammal" because of an absence of evidence (to my *totally amateur* skill level) not evidence of absence.

A_D_E_P_T · 2 days ago

You could have continual learning on text and still be stuck in the same "remixing baseline human communications" trap. It's a nasty one, very hard to avoid, possibly even structurally unavoidable.

As for the "just put a vision LLM in a robot body" suggestion: People are trying this (e.g. Physical Intelligence) and it looks like it's extraordinarily hard! The results so far suggest that bolting perception and embodiment onto a language-model core doesn't produce any kind of causal understanding. The architecture behind the integration of sensory streams, persistent object representations, and modeling time and causality is critically important... and that's where world models come in.

10xDev · 2 days ago

The fact that models aren't continually updating seems more like a feature. I want to know the model is exactly the same as it was the last time I used it. Any new information it needs can be stored in its context window or stored in a file to read the next it needs to access it.

the_black_hand · a day ago

yes those are bottlenecks that world models don't solve. but the promise of world models is, unlike LLMs, they might be able to learn things about the world that humans haven't written. For example, we still don't fully know how insects fly. A world model could be trained on thousands of videos of insects and make a novel observation about insect trajectories. The premise is that despite being here for millenia, humans have only observed a tiny fraction of the world.

So I do buy his idea. But I disagree that you need world models to get to human level capabilities. IMO there's no fundamental reason why models can't develop human understanding based on the known human observations.

stanfordkid · 2 days ago

It's pretty simple... the word circle and what you can correlate to it via english language description has somewhat less to do with reality than a physical 3D model of a circle and what it would do in an environment. You can't just add more linguistic description via training data to change that. It doesn't really matter that you can keep back propagating because what you are back propagating over is fundamentally and qualitatively less rich.

carlmr · a day ago

Especially they will require even more compute to get anything close to usable output. Human brains are super efficient at learning and producing output. We will need exponentially more compute for real time learning from video + audio + haptic data.

a1371 · a day ago

I never understood why we believe humans don't backprop. Isn't it that during the day we fill up our context (short term memory) and sleep is actually where we use that to backprop? Heck, everyone knows what "sleep on it" means.

eloisant · a day ago

LeCun is a researcher.

From his point of view, there are not much research left on LLM. Sure we can still improve them a bit with engineering around, but he's more interested in basic research.

energy123 · 2 days ago

I don't understand why online learning is that necessary. If you took Einstein at 40 and surgically removed his hippocampus so he can't learn anything he didn't already know (meaning no online learning), that's still a very useful AGI. A hippocampus is a nice upgrade to that, but not super obviously on the critical path.

slashdave · a day ago

If your model is poor, no amount of learning can fix it. If you don't think your model architecture is limited, you aren't looking hard enough.

anon7000 · a day ago

I don’t understand your view. Reality is that we need some way to encode the rules of the world in a more definitive way. If we want models to be able to make assertive claims about important information and be correct, it’s very fair to theorize they might need a more deterministic approach than just training them more. But it’s just a theory that this will actually solve the problem.

Ultimately, we still have a lot to learn and a lot of experiments to do. It’s frankly unscientific to suggest any approaches are off the table, unless the data & research truly proves that. Why shouldn’t we take this awesome LLM technology and bring in more techniques to make it better?

A really, really basic example is chess. Current top AI models still don’t know how to play it (https://www.software7.com/blog/ai_chess_vs_1983_atari/) The models are surely trained on source material that include chess rules, and even high level chess games. But the models are not learning how to play chess correctly. They don’t have a model to understand how chess actually works — they only have a non-deterministic prediction based on what they’ve seen, even after being trained on more data than any chess novice has ever seen about the topic. And this is probably one of the easiest things for AI to stimulate. Very clear/brief rules, small problem space, no hidden information, but it can’t handle the massive decision space because its prediction isn’t based on the actual rules, but just “things that look similar”

(And yeah, I’m sure someone could build a specific LLM or agent system that can handle chess, but the point is that the powerful general purpose models can’t do it out of the box after training.)

Maybe more training & self-learning can solve this, but it’s clearly still unsolved. So we should definitely be experimenting with more techniques.

edgyquant · 2 days ago

Iirc LeCunn talks about a self organizing hierarchy of real world objects and imo this is exactly how the human brain actually learns

nurettin · 2 days ago

Who knows? Perhaps attention really is all you need. Maybe our context window is really large. Or our compression is really effective. Perhaps adding external factors might be able to indirectly teach the models to act more in line with social expectations such as being embarrassed to repeat the same mistake, unlocking the final piece of the puzzle. We are still stumbling in the dark for answers.

mxkopy · 2 days ago

The reason LLMs fail today is because there’s no meaning inherent to the tokens they produce other than the one captured by cooccurrence within text. Efforts like these are necessary because so much of “general intelligence” is convention defined by embodied human experience, for example arrows implying directionality and even directionality itself.

charcircuit · 2 days ago

Agents have the ability of continual learning.

jnd-cz · 2 days ago

The sum of human knowledge is more than enough to come up with innovative ideas and not every field is working directly with the physical world. Still I would say there's enough information in the written history to create virtual simulation of 3d world with all ohysical laws applying (to a certain degree because computation is limited).

What current LLMs lack is inner motivation to create something on their own without being prompted. To think in their free time (whatever that means for batch, on demand processing), to reflect and learn, eventually to self modify.

I have a simple brain, limited knowledge, limited attention span, limited context memory. Yet I create stuff based what I see, read online. Nothing special, sometimes more based on someone else's project, sometimes on my own ideas which I have no doubt aren't that unique among 8 billions of other people. Yet consulting with AI provides me with more ideas applicable to my current vision of what I want to achieve. Sure it's mostly based on generally known (not always known to me) good practices. But my thoughts are the same way, only more limited by what I have slowly learned so far in my life.

jandrewrogers · 2 days ago

> virtual simulation of 3d world

Virtual simulations are not substitutable for the physical world. They are fundamentally different theory problems that have almost no overlap in applicability. You could in principle create a simulation with the same mathematical properties as the physical world but no one has ever done that. I'm not sure if we even know how.

Physical world dynamics are metastable and non-linear at every resolution. The models we do build are created from sparse irregular samples with large error rates; you often have to do complex inference to know if a piece of data even represents something real. All of this largely breaks the assumptions of our tidy sampling theorems in mathematics. The problem of physical world inference has been studied for a couple decades in the defense and mapping industries; we already have a pretty good understanding of why LLM-style AI is uniquely bad at inference in this domain, and it mostly comes down to the architectural inability to represent it.

Grounded estimates of the minimum quantity of training data required to build a reliable model of physical world dynamics, given the above properties, is many exabytes. This data exists, so that is not a problem. The models will be orders of magnitude larger than current LLMs. Even if you solve the computer science and theory problems around representation so that learning and inference is efficient, few people are prepared for the scale of it.

(source: many years doing frontier R&D on these problems)

daxfohl · 2 days ago

I guess you need two things to make that happen. First, more specialization among models and an ability to evolve, else you get all instances thinking roughly the same thing, or deer in the headlights where they don't know what of the millions of options they should think about. Second, fewer guardrails; there's only so much you can do by pure thought.

The problem is, idk if we're ready to have millions of distinct, evolving, self-executing models running wild without guardrails. It seems like a contradiction: you can't achieve true cognition from a machine while artificially restricting its boundaries, and you can't lift the boundaries without impacting safety.

slibhb · a day ago

> LLMs are fundamentally capped because they only learn from static text -- human communications about the world -- rather than from the world itself, which is why they can remix existing ideas but find it all but impossible to produce genuinely novel discoveries or inventions.

This seems wrong to me on a few levels.

First, there is no way to "experience the world directly," all experience is indirect, and language is a very good way of describing the world. If language was a bad choice or limited in some fundamental way, LLMs wouldn't work as well as they do.

Second, novel ideas are often existing ideas remixed. It's hard/impossible to point to any single idea that sprung from nowhere.

Third, you can provide an LLM with real-world information and suddenly it's "interacting with the world". If I tell an LLM about the US war on Iran, I am in a very real sense plugging it into the real world, something that isn't part of its training data.

Finally, modern LLMs are multi-modal, meaning they have the ability to handle images/video. My understanding is that they use some kind of adapter to turn non-text data into data that the LLM can make sense of.

A_D_E_P_T · a day ago

Re 1: You experience the world in real time (or close enough) via your senses, which combine to form a spatiotemporal sense: A sense of being a bounded entity in space and time. The LLM has none of that. They experience the world via stale old text and text derivatives.

Re 2: There's something tremendous in the fact, staring us right in the face, that LLMs are unable to meaningfully contribute to academic/medical research. I'm not saying that they need to perform on the level of a one-in-a-million Maxwell, DaVinci, or whatever. But as Dwarkesh asked one year ago: "What do you make of the fact that these things have basically the entire corpus of human knowledge memorized and they haven't been able to make a single new connection that has led to a discovery?"

Re 3: Sure, you can hold it by the hand and spoonfeed it. You can also create for it a mirror reality which doesn't exist, which is pure fiction. Given how limited these systems are, I don't suppose it makes much of a difference. There's no way for it to tell. The "human in the loop" is its interaction with the world. And a pale, meager interaction it is.

Re 4: Static, old images/video that they were trained on some months ago. That, too, is no way of interacting with the world.

ljm · 2 days ago

I'm gonna be a cynic and say this is money following money and Yann LeCun is an excellent salesman.

I 100% guarantee that he will not be holding the bag when this fails. Society will be protecting him.

On that proviso I have zero respect for this guy.

thinkling · 2 days ago

Um, why would anyone be "holding the bag" and who needs protecting by society? He's not taking out a loan, he's getting capital investment in a startup. People are gambling that he will do well and make money for them. If they gamble wrong, that's on them. Society won't be doing anything either way because investors in startups that fail don't get anything.

roromainmain · 2 days ago

Agree. LLMs operate in the domain of language and symbols, but the universe contains much more than that. Humans also learn a great deal from direct phenomenological experience of the world, even without putting those experiences into words. I remember a talk by Yann LeCun where he pointed out that in just the first couple of years of life, a human baby is exposed to orders of magnitude more sensory data (vision, sound, etc.) than what current LLMs are typically trained on. This seems like a major limitation of purely language-based models.

Unearned5161 · 2 days ago

I have a pet peeve with the concept of "a genuinely novel discovery or invention", what do you imagine this to be? Can you point me towards a discovery or invention that was "genuinely novel", ever?

I don't think it makes sense conceptually unless you're literally referring to discovering new physical things like elements or something.

Humans are remixers of ideas. That's all we do all the time. Our thoughts and actions are dictated by our environment and memories; everything must necessarily be built up from pre-existing parts.

davidfarrell · 2 days ago

W Brian Arthur's book "The Nature of Technology" provides a framework for classifying new technology as elemental vs innovative that I find helpful. For example the Huntley-Mcllroy diff operates on the phenomenon that ordered correspondence survives editing. That was an invention (discovery of a natural phenomenon and a means to harness it). Myers diff improves the performance by exploiting the fact that text changes are sparse. That's innovation. A python app using libdiff, that's engineering. And then you might say in terms of "descendants": invention > innovation > engineering. But it's just a perspective.

0x3f · 2 days ago

Novel things can be incremental. I don't think LLMs can do that either, at least I've never seen one do it.

A_D_E_P_T · 2 days ago

Suno is transformer-based; in a way it's a heavily modified LLM.

You can't get Suno to do anything that's not in its training data. It is physically incapable of inventing a new musical genre. No matter how detailed the instructions you give it, and even if you cheat and provide it with actual MP3 examples of what you want it to create, it is impossible.

The same goes for LLMs and invention generally, which is why they've made no important scientific discoveries.

You can learn a lot by playing with Suno.

bonesss · 2 days ago

Genuinely novel discovery or invention?

Einstein’s theory of relativity springs to mind, which is deeply counter-intuitive and relies on the interaction of forces unknowable to our basic Newtonian senses.

There’s an argument that it’s all turtles (someone told him about universes, he read about gravity, etc), but there are novel maths and novel types of math that arise around and for such theories which would indicate an objective positive expansion of understanding and concept volume.

mirekrusin · a day ago

Thank you for not saying "language", but "text".

It's true, but it's also true that text is very expressive.

Programming languages (huge, formalized expressiveness), math and other formal notation, SQL, HTML, SVG, JSON/YAML, CSV, domain specific encoding ie. for DNA/protein sequences, for music, verilog/VHDL for hardware, DOT/Graphviz/Mermaid, OBJ for 3D, Terraform/Nix, Dockerfiles, git diffs/patches, URLs etc etc.

The scope is very wide and covers enough to be called generic especially if you include multi modalities that are already being blended in (images, videos, sound).

I'm cheering for Yann, hope he's right and I really like his approach to openness (hope he'll carry it over to his new company).

At the same time current architectures do exist now and do work, by far exceeding his or anybody's else expectations and continue doing so. It may also be true they're here to stay for long on text and other supported modalities as cheaper to train.

vidarh · a day ago

It's just not true LLMs are limited to "static text". Data is data. Sensory input is still just data, and multimodal models has been a thing for a while. Ongoing learning and more extensive short term memory is a challenge, and so I am all for research in alternative architectures, but so much of the discourse about the limitations of LLMs act as if they have limitations they do not have.

masteranza · 2 days ago

A few years ago I've made this simple thought experiment to convince myself that LLM's won't achieve superhuman level (in the sense of being better than all human experts):

Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence? Obviously and intuitively the answer is NO.

Your comment actually extended this observation for me sparking hope that systems consuming natural world as input might actually avoid this trap, but then I realized that tool use & learning can in fact be all that's needed for singularity while consuming raw data streams most of the time might actually be counterproductive.

kadushka · 2 days ago

Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence?

It could potentially reach super-dolphin level intelligence

hodgehog11 · 2 days ago

I mean no offense here, but I really don't like this attitude of "I thought for a bit and came up with something that debunks all of the experts!". It's the same stuff you see with climate denialism, but it seems to be considered okay when it comes to AI. As if the people that spend all day every day for decades have not thought of this.

Dataset limitations have been well understood since the dawn of statistics-based AI, which is why these models are trained on data and RL tasks that are as wide as possible, and are assessed by generalization performance. Most of the experts in ML, even the mathematically trained ones, within the last few years acknowledge that superintelligence (under a more rigorous definition than the one here) is quite possible, even with only the current architectures. This is true even though no senior researcher in the field really wants superintelligence to be possible, hence the dozens of efforts to disprove its potential existence.

smokel · a day ago

> Imagine that we made an LLM out of all dolphin songs ever recorded, would such LLM ever reach human level intelligence? Obviously and intuitively the answer is NO.

Not so fast. People have built pretty amazing thought frameworks out of a few axioms, a few bits, or a few operations in a Turing machine. Dolphin songs are probably more than enough to encode the game of life. It's just how you look at it that makes it intelligence.

mountainriver · 2 days ago

Okay but most modern LLMs are multimodal, and it’s fairly easy to make an LLM multimodal.

Also there is no evidence that novel discoveries are more than remixes. This is heavily debated but from what we’ve seen so far I’m not sure I would bet against remix.

World models are great for specific kinds of RL or MPC. Yann is betting heavily on MPC, I’m not sure I agree with this as it’s currently computationally intractable at scale

rvz · 2 days ago

A lot more justifiable than say, Thinking Machines at least. But we will "see".

World models and vision seems like a great use case for robotics which I can imagine that being the main driver of AMI.

jimbo808 · 2 days ago

You're right that world models are the bottleneck, but people underestimate the staggering complexity gap between modeling the physical world and modeling a one-dimensional stream of text. Not only is the real world high-dimensional, continuous, noisy, and vastly more information dense, it's also not something for which there is an abundance of training data.

robrenaud · 2 days ago

Was Alphago's move 37 original?

In the last step of training LLMs, reinforcement learning from verified rewards, LLMs are trained to maximize the probability of solving problems using their own output, depending on a reward signal akin to winning in Go. It's not just imitating human written text.

Fwiw, I agree that world models and some kind of learning from interacting with physical reality, rather than massive amounts of digitized gym environments is likely necessary for a breakthrough for AGI.

energy123 · 2 days ago

why LLMs (transformers trained on multimodal token sequences, potentially containing spatiotemporal information) can't be a world model?

ForHackernews · 2 days ago

https://medium.com/state-of-the-art-technology/world-models-...

> One major critique LeCun raises is that LLMs operate only in the realm of language, which is a simple, discrete space compared to the continuous, complex physical world we live in. LLMs can solve math problems or answer trivia because such tasks reduce to pattern completion on text, but they lack any meaningful grounding in physical reality. LeCun points out a striking paradox: we now have language models that can pass the bar exam, solve equations, and compute integrals, yet “where is our domestic robot? Where is a robot that’s as good as a cat in the physical world?” Even a house cat effortlessly navigates the 3D world and manipulates objects — abilities that current AI notably lacks. As LeCun observes, “We don’t think the tasks that a cat can accomplish are smart, but in fact, they are.”

LarsDu88 · 2 days ago

I really hate the world model terminology, but the actual low level gripe between LeCunn and autoregressive LLMs as they stand now is the fact that the loss function needs to reconstruct the entirety of the input. Anything less than pixel perfect reconstruction on images is penalized. Token by token reconstruction also is biased towards that same level of granularity.

The density of information in the spatiotemporal world is very very great, and a technique is needed to compress that down effectively. JEPAs are a promising technique towards that direction, but if you're not reconstructing text or images, it's a bit harder for humans to immediately grok whether the model is learning something effectively.

I think that very soon we will see JEPA based language models, but their key domain may very well be in robotics where machines really need to experience and reason about the physical the world differently than a purely text based world.

whiplash451 · 2 days ago

The term LLM is confusing your point because VLMs belong to the same bin according to Yann.

Using the term autoregressive models instead might help.

kadushka · 2 days ago

Diffusion models are not autoregressive but have the same limitations

10xDev · 2 days ago

Whether it is text or an image, it is just bits for a computer. A token can represent anything.

A_D_E_P_T · 2 days ago

Sure, but don't conflate the representation format with the structure of what's being represented.

Everything is bits to a computer, but text training data captures the flattened, after-the-fact residue of baseline human thought: Someone's written description of how something works. (At best!)

A world model would need to capture the underlying causal, spatial, and temporal structure of reality itself -- the thing itself, that which generates those descriptions.

You can tokenize an image just as easily as a sentence, sure, but a pile of images and text won't give you a relation between the system and the world. A world model, in theory, can. I mean, we ought to be sufficient proof of this, in a sense...

Bombthecat · 2 days ago

Can a token represent concentration, will?

bsenftner · 2 days ago

There will be no "unlocking of AGI" until we develop a new science capable of artificial comprehension. Comprehension is the cornucopia that produces everything we are, given raw stimulus an entire communicating Universe is generated with a plethora of highly advanceds predator/prey characters in an infinitely complex dynamic, and human science and technology have no lead how to artificially make sense of that in a simultaneous unifying whole. That's comprehension.

chilmers · 2 days ago

Ironically, your comment is practically incomprehensible.

8bitsrule · a day ago

Gotta say, good luck with that effort. Lenat started Cyc 42 years ago, and after a while it seemed to disappear. 'Understanding' the 'physical world' is something that a few -may- start to approach intuitively after a decade or five of experience. (Einstein, Maxwell, et.al.) But the idea of feeding a machine facts and equations ... and dependence on human observations ... seems unlikely to lead to 'mastering the physical world'. Let alone for $1Billon.

kypro · 2 days ago

No hate, but this is just your opinion.

The definition of "text" here is extremely broad – an SVG is text, but it's also an image format. It's not incomprehensible to imagine how an AI model trained on lots of SVG "text" might build internal models to help it "visualise" SVGs in the same way you might visualise objects in your mind when you read a description of them.

The human brain only has electrical signals for IO, yet we can learn and reason about the world just fine. I don't see why the same wouldn't be possible with textual IO.

daxfohl · 2 days ago

Yeah I don't even think you'd need to train it. You could probably just explain how SVG works (or just tell it to emit coordinates of lines it wants to draw), and tell it to draw a horse, and I have to imagine it would be able to do so, even if it had never been trained on images, svg, or even cartesian coordinates. I think there's enough world model in there that you could simply explain cartesian coordinates in the context, it'd figure out how those map to its understanding of a horse's composition, and output something roughly correct. It'd be an interesting experiment anyway.

But yeah, I can't imagine that LLMs don't already have a world model in there. They have to. The internet's corpus of text may not contain enough detail to allow a LLM to differentiate between similar-looking celebrities, but it's plenty of information to allow it to create a world model of how we perceive the world. And it's a vastly more information-dense means of doing so.

uoaei · 2 days ago

> There are a lot more degrees of freedom in world models.

Perhaps for the current implementations this is true. But the reason the current versions keep failing is that world dynamics has multiple orders of magnitude fewer degrees of freedom than the models that are tasked to learn them. We waste so much compute learning to approximate the constraints that are inherent in the world, and LeCun has been pressing the point the past few years that the models he intends to design will obviate the excess degrees of freedom to stabilize training (and constrain inference to physically plausible states).

If my assumption is true then expect Max Tegmark to be intimately involved in this new direction.

_s_a_m_ · 2 days ago

Really? As if not everyone told him the last 10 years, especially Gary Marcus which he ridiculed on Twitter at every occasion and now silently like a dog returning home switches to Gary's position. As if anyone was waiting for this, even 5 years ago this was old news, Tenenbaum is building world models for a long time. People in pop venture capital culture don't seem to know what is going on in research. Makes them easier to milk.

ml-anon · 2 days ago

Honestly, how do people who know so little have this much confidence to post here?

mvc · 2 days ago

You must be new here

A_D_E_P_T · 2 days ago

Care to explain what led to this reaction?

Regardless of your opinion of Yann or his views on auto regressive models being "sufficient" for what most would describe as AGI or ASI, this is probably a good thing for Europe. We need more well capitalized labs that aren't US or China centric and while I do like Mistral, they just haven't been keeping up on the frontier of model performance and seem like they've sort of pivoted into being integration specialists and consultants for EU corporations. That's fine and they've got to make money, but fully ceding the research front is not a good way to keep the EU competitive.

brandonb · 2 days ago

LeCun's technical approach with AMI will likely be based on JEPA, which is also a very different approach than most US-based or Chinese AI labs are taking.

If you're looking to learn about JEPA, LeCun's vision document "A Path Towards Autonomous Machine Intelligence" is long but sketches out a very comprehensive vision of AI research: https://openreview.net/pdf?id=BZ5a1r-kVsf

Training JEPA models within reach, even for startups. For example, we're a 3-person startup who trained a health timeseries JEPA. There are JEPA models for computer vision and (even) for LLMs.

You don't need a $1B seed round to do interesting things here. We need more interesting, orthogonal ideas in AI. So I think it's good we're going to have a heavyweight lab in Europe alongside the US and China.

sanderjd · 2 days ago

Have you published anything about your health time series model? Sounds interesting!

mandeepj · 2 days ago

Appreciate your work! Healthcare is a regulated industry. Everything (Research, proposals, FDA submissions, Compliance docs, Accreditation Standards, etc.) is documented and follows a process, which means there's a lot of thesis. You can't sneak in anything unverified or unreliable. Why does healthcare need a JEPA\World model?

tomrod · 2 days ago

I've been working to understand the potential uses for JEPA. Outside of video, has anyone made a list of any type (geared towards dummies like me)?

Brajeshwar · 2 days ago

There seem to be other news articles mentioning that they are setting up in Singapore as their base. https://www.straitstimes.com/business/ai-godfather-raises-1-...

Signez · 2 days ago

Hm, Singapour looks more like "one of their base"; they will have offices in Paris, Montréal, Singapour and New York (according to both this article and the interview Yann Le Cun did this morning on France Inter, the most listened radio in France).

Of course, each relevant newspaper on those areas highlight that it's coming to their place, but it really seems to be distributed.

fnands · 2 days ago

Probably just a satellite office.

Might be to be close to some of Yann's collaborators like Xavier Bresson at NUS

stingraycharles · 2 days ago

That's a Singaporian newspaper, though; not sure if it's objectively their main base, or just one of them

Deleted Comment

throwpoaster · 2 days ago

"Show me the incentive and I will show you the outcome."

Almost certainly the IP will be held in Singapore for tax reasons.

RamblingCTO · 2 days ago

Which would be a good idea, as a European. I'd hate to see the investment go to waste on taxes that are spent on stupid shit anyway. Should go into R&D not fighting bureaucracy.

re-thc · 2 days ago

> they are setting up in Singapore as their base

Europe in general has been tightening up their rules / taxes / laws around startups / companies especially tech and remote.

It's been less friendly. these days.

barrell · 2 days ago

While I’d love there to be a European frontier model, I do very much enjoy mistral. For the price and speed it outperforms any other model for my use cases (language learning related formatting, non-code non-research).

vessenes · 2 days ago

Partner in a fund that wrote a small check into this — I have no private knowledge of the deal - while I agree that one’s opinion on auto regressive models doesn’t matter, I think the fact of whether or not the auto regressive models work matters a lot, and particularly so in LeCun’s case.

What’s different about investing in this than investing in say a young researcher’s startup, or Ilya’s superintelligence? In both those cases, if a model architecture isn’t working out, I believe they will pivot. In YL’s case, I’m not sure that is true.

In that light, this bet is a bet on YL’s current view of the world. If his view is accurate, this is very good for Europe. If inaccurate, then this is sort of a nothing-burger; company will likely exit for roughly the investment amount - that money would not have gone to smaller European startups anyway - it’s a wash.

FWIW, I don’t think the original complaint about auto-regression “errors exist, errors always multiply under sequential token choice, ergo errors are endemic and this architecture sucks” is intellectually that compelling. Here: “world model errors exist, world model errors will always multiply under sequential token choice, ergo world model errors are endemic and this architecture sucks.” See what I did there?

On the other hand, we have a lot of unused training tokens in videos, I’d like very much to talk to a model with excellent ‘world’ knowledge and frontier textual capabilities, and I hope this goes well. Either way, as you say, Europe needs a frontier model company and this could be it.

jsnell · 2 days ago

I don't think it's "regardless", your opinion on LeCun being right should be highly correlated to your opinion on whether this is good for Europe.

If you think that LLMs are sufficient and RSI is imminent (<1 year), this is horrible for Europe. It is a distracting boondoggle exactly at the wrong time.

vidarh · 2 days ago

It's sufficient to think that there is a chance that they will not be, however, for there to be a non-zero value to fund other approaches.

And even if you think the chance is zero, unless you also think there is a zero chance they will be capable of pivoting quickly, it might still be beneficial.

I think his views are largely flawed, but chances are there will still be lots of useful science coming out of it as well. Even if current architectures can achieve AGI, it does not mean there can't also be better, cheaper, more effective ways of doing the same things, and so exploring the space more broadly can still be of significant value.

Tenoke · 2 days ago

I think LeCun has been so consistently wrong and boneheaded for basically all of the AI boom, that this is much, much more likely to be bad than good for Europe. Probably one of the worst people to give that much money to that can even raise it in the field.

andrepd · 2 days ago

It's been 6 months away for 5 years now. In that time we've seen relatively mild incremental changes, not any qualitative ones. It's probably not 6 months away.

Insanity · 2 days ago

Whenever I see claims about AGI being reachable through large language models, it reminds me of the miasma theory of disease. Many respectable medical professionals were convinced this was true, and they viewed the entire world through this lens. They interpreted data in ways that aligned with a miasmatic view.

Of course now we know this was delusional and it seems almost funny in retrospect. I feel the same way when I hear that 'just scale language models' suddenly created something that's true AGI, indistinguishable from human intelligence.

Deleted Comment

dheera · 2 days ago

Just because you raise 1 billion dollars to do X doesn't mean you can't pivot and do Y if it is in the best interest of your mission.

I won't comment on Yann LeCun or his current technical strategy, but if you can avoid sunk cost fallacy and pivot nimbly I don't think it is bad for Europe at all. It is "1 billion dollars for an AI research lab", not "1 billion dollars to do X".

next_xibalba · 2 days ago

> RSI

Wait, we have another acronym to track. Is this the same/different than AGI and/or ASI?

Dead Comment

crystal_revenge · 2 days ago

> fully ceding the research front is not a good way to keep the EU competitive

Tech is ultimately a red herring as far as what's needed to keep the EU competitive. The EU has a trillion dollar hole[0] to fill if they want to replace US military presence, and current net import over 50% of their energy. Unfortunately the current situation in Iran is not helping either of these as they constrains energy further and risks requiring military intervention.

0. https://www.wsj.com/world/europe/europes-1-trillion-race-to-...

AngryData · 2 days ago

Hard disagree, military might isn't going to secure anybody into the future, modern society and our economies will only get more vulnerable as time goes on and large wars or engagements will just push economies closer to collapse. And without a solid modern economy to back up the military, modern military will fall apart.

gandalfstoe · 2 days ago

Right, they really need a military industrial complex to be "competitive" :eyeroll. Are you suggesting regressing to the stone age?

chrisgd · 2 days ago

33% of the business in a seed round is nuts

ak_111 · 2 days ago

can you elaborate more, also isn't this necessary for a Lab that wants to compete with highly funded entities (like OpenAI, Anthropic)?

gigatexal · 2 days ago

As an American here in Berlin, I, too welcome this. I would love for there to be many large well capitalized companies here for me to work at.

Deleted Comment

nailer · 2 days ago

> Regardless of your opinion of Yann or his views on auto regressive models being "sufficient" for what most would describe as AGI or ASI

My main concern with Lecunn are the amount of times he has repeatedly told people software is open source when it’s license directly violates the open source definition.

neversupervised · 2 days ago

Is it good? This will almost certainly fail. Not because Yann or Europe, but because these sort of hyper-hyped projects fail. SSI and Thinking Machines haven’t lived to the hype.

ma2rten · 2 days ago

Erm, ... OpenAI has hyped when it started and it took 6 years to take off. It's way to early to declare the SSI and Thinking Machines have failed.

giancarlostoro · 2 days ago

I didn't really know who he was, so I went and found his wikipedia, which is written like either he wrote it himself to stroke his ego, or someone who likes him wrote it to stroke his ego:

> He is the Jacob T. Schwartz Professor of Computer Science at the Courant Institute of Mathematical Sciences at New York University. He served as Chief AI Scientist at Meta Platforms before leaving to work on his own startup company.

That entire sentence before the remarks about him service at Meta could have been axed, its weird to me when people compare themselves to someone else who is well known. It's the most Kanye West thing you can do. Mind you the more I read about him, the more I discovered he is in fact egotistical. Good luck having a serious engineering team with someone who is egotistical.

pama · 2 days ago

You underestimate academia. Any academic that reads these two sentences only focuses on the first one: He has a named chair at Courant. In Germany, being a a Prof is added to your ID card/passport and becomes part of your official name, like knighthood in other countries.

timr · 2 days ago

It's not comparing him to anyone. He has an endowed professorship. This is standard in academia, and you give the name because a) it's prestigious for the recipient and b) it strokes the ego of the donor.

lairv · 2 days ago

https://cims.nyu.edu/dynamic/news/1441/

This is just the official name of a chair at NYU. I'm not even sure Jacob T. Schwartz is more well known than Yann LeCun

bobwaycott · 2 days ago

That’s not a comparison to another person. That’s his job title. It is not uncommon for universities to have distinguished chairs within departments named after a notable person—in this case, the founder of NYU’s Department of Computer Science.

g947o · 2 days ago

Eh, that paragraph reads perfectly normal to me.

Either you have not read enough Wikipedia pages, or you have too much to complain about. (Or both.)