The Darwin Gödel Machine: AI that improves itself by rewriting its own code

I have a feeling LLMs could probably self improve up to a point with current capacity, then hit some kind of wall where current research is also bottle necked. I don’t think they can yet self improve exponentially without human intuition yet , and the results of this paper seem to support this conclusion as well.

Just like an LLM can vibe code a great toy app, I don’t think an LLM can come to close to producing and maintaining production ready code anytime soon. I think the same is true for iterating on thinking machines

matheusd · 3 months ago

> I don’t think they can yet self improve exponentially without human intuition yet

I agree: if they could, they would be doing it already.

Case in point: one of the first things done once ChatGPT started getting popular was "auto-gpt"; roughly, let it loose and see what happens.

The same thing will happen to any accessible model in the future. Someone, somewhere will ask it to self-improve/make as much money as possible, with as little leashes as possible. Maybe even the labs themselves do that, as part of their post-training ops for new models.

Therefore, we can assume that if the existing models _could_ be doing that, they _would_ be doing that.

That doesn't say anything about new models released 6 months or 2 years from now.

__loam · 3 months ago

People in the industry have been saying 6 months to agi for 3 years.

NitpickLawyer · 3 months ago

Note that this isn't improving the LLM itself, but the software glue around it (i.e. agentic loops, tools, etc). The fact that using the same LLM got ~20% increase on the aider leaderboard speaks more about aider as a collection of software glue, than it does about the model.

I do wonder though if big labs are running this with model training episodes as well.

iknownothow · 3 months ago

Don't take this the wrong way, your opinion is also vibes.

Let's ground that a bit.

Have a look at ARC AGI 1 challenge/benchmark. Solve a problem or two yourself. Know that ARC AGI 1 is practically solved by a few LLMs as of Q1 2025.

Then have a look at the ARC AGI 2 challenge. Solve a problem or two yourself. Note that as of today, it is unsolved by LLMs.

Then observe that the "difficulty" of ARC AGI 1 and 2 for a human are relatively the same but challenge 2 is much harder for LLMs than 1.

ARC AGI 2 is going to be solved *within* 12 months (my bet is on 6 months). If it's not, I'll never post about AI on HN again.

There's only one problem to solve, i.e. "how to make LLMs truly see like humans do". Right now, any vision based features that the models exhibit comes from maximizing the use of engineering (i.e. applying CNNs on image slices, chunks, maybe zooming and applying ocr, vector search etc), it isn't vision like ours and isn't a native feature for these models.

Once that's solved, then LLMs or new Algo will be able to use a computer perfectly by feeding it screen capture. End of white collar jobs 2-5 years after (as we know it).

Edit - added "(as we know it)". And fixed missing word.

codr7 · 3 months ago

Speaking of vibes.

As long as AI is guessing answers based on what it has seen before, it's not happening.

I'm sorry. It doesn't matter how many bazillions you would cash in if it did, still not happening.

It's all wishful thinking.

artificialprint · 3 months ago

If you listen interview with Francois it'll be clear to you that "vision" in the way you refer it, has very little do to with solving ARC.

And more to do with "fluid, adaptable intelligence, that learns on the fly"

jplusequalt · 3 months ago

>I'll never post about AI on HN again

Saving this. One less overconfident AI zealot, the better.

belter · 3 months ago

The proof they are not "smart" in the way intelligence is normally defined, is that the models need to "read" all the books in the world. To perform at a level close to an expert on the domain, who read just two or three of the most representative books on his own domain.

We will be on the way to AGI when your model can learn Python just by reading the Python docs...Once...

alex-moon · 3 months ago

The wall is training data. An AI can't produce its own training data because an AI can't be smarter than its own training data. This is a well known regression problem and one I personally believe is not solvable. (A softer assertion would be: it's not solvable with current technology.)

rxtexit · 3 months ago

I use to think this but no one I have read believes data is the problem.

Amodei explains that if data, model size and compute scale up linearly, then the reaction happens.

I don't understand why data wouldn't be a problem but it seems like if it was, we would have ran into this problem already and it has already been overcome with synthetic data.

sharemywin · 3 months ago

an LLM can't learn without adding new data and a training run. so it's impossible for it to "self improve" by itself.

I'm not sure how much an agent could do though given the right tools. access to a task mgt system, test tracker. robust requirements/use cases.

viraptor · 3 months ago

I don't have the link on hand, but people have already proven that LLMs can both generate new problems for themselves and train on them. Not sure why it would be surprising though - we do it all the time ourselves.

owebmaster · 3 months ago

> an LLM can't learn without adding new data and a training run.

That's probably the next big breakthrough

littlestymaar · 3 months ago

> I don’t think they can yet self improve exponentially without human intuition yet

Even if they had human level intuition, they wouldn't be able to improve exponentially without human money, and they would need an exponentially growing amount of it to do so.

more_corn · 3 months ago

Ai code assistants have some peculiar problems. They often fall into loops and errors of perception. They can’t reason about high level architecture well. They will often flip flop between two possible ways of doing things. It’s possible that good coding rules might help, but I expect they will have weird rabbit hole errors.

That being said they can write thousands of lines an hour and can probably do things that would be impossible for a human. (Imagine having the LLM skip code and spit out compiled binaries as one example)

api · 3 months ago

Historically learning and AI systems, if you plug the output into the input (more or less), spiral off into lala land.

I think this happens with humans in places like social media echo chambers (or parts of academia) when they talk and talk and talk a whole lot without contact with any outer reality. It can be a source of creativity but also madness and insane ideas.

I’m quite firmly on the side of learning requiring either direct or indirect (informed by others) embodiment, or at least access to something outside. I don’t think a closed system can learn, and I suspect that this may reflect the fact that entropy increases in a closed system (second law).

As I said recently in another thread, I think self contemplating self improving “foom” AI scenarios are proposing informatic perpetual motion or infinite energy machines.

Everything has to “touch grass.”

medstrom · 3 months ago

> Everything has to “touch grass.”

Not wrong, but it's been said that a videoclip of an apple falling on Newton's head is technically enough information to infer the theory of relativity. You don't need a lot of grass, with a well-ordered mind.

AndrewKemendo · 3 months ago

> I don’t think they can yet self improve exponentially without human intuition yet

Who is claiming anything can self improve exponentially?

lawlessone · 3 months ago

I agree , it might incrementally optimize itself very well, but i think for now at least anything super innovative will still come from a human that can think beyond a few steps. There are surely far better possible architectures, training methods etc that would initially lead to worse performance if approached stepwise.

codr7 · 3 months ago

Yeah, anyone who's seen it trying to improve code could tell you what that optimization looks like.

Oh, this part is taking too long, let's replace it with an empty function.

Oh wait, now it's not working, let's add the function.

Oh, this part is taking too long...

It would be hilarious if this world wasn't full of idiots.

throwawaymaths · 3 months ago

what is there to improve? the transformer architecture is extremely simple. you gonna add another kv layer? you gonna tweak the nonlinearities? you gonna add 1 to one of the dimensions? you gonna inject a weird layer (which could have been in the weights anyways due to kolmogorov theorem)?

realistically the best you could do is evolve the prompt. maybe you could change input data preprocessing?

anyways the idea of current llm architectures self-improving via its own code seems silly as there are surprisingly few knobs to turn, and it's ~super expensive to train.

as a side note it's impressive how resistant the current architecture is to incremental RL away from results, since if even one "undesired input" result is multiple tokens, the coupling between the tokens is difficult to disentangle. (how do you separate jinping from jin-gitaxias for example)

amelius · 3 months ago

Id like to see what happens if you change the K,V matrix into a 3 dimensional tensor.

ninetyninenine · 3 months ago

They can improve. You can make one adjust its own prompt. But the improvement is limited to the context window.

It’s not far off from human improvement. Our improvement is limited to what we can remember as well.

We go a bit further in the sense that the neural network itself can grow new modules.

wat10000 · 3 months ago

It's radically different from human improvement. Imagine if you were handed a notebook with a bunch of writing that abruptly ends. You're asked to read it and then write one more word. Then you have a bout of amnesia and you go back to the beginning with no knowledge of the notebook's contents, and the cycle repeats. That's what LLMs do, just really fast.

You could still accomplish some things this way. You could even "improve" by leaving information in the notebook for your future self to see. But you could never "learn" anything bigger than what fits into the notebook. You could tell your future self about a new technique for finding integrals, but you couldn't learn calculus.

larrydag · 3 months ago

That would be something. When a AI/LLM can create new axioms or laws that have not discovered by humanity.

UltraSane · 3 months ago

I would LOVE to see an LLM trained simultaneously with ASICs optimized to run it. Or at least an FPGA design.

lawlessone · 3 months ago

I think that's basically what nvidia and their competitor AI chips do now?

jalk · 3 months ago

Can't find the reference now, but remember reading an article on evolving FPGA designs. The found optimum however only worked on the specific FPGA it was evolved on, since the algo had started to use some out-of-spec "features" of the specific chip. Obviously that can be fixed with proper constraints, but seems like a trap that could be stepped into again - i.e. the LLM is now really fast but only on GPUs that come from the same batch of wafers.

cyanydeez · 3 months ago

most of the limits arw likely going to be GIGO, the same as using synthetic training data.

junto · 3 months ago

This is where it networks itself into a hive mind with each AI node specializing in some task or function networked with hyper speed data buses. Humans do the same both within their own brains and as cohesive teams, who cross check and validate each other. At some point it becomes self aware.

0points · 3 months ago

> At some point it becomes self aware.

This is where you lost me.

Always the same supernatural beliefs, not even an attempt of an explanation in sight.

nartho · 3 months ago

Well LLMs are not capable of coming up with new paradigms or solve problems in a novel way, just efficiently do what's already be done or apply already found solutions, so they might be able to come up with improvements that have been missed by it's programmers but nothing that outside of our current understanding

I've built a coding assistant over the last two days. The first 100 lines or so were handwritten. The rest has been written by the assistant itself.

It's written its system prompt. It's written its tools. Its written the code to reload the improved tools into itself.

And it knows it is working on itself - it frequently tries to use the enhanced functionality, and then expresses what in a human would be frustration at not having immediate access.

Once by trying to use ps to find its own pid in an apparent attempt to find a way to reload itself (that's the reason it gå before trying to run ps, anyway)

All its commits are now authored by the tool, including the commit messages. It needs to be good, and convincing, and having run the linter and the test suite for me to let it commit, but I agree a substantial majority of the time. It's only caused regressions once or twice.

A bit more scaffolding to trigger an automatic rollback in the case of failure and giving it access to a model I won't be charged by the token for, and I'd be tempted to let it out of the box, so to speak.

Today it wrote its own plan for what to add next. I then only told it to execute it.

A minor separate goal oriented layer guiding the planning, and it could run in a loop.

Odds are it'd run off the rails pretty quickly, but I kinda want to see how far it gets.

alok-g · 3 months ago

Is there some pre-trained model involved in this? Or it all started with just those 100 lines?

vidarh · 3 months ago

It's talking to a model over an API. Currently using Claude. Certainly would not be reasonable to do from scratch. The basic starting point to make a coding assistant is basically reading text from the user, feeding it to the model over the API, and giving it a couple of basic tools. Really the models can handle starting with just the ability to execute shell commands (with confirmation, unless you're braver than me), and from that you can bootstrap by asking it to suggest and write additional tools for itself.

antithesizer · 3 months ago

That's cool. I saw a cloud once that looked like a bunny rabbit.