bearobear (u/bearobear)

programjames · 10 months ago

Anyone willing to give an intuitive summary of what they did mathwise? The math in the paper is super ugly to churn through.

bearobear · 10 months ago

Last author here (I also did the DDIM paper, https://arxiv.org/abs/2010.02502). I know this is going to be very tricky math-wise (and in the paper we just wrote the most general thing to make reviewers happy), so I tried to explain the idea more easily under the blog post (https://lumalabs.ai/news/inductive-moment-matching).

If you look at how a single step of the DDIM sampler interacts with the target timestep, it is actually just a linear function. This is obviously quite inflexible if we want to use it to represent a flexible function where we can choose any target timestep. So just add this as an argument to the neural network and then train it with a moment matching objective.

In general, I feel that analyzing a method's inference-time properties before training it can be helpful to not only diffusion models, but also LLMs including various recent diffusion LLMs, which prompted me to write a position paper in the hopes that others develop cool new ideas (https://arxiv.org/abs/2503.07154).