I can't recommend this blog highly enough. Just about every post provides a deeply detailed, understandable overview on a subject at the frontier of ML research
Not being completely fluent in the ML space, I'm not completely clear on the applicability of this technique.
Broadly, is this useful for determining the already-diffused pattern of data and determining the original inputs, or determining the diffused output without needing to iterate fully and produce the result manually, or both, or am I completely off?
They are a way of training generative models similar to GANs or autoencoders.
My understanding is that if you train an autoencoder with a gaussian likelihood then you will tend to get fuzzy samples, but using an iterative process where each step is a gaussian conditioned on the previous step can give you nicer samples.
I find the math and all the integral signs intimidating. Is there a course that will help me learn all the background material to understand these things better?
What's your background? How much and how recently have you done math? Do you do much computational work? Answering these questions will be useful for helping find target resources.
Well, I have bachelors degree in Computer Science and am fairly well versed with using the popular ML tools and libraries like Pytorch and Scikit. I have been using them professionally as part of my job. I can implement a VAE easily in Pytorch, but find the math hard to understand. My linear algebra is probably okay, but the math involving probability and calculus is the problem. I never learnt those topics properly. I am probably looking for a remedial math class that is specifically catered towards understanding machine learning theory or some other online math courses that assumes no more math background than from high school. Thanks for the offer to help.
Thanks for the link and the blog post. Diffusion models have definitely been making waves over the last year or two and I've been slacking on really digging in.
Broadly, is this useful for determining the already-diffused pattern of data and determining the original inputs, or determining the diffused output without needing to iterate fully and produce the result manually, or both, or am I completely off?
My understanding is that if you train an autoencoder with a gaussian likelihood then you will tend to get fuzzy samples, but using an iterative process where each step is a gaussian conditioned on the previous step can give you nicer samples.