Can anyone share insight into how this is different from consistency models? The insight seems quite similar?
Consistency models is a special case of IMM where you do moment matching with 1 sample from each distribution (i.e., you cannot match distributions properly). See Fig 5 for an ablation study, of course, adding more samples when you are doing moment matching makes it more stable during training :)