I can't wait to play with this but I can't even imagine how expensive it must be. They're training in full resolution and can generate up to a minute of video.
Seeing how bad video generation was, I expected it would take a few more years to get to this but it seems like this is another case of "Add data & compute"(TM) where transformers prove once again they'll learn everything and be great at it
I view it as being OSS or postgresql dev that happens to work at microsoft. I've been doing the former for much longer (starting somewhere between 2005 and 2008, depending on how you count) than the latter (2019-12).