Tom and Jerry One-Minute Video Generation with Test-Time Training

really impressive work considering the reported size of the model and training hours.

trunch · 5 months ago

50+ hours on 256 H100s is considered impressively low training?

Really makes me wonder if any of this incredibly computationally expensive research is worth it, which seems only useful in potentially promising a future in which humans are given less opportunity to express themselves creatively - while delivering them an infinitely produceable amount of ai generated 'content' to passively consume

skyyler · 5 months ago

>Really makes me wonder if any of this incredibly computationally expensive research is worth it

I'm wondering the same thing. 256 H100s were hot for two days straight to be able to make short clips of cartoons that almost don't look like shit?

It just isn't compelling to me.

quantumHazer · 5 months ago

Sorry, you're right lol. I'm just accustomed to other major lab gazillions of hours of training.

ramon156 · 5 months ago

This is by no means a comment about the quality of the project, but my god it's very uncanny in some frames. I feel like this would open up a lot of doors to creepypasta content. I'd love to play around with this

soupfordummies · 5 months ago

Reading the prompts reminds me of this interesting short story from Steven Millhauser called "Cat 'N Mouse"[1]

Would be really cool to just use this (or parts of it) as one of the prompts and see what results.

[1] - https://www.newyorker.com/magazine/2004/04/19/cat-n-mouse

Deleted Comment

keiferwiseman · 5 months ago

Looks pretty bad but considering this was impossible a couple years ago(as far as I know) it’s very impressive progress

onemoresoop · 5 months ago

Aside from memes I do not see the progress value.

blamarvt · 5 months ago

Are you saying you don't see the value in video generation? The potential for unlimited high quality and customizable content generation?

andy12_ · 5 months ago

The main progress value is that Test-Time Training appears to work very well in practice. I think that as labs begin to test it as scale in LLMs, it will become commonplace in next-generation models.