Readit News logoReadit News
KaiserPro · 9 months ago
It would be interesting to play with this model directly (if it ever gets released...)

One interesting thing is that foley is not real life, so depending on where they get the dataset from (or how they "generated" it) they might be learning how things actually sound, or how foley artists make them sound.

Its probably most noticeable in either car noises, or eagles: https://www.youtube.com/watch?v=jQI-ddEPTx4 (which are often overdubbed with hawk noises)

echelon · 9 months ago
Adobe intern research almost never sees code / weights releases, sadly. It's an academic peg for the students and marketing for Adobe. Maybe future interns will start taking this into account and work somewhere they can reuse their research.

We need a broader open research / open source conversation about ML models. Meta calling their approach "open" and what the OSI deems "open" are both flawed and/or misleading.

China is really leading the "open" AI research game. They release so many goodies - papers, code, weights, training scripts, notebooks, and sometimes even training data. It's gotten to the point that whenever I see a Western publication, I groan.

More recently their work seems to be done on H800s and export controlled GPUs. This is also why they're starting to experiment with wildly innovative architectures. Despite being resource constrained, they're really kicking ass.

fngjdflmdflg · 9 months ago
>Meta calling their approach "open" and what the OSI deems "open" are both flawed and/or misleading.[...] It's gotten to the point that whenever I see a Western publication, I groan.

llama weights are open, and Qwen 1.0 was based on llama,[0] so clearly those papers are useful to some people, even if some people want to be angry at Meta for not being open enough for them. In fact I would say Qwen is less useful than llama as their technical reports are not nearly as detailed as Meta's. Also, Why groan just from seeing the origin of a project? Just read it to see if it is open or not. And there are also lots of popular Chinese models that are completely closed with no code or paper eg. Hailuo, Kling, etc.

[0] https://arxiv.org/pdf/2309.16609 p. 6

m3kw9 · 9 months ago
That’s a first I’ve seen a demo like that. I have no doubt Hollywood is gonna be changed big time especially with cost and speed reduction, I think in 3 years it could be Hollywood quality/speed/cost of generation and tools.
swatcoder · 9 months ago
> Hollywood is gonna be changed big time especially with cost and speed reduction

Over 100 years of history suggest that Hollywood experiences an ongoing, strong pressure to make productions more expensive and slower. Productions have been much cheaper and quicker in the past, and there's no technical impediment to making them that way again already (nothing has been lost), but studios and audiences generally want to see the limits of spectacle.

But generative AI does not deliver on "the limits of spectacle" and has no clear path to doing so. It makes average-ish digital content, by definition, and has unsolved challenges with maintaining coherency and consistency across and within sessions/segments. It does do that pretty cheaply and quickly though.

We can expect it to see the most use in already budget-constrained projects, where its compromises are a tolerable backdrop against some other focus (writing, humor, romance, etc), not the blockbusters that have huge budgets and polish demands and that mark the signature of "Hollywood". There, it'll expedite some creative utility tasks as people get the hang of using and improving it, but we can expect that the money and time saved there will just get routed over to other artisan, limit-pushing tasks.

og_kalu · 9 months ago
>Productions have been much cheaper and quicker in the past, and there's no technical impediment to making them that way again already (nothing has been lost)

Productions also looked a lot worse in the past. Some productions are more expensive today because that's the budget the kind of production requires. You act like a movie like Guardians of the Galaxy looking as good as it does was even remotely possible decades ago and studios just want to spend more money for vain spectacle. Star Wars 77 was great for its time but that's exactly it, 'for its time'.

77's budget adjusts to 60M today and while you certainly can't recreate a modern 'will hold up' star wars with that budget, 60M today gets you a lot farther than it did decades ago with much better looking movies.

liontwist · 9 months ago
Isn’t this kind of a universal problem in media now? Movies compete with every other media. The only voices that can be heard in that competitive marketing environment are big budget, winner take all, projects, and direct access to niche segments (YouTubers).
dagmx · 9 months ago
You can do foley today for free already.

If you’re primarily talking about the time commitment as cost, anyone who doesn’t have the time won’t care about foley to begin with. It’s an attention to detail that someone turning to automated tools won’t have to begin with.

Beyond that, foley is a very creative process. It’s not just placing foot steps at the right moment, it’s making them sound right contextually. It’s knowing that smacking a spring is a great blaster sound, not just putting a gunshot in.

dylan604 · 9 months ago
Of all the crafts involved in making a movie, audio is one of my favorites (even though as someone from camera department I'd never admit that on set). One of the post houses I worked in had a foley studio, and it has always been one of those things I would love to do. It just looks like so much fun. It's as close to child-like playing as an adult can get. What would it sound like if we did....cool. What would it sound like if we....meh, but we can blend that will that other thing...cool.
rob74 · 9 months ago
Do I understand correctly that the videos themselves are also AI-generated? That would explain the lack of spacebars on both typewriters (the one in the slider at the beginning and the one in the example video).
0_____0 · 9 months ago
The spacebar is there and gets used, it's just dark. Looks like an old Russian typewriter and nothing suggests it's AI generated.
jeff_vader · 9 months ago
Yup. And the other (green/teal?) typewriter has a spacebar on the edge, same colour as the typewriter body. Ends of it sometimes visible between fingers, it's also used.
hoseja · 9 months ago
Oh hey there Baader-Meinhof. Ghost Town Living talked about the origin of "Foley" in his latest video: https://www.youtube.com/watch?v=tZmA_VF6cm8
whywhywhywhy · 9 months ago
No code, no model, Adobe involved.

Speaking as an creative AI tools only really get interesting or genuinely useful in a professional setting if you can fine tune them otherwise everything has the same base aesthetic/phonaesthetic brush.

This is before we get into the ethics of Adobe taking artists work without compensation to build closed and paywalled machines to make their work worth less.

Kye · 9 months ago
The ML-based similar sound search in Ableton Live is very useful. It's the only example I can think of where it's completely uncontroversial.

AI music people: "We have removed the parts of making music that are actually enjoyable. You're welcome!"

Ableton: "The computer will sort and rank all 50,000 of your kick samples so you can actually get somewhere when you spend an hour swapping them out"

HelloUsername · 9 months ago
Why is the sound of the dog barking also not reversed, like the video does..