Readit News logoReadit News
meltyness · 7 months ago
It's a tool box of demos with the following:

Segment Anything 2: Create video cutouts and other fun visual effects with a few clicks.

Seamless Translation: Hear what you sound like in another language.

Animated Drawings: Bring hand-drawn sketches to life with animations.

Audiobox: Create an audio story with A1-generated voices and sounds.

echelon · 7 months ago
> This research demo is not open to residents of, or those accessing the demo from, the States of Illinois or Texas.

Not accessible if you're in Illinois or Texas.

They must have anti-AI laws, probably with voice conversion moreso than image segmentation and cartoon animation.

Hopefully the lawmakers see beneficial use cases and fix their laws to target abuse instead of a blanket coarse-grained GenAI restriction.

azinman2 · 7 months ago
Illinois has laws against biometrics, which basically can be interpreted as broadly as anything that even looks for a face as a binary classifier. The translation demo uses video, intended to be your face.

Knowing meta they save all of it.

blagie · 7 months ago
Texas sounds reasonable in general. I've written license terms which exclude Texas. That's home of the patent trolls.

Heartland v. Kraft Foods is worth a read.

JKCalhoun · 7 months ago
I'm in Nebraska — but I think, due to my ISP, I appear to be in the Chicago area. Oh well.

Dead Comment

kylecazar · 7 months ago
Seamless translation is... Pretty incredible.

I speak English and Spanish, so I recorded some English sentences and listened to the Spanish output it generated. It came damn close to my own Spanish (although I have more Castilianisms in mine, which of course I wouldn't expect it to know)

heyjamesknight · 7 months ago
A real test here would be to give it to my friend from Mendoza, Argentina.

I'm bilingual and still can't understand him. I'm not even sure half the things he says are actual words.

mattlondon · 7 months ago
I tried it and it sounded nothing like me at all - just some random "generic" male voice that translate what I said into german. My wife put it as "that's shit - sounds nothing like you". Nuff said.
0xFEE1DEAD · 7 months ago
Same for me.

I also tried speaking German and translating it to English and when I said "Hallo ich wollte das nur mal ausprobieren" (Hello I just wanted to try this out) it translated it to "Hi, how are you? Do you know anyone who quit smoking?".

I feel gaslit.

suddenlybananas · 7 months ago
I translated from French to English and vice versa and the voice sounded nothing like me in either case. The English to French translation also made me sound about 90 years old.
ludwik · 7 months ago
Some for me. I'm a man with a relatively deep voice. The translation was read out by some generic female AI voice.
svilen_dobrev · 7 months ago
which is good. do you really want a deep-fake? that noone can distinguish?
lttlrck · 7 months ago
Did it _sound_ like you though? It doesn't sound remotely like me.
kylecazar · 7 months ago
It didn't really the first time. I recorded a second one and annunciated really strong/well (and said more) -- that yielded the positive results.
anal_reactor · 7 months ago
Whether "we're there yet" on translation technology is still debated, but at some point we'll consider it "good enough" for most practical use cases, truly removing the linguistic barrier. This is actually both terrifying and exciting, because then it'll definitely start influencing spoken language to at least some degree.
suddenlybananas · 7 months ago
It depends how much tolerance you have for mistakes. For a waiter or asking directions or things like that, 100% this works great. For a diplomatic discussion where nuance is very important however... It also doesn't work great for translating works of art where the translation itself is open-ended and can be done in a bunch of different ways and requires a lot of editorial/artistic decisions from the translator.
xandrius · 7 months ago
Unfortunate that the examples they provide were absolutely terrible and robotic.

It put me off from actually trying it, I might reconsider.

rob-olmos · 7 months ago
Is this subject purposely spelled Aidemos somewhere like the HN title says instead of AI Demos?
sophiebits · 7 months ago
HN automatically recapitalizes words in submission titles so I think it’s possible this could have been submitted as “AIDemos by Meta”.
rob-olmos · 7 months ago
Ahh I see. Thanks for the info!
riffraff · 7 months ago
At least it's not AI Demons
o-o- · 7 months ago
Aidemos... the greek god... of intelligence...?
saikatsg · 7 months ago
Fixed.
cebert · 7 months ago
The seamless transition demo is fantastic. The translated voice is passable for my own native voice. It would be incredible when we can achieve this in real-time.
exgrv · 7 months ago
We can! At Kyutai, we released a real-time, on-device speech translation demo last week. For now, it is working only for French to English translation, on an iPhone 16 Pro: https://x.com/neilzegh/status/1887498102455869775

We released inference code and weights, you can check our github here: https://github.com/kyutai-labs/hibiki

mastermedo · 7 months ago
Good work. The delay seems to be around 5 secods. This is a step in the right direction. I'm wondering how much more real-time can we push it.
ketzo · 7 months ago
Damn, this is pretty amazing. Feels like we’re not far off from the babel fish.
brap · 7 months ago
What is Meta’s angle with AI? They seem to be doing a lot of research but what is the end goal? Google and MSFT I understand, Meta not so much.
lanthissa · 7 months ago
Meta believes the dollars at the end of the AI race will be in walled gardens and prop data, not data centers and models.

They are going to do everything they can to make sure no one uses the time that models and data centers are limiting factors to disrupt them.

In the same way google demonetized the application layer of the web to prevent walled gardens from blocking search.

If models and hardware become commoditized at the end of the race meta will have a complete psychographic profile of people on an individual and group level to study, and serve incredibly targeted content to.

Their only real competition in that would be someone developing a 'her' like app that takes people out of social media and into their own individual silo'ed worlds. In a lot of ways discord is the alternative world to meta's ecosystem. hyper focused invite only small communities.

mattlondon · 7 months ago
> Their only real competition in that would be someone developing a 'her' like app that takes people out of social media and into their own individual silo'ed worlds

I take it you have not tried the new Gemini models on ai studio? It does real time streaming video input and conversation you can genuinely ask it questions about what you are looking at in a conversational audio in-out way. This is basically "her"-level technology in an unpolished form, right here today.

flir · 7 months ago
> Meta believes the dollars at the end of the AI race will be in walled gardens

Will those walls keep AI-generated content out, or will they keep the people outside from accessing the AI-generated content in the garden?

If it's the first, somebody should tell them the slop's already up to their navels and they probably shouldn't be helping people generate more of it.

If it's the second, then the models that supply the content to the garden must have some kind of uniqueness/value, because otherwise you could get identical content from anywhere.

This is a genuine question, because I don't understand the logic here.

(I had assumed it was more like hardware companies funding open source way back when - Commoditize Your Complement).

xyst · 7 months ago
> walled gardens

Apple tried that and it’s crumbling. Meta/Zuckerfuck is always behind the curve.

- AR (failed)

- “metaverse” (failed)

The only thing that has kept them above water is social media and selling off user data, and that’s crumbling as well. Smaller players have been eating their lunch and the user base is aging out.

Deleted Comment

twelve40 · 7 months ago
so in other words, "better targeting"? that's it?
CPLX · 7 months ago
https://gwern.net/complement

Joel Spolsky in 2002 identified a major pattern in technology business & economics: the pattern of “commoditizing your complement”, an alternative to vertical integration, where companies seek to secure a chokepoint or quasi-monopoly in products composed of many necessary & sufficient layers by dominating one layer while fostering so much competition in another layer above or below its layer that no competing monopolist can emerge, prices are driven down to marginal costs elsewhere in the stack, total price drops & increases demand, and the majority of the consumer surplus of the final product can be diverted to the quasi-monopolist. No matter how valuable the original may be and how much one could charge for it, it can be more valuable to make it free if it increases profits elsewhere. A classic example is the commodification of PC hardware by the Microsoft OS monopoly, to the detriment of IBM & benefit of MS.

This pattern explains many otherwise odd or apparently self-sabotaging ventures by large tech companies into apparently irrelevant fields, such as the high rate of releasing open-source contributions by many Internet companies or the intrusion of advertising companies into smartphone manufacturing & web browser development & statistical software & fiber-optic networks & municipal WiFi & radio spectrum auctions & DNS (Google): they are pre-emptive attempts to commodify another company elsewhere in the stack, or defenses against it being done to them.

twelve40 · 7 months ago
great question, i was wondering about that. I think it's mostly in discovery phase right now, similar to how they dabbled in crypto before, and the largely finished by now "metaverse" experiment. (yes, this dabbling involves a ton of money sometimes). These demos actually show what they might end up using AI for, but whether it's truly game-changing for their business and whether it will be good for the regular users, considering their shitty UI's both in FB and even Instagram by now are grossly obsolete, haven't changed in over a decade despite 70,000 people working there, and are nowadays mostly focused on violently shoving more ads over actual usefulness, is still an open question.

If their business remains a shitty declining buggy 20-year-old Facebook and a 10+year-old Instagram app, but they contribute to advancing open source models similar to how they did with React, I'll consider that a net win though.

rsynnott · 7 months ago
After the 'metaverse' stuff flopped, desperate to spend their money on some other thing that might be The Future(TM)?

Arguably this would be kind of rational behaviour for them even if they thought that LLM stuff had a low chance of being the next thing; they have lots and lots of money, and lots of revenue, so one strategy would be just to latch on to every new fad, and then if one is a real thing they don't get left behind (and if it's not, well, they can afford it).

My suspicion is that this is where most Big Tech interest in LLMs comes from; it's essentially risk management.

postexitus · 7 months ago
Paraphrasing from someone who is involved in this - their angle in AI is better targeting of Ads - better classification, clustering, better "recommendations" for the advertiser, including visuals, wording, video etc.

These and others are just side benefits or some form of "greenwashing". Meta's main (and only) business is advertisement. They failed to capitalize on everything else.

aprilthird2021 · 7 months ago
Enabling experiences with AI that will drive people sharing content with each other, communicating online, and which can be utilized in AR/VR, where they have a lead position. In-house AI improvements have also helped ad placement and ad generation for clients

People who think Meta's main business focus is Facebook and Instagram don't pay attention.

hypothesis · 7 months ago
What makes you think that more artificial stuff is going to reinvigorate the business? Metaverse was supposed to be such savior, but this time they didn’t even rename the company…
JTyQZSnP3cQGa8B · 7 months ago
Money and manipulation? Was that a real question?
twelve40 · 7 months ago
Yes, that's a real question, even for the money and manipulation use case, how does this help, especially the money part?
isoprophlex · 7 months ago
You forgot "fucking over the competition".

Not that I'm complaining about their open-weights model releases destroying openai's moat... but still.

999900000999 · 7 months ago
AI make stock go up.

I think this is it. I'm kicking myself for not going harder, but I was very much into LLMs/ML back in 2019, had I not given up I might have a startup right now.

I'd need like 70k and a minimum of 6 months, but I still have a few ideas for AI driven startups.

barbazoo · 7 months ago
Generated content is my assumption. Both, by users but also fully automated.
brap · 7 months ago
I don’t think anyone wants generated content in their IG/FB feed, so not sure how this will play out in the long run
yalogin · 7 months ago
What is MSFT and Google's reason?
brap · 7 months ago
Both do search, devices, OS and browsers - very natural verticals to integrate with AI, and both have cloud platforms where they can sell it to developers.

With Meta I can’t think of a single existing vertical where AI would be desirable. Maybe Quest

ghxst · 7 months ago
I'm pretty impressed with the segment anything[0] demo, is this integrated into an actual product anywhere? I do some simple video editing for friends as a hobby and can see some of this be pretty useful.

[0]https://sam2.metademolab.com/

Etheryte · 7 months ago
Photoroom [0] is from Y Combinator and their product is essentially SAM plus a lot of polish along with a good user experience. I'm not sure if they're using it, but if they're not, I think they should be.

[0] https://www.photoroom.com/

avgd · 7 months ago
SwarmUI, a front-end for image generation models, has integrated SAM2 as a quick way to mask parts of an image for things like inpainting. It's wonderful.
barrenko · 7 months ago
It probably is, but you won't hear it advertised as such.
thih9 · 7 months ago
If anyone else is wondering, Meta FAIR stands for "Facebook Artificial Intelligence Research" and has since been renamed to "Meta AI"[1].

[1]: https://en.wikipedia.org/wiki/Meta_AI

lelag · 7 months ago
It's not exhaustive. For exemple, it's missing the Meta Motivo demo at https://metamotivo.metademolab.com/ (humanoid control model)