Readit News logoReadit News
grej · 2 years ago
This story just within the last days:

"Athletic director used AI to frame principal with racist remarks in fake audio clip, police say"

https://apnews.com/article/ai-artificial-intelligence-princi...

fennecbutt · 2 years ago
Which is why we need to make this technology readily available and well known so that people are more aware of it and don't trust everything and look up sources.

Ah, who am I kidding, most people will still not fact check.

skeaker · 2 years ago
[deleted]
nolok · 2 years ago
We're really in an era were laws and their enforcement will have a lot of catching up to do very fast.

Fake historical proof, fake leaks, fake endorsments, fake ads ... People couldn't be bothered to double check when it was mere random text article on facetok, it's going to be so much worse ...

jorvi · 2 years ago
From hypernormalisation to hyperreality.

I’ve been telling my friends that 5-10 years from now, the only things that you’ll be able to ~100% trust is what happens in front of your eyes, in that very moment. You can elect to trust reliable news organizations to vet things for you, but even if you do, due to polarization a huge subset of the world will think you’ve been got and discard everything as fake.

Look at stuff like Sora, or all the new voice models coming out. Just a few days ago there was a high school athletic coach (!) arrested for cloning the principal’s voice using that to say vile stuff. He only got caught because he used his own e-mail.

Now combine that with the fact that Microsoft’s new Phi-mini model approaches GPT-3.5 performance using 3.8 billion parameters, whilst GPT-3.5 uses 175 billion. And we’re only ~5 years of optimization into this tech.

I want to get off Mr Bones’ wild ride.

pfannkuchen · 2 years ago
> the only things that you’ll be able to ~100% trust is what happens in front of your eyes

Won't this just be a return to the historical norm?

Prior to photography being invented, there was no guarantee that any retelling of events (whether spoken, written or drawn) was true.

It will be weird for people alive today, but it doesn't seem that risky from a societal perspective.

bombcar · 2 years ago
We already know you cannot trust what you see with your eyes (check any "compare eye-witness reports with trusted video recordings" or watch Penn & Teller).

We're in for a fun and wild ride.

dudefeliciano · 2 years ago
> you’ll be able to ~100% trust is what happens in front of your eyes

5-10 years from now we may have vision pro gen 5, or whatever system achieves market dominance, between our eyes and what happens in front of us.

bparsons · 2 years ago
You can trust what happens in front of your eyes until everyone starts wearing augmented reality contact lenses.
jareklupinski · 2 years ago
> I want to get off Mr Bones’ wild ride.

the ride never ends

giantg2 · 2 years ago
"I’ve been telling my friends that 5-10 years from now, the only things that you’ll be able to ~100% trust is what happens in front of your eyes, in that very moment. You can elect to trust reliable news organizations to vet things for you,"

The time is now. Even the mainstream news organizations are probably high 90s% reliable as they've been caught selectively editing, not vetting sources or facts, and displaying biases.

andsoitis · 2 years ago
Trust is a dependency for human existence. Not just for civilization, but also very small communities and basic exchange of ideas, goods, and services.

I cannot foretell how the risk of trust destruction from GenAI will unfold, but I'm optimistic our creativity will win out.

cthalupa · 2 years ago
If you think about it, though, the window for there being something resembling objective universal truth has been a very very short period of human history. It really didn't exist before the internet and ubiquitous smartphones.

Before the internet, TV, radio, newspapers were our sources of truth beyond just trusting people in our immediate vicinity, and these were all heavily filtered by what stories they decided to run, the amount of detail they focused on, any human bias that crept into their reporting, etc. I'm not a "FAKE NEWS!!!" kind of guy, but one has always had to ingest news from these sources with some level of filtering in this regard, and understand that there might be other sides to the story, or whole stories of importance going unreported.

If we revert to subjecting images/video/audio clips to the same level of skepticism we had with random people informing us of pieces of news with no proof, then we're effectively just at the same level of objective universal truth as we had been for the overwhelming majority of human history.

I'm not arguing this is a good thing - just that it might have been a small and blissful island that some of us had the privilege of enjoying.

froh · 2 years ago
yes. that depndency is why it's even in the ten commandments: don't give false testimony / don't slander, don't lie.

https://en.wikipedia.org/wiki/Thou_shalt_not_bear_false_witn...

but instead they discuss who I've married. sigh.

anyhow: I share your optimism.

bloopernova · 2 years ago
This is partly why I'm so fascinated and disgusted by trolls and astroturfers. They erode trust in a given forum, which degrades the quality of discourse because no one wants to invest time in untrustworthy discussions.

Sometimes I wish I could get an honest answer from trolls about what they hope to achieve, but of course that will never happen.

throwthrowuknow · 2 years ago
A digital audio file is not even close to being proof of anything. Even without voice cloning you can easily edit, clip and compose audio into almost anything you want. It’s also not difficult to simply impersonate someone else’s manner of speaking with practice something that is commonly done by both amateurs and professional actors. The only thing that changes is the ease with which this can be done which should help everyone understand how unreliable such “proof” is.
telesilla · 2 years ago
Sounds like a remake of Sneakers is needed--with a fresh take on impersonation and social engineering, to remind people what's possible and potentially dangerous.
colecut · 2 years ago
When dealing with social conflicts, there is very rarely "proof", just evidence

In courts of law, digital tapes have been frequently used as evidence.

tyingq · 2 years ago
Legal opinions don't seem to have caught up...

https://www.justice.gov/archives/jm/criminal-resource-manual...

jilijeanlouis · 2 years ago
There are actually a big opportunity for companies like loccus: https://www.loccus.ai/
nolok · 2 years ago
I don't know the political situation in your country, but all I will say is that the putin-aligned far right wing in mine uses very much fake quote and deep fake videos and things like that to propagate ideas, and that their "followers" eat it up, and then you either have to let those non truth remain or you spend all your energy fighting them / defending yourself, making you look guilty. And in the past 3/4 years (and it exploded with covid), it's the followers themselves that now start those things.

For AI, those AI move it to "someone with some dedication can do it" to "anyone with a computer can do it".

croes · 2 years ago
Easily for people who what to do.

Now it's easy for everyone.

ActionHank · 2 years ago
These are big issues, but I would say that a bigger issue is the case where a spam caller has you on the line talking for ~10 seconds and then calls the bank or family member as you.

Android and iOS should support real time voice changers as the norm with a quick switch button on the dialer to disable it and an option to have it off for known contacts.

andy99 · 2 years ago
I've come around to the idea that the hype around criminal or bad actor uses of AI is the same as the hype around other uses. Some real uses will shake out but the delta between what's actually enabled by the tech and what was possible anyway is way smaller than people like to represent.
jddj · 2 years ago
I'm not sure. Maybe I'm caught in the hype but I feel like possible is one thing, scalable is another.

We're at the point now where, for example, a leak of phone number + address book would enable a high quality, large scale automated impersonation pipeline that many here could put together in a weekend.

Install a compromised app (or just be a once-user of a newly-compromised app), then answer the phone for a minute and then everyone you know receives a somewhat believable phonecall to empty their wallets / your accounts?

nolok · 2 years ago
I am not talking about criminals or similar, for that I agree with you.

I am talking about political. There has been a MASSIVE explosion of fakeness since covid, coupled with people who completely lost their ability to actually check information, and anything that makes it easier for anyone to propagate is dangerous in my view.

We've reached the point where "check by yourself / do your own research" moved from what the proponent of the truth used to tell people to go check, to what the other side is now using to give as much value to their facetok videos and whatnot (I am speaking in the general sense).

unraveller · 2 years ago
It's not a very consistent pearl clutch either, would they deny all access to computers because terrorists will use them to fire heat-seeking missiles?

I think it's just a lack of imagination, we can do this funny offensive misleading thing right now, it can surely be used in other ways but you'd have to let your mind wonder for one extra moment after your social media induced reaction fixation episode. This could be depression hack to hear your own voice in different accents motivate you.

corobo · 2 years ago
Honestly it couldn't hurt for people to re-learn not to trust things on the internet.

I haven't heard someone snarkily say "it's on the internet so it must be true" in like 12 years. Let's bring that back.

ranger_danger · 2 years ago
Really? I hear it all the time. I mean, if it's on the Internet it must be true.
simion314 · 2 years ago
>We're really in an era were laws and their enforcement will have a lot of catching up to do very fast.

I mean there are existing laws that would apply, there are laws against scamming or whatever bad stuff you can do with this. The world might need a law though to force media to label AI generated content and IMO a law where media and users would take responsability for their content.

What I mean if say as a random user I claim something in a media post like "eating shit cures you by covid" I should either be forced to add "btw I am not taking any respectability for my comment and I am not a medic/lawyer/expert" otherwise I should be forced to pay damages is someone sues me for my bullshit claims.

So cloning voices should be legal, sscamming people is illegal.

_joel · 2 years ago
It's not cloning, it's just copying the tonality. It states this in the docs but still calls itself voice cloning.

I tried it and I ended up sounding American, not my ususal dulcet Lancashire tones. Absolutely nothing like me.

unraveller · 2 years ago
You should be able to bring it back to your proper accent using https://voiceshopai.github.io

VoiceShopAi can convert from young to old, male or female, or into any country's accent.

found via https://github.com/metame-ai/awesome-audio-plaza who is tracking most things in the voice space as they come up.

_joel · 2 years ago
Can't see any code for that, unless it's there and I've missed it.
youngNed · 2 years ago
never really thought about how much i need a 'Fred Dibnah' voiced AI until now
_joel · 2 years ago
Aye lad!
tiborsaas · 2 years ago
Same here, I've tried it with my own voice and luckily it sounds nothing like me.
causal · 2 years ago
Yeah not the best title/name. On a more meta note, I sometimes feel like HN comments are increasingly Reddit-style headline reactions with little investigating TFA or peering into the tech itself.
sandspar · 2 years ago
When people leave Reddit half of them come here. Lots of people left Reddit recently.
screamingninja · 2 years ago
What is a legitimate use case for this? I can think of a hundred applications for deceiving others but struggle to come up with a scenario where one would want their voice cloned or reproduced.
dannyw · 2 years ago
You're recording a podcast and want to tweak some of your own words, without the hassle of re-recording.

You're an indie game developer, and want to have vibrant NPCs with their unique voices and dialogues powered by a LLM.

You're producing a movie, and want to tweak certain lines of dialogue; with the consent of the talent.

You suffer from health conditions and are gradually losing your voice, but you still want to communicate.

There are certainly legitimate use cases of this technology. I personally believe illegitimate use cases overshadow the legitimate use cases, but I don't think it's fair to say there are no legitimate applications.

We should strictly regulate the use of this technology by criminalizing abuse; not by banning it altogether (which is pretty hard in the case of software and small models).

dylan604 · 2 years ago
> You're producing a movie, and want to tweak certain lines of dialogue; with the consent of the talent.

The latest agreement to end the last round of strikes was to prevent this very thing.

Of your list, the medical condition to give someone their real voice instead of a Hawking voice would be the most legit reason. Everything else is a skewed sense of morally acceptable as I think they are shady

ranger_danger · 2 years ago
I wish I had your creativity when trying to think of something useful to code.
whycome · 2 years ago
It’s only a matter of time before Alexa and other agents use better customizable voices.

Audiobooks could have voices read by characters rather than a single narrator faking it. (If even)

You have a cold but still want to give a speech without coughing.

Low bandwidth transmission of audio: transmit just the text and use local voice model to replay it.

Talk to your loved ones after they’re gone.

Hilarity and comedy.

beretguy · 2 years ago
> Talk to your loved ones after they’re gone.

Ok, no, that's bad. Have you seen Black Mirror?

ranger_danger · 2 years ago
Imagine the day when Alexa speaks back to you in your own voice. People would go insane.
r2_pilot · 2 years ago
You may not be trying hard at all then. The first thing I thought was to clone your voice to use in real-time translations. I can probably think of several others mentioned in comments below, but this is a 100% always-useful never-nefarious(assuming perfect translation not being maliciously used) application.
brigadier132 · 2 years ago
This tech makes me not even want to speak.
anotherevan · 2 years ago
I have a friend with a paralysed larynx who is often using his phone or a small laptop to type in order to communicate. I know he would love it if it was possible to take old recordings of him speaking and use that to give him back "his" voice, at least in some small measure.

Unfortunately I have yet to see something that can do this and provide a voice model that you could plug into Android TTS and/or Windows which are what he uses.

jilijeanlouis · 2 years ago
Why would you use a dedicated app? Does it have to be natively embedded in android ?
shortrounddev2 · 2 years ago
I play a lot of counter-strike and it's very amusing when people hurl insults at the other team with the voice of Joe Biden
colechristensen · 2 years ago
Fixing small errors in narration, voiceovers, or other recorded content.

Translation of recorded content with the original voice into new languages.

Comedy as long as it’s obvious that it’s a fake.

Actually intentionally selling your voice to be the voice of some text to speech product. Maybe I want Alexa to have the voice of Danny Devito, as long as he’s ok with it and getting paid.

lukevp · 2 years ago
My wife has been sick all week and has to communicate over text because her voice has gone. We’ve been talking about making voice clones of ourselves for situations like this. Some people never regain their voices so preserving them before they lose it is super valuable.
gotrythis · 2 years ago
I imagine training people, and having everything I say be available in any language, matching my tonality, and being able to reach a global audience. I'm very much looking forward to this.
mlboss · 2 years ago
Podcast production without speaking. Audio correction in media.
bhickey · 2 years ago
> What is a legitimate use case for this?

Voice loss.

tmaly · 2 years ago
What if you wanted to create audio for your videos without having to have a recording session.
smrtinsert · 2 years ago
Tiktok videos for the amusement of millions?
dylan604 · 2 years ago
For 6 months before it's banned!
tjbiddle · 2 years ago
Sending personalized messages to customers
codelobe · 2 years ago
Indie Gamedevs can do their own Voice Acting? See also indie film, same use case. Actor dies / hit by buss before a work is finished - Create a few more lines posthumously (it'll be in the fine print of the contract that you allow voice & image fakes in the event not able to do them). Satire, Pranks, and alleged pranks (stuff that makes folk laugh).
ChildOfChaos · 2 years ago
Where are the best places to keep up with all of this? I'm very interested in this area as I want to use these tools to create things with and my own voice isn't great for this.

Speech to speech seems like it might be better than TTS to get it to be more natural, i've played around with some tools like RVC etc, but I feel like there are maybe a lot of great AI workflows I am missing amoungst all the AI noise, it's the interesting workflows and people doing interesting things with AI that I am more interested in.

jilijeanlouis · 2 years ago
Definitely twitter. This is where everything is announced and commented
ChildOfChaos · 2 years ago
Thanks, it's annoying some of the best places seem to be the places I am trying to avoid due to wanting to avoid the more negative aspects of it.

Do you have any recommendations on twitter of who is good to follow?

Particularly interested in people doing intreasting things with AI, I already subscribe to the normal AI newsletters, such as bens bites etc.

skeaker · 2 years ago
Awful lot of doomsaying and drama in here. What makes this release so bad compared to the existing voice cloning AI methods that have been publicly available for ~1 year already?
mindcrime · 2 years ago
My voice is my passport, verify me
joshstrange · 2 years ago
I always have to say that in my head when I hear the word “Passport”. Also I love this scene of social engineering: https://youtu.be/WdcIqFOc2UE?si=tP2HxVEskl9szuKO
mindcrime · 2 years ago
Classic. The whole "I'm late for the party on the second floor" scene is great as well.
_joel · 2 years ago
I loved that screensaver, LLAP.
perakojotgenije · 2 years ago
nice reference :-)
andy_ppp · 2 years ago
I really can’t wait until voice cloning means we get a version of audiobooks read in the author’s voice. Of course it will never be quite as good as them reading it themselves but I think the author’s voice adds something that voice actors can’t- they appear to be too generic and too affected in their pronunciation for me to connect with.
smeej · 2 years ago
What the author adds, if they're not also a trained/well-practiced voice author, is that their inflection exactly matches how they meant the words in the book to be spoken/understood.

AI isn't going to be able to do that. As good as it may get, it won't be able to read the mind of the author. It's going to be even more generic than a human reader.

throwthrowuknow · 2 years ago
Exactly, the improvement will be in rerecording terrible readings into something enjoyable or at least inoffensive. That and personalization so you can choose the voice that you prefer.

Deleted Comment

wddkcs · 2 years ago
Will this be the case when/if books become largely written with the help of AI? Let alone when AI start writing the books themselves.
Kuinox · 2 years ago
So you just need a sample of the author saying the "odd" words.
PodgieTar · 2 years ago
Odd, because I actually worry about this. I don't see why you'd want your books read by the author. Trained Voice Actors do a much better job, and can modulate their voices based on tone.

Autobiographies? Fine, but most of the time they are usually read by their authors.

Rodeoclash · 2 years ago
If you think that a voice actor reading an audio book is too generic then I've got bad news about an AI trained on the author's voice...
andy_ppp · 2 years ago
I was hoping it would be voice transfer so the voice actor would give all the intonation and emotion and the AI would take that and make it sound like the author. Reading text with AI is getting better but yes it’ll be worse for a long time.
joshstrange · 2 years ago
I have nearly no desire to have my book read my the author. They are good at writing, and an audiobook is not simply “reading” the words on the page. Maybe something like Descript that the author can use to tweak pronunciation after it’s narrated but I don’t want the author’s voice.

I would like train a model on Allyson Johnson’s voice (narrated the Honor Harrington books) and then use that to re-narrate the 1-2 books in one of the spinoffs (I think it was in the Saganami Island series?) where they used a different narrator (who was horrible).

I also might be interested in using it to clean up the Wheel of Time series where, while it’s the same 2 narrators, they change the pronunciation of various names/words book-to-book. “Moghedien” being the one that stands out most. They pronounce it at least 3 different ways:

* Mo-gid-e-on

* Mo-ga-dean

* Mog-a-din

kornork · 2 years ago
It's curious, to me at least, why they didn't just go back and fix those themselves later. The early ones were on CD (or tape?), so maybe that's why.
kornork · 2 years ago
I think I'd prefer to have options for each audiobook. I have favorite narrators, and find others unlistenable. There are also thousands and thousands of books that will never otherwise be turned into audio format unless an AI is used.
infoseek12 · 2 years ago
Writing and being a voice actor are two quite different skills. My experience with author narrated audiobooks is that there isn’t very much overlap.
block_dagger · 2 years ago
Never be as good as human? I disagree, seems like it’ll be nailed, no way to tell from the outside.
andrewstuart · 2 years ago
Audiobooks in the authors voice….. fine for non fiction, usually terrible for fiction.