AI-Generated Voice Evidence Poses Dangers in Court

Sounds like reasonable changes.

Generally speaking, I think evidence tampering is not a new problem, and even though it's easy in some cases, I don't think it's _that_ widespread. Just like it's possible to lie on the stand, but people usually think twice before they do it, because _if_ they are found to have lied, they're in trouble.

My main concern is rather that legit evidence can now easily be called into question. That seems to me like a much higher risk than fake evidence, considering the overall dynamics.

But ultimately: Humanity has coped without photo, audio or video evidence for most of its existence. I suppose it will cope again.

mitthrowaway2 · 6 months ago

I agree. The article didn't touch on this aspect, but we're now at the point where even authentic recordings could be plausibly denied and claimed to be fake. So the entire usage of recordings as evidence will suffer a hit. We may essentially be knocked back to an 18th century level of reliance on eyewitness testimony. One wonders what the consequences for justice will be.

Ukv · 6 months ago

I wouldn't say we'd be quite back to pre-photo evidence days. I feel a lot if not most of the value in a video/audio recording is not just that the medium has traditionally been difficult to edit, but that it's attesting to a lot of details with high specificity. There's a lot to potentially get caught out on with not a lot of wiggle room when inconsistencies are spotted (compared to recalling from memory). Document scans and static images are still useful despite having long been trivial to edit, for instance.

thereddaikon · 6 months ago

There's already a process for this, its called chain of custody. If you cant prove the evidence has a solid chain of custody then it was potentially tampered with and isn't reliable.

nine_k · 6 months ago

I can easily imagine a future where video evidence is only acceptable in the form of chemically developed analog film, at resolutions that are prohibitively expensive to model, and audio recordings of any kind are not admissible as evidence at all. Signatures on paper, faxes, etc are, of course, inadmissible, too.

AyyEye · 6 months ago

> we're now at the point where even authentic recordings could be plausibly denied and claimed to be fake.

We've been there for at least two years.

https://arstechnica.com/tech-policy/2023/04/judge-slams-tesl...

datadrivenangel · 6 months ago

Digital forensics will continue to be an in-demand skill!

SoftTalker · 6 months ago

Have you ever heard some of the wiretap or hidden microphone recordings used to convict mafia bosses back in the 1980s? It was so bad I can't believe it was accepted. It could easily have been faked. The only thing that made it work was the sworn statements of authenticity from the people who did the recording, and the chain of custody thereafter.

qingcharles · 6 months ago

I know a lawyer who was convicted on one of those. The detectives had the mic + transmitter taped to his groin so it would get through a pat down. Then the undercover just spoke both sides of the conversation (the defendant's side was whispered and muffled). Someone testified it sounded like the defendant. Conviction was overturned on appeal on a simpler issue unrelated to the evidence.

In fact, I remember a drug dealer I was helping with his defense. Hidden mic on undercover was taped under his armpit. He arrived to defendant's hotel room for a deal and defendant made him undress. The recording is hilarious because defendant is like "You fucking snitch, what's that under your armpit?" and the undercover says "It's my .. er .. MP3 player?" LOL

trhway · 6 months ago

>But ultimately: Humanity has coped without photo, audio or video evidence for most of its existence.

which allowed to burn witches based on testimony of citizens in good standing.

And that leads us to using Neuralink and similar tech and the next gen lie detectors (like say the defendant's fMRI (most probably interpreted by AI too:)) to look into the brain and extract confession. No need for evidence, deposition and all that expensive time consuming stuff standing in the way of truth especially given that it can't be trusted anymore in the face of AI capabilities.

EGreg · 6 months ago

I kind of want to have an LLM which takes absolutely any criticism of AI or news of it doing something bad and then generates a plausible HN comment that basically goes "I don't think it's a new problem, X has always been possible, there's nothing really new"... because that comment always appears like clockwork beneath it :)

fhd2 · 6 months ago

Strangely, I'm not even offended by the notion that an AI can replace this work I apparently do :D Though my thought here was just pure optimism in the face of something bad, not an attempt to frame a bad side effect of something good as not a big deal.

When it comes to generative AI, I personally don't see a lot of good applications, but a plethora of bad ones. The only solution I could imagine would be regulation to the degree that using or distributing models with certain capabilities is just illegal. Judging from the war on file sharing a few decades ago, probably very difficult to enforce, even if it is perhaps still worth doing.

But I don't see any governments line up to do it. Given that, this particular (semi) new development that generative AI is effective for evidence tampering, I think we'll manage to deal with it.

throwway120385 · 6 months ago

The nuance that's usually missing is that new technology invariably enables the same things but more of them and faster.

dragonwriter · 6 months ago

I mean, you are already getting the result for free, so why do you need an LLM for it?

In any case, you can probably do this with a fairly simple prompt with any LLM.

RataNova · 6 months ago

That's a great point. The ability to undermine real evidence by claiming it's AI-generated could be just as (if not more) damaging than fake evidence itself

pixl97 · 6 months ago

Well, and to generate fake evidence to lead cops off the trail for the longest period of time possible. Evidence could be said to contain a half life. The longer (real) evidence is not gathered the more it can decay in one way or another. For example security footage gets overwritten. Actual witness memory gets foggy. Physical items get thrown away.

If the cops spend the first month going after some poor Mark that had nothing to do with it, by the time they realize they've been had it will likely be much more difficult to catch the actual perp.

Also, that's saying nothing about the court of public opinion.

kenjackson · 6 months ago

> I don't think it's _that_ widespread. Just like it's possible to lie on the stand, but people usually think twice before they do it, because _if_ they are found to have lied, they're in trouble.

I don't have data, but I suspect a LOT of people lie on the stand. This is mostly based on what I see on reality court shows and true crime type shows, so admittedly not a great sample, but I figure once something gets to trial things are going to be contentious.

HPsquared · 6 months ago

At this stage it's more a risk for people who have their likeness out in the public domain where it can be copied. But that's just about everyone these days.

vasco · 6 months ago

> But ultimately: Humanity has coped without photo, audio or video evidence for most of its existence. I suppose it will cope again.

Same argument for electricity, the internet, sanitation, democracy... doesn't seem like a great test for stuff that we just didn't have it before and survived.

fhd2 · 6 months ago

Well, I'm not arguing "good riddance". I just optimistically think we'll manage. I wouldn't want to miss any of the things you listed, actually. I'd also prefer evidence tampering to be impossible or at least very difficult. But that's not a call I can make. All I can decide about things out of my control, is how I think about it. And here I'm carefully optimistic.

Dead Comment

Deleted Comment

I'll say it again, even though it is rather unpopular here: there has never been a need to develop these tools, nor one to make them easy to deploy, nor one to make them easy to use. Yet all this has happened, and now it may occur that someone is acquitted because AI generated media is so good, the evidence might be artificial. If that happens, and the suspect commits another crime, it's on the conscience of the people that contributed to this. You cannot create something and pretend its use has nothing to do with you.

The tools aren't perfect yet, so it's not too late to stop. Stop the ridiculous image and audio generation tools before it's too late. Nothing of value is lost when these models are made private again, and research is simply halted.

cootsnuck · 6 months ago

It actually is too late for that. For anyone unaware, the open source models are already more than sufficient enough for imitations and deepfakes. For better or worse there's no going back.

Personally, I'd rather we all know this tech is out there and develop defense mechanisms rather than thinking hiding it away will prevent harm.

The cyber security industry exists because of all the privacy and security issues posed by all the tech we already have had for the past several decades.

I'm confident the same will happen for AI simply because it is a business opportunity and other businesses and institutions are already talking about these issues.

bdangubic · 6 months ago

You cannot create something and pretend its use has nothing to do with you.

has always worked with guns so it'll always work with anything else :) and guns will kill more people that AI-generated shit for foreseeable future

almogo · 6 months ago

There is an argument that "AI-generated shit" (on social media, or otherwise) can influence elections, and thus war. The best example I can think of is https://www.bbc.com/news/world-asia-india-68918330.

consteval · 6 months ago

Text, images, and things of that class - human communication tools - are much higher level than guns.

Human communication tools are responsible for, say, the holocaust. At its core, you can distill the holocaust to that. Or choose any tragedy, really.

I think what we’re dealing with is much worse than we are giving it credit. I’m envisioning the death of trust as a concept itself. Such has never before been faced by humanity, and I don’t know if even our biological social circuits can handle it.

We’re already half way there. It’s trivial to convince large amount of people of something that isn’t true, and then get them to act on that belief. It just requires some time and some money. Remove the time and the money requirement and I don’t know where we land.

qwertox · 6 months ago

I do want a computer which talks to me like a person, with that agility. I am used to listening to voice.

I am tired of reading so much stuff which could well be spoken to me. Like this comment here, which you now need to read.

It's only very recently that living beings on this world have learned to read and write, it's not normal. It's normal to communicate through sound.

krainboltgreene · 6 months ago

What an incredible line to end on. “It’s not normal to read/write”.

Loughla · 6 months ago

>It's only very recently that living beings on this world have learned to read and write, it's not normal. It's normal to communicate through sound.

It's normal for humans to communicate via sight and sound combined.

That being said, we invest a lot of calories and brain power into vision. It's our primary mode of interacting with the planet. This is what we have evolved to rely on.

I think, therefore, you're wrong. We rely heavily on eyesight, naturally. It's why our eyes are focused on the front, allowing precise binocular vision, whereas our ears are on the side, allowing for broader coverage, but less specific information received. Reading and writing (or at least some form of image communication) is probably more natural for communicating ideas than talking, for humans.

Even if you are still trying to average us out across other life on earth (which doesn't really make a bit of sense), I think vision is the way to go. Colors and visual attractants and other visual forms of communication are common even in plants, as well as animals.

triceratops · 6 months ago

Reading and writing isn't "normal" but using a computer is? Humans have been doing the former for way longer.

gowld · 6 months ago

You don't need writing, if you want to live in a cave or wander a prairie, I guess?

Ukv · 6 months ago

> Nothing of value is lost when these models are made private again

To list a few uses I see for voice-generation/cloning tools:

* Real-time translation of a user's voice, maintaining emotion and intonation

* Professional-quality audio from cheap microphone setups (for video tutorials, indie games, etc.)

* Allowing those with speech impairments to communicate using their natural voice again, or:

* Allowing those uncomfortable with their natural voice to communicate closer to how they wish to be perceived

* Customization of voice assistants, such as to use a native accent/dialect

* Movies, podcasts, audiobooks, news broadcasts, etc. made available in a huge range of languages

* And of course: memes, satire, and parody

If it exists but isn't widely accessible, it's likely in the hands of Musk/Zuck and and various state actors. To me that seems possibly the worst alternative - to have the public generally unaware that it's possible and receiving few of the benefits listed, yet still having it available as a tool for competent disinformation.

educasean · 6 months ago

Fatalities due to automobile accidents are a major drag. We should've stuck with horses

a2128 · 6 months ago

There is some use to bringing deepfakes to mass adoption. The thing is that since the tech exists, powerful actors with lots of resources will develop these tools for their own use either way. The question is whether they'll be able to fool the masses who are unaware that such realistic deepfakes can exist, or whether they will have no effect as everyone and their mom have already seen similar AI slop on their Facebook feed

pixl97 · 6 months ago

>The tools aren't perfect yet, so it's not too late to stop

This is some serious level of cope for HN.

Simply put, it's not going to happen.

treetalker · 6 months ago

I question the wisdom of setting the judge up as a superjury / gatekeeper for this kind of situation. This seems like a reliability / weight of the evidence scenario, not a reliability / qualification of the witness scenario (as with an expert witness).

Why would the judge be better qualified to determine whether the voice was authentic, as opposed to the witness? And why should the judge effectively determine the witness's credibility or ability to discern, when that's what juries are for?

All that said, emulated voices do pose big problems for litigation.

_DeadFred_ · 6 months ago

You mean like expert witness polygraphers that were treated as fact by the courts for years and still used to re-incarcerate people on parole/probation?

Or gun matching (ballistics) that are no longer considered conclusive but subjective? https://www.mdcourts.gov/data/opinions/coa/2023/10a22.pdf

Hair and Fiber expert analysis that was wrong? https://innocenceproject.org/fbi-agents-gave-erroneous-testi...

Or do you mean bite mark analysis that was again wrong?

Many of these forensic methods were used for decades and presented/treated as conclusive evidence before being challenged leading to wrongful convictions. But yes, let's have 'certified' voice experts whose living is based on being hired by the government and giving testimony to convict people. Surely this time it will be altruistic and scientific.

klipklop · 6 months ago

What always worried me is that nobody ever challenged these methods. We just accepted it as fact. Endless crime tv shows reenforce that these methods are infallible. People that watch these shows then become jurors. Scary.

bluGill · 6 months ago

I don't know what the latest is, but often judges are supposed to not allow "expert testimony" without checking that the person is an expert. However this is a really complex area. Judges don't want to be the one deciding a case, but not allowing some "expert" is in a way deciding the case.

zusammen · 6 months ago

There’s a built-in design paradox. How does a judge assess an expert in a field where he is not one? There’s probably some improvement that comes from experience but it’s not perfect.

hiatus · 6 months ago

Isn't it up to the opposing party, and not the judge, to impeach an expert?

But I think the argument here is less about judges making definitive calls on authenticity and more about ensuring that clearly questionable evidence isn't automatically admitted just because a witness vouches for it.

whywhywhywhy · 6 months ago

Gell-Mann Amnesia effect comes to mind.

NoMoreNicksLeft · 6 months ago

Excluding fake evidence is very much a responsibility of the judge. In the age of Fox News, letting the jury decide for themselves whether or not made-up bullshit is actual evidence seems like a recipe for disaster, and not necessarily one that errs on the side of caution.

To the contrary, in US courts the jury determines whether evidence (documentary, testimonial) is credible or not, and what weight (if any) to assign it. (Experts, à la Daubert etc., are a different matter because they give expert opinions, not factual evidence based on personal witnessing of the events in the case, so the judge does perform a gatekeeping function, essentially to ensure the underlying field/science is reliable.)

While certainly Fox News headlines would not reach the jury in most instances, that is on account of hearsay, lack of qualification, materiality, relevance, and similar rules. It is not a prior credibility or weight determination by the judge, as I understand TFA to be advocating. So: did the witness hear a voice that he believed to be the one in question? If so, jury gets to decide (unless unfairly prejudicial or some other overriding rule comes into play).

dghlsakjg · 6 months ago

How is AI voice faking any different than any other type of faking? How is it different than a manipulated recording, or a recording where someone is imitating another?

It is just as easy to fake many paper documents, and we have accepted documents as evidence for centuries.

Photos can be faked, video can be edited or faked, witnesses lie or misremember.

Is this just about telling lawyers that unvetted audio recordings can be unreliable? Because that shouldn't be news.

Edit: this is a good faith question. I'm legitimately just curious. Splicing and editing have been around since recording was invented, I was legitimately curious why voice recordings would have been given extra evidential weight when manipulating recordings is a known possibility.

rout39574 · 6 months ago

Presuming good faith here, faking recordings has been harder to do, easier to detect, and less equivocal in the past than it is now.

If it takes an FX house to generate a plausible recording of me saying something I didn't say, that's a risky enterprise with a lot of witnesses.

If my enemy can do it in their basement with an hour of research, the exposure risk goes way down, and consequently the expectation you'll see it in real life goes way up.

ceejayoz · 6 months ago

Nuclear weapons require nation-state levels of resources. Now imagine you can build one in your basement for $10k. Does that change things at all?

I understand what you are saying, but my point is that I could make plausible sounding recordings that did not reflect reality by, for example, cutting recordings up with freeware like Audacity, or even using a consumer level double tapedeck before that. It wasn't Manhattan project levels of effort before this.

This seems more like people losing their minds over 3d printed guns, when hobbyists with a drill press have been making guns in the garage for decades.

Yeah, its easier now to fake voice, but its not as if what this article warns against wasn't possible before the latest AI hype cycle. And it is also worth noting that voice cloning/changing technology is not particularly new either (I've been able to sound like Morgan Freeman using a phone app for at least half a decade).

I agree that courts should be cautious around accepting voice recording evidence, I just don't think that the ability to do this is new.

gdulli · 6 months ago

"We've had projectile murder for centuries with arrows, how are these machine guns and missiles any different?"

Lammy · 6 months ago

In the future this stuff will get so good that the public will beg to be surveilled at all times because it will be the only way to prove what you didn't do. You will learn to love Total Information Awareness. Consent status: manufactured :)

nostrademons · 6 months ago

It could easily go the other way, where the public doesn't care what people think they did or didn't do and just does whatever they want, because they don't respect the state and believe the social contract has been broken. "Fuck justice, talk to my AR-15."

There's ample evidence that this is already happening, eg. recent headlines about kids being radicalized at increasingly younger ages, groups like No Lives Matter that embrace violent nihilism, increased domestic terrorism, record high gun ownership across both sides of the political spectrum, authority figures that just do whatever they want and ignore any form of law or accountability, etc.

bilbo0s · 6 months ago

I don't know man?

We're already at a place where most people don't care what other people think of them.

Issue is, as long as the government has the big guns, what the government thinks of you will still matter in a major way.

In such an environment, most people are going to choose to have some kind of way to prove to the government what they did and what they didn't do. Not because they care what other people think as you're implying, but rather because they very much care that the government not get the wrong idea about them. Because the government getting the wrong idea about you can be fatal.

PolieBotics · 6 months ago

I've developed a novel approach to creating tamper-evident video via cryptographic feedback loops between projectors and cameras. The process works as follows:

1. A projector displays a challenge pattern (Perlin noise derived from of a hash) 2. A camera captures this projection 3. The system hashes the captured image concatenated with the previous hash and uses it to derive the next projection 4. This chain demonstrates true temporal sequentiality that's difficult to forge

By incorporating random noise derived from Byzantine Fault Tolerant networks and using these networks as timestamping servers, the proofs inherit the network's decentralization properties. ML then confirms that the feature distributions in projection-photograph pairs match expected patterns from the training dataset.

Demo video and GitHub repo available here: https://www.reddit.com/r/PoliePals/comments/1j8qm2j/truth_be...

o11c · 6 months ago

There are actually (at least) 3 different places where cryptography is needed here:

* Proof that this starts after a given time. Traditionally this has used methods like "this is the headline of a major newspaper today", which is limited to 1-day granularity and has problems if you can just generate a large number of expected headlines and use them in parallel. But with crypto, we can just query any random-number-timestamp-signing server, and a network of such servers can mutually sign each other's previous packets so it's very reliable both against downtime and against attacks.

* Proof of sequencing. This is trivial with a chain of hashes, though it does prevent recompression.

* Proof that this ends before a given time. This requires actively submitting your signature data to a timestamp server for additional signing, which is a much more complicated task than the initial half. It is still possible to eliminate the single source of vulnerability, but much more work.

"Camera looks at monitor" is going to be a much cheaper way to make this air-gapped than adding a projector. And this doesn't strictly need to be continuous; most things are tolerant of one-day granularity and almost everything of 15-minute granularity.

Largely true, although submitting timestamp hashes to the blockchain is probably the easiest bit.

"Camera looking at a monitor": While that might be simpler in some setups, it doesn’t really solve my main issue. I want the signal to permeate the entire scene, not just appear in the corner of a display or overlaid on the video. By projecting the challenge onto all visible surfaces, we create a physical environment that’s difficult to fake (since you’d have to convincingly generate or remove those patterns in real time). Air-gapping isn’t really the goal right now.

Finally, we're need much finer granularity than 15 minutes! The point is to lower the generation time below what is achievable via generative model.

Thank you for the comment, and I hope these clarifications are useful. It's a new concept, so please forgive the clumsiness with which I may be communicating it.

asddubs · 6 months ago

pretty clever, kind of like cipher block chaining

Thank you! It is indeed a little like a signature based on proof-of-projection.

As they say, once you have a signature, you have a most of a cryptosystem. I've been experimenting with those and other applications of non-linear functions in projector-camera systems.

https://github.com/poliebotics/PolieBotics

recursive · 6 months ago

Here's my plan.

1. Cryptographically hash each piece of media when it's recorded.

2. Submit the hash to a "trusted" authority.

3. It will add a timestamp and sign the result.

4. Now, as long as you keep the original, without re-compressing, and you trust the authority, you have some evidence that the media existed at a timestamp. On or before.

This doesn't prove authenticity, but in many cases, establishing a timestamp would be enough. Forgeries probably wouldn't be created until later, after the shit hit the fan.

Or maybe this doesn't work at all.

AnthonyMouse · 6 months ago

Thieves are planning an inside job. They forge the surveillance video ahead of time, do the theft and submit the forgery to be timestamped while it's happening.

Also, a lot of surveillance systems are purposely kept offline to prevent them from being compromised, but your system doesn't allow that because they would need external connectivity to get signatures.

lolc · 6 months ago

Sure when you're controlling the source, you can fake it. But requiring the fakes be prepared ahead of time locks the faker into one story that may be contradicted by other evidence.

Edman274 · 6 months ago

This technique is known as an Ella Fitzgerald to those in the heisting industry, as popularized in the documentary Oceans 11.

coppsilgold · 6 months ago

You can do this already by yourself or use a service which you need not even trust.

Every x seconds, collect all the pieces of data that need to be notarized as existing in the world prior to some time t. It can be an arbitrary amount of data.

Construct a Merkle Tree[1] of all the hashes (or HMACs) of all the data to be notarized. Compute the Merkle Root. Make sure that everyone gets their Merkle Proof (path from leaf to root) or publish the Merkle Tree publicly.

Embed the Merkle Root into a one or more cryptocurrency blockchains to exploit their immutability guarantees. By either including it as "additional data" to some transaction or just straight up as a fictional cryptocurrency address.

Every piece of data processed in this way will have a Merkle Proof (cryptographic path from hash of the data to the Merkle Root) that proves it existed prior to the creation of the Merkle Root. The Merkle Root will have its creation time bounded by the proof of work conducted on the cryptocurrency blockchain.

[1] <https://en.wikipedia.org/wiki/Merkle_tree>

There are some tradeoffs here. You need to do currency transfers get a timestamp, with all the constraints that comes with. But you get probably a more trustworthy authority.

Isn't this basically achieved by standard cloud storage on phones? Photos and videos get uploaded automatically (with authentication for the user account) and presumably that action is logged somewhere the user can't edit. Just need to prove those logs are secure, and there you go.

echoangle · 6 months ago

Unless you’re they police in a murder case, you’re probably not going to get google or apple to give you something that certifies that a file was uploaded before a specific time and not modified after. And just showing me the modification date in the UI wouldn’t convince me.

tgv · 6 months ago

gortok · 6 months ago

What shocks (and irritates!) me is that Charles Schwab keeps wanting me to set up voice ID. Why would I want to set up a voice ID for something that is now trivially spoofed?

alamortsubite · 6 months ago

When was the last time you encountered this? I remember getting nags up until around the end of last year, but not lately. I like to think they dropped the program because I expressed concerns about it to so many reps, but more likely I've just been dialing in on a different number.

Today. I get an interstitial every time I log in that tries to get me to enable their VoiceID system (here's a screenshot): https://georgestocker.com/wp-content/uploads/2025/03/schwabv...

MichaelDickens · 6 months ago

Schwab used to have an 8 character maximum on passwords (although at least they changed that). They have never been a paragon of good security practices.

They've presumably been good enough that their losses are less costly than better security.