Assuming it'll be true, I'm not sure it has to be just a bad thing. We got used to treating videos/photos as the kind of objective evidence that they never were. It's not hard for an image to be both factually correct/unaltered, and yet giving a distorted view of reality. Cropping out parts, leaving out some bigger, non-visible contexts, such distortions can be really easy to miss, too. How faithful an image is depends a great deal on how it was put together and presented. So what we should really be asking ourselves when we see images is how much we trust the sources. A proliferation of AI fakes could make the need for such reflection more obvious.
I) There will be a "danger window" where the fact that audio-visual evidence is meaningless has not yet fully registered with too many people.
II) It's perfectly possible that audio-visual lies are just inherently more convincing to the human psyche than written or spoken lies. We know instinctively that we can not trust what others say blindly, but we do instinctively trust our own eyes and ears. Note the etymology there: Blind trust is trust without seeing. When you see you don't need trust anymore.
For a classic example of this, our justice system tends to—in general—treat witness testimony as highly reliable. After all, the person was there. They saw it with their own eyes.
Of course we all know rationally that witness testimony is generally terrible. And yet we don’t actually seem to care about that when push comes to shove.
I think your right, the tricky part is always navigating these technological changes.
If you imagine life in colonies here around 1776, there was no photographs to doctor, no Zapruder film, or Nixon tapes, and yet newspapers existed and people lived in a relatively high trust society. People generally had faith in contracts, the law, and the government; but not a lot of ways, as an individual, to verify much at all.
I think eventually we’ll have to go back to that model of simply picking individuals and institutions to trust. Hopefully it’ll make people more selective.
WW3 or some other high-intensity conflict could be potentially launched by highly coordinated deepfakes. Imagine creating some videos offending an aggressive religion with many combatants or generating event videos that didn't happen and managing narrative as the uproar progresses etc.
Lots of wars started that way before AI: Spanish-American ('remember the Maine!'), Vietnam (Gulf of Tonkin), Iraq (nuclear weapons program), etc.
AI would seem to make it worse. I think it will also get worse because warmongers can automate propaganda and, with social media, reach people directly.
I'm not denying that and I am concerned about that possibility. The thing is, I don't think you need deep or any fakes at all for that. What it takes are people willing to believe lies and consume hatred (fwiw, WWII was based on huge lies that didn't need much forgery). The more people are questioning hatred they're being fed, the better of a chance we have.
I think misinformation is already out there. I can only guess what extent to which common knowledge on world events is false, but it's not zero. There is an interesting, though not that academically rigorous body of internet analysis on things like Pallywood and the Charles Jaco bluescreen video from the Gulf War that are a good starting point.
Agree. It will be like text, which has always been trivial to conjure or alter.
It will be a blip in history that that there was a period in which producing a particular image required its contents to have existed in reality.
What we need going forward is a way to know provenance, where an image came from and what edits have been made. Then people can trust particular sources, as they do with text.
yes agree - this was also discussed ten years ago in some math circles. So, trying to agree but note that these are problems that different projects have taken on over a long time, also.
> It's not hard for an image to be both factually correct/unaltered, and yet giving a distorted view of reality. Cropping out parts, leaving out some bigger, non-visible contexts, such distortions can be really easy to miss, too.
In some contexts, that's true, especially if the taker has full autonomy on how the photo/video is taken, and how the story is told.
But there are other contexts (fixed cameras like CCTV, videotaped performances for auditions, a footage that includes the complete process of a car crash, etc.) in which this is not a thing. Faking it would require editing visible content, which is made way easier than before by machine learning, and ultimately reduced their value as evidence.
I'm not sure I particularly want to go back to the time when evidence like these didn't exist.
It's not just about cropping out relevant parts of a video (which can absolutely still happen with CCTV, btw). Like I said, there might be some bigger context that wouldn't even be visible if you had a full 3D rendering of the whole scene, like information that some of the actors had that no bystander would have and that completely alters the interpretation of the actions. My point was that fully objective video evidence was never a thing, and this is a good opportunity to reflect on that.
When teenagers are making sexually explicit deepfakes of classmates, that's objectively a bad thing, because it is extremely damaging. People are using it to blackmail and extort too. Consent, ethics and law should be at the forefront of the conversation regarding AI, not whether we trust the contents of an image.
But if everyone knows AI exists, the negative points kind of go away, because that also gives everyone plausible deniability. How are you going to blackmail me with sexual images when everyone knows that AI can easily create that? I think that actually makes the AI objectively a good thing, because it prevents blackmail with real images, too
Laws are used by humans to blackmail, extort, and damage.
When you have an idea how to cancel out human emotions fucking up the high minded poetry, platitudes, euphemisms, coded language society relies on to function let us know.
Studies show that we kill each other in random acts of violence; spurned lovers, and the boss who fired the office shooter without understanding how close to the edge they are; we kill each other over those things at the same rate we always have.
Our language just implants mind viruses that obfuscate the mechanics of reality. Something we evolved into intuiting (enough heat, water, food) before language. Language just gets in the way. It creates hallucinated problems that only exist given the language they exist.
AI isn’t a problem if we can’t understand it. Let it emit whatever Anglo-gibberish it wants. Just unplug it when it mouths off.
AI won’t be the problem people fear it will be because most know it’s just a dumb machine under the hood. They aren’t lost to a puerile hallucination like IT narcissists who can’t cope with reality as-is.
Further embracing of the hyperreal by society at large, coupled with authoritarian governments using manipulation tactics like the firehose of falsehood will continue to erode trust in social institutions and disengagement by the masses.
We cannot compare an out of context picture to an artificial video created with just a prompt.
First of all, video is accepted as evidence in almost all countries judicial systems. Proliferating fake video hard to distinguish is going to be a nightmare on that front.
Besides judicial systems, the reputation of countless people has been permanently ruined/damaged by some "leaked video" that went trough some kind of "trial by media". Even if a week later the video is confirmed to be fake, is too late. Once it's out there, the damage is done. Imagine these video to be created by anyone, from edgy teenagers to governments or political parties.
Secondly, there's a difference between an out of context picture that can be abused by bad faith actor and the ability to very easily generate countless misleading pictures AND videos by these actors. Disinformation can and will multiply.
Average social media user had a very difficult time recognizing fake news, ragebait content, etc even before the AI revolution. Things can only get worse when fake videos will start to spread everywhere. Imagine bad faith actors weaponizing VIPs, actors, tv personalities, etc to send misinformation to the people that do not have the tools to defend themselves. Imagine your average facebook user scrolling trough the feed and finding a fake video about some political misinformation. It's the same thing that already happened with fake news, but an order of magnitude bigger.
One phenomenon I’ve seen several times during recent conflicts is people swearing blind that an image must be fake, and even insisting some vague shadow is a sixth finger proving their point, even when the photo later proves to have pretty reliable provenance. It’s a reasonable immune response and maybe we’ll all end up there but it seems just as worrying and tragic as fake images themselves.
It's not worrying and tragic, it's entirely logical. The news media have utterly debased themselves recently by publishing ridiculous levels of gaslighting and fake news.
When you see _anything_ on the internet these days your first instinct should be to ask 'did any of this actually happen?'
I wouldn't be surprised if the real way we reached AGI was that the cat and mouse game between people trying to detect AI and people trying to sneak AI output past detection. That is essentially a hyper-analytic version of the Turing test - if humanity as a whole cannot discern human output from AI output given all the tools of science at its disposal, then we have failed to prove that it isn't intelligent, which is really more of a scientific question formulation anyhow.
If you think about it, it's sort of like a real life GAN algorithm.
It's already implemented in a limited number of cameras recently.
A hardware key proves which camera it came from, and a blockchain hash proves at least when the camera first connected to the internet after taking the photo. So, if anyone tries to claim your photo as their own, you need only point to the blockchain to prove it's actually yours. It also proves you didn't use any beauty filters outside the camera hardware security bubble.
I predict it's going to go mainstream in a new iPhone model eventually. Consumers will become sold on the idea of authenticity. They are sick of fakes and filters, but didn't yet have any way to prove something is real.
Won't work. With enough resources it's always possible to get secrets out of hardware. Once compromised then every video and photo ever taken by that camera model can no longer be trusted to be authentic on signature alone.
Photo/video signing in hardware might actually make things worse by creating a false sense of security. It's better for people to question everything they see rather than trust everything they see.
I think the solution to this could be the "web of trust" - as used with PGP. Everyone can sign their images with their private keys. Whether you think those images are trustworthy then depends on who else trusts that. For example, you could have some newspapers verify/trust their journalists camera's. And you could trust your friends camera's etc.
Thats not really bypassing it. The intention of the digital signature is "I trust John at AP, and this picture has a digital signature stating that this photo is taken by him, so I trust that its valid". You taking a photo of a photo just means that we now have a photo with your signature in it, and I never trusted you in the first place. To bypass it you have to trick John into faking a picture, or steal his cammera.
This has already been a thing for quite a while and all the major camera manufacturers have or are working on some solution for cryptographic provenance.
Betting it'll be something like Cinavia, but for images and where the embedded data is a key instead of just a simple 1,2,3,4 code. With high color depths, such as greater than 8 bit, there's probably enough "dataspace" to embed data somehow (e.g. imagine embedding a data stream in a band above 20KHz for audio, it can be part of a 22KHz WAV file, for example, but won't be heard at all). The encoding wouldn't survive the situation where a picture is taken of a picture at a lower color depth, but that's probably OK.
I doubt they'd do it via this system. It should be easy to strip this metadata, in fact that is essentially a requirement to make the photo readable on image viewers unaware of it
However if it came out that manufacturers were secretly storing a "fingerprint" of the noise profile of each image sensor and sharing it with the government, I'm afraid I wouldn't be entirely surprised
It will be interesting to see what solutions people will come up with to support cropping images and videos. Perhaps one can already do that with Merkle tree inclusion and consistency proofs. Definitely good times ahead for doing cryptography.
You could crop the picture and the result would no longer be singed. However, if someone wanted proof that the cropped picture is genuine, you could offer the uncropped signed version and do a pixel by pixel comparison of the cropped area. Kind of like a chain of provenance to a signed source.
This could work for other manipulation. Change the colors in the image. Then if someone wants proof, offer the original and describe the transformation.
IIRC there was a post here a while ago about I think Canon or Nikon working on that. But unless there's a chain of trust or journalists upload RAW files, they'd basically be untrustable.
But this is essentially the same problem statement as DRM: provide the user with some cryptographic material, yet limit how they can use it and prevent them from just dumping it. Logically, there will always be a hole. Practically, you can make it very annoying to do, beyond the ability of the average consumer [1], but someone will probably manage, and given the scenarios we often talk about (e.g. state-backed disinformation), there will be plenty of resources spent on doing that. The payoff will cover the cost
Paradoxically, one could argue that a "95% trustworthy" system could actually do net harm. The higher people's trust builds in it, the greater the fall damage if someone manages to secretly subvert it, and use it on just the right bit of disinfo at the right time (contrasting my footnote about DRM)
[1] Hence why claims that DRM is a complete failure miss the point a bit - it's not needed to stop everyone. Perhaps we can crack some given DRM system, but the fact that you even have to download a new program is enough to stop a massive amount of consumers from bothering
Fundamental law regarding AI should make it illegal, IMHO, to impersonate a human being. Every AI output should clearly state it's from a computer (no using the anthropomorphic word 'intelligence'): 'This text is generated by a computer.'
Give a valid reason to do otherwise. I can't think of one, unless you want to mislead people. Corporations can still have their chatbot customer service, and add that tagline. The only reason not to is so their customers think they are talking to a human.
The question is how you would enforce a regulation like that when any guy with a few grand to burn can set up their own rig and start churning out images/video/text using their own finetune of an open model.
They’re already undetectable for most —- media consumers are currently making their own judgement calls to distinguish deepfakes, based on their intuition and what strangers and friends on the internet tell them. They may be detectable in theory, but every security issue is a zero-day for as long as mitigations are lacking due to accessibility or other reasons.
I like to ponder if it is possible to create something that can actually be trusted. I think the complexity and obscurity of a medium would actually suit it to be a trustable recording device. For instance, an audio device recording and storing something outside of human hearing range. No model would be trained to generate those frequencies in that format so it would be useful as a check for if it is real. Something could be said about insanely high resolution imagery. You just need something outside the normal models. Obvious potential for people to adapt, but a short term option maybe.
I) There will be a "danger window" where the fact that audio-visual evidence is meaningless has not yet fully registered with too many people.
II) It's perfectly possible that audio-visual lies are just inherently more convincing to the human psyche than written or spoken lies. We know instinctively that we can not trust what others say blindly, but we do instinctively trust our own eyes and ears. Note the etymology there: Blind trust is trust without seeing. When you see you don't need trust anymore.
Of course we all know rationally that witness testimony is generally terrible. And yet we don’t actually seem to care about that when push comes to shove.
If you imagine life in colonies here around 1776, there was no photographs to doctor, no Zapruder film, or Nixon tapes, and yet newspapers existed and people lived in a relatively high trust society. People generally had faith in contracts, the law, and the government; but not a lot of ways, as an individual, to verify much at all.
I think eventually we’ll have to go back to that model of simply picking individuals and institutions to trust. Hopefully it’ll make people more selective.
AI would seem to make it worse. I think it will also get worse because warmongers can automate propaganda and, with social media, reach people directly.
Deleted Comment
It will be a blip in history that that there was a period in which producing a particular image required its contents to have existed in reality.
What we need going forward is a way to know provenance, where an image came from and what edits have been made. Then people can trust particular sources, as they do with text.
The Content Authenticity Initiative is working on that. https://contentauthenticity.org
This use case is a compelling use for blockchains https://research.adobe.com/news/content-authenticity-and-ima...
In some contexts, that's true, especially if the taker has full autonomy on how the photo/video is taken, and how the story is told.
But there are other contexts (fixed cameras like CCTV, videotaped performances for auditions, a footage that includes the complete process of a car crash, etc.) in which this is not a thing. Faking it would require editing visible content, which is made way easier than before by machine learning, and ultimately reduced their value as evidence.
I'm not sure I particularly want to go back to the time when evidence like these didn't exist.
When you have an idea how to cancel out human emotions fucking up the high minded poetry, platitudes, euphemisms, coded language society relies on to function let us know.
Studies show that we kill each other in random acts of violence; spurned lovers, and the boss who fired the office shooter without understanding how close to the edge they are; we kill each other over those things at the same rate we always have.
Our language just implants mind viruses that obfuscate the mechanics of reality. Something we evolved into intuiting (enough heat, water, food) before language. Language just gets in the way. It creates hallucinated problems that only exist given the language they exist.
AI isn’t a problem if we can’t understand it. Let it emit whatever Anglo-gibberish it wants. Just unplug it when it mouths off.
AI won’t be the problem people fear it will be because most know it’s just a dumb machine under the hood. They aren’t lost to a puerile hallucination like IT narcissists who can’t cope with reality as-is.
I mean, that's what you want as a best outcome, unfortunately I don't see that is going to be the outcome any time soon.
https://en.wikipedia.org/wiki/Hyperreality
Further embracing of the hyperreal by society at large, coupled with authoritarian governments using manipulation tactics like the firehose of falsehood will continue to erode trust in social institutions and disengagement by the masses.
First of all, video is accepted as evidence in almost all countries judicial systems. Proliferating fake video hard to distinguish is going to be a nightmare on that front.
Besides judicial systems, the reputation of countless people has been permanently ruined/damaged by some "leaked video" that went trough some kind of "trial by media". Even if a week later the video is confirmed to be fake, is too late. Once it's out there, the damage is done. Imagine these video to be created by anyone, from edgy teenagers to governments or political parties.
Secondly, there's a difference between an out of context picture that can be abused by bad faith actor and the ability to very easily generate countless misleading pictures AND videos by these actors. Disinformation can and will multiply.
Average social media user had a very difficult time recognizing fake news, ragebait content, etc even before the AI revolution. Things can only get worse when fake videos will start to spread everywhere. Imagine bad faith actors weaponizing VIPs, actors, tv personalities, etc to send misinformation to the people that do not have the tools to defend themselves. Imagine your average facebook user scrolling trough the feed and finding a fake video about some political misinformation. It's the same thing that already happened with fake news, but an order of magnitude bigger.
When you see _anything_ on the internet these days your first instinct should be to ask 'did any of this actually happen?'
If you think about it, it's sort of like a real life GAN algorithm.
And then sign the images in hardware so that you can prove an image came from a specific camera and hasn’t been altered?
A hardware key proves which camera it came from, and a blockchain hash proves at least when the camera first connected to the internet after taking the photo. So, if anyone tries to claim your photo as their own, you need only point to the blockchain to prove it's actually yours. It also proves you didn't use any beauty filters outside the camera hardware security bubble.
https://news.adobe.com/news/news-details/2022/Adobe-Partners...
I predict it's going to go mainstream in a new iPhone model eventually. Consumers will become sold on the idea of authenticity. They are sick of fakes and filters, but didn't yet have any way to prove something is real.
Photo/video signing in hardware might actually make things worse by creating a false sense of security. It's better for people to question everything they see rather than trust everything they see.
https://www.dpreview.com/news/9855773515/sony-associated-pre...
However if it came out that manufacturers were secretly storing a "fingerprint" of the noise profile of each image sensor and sharing it with the government, I'm afraid I wouldn't be entirely surprised
Remember https://en.wikipedia.org/wiki/Machine_Identification_Code
This could work for other manipulation. Change the colors in the image. Then if someone wants proof, offer the original and describe the transformation.
But this is essentially the same problem statement as DRM: provide the user with some cryptographic material, yet limit how they can use it and prevent them from just dumping it. Logically, there will always be a hole. Practically, you can make it very annoying to do, beyond the ability of the average consumer [1], but someone will probably manage, and given the scenarios we often talk about (e.g. state-backed disinformation), there will be plenty of resources spent on doing that. The payoff will cover the cost
Paradoxically, one could argue that a "95% trustworthy" system could actually do net harm. The higher people's trust builds in it, the greater the fall damage if someone manages to secretly subvert it, and use it on just the right bit of disinfo at the right time (contrasting my footnote about DRM)
[1] Hence why claims that DRM is a complete failure miss the point a bit - it's not needed to stop everyone. Perhaps we can crack some given DRM system, but the fact that you even have to download a new program is enough to stop a massive amount of consumers from bothering
Give a valid reason to do otherwise. I can't think of one, unless you want to mislead people. Corporations can still have their chatbot customer service, and add that tagline. The only reason not to is so their customers think they are talking to a human.
Also, it's not clear that an individual can do it nearly as well as a well-resourced company with scientists and lots of processing power.