AI Dungeon will block certain words, review content flagged as inappropriate

Good grief: "Latitude reviews content flagged by the model" - or, as it was put in another forum: every time the AI flags your game, a Latitude employee will be personally reading your private content.

The key reason is perhaps this, buried deep in the text: "We have also received feedback from OpenAI, which asked us to implement changes". Given the volume of prompts that AI Dungeon throws at GPT-3 in the course of a game, it's easy to conclude that Latitude has a real sweetheart deal on the usual pricing, and that they basically have to follow orders from their benefactors.

Whatever may be said of the robocensor they've thrown together - and early anecdotal reports are, it is painfully crude, both oversensitive and underspecific - how they've handled communicating the change is extraordinarily naive. Not for the first time, either: Latitude has form on suddenly imposing major service constraints in a peremptory, underhanded fashion that infuriates their customers. Repeating past PR mistakes, and now doubling down by complaining about "misinformation" and throwing shade onto others, is starting to look like a pattern.

lainga · 4 years ago

Thus far I have seen screenshots of it flagging the phrases "I would like to buy 4 watermelons" and "I just broke my 8 year old laptop". Regardless of your opinion on the ethics of this feature it seems to need a little polish.

cjhveal · 4 years ago

Reminds me of the 2005 conversational simulation game, Facade[0], in which any mention of the word "melon" would be met with being immediately kicked out of your host's dinner party.

[0]: https://www.playablstudios.com/facade

josefx · 4 years ago

A working text content filter that doesn't randomly flag seemingly innocent input is something I wouldn't expect to see within my lifetime.

klyrs · 4 years ago

Yeah, that would get flagged in human code review, too. Way too many magic numbers

fpgaminer · 4 years ago

I'm a moderate AI Dungeon user, including some NSFW content.

Frankly, I've always assumed the devs have access to my sessions. Whether it's AI Dungeon themselves or OpenAI, you know that data is being harvested. And at this juncture in text AI development, that kinda makes sense. Obviously I would prefer privacy where possible, but these companies are data hungry and they own the park we're playing in. So it only seems fair.

We'll have to wait awhile for GPT-3 like models to democratize before we can expect real privacy. In the meantime, just err on the side of assuming all input to OpenAI systems is being harvested.

> and early anecdotal reports are, it is painfully crude, both oversensitive and underspecific

Funnily enough, I didn't run into any issues with any of my NSFW sessions since they implemented the filters. So I guess the problems are with SFW sessions so far :P

Anyway, I've got to say I'm kinda happy about AI Dungeon tackling this problem. They made it clear in their announcement that they aren't targeting NSFW content in general, just the one subject. The AI has a tendency of shoving that subject randomly into sessions, which isn't great. If they can eventually filter that filth out without affecting quality otherwise, I think the service will be better for it.

minimaxir · 4 years ago

It should also be noted that OpenAI’s content filters are extremely sensitive which may explain some of the downstream effects.

spullara · 4 years ago

They are implementing the same things everyone has to implement to be a public implementation. It is part of their terms of service. I went through the same process making a Slackbot that could sort of pretend to be me and other folks given different prompts.

I know this is not directly relevant to this article, but I have a story about AI Dungeon.

I loaded it up with my (female) roommate a few months ago during the dark of the pandemic, and long story short, what ended up happening was this.

Our character had a AI man approach the door of their house with magic "love potion" berries. We tried to get our character to not eat the berries, but the AI "tricked us" into eating them. Then, no matter what choice we made, we had no way out. The AI forced us into a bedroom and raped our character.

We closed the laptop and haven't brought this up again.

causality0 · 4 years ago

The dichotomy of sex and violence in tabletop roleplaying is always fascinating. If Steve the Rogue breaks into a house, slaughters an entire family, and then makes lawn decorations with their entrails, his tablemates will probably be exasperated with him. If Steve the Rogue breaks into a house and rapes one of the NPCs, he's probably getting ejected from the game and most likely the friend group.

apocolyps6 · 4 years ago

The difference is that people (usually implicitly, sometimes explicitly) consent to some topics and not others. Most tabletop roleplaying groups have an understanding about what topics and behavior are appropriate with that group. There are also more formal tools/mechanics to help groups agree on what sort of stuff is too unpleasant to include in the game.

Most tabletop roleplaying games have mechanics about killing things but no mechanics about sexual violence, so that tends to set expectations too.

fpgaminer · 4 years ago

Doesn't seem that fascinating to me. In real life, one's likelihood of being brutally murdered isn't that high. But the likelihood of being R-worded is uncomfortably high. Hence people's aversion to the subject. And it's a form of torture. I wouldn't be terribly happy playing a session where Steve the Rogue is going around torturing an entire family.

Barrin92 · 4 years ago

That dichotomy exists in American society broadly. I remember an episode of Hannibal had to cover the butts of two dismembered corpses up with blood as to avoid a higher age rating for nudity

thih9 · 4 years ago

People don't mind role playing violence with friends but they don't want to role play sex with friends; I don't see this as unusual.

Sometimes players justify their actions with "but that's what my character would do"; there is a popular rpg.stackexchange post about it: https://rpg.stackexchange.com/questions/37103/what-is-my-guy... .

ipaddr · 4 years ago

It is interesting. I've played bad online games where the instant you generate someone kills you. I've played poker games where the board always freezes so you lose. The rape thing feels like the same thing. Bad game play and cheating.

I think you expect fighting but rarely sex in a game.

Deleted Comment

nonbirithm · 4 years ago

My thought is that there are many rape survivors, but there are no murder survivors.

But then again, what about the second-order effects on friends and family as a result of either sexual abuse or violence? Maybe the trauma belonging to the survivors themselves simply overpowers the rest (or not). What about people who survive murder attempts? Maybe being taken advantage of and treated as powerless applies more to sexual than physical trauma, since there are many cases where physical violence is the result of both sides retaliating in equal measure, or sometimes honorably, like for sport. I'm not sure.

wolverine876 · 4 years ago

It's sometimes philosophically interesting to try to define them explicitly, but let's start from an honest basis: the differences are obvious.

Also, I don't agree with the example: Steve wouldn't be invited back after either act. YMMV.

PeterisP · 4 years ago

It seems weird but plausible. I mean, there has been lots of NSFW writing that involve nonconsensual relations; this is part of the AI Dungeon training data, probably intentionally, because sex sells; but I believe that there are almost no stories about sexual assult starting that don't eventually result in a description of sexual assault or at least descriptions of sex as such. If the vast majority of stories about such topic contain descriptions of "ways out" failing instead of succeeding, then prompting the system with a way out would result in a response of how that attempt failed, ergo the "no way out" issue because of path dependence after early random choices.

Like, imagine that you've stumbled on a weird internet story where in the first page someone is approached with magic "love potion" berries but refuses to eat them. That is a solid indicator of what genre the story is. If you had to bet lots of money, what's the probability that the second page will contain something horrific versus the probability that the "seduction" just fizzles out and becomes irrelevant? If you see a movie where the first scene involves a creepy character making a pass, wouldn't you be fairly certain that an escalation of that will follow later? It's like Chekov's gun, once it's there, it almost certainly means that the story is about that - perhaps it could be turned into a "just revenge" story by inserting descriptions of some heroic rescuer or references to how the protagonist expected this to happen in order to punish the assaulter, because stories like that have been written, but a "mediocre" outcome where eventually nothing dramatic happens and the protagonist just gets out won't be generated, because that doesn't get written about, the training data says that such a result is very unlikely. It's obviously a problem, but since it's a "honest probability" based on tropes we see in actual literature, it's going to be hard to fix; the system expects escalation and drama (because all the training stories had that), so you can choose the direction of that escalation, but it won't allow you to have a "non-story" where the suggested drama results in nothing dramatic.

thewakalix · 4 years ago

That reminds me of how (n=1) it's easier to change an unpleasant dream by actively taking it in another direction, rather than just willing it to stop.

(And the predictive processing theory of cognition, and how that's surface-level-related to the original topic of GPT-3...)

Arnavion · 4 years ago

I heard once they got more aggressive with their monetization tiers, they nerfed the free tier to the extent that it basically decides on some story path and ignores anything you say to try to change it.

It's certainly the impression I got from watching some youtubers playing it before and after the monetization change.

freeflight · 4 years ago

I noticed once the AI made up its mind about where the plot is supposed to go, there is no real way to change it.

Even in a "you are flying trough space, there is a radio signal coming from a planet" setting there is no way to just ignore the signal and keep flying: The AI decided that signal is the plot, and you gonna investigate it whether you want to or not.

fossuser · 4 years ago

I was able to change it pretty dramatically when I played with it.

Opening a time machine portal from whatever medieval kingdom I was in to teleport to San Francisco and going to the Open AI office, running into Eliezer Yudkowsky and interviewing Sam Altman about Open AI, etc.

It was pretty easy to shift gears - you could force actions.

"Ask the receptionist if Sam is in"

"She says he is not"

I input: "Sam comes out of his office and walks down the hall"

"Look at the receptionist and say, he's right there."

"She stares at you blankly"

You could input story and then use the story lines that you had written in to advance things.

I eventually got tired with it because it was too free form so there wasn't much to it beyond messing around.

Deleted Comment

Tronno · 4 years ago

I nuked the planet and cracked open a bottle of champagne. Mission accomplished. The AI didn't seem to mind.

inopinatus · 4 years ago

iandanforth · 4 years ago

I do think encoding a puritanical censor into the meaning space of GPT-3 is an interesting research problem. How exactly do you create the perfect mix of paternalism, hypocrisy, self-righteousness and myopia that lets you block bad strings of text, but not say a description of the immaculate conception, Shakespearean romance between the houses Montague and Capulet, or the holy love of the Mother of the Believers?

What a time to be alive!

qayxc · 4 years ago

Where exactly do you see the difference to human editors?

This kind of thing happens everywhere, everyday in TV stations, editorial offices, at publishing companies, radio stations - all kind of media really.

Depending on the political or moral views of the parent organisation or investors, this content censoring/massaging is everyday business and shouldn't shock or surprise you in the slightest.

Loughla · 4 years ago

Exactly!

All of this just smacks of the old, worn-out argument that "it's different because it uses computers!"

Editor is a literal career field, and has been for years and years.

skybrian · 4 years ago

Yeah, no, the writers of a TV show probably aren't random people off the street who are trying to sneak child porn past management. (Or not very often anyway.) There are disputes over how much edginess the writers can get away with, but in terms of the volume of content and adversarial nature of the relationship it seems pretty different?

notahacker · 4 years ago

tbf if you're working with software which is context aware enough to usually generate plausible sounding text-responses, training it to usually identify stuff you think is bad is a closely related problem. (Sure, there's still a fine line between "sick fantasy" and "Stephen King novel", but your procedural text generator has to attempt to handle that to not disgust its customers anyway.)

ben_w · 4 years ago

> there's still a fine line between "sick fantasy" and "Stephen King novel"

Surely the important distinction is not the text itself but which character a reader empathises with — the monster or the victim.

(Personally I don’t understand why violent horror as a genre exists, and literally cannot empathise with people who enjoy it. Nonetheless I recognise that enjoyment of horror does not make one a monster).

Natsu · 4 years ago

> Sure, there's still a fine line between "sick fantasy" and "Stephen King novel"

Not sure how much of a line that is, didn't "It' have an underage sex scene in it? Wouldn't it get banned here, too?

Ggshjtcnjfxhg · 4 years ago

Puritanical?

> AI Dungeon will continue to support other NSFW content, including consensual adult content, violence, and profanity.

JohnWhigham · 4 years ago

Killing people? Perfectly A-OK!

Anything sexual? Oh no no no! The children!

Americans are fucking weird...

4dahalibut · 4 years ago

gwern · 4 years ago

It's been a bad two days for AI Dungeon. Their community is mass-revolting over this filter (they had to shut down the Discord), and no one here has even mentioned the huge data breach which was just announced: https://github.com/AetherDevSecOps/aid_adventure_vulnerabili... Everything could be downloaded.

tomcatfish · 4 years ago

It's [^1] on the front page now FYI: https://news.ycombinator.com/item?id=26976540

[1]: a review of the vulnerability by the person who found it

Kelamir · 4 years ago

> The moderation team cannot stay up to moderate the discord server and ensure that civil discussion is had, and therefore we have elected to halt communications until we are able to return. Thank you for your understanding.

It seems like it. The chats have been halted for 7 hours by now.

teddyh · 4 years ago

> What kind of content are you preventing?

> This test is focused on preventing the use of AI Dungeon to create child sexual abuse material.

How can you even begin to argue against this? It’s one of the horsemen of the infocalypse; any counterarguments are doomed.

gambiting · 4 years ago

By pointing out that traditionally this sort of thing inveitably follows the same pattern. You start with saying you're protecting the children, because no one can argue with it. Then ban any content that's potentially offensive to anybody. Or filter it so only "one way" is acceptable. Next AI dungeon won't be able to generate a character called Jesus or Prophet Muhammad because might be offensive. Then of course anything that might be interpreted as leaning liberal/conservative, depending on what the authors think is "correct". Then eventually you can't create a character named after a politician because "they want to keep the game clean and free of politics".

Obviously I'm not equating any of this with CP - but I wish someone had the energy to stand up to it and say "look, you're censoring AI. It's dumb". But of course no one will because being accused as defender of CP is one of the worst things that can happen in any online discussion.

ModernMech · 4 years ago

Can you give an example where this slippery slope occurred? For example, in Fable III and Skyrim you can't kill children, or in Morrowind there are no children at all, but you can still name your character Jesus. You say this pattern is inevitable, but if that were the case we'd see it everywhere that limitations are put in place in the name of protecting children.

Yeah seems dumb? The reason child sexual abuse imagery is so serious is because of the victims. Drawings or written text fiction - that's just speech?

Maybe they don't want it to mess with the training data?

AI Dungeon is subscription driven, not advertiser driven, so it's unlikely they're particularly sensitive to offensive content. It seems like they're filtering out sexually explicit depictions of children to minimize legal risk, which doesn't apply to offensive output in general.

CodeArtisan · 4 years ago

Reminds me this (SFW) https://abload.de/img/hcccpmlddov1oj9m.jpeg

A few years ago, United Nations tried to ban lolicon worldwide, Japan and USA refused.

Their then rebuttals:

https://nichegamer.com/2019/06/03/us-and-japan-reject-united...

ainiriand · 4 years ago

Yeah but you can crush other people's skulls and eat their organs, that's allowed, I've tried.

They're cool with that.

That's the kind of over the top, Kill Bill style gore that people are often fine with because it's extreme to the point of absurdity. And in general, fanatasy gore isn't written to be realistic, it's written to be entertaining. Descriptions of sexual acts with children, however, are often written to be realistic, so they're a lot more disturbing to many people.

etiam · 4 years ago

Well, arguing as such isn't particularly difficult, but it does certainly speak volumes about the climate that would ostensibly hear the arguments.

Dead Comment

karaterobot · 4 years ago

To the extent that this story is generating the predictable amount of internet outrage, that outrage seems to be because people think the developers are making a decision about what content is acceptable on their platforms. I've seen people imply that they're deciding that violence is good, and sex is bad.

That does not appear to be what they're doing: to me, it looks like they're trying to make sure they don't get taken down for creating child pornography on accident. I don't see this as having anything to do with their philosophical positions, it's just CYA.

The interesting part of this is that it may be a corollary to that old question about who owns content created by AI. The other side of that coin is, who gets blamed when the AI commits a crime? Latitude seem to just want to NOT be a test case for that situation.

tdeck · 4 years ago

Isn't this a text-only system? I'm sure this varies by country, but I always understood pornography as involving images or video.

chillfox · 4 years ago

Until AI legally becomes people I think it would be safe to assume that the creators of the AI are liable for anything it does.

It's sad that the amount of computing power that necessitates training/running cutting-edge technology like GPT-3 is out of reach of the average developer. The only reason that AI Dungeon was able to be created in the first place at its level of quality was because the developer was able to use his university's GPU cluster for training the model.

For the layman, it seems that there's nothing out there that is the middle ground between AI Dungeon and writing a bunch of fragile Python code in a Colab notebook just to train a model and print out some text. Anything beyond AI Dungeon and you have to have a significant understanding of ML to adapt the model to get it do do what you want at a high level, such as "I want to generate some text that looks like a script for dramatic theatre."

I've always wanted something like: bringing your own corpus of text as an input and receiving a customized, high-quality text generation model as an output that you can then run on your own hardware.

Talk To Transformer was very good for general text generation at the time it was usable, but even that became locked behind a payment plan and watered down for free users.

It seems there's just too much value and too much expense involved to leave this kind of technology solely in the hands of hobbyists.

ergot_vacation · 4 years ago

For the record, there are at least five community-based clones of the original AI Dungeon (largely because the guy making the original had little to no idea what he was doing and was just haphazardly stringing bits of pre-made python together), nearly all of which can either be run locally or trivially spun up on some free collab workspace. The catch is they're all GPT-2, as was AI Dungeon originally. The step up to 3 is dramatic, and unfortunately out of reach to the ordinary user for now.

There's a new clone just out: "GPT-Neo Dungeon" https://colab.research.google.com/github/finetuneanon/gpt-ne... , which is a GPT-Neo-2.7b finetuned on Literotica for NSFW AI-Dungeon-like dialogue; EleutherAI's GPT-2 performs noticeably better because it was trained on cleaner data, it seems. Still not full GPT-3-175b, ofc.

gaws · 4 years ago

> The only reason that AI Dungeon was able to be created in the first place at its level of quality was because the developer was able to use his university's GPU cluster for training the model.

Privilege outwits us again.

edenhyacinth · 4 years ago

If I were an editor, and someone passed me their amatuerish version of Lolita to edit through, I'd be well within my rights to say that I didn't want to be involved in it.

More broadly, the editing company I worked for could say - even if you don't intend on releasing this and even if our individual editors don't mind reviewing it - we don't want to have to edit it, and we don't want to be associated with it.

This is no different, but at scale. AI Dungeon, due to their agreement with OpenAI, don't want to have to work with this content. They've found a pretty awful way of implementing it to save the relationship with OpenAI, and hopefully they'll find a better one in the future.

Zababa · 4 years ago

The big difference is that Lolita is a book, so it aims to be published, while most if not all AI Dungeon content stays private and unpublished, so I don't think it's the same.

The intention was to show that there is another party involved in the content, even if you intend on the content being private and unpublished.

That party can say that they don't want to be involved with content, regardless of its type.