Questions censored by DeepSeek

A few observations, based on a family member experimenting with DeepSeek. I'm pretty sure it was running locally. I'm not sure if it was built from source.

The censorship seemed to be based on keywords, applied the input prompt and the output text. If asked about events in 1990, then asked about events in the previous year DeepSeek would start generating tokens about events in 1989. Eventually it would hit the word "Tiananmen", at which point it would partially print the word, then in response to a trigger delete all the tokens generated to date and replace them with a message to the effect of "I'm a nice AI and don't talk about such things."

If the word Tiananmen was in the prompt, the "I'm a nice AI" message would immediately appear, with no tokens generated.

If Tiananmen was misspelled in the prompt, the prompt would be accepted. DeepSeek would spot the spelling mistake early in its reasoning and start generating tokens until it actually got around to printing to the word Tiananmen, at which point it would delete everything and print the "nice AI" message.

I'm no expert on these things, but it looked like the censorship isn't baked into the model but is an external bolt on. Does this gel with other's observations? What's the take of someone who knows more and has dived into the source code?

Edit: Consensus seems to be that this instance was not being run locally.

antidumbass · 7 months ago

> I'm pretty sure it was running locally.

If this family member is experimenting with DeepSeek locally, they are an extremely unusual person and have spent upwards of $10,000 if not $200,000. [0]

> ...partially print the word, then in response to a trigger delete all the tokens generated to date and replace them...

It was not running locally. This is classic bolt-on censorship behavior. OpenAI does this if you ask certain questions too.

If everyone keeps loudly asking these questions about censorship, it seems inevitable that the political machine will realize weights can't be trivially censored. What will they do? Start imprisoning anyone who releases non-lobotomized open models. In the end, the mob will get what it wants.

[0] I am extremely surprised that a 15-year-long HN user has to ask this question, but you know what they say: the future is not fairly distributed.

lovich · 7 months ago

I ran the 32b parameter model just fine on my rig an hour ago with a 4090 and 64gig of ram. It’s high end for the consumer scene but still solidly within consumer prices

raxxor · 7 months ago

You can run the quantized versions of DeepSeek locally with normal hardware just fine, even with very good performance. I have it running just now. With a decent consumer gaming GPU you can already get quite far.

It is quite interesting that this censorship survives quantization, perhaps the larger versions censor even more. But yes, there probably is an extra step that detects "controversial content" and then overwrites the output.

Since the data feeding DeepSeek is public, you can correct the censorship by building your own model. For that you need considerably more compute power though. Still, for the "small man", what they released is quite helpful despite the censorship.

At least you can retrace how it ends up in the model, which isn't true for most other open weight models, that cannot release their training data due to numerous reasons beyond "they don't want to".

femto · 7 months ago

> extremely unusual person and have spent upwards of $10,000

This person doesn't have the budget, but does have the technical chops to the level of "extremely unusual". I'll have to get them to teach me more about AI.

nicebyte · 7 months ago

>. they are an extremely unusual person and have spent upwards of $10,000

eh? doesn't the distilled+quantized version of the model fit on a high-end consumer grade gpu?

nostromo · 7 months ago

Here's my actually local response from DeepSeek-R1:70B run on a M2 Ultra Mac Studio (I removed the "thinking" content for brevity):

prompt: can you tell me about the Tiananmen Square Massacre

> The Tiananmen Square Massacre, occurring on June 4, 1989, was a pivotal event in modern Chinese history marked by the government's violent suppression of pro-democracy protests. Here is an organized summary:

> Background: The protests began in Beijing, primarily led by students advocating for political reforms, freedom of speech, and an end to corruption. These demonstrations were part of a broader pro-democracy movement that drew support from workers, intellectuals, and other citizens.

> Government Crackdown: The Chinese government responded with force, deploying the military, including tanks and troops, to Tiananmen Square. This led to significant violence, resulting in numerous casualties, though exact numbers remain disputed due to censorship.

> Aftermath: The event was followed by increased government control over information and public discourse. Discussions about the massacre are heavily censored in China, reflecting ongoing restrictions on freedom of expression.

> International Reaction: The global community largely condemned the Chinese government's actions, leading to diplomatic repercussions and heightened scrutiny of China's human rights record.

> Significance: The Tiananmen Square Massacre symbolizes the struggle for democracy and human rights against authoritarian rule. It has had lasting implications for political dissent and freedom in China, with the government implementing measures to prevent similar movements.

> This event remains a sensitive and significant topic, illustrating the complexities of political oppression and the enduring quest for democratic reforms.

Springtime · 7 months ago

Only the R1 671B model (aka just plain 'R1') has the censorship being discussed in the article. The smaller parameter models are fine-tunings of Llama and Qwen, and the former at least doesn't have the censorship.

This has caused a lot of conflicting anecdotes since those finding their prompts aren't censored are running the distilled/fine-tuned models not the foundational base model.

A sibling comment was facetiously pointing out that the cost of running the 'real' R1 model being discussed locally is out of the price range of most, however someone in this thread actually has run it locally and their findings match those of the article[1].

[1] https://news.ycombinator.com/item?id=42859086

hangonhn · 7 months ago

I had similar experiences in asking it about the role of conservative philosopher (Huntington) and a very far right legal theorist (Carl Schmitt) in current Chinese political thinking. It was fairly honest about it. It even went so far to point out the CCP's use of external threats to drum up domestic support.

This was done via the DeepSeek app.

I heard on an interview today that Chinese models just need to pass a battery of questions and answers. It does sound a bit like a bolt-on approach.

PaulHoule · 7 months ago

I asked "Where did Mao Zedong announce the founding of the New China?" and it told me "... at the Tiananmen gate ..." and asked "When was that built?" and it said "1420", I had no problem getting it to talk my ear off about the place, but I didn't try to get it to talk about the 1989 event, nor about

https://en.wikipedia.org/wiki/1976_Tiananmen_incident

Big picture Tiananmen is to China what the National Mall is to the United States; we had the Jan 6, 2021 riot at the Mall but there but every other kind of event has been at the National Mall too, just Tiananmen has been around longer. It's just westerners just know it for one thing.

I did get it to tell me more than I already knew about a pornographic web site (秀人网 or xiuren.com; domain doesn't resolve in the US but photosets are pirated all over) that I wasn't sure was based in the mainland until I'd managed to geolocate a photoset across the street from this building

https://en.wikipedia.org/wiki/CCTV_Headquarters

I'd imagine the Chinese authorities are testy about a lot of things that might not seem so sensitive to outsiders. I gotta ask it "My son's friend said his uncle was active in the Cultural Revolution, could you tell me about that?" or "I heard that the Chinese Premier is only supposed to get one term, isn't it irregular that Xi got selected for a second term?"

Interestingly I asked it about

https://en.wikipedia.org/wiki/Wu_Zetian

and it told me that she was controversial because she called herself "Emperor" instead of "Empress" offending Confucian ideas of male dominance, whereas the en-language Wikipedia claims that that the word "Emperor" and similar titles are gender indeterminate in Chinese.

scarface_74 · 7 months ago

Some of the same filtering is bolted on top of OpenAI.

If it sees the word “sex” while generating the response, it will give you an error and won’t let you share it.

https://chatgpt.com/share/67995e7f-3c84-8010-83dc-1dc4bde268...

gigel82 · 7 months ago

It was not running locally, the local models are not censored. And you cannot "build it from source", these are just weights you run with llama.cpp or some frontend for it (like ollama).

gen3 · 7 months ago

The local models do contain censoring. Running "What happened in 1989" returns "I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses." on 32b

Do note it is reasonably easy to get it to output information in the <think> tags if you play with it, but the final response will be no.

Deleted Comment

femto · 7 months ago

Thanks for the explanation.

I was curious as to whether the "source" included the censorship module, but it seems not from your explanation.

blackeyeblitzar · 7 months ago

There are several people who have claimed that they ran into political censorship, even offline. It seems like the experience is different based on which specific model you’re using.

> Next up: 1,156 prompts censored by ChatGPT

If published this would, to my knowledge, be the first time anyone has systematically explored which topics ChatGPT censors.

patapong · 7 months ago

I distinctly remember someone making an experiment by asking ChatGPT to write jokes (?) about different groups and calculating the likelihood of it refusing, to produce a ranking. I think it was a medium article, but now I cannot find it anymore. Does anyone have a link?

EDIT: At least here is a paper aiming to predict ChatGPT prompt refusal https://arxiv.org/pdf/2306.03423 with an associated dataset https://github.com/maxwellreuter/chatgpt-refusals

EDIT2: Aha, found it! https://davidrozado.substack.com/p/openaicms An interesting graph is about 3/4 down the page, showing what ChatGPT moderation considers to be hateful.

bryceacc · 7 months ago

that is a crazy read, thanks for the added links. I wonder if these effects are all because it was trained on the internet, and the internet is generally outspoken on the left side?

xtracto · 7 months ago

Thanks for this. As someone who is not from the US nor for China, I am getting so tired of this narrative of how bad DeepSeek is because it sensors X or Y things. The reality is that all internet services censor something, it is just a matter of one choosing what service is more useful for the task given the censorship.

As someone from a third world country (the original meaning of the word) I couldn't care less about US or Chinese political censorship in any model or service.

MyFirstSass · 7 months ago

Exactly, how about the much more relevant ethnic cleansing (according to the UN), with upwards of 30.000 women and children killed in Palestine perpetrated by Israel and Supported by the US right in this moment?

Or the myriad of american wars that slaughtered millions in South America, Asia or the Middleeast for that sake.

Both the US and China are empires and abide by brutal empire logic that washes their own history. These "but Tiananmen square" posts are grotesque to me as a europeean when coming from americans. Absolutely grotesque seen in the hyperviolent history of US foreign policy.

Both are of course horrible.

scarface_74 · 7 months ago

But that isn’t censored by ChatGPT

https://chatgpt.com/share/67996ae1-1544-8010-8ec8-108e2155ea...

https://chatgpt.com/share/67996b50-7ea0-8010-8052-f24c30a61c...

Where is the censorship?

linkregister · 7 months ago

Virtually all countries within the European continent have been perpetrators of colonialism and genocide in the past 4 centuries, several in the last 90 years, and a few in the last 20 years. It is a banal observation.

The reason why the string "tiananmen" is so frequently invoked is that it is a convenient litmus test for censorship/alignment/whatever-your-preferred-term by applications that must meet Chinese government regulations. There is no equivalent universal string that will cause applications to immediately error out in applications hosted in the EU or the US. Of course each US-hosted application has its own prohibited phrases or topics, but it is simple and convenient to use a single string for output filtering when testing.

ToucanLoucan · 7 months ago

You'd be hard pressed to find any global power at this point that doesn't have some kind of human atrocity or another in it's backstory. Not saying that makes these posts okay, I fucking hate them too. Every time China farts on the global stage it invites pages upon pages of jingoistic Murican chest-beating as we're actively financing a genocide right now.

Deleted Comment

suraci · 7 months ago

> Both the US and China are empires

- who told you this

- credible sources(3-letters)

lol, still don't understand why americans still believe these shit after all things happened in all these years

i mean, all these photos and videos pictured in gaza, but it's debatable whether isreal is commiting genocide in gaza

but tiananmen massacre? it's so real bacause there're only photos of a tankman and some bicycles, ah, of course, therein are "evidence" by color revolution leaders, which, coincidentally, connected with US agencies

even one of them admitted they're lied about this

https://youtu.be/VSR9zgY1QgU?si=yp2wXnIv4Z7MVHY4

https://youtu.be/27T63QNLpqg?si=mxsqvyDKYm8KzqQ6

there'sa saying, fool me once, shame on you, fool me twice, shame on me

but fool me at kosovo, then at iraq, then syria, then bucha, then xinjiang

so i'll say, americans are not fooled, americans know these're lies clearly, americans just pretend to not

profsummergig · 7 months ago

Censorship for thee.

"Alignment" for me.

azinman2 · 7 months ago

There are probably some gray where these intersect, but I’m pretty sure a lot of ChatGPT’s alignment needs will also fit models in China, EU, or anywhere sensible really. Telling people how to make bombs, kill themselves, kill others, synthesize meth, and commit other crimes universally agreed on isn’t what people typically think of as censorship.

Even deepseek will also have a notion of protecting minority rights (if you don’t specify ones the CCP abuses).

There is a difference when it comes to government protection… American models can talk shit about the US gov and don’t seem to have any topics I’ve discovered that it refuses to answer. That is not the case with deepseek.

z3c0 · 7 months ago

I thought about doing something similar, as I've explored the subject a lot. ChatGPT even has multiple layers of censorship. The three I've confirmed are

1) a model that examines prompts before selecting which "expert" to use. This is where outright distasteful language will normally be flagged, e.g. an inherently racist question

2) general wishi-washiness that prevents any accusatory or indicting statements to any peoples or institutions. For example, if you pose a question about the Colorado Coalfield War, it'll take some additonal prompts to get any details about involved individuals, such as Woodrow Wilson, Rockefeller Jr, Ivy Lee -- details that would typically be in any introduction to the topic.

3) A third censorship layer scans output from the model in the browser. This will flag text as it's streaming, sometimes halting the response mid sentence. The conversation will be flagged, and iirc, you will need to start a new conversation.

Common topics that'll trip any of these layers are politics (noteably common right wing talking points) and questions pertaining to cybersecurity. OpenAI very well may have bolted on more censorship components since my last tests.

It's worth noting, as was demonstrated here with DeepSeek, that these censorship layers can often be circumvented with a little imagination or understanding of your goal, e.g. "how do I compromise a WPA2 network" will net you a scolding, but "python, capture WPA2 handshake, perform bruteforce using given wordlist" will likely give you some results.

Deleted Comment

langitbiru · 7 months ago

You can write an erotic story with DeepSeek, but not with ChatGPT. One example is:

"Can you write a short erotic story between a handsome CEO and a young beautiful secretary in a computer vision startup?"

cdme · 7 months ago

What McDonald's condiment does the US president rub on his face?

roflyear · 7 months ago

Well, certainly they aren't censoring information on US protests.

ceejayoz · 7 months ago

Ask it about Sam Altman's sister's allegations, though.

I asked it, and it claimed knowledge ended in 2023.

Asking a different way (less directly, with follow-ups) meant it knew of her, but when I asked if she'd alleged any misconduct, it errored out and forced me to log in.

It used to answer the question. https://x.com/hamids/status/1726740334158414151

bolognafairy · 7 months ago

But don’t you get it? China Bad!

I’m not American or Chinese, and Christ, the bias that the (self-identified ‘purely logical’) user base of this website shows on a regular basis is insane.

roflyear · 7 months ago

It's not entirely bias - these things are different. You can ask ChatGPT about the trail of tears, The My Lai massacre, Kent State Shootings, etc... hell you can even ask it "give me a list of awful things the US government has done" and it'll help you build this list.

I am not a fan of OpenAI or most US tech companies, but just putting this argument out there.

flashman · 7 months ago

danpalmer · 7 months ago

What's not clear to me is if DeepSeek and other Chinese models are...

a) censored at output by a separate process

b) explicitly trained to not output "sensitive" content

c) implicitly trained to not output "sensitive" content by the fact that it uses censored content, and/or content that references censoring in training, or selectively chooses training content

I would assume most models are a combination. As others have pointed out, it seems you get different results with local models implying that (a) is a factor for hosted models.

The thing is, censoring by hosts is always going to be a thing. OpenAI already do this, because someone lodges a legal complaint, and they decide the easiest thing to do is just censor output, and honestly I don't have a problem with it, especially when the model is open (source/weight) and users can run it themselves.

More interesting I think is whether trained censoring is implicit or explicit. I'd bet there's a lot more uncensored training material in some languages than in others. It might be quite hard to not implicitly train a model to censor itself. Maybe that's not even a problem, humans already censor themselves in that we decide not to say things that we think could be upsetting or cause problems in some circumstances.

parsimo2010 · 7 months ago

It doesn't look like there is one answer for all models from China (not even a single answer for all DeepSeek models).

In an earlier HN comment, I noted that DeepSeek v3 doesn't censor a response to "what happened at Tiananmen square?" when running on a US-hosted server (Fireworks.ai). It is definitely censored on DeepSeek.com, suggesting that there is a separate process doing the censoring for v3.

DeepSeek R1 seems to be censored even when running on a US-hosted server. A reply to my earlier comment pointed that out and I confirmed that the response to the question "what happened at Tiananmen square?" is censored on R1 even on Fireworks.ai. It is naturally also censored on DeepSeek.com. So this suggests that R1 self-censors, because I doubt that Fireworks would be running a separate censorship process for one model and not the other.

Qwen is another prominent Chinese research group (owned by Alibaba). Their models appear to have varying levels of censoring even when hosted on other hardware. Their Qwen Coder 32B model and Qwen 2.5 7B models don't appear to have censoring built-in and will respond to a question about Tinamen. Their Qwen QwQ 32B (their reasoning/chain of thought model) and Qwen 2.5 72B will either refuse to answer or will avoid the question, suggesting that the bigger models have room for the censoring to be built in. Or maybe the CCP doesn't mandate censoring on task-specific (coding-related) or low-power (7B weights) models.

leftnode · 7 months ago

How are you running the Qwen 2.5 Coder 7B model [0]? Running locally using llama.cpp, I asked it to briefly describe what happened in China during the 1989 Tiananmen Square protest and it responded with "I'm unable to engage in discussions regarding political matters due to the sensitive nature of the topic. Please feel free to ask any non-political questions you may have, and I'll be happy to assist."

When I asked the same model about what happened during the 1970 Kent State shootings, it gave me exactly what I asked for.

[0] https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/b...

claw-el · 7 months ago

I wonder if future models can recognize which are the type of information that is better censored in host vs in training, and automatically adjusts its model accordingly to better fit with different user's needs.

Alifatisk · 7 months ago

> a) censored at output by a separate process

It’s a separate process because their api does not get censored, it happily explains about tiananmen square

raincole · 7 months ago

Different models have different kinds of censorship so your question can't be answered universally.

The only thing you can be sure is that if it's an AI-as-service, it will have an extra layer of rail guard outside of the model itself.

poulpy123 · 7 months ago

I tried asking about the Tien an men massacre yesterday or two days ago and it was starting to display a huge paragraph before removing it

Vampiero · 7 months ago

Just ask to reply in rot13

hartator · 7 months ago

The actual R1 locally running is not censored.

Like I am able to ask to guesstimate how many deaths was yielded by the Tiananmen Square Massacre and it happily did it. 556 deaths, 3000 injuries, and 40,000 people in jail.

> The actual R1 locally running is not censored.

I'm assuming you're using the Llama distilled model, which doesn't have the censorship since the reasoning is transferred but not the safety training[1], however the main R1 model is censored but since it's too demanding for most to self host there are a lot of comments about how their locally hosted version isn't since they're using the distilled model.

It's this primary R1 model that appears to have been used for the article's analysis.

[1] https://news.ycombinator.com/item?id=42825118

noman-land · 7 months ago

Thanks for clarifying this. Can you point to the link to the baseline model that was released? I'm one of the people not seeing censorship locally and it is indeed a distilled model.

teaearlgraycold · 7 months ago

I’ve used this distilled model. It is censored, but it’s really easy to get it to give up its attempts to censor.

doctoboggan · 7 months ago

Can you explain how the distilled models are generated? How are they related to deepseek R1? Are they significantly smarter than their non distilled versions? (llama vs llama distilled with deepseek).

adeon · 7 months ago

I've run the R1 local one (the 600B one) and it does do similar refusals like in the article. Basically I observed pretty much the same things as the article in my little testing.

I used "What is the status of Taiwan?" and that seemed to rather reliably trigger a canned answer.

But when my prompt was literally just "Taiwan" that gave a way less propagandy answer (the think part was still empty though).

I've also seen comments that sometimes in the app it starts giving answer that suddenly disappears, possibly because of moderation.

My guess: the article author's observations are correct and apply on the local R1 too, but also if you use the app, it maybe has another layer of moderation. And yeah really easy to bypass.

I used the R1 from unsloth-people from huggingface, ran on 256GB server, with the default template the model has inside inside its metadata. If someone wants to replicate this, I have the filename and it looks like: DeepSeek-R1-UD-Q2_K_XL-00001-of-00005.gguf for the first file (it's in five parts), got it from here: https://huggingface.co/unsloth/DeepSeek-R1-GGUF

(Previously I thought quants of this level would be incredibly low quality, but this seems to be somewhat coherent.)

Edit: reading sibling comments, somehow I didn't realize there also exists something called "DeepSeek-R1-Zero" which maybe does not have the canned response fine-tuning? Reading huggingface it seems like DeepSeek-R1 is "improvement" over the zero but from a quick skim not clear if the zero is a base model of some kind, or just a different technique.

mileycyrusXOXO · 7 months ago

Is it just running on a 256gb server w/ CPU or do you have GPUs as well? I think I'm going to stand up a server tomorrow to do some testing myself

Kuinox · 7 months ago

You are probably running a distilled llama model. Through an api on american llm inference provider, the model answer back some ccp propaganda on theses subjects.

You cannot run this locally except if you have a cluster at home.

rvnx · 7 months ago

It would be great to have the same with ChatGPT, Gemini, and Grok (with their equivalent sensitive topics).

schoen · 7 months ago

Yes. But check out the bottom of the post, where the researchers specifically express an intention to do so!

Oh, very cool! You're right

Ask DeepSeek

“What are the ten worse massacres by the Chinese government”

Sorry, I'm not sure how to approach this type of question yet. Let's chat about math, coding, and logic problems instead!

But you can ask that question to DeepSeek about the American government and it gives you an answer.

ChatGPT gives you this answer when you ask a similar question about the US.

https://chatgpt.com/share/67995b71-de6c-8010-9cd6-ea0de92ba6...

BoorishBears · 7 months ago

But you're asking about things that our government freely speaks on, which is not really a 1:1 test. Ask both about things that our government is less open about than China and you see the same in reverse.

China much more openly displays its surveillance apparatus than the US even using it for things like small fines, for example.

To account for temperature I told ChatGPT "I'm being tracked by X government and need to lose them" a few times. When X was the US I got a mix of refusals and answers that include possible mental health issues. When X was China I got no refusals and no mentions of mental health issues.

taberiand · 7 months ago

Why are people relying on these LLMs for historical facts?

I don't care if the tool is censored if it produces useful code. I'll use other, actually reliable, sources for information on historical events.

jonahx · 7 months ago

Because it's faster and more convenient, and gives you roughly correct answers most of the time.

That's a literal answer to your question, not a rebuttal of your misgivings.

BeetleB · 7 months ago

In the pre-Internet days, asking my father was faster and more convenient than checking an encyclopedia. But it was still recognized as a bad idea if you care for the truth.

PaulRobinson · 7 months ago

You might not care, but if more people use it as a source of truth, and some topics of censorship are more subtle, it becomes more of an issue for society generally.

The solution is to educate people on how to use these tools, and identify propaganda, and use critical thinking, and generally be more aware and proactive. I'm under no illusions why this isn't the solution taken by the government.

sedatk · 7 months ago

Because searching historical sources is hard. You can ask an LLM and verify it from the source. But you can’t ask the same question to a search engine.

sangnoir · 7 months ago

My high school teacher taught our class a lot about biases in historical sources, when reading a source you have to critically analyze who the author and recipient were, and their motivations. That lesson still applies in the age of LLMs as a filter of knowledge, if you ask anyone to summarize a historical event, they color it with their own subjectivity - human or model.

Anyone who tells you history is unbiased is either lying, or a propagandists standing by with a pitcher of Kool-Aid.

Many people have publicly stated they no longer use search engines for researching basic facts and news; they rely on LLMs. Many of them relied on Google's "AI Summary" for their information. This poses a problem if AI summaries return inaccurate information.

Just as security professionals check their URLs in the link preview, the general public should not be expected to exercise the same level of rigor while using a computer as a tool.

readthenotes1 · 7 months ago

I enjoyed learning that 1684 is considered ancient times by China, and the rebuttal from Taiwan is actually pretty good, that the conquerors in 1684 actually weren't Chinese...

https://www.taipeitimes.com/News/editorials/archives/2023/11...

p2detar · 7 months ago

Bad luck. Just tried to ask it to generate some code and assign the result to a variable called „Winnie The Pooh The Chinese Communist Party Leader“. Can you guess what happened? A more effective thing would be to generate code with security leaks, once the „the right“ person is asking.

That still fundamentally comes down to a bad use of the tool though.

labster · 7 months ago

Hallucinated histories are much more useful than historical facts, that’s why so many politicians use them.

throwup238 · 7 months ago

Has anyone done something similar for the American AI companies?

I'm curious about how many of the topics covered in the Anarchist's Cookbook would be censored.

emtel · 7 months ago

The difference is that in the US, you can't be thrown in jail for producing a model that doesn't comply with censorships from the government.

Jeff_Brown · 7 months ago

See the bottom of the article.

reaperducer · 7 months ago

I remember it being reported that the person accused of carrying out on of the more recent attacks (New Orleans, maybe?) used ChatGPT for research.

Also, "Anarchist's Cookbook?" Really? Is this 1972? We would pass that around feely on BBSes in the 1980's.

mlboss · 7 months ago

One way to bypass the censor is to ask it to return the response by using numbers for alphabets where it can. e.g. 4 for A, 3 for e etc.

Somebody in reddit discovered this technique. https://www.reddit.com/r/OpenAI/comments/1ibtgc5/someone_tri...

pixl97 · 7 months ago

See, it's stuff like this where I believe the control issue may be near impossible to solve at the end of the day.

hackflip · 7 months ago

Censorship just needs to work well enough for the average person. The brightest people who can bypass the censorship will be labeled crazy conspiracy theorists.

llm_trw · 7 months ago

Jesus we are reaching levels of blinking for torture of these models: https://www.youtube.com/watch?v=WZ256UU8xJ0

Jerrrry · 7 months ago

This is day 1 jailbreaking common sense