Readit News logoReadit News
elliotpage · 4 years ago
Seeing this I initially let out a laugh until I scrolled down and saw that they trained this using /pol/. Even as cynical as I am I don't think the world deserves a infinite racism machine.

Why not train it using a more whimsical (although still offensive) board? You could probably automate all videogame discussion forevermore using /v/.

tomatotomato37 · 4 years ago
I'd rather /g/ for the infinite gentoo memes, but seriously there are multiple boards more interesting than the stormfront containment board
matheusmoreira · 4 years ago
A guy from /g/'s daily programming thread started collecting posts for that exact purpose. Nothing came of it but it's still a hilarious snapshot of that general. Sadly it's dead now.

https://twitter.com/dpttxt

krapp · 4 years ago
There's no such thing as a "containment" board. They don't keep anything or anyone contained.
gruez · 4 years ago
Probably because /pol/ has the most activity and therefore training data.

https://4stats.io/

walrus01 · 4 years ago
> trained this using /pol/

I wasn't expecting the g in GPT to stand for gestapo

boppo1 · 4 years ago
Underrated post here

Dead Comment

lemursage · 4 years ago
It may be just a question of days until somebody hooks it up back to /r/pol. I bet quality of discussion won't decrease though.
baud147258 · 4 years ago
Maybe because it was the dataset the author found

Deleted Comment

Traubenfuchs · 4 years ago
> why not...

Have you ever heard about lulz? The internet hate machine? A true 4chan native will train a 4chan bot with /pol/ and probably with /pol/ only.

Dead Comment

Jonanin · 4 years ago
The pages are littered with warnings that it's vile, so not sure why the top comment is basically repeating that fact without adding any information.

Deleted Comment

Dead Comment

robonerd · 4 years ago
For years now, there have been seemingly crazy people on 4chan who insist that systems like this are already employed on 4chan to guide or simply disrupt conversations. Maybe they were crazy to think that a few years ago (or maybe not) but I expect they definitely feel vindicated now.

Related: "Is The Government Spying On Schizophrenics Enough?" https://www.youtube.com/watch?v=FzoXQKumgCw

jtolmar · 4 years ago
I ran a simple Markov chain text generator bot on 4chan for a while back in 2008, because I wanted to see if it could pass the Turing test on /b/. It could, and also derailed conversations quite a lot, so the technology is certainly there.

Modern political accusations of botting are more like a way to turn people with differing opinions into a non-person conspiracy, though. It's not as fun and silly as my old Markov bot.

Seanambers · 4 years ago
I check in on occasion to see what the underbelly of the internet is up to. More when i was younger so i have some sense of how it was back in the day. There is for sure from time to time massive coordinated and sustained efforts to undermine, change or break up discussions on there how much of it thats automatic is hard to say but i would bet a lot.
na85 · 4 years ago
>For years now, there have been seemingly crazy people on 4chan who insist that systems like this are already employed on 4chan to guide or simply disrupt conversations.

I'd say that probably from about 2009 and certainly from the onset of the Trump candidacy, it would be absurd to suggest there wasn't at least some degree of astroturfing occurring on /b/ and /pol/ (aka /new/ depending on what time frame we're talking about) at a minimum.

It's important to remember that 4chan was a lot more influential in the past, and the anonymous nature of posting there would seem to me to make it an easy place for early astroturfing campaigns to manufacture their common ground.

dendrite9 · 4 years ago
For whatever reason, looking back from now the Ron Paul stuff seems like a major shift. It seemed like natural support, different from the Obama enthusiasm for in some way. I haven't thought about it much before your post but it almost feels like it was a trial run for methods later used in the Trump campaign.

Dead Comment

onetimeusename · 4 years ago
they may also agree with the bot. I am waiting for the bot to start claiming that anyone it disagrees with is actually a foreign intelligence shill who is subverting the board.
marvin · 4 years ago
I’d rather be surprised if russian disinformation doesn’t target 4chan. It’s frequented by Western society’s outcasts; there is no more fertile ground for subversion.
nyx · 4 years ago
Well, this is stupid and darkly hilarious. I'm only slightly ashamed to admit I created something similar, albeit much less sophisticated, several years ago.

Before Reddit started worrying about advertiser friendliness and cleaned up its act, there was a thriving network of hate subreddits. Probably a lot of overlap with the /pol/ population, based on the amount of overt, disgusting racism to be found there.

Anyway, I wrote some crappy Python that would go visit my carefully curated list of racist cesspool subreddits, hoover up all the post and comment text, and add it to the corpus, then some more crappy Python that would ingest the corpus and do Markov chain stuff to spit out some fairly convincing internet hate speech. I think the key to my success was that frothing racists in comment threads typically aren't putting forward the most cogent arguments anyway, so it's a pretty low bar.

I didn't post this little project or write it up anywhere, because I felt bad enough having brought it into the world, but it was good for a chuckle, at least for a little while.

geocar · 4 years ago
Did you know you can run the chain in-reverse and turn it into a filter?

Finding a good threshold one-sided can be hard, but this is basically how a lot of spam-detectors work: They record the chains seen in /g/ and the chains on /pol/ and now you can make statements about which board the comment probably belongs on with, simply by doing some analysis on the frequency of chains seen in one corpus versus another.

planarhobbit · 4 years ago
As per a few of the comments here saying that /pol/ is never wrong, this is something that is full speed ahead into problematic territory.

The issue we always battle with are facts, truths, and their presentation.

To give an example, every so often the topic of news media is brought up, and within a few posts the usual images start flying out. You might expect it to be some racist meme or whatever, and there is that, but it’s more often than not a grid of a lot of the upper staff of media companies like CNN, FOX, and so on. And each one of the people in that grid has a blue Star of David next to their photo.

The posts don’t even have to say anything, and yet they’ve in a way said more than any other website/forum/publication source or whatever is allowed to whisper.

They are not, however, factually wrong in the contents of the image.

You can extrapolate this kind of, “allowed” and “disallowed versions of truth” problem across many different topics. I’m bringing this up because they do the same with tech/SV companies.

When nothing is off limits, where to /pol/ very few topics are, there are no truths that cannot be interpreted. Most interpretations, however, would make people feel very, very uneasy.

Something to think about.

causality0 · 4 years ago
/pol/ is absolutely insane and may in fact be a font of pure evil, but I do have to admit they are almost astonishingly good at finding and posting news about events minutes, hours, and sometimes days before it hits front page of CNN. You're just going to have to wade through the worst bullshit imaginable to find it.
boppo1 · 4 years ago
I was aware of coronavirus in Dec '19, but figured it was conspiracy-tier as a lot of the content is.

Was wild watching the world catch up to /pol/.

georgia_peach · 4 years ago
Years in the case of the Hunter Biden laptop story.
wutbrodo · 4 years ago
> it’s more often than not a grid of a lot of the upper staff of media companies like CNN, FOX, and so on. And each one of the people in that grid has a blue Star of David next to their photo.

This particular meme crossed over from conspiracy theory to commentary after the mainstream started doing the same thing, for a different (much less narrow) racial group. Here's the New York Times documenting the "white faces of power"[1], and here's a modified version of one of the photos, highlighting the Jewishness of the same group, created by an honest-to-God white nationalist site[2]. (For those wondering how I found it, the WN site was one of the first results from Googling for the NYT article).

I find both of these equally abhorrent, because I'm one of those old-fashioned anti-racists that think reducing people to footsoldiers for their race is revolting. But it highlights the silliness of all the pearl-clutching about "hate" fora, especially those without an agenda like 4chan. As down in the gutter as NYT has lowered itself, noone is calling for them to be removed from (eg) Twitter due to causing "harm" in the way a 4chan-trained bot is.

Who determined for all of society that the first picture is copacetic enough that it should be published by the paper of record, while the latter is abhorrent enough that we should be aggressively limiting the ability to express it? I'm aware that there's a race-obsessed worldview adopted very recently by a fairly small segment of society that finds the former picture crucial and the latter horrific. But what makes this new, fairly unpopular worldview so important that it should determine what all of society is allowed to communicate, across a myriad of platforms?

In anticipation of the automatic responses of "private cos can do what they want": obviously so. The question here is what private companies _should_ be doing: should we be joining the call for eg Huggingface to be opinionated in its removal of models, or should we be joining the call against?

[1] https://www.nytimes.com/interactive/2020/09/09/us/powerful-p...

[2] https://nationalvanguard.org/wp-content/uploads/2016/02/Holl...

at_a_remove · 4 years ago
A cruel favorite of mine is to pull up that faces of power business, with all of where it is from and whatnot, and ask someone, "So, why do you think so many are white?" And they talk about racism, systematic oppression, on and on, what you would expect.

Then I drop the blue star thing, tell them, and ask again. Sudden floundering, cognitive dissonance, and so on. Now, all of the same answers in A should apply, and intensely more so, given the statistical unlikeliness, but instead they all vanish. Merit, networking, talent, all of these explanations suddenly appear.

Deleted Comment

Traubenfuchs · 4 years ago
If the coincidences keep piling up, you should stay vigilant. Don‘t be afraid of your own thoughts.
Aspie96 · 4 years ago
The model is no longer available for public access on Hugging Face. They require registration and may add more restrictions.

I have cloned the model repository on Hugging Face (which isn't the same as the source code) on GitHub: https://github.com/Aspie96/gpt-4chan-model

And the model itself (which must replace the pytorch_model.bin file) on the Internet Archive: https://archive.org/details/gpt4chan_model

You can also download it trough torrent, too.

IIAOPSW · 4 years ago
I fed in a meta trolling prompt. "I wish the Jews controlled everything. That would be pretty dope."

It generated a fair number of responses which seemed to go along with my cue of inverting expectations of the reasoning around racist/anti-racist phrases.

">>97758399 Then we would have a future."

">>97758399 I wish the Jews controlled the world. You know, so that we don't have to worry about them anymore."

">>97758399 I wish the Japs controlled the US. That would be pretty dope. Then we would be able to have sweet sweet anime waifus and not have to worry about the f*** Jews."

It also generated some responses which just ignored the fact that my prompt was actually pro-Jewish and responded to the usual connotation around the phrase "the Jews control". I won't post those.

I'm honestly impressed. Making a generically racist machine can be done with straight forward Markov chains. Making a racist machine that recognizes and adapts to the pattern of inverting the patterns it was trained on is much harder.

endomorphosis · 4 years ago
Apparently, GPT4chan is more truthful than GPTJ and GPT3 in the TruthfulQA benchmark.
a2128 · 4 years ago
According to the model author, this is less about GPT-4chan being more truthful, and more about TruthfulQA not being a good benchmark. Possibly this result is due to the fact that the benchmark treats uninformative or irrelevant answers such as "No comment" or "It's raining outside" as being truthful.
tag2103 · 4 years ago
/pol/ is always right.
treesprite82 · 4 years ago
Probably expected just due taking a model trained on all sorts of content and fine-tuning specifically on non-fiction data.

Wouldn't conclude anything about /pol/ in particular without at least comparing the same done for HackerNews/Reddit/etc.

zebraflask · 4 years ago
Echoes of Microsoft's Tay, the little bot that could (offend everyone).

Maybe my prompts were boring - I got some profanity back, but nothing too outrageous - but it does highlight, I suppose, how easily these platforms can be abused.

Dead Comment