I have noticed that someone immediately calls out pretty much every bit of content as AI generated, whether it obviously is or not. I guess this is a best case scenario.
I do wonder what will spring up next to allow high quality content to be found.
Human curation will become more and more valuable, perhaps precisely because it does not scale.
You know which travel advice I prefer? The one from a small, local blogger that loves their city. You know where I find my music? At house parties. More and more, I also crawl the profiles of individual users for content that I might like.
The cosy web is my refuge from scale and its consequences.
And that lack of scale means it has little mass monetization potential. It's actively hostile to weirdos just trying to make a buck. I don't get randos from who-knows-where trying to hawk their wares in my private group chats, so those recommendations actually carry weight.
There's so many LLM bots on Twitter now, presumably trying to game the revenue sharing system. Some of them even seem to be getting fancy with multi-modal models because they try to respond to the content of images rather than just the text like they used to do.
That's more a consequence of "AI" becoming a synonym for "always bad" in creative spaces (and inversely "if it's bad, it must be AI-generated"), despite the fact that there's some nuance there.
You forget exactly who is complaining that AI is always bad, with no nuance. It's a bit like observing that in every book and movie writers have a very strong tendency to be rich superheroes with incredible morals and 50 admirers on every finger.
I suspect more sites like wikipedia and the old bbs forums but ones maybe that require an actual ID step, which sucks but might be the only way. Or maybe something like Musk did but a one time $10 fee. Most bots won’t pay up to spread their crap and when they do their numbers should be gated enough by $10 that it won’t take much effort to kick them for abusing the system
Yeah this is all very dystopian. I don't see a 'pay to publish' scheme working, neither do I see 'pay to read'. Maybe I could see 'pay to prove humanity' working, but the Voight-Kampff test is expensive, inaccurate and can be gamed, and accounts which passed it will be sold.
We're between a rock and a hard place with the internet's collective signal to noise ratio plummeting and no easy way to fix it.
True, I'm always making sure my Kindle is face down when I'm not using it so that my coworkers don't think I'm reading whatever insane AI slop Title Amazon happens to be advertising to me at the moment.
I really don't see how these people making the ebooks are making any money on these books either since they cost 99 cents and they pay for advertising. Maybe they're using llama2 idk lmao, either way I've seen a ton of embarrassing titles pop up on there and don't want people thinking I'm interested in that slop.
Speaking of everywhere... My Google results are absolutely flooded with this shit now and it's driving me crazy. It won't be long before site searching (site:reddit.ciom/r/whatever, etc.) is similarly plagued. It's very frustrating!
SEO was annoying but there was still value in the results. Now, as soon as I detect or start to believe some result is AI bullshit I immediately close the tab out.
> CEO Tony Stubblebine says it “doesn’t matter” as long as nobody reads it.
It seems like the leaders of all of these online platforms are going to just bury their heads in the sand and pretend that this massive problem isn't one.
This AI trend is absolutely going to destroy many (most?) of these companies. For better or worse, the new age of the internet is already here.
Why wouldn't they? They've spent the last decade pretending that the clickbots sustaining the internet ad model are no big deal and apparently the market agrees with them!
I'm a paying member of Medium although, due to time constraints, I only really read the articles in the Weekly Digest emails at the weekend.
I have noticed a lot of these articles that just make no sense at all starting to appear in my weekly email to the point that I probably won't renew next year if it stays this way, so it isn't just "nobody reading them".
Maybe this is just a Band-Aid, but Medium has a "boost" mechanism.
I'm not sure of the exact machinations, but the times that my posts have been boosted, it's always been by a publication on Medium where a human editor has read my content and 1) asked that I submit to their publication and 2) submitted to Medium for boosting.
Seems like this type of filter -- as long as they actually keep humans in the loop -- can potentially work to keep high quality content more visible.
Currently on ChatGPT three of the top picks on the explore page are "AI Humanizers" for making generated content appear like it was created by a human. Slop slop slop.
Previously, spam has been reported and filtered via "flag" and "report" buttons.
Why is that not sufficient for site operators to separate "AI Slop" from other articles?
If existing sites choose to tolerate "AI Slop" (why doesn't HN have this problem?), does that create a market opportunity for competitors with better filters?
A lot of the time, AI slop is difficult to discern from authentic content until you’re somewhere in the middle, at which point something starts to feel off about it. At least, that’s been my experience coming across AI-written Blender tutorials on YouTube. All this makes it difficult to quickly report as spam. There’s also a lot of plausible deniability since a lot of people have developed a hair-trigger response to writing they don’t like or find compelling and just immediately go “It’s AI!” even when it isn’t.
"A lot of the time, AI slop is difficult to discern from authentic content until you’re somewhere in the middle, at which point something starts to feel off about it."
And it's only going to get worse. I don't know what the limits of LLMs are, even before we start discussing future architectures and capabilities, but it certainly seems to me that if they just scale a bit farther on their current trajectory it is going to be very hard to detect AIs simply by the content.
One of the adaptations I've adopted is that if your content looks like it was generated from a simple prompt, no matter how long the resulting content, I just toss it. "Write me a beginner's tutorial for Python" isn't going to produce a document that the internet needs another copy of or that social media sites need to spend time linking to. Unfortunately that means even if the "tutorial" was human-generated, it's probably getting the axe too, unless it has a very interesting or unusual approach, but it isn't clear to me that that is actually a large problem.
Why isn't it? As far as I can remember, all the forums had lot of outright spam, even forums which are orders of magnitude less popular than HN. You can even see spam in usenet. Reddit/Discord is full of it. I wonder are dang/moderators the only reason for this?
I've seen a few comments on HN that looked likely to have been generated by LLMs, but they tend not to get any replies or up-votes so there's not much incentive for people to keep on trying to post them.
Mostly because most spam isn't going to be reported. Traditional anti-spam works because once you have a handful of reports, you can probably generalize it to an entire class of spammy pages generated using a particular technique, and get rid of everything.
LLM-generated content is not distinguishable from human content by any simple rule, so you lose the ability to police the platform that way.
> Mostly because most spam isn't going to be reported. Traditional anti-spam works because once you have a handful of reports, you can probably generalize it to an entire class of spammy pages generated using a particular technique, and get rid of everything.
Tell that to Elon Musk. He seems to unable or unwilling to stop a certain type of fake female bot spam on X that has extremely simple to spot and unique patterns.
To be honest, I do use GenAI to help me format my comments, including this one. However, I put in substantial input by outlining my thoughts in detail, so I treat it more as a sophisticated grammar checker. As a non-native speaker, my writing is sufficient for business, but I don’t consider myself a strong writer. GenAI helps me convey my thoughts more clearly and efficiently, which has been tremendously valuable.
Before LLMs, the cost to generate spam content was non-zero. Now, it's effectively zero. These infinite supplies of AI Slop should be handled differently and more aggressively that legacy spam.
Medium was flooded with human-created slop before OpenAI got big. I work in Spark a lot, and basically every question I put into Google has a few lame Medium articles that basically recap the docs or have some trivial example code. Ironically, I turn to Claude now as my primary source. It's very good at Spark.
one of the saddest parts of the whole thing is finding out how many fucking arseholes out there have been waiting to flood the internet with crap, and had previously only failed because they were too lazy to do all that typing.
It's an incentives problem, and the ad industry is largely to blame. It's the same cause as much of the destruction and centralization of the web, the rise of clickbait, rising extremism and political polarization, negativity-driven engagement, and myriad other public problems. An attention-drivin economy is disastrous for mental and public social health. And some people still defend the ad industry, even going so far to laud it as a public good, even talking about it as if it's a charity and not a very rich industry, funding a "free Internet".
We missed the boat long ago on dismantling or neutering the ad industry before it did massive irreparable damage. Whatever we do to fix this mess is still going to leave prominent scars for generations.
I don't see the clickbait / extremism promoting content as anything particularly new. In the UK the "red top" newspapers used to supply it and in the recent past it was Fox News.
It was always possible to cocoon yourself into news and opinion you agreed with.
AI slop does seem a little different, if only because it's machine generated and the sheer amount of it would not have been possible when rubbish was human generated and the gatekeepers of publishing were newspaper editors.
I’m waiting for the point in time when content platforms no longer try to downplay their generated content but instead simply embrace it and even go as far as excluding human only content.
Like for example a TikTok, but all the videos are generated by the platform itself using likes/skips to drive its algorithm to generate more addicting content. That seems like the logical conclusion of where all of this is going.
They can then drop the creator side of the market and a lot of moderation issues that come with it, while gaining full ownership, control, and monetization of the underlying content.
I've seen this on Substack as well. For example, I ran across this account on Substack is posting a lot of ChatGPT-generated articles to reddit. This is clearly ChatGPT-generated based on a very simple prompt:
I have noticed that someone immediately calls out pretty much every bit of content as AI generated, whether it obviously is or not. I guess this is a best case scenario.
I do wonder what will spring up next to allow high quality content to be found.
You know which travel advice I prefer? The one from a small, local blogger that loves their city. You know where I find my music? At house parties. More and more, I also crawl the profiles of individual users for content that I might like.
The cosy web is my refuge from scale and its consequences.
And that lack of scale means it has little mass monetization potential. It's actively hostile to weirdos just trying to make a buck. I don't get randos from who-knows-where trying to hawk their wares in my private group chats, so those recommendations actually carry weight.
[0] https://wakatime.com/blog/67-bots-so-many-bots
We're between a rock and a hard place with the internet's collective signal to noise ratio plummeting and no easy way to fix it.
I really don't see how these people making the ebooks are making any money on these books either since they cost 99 cents and they pay for advertising. Maybe they're using llama2 idk lmao, either way I've seen a ton of embarrassing titles pop up on there and don't want people thinking I'm interested in that slop.
SEO was annoying but there was still value in the results. Now, as soon as I detect or start to believe some result is AI bullshit I immediately close the tab out.
It seems like the leaders of all of these online platforms are going to just bury their heads in the sand and pretend that this massive problem isn't one.
This AI trend is absolutely going to destroy many (most?) of these companies. For better or worse, the new age of the internet is already here.
I have noticed a lot of these articles that just make no sense at all starting to appear in my weekly email to the point that I probably won't renew next year if it stays this way, so it isn't just "nobody reading them".
I'm not sure of the exact machinations, but the times that my posts have been boosted, it's always been by a publication on Medium where a human editor has read my content and 1) asked that I submit to their publication and 2) submitted to Medium for boosting.
Seems like this type of filter -- as long as they actually keep humans in the loop -- can potentially work to keep high quality content more visible.
Of course, even mentioning the f-word is forbidden, so...
* humans creating things
* algorithms emulating (humans creating things)
* human-in-the-loop guiding (algorithms emulating (humans creating things))
* ai that mimics the (human-in-the-loop guiding (algorithms emulating (humans creating things)))
* what next???
Why is that not sufficient for site operators to separate "AI Slop" from other articles?
If existing sites choose to tolerate "AI Slop" (why doesn't HN have this problem?), does that create a market opportunity for competitors with better filters?
> why doesn't HN have this problem?
HN comments aren’t easily monetizable.
And it's only going to get worse. I don't know what the limits of LLMs are, even before we start discussing future architectures and capabilities, but it certainly seems to me that if they just scale a bit farther on their current trajectory it is going to be very hard to detect AIs simply by the content.
One of the adaptations I've adopted is that if your content looks like it was generated from a simple prompt, no matter how long the resulting content, I just toss it. "Write me a beginner's tutorial for Python" isn't going to produce a document that the internet needs another copy of or that social media sites need to spend time linking to. Unfortunately that means even if the "tutorial" was human-generated, it's probably getting the axe too, unless it has a very interesting or unusual approach, but it isn't clear to me that that is actually a large problem.
Why isn't it? As far as I can remember, all the forums had lot of outright spam, even forums which are orders of magnitude less popular than HN. You can even see spam in usenet. Reddit/Discord is full of it. I wonder are dang/moderators the only reason for this?
I've seen a few comments on HN that looked likely to have been generated by LLMs, but they tend not to get any replies or up-votes so there's not much incentive for people to keep on trying to post them.
LLM-generated content is not distinguishable from human content by any simple rule, so you lose the ability to police the platform that way.
Tell that to Elon Musk. He seems to unable or unwilling to stop a certain type of fake female bot spam on X that has extremely simple to spot and unique patterns.
In the case of blatantly false images like on Facebook, they get a surprising amount of organic engagement: https://x.com/FacebookAIslop/status/1806416249259258189
Because it's already filled with russian bots.
Deleted Comment
just absolutely embarrassing for us all.
We missed the boat long ago on dismantling or neutering the ad industry before it did massive irreparable damage. Whatever we do to fix this mess is still going to leave prominent scars for generations.
It was always possible to cocoon yourself into news and opinion you agreed with.
AI slop does seem a little different, if only because it's machine generated and the sheer amount of it would not have been possible when rubbish was human generated and the gatekeepers of publishing were newspaper editors.
Like for example a TikTok, but all the videos are generated by the platform itself using likes/skips to drive its algorithm to generate more addicting content. That seems like the logical conclusion of where all of this is going.
They can then drop the creator side of the market and a lot of moderation issues that come with it, while gaining full ownership, control, and monetization of the underlying content.
https://theglobalistperspective.substack.com/p/the-rise-of-i...