Cloudflare launched “AI Labyrinth” — traps for AI scrapers: fake AI-generated pages that confuse bots and burn their resources.
But… what’s the point? Can someone explain this to me? Why are we trying to make it harder for AIs to access public info humans can find anyway? We’ll get there either way.
Many of the AI scraper's requests also point to non-existent, redundant, or low-quality destinations. Websites provide a file, /robots.txt, that clearly indicate what URLs crawlers should and should not visit; but the AI scrapers ignore robots.txt, visiting any URL they find, and some they invent (which, naturally, turn out to be non-existent). Websites also indicate when the content at a specific URL has or may change; but AI scrapers ignore those indicators too, requesting the same URL for a static webpage sometimes seconds apart.
https://blog.cloudflare.com/ai-labyrinth/ specifies that it works against "unauthorized crawling" and "inappropriate bot activity". I assume that a scraper (even an AI scraper) that respects robots.txt and doesn't send requests at unreasonable rates won't encounter the AI labyrinth.
Now we see the crawlers ignore the robots.txt.
Some crawlers don't do 1 request per second but hit a website with 100 per second. And for days. And crawl the same data again and again. It makes websites slow and has no immediate benefit to the humans.
Additionally the objective of these pages is to serve ads to real people or funnel real people to paid products, AI traffic for them is just a cost at best and a denial of service attack at worst.
I own an apple farm, so I don't mind leaving out a few apples boxed up and ready to go. For sake of argument, just assume that these apples are fine but by the time they could be transported for sale they wouldn't be fresh.
For the first two years of doing this, most people would come and pick up a couple of apples and then go home. In the last two months Jimbo pulls up a truck and dumps as many apples as possible into the back and drives off.
Eventually I'm going to have to tell Jimbo to stop doing this or at least charge a fee for each apple. Otherwise Jim is the only one who gets any.
Also the data on the site may prohibit being used for AI slop.