I travel often. Sometimes I use a VPN, sometimes I don't. I use a heavily customized Firefox config on Linux.
Cloudflare challenges have made large portions of the web unusable for me.
Some recent examples
- The "unsubscribe" button in Indeed's job notification emails leads me to an impassable Cloudflare challenge. The "Contact Us" page is also behind an impassable Cloudflare challenge.
- While migrating a non-profit off of A2 Hosting, their login forces me to re-enter credentials after failing a challenge, looping endlessly.
- On a particularly ironic note, I tried to complain on the Cloudflare Forums—met with another impassable challenge.
When reachable, customer support always says "try a mobile data connection", "switch to Chrome", or some other variant of "too bad, so sad".Is anyone else dealing with this mess?
That's a CAN-SPAM act violation.
FTC: "Tell recipients how to opt out of receiving future marketing email from you. Your message must include a clear and conspicuous explanation of how the recipient can opt out of getting marketing email from you in the future. Craft the notice in a way that’s easy for an ordinary person to recognize, read, and understand. Creative use of type size, color, and location can improve clarity. Give a return email address or another easy Internet-based way to allow people to communicate their choice to you. You may create a menu to allow a recipient to opt out of certain types of messages, but you must include the option to stop all marketing messages from you. Make sure your spam filter doesn’t block these opt-out requests."[1]
Experian was recently fined for making it hard to opt out of their marketing emails.
The actual regulation text:
§ 316.5 Prohibition on charging a fee or imposing other requirements on recipients who wish to opt out.
Neither a sender nor any person acting on behalf of a sender may require that any recipient pay any fee, provide any information other than the recipient's electronic mail address and opt-out preferences, or take any other steps except sending a reply electronic mail message or visiting a single Internet Web page, in order to:
(a) Use a return electronic mail address or other Internet-based mechanism, required by 15 U.S.C. 7704(a)(3), to submit a request not to receive future commercial electronic mail messages from a sender; or
(b) Have such a request honored as required by 15 U.S.C. 7704(a)(3)(B) and (a)(4).
That seems to cover it. File a CAN-SPAM act complaint (spam@uce.gov). Send a copy to the legal department of the sender.
[1] https://www.ftc.gov/business-guidance/resources/can-spam-act...
I decided to download larger files from their web site a few tens of millions of times, which I think cost them a few hundred dollars. Unethical? Perhaps, but I'm not the kind of person who just accepts that companies are too large to have humans that can communicate and that I should just accept their harassment.
It worked, though. I finally got a response from Hertz saying they were going to "get to the bottom of it", and I finally stopped getting their spam.
They didn't but I still recieved spam which I couldn't opt out of because they wanted me to log into my account, even for support, which obviously didn't exist.
At least back then we had Twitter and messaging them publicly got a customer service response.
If you don't do that, bot protection isn't going to stop a dedicated troll.
CAN-SPAM was introduced by Republicans and signed into law by Bush btw.
It's like a restaurant that complies with a local food access requirement to be open at a certain time... but only by having a drive-through that requires you to not just be a human being, but also to drive a car to get to the restaurant.
Unfortunately, I think the Cloudflare challenges are designed to filter out users similar to your profile... once you stray far enough from the norm, it just looks like a bot / suspicious traffic to them. Statistically there's not enough users like you (privacy-conscious Linux users on nonstandard browsers) for them to really care enough to do anything about it. Site owners don't care either since you're usually like 1-2% of users at most, and typically also the same ones who block ads, etc., so they don't mind blocking you... it's sad, but I don't think there is really anything you can do about it except conform. It's an ongoing arms race and you're caught in the middle.
There are residential-IP-backed VPN services that you can use just like commercial VPN services — but they're mostly built on the backs of botnets, so it's ethically questionable to use them.
The old IP address was a mom-and-pop CGNAT.
Thanks CF, for protecting us from capitalism, I guess?
I do believe that it is true that many site owners wouldn't care. But I suspect that in the vast majority of cases they don't actually know. Cloudflare probably shows them a nice dashboard about all of these blocked "threats" and they don't know better than to question it.
"We have a problem with bots" - "Just create a firewall rule, whatever"
Anyway, I know the "Cloudflare's monopoly gating is killing web openness!" meme is common online, especially on HN, but in real life I've never actually heard anyone else complain about it (either a fellow dev or a customer or a manager). Instead, it's been universal praise for the actual issues Cloudflare exists to solve (CDN, bot protection, serverless, etc)... they are a godsend for small businesses that otherwise get immediately flooded by spam requests, especially from China, Russia, and India.
And if you think Cloudflare is bad, it was even worse before they became dominant, with terrible services like Incapsula/Imperva charging way more but providing both worse bot protection AND more false positives, or the really hard early reCAPTCHAs (that Cloudflare was largely able to replace, for users who DO fit within the "norm"). That, or you'd have to fight every random sysadmin with their own lazy rules, like firewall rules that blacklisted entire regional ISPs and took weeks or months to resolve, if they ever even checked their emails.
As inconvenient as Cloudflare is for users who take privacy seriously and try to be less trackable, for the other 90% of us who don't care as much and easily fit into their "norm" model, it's much nicer than what came before. Site downtime and slowness are also much less common now, in no small part because of their easy CDN and caching.
From the implementation side, I've set up a few Cloudflare accounts in my career, but do take the time to try to configure it to balance security vs accessibility for any given target audience. Sometimes we'd block entire countries, other times we'd minimize security to ensure maximum reach, but usually we'd customize rulesets in the middle for any given company & audience. I never got a complaint about it (our emails were still available and not blocked).
This was always a direct response to some business need, usually spambots or DDoS attempts that fail2ban etc. couldn't catch well enough. For the business, it was usually a "shit, our website is down again, what is it this time", and the choice between "for free or $20 we can get it back up again and not have this issue anymore" or "we can spend thousands of dollars and weeks of labor building our own security solution" is pretty easy. "What about that one guy who is proxied behind TOR and three VPNs with a random user agent using a text-only browser he wrote himself?" never really factors into that process =/ There's just not enough users like that out in the wild vs the very real constant threat of bots and malware.
It's a shitty situation that the web is like this today, and I wish it weren't the case, but it really is an arms race, and these imperfect weapons are just what most of us have access to...
Dead Comment
For example, Google proposed https://github.com/explainers-by-googlers/Web-Environment-In... and this was shot down by privacy advocates (for very good reasons).
So basically the choice for website operators is either to fight the bots and accept that their service will be unusable for some subset of their users or not fight the bots, which will lead to their service becoming unusable for everyone.
More and more, you see services pushing you very hard towards using their app and the reason is that with the app, they are able to actually verify that you are likely not a bot (or rather, in reality, that at least the app is running on an actual physical device, mobile phone bot farms are unfortunately also a thing).
As for Cloudflare - they offer it as a service, so when the website operator has a choice between using them or allocating several engineers for bot-fighting, why would they not just go with Cloudflare? Doing it yourself can be slightly higher fidelity, as you know your customers better, but it is also a lot of effort which could be better spent elsewhere.
2/3 of the issues OP listed would not make the service unusable for anyone if the botcheck were removed. 1. What would be the problem with allowing "bots" to opt out of receiving marketing emails? Why do I need to be a human to tell you to stop spamming me? Who is running such a bot, for what purpose? 2. What would be the problem with allowing a "bot" to log in to an already-verified human account a single time?
The only situations where you actually need to confirm that a user "looks human" is for repeated connection attempts in quick enough succession to matter (DDoS prevention), or when they want to do something that someone would actually write a nefarious bot to do (mainly just creating posts/messages visible to other users).
Even if you send a confirmation email afterwords that's potentially millions of emails you are sending because of bots.
Recently I had to deal with this for alibaba just to look at something, which I usually just use torbrowser with, and finally gave up as I couldn't pass the challenge. I suppose I shouldn't be surprised at that though, they trust me as much as I trust them.
The worst is usually adobe and cookielaw with all their related tracking crap, where I can't even get the captcha to render as it's so many layers buried in scripting I can't enable enough sites between ublock, noscript, privacy badger, and firefox strict modes. I treat adobe like malware, but unfortunately things like albertsons.com for groceries and other mega companies love to use it, and their sites literally do not work without allowing their heavy scripting/tracking.
There are other usually smaller captcha players that I haven't been human enough to pass with, I forget the names of the stupid to shame, but a few when I see them I recognize to just close the window and forget about whatever it was I was looking for there (like twitter/x).
Hooray commerce!
The error: ``` Access denied Error 16 www.albertsons.com 2025-01-03 09:30:00 UTC What happened? This request was blocked by our security service Your IP: xxx Proxy IP: xxx (ID xxx) Incident ID: XXX Powered by Imperva ```
Might be worth checking some enterprise threat lists for whatever IP's your popping up on (ie Imperva and Cloudflare), or something uniquely fingerprints you from your browser. I use multiple extensions to block whatever they each can, and even I'm not treated that badly as you for wherever you are coming online from.
Here's Fortinet's you can check your IP against, they all tend to roughly use the same lists eventually: https://www.fortiguard.com/iprep
This is the way.
CloudFlare has positioned itself as the doorman of the Internet, deciding who gets to visit shitty websites written by AIs and who doesn't. Every time I try to visit a website and get blocked by this company and its unnecessary services, I congratulate myself for avoiding yet another terrible website and move on with my life.
[1] https://ido50.net/content/what-chafes-my-groin-9.html
Offering free stuff which works and that many people want is how internet companies get big.
https://news.ycombinator.com/item?id=38063548
What's funny about it is that as a human I get tormented by those things all the time but I have been writing bots since 1999 and have yet to have had CAPTCHAs affect a webcrawling project in a big way: for instance I have a bot that collected 800,000 images from 4 web sites since last April, at times I thought they had anti-bot countermeasures but I realized that when they were having problems it was because the wheels were coming off their web site (don't blame me, that is 0.03 requests/second and are not parallelized and pipelined like the requests from a web browser.) I'm also prototyping one that can look at an article like
https://phys.org/news/2025-01-diversifying-dna-origami-gener...
see if there are links to journal articles in there, determine if the articles are Open Access and pick out an image for social... so far no problems. But if I want to pay my electric bill there's a CAPTCHA -- I mean, what kind of bot wants to pay my electric bill? (Kinda seems like it is asking for a lawsuit in this day and age if it prevents anyone 'differently abled' from accessing essential services...)
None, but they do want to use your electricity company's credit card payment facility to test stolen card numbers.
That's because that web site returns bad results to Cloudflare DNS, ostensibly because they take issue with the way it handles EDNS0. The fact that it fails to work is a deliberate choice by the site operator; it isn't Cloudflare's fault.
Cloudflare wants to "protect" people from exposing even their general region. This has the side effect of making CDNs that aren't Cloudflare work worse. Cloudflare are being dicks because they do to others what they wouldn't want to be done to themselves, or what they themselves don't do to themselves.
It's not even that people are choosing to opt in to Cloudflare's bullshit. If you use Firefox in the US (and many other areas, but the US for sure) and you haven't manually configured Firefox or set up a canary domain, all your DNS lookups are going to Cloudflare, and they're using that to make other CDNs work less well. That's definitely shady and definitely bad on Cloudflare's end.
I'm glad some people are taking a stand.
I work at the Uni now and circa 2015 we had a lawsuit against us because we made people use terrible quality applications that weren't accessible. I'd make the case that that sort of organization which has a rigid social hierarchy (e.g. grad student, postdoc, assistant professor, associate professor, full professor, department head, provost, ...) finds it close to impossible to confront quality problems that it finds invisible. (e.g. if you submitted a bad paper to a journal or had sex with an undergraduate it could understand that but a web site could set your computer on fire and they wouldn't see a problem with that.)
Since then all higher ed organizations feel a lot of need to offer accessible applications. My unit sells a subscription service to a data product and in sales talks and other conversations with our customers we find accessibility is a priority so it is a priority for me as a web dev.
(2) Don't get me started about RSS. I think it is great, kinda. Fir $10 a month I can pay Superfeedr to scrape 110 news sites and send them to my web hook which queues them in SQS and lets my RSS reader YOShInOn ingest them at its own convenience. I'd like to subscribe to 2000 or so independent blogs but don't want to pay a $100+ month scraping bill.
Could I write my own crawler? Sure! But polling is for the birds. You really want to get a ping just when the event happens (ActivityPub? PubSubHubbub? AT Protocol? XMPP?) but instead you have to poll. There are two kinds of polling: (a) too fast, (b) too slow. Should I run it at home over my slow ADSL connection (is my wife having trouble using the internet because my crawler is having a bad day?) or should I run it the cloud where trying to save $5 a month on my bill could cause EBS volumes to go swap crazy costing me $500 a month? It's awful for people who run feeds, see
https://rachelbythebay.com/w/2024/05/27/feed/
although she should (a) just get a CDN and get over it or (b) give up on RSS. Sorry, people write stupid stateless crawlers with curl and making your crawler stateful enough to respect her silly 429 protocol makes RSS no longer a simple protocol.
On top of that people keep failing with the same failing user interfaces for RSS readers that have been failing with 1999 with no insight that "people tried that in 2001 and it failed". People like Dave Weiner have no insight why the world couldn't care less about RSS because they just won't recotnize there are problems.)
(e.g. if you gotta know, YOShInOn works like TikTok... I never "mark as read", it doesn't show me little windows that show me the top N from 20 different sites, none of that.)
(3) If it's your electric bill it really is an essential service that there is no competition for. Frequently markets work, but not in that case, even if Enron was able to fool some legislators that they would work in that case for a while.
For example, for starters, Cloudflare and Google need to find ways so that individual people who're wrongly being locked out of services by the company, have some way to get that unlocked. Not "sux2bu we dont do support bro".
(Then they can start thinking about the next step, which is due process, and what it means to wrongly lock out someone in the first place.)
That said, as an immediate pragmatic matter, one debugging tip with your Firefox is to go to the `about:profiles` URL, and temporarily create a new profile, and without using any Firefox sync feature, and see if Cloudflare lets you through, and then incrementally add back in your extensions and preference customizations, and see if/when CF stops letting you in. (Not that it will necessarily identify the sole and exact trigger, since they might be using scores of multiple factors, but it will be evidence of one thing that pushes it over the edge. And maybe get you to a compromise setup that lets you do your work for now.) Also helpful is to have alternate browsers installed; personally, I keep Chromium installed, as my "violate me every possible way, if you'll just let me access this one page/site I really need right now".
It seems ironic that as a human I can't seem to reliably prove I am a human with a realistic amount of effort via these systems, but having installed a specific automated browser extension does?
I am not a fan of Cloudflare and don't like the idea of running their software on my computer, but it seemed like the only options to continue using the internet at all.