I have exactly this problem. Beej's Guide to Network Programming is indexed just fine. Beej's Guide to C won't index.
The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.
All the other guides index just fine.
I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.
Recently I split the C guide in two. I'll have to check to see if that made any difference.
But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.
Wow it's beej. I owe you rather a lot of beers, Guide to Network Programming is directly responsible for my entire career. Sorry for the low value post! :}
I may have found the answer, and I've seen this before (it happened to me once). It's when a different spam site copies your content wholesale, and a search engine decides they're the "original" site, and you're the spammy copycat.
Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)
How to fix it really depends on what techniques they're using to mirror your site, of which there are many.
It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
Nowadays, when I search for things, the results are often clearly pages that have come from a program scrapping sites and then merging them into one page. You can tell because the pages are not really coherent and quickly start to repeat themselves. I assume they are getting money through ads on the pages, though I never actually see the ads because of my blockers. I wish there was a button in the browser that I could click to report the page as spam to all search engines.
> It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
The problem is that the two work hand-in-hand, thanks to the advertising driven search model, and the search engines owning the main advertising platforms.
It should be easy for search engines to identify an original site from the SEO spammer rip-offs - the original site is going to have no adverts (or certainly fewer) while the SEO spammer copies are going to be covered in adverts. The problem is that the search engines have no incentive to do so, in fact if anything they have the incentive to send people to the sites with more adverts.
And of course the whole problem has been created by the search engines in the first place - there would be no point in SEO spammers making advert-laden ripoff sites if it wasn't to rake in advertising revenue.
No more hostile than real world, we are just finding out it is a reflection of our world, of course the difference being the global interconnectedness which magnifies the celebrities but also the crooks.
> What should you do when another site copies your content like this then?
Have we gotten to the point where websites (and their content) need to be verified like Twitter, Instagram, Facebook, and TikTok do for personal accounts?
If so, will search engines be the ones verifying - using this as a new revenue scheme (with the dangers inherent in this... ie; pay to be listed or ranked higher)?
Various things from addressing the problem directly if possible (block the IP address range they use to scrape your content with, insert JavaScript to strip the content client-side depending on the domain it's being served from), to changing their search engine behavior (canonical meta tags, contact the search engine to let them know, build up links on the web to make your site higher ranked).
The more sophisticated and popular the copycat site is (scraping from a distributed network, stripping most HTML tags, etc.), the harder it becomes, and the only thing is to contact the search and hope they can manually mark your domain as the authoritative one. Your success may vary according to your popularity/importance.
When you first hear a term used, I think it's natural for people just try to figure out what it means in context without looking up the official definition (I've caught myself doing this subconsciously before).
Imo a logical interpretation of "shadow ban" would be when you are banned but they didn't tell you they banned you, and regular "ban" is when they tell you you were banned. It makes enough sense that people don't think they need to look it up to confirm.
edit: funny enough, I did double-check the wikipedia page to make sure my understanding was correct, but upon reading further it does acknowledge the expanding of the definition: https://en.wikipedia.org/wiki/Shadow_banning
Elon wants to redefine the term as well but for the purpose of a coordinated witch hunt. People like using shadow ban because it sounds more malicious vs. content that is no longer actively promoted by a company. It's hard to claim you're a victim if the reason is you're just not that interesting or popular.
Is there also a term for being algorithmically suppressed on a social media platform, I wonder? I.e. a much more subtle, harder to dectect mechanism whereby the algorithms ensure you get some exposure, but never the same as other unsuppressed people would get based on similar activity. Or only exposure to a certain limited subset of the graph based on some metrics (e.g. just your 'friends' so no one points out you are effectively shadow banned).
That is what is happening, you think you are not showing up because bad SEO or better results. You have to find out through experimentatiom that you are restricted. The moderator didn't let you know that they have taken punitive action against you.
Indeed. Some people seem to be trying to make it apply any time the top result of an algorithmic ranking isn't what they think it should be (eg if they have been deboosted rather than shadow banned).
I suspect OP used it in the way that “I believe all my posts are showing up in the places that they should, but unbeknownst to me, they are being suppressed.” In this instance, I could see a “search engine shadowban” being an appropriate moniker.
I think a good test of whether this application of the term makes any sense is: Could any search engine ban ever not be a shadow ban? We already have a term: ban. Let's just use that one and stop conflating things and being unnecessary imprecise and incendiary. It helps certain parties' (edit plural possessive) agenda but does not help us clearly communicate.
The author is extending the concept to include an inadverted ban. Why would Bing warn him anyways, since there is no user account? Welcome to Cancelbannia.
This triggered me to DuckDuckGo my own site and immediately I notice the top result is someone rehosting my OSS on a page loaded with pages of crap SEO content.
>One “out there” reason I can think is that I use Amazon Affiliate links on my Bookshelf and my /Uses page and that triggers a shadow ban?
It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.
Per Amazon:
>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.
>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.
For some reason, Beej’s Guide to C Programming is also banned from Bing (and consequently DDG) [1], with the standard robotic non-explanations given when the author asked, even though the rest of the site is not.
He says in the link it’s specifically the C guide, the rest of the website is fine. Though... yeah, DDG queries like “beej c guide strlen” give reasonable results for me, if with an unjustifiably high-ranked position for the mirror at http://docs.hfbk.net/beej.us. Bing ones only include the mirror and the other guides (and a Scribd-hosted PDF copy, of all things, as the first result below a huge navigation card referring to https://beej.us/guides but without the C guide among the links).
You've indicated that you've used Bing's tools to see if your website has been indexed but are silent as to whether you've actually manually submitted your site to be indexed by Bing using their url submission tool [0]. If you do submit the URL and then, after a decent interval, your site still doesn't show up then there might be something to your claim.
The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.
All the other guides index just fine.
I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.
Recently I split the C guide in two. I'll have to check to see if that made any difference.
But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.
But I'm pretty sure mine is the greatest that you can't pay money for. ;)
But they index everything else on my site and don't prude out over that...
(Yes, not adding much insightful conversation. I don’t care if I get downvoted.)
Deleted Comment
https://duckduckgo.com/?q=c+guide+stdalign&t=ffab&ia=web
brings up your guide as the 6th result.
I always wondered about this. What exactly is a backlink, and why should I need one?
Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)
How to fix it really depends on what techniques they're using to mirror your site, of which there are many.
Example search and resulting URL:
https://www.bing.com/search?q=%22Megan+Smith+explaining+the+...
https://www.scien.cx/2022/12/25/megan-smith-explaining-the-g...
Compare with Google getting it right:
https://www.google.com/search?q=%22Megan+Smith+explaining+th...
https://daverupert.com/2022/12/megan-smith-general-magic-pro...
It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.
The problem is that the two work hand-in-hand, thanks to the advertising driven search model, and the search engines owning the main advertising platforms.
It should be easy for search engines to identify an original site from the SEO spammer rip-offs - the original site is going to have no adverts (or certainly fewer) while the SEO spammer copies are going to be covered in adverts. The problem is that the search engines have no incentive to do so, in fact if anything they have the incentive to send people to the sites with more adverts.
And of course the whole problem has been created by the search engines in the first place - there would be no point in SEO spammers making advert-laden ripoff sites if it wasn't to rake in advertising revenue.
Have we gotten to the point where websites (and their content) need to be verified like Twitter, Instagram, Facebook, and TikTok do for personal accounts?
If so, will search engines be the ones verifying - using this as a new revenue scheme (with the dangers inherent in this... ie; pay to be listed or ranked higher)?
The more sophisticated and popular the copycat site is (scraping from a distributed network, stripping most HTML tags, etc.), the harder it becomes, and the only thing is to contact the search and hope they can manually mark your domain as the authoritative one. Your success may vary according to your popularity/importance.
Imo a logical interpretation of "shadow ban" would be when you are banned but they didn't tell you they banned you, and regular "ban" is when they tell you you were banned. It makes enough sense that people don't think they need to look it up to confirm.
edit: funny enough, I did double-check the wikipedia page to make sure my understanding was correct, but upon reading further it does acknowledge the expanding of the definition: https://en.wikipedia.org/wiki/Shadow_banning
That doesn't sound as cool and victimey though.
Deleted Comment
Ghost Banned
or
Ghostdexed
Dead Comment
Scrolling further, I don’t seem to find my own site either… https://donatstudios.com
I’ve added my site into Bing webmaster tools, we’ll see if it helps I guess.
It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.
Per Amazon:
>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.
https://affiliate-program.amazon.com/help/node/topic/GHQNZAU...
Per FTC:
>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.
https://www.ftc.gov/business-guidance/resources/ftcs-endorse...
[1] https://beej.us/guide/bgc/whynoddg.html
[0] https://www.bing.com/webmasters/help/url-submission-62f2860b
[EDIT] I just published a new blog post "Bing and DuckDuckGo removed my business web site AGAIN" https://lapcatsoftware.com/articles/bing2.html
Sigh.