Readit News logoReadit News
beej71 · 3 years ago
I have exactly this problem. Beej's Guide to Network Programming is indexed just fine. Beej's Guide to C won't index.

The automated tool says it's in violation of some unnamed rule, but I can't figure out which. There's zero SEO, tracking, or ads, and the content is educational and G-rated.

All the other guides index just fine.

I asked for a review and they came back with the same ambiguous message. Eventually I just gave up.

Recently I split the C guide in two. I'll have to check to see if that made any difference.

But it left a bad taste, and now I don't trust Bing or DDG to provide complete results. Google's overrun with spam, but at least my stuff actually shows up on Startpage.

KRAKRISMOTT · 3 years ago
It is interesting since Beej's guide is probably the most famous C/Unix programming tutorial after K&R.
beej71 · 3 years ago
I don't know if I'd go _that_ far. There are a lot of great books out there.

But I'm pretty sure mine is the greatest that you can't pay money for. ;)

abdullah2993 · 3 years ago
Whats K&R?
et-al · 3 years ago
Could it be due to some overly zealous prude filter and the unfortunate coincidence that "beej" is American slang for blowjob (oral sex)?
beej71 · 3 years ago
I've definitely considered that. (I'm well-aware of my nickname's connotations, and I just don't care. :) )

But they index everything else on my site and don't prude out over that...

michaelmrose · 3 years ago
Actually outright porn is indexed you know or so I hear.
Johnny555 · 3 years ago
That's pretty obscure slang, I'm American and have never heard it.
cyberpunk · 3 years ago
Wow it's beej. I owe you rather a lot of beers, Guide to Network Programming is directly responsible for my entire career. Sorry for the low value post! :}
beej71 · 3 years ago
Entire career? That's pretty high praise... Well, if you're ever in Bend, OR drop me a DM and I'll start to collect. :)
mprime1 · 3 years ago
My hero! Thank you for your work.

(Yes, not adding much insightful conversation. I don’t care if I get downvoted.)

Deleted Comment

richardjam73 · 3 years ago
It seems to show up for me. Perhaps it is fixed now.

https://duckduckgo.com/?q=c+guide+stdalign&t=ffab&ia=web

brings up your guide as the 6th result.

beej71 · 3 years ago
I have to add "beej" to the search, and then it hits my C Library Reference Guide. But the C Tutorial Guide is nowhere to be found. :(
daflip · 3 years ago
i assume you have >0 backlinks to the site?
beej71 · 3 years ago
Some rando backlink checker says it's about 1300.
lelanthran · 3 years ago
> i assume you have >0 backlinks to the site?

I always wondered about this. What exactly is a backlink, and why should I need one?

crazygringo · 3 years ago
I may have found the answer, and I've seen this before (it happened to me once). It's when a different spam site copies your content wholesale, and a search engine decides they're the "original" site, and you're the spammy copycat.

Because if you put the headlines (in quotes) from two of his recent articles into Bring, e.g. either "Megan Smith explaining the General Magic prototyping process" or "Denialists, Alarmists, and Doomists", both point as their first result to a URL starting with "https://www.scien.cx" which seems to be the spam site with a copy of each article. (The URL isn't loading right now, however, when I try to visit.)

How to fix it really depends on what techniques they're using to mirror your site, of which there are many.

Example search and resulting URL:

https://www.bing.com/search?q=%22Megan+Smith+explaining+the+...

https://www.scien.cx/2022/12/25/megan-smith-explaining-the-g...

Compare with Google getting it right:

https://www.google.com/search?q=%22Megan+Smith+explaining+th...

https://daverupert.com/2022/12/megan-smith-general-magic-pro...

gary_0 · 3 years ago
Recently on HN there was "Someone is proxy-mirroring my website, can I do anything?": https://news.ycombinator.com/item?id=33952114

It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.

irrational · 3 years ago
Nowadays, when I search for things, the results are often clearly pages that have come from a program scrapping sites and then merging them into one page. You can tell because the pages are not really coherent and quickly start to repeat themselves. I assume they are getting money through ads on the pages, though I never actually see the ads because of my blockers. I wish there was a button in the browser that I could click to report the page as spam to all search engines.
m-i-l · 3 years ago
> It feels like the Internet is a more hostile place than ever for small-time websites. You get squeezed from below by wily criminals, and crushed from above by careless megacorps who want to filter out anything that doesn't make them money.

The problem is that the two work hand-in-hand, thanks to the advertising driven search model, and the search engines owning the main advertising platforms.

It should be easy for search engines to identify an original site from the SEO spammer rip-offs - the original site is going to have no adverts (or certainly fewer) while the SEO spammer copies are going to be covered in adverts. The problem is that the search engines have no incentive to do so, in fact if anything they have the incentive to send people to the sites with more adverts.

And of course the whole problem has been created by the search engines in the first place - there would be no point in SEO spammers making advert-laden ripoff sites if it wasn't to rake in advertising revenue.

kshacker · 3 years ago
No more hostile than real world, we are just finding out it is a reflection of our world, of course the difference being the global interconnectedness which magnifies the celebrities but also the crooks.
lapcat · 3 years ago
It's not clear that this is happening with all of the (many) sites that are mysteriously deindexed by Bing. See my comment: https://news.ycombinator.com/item?id=34389279
TrueGeek · 3 years ago
What should you do when another site copies your content like this then?
O1111OOO · 3 years ago
> What should you do when another site copies your content like this then?

Have we gotten to the point where websites (and their content) need to be verified like Twitter, Instagram, Facebook, and TikTok do for personal accounts?

If so, will search engines be the ones verifying - using this as a new revenue scheme (with the dangers inherent in this... ie; pay to be listed or ranked higher)?

crazygringo · 3 years ago
Various things from addressing the problem directly if possible (block the IP address range they use to scrape your content with, insert JavaScript to strip the content client-side depending on the domain it's being served from), to changing their search engine behavior (canonical meta tags, contact the search engine to let them know, build up links on the web to make your site higher ranked).

The more sophisticated and popular the copycat site is (scraping from a distributed network, stripping most HTML tags, etc.), the harder it becomes, and the only thing is to contact the search and hope they can manually mark your domain as the authoritative one. Your success may vary according to your popularity/importance.

gkbrk · 3 years ago
You send a DMCA.
StreamBright · 3 years ago
This is exactly a perfect use case for a blockchain. In fact if people are interested we should create a POC.
0cf8612b2e1e · 3 years ago
How do all of those Stack Overflow mirrors stay up if there is a mechanism to pull the copy-cat?
psychoslave · 3 years ago
Maybe there is a retrocommission program to which they are affiliated?
supermatt · 3 years ago
This is not a shadow ban. A shadow ban is when it appears to you that you are not banned, but from others perspective you are.
tasuki · 3 years ago
Yes. Unfortunately people increasingly use "shadow ban" to just mean "ban", perhaps it sounds cool?
dack · 3 years ago
When you first hear a term used, I think it's natural for people just try to figure out what it means in context without looking up the official definition (I've caught myself doing this subconsciously before).

Imo a logical interpretation of "shadow ban" would be when you are banned but they didn't tell you they banned you, and regular "ban" is when they tell you you were banned. It makes enough sense that people don't think they need to look it up to confirm.

edit: funny enough, I did double-check the wikipedia page to make sure my understanding was correct, but upon reading further it does acknowledge the expanding of the definition: https://en.wikipedia.org/wiki/Shadow_banning

cactusplant7374 · 3 years ago
Elon wants to redefine the term as well but for the purpose of a coordinated witch hunt. People like using shadow ban because it sounds more malicious vs. content that is no longer actively promoted by a company. It's hard to claim you're a victim if the reason is you're just not that interesting or popular.
rapnie · 3 years ago
Is there also a term for being algorithmically suppressed on a social media platform, I wonder? I.e. a much more subtle, harder to dectect mechanism whereby the algorithms ensure you get some exposure, but never the same as other unsuppressed people would get based on similar activity. Or only exposure to a certain limited subset of the graph based on some metrics (e.g. just your 'friends' so no one points out you are effectively shadow banned).
Marazan · 3 years ago
Yes, its called Algorithmic Supression.

That doesn't sound as cool and victimey though.

colanderman · 3 years ago
"Automatically downranked/downweighted/penalized" are terms I've heard.
hutzlibu · 3 years ago
I think unlike in the case here, (soft) shadow banning would be appropriate to describe it, even though not 100% technical correct.
cma · 3 years ago
I believe Musk when condemning old twitter for doing it and deboosting when he promises new Twitter will do it.
badrabbit · 3 years ago
That is what is happening, you think you are not showing up because bad SEO or better results. You have to find out through experimentatiom that you are restricted. The moderator didn't let you know that they have taken punitive action against you.
supermatt · 3 years ago
That would just be a ban. You dont need to be told you are banned.
seanhunter · 3 years ago
Indeed. Some people seem to be trying to make it apply any time the top result of an algorithmic ranking isn't what they think it should be (eg if they have been deboosted rather than shadow banned).
JadoJodo · 3 years ago
I suspect OP used it in the way that “I believe all my posts are showing up in the places that they should, but unbeknownst to me, they are being suppressed.” In this instance, I could see a “search engine shadowban” being an appropriate moniker.
InspiredIdiot · 3 years ago
I think a good test of whether this application of the term makes any sense is: Could any search engine ban ever not be a shadow ban? We already have a term: ban. Let's just use that one and stop conflating things and being unnecessary imprecise and incendiary. It helps certain parties' (edit plural possessive) agenda but does not help us clearly communicate.
ffhhj · 3 years ago
The author is extending the concept to include an inadverted ban. Why would Bing warn him anyways, since there is no user account? Welcome to Cancelbannia.
Dylan16807 · 3 years ago
I think being banned from search could be part of a shadow ban, but when the entire service is search that's just a ban.
mkl · 3 years ago
If it was a shadow ban he would see his site when he searched but we wouldn't. This is definitely not that.

Deleted Comment

travisgriggs · 3 years ago
Clearly we need a new term. I nominate

Ghost Banned

or

Ghostdexed

zxcvbn4038 · 3 years ago
Check with Stanford first, “ghost” might be offensive to the living impaired. =P
stephencanon · 3 years ago
Outdexed, clearly.

Dead Comment

donatj · 3 years ago
This triggered me to DuckDuckGo my own site and immediately I notice the top result is someone rehosting my OSS on a page loaded with pages of crap SEO content.

Scrolling further, I don’t seem to find my own site either… https://donatstudios.com

I’ve added my site into Bing webmaster tools, we’ll see if it helps I guess.

HomeDeLaPot · 3 years ago
Wow. I think I've used your circle generator on that other site without realizing it. That sucks. I've bookmarked yours now!
donatj · 3 years ago
A bunch of sites popped up hosting it loaded with SEO garbage and ads. I’d licensed it MIT so while they certainly can, it sure doesn’t feel nice.
mtlynch · 3 years ago
>One “out there” reason I can think is that I use Amazon Affiliate links on my Bookshelf and my /Uses page and that triggers a shadow ban?

It's probably not the reason, but it's worth noting that the author is using Amazon affiliate links in violation of Amazon and FTC rules because they're not disclosing the fact that they profit from purchases through their links.

Per Amazon:

>Anytime you share an affiliate link, it's important to disclose that to your audience... you must (1) include a legally compliant disclosure with your links and (2) identify yourself on your Site as an Amazon Associate with the language required by the Operating Agreement.

https://affiliate-program.amazon.com/help/node/topic/GHQNZAU...

Per FTC:

>As for where to place a disclosure, the guiding principle is that it has to be clear and conspicuous... Consumers should be able to notice the disclosure easily. They shouldn’t have to hunt for it.

https://www.ftc.gov/business-guidance/resources/ftcs-endorse...

mananaysiempre · 3 years ago
For some reason, Beej’s Guide to C Programming is also banned from Bing (and consequently DDG) [1], with the standard robotic non-explanations given when the author asked, even though the rest of the site is not.

[1] https://beej.us/guide/bgc/whynoddg.html

Pelam · 3 years ago
I can find Beej.us with DDG, but not daverupert.com. Maybe Beej got the problem resolved somehow.
mananaysiempre · 3 years ago
He says in the link it’s specifically the C guide, the rest of the website is fine. Though... yeah, DDG queries like “beej c guide strlen” give reasonable results for me, if with an unjustifiably high-ranked position for the mirror at http://docs.hfbk.net/beej.us. Bing ones only include the mirror and the other guides (and a Scribd-hosted PDF copy, of all things, as the first result below a huge navigation card referring to https://beej.us/guides but without the C guide among the links).
pseudolus · 3 years ago
You've indicated that you've used Bing's tools to see if your website has been indexed but are silent as to whether you've actually manually submitted your site to be indexed by Bing using their url submission tool [0]. If you do submit the URL and then, after a decent interval, your site still doesn't show up then there might be something to your claim.

[0] https://www.bing.com/webmasters/help/url-submission-62f2860b

Liquix · 3 years ago
To be fair, a decade-old SFW blog with 2.2k crawler hits ought to be automatically indexed by any major search engine.
beej71 · 3 years ago
I'm not the OP, but I have a site with the same issue, and I did manually submit. No impact.
lapcat · 3 years ago
See "Bing and DuckDuckGo removed my business web site" https://lapcatsoftware.com/articles/bing.html and "My website disappeared from Bing and DuckDuckGo, Part 2" https://www.jessesquires.com/blog/2022/07/25/my-website-disa...

[EDIT] I just published a new blog post "Bing and DuckDuckGo removed my business web site AGAIN" https://lapcatsoftware.com/articles/bing2.html

Sigh.