Whenever I'm searching for anything even mildly off the beaten path, it's not uncommon for the top results to be SEO stuffed spam websites, or maybe even real websites that I can't access (like paywalls or requiring adblocker exceptions to proceed). Usually pages from the same domains are top-ranked for other related searches too.
As a user I'd love to be able to tell my search engine to "Never show me results from this domains" (similar to blocking an account on Twitter) – but as far as I can tell there is no way to do this in either Google or DuckDuckGo search.
This seems like such low-hanging fruit to me that I'm wondering if other people have ever wanted this, and if there's actually a reason not to do it.
In my opinion, back then, they needed the data as a training set for spammy domain detection. Now that SERP spam is no longer a serious issue (in Google's eyes anyway), why bother. Google always knows what's best for you.
Given the number of clones of StackOverflow and GitHub that show up on the front page, even at the cost of replacing the original SO and GH links that they copied from, I can only assume either Google's eyes are blind or the search engine devs are so good at their job that they never need to Google anything themselves.
Whole first page is just poorly written fiverr style articles, all by people for whom English is clearly a second language, at best.
Eventually, legit stuff gets on the first page too, but theres a few instances where weeks later the first result is still one of these garbage seo-torture sites.
Has really made me lose my faith in google lately.
We and the other few independent search engines have not made enough dent on the market to suffer SEO spam. We'll have a way to deal with it (watch this space). Right now you'll certainly get results "off the beaten path" and with one click you can try out 8 other search options [0].
[0] https://blog.mojeek.com/2022/02/search-choices-enable-freedo...
If the top results are "SEO stuffed spam websites", they're probably also loaded with ads. If they're loaded with ads, there's a good chance that they're using Google's ads.
If the top results are "SEO stuffed spam websites", there's a good chance they're chock-full of affiliate links. If a search for "best baby formula" is going to end up costing Amazon an affiliate fee via an "SEO stuffed spam website", Amazon might as well just buy a Google ad for that keyword and cut out the "SEO stuffed spam website". If the results go to pages that are just going to cost a seller money anyway, it gives the seller more incentive to buy ads since they aren't getting free traffic from the search engine anyway.
While being an answer engine keeps you on their page longer, feeding you SEO spam also keeps you coming back to their page; feeding you SEO spam signals to potential advertisers that they won't be getting free traffic from the search engine so they might as well pay for the traffic via ads.
I'm not suggesting that it's a conspiracy to send you bad results, but it does seem likely that as long as they aren't losing traffic to competitors, it might not be something that becomes a priority.
Bingo. If you can't get clicks on AdWords get them on AdSense.
You do much better than Google (Google will always include "Top 10 Best Grass Fed Organic Steak in San Francisco, CA" from Yelp and then link to places that don't have Grass-fed beef options.)
However, currently, your first ranking option is Pinterest spam FYI: https://www.mojeek.com/search?q=grass+fed+beef+restaurant+in... → https://www.mojeek.com/search?q=grass+fed+beef+restaurant+in...
Your second option is the correct kind of result (a blog from a local that actually answers the question): https://www.grassfedgirl.com/paleo-friendly-restaurants-in-s...
Where most of the results on Google are not correct. They are mostly articles about Steaks (and a few actual restaurants that serve grass-fed beef, so that is good). Actually, FYI your results don't include these restaurants. eg. It would be nice if this Google result showed up in your results:
"SF / SOMA - Belcampo We source grass-fed and finished, pasture-raised meats directly from our own climate positive CA farms and seasonal vegetables from local farms."
As you point out #1 organic link on G is Yelp. They currently block all but G, Bing and Yahoo! - we'll get in touch with them again. https://www.yelp.com/robots.txt
Organic link #3 on G is Belcampo; we have some of that indexed so we'll take a look: https://www.mojeek.com/search?q=food+site%3Abelcampo.com
A filter gives me a binary outcome; it either has Bruce Willis, or it doesn't.
All of which is sort of the ops point I think; the search engine is fuzzy because it wants to show you many things for profit reasons. I don't think it's geared toward maximizing profitability at the expense of what you're looking for, exactly, so much as engagement as a proxy for profitability. The more you use the service, the stickier you likely are as a paid subscriber, so they'll happily shovel things that don't match the search (but are similar!...and that quickly jump the shark into being quite dissimilar, but hey, maybe!) to try and keep your eyeballs.
A better demonstration of this handling is searching for movie titles they don't have. "The Princess Bride" on Netflix, for instance, doesn't even tell you "We don't have that", but instead "Explore titles related to: The Princess Bride". And while the first suggestion, The Neverending Story, is kinda a fair suggestion, some slightly later ones, like Top Gun and Zoolander, feel like a stretch.
I'm largely guessing - I don't watch movies, but I read a lot of manga online, and the only site that allows similar queries and still has UX a bit better than late '90s (ie. mangaupdates.com) is a fan-made (ie. pirated) one with porn/hentai...
I use this feature a lot, and it works.
Block results from specific domains on Google or DDG:
And it's even possible to target element content with regex with the `:has-text(/regex/)` selector. Bonus content: Ever tried getting rid of Medium's obnoxious cookie notification? Just nuke it from orbit on all domains:Filtering out the spam only removes the clones; it doesn't get the good results back in.
The time spent (including maintenance) will be paid back faster than you might expect.
Optionally rewrite some sites to altfronts like nitter/scribe/piped. If you care about spending time on privacy and decoupling searches from visits, you can set up arbitrary proxying rules.
One benefit among others over browser extensions is that it's a one-time setup for all your devices and clients. All you need to do on reinstall is to change the default search engine.
Either way, OP's ask was for a way to blacklist results, and I'm providing a method to accomplish exactly that. Edit: The rest is up to Google.
Current solution is either ublacklist, or add filters to uBlock Origin. Both are linked in the thread https://news.ycombinator.com/item?id=29546433
https://github.com/h-matsuo/uBlacklist-subscription-for-deve...
Google might have been better in the past, but since there is absolutely no serious competition whatsoever from a market perspective, Google technically doesn't really need to care about the quality of its search results anymore, only maximizing profits.