Given the current high-ranking thread about spammy sites in Google results, it strikes me that a very simple solution would be to let logged-in users blacklist sites.
Bam, no more wareseeker or efreedom.
This would solve a lot of people's complaints in one fell swoop.
There are greasemonkey etc. scripts to do this, but they're tied to a single browser on a single machine. A global filter (like in gmail) would be so much more useful.
Would this be particularly hard to do?
Plus, we are talking about a company whose core business demands that it can identify groups of bad-faith voters. Given time, they may find a way to incorporate this data safely into the ranking data (if anyone could, it would be Google).
And I know there are extensions to do this (mine mysteriously stopped working recently), but doing this on the client-side in a way that's bound to a single browser install just seems wrong to me, especially for Google.
As mentioned above, then introducing the shared-ranking via the social graph would be the next logical step. It could be something opt-in'd to ease adoption.
Then, ideally (and this is my personal 'white whale' problem) it would be great to imagine something where the user could whitelist through no action of their own rather than have to do any work to block, i.e. use the result set 'hit' of what's clicked in the results to act as a personal ranking upvote.
There's some interesting engineering issues of per-user indexing though, but hey, you wanted to work at Google right?
efreedom is monetised by Google ads. Might seem like a problem to Google.
Let's say it starts with personal blacklists. Then trusted lists that you can subscribe to (AdBlock-style). Then word spreads and enough people are using it such that AdSense revenue drops 20-30% or more?
(IME, CTR on ads is much higher on these content-light sites than it is on more reputable sites.)
It's to Google's benefit that people end up on these pages, see a ton of ads, and then click on one out of confusion or desperation.
Dead Comment
Facebook, on the other hand, has developed a system where nearly every user activity creates a new easily processed and meaningful connection between users or out to the web itself. And those connections are probably closer to representations of some kind of trust than "I email that person a lot".
Anyway, I'm not saying the sky is falling for Google, just that search appears to be changing for the first time in awhile.
Why?
99% of users are non-tech oriented.
Those users will not really be aware of the specific problems with the search results, they won't understand the concept of a good vs bad result and they certainly won't bother to tweak/ban/filter their results.
The 1% that do care and are currently being vocal about it will start filtering their results and they will perceive that the problem is solved. They will stop making a fuss.
So now, the complaints have gone away, but 99% of users are still using the broken system, so the good sites that create good original content are still ranking below the scrapers and spam results for 99% of the users.
The problem must be solved for all (or at least the majority) of users.
(And you can't take the 1%s filtering and apply it to all users in some kind of social search because the spammers will just join the 1% and game the system)
Perfection being the enemy of good enough, and a common and valued and traditional mechanism to delay product shipment.
And Google might well be able to utilize information from that 1% of users that have sorted that out - 1% of a Really Big Number of searches, factoring for the folks looking to game the search results (downward, in this case) - to provide feedback back into their search results.
I disagree. Let's call it 95%.
>Those users will not really be aware of the specific problems with the search results, they won't understand the concept of a good vs bad result and they certainly won't bother to tweak/ban/filter their results.
So have only people that have enabled the advanced features of Google search ban sites. All of a sudden only people that "get it" are the ones that can ban.
>So now, the complaints have gone away, but 99% of users are still using the broken system, so the good sites that create good original content are still ranking below the scrapers and spam results for 99% of the users.
So we need to use the votes to stop the spammers.
>(And you can't take the 1%s filtering and apply it to all users in some kind of social search because the spammers will just join the 1% and game the system)
Sure you can. If you couldn't then Reddit would be a wasteland of adds, but it isn't. They only have 4 or 5 engineers there and they can write code that will stop vote rings, let alone Google.
It's actually a pretty simple exercise to stop vote rings, unless the anti-vote ring code is open sourced, but even there it should be possible.
I think 99% of email users have not been adequately trained in why or how they should report spam, and even if they were I think most of them would still not care enough to actually do it with any regularity.
When pushed many may acknowledge that they know it exists, they will probably even be able to find the button when asked if given a chance. But they won't remember to do it when they see spam, they'll just ignore it and move on to the messages from people they know.
With search results, the spam is more often than not Made for AdSense sites that the average user doesn't realize are pure garbage. Then there are the mass-produced content sites like eHow that most technical people realize are worthless, but the average user loves. It isn't often you see Viagra sites popping up in searches for woodworking. It does happen occasionally though.
So no, I am pretty confident a majority of users would not utilize effectively a feature like that.
And when it's about search result. People are browsing and clicking through adsense filled "landing page sites". Most of them think that it's their fault that they couldn't find the thing they were searching for.
What I think really needs to be exploited is a ring of trust type aspect, I'd like to have the Hackernews ring where all us on here work together to remove the spam from our results and let's Google see what are taking out, maybe that will help them improve their algorithms.
Why not apply that reading level algorithm to users gmail data and public social network profiles, estimate the users IQ, and then those at the top of the pile are given "result burrying" moderator privileges.
Confirmed user accounts (cell phone verification) combined with other algorithms, such as profile age and activity, could make spamming sufficiently complex to de-incenvize all but the most illicit spammers.
Users at the bottom of the IQ pile (non-logged in users based on past search data and geo-location socio-economic status) don't even get the option to bury results. Which, by the way I think is more like 20% of US internet users than 99%.
Dead Comment
Traditionally google seem against human powered editing (as this would be), but I think as the black hat SEOs run rings around them, its needed badly.
What I'm trying to get at is, with all things equal, let's say Stack Overflow and efreedom's SEO is on par with each other, shouldn't SO's reputation/inbound link ranks automatically trump things?
SO is not editing the material for SEO, they just have whatever content the users generated.
1) - It doesn't have to be extremely painful, just painful enough, such that true loathing is needed as motivation. This way, we filter out frivolous decisions. A few seconds pause would be enough.
2) - We need to let the reduced ad revenue do the job for us through the market. Anything else will be gamed much to everyone's detriment. Just empower people to remove the annoyance, and let the money do its thing.
Re 1, painful? WTF? The whole point is to make it quick and usable. I can already blacklist sites the painful way, by adding them to a Google Custom Search page. The whole point is I'd like a quick add-to-killfile button, like email clients have had for decades.
http://radleymarx.com/blog/better-search-results/
CSE wants you to list sites that you want to search from. Of course, you can't default to '' or '.'. They even stated that '.com' and '*.org' etc. won't return any results. That's unacceptable. Secondly, given you could configure it meaningfully it seems it's pretty hard to configure your browser's search bar to use this CSE instead.
And that's what I think most people use for searching. At least I do.
Facebook got it right this time: with each post, there's an option to hide that post, that person, that application, or that site which posted the post. One click that means "don't show stuff from them anymore": that's what Google needs, too.
By this I mean I added it to my browsers, but I still use regular Google search daily. If the results is laden with bogus sites, then I switch over and start again, weeding if necessary.
Initially I thought I'd use GCS all the time, but it lacks the Google menu (Images, Maps, etc) which comes in handy more often than I expected. I use GCS most for code/development related searches.
http://news.ycombinator.com/item?id=2075437
It's not so much that it's a knowledge vacuum, just that someone didn't read the whole thread before replying.
Deleted Comment
If you go to the CSE website and select 'Advanced' and then download annotations, you can export the list of sites you've excluded.
Further you can make the exclusion list ("annotation list") into a feed - so it is entirely possible to implement the kind of user-generated blacklist of sites which has been discussed here.
GCS is really easy to set up - takes only a few minutes. I spent the most time hunting down rouge sites - which was actually kinda fun and cathartic.
Big tip: keep an easy-to-get-to link to the GCS Sites Control panel, so it's easy to add new sites. I've added ~40 more in the past two months.
If they're not looking into integrating that nicely into the existing search results page (not a separate form that the average user will never find or use), especially after all the internet chatter about it recently, then they definitely should make that a top priority in 2011. I definitely don't want them to do a rush job on it though. I don't want competitors to start reporting each other as spam in search results to try and game the system even further. I'm assuming they have anti-gaming measures in place for Gmail, so they won't be completely starting that from scratch...
At best G could use the information as a list of potential spammers and filter domains manually, but I really can't see this being automated without giving the SEOs another weapon.
Also there is this form for reporting spam sites: https://www.google.com/webmasters/tools/spamreport
Integrating the above into standard search results would be difficult unless it was restricted to users with a good "karma". That might be possible in our increasingly socially networked world
Perhaps we need to frame the discussion differently, considering what the searcher wants, rather than "spam-free hits".
If Google use that information to gradually adjust their ranking overall, then fair enough -- won't affect me, I can't see them anyway.
EDIT: Even if they don't let that affect everyone else's results (because of gaming), then I still don't care, I still don't see the crap in my results ever again.
As a workaround, try searching for "[any widget] sucks" and "[any widget] good".
EDIT: tying this to other discussions on the topic, it's a symptom of Patio11's observation that natural language search doesn't work very well. If you want to find something, you need to paint a picture of what it looks like, rather than asking a question about it.
Deleted Comment
Everyone is ripping off someone's content.
And just to be accurate here, SO content is creative commons (created by the community). Are those just cheap words?
These add significant value to the original content IMO
In my experience though the sites that are taking the content are ad ridden messes which remove value rather than add anything.
Maybe it's like that in your field, but in Mac dev questions you're fairly likely to get answers from established OS X developers, and even Apple employees.