For example, google seems to want full sentences instead of just keywords now. "How do I do X?" seems to get me better(?) results then "X + some relevant keyword". But I can't seem to get past this "most popular responses" google things I need. I do appreciate youtube videos marked at certain times but watching video isn't always what I want to do. Tangentially, has youtube search been integrated to youtube search or something now? I used to be able to search obscure music in youtube. "Sal dulu a" would both recommend "Sal dulu antasma" and list it but now unless i search for that particularly, it doesn't show up.
Any pro tips on how to google (or use search engines) like a modern human would be appreciated. Or modern version of google dorking (which also seems to not work like it used to for me). Thank you.
I've used DDG for the past ~5 years, and it is typically worse without using a hashbang like !so for technical queries. I guess that is what the web has evolved to-- knowing which mega-site you want to search against rather than discover new sites?
Also, I am by now 100% sure that Google has just stopped indexing the long tail. Like if I search for function names of public source code that I downloaded from GitHub, Google won't find it. But of course, it's still on GitHub.
Similarly, Google will sometimes not find a single result for some Windows API function names, despite them being publicly documented on docs.microsoft.com.
Then searching for "odicht" out of curiousity, they auto-correct it to "olight". So I start with almond-based sugar sweets, follow their auto-correct twice and now I'm staring at headlamps. And even Google has no idea what "odicht" might have been, so I really wonder how Amazon decided to auto-correct from an existing product into a non-word.
Searching for "odense marzipan" including the quotes then works, but it yields the cringe-worthy message:
Your search ""odense marzipan"" was automatically translated into "„odense marzipan“".
(where the only difference between the first and the second thing is that they converted the ascii quotes to up and down sentence quotes)
Rather weird if true, but I can't really disagree with your observation. It seems like large parts of the web have disappeared in the last five to ten years.
Google do most likely index the sites, but their current algorithm just don't use them, because it as much a promotion algorithm at it is search.
Google News too has become flaky. Often does not find stuff you know is there, or finds it one day, but not another. Hrmph.
What does the business model look like? Ads (it would be in front of a very valuable audience of technical folks)? Or paid subscriptions (perhaps the community votes which resources get crawled / indexed)
Most of my searches are for really old pages or really long tail stuff and Google just simply doesn't bubble them up, if it has them at all. I keep finding web sites lately from links on other sites and find myself asking "Why the fuck did Google not find this?" .. then I go back to Google and try to find it with keywords from the site, and nothing...
This filter takes care of that box completely:
This happens _all_ the time on the Twitter app search bar.
And somewhere a team of designers and PMs got their bonus for increasing the engagement OKR. Clearly users love the animation and added delay because look at the metrics skyrocket!
Deleted Comment
Deleted Comment
> Results for exact term [...]. If no results are found, we'll try to show related results.
But very regularly it fails to find results I know exist in not so unpopular places.
[0] https://help.duckduckgo.com/duckduckgo-help-pages/results/sy...
This really got to me about 6 months ago, so I changed all my default searches on all my browsers and mobile to DDG, and haven't looked back. I tried DDG ~5 years ago, and there was no way it could have replaced google for me then, but when I did it 6 months ago, it didn't seem any worse, maybe a little better.
You'd think from the responses herein that using quotes was a panacea.
It isn't. I see what you see. I think it's ignoring the quotes.
DDG is heading this wrong direction too. Today I've searched for some ecommerce platform called Comerzzia and it showed me some Comerzia or Comercia or whatever shops near me. It shows maps if it thinks they're related and apparently I can't disable that feature.
The fundamental, unavoidable problem is that the cost of providing high-quality results on the long-tail of possible searches tends to grow faster than the revenues that can be earned from those increasingly rare, obscure, long-tail searches. Any search service seeking to maximize profit, like Google or DDG, ultimately always evolves to perform less and less well on the long tail of possible searches.
The search service we all wish we could have -- a service seeking to maximize the quality of individual searches, no matter how obscure -- may not be feasible as a profit-maximizing business.
I think even two years ago, Google searches had far more depth and yet Google was quite profitable (then the searches were still biased but now stuff is simply gone). Sure, if someone looked at the marginal profitability of every single search result, it would look like what we're seeing. But there was a time when good indexing of stuff that didn't turn a profit by itself was done as a service to attract people to Google and/or to improve the Internet generally. That time has passed, clearly but it was a decision.
What are the ways to direct more air into the likes of you.com? You.com was an “Show HN” topic 3 weeks ago[1]. The you.com improvement in 3 weeks is noticeable.
This post has been simplified for sentiment parsers...
[1] https://news.ycombinator.com/item?id=29165601
i have a directory of duckduckgo !bang operators, for easy access https://mosermichael.github.io/duckduckbang/html/main.html
This helps to find the specialised search engine, if you need one. (it scans duckduckgo for any new !bang operators, in a nightly built, here is the script: https://github.com/mosermichael/duckduckbang )
I suspect, that most of these specialised search engines are powered by elasticsearch. It may be, that elastic is starting to cut into google search, from the low end.
Try searching for any product outside your wheelhouse and it quickly devolves into an undergrad research endeavor.
I can't trust the first or second page results because of SEO. Then every page after quickly veers off topic or just features sites that aren't as good at SEO.
Google is simply maximizing profits by giving users results that would cause either more clicks on ads or show more ads. It's mission is to make money this quarter/year. If you believe any of their Silicon Valley-style new age talking points you probably don't have critical thinking skills.
If their products are getting worse for you perhaps you are not part of a profitable segment for them.
I remember once searching for how common same-sex relationships among teenagers are in Japan, as some say it is very common, and all I received were political opinion pieces that did not in any way come with the numbers I sought on Google, so I then tried DuckDuckGo and to my amusement what I received with the same query was mostly pornography.
Neither particularly useful, but the contrast in how both prioritize was interesting to me.
Maybe it's just me but those always seem vastly gratuitous. Like shouldn't the engine figure that out automatically? It's like half its job.
It’s almost as if the government and big companies have spent a lot of effort understanding human biology, cognitive function and applying what was learned.
While selling the masses on a contrived story keeps them believing there’s a universe of infinite life available to humanity if you just follow these steps…
Things like human colonization of space, and political memes about wealth are taking advantage of the same biological quirks as religion. It’s just now we can quantify the effects rather than wave it off as mysticism.
But the human story is already set on a path of building a bridge to nowhere.
You can sort of fight it by including a term showing your topic in your search, I think.
Dead Comment
Based on the search box suggestions I get, it seems many people work around this by appending reddit to their searches. If I search for "warmest winter coat" it's a bunch of untrustworthy content marketing until you try something like "warmest winter coat reddit"
Unfortunately I prefer to avoid reddit (which also has a fair amount of astroturfing), but I haven't found a good alternative. I severely miss Google's old "discussions" (or was it forums?) filter.
For some reason though (probably because they used AMP) they basically allow them to do anything they want. Multiple popups, hijacking click events for login modals, and hiding the content with no impact to search results.
So now, in all the glory of the Internet, the person who genuinely wrote the best blog post on the "warmest winter coat" is completely unfindable on normal Google search or you force yourself via a reddit query to a completely hostile user experience unless you login.
It would be cool to just be able to do something like this "warmest winter coat --hobbyist-only".
I like to compare with search.marginalia.nu results from time to time, but the restrictions it puts on the content it traverses do not make for a good daily driver.
warmest winter coat --hobbyist-only
Top 10: 1. Canada Goose Parka 2. Patagonia Down Jacket 3. Marmot Precip Jacket 4. Columbia Winter Jacket 5. North Face Thermoball Jacket 6. The North Face Nuptse Jacket 7. Rab Neutrino Endurance Jacket 8. Mountain Hardwear Ghost Whisperer Jacket 9. Black Diamond Fineline Hoodie 10. Outdoor Research Cathode Hoody
Google should be opinionated. It should have a huge bias towards quality. It should not be hard for a small army of employees to be blackholing ANY crap product roundup site. Real product tests, where multiple items were actually purchased and compared, should always float to the top.
Just as much as needing to pay attention to what spam to suppress, they should be asking "what do we we want at the top" and whitelisting really great sources that always cut in front. Why should healthline ever appear before examine.com?
Instead, they have thrown their hands up, said the algorithm is in charge, and to interfere with it would be improper. Bollocks.
when i search for something specific, i usually include a random niche tangental hyper specific keyword about the thing i want in quotes (until it gets turned into the SEO-buzzword of the day)
"impedance" for analog electronics stuff, "ring-spun" for clothes stuff, etc
Sometimes I feel like that internet isn't for me anymore, and that's a little distressing.
We need a PBS/NPR of search engines.
But it is still not so dire. I went back to bookmarks, reading lists and keeping note of writers I check out. It's not bad at all as long as I keep in my interest bubble. Google or not, I still would prefer today's internet world to the decades before the internet.
And of course, new search engine, something distributed and in the GNU domain.
Ironically we’re watching this play out now with products and techs that market themselves as “decentralized.” Maybe after this phase the tech community will consider this isn’t something we can tech our way out of.
banning advertising would help for awhile, but other profit streams would be optimised and expand to fill void, such as data collection/mining
I can only presume that so many people have given up on the web as indexed by Google and are just searching for "<whatever> reddit" now as the only way to get any kind of content written by real people on a subject instead of SEOd filler "content".
Presumably it won't be long before Reddit itself is flooded with spam content to take advantage of this - I'm sure it's already happening to high value keywords.
This happens on Amazon reviews all the time as well.
What I end up doing is trying to find a post that isn't all-in on any specific solution... but lists pros and cons of multiple options, because it seems less likely that a content advertiser will post anything negative (or positive about a competitor).
first result is a list from a blog by some "Emergency prep guy" it basically lists 27 coats with information.
Second result is RT online with black Friday recommendations
Third is oprah daily with recommendations and shop links
So, reddit is also, as you said, full of false information + astroturfing as well. Besides not everyone is interested in diving into reddit rabbithole to find information on warm coats.
What do we want google to do? It tries to blend whatever is available, I don't think google got worse on this particularly, but it is probably a hopeless pursuit considering the status of the web. As for forums, adding "forum" at the end sees to work, but I agree it would be nice to have the option in the toolbox.
Most of the time, this is just a list of coats someone googled and copy-pasted info from the marketing pages. This page is an affiliate-marketing site masquerading as a review site.
Not sure if specifically that page is, but that's what the majority of "product review" results in Google are nowadays.
The study was poorly done and there were tons of comments pushing the same message: "vegetarians/vegans are annoying hipsters who will lecture you for eating meat and they'll be so deservedly upset by this."
Found it and most of those comments are deleted now (https://redd.it/qskxol). Is the meat industry losing a sizable chunk of profits to more people swearing off meat for moral reasons, or ditching meat as a financial decision?
Edit: Threw that link into a website that restores deleted comments (https://www.reveddit.com/v/science/comments/qskxol/meat_cons...).
Mods deleted all references to fact that the study was funded by a beef company. Blatant corruption?
SEO blogs are full uncanny valley for me.
It's also funny how Google basically nuked groups and made it unsearchable, while once in a blue moon you get a search result to alt.coats.winter or something
If I'm asking google or DDG for advice on a product, it's either going to be a reddit or Wirecutter for me. 99% of results on "best *" results in _*literally hundreds*_ of domains like "best*for2021.com" "buybest*.com" "top10*reviews.com" that are all generated by bots containing only the worst knockoff / counterfeit / Chinesium products and tons and tons of Amazon affiliate links.
E.g. I was trying to remember the name of a top-of-the-line soldering station brand (Metcal) I used back in college, so I kept trying permutations of "best professional soldering rework station" on google [0] and DDG [1]but it only comes up with low-end Chinese stations, a few mentions of Weller and Hakko, but no impartial reviewers, no forums or blogs, no discussions...nothing leading to Metcal.
Then I searched "best professional soldering rework station site:reddit.com" [2], I clocked the first 3 links, scrolled, and found Metcal on the second hit. [3]
I was surprised to see Wirecutter did a review [4], and arguably the Hakko FX-888D is the best soldering station ever made (and the X-Tronic is a fine budget runner-up) for *_MOST_* people, but it's still not a Metcal (the thermal capacity and regulation of their iron tips is just unparalleled even with nice Wellers and Hakkos - you can really feel the difference when working with THICC power ground planes and RF connectors).
[0] https://www.google.com/search?hl=en&q=best%20soldering%20rew...
[1] https://duckduckgo.com/?t=ffab&q=best+soldering+rework+stati...
[2] https://duckduckgo.com/?t=ffab&q=best+professional+soldering...
[3] https://www.reddit.com/r/electronics/comments/2c4hnl/best_so...
[4] https://www.nytimes.com/wirecutter/reviews/best-soldering-ir...
On the other hand, a different search for "R-values of winter coats" produces a few real gems, like https://outdoorcrunch.com/jackets/
> "R-values of winter coats"
This is a valid alternative... but I don't want to be an expert on winter coats to be able to Google basic information. I'd have to weed through a fair amount of marketing content to even find that the phrase "r-value" exists. In the past this wasn't necessary.
The internet used to be primarily a place for people to connect and share information... and now it feels like primarily a place to be advertised to. There's also the fact that many ads have evolved beyond simple billboards to psychological manipulative clickbait.
It's completely anecdotal and tangential to this topic, but I have a sneaking suspicion that the way marketing manipulates people has created unhealthy amounts of skepticism that further fuels the affinity for conspiracy theories... which tend to be so toxic that they're almost inoculated from marketing.
You can search "warmest winter coat site:reddit.com" and filter by past year, only to get a result from 7 years ago.
I really don't know how to search the web anymore.
Eg.
'good cheap mountain bike' -> 10 results out of the first 10 are commercial spam and listicles.
'thread good cheap mountain bike' -> 5 human discussions on entry-level MBs, 1 link to a MB forum home page (not a specific thread), 2 commercial spam, 1 paywalled magazine article testing MBs, and 1 online shop product page for a MB that happened to mention a "73mm Threaded BB shell" multiple times.
This always gets brought up but the problem is far deeper:
When I search for:
> "weirdly specific ab345"
and the results contains thousands of pages without "weirdly specific ab34" then the problem isn't spam sites.
It is Google not respecting my queries.
- captcha (because vpn)
- spam results (based on location, my location was never very good for technical content)
- paywalls
- no pictures cause photos is now paid (i've signed up for unlimited forever)
And on bing basically I get shopping coupons, games and, well, and microsoft's "anything's valid, except customer sat" approach.
When I want information about a product, I join the discord forum associated with that hobby and I ask for recommendations.
Since Discord is a chat service like IRC, I get replies from humans instead of shady astro-turfed websites.
My deepest apologies for saying this, but for any type of query that has a monetization angle, I now add "site:www.reddit.com" to the query to find actual discussion about it.
Normal Reddit disclaimers apply as much of what you find is garbage but at least if you search "best exercise bike" confined to reddit you'll get real opinion not hellbent on monetizing you.
https://www.tomsguide.com/best-picks/best-exercise-bikeshttps://www.cyclingweekly.com/news/best-exercise-bikes-40742...https://www.menshealth.com/fitness/g23064646/best-exercise-b...
Not sure if they are amazing results but it's decent for such a generic query.
I feel like this exact question was asked in a meeting then peloton appeared. It turns out a bike also needs to have an ipad strapped to the front with a fitness instructor reading your name off a list of connected users and offering personalized praise. That'll be $1500 up front then $40 a month please.
I slightly worry when I have some product I rely on in my daily life that just does its job well. How long until the corporate race to the bottom hits this industry too?
It’s infuriating.
But now that I've typed that, I can see exactly why they haven't responded: a search company telling the world that its search results are f**ed. That would do wonders for the stock
I've been on both sides of paid reviews on Reddit.
I still do the same thing because in some subs you'll get actual conversation in the comments, but it's definitely being manipulated.
Instead of fixing the spam they are instead encouraging companies to spend more and more time on SEO and coming up with their own shenanigans like better ranking for using AMP (defunct now).
People who generally make great content (think a researcher or a great software maker) can't compete with billion dollar companies like Canva, Shutterstock and Pinterest who spend millions of dollars on SEO and have dedicated SEO employees who spend all day sending outreach emails and doing experiments. Henceforth the good content never even sees the light of the day; drowned by all this "SEO" optimized content.
FWIW i still believe it's the job of the search engine to find great relevant content and show it to the user instead of the other way round. Though I know it's much easier said than done.
(1) https://news.ycombinator.com/item?id=25538586
I do not think this is true, some businesses got special stuff in the white pages.
- Many of the problems are self-inflicted (dropping search terms, pages of other stuff before first link).
- They’re getting worse faster than their competition.
- Google apparently helped at least one external SEO team game image search into relevance-oblivion.
(Also, “narrative” usually implies “fiction” or “concerted disinformation campaign”, and is either used as a weasel word by liars referring to their own writing/reporting, or it’s used as a pejorative. I don’t think you meant to imply either.)
What did I miss?
The fundamental problem is that Google and the SEO spammer’s interests are aligned. Google is both the search provider and ad network. I think this makes Google tremendously vulnerable to competitors who don’t have that conflict of interest, and presents a massive opportunity to those with enough courage and cash.
Google makes money regardless of the quality or 'originality' of the content your search comes up with so they currently have no motive to change things.
This eliminates the conflict of interest where spammy results kick off ad auctions that the search engine profits from.
At this point, revenue would be proportional to: market-share * expected-number-of-searches-before-success
So, the dominant player (with market share near 100%) would be incentivized to make the first search or so produce crappy results.
2) This incentive can be eliminated by throttling the rate at which ads are delivered to a given user (such as by repeating the same ads for free on search refinements, or simply skipping ads on search refinements).
DDG is still in step 1, where the upside on market share is much greater than the upside from spamming low quality results to increase incremental ad display rates.
All of these threads devolve into anecdotes and reminisces about the "good old days" and complaining about Pinterest. None of which is in the least quantitative.
I'd be interested to see some actual data or research on the subject, if it exists.
Or maybe it's not Google that's gotten worse but the web itself? Again, quantitative results, please, not anecdata.
Google has. They use this data expertly to improve search. Common sense and technological advancement tells us that, quantitatively, Google search has become better year over year, for all their relevant metrics/cost functions.
And likely, exactly because it has become better for all its users in aggregate, it has to become a bit worse for a certain group of power users. There, we can only rely on anecdotes and personal experience, but these tell us it actually has gotten worse.
Similarly, the web can become both worse and better. The really useful articles today are better researched, multi-modal, solid web of links, internet-first. Spam has also evolved. And "top 10 ways to do X"-McContent outranks better articles, because that is what the majority of Google users wants to see and clicks on. They truly have a better experience, while others' experiences suffer. It depends on what you measure.
lmfao. so you're telling me "quantitatively" that google search results have gotten better, without citing any data at all, but with an appeal to common sense and "technological advancement"?
what if i told you that search is an adversarial problem, and that it's possible for google's tech to be getting better slower than the aggregate tech power used to game google search is getting better? is this not a patently obvious possibility? it's not some kind of gotcha impossibility for google's tech to get much worse over time, even if they weren't hamstringing themselves by lots and lots of user-hostile changes which benefit google's interests rather than their users.
I also think you're underestimating average users. Anecdotally I've heard my parents complain repeatedly about the incoherent, auto-generated, affiliate link spam that plagues product searches.
Without access to both Googles, the best you can do is compare across different search engines: special-cased ones like search.marginalia.nu can net you a quantitative feel for what exists out there that's less likely to be content marketing, but I am not sure if you can figure out where those pages rank in Google search results for the same terms programmatically?
You can also prepare for the future: record some data today, and compare in 10 years time.
It is pretty quantitative yes, but it happened at different times for different people.
But it used to be that when you searched for something you got pages containing the thing you searched for, and if you couldn't find it at first then you could do a search on the page and find out some enterprising scammer had included your keywords in white text on white background.
Today Google and Bing has teamed up with the scammers so they don't need to use such hacks anymore. Google and Bing will include the results even if they don't contain said keywords at all.
Uncharitable? Yes. Do I hope Google and Bing engineers read this and fix it or do I hope some Russian enterprise launch a better engine?
I actually hope Google find back to its roots! I don't hate you guys but you really don't make it easy for us in between using all the oxygen in the room and annoying me all the time with useless time wasting results.
- There are way more Google users, including grandmas.
- Conversations have moved from discussion boards to walled gardens and chats.
- Google relies more on neural network embeddings, so does a better job when you type full sentences and semantic similarity.
- Google relies on authority signals and incoming links to a website, so non-commercial, hobbyist, or controversial content ranks way lower.
- Websites rely on Google for income, so they start producing what Google and its readers want to see.
- Spammers rely on Google for income, so those surviving after decades, have created massively successful linking rings and spam production pipelines looking at keyword search statistics.
- You were really good at Google searching years ago, having a harder time updating and letting go of what worked for you. Easier to blame Google for this.
As for tips: Anything academic, search on specific websites or Google Scholar. Anything technical/coding, search on StackOverflow. Anything cultural/commercial you want a peer answer, instead of a salesman answer, search on Reddit. Try to join like-minded communities where you can ask expert questions, and research new things in your field. Exact keyword match still works by enclosing keyword in double quotes:
This is a completely miserable experience, and walls off useful information into classes of people who "are in the know" about where the most relevant information exists.
And if you're that grandma searching for a birthday present for your grandson? Good luck. She's likely to be devoured by ads, if not an outright scam.
Agreed on the miserable experience. Do you have any ideas on how to attack this? Perhaps Google started out with the right experience, but ads eventually toppled it. Perhaps Google never hit on the right experience. What gives?
This is much more like what Ye Olde Webbe was like. Sites competed to build communities that were repositories of information. Things like Reddit tried to build a generic silo so that they could silo information there, which I think is a bad thing long-term.
The biggest problem, as I see it, is sites just give up on doing their own search. Not surprising, as search is a hard problem, but it plays merry hell with the democratization of the Internet to foist the problem off onto Big Corporation Inc. to do the heavy lifting.
A related problem is that many sites simply don't have what could be called a "webmaster" anymore. Everything is contracted out, or part of a subscription service, or otherwise disconnected from the owner of the site having full control. If you're a small business that sells locally produced products, you're never going to appear in Google or Amazon searches, even if you have an Amazon store. You can't afford a full-time webmaster just for your site, and all of the various platforms, like Wordpress/Shopify/etc, deal in such volume that these small businesses will be largely ignored.
The ISV model for products like AutoCAD is possibly a good route. A team of well-versed engineers and designers can build things, but you need a direct customer representative to get at the juicy meat of what the end-user needs. Apply this sort of model to search, and you can aggregate over larger swathes of customers.
Do you find this better? In my experience it’s nicer to just put stackoverflow, reddit, or (often, in my case) seriouseats in my google query. Reddit search in particular is pretty miserable.
That said, your tips are good. Thank you.
Edit: I'd love to see a some-of-the-web search engine like this. Start just with university sites, prepress archives, quality forums, public dev Slacks, etc.
Even though it's a bit broken, it has some lucid moments. Just compare:
https://www.google.com/search?q=mechanical+keyboards
https://search.marginalia.nu/search?query=mechanical+keyboar...
The take-away I want to drive home is that it's absolutely possible to build something the scale of Google c.a. 2003 and run it on consumer hardware. I think, due to general difficulties in making these things profitable, the ideal approach is to make the operation so absurdly cheap it can be run non-profit instead. I'll gladly pay out of my own pocket to have a good search alternative.
Oh, and the cherry on top is completely abandoning the idea of Natural Language Processing. Go right back to keywords only.
Dead Comment
Unless you have some pretty evil supervillain scheme, no moral compass, and succeed.... that world is never coming back.
Imagine a service that provides you with a personal search engine in exchange for a list of your bookmarks. Those bookmarks provide the signal for what sites to index for the public search engine.
Right now Neeva seems very good at navigational queries, which I do 90% of the time. It's still not as good at deep research queries for obscure things. Probably related: Neeva is relying on Bing for a big chunk of their queries. But they are building their own index.
https://www.reddit.com/r/vivaldibrowser/comments/pol41p/comm...
Your bubble seems to agree, but the lack of serious competition, even in niches, is a sign that outside the HN bubble Google is in fact not seen as any worse.
https://www.devontechnologies.com/apps/devonagent#editions
mac os only. 5$ or 50$ for automation+archival
A new startup would need to dump loads of money into servers and building their own tech.
So there's just no way to recoup the cost of building a competitor.
Not the same thing but, entire SO website runs on like 2 door sized racks. Search engine might be a different thing but if you have funding to get started, hardware costs aren’t going to be impossibly huge. Most is labor (engineering).
I’m curious, how much compute power it takes to index the whole web? I presume queries are super fast.