Readit News logoReadit News
precompute · 2 years ago
This just proves all the "suspicions" privacy-conscious users have had about large corporations fingerprinting users, often in very obvious ways. There's often no better place to find ideas for surveillance than the people conscious about being surveilled.
p3rls · 2 years ago
Many of the SEO suspicions were confirmed too.

I found it VERY amusing if you go to r/SEO just yesterday there were moderators and flaired users (you know, the elites of the SEO community, lol) insisting much of this was "debunked" years ago.

They of course deleted their posts, but the threads are still up. What a den of scammers over there.

https://www.reddit.com/r/SEO/comments/1d1eqjj/comment/l5tvfw...

https://www.reddit.com/user/WebLinkr/

I love how reddit is turning into the new SEO scam over night because of this stuff. Great work as always Danny Sullivan!

p3rls · 2 years ago
It's just endlessly fascinating to me the grift on rSEO

How these types first gain moderator status on a few subs and then the spam begins (picture of spam https://pixeldrain.com/u/a6qUPjTq )

I haven't been able to find a single legitimate expert in the entire sub, and I've checked about every flaired user and moderator.

You have lots of people like the above, or https://www.reddit.com/user/jesustellezllc/ that claim to run an agency in Frenso California called Ozelot Media, but when you look him up there's nothing. When you google "SEO" + "Fresno California", Ozelot media isn't even in the top 100 results. Lol, I thought that was the job of a SEO-type? Why let that stop the grift though?

phone8675309 · 2 years ago
SEO is vandalism and I one day hope the majority of Internet users see that
theolivenbaum · 2 years ago
Seems like a lot of it came from them inadvertently posting some internal API to GitHub: https://github.com/googleapis/elixir-google-api/commit/078b4...
renegade-otter · 2 years ago
I guess too many people got laid off to do the whole "three reviewers per PR" thing!
eru · 2 years ago
When I was at Google (about a decade ago by now), we had two reviews per PR; not three. Could you tell me more about the third review?
dontdoxxme · 2 years ago
And it's Apache licensed, which grants a patent license. Some of the comments refer to specific aspects of how page rank is calculated. Pagerank itself is past patent protection but I wonder if this also accidentally might grant licenses to other patents.
yencabulator · 2 years ago
There's still an angle where the copyright owner claims that the person who caused this to happen did not have the authority to apply the license to it.
ec109685 · 2 years ago
Oops, someone’s script was too greedy when uploading those elixir api documents.
xnx · 2 years ago
precompute · 2 years ago
> My anonymous source claimed that way back in 2005, Google wanted the full clickstream of billions of Internet users, and with Chrome, they’ve now got it. The API documents suggest Google calculates several types of metrics that can be called using Chrome views related to both individual pages and entire domains.

What answer do the engineers at google working on this have for this violation of privacy?

GuB-42 · 2 years ago
I am not an engineer at Google but this is I would say if I was.

We don't know who you are, you are just a number in a database, and we don't even know what number, we just get the total number of visits for each website, not who visited it. It is like counting cars on a highway, not following your car. Plus, it serves the useful purpose of providing you with better search results, the terms and conditions allow it, and it can be disabled.

voltaireodactyl · 2 years ago
The obvious response being that counting cars on the highway is a necessary first step on the road to identifying and then tracking their movements.

Similar to how insurance companies have offered voluntary, “anonymized” data dongles for discounts that are now being used (or at least revealed to be used) to collect data most often used to reject claims.

lolinder · 2 years ago
> we don't even know what number, we just get the total number of visits for each website, not who visited it

This is not what a clickstream is. A clickstream requires that the sequence of clicks be preserved, and preserving that sequence undermines anonymity.

raxxorraxor · 2 years ago
That would be money. If someone has another excuse, they are naive or lying to themselves.

It certainly is not "to improve the net or advertising" - that would be the lying part.

Google has done some good for the net, but the scales of their contributions slowly but steadily move to the negative side.

azemetre · 2 years ago
Reminds me of the studies they’ve done on cognitive dissonance/lying.

Basically if you believe lies you tell yourself, they tend to turn into truths in your mind over time. Even if you were doing it “ironically.”

danpalmer · 2 years ago
Personal (not work related opinion): This basically can’t happen with things like DMA and GDPR. DMA in particular means you can’t share data across “products” without explicit consent. So you could for example collect websites that don’t work for the purposes of improving Chrome, but not then share that with the Ads/Search orgs for personalisation or targeting, as far as I understand the legislation.

Personal opinion about work at Google (still not googles opinion) I’m consistently impressed with how seriously this stuff is taken and the amount of work that goes into making sure that things like this sharing can’t happen accidentally, and that user choice is respected. The engineers on the ground are absolutely making sure this all works, and most of us care deeply about user privacy. I have personally worked both on implementing new features that significantly push forward privacy, and on implementing privacy controls for regulatory purposes.

BrenBarn · 2 years ago
The thing is that preventing "sharing" isn't sufficient. People who are concerned about privacy don't want any such data collected or stored in the first place, ever. The implicit "sharing" of my data with Google (or whatever company) is a problem in itself. Regardless of how "seriously" Google (or whatever company) takes it, for a lot of the data I don't want them to ever have it in the first place.
verteu · 2 years ago
> I’m consistently impressed with how seriously this stuff is taken and the amount of work that goes into making sure that things like this sharing can’t happen accidentally

I believe the law is violated when it's sufficiently profitable -- it just requires VP permission.

No public sources for this except Jedi Blue, the old anti-poaching case, etc.

noprocrasted · 2 years ago
> This basically can’t happen with things like DMA and GDPR

I'm sorry but this is just wishful thinking. It might be what the spirit of the DMA & GDPR want but definitely not the reality thanks to inadequate or outright non-existent enforcement.

There are businesses out there whose entire business model and revenue stream are based on violating the GDPR. Not some kind of internal conspiracy or rogue employee, but the entire company is doing it in the open and the result of its doings (targeted ads or spam) are visible out there in the open for all to see.

Facebook, credit bureaus, data brokers, "consent management platforms", etc. All these companies' business models are big, obvious breaches of the GDPR. Yet, they are... still alive and kicking?

There is no chance that a concealed GDPR breach (whether intentional or accidental) will get addressed when the biggest intentional breaches are still allowed to continue out there in the open.

I suspect something very similar is going to happen with the DMA - Apple is already acting in bad faith but have yet to see any consequences.

Deleted Comment

marcinzm · 2 years ago
> What answer do the engineers at google working on this have for this violation of privacy?

The same answer you probably have for the millions of questions about what the things you do that some other people find offensive to their personal views and beliefs.

bdlowery · 2 years ago
How is it a violation of privacy. Did you read the terms of service?
precompute · 2 years ago
It's a privacy violation regardless of the ToS.
y42 · 2 years ago
A tos announcement is not an explicit consent. I doubt that this will help in court, even pre-GDPR.
9dev · 2 years ago
See, that’s the nice thing about the GDPR: You cannot hide unexpected hostile stuff in the ToS anymore. If you don’t tell me what you do with my data in a way that is obvious, easy to understand, and most importantly easy to disable, it’s illegal.
vouaobrasil · 2 years ago
Sometimes I wonder how much better the internet would be hits on Google weren't directly tied to revenue from Google itself through its ad program. I am certain Google has made the internet and the world a worse place to live.
eitland · 2 years ago
As a user of Kagi and search.marginalia.nu I can tell you:

Quite a bit.

So much that now that I have what "everyone" asked Google for for years - that is blacklists - I hardly use them.

Why? Because with Kagi I get much better results out of the box.

I am fairly sure Googlers will tell me there are multiple safeguards to prevent the inclusion of Google ads from affecting ranking, to which I just have to say that the results speak for themselves.

Please note: I have only used Kagi for two years. I am only one user. But I am a user with 20 years of experience with Google and that got to count for something.

Nevolihs · 2 years ago
I actually use pinning, blocking and raising/lowering the value of individual sites every day. I wish this is the direction search engines went in the first place and it's the direction I hope Kagi continues. I want a personalized search engine that's personalized by me, not by a company trying to profile me and make money off of my clicks.
scutrell · 2 years ago
I was excited to try Kagi, but I couldn't justify the cost. I find DDG with the occasional Google search to function almost as well. I'll try Kagi again at some point, but it wasn't the panacea people here made it out to be
p3rls · 2 years ago
Kagi is the same garbage as google in my niche. Even worse, maybe. It looks like it's weighing backlinks and SEO garbage even higher. Well done.

I don't know how people keep talking about it. The results, as you say, speak for themselves.

abhijat · 2 years ago
I switched to Kagi in June last year. I just realized I tried it initially because I wanted to try out blocking sites in search results, and I have only ever needed to block three domains.
beeboobaa3 · 2 years ago
Is kagi good for finding things like old forum posts (not reddit)? I know some of those sites are still up but google seems to ignore them.
packetlost · 2 years ago
I dunno, the first thing I did was blacklist G**ksf*rG**ks from my search results (and others, of course) and I couldn't be happier.
esperent · 2 years ago
Kagi is worth the money, but it isn't magic. It's about as good as Google was ~five years ago, before they made all the search operators stop work. There's also a whole bunch of things it's worse at that Google - especially local search and shopping. Plus I still get plenty of blogspam and AI generated crap from Kagi.
karma_pharmer · 2 years ago
Kagi is simply reselling google search results.
amelius · 2 years ago
How do you know that Kagi won't become as bad as Google at some point?
Workaccount2 · 2 years ago
The fundamental problem with the Internet is that people don't want to pay for things on it.

No matter what, whatever we ended up with was going to be shitty and exploitive.

eitland · 2 years ago
Now you have a chance. Kagi is there.

I made my decision two years ago and I would probably do it even if it was just on par with Google, to support competition and to avoid supporting Google.

But in hindsight it is just exeptionally much better. There is no going back unless Kagi does something monumentally stupid.

tjpnz · 2 years ago
How much of that is due to ad-tech companies like Google conditioning people into thinking that way? What if online payments weren't so god awful and allowed people to throw in a few dollars as easily as they might at a toll booth? That's still an unsolved problem too. Credit card companies have solidified their involvement in every facet of the process and the alternatives are non-starters for frictionless commerce.

I'm still happy to put my money where my mouth is and do pay for services which are genuinely useful to me. But this is not the kind of internet I imagined when growing up.

L-four · 2 years ago
It's not that people don't want to pay it's that it's difficult to pay small sums. The web browsers could solve this problem but they make money from ads so it's not in there best interest.
wslh · 2 years ago
Google was really great and revolutionary, they helped zillions of small companies to thrive. It was another cycle.

Then, now, it is like media before the 90s: you need to pay a lot of money to be in the center page of the newspaper.

But, hopefully we are talking about LLMs now, seems like one of the answers to search engines in general. Beyond AI, I see LLMs as a good evolution from PageRank.

A little bit general but lately I use the expression: "Complexity as Scam". Google always pointed to their "algorithms" and played with this term as if algorithms couldn't be adjusted to whatever you want to be. Initially the coined term was sound because it was based on a scientific paper and eventually it evolution but it seems like the PageRank original idea has detoured from being a "pure" graph algorithm.

Another context where I use "Complexity as Scam" is Web3. It is like Matryoshka dolls where there is always one more step of complexity to probe a point, but it never ends.

benterix · 2 years ago
It's not black and white. There was a lot of junk that was forced on us and that was removed thanks to Google. But I agree the direct relationship is inherently corrupting.
GTP · 2 years ago
Larry Page and Sergei Brin even stated very clearly in their original paper that using ads as revenue source can impact the quality of results returned from the search engine.
DarkNova6 · 2 years ago
You mean the way Google worked originally? The founders were very careful in creating a barrier between ads and search.

A barrier whose erosion has been well documented over the last 10 years.

vouaobrasil · 2 years ago
A barrier whose only purpose was to establish trust so that it could be later taken advantage of.
heresie-dabord · 2 years ago
Instead of a semantic Web of knowledge, we got "grep the HTML... with ads".
josefx · 2 years ago
You dropped the -v . Modern day Google seems fine tuned to return results that contain everything except for the words I searched for.
greg_V · 2 years ago
I mean... maybe, but not really. The first problem of the internet was that there wasn't that much content specifically. The first internet companies were the broadband providers who were developing content themselves, like AOL.

Google and the ad ecosystem they acquired was basically the flywheel that spurred content creation at scale. Anyone could jump in, follow a few guidelines and earn a living by producing content on the internet. The Youtube acquisition and monetization followed the same pattern.

Over time the market consolidated and got less and less competitive: less platforms with complete control of traffic and one-sided revenue sharing agreements. The guidelines so to speak on how content should look and feel like were algorithmically made stricter and stricter until everything looks, feels, sounds and reads the same.

The problem right now is that the platforms are still tightening their grip, and it's all tied to the approach of using AI to replace the content creators on the platforms from Google to Spotify to Meta, and carving the spared money to shareholders. And while the web has been shitty for a few years now, we're now seeing a sudden drop in quality because the average user has no recourse or alternative, and neither does the average creator have the means of distribution and monetization (not just publishing, that's been solved) to even find, let alone meet the new kinds of demand.

I'm certain that in a few years this will even out: new search engines, new aggregators and new feeds will emerge, but the content - money - network problem triangle remains as a fundamental problem of the internet.

linsomniac · 2 years ago
Did you experience the Internet before google? The idea of a world where Alta Vista won is truly chilling.
thsksbd · 2 years ago
You mean a world where people still knew how to use a library catalog, still relied on more than one source of information and curious crazy tid bits are still out there?

The internet is boring. And the trash is still there. Its just become reputable instead.

washadjeffmad · 2 years ago
I'd be okay with a world in which everyone else in search didn't lose, too.
msk-lywenn · 2 years ago
In some way, didn't Google become Alta Vista?
vouaobrasil · 2 years ago
Yes, I did! I used to use Yahoo search where the results were more hand-curated and people did not create websties for intensive commercial purposes with useless SEO fluff like it is today.
blowski · 2 years ago
I imagine it would be a different flavour to what we have today, but the same intensity. Anything that so deeply penetrates daily life across the globe is going to bring enormous problems with it.
1vuio0pswjnm7 · 2 years ago
There is something truly strange about the idea than people "trust" a website operator and can rely on it to provide them with useful information when that same operator is well-known to be secretive, deceptive and dishonest in order to protect its own interests. It's like imagining that a fact witness who tells the truth on some occasions and lies on others is credible.

https://ipullrank.com/google-algo-leak

nsmog767 · 2 years ago
I work in search and didn't find anything surprising in here. But that's mostly because I've just assumed Google has been lying for years about many things, such as not using click data or Chrome data.

I've directly seen people who have successfully manipulated search rankings by having logged-in chrome users search for a term, and then click on a given page. Works like a charm (though may not stick once the manipulation is done, unless organic users also prefer it).

ec109685 · 2 years ago
If anyone is surprised about chrome sending urls to Google, you can turn the “feature” off by unchecking “Make searches and browsing better” in the sync section of Google chrome settings.

Creepy.

HenryBemis · 2 years ago
Or, and hear me out, you never use Chrome again, in any platform.. like ever ever again.
smegger001 · 2 years ago
I only have chrome installed for a couple of work related sites that don't display correctly on firefox. I dont get to choose not use the work related site and MS edge likely isn't any safer and also is not available on my choice of operating system
Terr_ · 2 years ago
"But what if I don't want my own computer to build and share a detailed profile of everyone I know, everywhere I go, all my preferences, and how to manipulate me?"

"Well obviously it's your fault for not picking the 'Don't Be Cool' option on subpage 27b-6, duh!"

ralfn · 2 years ago
Yeah. It's victim blaming. Reminds me of "they should have shouted louder".

The confusing thing is the crime itself is small on an individual level. The question is: does it add up cumulatively if a small crime is committed against many?

andrybak · 2 years ago
> unchecking “Make searches and browsing better”

Before that, you can make it audible: <https://github.com/berthubert/googerteller>

precompute · 2 years ago
Is that part of Chrome not open-source?
alexvitkov · 2 years ago
Presumably no, I haven't seen any overly creepy shit in Chromium. There's a project called ungoogled-chromium that tracks all the Google junk in Chromium and gets rid of it, their patch set is actually surprisingly small:

[1] https://github.com/ungoogled-software/ungoogled-chromium/tre...

noman-land · 2 years ago
Imagine thinking you can escape your abuser by living in their house and asking them politely to stop.

Deleted Comment

thih9 · 2 years ago
> Thousands of documents, which appear to come from Google’s internal Content API Warehouse, were released March 13 on Github by an automated bot called yoshi-code-bot

Does anyone know more about yoshi-code-bot and how were these documents suddenly published?

Was it a script misconfiguration? A manual push? Something else?

chx · 2 years ago
https://github.com/yoshi-code-bot

Created 1,891 commits in 19 repositories

All 19 is under googleapis

This looks like a bot Google uses to publish their stuff on github and so likely it's a misconfiguration.