Anonymous Source Shared Leaked Google Search API Documents

> My anonymous source claimed that way back in 2005, Google wanted the full clickstream of billions of Internet users, and with Chrome, they’ve now got it. The API documents suggest Google calculates several types of metrics that can be called using Chrome views related to both individual pages and entire domains.

What answer do the engineers at google working on this have for this violation of privacy?

GuB-42 · 2 years ago

I am not an engineer at Google but this is I would say if I was.

We don't know who you are, you are just a number in a database, and we don't even know what number, we just get the total number of visits for each website, not who visited it. It is like counting cars on a highway, not following your car. Plus, it serves the useful purpose of providing you with better search results, the terms and conditions allow it, and it can be disabled.

voltaireodactyl · 2 years ago

The obvious response being that counting cars on the highway is a necessary first step on the road to identifying and then tracking their movements.

Similar to how insurance companies have offered voluntary, “anonymized” data dongles for discounts that are now being used (or at least revealed to be used) to collect data most often used to reject claims.

lolinder · 2 years ago

> we don't even know what number, we just get the total number of visits for each website, not who visited it

This is not what a clickstream is. A clickstream requires that the sequence of clicks be preserved, and preserving that sequence undermines anonymity.

raxxorraxor · 2 years ago

That would be money. If someone has another excuse, they are naive or lying to themselves.

It certainly is not "to improve the net or advertising" - that would be the lying part.

Google has done some good for the net, but the scales of their contributions slowly but steadily move to the negative side.

azemetre · 2 years ago

Reminds me of the studies they’ve done on cognitive dissonance/lying.

Basically if you believe lies you tell yourself, they tend to turn into truths in your mind over time. Even if you were doing it “ironically.”

danpalmer · 2 years ago

Personal (not work related opinion): This basically can’t happen with things like DMA and GDPR. DMA in particular means you can’t share data across “products” without explicit consent. So you could for example collect websites that don’t work for the purposes of improving Chrome, but not then share that with the Ads/Search orgs for personalisation or targeting, as far as I understand the legislation.

Personal opinion about work at Google (still not googles opinion) I’m consistently impressed with how seriously this stuff is taken and the amount of work that goes into making sure that things like this sharing can’t happen accidentally, and that user choice is respected. The engineers on the ground are absolutely making sure this all works, and most of us care deeply about user privacy. I have personally worked both on implementing new features that significantly push forward privacy, and on implementing privacy controls for regulatory purposes.

BrenBarn · 2 years ago

The thing is that preventing "sharing" isn't sufficient. People who are concerned about privacy don't want any such data collected or stored in the first place, ever. The implicit "sharing" of my data with Google (or whatever company) is a problem in itself. Regardless of how "seriously" Google (or whatever company) takes it, for a lot of the data I don't want them to ever have it in the first place.

verteu · 2 years ago

> I’m consistently impressed with how seriously this stuff is taken and the amount of work that goes into making sure that things like this sharing can’t happen accidentally

I believe the law is violated when it's sufficiently profitable -- it just requires VP permission.

No public sources for this except Jedi Blue, the old anti-poaching case, etc.

noprocrasted · 2 years ago

> This basically can’t happen with things like DMA and GDPR

I'm sorry but this is just wishful thinking. It might be what the spirit of the DMA & GDPR want but definitely not the reality thanks to inadequate or outright non-existent enforcement.

There are businesses out there whose entire business model and revenue stream are based on violating the GDPR. Not some kind of internal conspiracy or rogue employee, but the entire company is doing it in the open and the result of its doings (targeted ads or spam) are visible out there in the open for all to see.

Facebook, credit bureaus, data brokers, "consent management platforms", etc. All these companies' business models are big, obvious breaches of the GDPR. Yet, they are... still alive and kicking?

There is no chance that a concealed GDPR breach (whether intentional or accidental) will get addressed when the biggest intentional breaches are still allowed to continue out there in the open.

I suspect something very similar is going to happen with the DMA - Apple is already acting in bad faith but have yet to see any consequences.

Deleted Comment

marcinzm · 2 years ago

> What answer do the engineers at google working on this have for this violation of privacy?

The same answer you probably have for the millions of questions about what the things you do that some other people find offensive to their personal views and beliefs.

bdlowery · 2 years ago

How is it a violation of privacy. Did you read the terms of service?

precompute · 2 years ago

It's a privacy violation regardless of the ToS.

y42 · 2 years ago

A tos announcement is not an explicit consent. I doubt that this will help in court, even pre-GDPR.

9dev · 2 years ago

See, that’s the nice thing about the GDPR: You cannot hide unexpected hostile stuff in the ToS anymore. If you don’t tell me what you do with my data in a way that is obvious, easy to understand, and most importantly easy to disable, it’s illegal.

Sometimes I wonder how much better the internet would be hits on Google weren't directly tied to revenue from Google itself through its ad program. I am certain Google has made the internet and the world a worse place to live.

eitland · 2 years ago

As a user of Kagi and search.marginalia.nu I can tell you:

Quite a bit.

So much that now that I have what "everyone" asked Google for for years - that is blacklists - I hardly use them.

Why? Because with Kagi I get much better results out of the box.

I am fairly sure Googlers will tell me there are multiple safeguards to prevent the inclusion of Google ads from affecting ranking, to which I just have to say that the results speak for themselves.

Please note: I have only used Kagi for two years. I am only one user. But I am a user with 20 years of experience with Google and that got to count for something.

Nevolihs · 2 years ago

I actually use pinning, blocking and raising/lowering the value of individual sites every day. I wish this is the direction search engines went in the first place and it's the direction I hope Kagi continues. I want a personalized search engine that's personalized by me, not by a company trying to profile me and make money off of my clicks.

scutrell · 2 years ago

I was excited to try Kagi, but I couldn't justify the cost. I find DDG with the occasional Google search to function almost as well. I'll try Kagi again at some point, but it wasn't the panacea people here made it out to be

p3rls · 2 years ago

Kagi is the same garbage as google in my niche. Even worse, maybe. It looks like it's weighing backlinks and SEO garbage even higher. Well done.

I don't know how people keep talking about it. The results, as you say, speak for themselves.

abhijat · 2 years ago

I switched to Kagi in June last year. I just realized I tried it initially because I wanted to try out blocking sites in search results, and I have only ever needed to block three domains.

beeboobaa3 · 2 years ago

Is kagi good for finding things like old forum posts (not reddit)? I know some of those sites are still up but google seems to ignore them.

packetlost · 2 years ago

I dunno, the first thing I did was blacklist G**ksf*rG**ks from my search results (and others, of course) and I couldn't be happier.

esperent · 2 years ago

Kagi is worth the money, but it isn't magic. It's about as good as Google was ~five years ago, before they made all the search operators stop work. There's also a whole bunch of things it's worse at that Google - especially local search and shopping. Plus I still get plenty of blogspam and AI generated crap from Kagi.

karma_pharmer · 2 years ago

Kagi is simply reselling google search results.

amelius · 2 years ago

How do you know that Kagi won't become as bad as Google at some point?

Workaccount2 · 2 years ago

The fundamental problem with the Internet is that people don't want to pay for things on it.

No matter what, whatever we ended up with was going to be shitty and exploitive.

eitland · 2 years ago

Now you have a chance. Kagi is there.

I made my decision two years ago and I would probably do it even if it was just on par with Google, to support competition and to avoid supporting Google.

But in hindsight it is just exeptionally much better. There is no going back unless Kagi does something monumentally stupid.

tjpnz · 2 years ago

How much of that is due to ad-tech companies like Google conditioning people into thinking that way? What if online payments weren't so god awful and allowed people to throw in a few dollars as easily as they might at a toll booth? That's still an unsolved problem too. Credit card companies have solidified their involvement in every facet of the process and the alternatives are non-starters for frictionless commerce.

I'm still happy to put my money where my mouth is and do pay for services which are genuinely useful to me. But this is not the kind of internet I imagined when growing up.

L-four · 2 years ago

It's not that people don't want to pay it's that it's difficult to pay small sums. The web browsers could solve this problem but they make money from ads so it's not in there best interest.

wslh · 2 years ago

Google was really great and revolutionary, they helped zillions of small companies to thrive. It was another cycle.

Then, now, it is like media before the 90s: you need to pay a lot of money to be in the center page of the newspaper.

But, hopefully we are talking about LLMs now, seems like one of the answers to search engines in general. Beyond AI, I see LLMs as a good evolution from PageRank.

A little bit general but lately I use the expression: "Complexity as Scam". Google always pointed to their "algorithms" and played with this term as if algorithms couldn't be adjusted to whatever you want to be. Initially the coined term was sound because it was based on a scientific paper and eventually it evolution but it seems like the PageRank original idea has detoured from being a "pure" graph algorithm.

Another context where I use "Complexity as Scam" is Web3. It is like Matryoshka dolls where there is always one more step of complexity to probe a point, but it never ends.

benterix · 2 years ago

It's not black and white. There was a lot of junk that was forced on us and that was removed thanks to Google. But I agree the direct relationship is inherently corrupting.

GTP · 2 years ago

Larry Page and Sergei Brin even stated very clearly in their original paper that using ads as revenue source can impact the quality of results returned from the search engine.

DarkNova6 · 2 years ago

You mean the way Google worked originally? The founders were very careful in creating a barrier between ads and search.

A barrier whose erosion has been well documented over the last 10 years.

vouaobrasil · 2 years ago

A barrier whose only purpose was to establish trust so that it could be later taken advantage of.

heresie-dabord · 2 years ago

Instead of a semantic Web of knowledge, we got "grep the HTML... with ads".

josefx · 2 years ago

You dropped the -v . Modern day Google seems fine tuned to return results that contain everything except for the words I searched for.

greg_V · 2 years ago

I mean... maybe, but not really. The first problem of the internet was that there wasn't that much content specifically. The first internet companies were the broadband providers who were developing content themselves, like AOL.

Google and the ad ecosystem they acquired was basically the flywheel that spurred content creation at scale. Anyone could jump in, follow a few guidelines and earn a living by producing content on the internet. The Youtube acquisition and monetization followed the same pattern.

Over time the market consolidated and got less and less competitive: less platforms with complete control of traffic and one-sided revenue sharing agreements. The guidelines so to speak on how content should look and feel like were algorithmically made stricter and stricter until everything looks, feels, sounds and reads the same.

The problem right now is that the platforms are still tightening their grip, and it's all tied to the approach of using AI to replace the content creators on the platforms from Google to Spotify to Meta, and carving the spared money to shareholders. And while the web has been shitty for a few years now, we're now seeing a sudden drop in quality because the average user has no recourse or alternative, and neither does the average creator have the means of distribution and monetization (not just publishing, that's been solved) to even find, let alone meet the new kinds of demand.

I'm certain that in a few years this will even out: new search engines, new aggregators and new feeds will emerge, but the content - money - network problem triangle remains as a fundamental problem of the internet.

linsomniac · 2 years ago

Did you experience the Internet before google? The idea of a world where Alta Vista won is truly chilling.

thsksbd · 2 years ago

You mean a world where people still knew how to use a library catalog, still relied on more than one source of information and curious crazy tid bits are still out there?

The internet is boring. And the trash is still there. Its just become reputable instead.

washadjeffmad · 2 years ago

I'd be okay with a world in which everyone else in search didn't lose, too.

msk-lywenn · 2 years ago

In some way, didn't Google become Alta Vista?

vouaobrasil · 2 years ago

Yes, I did! I used to use Yahoo search where the results were more hand-curated and people did not create websties for intensive commercial purposes with useless SEO fluff like it is today.

blowski · 2 years ago

I imagine it would be a different flavour to what we have today, but the same intensity. Anything that so deeply penetrates daily life across the globe is going to bring enormous problems with it.

1vuio0pswjnm7 · 2 years ago

There is something truly strange about the idea than people "trust" a website operator and can rely on it to provide them with useful information when that same operator is well-known to be secretive, deceptive and dishonest in order to protect its own interests. It's like imagining that a fact witness who tells the truth on some occasions and lies on others is credible.

https://ipullrank.com/google-algo-leak