Ask HN: Ads triggered by WhatsApp “end to end encrypted” messages?

I think you partly may just be biased by happening to notice ads more if they fit the topic you sent to your wife or being lenient in deciding if ads meet the categorization. And if you dwell on an ad because it seems to match, you may get more similar ads.

Here is how I think you could design a more robust (but less fun) experiment:

- Come up with a bunch of topics, write them down on slips of paper, put the paper into a hat

- Each Monday, draw three topics from the hat, send some WhatsApp messages about the first, Messenger messages about the second, and don’t discuss the third. Don’t put the topics back in the hat.

- If you see any ads relating to one of the topics, screenshot them and save screenshots to eg your computer with a bit of the topic

- Separately, record which topic went to which platform

- After doing this for a while, go through the screenshots and (each of you and your wife or ideally other people) give a rating for how well the ad matches the topic. To avoid bias, you shouldn’t know which app saw the topic.

- Now work out average ratings / the distribution across the three products (WhatsApp vs Messenger vs none) and compare

rakoo · 3 years ago

A simpler protocol to realize that the Baader-Meinhof phenomenon is probably what's happening:

- pick said topic, something you never cared about before, talk about it but don't write any messages containing it; - for 1 month record every ad you see about it; - send a message about the topic; - for another month, record every ad you see about it

Comparing the number of occurrences will tell you what is happening.

KennyBlanken · 3 years ago

> pick said topic, something you never cared about before, talk about it but don't write any messages containing it;

This does not work. How did you come about the topic? Answer: it was in your brain, because advertising, trends among your peers and social connections, online trends real or astroturfed, etc.

That's why you end up with people thinking their phone is "listening" to them.

Maximus9000 · 3 years ago

It also has to be a topic that advertisers would pay to target you with. You can't talk about something super obscure that advertisers don't care about - like steam engines.

Firmwarrior · 3 years ago

You're saying to record ads you see about it on TV or something? (Just to eliminate the "My computer is secretly recording me" angle)

TaylorAlexander · 3 years ago

If anyone wants to try this, a friend sent me one link to a device called Levo which does “herb oil infusion” aka it lets you make weed brownies easily. I clicked the one link my friend sent and now I get ads for Levo constantly in my YouTube adroll. Though I should say this is obviously on Google’s ad network specifically and I have no idea if this applies to other networks.

RC_ITR · 3 years ago

These kind of stories are always fun to analyze using the Socratic method.

-How did you learn about the product?

-Have you ever searched for it?

-Did a friend of yours tell you about it? Do you think they searched for it?

-Are a lot of ads for it playing on TV channels you like? Could instagram know you like those TV channels?

-Is it something your neighbors got? Do you think there has been a spike in shipments of this product to your neighbors?

Eventually people start to “get” that scanning the text of messages is way more helpful for humans than it is for computers. They’ve got other data they can use.

hbn · 3 years ago

I also have a theory that sometimes when people say "we were talking about <product> and I never even typed it into my phone or anything, and suddenly I started seeing ads for it the next day!" that the person in the story may not have looked up <product>, but someone else in the conversation might have Googled it or browsed an Amazon listing or something and they have some kind of connection in their ad profiles whether it be that they know these 2 people interact a lot, they're in the same geolocation, same wifi network/IP address, etc.

I'm just not convinced of the always on microphones in phones listening for and processing every single thing considering how much battery drain that would cause, whether the processing is done on device or they're sending all that data to a server to be processed.

muzani · 3 years ago

I don't think it can be observed because it's likely a bug. It freaks users out. People uninstall the app and start threads like this.

It's sort of like getting mugged once and then setting up a camera in a bunch of alleys to prove that muggers exist. You can even set up a camera of yourself running into dark alleys every night, but the odds of reproducing a mugging is still extremely low.

There's a certain kind of precision that convinces me it's real though. Precision is common. I look at a book on Amazon, and a FB ad for that book appears.

But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.

dahdum · 3 years ago

> But I get rejected for a loan via WhatsApp and then used car ads appear for that model of car that I applied a loan for? That's a bit on the nose.

Aside from random coincidence, I could see this happening if you provided your personal information (especially email) for the loan application. It could have been shared to multiple underlying lenders alongside a data vendor who ultimately provided interest targeting (which can include car models) to an ad network.

Getting an ad for that specific model could also have been due to other online activity, such as checking the KBB.

boltzmann-brain · 3 years ago

> the odds of reproducing a mugging is still extremely low

I did that and I got a mugging on camera. The attacker was convicted.

badrabbit · 3 years ago

This is facebook. They've been caught recording people and selling that for advertising, they deny it because technically your audio is transcripted not recorded and they can send only some keywords back so whole conversations aren't sent back to them.

badrabbit · 3 years ago

Plenty of research and news stories about this if you care to search. The speculative part of my comment is about the transcription which I'm speculating because of their fervent denials despite evidence which technically their wording in their denial statements is correct.

If I had to guess, your whatsapp messages are e2e secured but keywords are sent to facebook when they match some condition. So if you message "happy birthday" to someone, they won't see that but the fact that the keyword "birthday" was found even if the word isn't included is sent to fb. That way they can say they're not snooping your messages.

thehappypm · 3 years ago

Source?

neodypsis · 3 years ago

Would they argue that the message goes first into a neural network that outputs potential product labels based on the message and that it all happens client-side? That's the only way I see it possible for them not to violate the E2EE.

vineyardmike · 3 years ago

An important thing you’re missing is the control. You should record every ad.

You need to know if you got 3 topics of ads every day and 1/3 of them are related to that secret topic, OR if you get 300 topics every day and 1/300 are related to that secret topic. If it’s the former, it’s suspicious, if it’s the latter, it’s way less suspicious.

dan-robertson · 3 years ago

The control is the topic you pick that you don’t discuss on WhatsApp or messenger. The idea is random differences between topics should average out over many trials.

chillacy · 3 years ago

You can’t come up with the topics yourself either, because the topics you will think up are different based on your demographic / type of person you are, and ad networks basically try to guess that.

JoshuaDavid · 3 years ago

You can if you first come up with a list of topics, and then once you have that list, randomly assign each of those topics to one of the three categories.

dan-robertson · 3 years ago

The idea is that you may discover the topics you don’t talk about that week still come up as much as those you do.

nindalf · 3 years ago

There will be some bias in what they choose to screenshot right? Meaning, the unrelated topic might show up in the feed but they don’t screenshot it because it doesn’t fit the narrative?

Also, what we’re interested is if the text changed what was shown. If I saw ads for X last week but didn’t notice them, then spoke to a friend and noticed them and took a screenshot, it would appear to confirm the theory. Even though I was always seeing ads for X.

Ultimately, I don’t think people who are convinced of this theory will change their minds so it’s a moot point.

dan-robertson · 3 years ago

Yeah, that’s the biggest flaw in the experiment I proposed, I think. This is the reason I try to have the hopefully independent grading of ad-topic-relevancy blinded to which system the topic was communicated over. It may be that one sees many vaguely related ads for the WhatsApp topic due to some selection bias but a similar number of actually related ads.

uup · 3 years ago

I call this the gaslighting explanation: “no, it definitely wasn’t the messaging product owned by an advertising behemoth. You must have searched for it somewhere else.” Obviously the OP remembers where they’ve seen the product. If they has seen the product elsewhere, they wouldn’t have started this thread!

(wrt some comments in this thread)

Is it so hard to believe that Meta is snooping on WhatsApp conversations? Meta, a company of unprecedented size that was built over monetizing your private data? A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?

Someone from this community, which generally means educated, tech-literate and sensitive to these topics shares a perfectly plausible observation, of something that has been experienced as well by plenty of other folks, me included; and then some people come and try to make up the most convoluted explanations (candy boxes from Kazakhstan just happened to be trending that specific day, nothing to see here, move along!) to this phenomena and try to shift the blame away from Meta. Why do you do this? Are you Meta employees? A PR agency they hired?

It's just baffling. Apparently some people DO want to be abused.

Plot twist: we all get ads about candy boxes from KZ now.

brap · 3 years ago

For anyone who has ever worked at a FAANG like company in the last decade, yes, this is actually very hard to believe.

Despite the shady image they have, these companies go to great lengths to avoid doing shady things (because ultimately it’s bad for business). Not to mention the hundreds of tech employees that would have to be involved and keep quiet in this type of “conspiracy”. It’s incredibly unlikely, I truly believe that.

dessant · 3 years ago

I can imagine you haven't been involved in anything illegal, but I'm sure you've aware of Meta's documented track record of coordinated illegal actions. Do engineering teams just fall head first into a bucket of 2FA phone numbers and start using the data for ad targeting, and nobody bats an eye from the legal department to product managers? Or are they hypnotized to build services for biometric data collection without consent? Nobody does anything nefarious, but their collective actions which benefit the company just end up being illegal, again and again?

The tech companies you work for do often engage in illegal activities, and some of your collegues are complicit. I'm sure it is an uncomfortable thought for some of you, but this is all part of the public record.

thatoneguytoo · 3 years ago

I completely agree (as another employee of FAANG). It's ridiculously hard to do anything against policy once it's set, and trust me, the policies are set. Media overplays a lot of things which aren't just there.

The sad reality is people are very predictable, even with basic data.

rini17 · 3 years ago

The employees obviously are told the functions and APIs that they are implementing have a completely legit use case. That is not hard to believe at all and was the case in Cambridge Analytica scandal, for example.

m463 · 3 years ago

"bad for business" leads to systems that do unexpected things. For instance, on-device generate identifiers for any image sent, and send the identifier out-of-band. This helps catch child pornography.

I can imagine the same thing done for text. The text might be encrypted, but interest keywords might be generated on-device and sent out-of-band.

rendaw · 3 years ago

The PRISM "conspiracy" was very shady and involved probably hundreds of employees. And if they have hushed people punching holes for the government, it's not crazy to think some data could leak out into other parts of their pipelines too.

I'm not claiming this is real, but I agree with GP.

beowulfey · 3 years ago

Let me start by saying I have no idea if Facebook is reading my encrypted messages or whatever. However, I will say that in my experience, whether something is bad for business if it gets discovered is usually not a concern for large corporations, if the thing being done makes them more money. Because everything is just a balance sheet.

For an example from non-FAANG companies, see illegal dumping of toxic waste by chemical companies, such as DuPont and PFOAs [1]. Despite knowing what they did was illegal, the math works out -- products with PFOAs were something like $1 billion in annual profit, and even when they got caught the fines and legals were a fraction of that, spread out over many years.

So I personally believe these companies 100% would do shady shit if it increases their profit margins. And why wouldn't they? There is no room for morals in capitalism, and the drawbacks are slim.

[1]https://www.nytimes.com/2016/01/10/magazine/the-lawyer-who-b...

spaceywilly · 3 years ago

The most plausible explanation is that people are just easy to predict. Might be tough to admit, but that’s actually a much simpler explanation than Facebook having a back door into our messages, which are end-to-end encrypted.

As others above me have thoroughly explained, there are numerous ways Facebook could figure out what you’re reading about/listening to/viewing on the internet, which ultimately drives what you are chatting with your friends about. Reading your messages would actually be the most difficult and low fidelity way for them to try to mine this information. They can just see your entire browsing history and extract from there, since the majority of website have a tracking cookie that in some way phones home to Facebook.

bhk · 3 years ago

Seriously? Facebook knows their internal thoughts well enough to guess what topics they would choose when trying to pick something they "never talk about"?

If FB could do that, then FB would realize that these topics are not actually products they are interested in, so they wouldn't be showing ads.

z9znz · 3 years ago

> The most plausible explanation is that people are just easy to predict. Might be tough to admit, but that’s actually a much simpler explanation than Facebook having a back door into our messages, which are end-to-end encrypted.

I disagree with this to the extent that I would say the exact opposite is true.

Facebook (and others) have proven time and time again that they cannot correctly predict user behavior by locking out or banning users who actually did nothing wrong (because their algorithms predicted that the user was breaking terms of service or might be planning to). This happens over and over, even in cases not so complex as the "photos of my child to send to my doctor".

But on the flipside, Zuckerberg has been documented saying one thing to the public and exactly the opposite in private. Heck, Facebook has had memos and emails leaked where they talked about how they would say one thing in public (and to regulators) while doing the opposite secretly.

I believe that Facebook cheats and breaks agreements (and laws) in multiple directions all the time, often willfully. They've even been caught cheating their own ad customers by intentionally overstating the effectiveness and target accuracy of their ads.

Deleted Comment

mgraczyk · 3 years ago

It's hard to believe because I worked there and worked on this stuff (data and ML side) and know that they aren't.

moralestapia · 3 years ago

>I worked there [...] and know that they aren't

I know that, unfortunately, this is what puts bread on your mouth.

But, really? Are you suggesting that Cambridge Analytica didn't happen? Did we all hallucinate that?

You guys jumped the shark already. These attempts at damage control are laughable.

Aunche · 3 years ago

Being educated and tech-literate means that you should try to think more critically than "Facebook bad." You brought up Cambridge Analytica as your scandal of choice, which is the most newsworthy scandal, but the one where Facebook is the least guilty. Everyone had the same access to the APIs that Cambridge Analytica did and Facebook had shut down those APIs before the story broke out. Acting on instinct will only lead to regulation will won't be effective at stopping what you're trying to stop, cause needless side effects, and undermine your political credibility to push for changes that solve the important issues.

Closi · 3 years ago

From WhatsApp privacy policy:

We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.

JumpCrisscross · 3 years ago

> WhatsApp privacy policy

Facebook has a deep culture of pathological lying. They lied to the FTC [1]. They lied to WhatsApp and to the EU [2]. They created an Oversight Board and then lied to it [3].

Each of those lies are more substantial than lying in a privacy policy.

[1] https://www.ftc.gov/system/files/documents/cases/182_3109_fa...

[2] https://euobserver.com/digital/137953

[3] https://techcrunch.com/2021/09/21/the-oversight-board-wants-...

lrvick · 3 years ago

Meta controls the proprietary Whatsapp client software that decrypts your messages and they can have that decrypt and scan the messages for them and send back metrics and how often different words are used.

They can of course also have their app de-crypt and re-encrypt the messages to the key of a requesting third party like police or hired reviewers if certain keywords are used.

Authorities could also have Google or Apple ship a signed tampered Whatsapp binary to any user or group of users, like protestors, that uses a custom seeded random number generator so they can predict all encryption keys generated and no one else, including Meta, will know.

The variant of end to end encryption where third parties control the proprietary software on both ends, is called marketing.

muzani · 3 years ago

Also WhatsApp privacy policy:

As part of the Meta Companies, WhatsApp receives information from, and shares information (see here) with, the other Meta Companies. We may use the information we receive from them, and they may use the information we share with them, to help operate, provide, improve, understand, customize, support, and market our Services and their offerings, including the Meta Company Products. This includes:

- improving their services and your experiences using them, such as making suggestions for you (for example, of friends or group connections, or of interesting content), personalizing features and content, helping you complete purchases and transactions, and showing relevant offers and ads across the Meta Company Products

====

Popular theory is they can't see or store your messages, but can analyze them on the client and profile you (e.g. interested in brazil nuts)

pfortuny · 3 years ago

What does “conversation “ mean in that text.

I can perfectly mean just the audio exchanges when both parties talk.

Also: E2E does not imply necessarily that they do not know the key.

sschueller · 3 years ago

Prove it. Open source the client and open the server for 3rd party apps to use.

worker767424 · 3 years ago

It's possible that the client blindly fetches a mapping from keyword to ads, saw the keyword client-side, then requested the ad.

thescriptkiddie · 3 years ago

I hate to break it to you, but a privacy policy is not a legally binding document.

madeofpalk · 3 years ago

> Is it so hard to believe that Meta is snooping on WhatsApp conversations?

Where's the evidence? I don't know what ethos "Hacker News" is supposed to capture, but surely it's not superstition?

tedunangst · 3 years ago

Well, at least some people here are smart enough to know how to run disassemblers and packet captures. Clearly not everyone, but a few tech literate people.

KaiserPro · 3 years ago

> Is it so hard to believe that Meta is snooping on WhatsApp conversations?

for a lot of people, no

> Meta, a company of unprecedented size that was built over monetizing your private data?

one of many companies, however "meta" does have the advantage that you can opt out of them, mostly.

> A company who's been caught in plenty of scandals (like Cambridge Analytic) about this exact sort of thing (violating their users' privacy)?

CA is interesting as it started out as an academic study, which was consented fully. CA then went on to scrape people's public profiles, which often included likes, friends, etc. This combined with other opensource information allowed them to claim to have good profiles of lots of people, the PR was strong. Should FB have had such an open graph? probably not. Should they have taken the rap for everything evil on the internet since 2016? no. There are other actors who are much more predatory who we should really be questioning.

> Are you Meta employees?

I think you place far to much faith into a company that is clearly floundering. Its not like it has a master plan to invade your entire life. Its reached it's peak and has not managed to find a new product, and is slowly fading.

However, as we all think we are engineers, we should really design a test! but first we need to be mindful of how people are tracked:

1) phone ID. If you are on android, your phone is riddled with markers. Apple, supposedly they are hidden, but I don't believe that they don't leak

2) account, and account is your UUID that tracks what you like.

3) your IP. if you have IPv6, perhaps you are quite easy to track. even on V4 your home IP changes irregularly and can be combined with any of the above to work out that you are the same household.

4) your browser fingerprint. (be that cookies, or some other method)

5) your social graph

method:

1) buy two new phones.

2) do not register them with wifi

3) create all new accounts for tiktock, gmail, instgram etc.

4) never log into anything you've created previously, or the fresh accounts on old devices.

5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"

report back.

TacticalCoder · 3 years ago

> 5) message each other about something. However you need to source your ideas from something offline, like a book from a thrift store or the like. maybe an old magazine. open a page, pick the first thing your finger lands on. this will eliminate the "I heard about x" or "i'm in the mood for y"

Wait... If WhatsApp is really E2EE encrypted, why would any of the other steps be necessary? Dude and his wife can simply pick at page at random from a magazine in a store, never search anything online about it, start talking about it using WhatsApp as if it was something of great interest to them. If they start getting related ads, obviously something shady is going on. There's no need for new phones / new GMail accounts / etc.

xerxesaa · 3 years ago

As someone who has actually worked on end to end encryption at Meta, I can tell you I am not aware of anything where the company reads your WhatsApp messages - either in transit or device. The company takes fairly serious measures to ensure it cannot even accidentally infer such contents.

I don't know what is happening in this specific case. Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.

It's hard to convince people at this point because many have lost trust in Meta as a company, and I understand that. But I still find it stunning that so many people are making so many false claims without any actual knowledge to back it up.

daqhris · 3 years ago

Thanks for your explanation.

I didn't have in mind the scenario of a keyboard logging user inputs besides the normal functionality of WhatsApp. I find this theory to be very plausible. Not at all happy with Meta's privacy policy, but I agree that it is worth considering other threats.

From using a VPN that logs all incoming and outgoing traffic (NetGuard) on an Android One device, I've noticied that the default Google keyboard gets in touch way too many times with some distant servers. Whereas, an open source keyboard from F-Droid, FlorisBoard, does no snooping and gets updated solely through the app store.

AJ007 · 3 years ago

The third party keyboard apps are a big question for the OP.

Another consideration, there are companies that track and sell geolocation data. It's "anonymized" but so precise you know the street address a user resides at. It is not a stretch to consider "anonymized" retargeting from keyboard inputs.

I was dismissive of it in the past, as comments voted higher here are. However I've seen enough weird ads show up within minutes of making jokes about obscure topics that I suspect there is something going on.

The piece that might be missing here is third parties collecting signals, "anonymizing" them, and then ads get re-displayed through Facebook, Google, etc. It may not be the major ad platforms doing it directly. In theory this should be harder now with the iOS tracking restrictions.

For the skeptical, consider Avast's Jumpshot. Here millions of users thought they were protecting themselves when their raw browsing stream was being sold live to third parties. I They aren't the only company that has done that. https://www.theverge.com/2020/1/30/21115326/avast-jumpshot-s...

Google, Apple, or Meta retain the power to ship a tweaked binary with a compromised RNG to a subset of users if authorities order them to be it now or in the future after a privacy policy change.

Proprietary encryption means users cannot verify or control the keys or the code that generates or uses the keys. The app can exfiltrate the keys or do any keyword processing on behalf of Meta as well which can include well intentioned features like forwarding paintext messages containing certain dangerous-seeming words to authorities or theoretically trusted third party review teams. Naturally they could also return -metrics- about frequency of word use back to Meta for ad targeting as well.

I too have been a champion of encryption and privacy at past companies only to have all my work undone and watch all the data become plaintext and abused for marketing by a new acquirer.

The only way end to end encryption solutions can avoid these types of abuses is when the client software is open source and can be audited, reproducibly built, and signed by any interested volunteers from the public for accountability.

Short of that it is really not that much different than TLS with promises Meta will not peek, at least not directly, today.

beiller · 3 years ago

If they modified the RNG of person A's phone app during a forced stealth update, then shouldn't person B not be able to decrypt the message? Have you ever had an app update to Whatsapp that you cannot communicate with other people until you are forced to update? The alternative is that there is a vast internal conspiracy at meta that hundreds of engineers, and hundreds of ex-engineers are somehow silent on, which would be using 2 encryption keys, one that law enforcement can read, and one that the other end of the device can read. Isn't provable that Whatsapp the app is using the operating system level secure prng functions? If there was evidence of this, wouldn't it be great for a whistleblower to come out and make a killing shorting Meta's stock? Right now would be the perfect time to be kicked while they are down.

Melatonic · 3 years ago

Even with end to end encryption couldnt the app at the end also be just aggregating the data (or even transcribing audio) to send over separately?

5d8767c68926 · 3 years ago

Meta has repeatedly demonstrated they will do whatever it takes to capture user data. Kid VPNs, in app browsers, etc. Is it any surprise that people are deeply suspicious of any coincidences that arise from using a supposedly private channel.?

Given evidence at hand, it is hard to view Meta as anything but a bad actor.

hayst4ck · 3 years ago

> Perhaps the ads came from some other similar search queries. Perhaps they came from the keyboard intercepting what was typed. Or perhaps something else that I can't think of. But I'm nearly certain it did not come from meta intercepting the contents of your messages.

Isn't this kind of splitting hairs? Does it matter if text information came from a "side channel"?

It seems like the promise Facebook makes is that 'your communication using whats app is secure,' that's certainly my interpretation of what "end to end encrypted" means. It is a promise of security. That means text is sacred and even text sent to giphy should be privileged from the ad machine.

The question being asked here is not "is it end to end encrypted?" It's "are my communications secure?" End to end encryption is just one element of that security.

thetrb · 3 years ago

The thing is if it's a 3rd-party Android keyboard or similar that logs your messages then there's nothing Meta can do about this.

pdntspa · 3 years ago

Are you absolutely sure that this is still the case? You say you "used" to work on it, but modus operandi for these companies is rugpulling protections like this as soon as nobody is looking

I feel quite confident based on first hand knowledge of code, system design, and the many, many privacy reviews we had to go through when building new features to ensure we didn't accidentally log or otherwise infer data we weren't supposed to.

WhatsApp architecture is designed with the assumption that the server could be compromised and yet such an event should not result in any message contents being revealed. Furthermore, the encryption function is designed to ratchet and rotate keys so that a leak of a key at a given point in time would not compromise past and future messages.

So yes, I have a strong sense of confidence that message contents are not exposed to Meta and, given the bar set by privacy reviews, I don't think Meta would do some backdoor workaround like scraping the contents off the device and sending an unencrypted copy. To be clear, my claims are specifically around message contents and when it comes to certain metadata (ex. the sender/receiver, the names of groups, etc) I don't recall the exact details of how they are treated.

Now, despite the fact that I've said all this and that my knowledge on the matter is fairly recent, I'm not sure I could ever say anything with absolute confidence. The code base is huge and not open source. I obviously have not seen every line of code and as you pointed out, there's always a chance some company policy changes happened without my awareness. So I would say "highly" confident but not "absolutely" confident.

bartimus · 3 years ago

What about spell-check data?

Dead Comment

FreeHugs · 3 years ago

Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.

Meta has control over the app Sue uses. So they could send them to Meta unencrypted in addition to sending them to Joe in an encrypted fashion.

Or they just extract the relevant terms:

Sue->Joe: "Hello Joe, I'm so excited! We are going to have a baby! Let's call it Dingbert. You're not the father! Jim is. I hope you don't mind too much!".

Sue->Meta: "Sue will have a baby"

Insta->Sue: "Check out these cute baby clothes!"

rreyes1979 · 3 years ago

More so, my wife sent me a picture of my daughter working on a puzzle. Less than 24 hours later, her Instagram was showing ads for a store that was selling the same type of puzzle as the one my daughter was playing with. So it's not just terms but images too.

planb · 3 years ago

She probably gave Instagram access to her photo library (not unreasonable for a photo sharing app). That means the Instagram app can scan her latest pictures in the background when it's opened. I think it's more likely that the data was leaked this way.

nerdponx · 3 years ago

Back when deep learning was first hitting "mainstream" for object recognition in images, I recall reading that Facebook was using it to look for brand logos and other signs of using a particular product, in your uploaded photos.

Turns out they were also building a database of everyone's face so they could build shadow profiles...

whywhywhywhy · 3 years ago

How did she buy the puzzle to begin with.

lm28469 · 3 years ago

> my wife sent me a picture of my daughter working on a puzzle.

> her Instagram was showing ads for a store that was selling the same type of puzzle

How did she take the pic ?

netsharc · 3 years ago

I have a suspicion as well that this is what they're doing: before the message is encrypted and sent, the app (on your phone) does analysis and picks out keywords relevant for advertising. So they can claim and be technically correct that they are not reading your messages. Although if their algorithm is doing it on your phone, is it... reading?

Or they can say, technically it wasn't a message before it was sent. The dictionary definition[1] even mentions "send".

[1] https://www.oxfordlearnersdictionaries.com/definition/englis...

This is definitely the most likely scenario in my opinion

jtbayly · 3 years ago

> Just because that the messages might be sent end-to-end encrypted from Sue to Joe does not mean Meta cannot read them.

No, that's precisely what End-to-End encryption means.

piva00 · 3 years ago

It means that for strictly one receiver end-to-end encryption. When it's touted as a feature without explicitly stating that "all messages are sent only e2e encrypted and only to your receiver" we can't assume only the receiver is getting the message, it might be E2E encrypted for all traffic, between people using their own keys and nothing stops Meta from sending a different encrypted payload to their own servers with a key they have access to.

Facebook loves to use newspeak, wouldn't surprise me if they applied newspeak to what "end-to-end encryption" means.

neilalexander · 3 years ago

Meta own the proprietary code running at either end of the encrypted pipe. Of course they can.

m0RRSIYB0Zq8MgL · 3 years ago

End-to-End means that it can't be read in the middle. It does not not mean it can't be read by proprietary clients on either end.

philsnow · 3 years ago

Until there are cybernetic implants, the "ends" are the app running on your phones, which they control.

The quandary of what one allows to run on those implants sounds like a chilling sci-fi novel (chilling not because "but FAANG could read your thoughts!" but because people would absolutely still get them installed).

rr888 · 3 years ago

End-to-End is about the networking, not the end points.

https://en.wikipedia.org/wiki/End-to-end_encryption#Endpoint...

spoiler · 3 years ago

So you're nit-picking over the phrasing of the sentence, but should instead focus on the spirit/meaning behind it.

It's illustrated in their example below that they if you say you're having a baby, meta can send some type of distilled ad-keywords to its servers (eg `[mother, baby]` if it knows the user is a woman based on their name/profile, but probably more sophisticated than that). The message you sent is still technically end-to-end encrypted, though,

amelius · 3 years ago

Google can in theory read what is on your screen (assuming you use Android) regardless what app with what encryption you use.

tom-thistime · 3 years ago

Oh, come on. It's called "end to end" but it isn't. Meta has to read them to provide the service. This is not a new revelation.

I think they are extracting terms. Some of the messages generated ads that were related to a term but not really about the conversation.

rhn_mk1 · 3 years ago

I think it does actually no one except them can read them. If someone else can, then by definition it's not end-to-end encryption.

From https://www.definitions.net/definition/End-To-End%20Encrypti...

> End-to-end encryption (E2EE) is a system of communication where only the communicating users can read the messages.

viraptor · 3 years ago

The conversations being e2ee do not affect the app itself from acting on contents. By definition the app needs to know the contents to display it, but it can also update your ad profile. It doesn't even need to send the whole message to meta, just the keywords triggered, or a preprocessed vector defining your interests.

E2ee means only the messages themselves can't be intercepted and read. But if anyone can actually prove fb acting on message contents, I suspect the EU banhammer would be interested.

xuki · 3 years ago

Whatsapp can't read the message on their servers but they can read it at clients, otherwise they cannot display the messages for users. Likewise, Apple/Google can read them too because they have to in order to render the texts.

kadotus · 3 years ago

But the problem arises, I think, is when they say they can't read them: "WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp." https://faq.whatsapp.com/general/security-and-privacy/end-to...

kome · 3 years ago

so what's the point? just inconvenience. better to use telegram at this point.

codethief · 3 years ago

…and have no encryption at all? (Unless you manually enable it for a given conversation.)

JustSomeNobody · 3 years ago

My guess is they encrypt the message twice, append it, and split it off at their servers. To anyone observing traffic, it looks like normal encrypted traffic AND they can still, if needed, show that everyone has their own key and can encrypt/decrypt their own messages. I don't think they would be brazen enough to send it to themselves in plain text.

In principle yes, in practice no, as this is a statement from the WhatsApp website:

> We limit the information we share with Meta in important ways. For example, we will always protect your personal conversations with end-to-end encryption, so that neither WhatsApp nor Meta can see these private messages.

hapless · 3 years ago

That statement was worded carefully

They are saying they dont store or forward your message text, not that your phone doesnt send them topics of interest

jlarocco · 3 years ago

Do you really trust TOS like that, though?

Assuming they're not blatantly violating the policy (which I think they've done before), it's pretty easy to weasel out of that statement by only sharing keywords from the conversation, or only sharing the info with advertisers (but not WhatsApp and Meta), or redefining what a "personal conversation" is, or carefully redefining what "end-to-end encryption" means, or ...

There's no transparency, a huge power imbalance, and terrific pressure on WhatsApp/Meta to monetize as much as possible.

Yup, I think it's just some form of analytics that profiles the user.

I've always suspected them of recording conversations, also why I think Android has gradually tightened permissions and visibilty around speech to text/microphone/camera use.

baxtr · 3 years ago

That’s of course because WhatsApp's privacy policy isn’t applicable in the Metaverse.

Looking at this from a reality perspective is not very helpful.

pb7 · 3 years ago

Meta->Joe: “Focus on yourself bro”

planede · 3 years ago

Instead of speculating whether something like this could or could not be true, there should be a way to test it scientifically.

* Have pairs of mobile devices set up from factory configuration with WhatsApp and Instagram installed.

* Simulate conversations between each pair from select topics.

* Collect all ads from Instagram after the WhatsApp conversations from each device.

* Categorize ads to broad topics.

* Search for significant bias.

There are probably a lot of factors I'm missing here, and it's probably easy to introduce bias when there is none there. For example it's probably a good idea that a different person categorizes the ads into topics than the person handling the specific phone, otherwise the person might bias the categorization of the ads based on the conversation they had on WhatsApp beforehand. The person categorizing the ads should have no knowledge of the WhatsApp conversation that happened on the phone. The devices should probably be on different networks. There is probably a lot that I am missing here.

snowmizuh-04 · 3 years ago

The scarier thing to me is when ads match _conversations_ I have with my wife. I told her about this story this morning, and she reminded me about a conversation about stem cell research we had yesterday. I said something along the lines of 'I hope there is a breakthrough soon on regenerating the Isle of Langerhans in the pancreas to treat diabetes.' Sure enough, she noticed an article in her Google News feed later that day related to diabetes.

Once or twice may be a coincidence. Maybe. But this happens regularly and with startling specificity.

What could be listening? I'm a technologist like the rest of you. I know apps need permissions to the mic, I know it's not easy for an app to stay in the foreground. Is it my Roku? My smart TV?

Makes one want to go full Richard Stallman.

p.s. my wife just said it would be really funny if Google News showed an article now on people worrying about their tech listening to their conversations. I'll post an update if that happens...

spmurrayzzz · 3 years ago

This may or may not apply to the anecdote you shared about your wife, but since these apps know your relative proximal location to your weak/strong social connections, they also can know what your friends may search for and is often-used flavor of targeting.

e.g. You and a bunch of friends go to dinner and have a conversation about <topic x>, at least one of those friends googles something about that topic. You later see an ad related to <topic x> because you were targeted based on the search your friend did while they were near you.

If your wife potentially did anything digitally, related to the diabetes topic, its likely that you were targeted based on that.

Again, no idea what happened resultant to the story you shared, but whenever this sort of thing happens to me I try to appeal to Occam's Razor based on how much I know about how this tech works under the hood.

I thought of this. Just to be clear, neither she nor I did any searching on diabetes or anything like that. I can understand this being driven by search, chats, emails, etc. (basically any type of keyboard input). But here, the mode of communication was voice-only.

omniglottal · 3 years ago

Occam's razor, in this situation, is that the corporate entity who profits from collected data while deceiving it's product into perceiving itself as a "customer" might, in fact be... collecting data from its product to sell to it's customers.

Swizec · 3 years ago

As creepy as this is whenever it happens, I remain convinced it’s observation bias. Because of how often it doesn’t happen.

Quoth Richard Feynman:

> “You know, the most amazing thing happened to me tonight... I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!”

You were gonna see ads, or news stories, or whatever. You happened to see this one. And it happened to connect to a conversation you were having. Amazing! What magic!?

Well how many other ads or other times didnt it connect to a conversation and you just don’t remember because it didn’t feel special? Probably many more times than it did work.

Lately I’ve been trying to find alternate explanations. Hm we were talking about going to a restaurant tomorrow. Why is my girlfriend getting all these restaurant ads on instagram all of a sudden? Oh right, because it’s Thursday, we live in a city, I searched Google for “Good Friday restaurants for a date” during our conversation, and we live on the same IP.

I love the Feynman reference. Thank you. And this may be right.

However, I also am reminded of Nassim Taleb's Fat Tony, who when asked 'what are the odds of flipping a coin heads 10 times in a row?' responds with 'fuggehdabowdit. it's a hustle.' There's a scientific response, which can be very naive in some ways, and there's the Fat Tony street analysis. As I've gotten older, I tend to value the latter.

I'm sorry, but the Feynman reference may be a little misleading. With respect, it is difficult to imagine Richard Feynman putting "I sent the information to the recipient via the internet, but I hope they didn't read it" in the same category as "no way they could know; must be telepathy."

pclmulqdq · 3 years ago

Google probably gets higher CTR by showing you many different ads that are oddly specific rather than ads that are more general. Once in a while, they will get one right, and it will feel creepy.

unity1001 · 3 years ago

> Because of how often it doesn’t happen.

So sweet boxes from Kazakhstan suddenly appearing in your ads is due to observation bias...

prego_xo · 3 years ago

I'm seeing a lot of people trying to rationalize and excuse this behavior from Google, but man is it a hard sell on me.

My mother recently remarked in a unique conversation that we have had a box of Golden Grahams cereal for a year and should find a recipe to use it up. She opened her phone to search, and lo and behold, the top recommendation after only two letters, R and e, was "Golden Grahams recipes". Not only had that never been a topic of conversation or search beforehand, nor did she have Google open on her phone, but you may have noticed that "Golden Grahams recipes" doesn't even start with R or e. This sparked a long conversation about how privacy really is something worth fighting for.

My only guess is that Google has the ability to listen in because we use Android phones.

mariojv · 3 years ago

It is totally weird when this kind of thing happens. I attribute it to my husband Googling and clicking on links related to our conversation with IP based tracking. I've found Instagram showing me things more related to my husband's interests if I haven't used it in a while.

bombcar · 3 years ago

It's possible some of it is simply Baader Meinhof https://www.healthline.com/health/baader-meinhof-phenomenon but it could be something deeper (or something forgotten). If my wife asks me about something, I often search or google it almost without thinking, even if I never use any results.

It should be possible to run scientifically sound experiments to show if it's happening.

snovv_crash · 3 years ago

It just happened right here on HN... clearly HN is spying on your conversations /s

I would setup NextDNS free version, add a ton of blockers, link your devices correctly to it, and then see if it still occurs. NextDNS is not going to block a lot of stuff but if there is a significant change then at least you have an easy way to show it. Setup is quick and easy relatively

Maro · 3 years ago

Some options:

1. Nobody is reading your WA messages, the same topics can be learned from your browsing activity or other msgs, eg. by reading your sms texts.

2. Meta is reading your messages directly in-transit, server-side.

3. Meta is not reading your messages server-side, but the Meta apps extract keywords from your conversations and request relevant ads from the ad servers.

4. Another non-Meta app is doing the above.

5...

GuB-42 · 3 years ago

If 2 is true, then it is not end-to-end encrypted, and I don't think that WhatsApp is lying. They have ways of doing their things without lying, so I don't expect 2 to be true.

I think that 1 is the most plausible, however the original post is about "topics they never talk about", so assuming that WhatsApp is the only channel and they don't leak data in other ways (and there are many other ways to leak data), then 1 becomes unlikely.

3 is the most compatible. All the targeting can be done locally, so no end-to-end unencrypted message leaves the app. The app then sends your topics of interests to Meta.

4 again assuming WhatsApp is the only channel, then there is probably some malware somewhere, and it is unlikely that Meta accepts illegally collected data (they can do it legally, better, and with less potential trouble). There are however a few legitimate apps that can do the above. I am thinking about things like predictive keyboards, accessibility apps (screen readers, ...), backup apps (end-to-end encryption is about transmission, not storage), and the OS itself. I don't think Meta controls any of these, and I don't think they would buy data from them (Google and Apple are competitors after all).

So I would go for an accidental leak (case 1). For example, for the experiment to be meaningful, you shouldn't tell anyone about the test topic before you receive the ads. Or with the WhatsApp app hinting Meta about your topics of interest.

mid-kid · 3 years ago

Another thing that would make 1 happen even if they think they're not leaking information over different channels, is the software keyboard. GBoard is google's, and likely has some data collection in one way or another. Similarly, there's a lot of google-related services running with root privileges on stock android phones that could easily snoop on data from various apps. This effect is worsened by other android OEMs, like xiaomi or maybe even samsung, who ship their own invasive services on top.

Disclaimer: I worked at FB, but not on Whatsapp or ads.

I agree, I'm pretty sure 2. is not the case; I just listed it as a theoretical possibility. Despite all the bad press and problems, FB has very (very) high integrity and standards, at least the parts I saw.

elondaits · 3 years ago

Another option would be that META created a small model that could be run client-side and picked the right selection of ads to show elsewhere without even exfiltrating keywords.

Well, my wife sent me a picture of my daughter working on a puzzle. Less than 24 hours later, her Instagram was showing ads for a store that was selling the same type of puzzle as the one my daughter was playing with. So it's not just terms but images too.

Yes, this is a bit similar to my 3. option, but more sophisticated..

Whatsapp FAQ:

WhatsApp's end-to-end encryption is used when you chat with another person using WhatsApp Messenger. End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp. This is because with end-to-end encryption, your messages are secured with a lock, and only the recipient and you have the special key needed to unlock and read them. All of this happens automatically: no need to turn on any special settings to secure your messages.

https://faq.whatsapp.com/791574747982248

InsomniacL · 3 years ago

> End-to-end encryption ensures only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp.

This does not exclude an algorithm running running on the sender/recipient's App from scanning the content and sending suggestions to AD servers.

Mordisquitos · 3 years ago

I thought that too, but if that is the case then it should be relatively easy to find this hidden functionality by decompiling the APK and exploring it.

spinningslate · 3 years ago

> only you and the person you're communicating with can read or listen to what is sent, and nobody in between, not even WhatsApp

I guess the key lies in "what is sent" in the above statement. The casual reader might reasonably interpret as "no-one except the intended recipient can see _what I type_". But it doesn't say that. It only covers what gets _sent_. It doesn't say anything about what happens to the content outside specifically _sending_ it to the other party(ies).

Even if that interpretation was correct in a broader sense, which I don't believe it is, certainly in this context of e2ee it isn't correct.

The more obvious explanation is that before or after the e2ee (i.e within the app itself), an algorithm scans the content, categorizes it and sends this to Meta/Facebook.

In this scenario, *Nobody* has read the content other than the person you're communicating with.

beardyw · 3 years ago

Maybe Meta are not as trustworthy as I imagined.

andrewinardeer · 3 years ago

So lying by ommission?

leobg · 3 years ago

Can I not train a text classifier on encrypted text?

Basically, let the AI figure out what ads get clicked the most for a given string of encrypted 24h window of chat history. Eventually, the AI is going to hit on its “Rosetta Stone”, even without ever formally decrypting the text, much less any human reading it.

With millions of conversations happening on WhatsApp, why shouldn’t that be possible?

And it’s not even a breach, technically, because nothing ever got decrypted, and the similarity vector generated by the AI have, per se, nothing to do with the content of the conversation or the individual that sent them. Run the same training algorithm again and they’d look completely different! Hence they can’t possibly be “personal data” in the sense of the law.

MAGZine · 3 years ago

No. Encryption means the data is scrambled. Essentially unrecognizable from noise, save perhaps for some headers.

If you can discern meaning from noise, then your theory would work. But discerning meaning from random noise is obviously impossible (i.e. what if there is no meaning?).

If you leak information than you say, then the encryption is worthless. Harmful, even, because you think you have protection when you do not.

mminer237 · 3 years ago

It doesn't matter how many conversations are going. With private key encryption, what a phrase encrypts to for one person would be different than the next. It would have to be trained solely on your conversation. Encryption is also dependent on all the text before it as well as text in the same block, so it would have to be the beginning of the message with no metadata to throw off alignment saying the 16-byte phrase so many times that it could pick up a difference. I'm pretty confident it's impossible to get anything useful out of that.

roywiggins · 3 years ago

If any algorithm can get even one bit of data about the plaintext then the encryption is broken by definition.