This is a pretty clever combination of feature misuse, although I think I'd rate the overall security impact fairly low, because the best-case scenario is that you cause the recipient to open a link in their browser. That can be useful in some cases, but unless the attacker is a police force, intelligence agency, or similar, there would usually need to be some kind of follow-up attack, e.g. exploiting unpatched software on the device.
In the interest of technical accuracy, I don't think I'd label this one "clickjacking" specifically. "Clickjacking" usually refers to a very specific technique that involves invisible HTML frames overlaid on top of other content.[1][2]
Yeah, I wouldn't call this clickjacking, because real clickjacking is a technique the makes a victim perform some account action without knowing it. Simply opening an unintended link isn't as bad as that.
If that link shows a login screen identical to the Instagram one, what percentage of users do you think will double-check the URL a second time, after mistakenly confirming it in the WhatsApp preview?
Everyone fixing on the UTF RTL character but Meta should have at least acknowledged the issue with the preview URL that can be different from the message URL. I understand that this is probably to unfurl shortened URLs, but there has to be some clever workaround that Meta & Whatsapp can implement
No. End-to-end encryption means that the preview has to be generated either by the sender or the receiver. Having the receiver generate the preview would leak his IP. They have to remove the preview feature.
Yes, preview is generated by the sender to avoid receiver's address leak to a sender-controlled host, but what I'm saying is that WA should enforce on the receiver side that both point to the same URL. As said initially, they are most certainly doing it this way to unfurl URL shorteners, which would other be the easiest way to phish people.
At the same time it's also noteworthy that the preview can fail to be generated on the sender side and the message will be send out anyway, so yeah, I agree with you that they could just remove the preview feature. Probably in their opinion the trade-offs are worth, I guess.
Clickjacking is where you perform a click on some element, but actually the click event is caught by a different element, typically laid transparently above the thing you think you're clicking on. The click can be detected by the attacker, despite you not getting the event, by making your visible underlying layer have focus and look for the onblur event to fire.
What OP found is cool. I've also used RTL characters to make screensaver files (so just normal executables with a different extension in Windows) that looked like Word documents, I forgot why, maybe to prank a friend or teacher or so. OP has gone one step further and found a way to alter the displaying on another system. I'm not sure what this is called, though it's not clickjacking (the Wikipedia page OP links to in the article lede confirms that) because the user doesn't mistake which element they're clicking on, they mistake where a link will lead. I've also never seen a clickjacking being abused in practice, but what OP found I can imagine will be abused!
Honestly I've long given up on users being able to tell which domain they'll end up on when clicking a link. A majority doesn't understand the concept anyway, and the remainder can't tell. Those who think they can tell (such as yours truly) end up getting frustrated when all links go to sendgrid.tld/j3ovi3bfogobbledypoop93jnri2o. We're training people to click random tracking obfuscated fishy looking garbage every day and nobody bats an eye at it
This specific example is poor sanitization because it actively misleads the users who try to understand what they’re clicking on.
Your example of the generic confusion around host names and domains is a harder problem but people have tried to mitigate it somewhat by doing things like highlighting the domain name portion. Like most phishing techniques, passkeys will end it eventually.
RTL has been a huge source of security vulnerabilities for its entire existence. Why don't operating systems have a setting to disable all RTL, so that people who don't know any such languages aren't unnecessarily exposed to the dangers with zero benefit?
Operating systems notwithstanding, there should definitely be such option for every OS widget, that displays text (including Android TextView). And it should default to disabling all BiDi backdoors unless developer explicitly vetted specific text span to enable them.
Making entire text rendering stack vulnerable by default under pretext of catering to less than 1% of world population is ridiculous.
I reported a similar issue to Google early this year and they declined the submission because it "can only result from social engineering" and "we think that addressing it would not make our users significantly less vulnerable".
I won't mention the details here but Google Search sometimes rewrite URLs in such way that an attacker can spoof the actual URL.
My advice is to never trust URLs displayed by websites and apps.
I expect they probably didn't make clear exactly what they wanted fixed (blacklisting the RTL character) and Meta thought they wanted all misleading URLs fixed which is not really possible.
It's a security tradeoff. Given that you want to provide a link preview (which is a nice feature) you have a few options:
1. Generate on the sender side. Downside: Can be spoofed.
2. Generate on the receiver side: Downside: Leaks receiver IP.
3. Generate via third party: Downside: Leaks information to the third party.
Overall I think that 1 is the best option. The sender can "spoof" all of their messages anyways, including the preview as part of the message is really no different. The problem here is that it isn't obvious that this content comes from the sender, it is displayed as a separate bubble and I would bet that 99% of users don't realize that the content is from the sender.
Plus the URL is all that really matters anyways. If you are clicking on an attacker-controlled URL they can make the preview display anything they want. So you gain very little by forcing the preview to be "authentic".
Option 3 can be good as well. Especially if implemented with something like double-blinding. So you connect to one party which forwards you to a second party. This way the first sees your IP and the second sees the destination IP but neither sees both (unless they collude). However that is a lot of infrastructure to set up and maintain for relatively little benefits.
Another comment picked what I think is the best option: the sender generates it, and receiver verifies it, but only on click. That way the receiver's already going to leak their IP, so WhatsApp can verify before opening up the web page.
In the interest of technical accuracy, I don't think I'd label this one "clickjacking" specifically. "Clickjacking" usually refers to a very specific technique that involves invisible HTML frames overlaid on top of other content.[1][2]
[1] https://owasp.org/www-community/attacks/Clickjacking [2] https://portswigger.net/web-security/clickjacking
They can just disable it for contacts that you don't have on your contact list.
What OP found is cool. I've also used RTL characters to make screensaver files (so just normal executables with a different extension in Windows) that looked like Word documents, I forgot why, maybe to prank a friend or teacher or so. OP has gone one step further and found a way to alter the displaying on another system. I'm not sure what this is called, though it's not clickjacking (the Wikipedia page OP links to in the article lede confirms that) because the user doesn't mistake which element they're clicking on, they mistake where a link will lead. I've also never seen a clickjacking being abused in practice, but what OP found I can imagine will be abused!
Honestly I've long given up on users being able to tell which domain they'll end up on when clicking a link. A majority doesn't understand the concept anyway, and the remainder can't tell. Those who think they can tell (such as yours truly) end up getting frustrated when all links go to sendgrid.tld/j3ovi3bfogobbledypoop93jnri2o. We're training people to click random tracking obfuscated fishy looking garbage every day and nobody bats an eye at it
Just this simple visa.securesite.com fools a lot of people. And I don’t see a good solution in the near future.
Your example of the generic confusion around host names and domains is a harder problem but people have tried to mitigate it somewhat by doing things like highlighting the domain name portion. Like most phishing techniques, passkeys will end it eventually.
This assumes passkeys will be widely adopted. And that users will know to stop wherever the passkey doesn't work. I have doubts about both.
Making entire text rendering stack vulnerable by default under pretext of catering to less than 1% of world population is ridiculous.
I won't mention the details here but Google Search sometimes rewrite URLs in such way that an attacker can spoof the actual URL.
My advice is to never trust URLs displayed by websites and apps.
I think I saw something like this a while ago, with some fake KeePass website maybe.
This is an even bigger issue with the UI design, why should poor users compare links and previews to be safe?
1. Generate on the sender side. Downside: Can be spoofed.
2. Generate on the receiver side: Downside: Leaks receiver IP.
3. Generate via third party: Downside: Leaks information to the third party.
Overall I think that 1 is the best option. The sender can "spoof" all of their messages anyways, including the preview as part of the message is really no different. The problem here is that it isn't obvious that this content comes from the sender, it is displayed as a separate bubble and I would bet that 99% of users don't realize that the content is from the sender.
Plus the URL is all that really matters anyways. If you are clicking on an attacker-controlled URL they can make the preview display anything they want. So you gain very little by forcing the preview to be "authentic".
Option 3 can be good as well. Especially if implemented with something like double-blinding. So you connect to one party which forwards you to a second party. This way the first sees your IP and the second sees the destination IP but neither sees both (unless they collude). However that is a lot of infrastructure to set up and maintain for relatively little benefits.