Readit News logoReadit News
kevindong · 4 years ago
Keep in mind that a substantial portion of users now use ad blockers such that a lot of URLs used for analytics like this are blocked.

Consequently, you can't actually expect to capture 100% of these analytics events nor even expect the percentage captured to stay the same over time since the filter lists are very regularly updated and users enable/disable different ad blockers over time.

More broadly speaking, once you have sent a webpage to the user, you should not expect anything from the user's browser. They may or may not allow whatever arbitrary JS you have on the page. They may even intentionally give you bad data (e.g. hijack the payload to give you intentionally malformed data).

edit: even more broadly speaking, there's additional reasons why you can't expect to receive these kinds of callbacks: consider what happens if a user loses connectivity between loading the page and them navigating away (e.g. their phone loses service because they went into an elevator before navigating away)

capableweb · 4 years ago
> Keep in mind that a substantial portion of users now use ad blockers such that a lot of URLs used for analytics like this are blocked.

How sure are we about this? I'm pretty sure it depends on which market specifically you're in, and the data I'm about to show is of course not perfect, but it seems that not so many users actually do use adblockers today. Although I don't know a single developer who doesn't, and in some web applications I'm running, the majority of users do use adblockers as they are focused on developers.

Chrome is assumed to be the most popular browser (by a large margin last time I checked, so I won't bother to check again) and a quick search puts the user base around 2-3 billion users. Searching for "adblock" in the Chrome Web Store (https://chrome.google.com/webstore/search/adblock?hl=en&_cat...) shows that the most popular adblocker has a user base of ~300,000 users.

That makes 0.015% to 0.01% of Chrome users having the "AdBlock" extension installed. Not that substantial.

If someone has some more accurate numbers than my slightly-educated guess, I'd be happy to be proven wrong.

Edit: The above user base of the adblock extension is wrong. As Jabbles pointed out, I was seeing the number of reviews, not number of users.

So instead, the page lists "10,000,000+ users" so we can assume the true number to be above that, but below "100,000,000+ users" users.

That would put the amount of Chrome users using the "AdBlock" extension between 0.3% and 5% more of less. Closer to "substantial", but not sure if it would impact businesses choice regarding ads/tracking or not.

zamadatix · 4 years ago
> So instead, the page lists "10,000,000+ users" so we can assume the true number to be above that, but below "100,000,000+ users" users

Can we? I can't seem to find anything that indicates if/when the next number jump is, just a lot of big name extensions at "10,000,000+". Back in 2016 ABP had a post about their extension alone having 100+ million active users https://blog.adblockplus.org/blog/100-million-users-100-mill... and that's ignoring the >50% of Chrome users on mobile which requires non-extension based blocking.

Going for someone else's numbers instead of trying to build my own I'm finding anything from 10% to near 50% with most estimates being in the range of ~25%.

varenc · 4 years ago
I use an adblocker, and lots of filtering lists, but most of the `navigator.sendBeacon` requests I was seeing weren't being blocked. Sometimes they were when the URL matched a pattern, but often they weren't. Which makes sense since they aren't ads and by design have nearly zero effect on the user experience.

I still wanted to block them though... so I started killing all `navigator.sendBeacon` requests by replacing it with a no-op function on page load. [0]

I have the no-op function log the results to console and it's fascinating seeing all the sites attempting to use it.Some pages on Amazon will fire a sendBeacon request every second or so.

[0] With this uBlock Origin user script: https://gist.github.com/varenc/7c0c17eea480a03d7c8844d81e608...

kbelder · 4 years ago
From tests I've ran, about 8-12% of our visitors have some sort of tracking, analytics, or javascript disabling or blocking. This is in an ecommerce site focused toward non-technical users. I'd expect a tech savy browsing audience to be composed of 20% or more visitors with blocking.
Jabbles · 4 years ago
You're looking at number of reviews, the number of downloads is significantly higher.
dfas23 · 4 years ago
Not always - more sophisticated analytics will proxy these requests through the websites own domain.
ForHackernews · 4 years ago
> they may even intentionally give you bad data (e.g. hijack the payload to give you intentionally malformed data).

Interesting. Are there tools we can install to send malicious payloads to surveillance companies?

HWR_14 · 4 years ago
There are plugins that do this, but I think trying to choke off the signal is more effective than just adding noise.

Deleted Comment

userbinator · 4 years ago
For those wondering --- despite the domain of the site seeming to imply so, this does not use CSS.

On the other hand, the more I read about stuff like this, the more disenchanted I am with the "modern web". When something as seemingly innocent as a link can "stealthily" do something else when it's clicked, but otherwise looks exactly the same as a benign one (and moreover, behaves entirely as expected with JS off), it really brings into question whose interests such features are designed to serve.

capableweb · 4 years ago
Every time I hear some argument against "the modern web", I always try to imagine if you could apply the same argument to anything in computing.

And for this one, you could. Native applications might seem like you'll "submit the form when you click on the submit button, but something else entirely happens" because it's a general computing environment, just like the web.

So in reality, it seems you're just against "modern" computing in general, not specifically the web. Because you can have that behavior anywhere, not just in the web platform.

_moof · 4 years ago
Take "the modern web" as a cultural descriptor then, not a technical one. There has been a huge shift in the last decade-plus. Software does not serve the user's interest anymore, and it sucks. Yes, this sort of thing has been technically possible across a range of platforms for a while. But the difference is that people weren't doing this kind of garbage before, and now it's everywhere.

Casual gaming is an excellent example of how this has changed. The whole purpose of the software is different. It used to be that you would buy a game—a piece of software purpose-built to entertain you. Now you are no longer buying a game; you are buying a value extraction device that uses a game as bait. Software just isn't built for users anymore.

feanaro · 4 years ago
Well there is a case to be made that the web should not necessarily be a general computing platform. Or at least that there should be a separate mode in which it assuredly is not because it is a hypertext platform instead. This use case has not disappeared just because the web grew additional capabilities.

But it can feel as if the capability to act as a hypertext platform has disappeared, because you now have to worry about it doing things that are not aligned with this purpose. Remember, there is power in limitations, as type theory teaches us.

And even for use cases where more than simple hypertext is needed, there is still a vast, nuanced chasm between that and full-blown Turing completeness.

switchbak · 4 years ago
I'm also against native clients that are inherently user-hostile. Does that mean I'm against modern computing?

I'm also against people working against my interest that only use pen and paper. Perhaps this issue is more about privacy and ethics than some judgement of being for or against some "modern" notion of computing.

Personally I'd prefer platforms that have privacy (and security) guarantees built in. Given the strong conflict of interests at play, I don't think we can expect that from any of the big players (if it's even reasonable at scale).

tsimionescu · 4 years ago
That's exactly it. The web was supposed to solve a problem of document distribution. Documents are not general computing environments, and that's exactly what you want. I want to read this news article, not run general software on my own system.

The biggest problem with the web is that it has been made into a general computing environment. Instead of creating another separate network - the web for docs, the app web for apps, with different url schemas etc - the web has been co-opted for this second case, thereby making it much less useful/safe for its original purpose.

liketochill · 4 years ago
Me too, I miss the days where software wasn’t constantly updated and networked for no good reason.

In fact I run a number of VMs with no internet access to keep my tools under my control.

yurishimo · 4 years ago
This website (CSS Tricks) has been more about general webdev for the better part of a decade.
jbverschoor · 4 years ago
> I’ve needed to send off an HTTP request with some data to log when a user does something like navigate to a different page or submit a form.

No need for that. Stop hoarding data!

soylentgraham · 4 years ago
Maybe it was to log a player off from a game, rather than have a dangling player. (Though websocket would work better)
inglor · 4 years ago
This is true and in general websockets (well, and webtransport/webrtc etc) are the only reliable way to detect navigation from the backend since none of the methods mentioned in the article work if the user simply closed the tab.

Deleted Comment

Deleted Comment

dylan604 · 4 years ago
From an honest analytics parsing question, how do you decipher 4 different iOS users on the same device model from a Starbucks or the VPN exit node sharing the same IP from access_log?
hnarn · 4 years ago
From an honest privacy advocate: what makes you feel you have the right to be able to?

Why do you think end-users install things like "uBlock Origin", or why do smartphone manufacturers like Apple introduce features like random MAC address generation when connecting to wireless networks? Why do people use VPNs in the first place?

Because data points are being collected without our consent to profile us, and most of us do not want that.

The type of differentiation you are talking about by definition lacks consent, because if you had consent it would be trivial, you could do it by cookies or other forms of identification that the user consented to.

867-5309 · 4 years ago
by using a unique id for every visitor, stored in sessions / cookies / local db. modern fingerprinting goes way deeper than device, OS and IP
jen20 · 4 years ago
Given private relay, how would you know they’re in a Starbucks in any case?
XCSme · 4 years ago
They might still use a different browser or a different browser version.
Ekaros · 4 years ago
Just thinking here. If you did not have SPA, couldn't you just I don't know log when they request those pages? Like, aren't those done with HTTP request anyway... So why the hell you need to do the logging on client side...

Then again that would be fun to send enough nice bogus log information... As clearly that is what they want.

xg15 · 4 years ago
This works as long as the user is navigating around inside your site - but it won't work if the user e.g. navigates to a different domain or closes the browser window.

If you specifically want to track when a user leaves a page, you will miss a lot of events this way.

capableweb · 4 years ago
What's wrong with doing something like this? If it contains PII, then I agree, the additional hassle of dealing with everything that comes with handling PII (like GDPR) becomes too much, easier to just don't do it. But if it doesn't contain PII, it can be useful to see how many people drop off a form VS submitting it for example.
Moru · 4 years ago
Record how many opens the form and compare to how many actually submit the form?
tantalor · 4 years ago
IP addr & cookies are PII
fivea · 4 years ago
> If it contains PII (...)

How can you tell?

Deleted Comment

Deleted Comment

ninkendo · 4 years ago
Will any of these techniques inform the site owners that I close the tab the instant they put up a big “sign up for our mailing list” modal? And maybe that would help train them to not design sites that way?

Maybe that could be one good outcome of such an API.

panphora · 4 years ago
I think popups on page load are more of a symptom of short-term, myopic thinking than anything else. This type of thinking is strongly encouraged by always trying to live up to quarterly projections - "Did Sales meet their targets?"

Popups exchange long-term trust in your brand for short-term profits. Of course it's the case that if you pop a modal in front of everyone, your conversion stats will go up. But potential life-time customers who have faith in your brand and believe in your mission will lose trust in you quickly and probably never sign up in the first place.

For me, it always comes down to the primary rule of UX: if you wouldn't do it IRL, don't do it online.

You can ask for someone's email if they look like they're interested in something you're selling, but don't ask for it the moment they enter your store!

JaimeThompson · 4 years ago
>exchange long-term trust in your brand for short-term profits

That could be the motto of most every modern MBA.

Nextgrid · 4 years ago
> if you wouldn't do it IRL, don't do it online.

The lack of a Remote Strangulation Protocol (https://wiki.kluv.in/a/RemoteStrangulationProtocol) is a problem again.

jasonlotito · 4 years ago
Long term trust doesn’t pay the bills. See every underfunded open source project.

Think of it this way: if the site that you have long term trust for does this, it means the previous methods weren’t working. The ‘trust’ was broken.

Would love to see your advice applied to your own life.

dazc · 4 years ago
Since you were never going to join their mailing list, no, you are simply saving them bandwidth.

What you don't realise is that x number of people join the list and x number of people tolerate being spammed with 'did you know we sell stuff' emails every day for the rest of their lives. x number of those people actually do buy something; likley the thing they intended to buy originally but joined the email list in case a discount code would be the subject of the first 'welcome' email because this is what they have been trained to expect.

All this has been A/B tested and, thus, has been 'proven' to be great business practice.

slig · 4 years ago
They learned, and now there's the "back button breaker", it shows another spam page when clicking the back button.
XCSme · 4 years ago
Those pop-ups are usually shown on "exit intent" which is simply when you move your mouse outside the viewport (i.e. the address bar or hover over another tab).
slig · 4 years ago
Have you seen the new variation of this abomination? Now it shows another spam page after the user clicks the back button. There's no "intent" anymore, just a user trying to leave and getting a new spam.
sokoloff · 4 years ago
There was a parts supplier site that used to offer a 5% discount in this exit intent handler. (I’m not sure if they still do and I’m only on mobile at the moment.)

I’d happily trigger their exit intent handler and juice their “this feature is a total game-changer for our business!” reporting.

bhaney · 4 years ago
> Browser support is good, but not great. At the time of this writing, Firefox specifically doesn’t have it enabled by default

Little sad that Firefox not supporting an API only downgrades that API's browser support to "good" instead of "unusable" these days

danuker · 4 years ago
This is a feature built specifically for tracking. Firefox takes anti-tracking measures.

Another example is trimming referrer URLs (just keeping the domain).

bhaney · 4 years ago
I wasn't commenting on Firefox's decision to not enable the API, I was commenting on how the author appears to consider Firefox to not be very significant.
dspillett · 4 years ago
Though according to caniuse, beacons is fully implemented which has much the same effect from a tracking PoV.

As a side note, I was somehow completely unaware of that feature (and I don't think in a "heard about it and forgot" sort of way). Every day is a school day...

sgustard · 4 years ago
Interesting that commenters here assert "most users don't want to be tracked," whereas in fact most users happily choose Chrome over any privacy-focused platforms.
vbezhenar · 4 years ago
Firefox market share is about the same as something called Samsung Internet. Do you test your website with "Samsung Internet" browser?

There's Chrome and there's Safari. Rest of browsers are rounding errors. And even those rounding errors are mostly chromiums. Sad indeed.

SquareWheel · 4 years ago
Samsung Internet is Chromium-based. There's little need to test it separately as there is with Firefox.
pornel · 4 years ago
Firefox is unlucky that its remaining users furiously hate everything tracking-related, and are ready to burn Mozilla to the ground for even tiniest suspicion of any transgression. Therefore, Mozilla can't choose a lesser evil here, because it's still seen as evil.
rootlocus · 4 years ago
Firefox markets itself and openly protests for privacy rights. To go against it would completely shatter the users trust.
TheAceOfHearts · 4 years ago
What use-cases are there for this feature other than tracking users?

Anyway, the best solution is to pass any links through a redirect which is responsible for logging the visit. That's how Google tracks what search results get clicked on. And it doesn't require any JavaScript.

I'm surprised they didn't even mention it as a possibility.

capableweb · 4 years ago
> What use-cases are there for this feature other than tracking users?

Social services where users want to signal if they are online/offline

Users who want to know how much time they've spent on a platform.

Closing down sessions for things that can be expensive to just run "forever". Think VMs and similar that gets started when the user visits the page, that you want to turn off when they leave.

The use cases are many, and there are also many other alternative solutions, this is obviously not the only one. But as always, all of them come with their own tradeoffs, so this technique might be the best for some of them.

adhesive_wombat · 4 years ago
You will already need to handle the case when you don't get the last request, because there are more ways the user can just vanish on you: suspending the computer, killing the browser process, crashing the browser, turning off WiFi or going out of range, etc, etc.

The best you can hope for is a "happy path" where you can clear up a little sooner in most cases, but you can't rely on it.

If a user being on a webpage is holding open a resource, that might be an architectural issue.

dspillett · 4 years ago
Let us not forget that tracking users is not a single use case.

Closing off cached resources when a user vanishes is a good thing, it saves resources for other users and could, if I may stretch a little, have environmental impact. Noting that sessions tend to end after a particular set of circumstances could mean spoting something in your application that is annoying people, something that particularly in an SPA you might otherwise not easily detect.

Sometimes tracking is legitimately benign. It is only when unnecessary PII is included, and kept to track each specific user over time rather than just over a session, that it gets stalky to the point of begin in my "I won't do that" list (sharing the PII around further gets you on my "you are morally wrong for doing that" list).

> surprised they didn't even mention it as a possibility

The article is focused on detecting when a user leaves. While a fair amount if that will be detectable because they've clicked a link, it doesn't cover the large case set that is closing tabs and windows.

Something I might have to investigate (or search to see if someone else has): do any of these techniques work reliably on controlled OS shutdown.

gwern · 4 years ago
"Tracking users" has lots of uses. For example, we were trying to do OP on gwern.net for a while, using several of those tricks. It failed miserably and we could never figure out why.

What evil things were we doing? Well, I wanted to know if people were using the links I put inside popups, like to Google Scholar. Was anyone using them? If not, they were just so much clutter and should be removed for the users' benefit. I also wanted to get a list of outbound links by popularity, so I could write summaries/annotations for the most popular unannotated links, and rank dead links by priority for fixing (there are too many unannotated or dead links to 'just fix', so a ranking would've helped a lot).

But the link tracking never worked, so, I couldn't do any of that.

daveoc64 · 4 years ago
Some systems lock records while users are viewing them, so it can be helpful to send a request that unlocks the record when the user leaves the page.
maccard · 4 years ago
What if they just close their laptop lid, or the browser crashes? You can't reliably detect when a user leaves with this method, so you _cant_ use it for anything critical
XCSme · 4 years ago
Doing resource-locking client-side authoritative sounds like a security issue.
OtomotO · 4 years ago
Stuff like that is the reason I have disabled JS by default.

Access to client side scripts on my machine has to be earned in trust.

Most websites never get that far, because they are broken without js.

Doesn't matter, saves me a lot of time ;)

PUSH_AX · 4 years ago
This is an interesting take, I would assume at some point it would end up costing you time, because for all its faults, the modern web has some great tools (apps) that can improve your life in a number of ways. I’m also unsure how they’d earn your trust without giving them that essential unblocking js trust in the first place.
beebeepka · 4 years ago
I've been browsing without js for a decade. The alternative is to install blockers or live with ads.

I have zero interest in "apps that can improve my life". People have different interests.

Disclaimer: I write us for a living.

XCSme · 4 years ago
> Access to client side scripts on my machine has to be earned in trust

I agree that it's all about trust. Why would you ever enter information on a website you don't trust? What extra information can be gathered using JS that can't be otherwise? Or what is the exact harm done if they send extra requests to a tracking API?

My point is that once you access a specific website you consent with that website sending HTML/CSS/JS to your computer and executing it. The biggest problem those days is when that website is sending your information to other entities, 3rd parties, for other purposes than improving your experience. I think having "tracking" to detect errors, loading time issues and improving the user experience is perfectly reasonable if implemented in a proper way.

OtomotO · 4 years ago
> My point is that once you access a specific website you consent with that website sending HTML/CSS/JS to your computer and executing it

Just HTML and CSS in my case.

vbezhenar · 4 years ago
It seems that ping attribute of <a> does not require JS. So you still can be tracked.

Actually I think that's the whole reason of introducing this attribute.

Too · 4 years ago
With all browser vendors nowadays (minus chrome) trying to perceive themselves as privacy protectors, who even approved and implemented this?

The only good thing coming out of this ping attribute is that it’s a lot easier for plugins to block it rather than intercepting javascript and domains.

someweirdperson · 4 years ago
What happens if the user has more than one tab/window open an closes one?

Weirdest case I've observed in the wild is a site that works perfectly without referer header, except for logout. As usual for anything requiring referer it fails without a user-readable message without referer. If referer is sent, the logout procedure switches through multiple redirects and some waiting while js is doing something, server backend probably, too.

Some sites make weird assumptions about the network, the browser, and the user.