Readit News logoReadit News
dang · 2 months ago
PSA: Submitted title was 'PSA: Always use a separate domain for user content". We've changed it per https://news.ycombinator.com/newsguidelines.html. Might be worth knowing for context.
SquareWheel · 2 months ago
It's generally good advice, but I don't see that Safe Browsing did anything wrong in this case. First, it sounds like they actually were briefly hosting phishing sites:

> All sites on statichost.eu get a SITE-NAME.statichost.eu domain, and during the weekend there was an influx of phishing sites.

Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.

From my reading, Safe Browsing did its job correctly in this case, and they restored the site quickly once the threat was removed.

ericselin · 2 months ago
I'm not saying that Google or Safe Browsing in particular did anything wrong per se. My point is primarily that Google has too much power over the internet. I know that in this case what actually happened is because of me not putting enough effort into fending off bad guys.

The new separate domain is pending inclusion in the PSL, yes.

Edit: the "effort" I'm talking about above refers to more real time moderation of content.

sokoloff · 2 months ago
> My point is primarily that Google has too much power over the internet.

That is probably true, but in this case I think most people would think that they used that power for good.

It was inconvenient for you and the legitimate parts of what was hosted on your domain, but it was blocking genuinely phishing content that was also hosted on your domain.

stickfigure · 2 months ago
"Google does good thing, therefore Google has too much power over the internet" is not a convincing point to make.

This safety feature saves a nontrivial number of people from life-changing mistakes. Yes we publishers have to take extra care. Hard to see a negative here.

neon_erosion · 2 months ago
How does flagging a domain that was actively hosting phishing sites demonstrate that Google has too much power? They do, but this is a terrible example, undermining any point you are trying to make.
shadowgovt · 2 months ago
There are two aspects to the Internet: the technical and the social.

In the social, there is always someone with most of the power (distributed power is an unstable equilibrium), and it's incumbent upon us, the web developers, to know the current status quo.

Back in the day, if you weren't testing on IE6 you weren't serving a critical mass of your potential users. Nowadays, the nameplates have changed but the same principles hold.

faust201 · 2 months ago
> Google has too much power over the internet.

In this case they did use it for good cause. Yes, alternatively you could have prevented the whole thing from happening if you cared about customers.

dormento · 2 months ago
Exactly.

> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged.

NO, Google should be "mindful" (I know companies are not people but w/e) of the power it unfortunately has. Also, Cloudflare. All my homies hate Cloudflare.

rasengan · 2 months ago
Getting on the public suffix list is easier said than done [1]. They can simply say no if they feel like it and are making sure to be able to keep said rights as a "project" vs a "business," [2] which has its pros and cons.

[1] https://github.com/publicsuffix/list/blob/main/public_suffix...

[2] https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...

RandomBK · 2 months ago
> Getting on the public suffix list is easier said than done [1].

Can you elaborate on this? I didn't see anything in either link that would indicate unreasonable challenges. The PSL naturally has a a series of validation requirements, but I haven't heard of any undue shenanigans.

Is it great that such vital infrastructure is held together by a ragtag band of unpaid volunteers? No; but that's hardly unique in this space.

KronisLV · 2 months ago
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.

How is this kinda not insane? https://publicsuffix.org/list/public_suffix_list.dat

A centralized list, where you have to apply to be included and it's up to someone else to decide whether you will be allowed in? How is this what they went for: "You want to specify some rules around how subdomains should be treated? Sure, name EVERY domain that this applies to."

Why not just something like https://example.com/.well-known/suffixes.dat at the main domain or whatever? Regardless of the particulars, this feels like it should have been an RFC and a standard that avoids such centralization.

schoen · 2 months ago
There was an IETF working group that was working on a more distributed alternative based on a DNS record (so you could make statements in the DNS about common administrative control of subdomains, or lack of such common control, and other related issues). I believe the working group concluded its work without successfully creating a standard for this, though.
zorked · 2 months ago
The problem is that you then have to trust the site's own statement about whether its subdomains are independent.

Deleted Comment

trod1234 · 2 months ago
Yes, its generally good advice to keep user content on a separate domain.

That said, there are a number of IT professionals that aren't aware of the PSL as these are largely initiatives that didn't exist prior to 2023 and don't get a lot of advertisement, or even a requirement. They largely just started being used silently by big players which itself presents issues.

There are hundreds if not thousands of whitepapers on industry, and afaik there's only one or two places its mentioned in industry working groups, and those were in blog posts, not whitepapers (at M3AAWG). There's no real documentation of the organization, what its for, and how it should be used in any of the working group whitepapers. Just that it is being used and needs support; not something professional's would pay attention to imo.

> Second, they should be using the public suffix list

This is flawed reasoning as is. Its hard to claim this with a basis when professionals don't know about this, a small subset just arbitrarily started doing this, and seems more like false justification after-the-fact for throwing the baby out with the bath water.

Security is everyone's responsibility, and Google could have narrowly tailored the offending domain name accesses instead of blocking the top-level. They didn't do that, and worse that behavior could even be automated in a way that the process could be extended and there could be a noticing period to the toplevel provider before it started hitting everyone's devices. They also didn't do that apparently.

Regardless, no single entity should be able to dictate what other people perceive or see arbitrarily from their devices (without a choice; opt-in) but that is what they've designed these systems to do.

Enumerating badness doesn't work. Worse, say the domain names get reassigned to another unrelated customer.

Those people are different people, but they are still blocked as happens with small mail servers quite often. Who is responsible when someone who hasn't been engaged with phishing is being arbitrarily punished without due process. Who is to say that google isn't doing this purposefully to retain their monopolies for services they also provide.

Its a perilous torturous path where trust cannot be given because they've violated that trust in the past, and have little credibility with all net incentives towards their own profit at the expense of others. They are even willing to regularly break the law, and have never been held to account for it. (i.e. Google Maps WIFI wiretapping).

Hanlon's razor is a joke intended as a joke, but there are people that use it literally and inappropriately to deceitfully take advantage of others.

Gross negligence coupled with some form of loss is sufficient for general intent which makes the associated actions malicious/malice.

Throwing out the baby with the bath water without telling anyone or without warning, is gross negligence.

SquareWheel · 2 months ago
I'm not sure what to tell you. I'm a professional with nearly two decades of experience in this industry, and I don't read any white papers. I read web publications like Smashing Magazine or CSS Tricks, and more specifically authors like Paul Irish, Jake Archibald, Josh Comeau, and Roman Komarov. Developers who talk about the latest features and standards, and best practices to adopt.

The view that professionals in this industry exclusively participate in academic circles runs counter to my experience. Unless you're following the latest AI buzz, most people are not spending their time on arXiv.

The PSL is surely an imperfect solution, but it's solving a problem for the moment. Ideally a more permanent DNS-based solution would be implemented to replace it. Though some system akin to SSL certificates would be necessary to provide an element of third-party trust, as bad actors could otherwise abuse it to segment malicious activity on their own domains.

If you're opposed to Safe Browsing as a whole, both Chromium and Firefox allow you to disable that feature. However, making it an opt-in would essentially turn off an important security feature for billions of users. This would result in a far greater influx of phishing attacks and the spread of malware. I can understand being opposed to such a filter from an idealistic perspective, but practically speaking, it would do far more harm than good.

kbolino · 2 months ago
Putting user content on another domain and adding that domain to the public suffix list is good advice.

So good, in fact, that it should have been known to an infrastructure provider in the first place. There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.

jeroenhd · 2 months ago
The PSL is something you find out about after it goes wrong.

It's a weird thing, to be honest, a Github repo mentioned nowhere in any standards that browsers use to treat some subdomains differently.

Information like this doesn't just manifest itself into your brain once you start hosting stuff, and if I hadn't known about its existence I wouldn't have thought to look for a project like this either. I certainly wouldn't have expected it to be both open for everyone and built into every modern internet-capable computer or anti malware service.

10000truths · 2 months ago
To be pedantic, the GitHub repo is not the source of truth, this is:

https://publicsuffix.org/list/public_suffix_list.dat

It even says so in the file itself. If Microsoft goes up in flames, they can switch to another repository provider without affecting the SoT.

bawolff · 2 months ago
If you don't know what you're doing and as a result bad things happen, that's on you.

I don't have a lot of sympathy for people who allow phishing sites suffering reputational consequences.

dawnerd · 2 months ago
To be fair I’ve been in the space for close to 20 years now, worked on some of the largest sites and this is the first I’m hearing of the public suffix list.
j45 · 2 months ago
Maybe it was effective from obscurity?

Deleted Comment

yafinder · 2 months ago
For something that you think is a de-facto standard, public suffix list seems kinda raw to me for now.

I checked it for two popular public suffixes that came to mind: 'livejournal.com' and 'substack.com'. Both weren't there.

Maybe I'm mistaken, it's not a bug and these suffixes shouldn't be included, but I can't think of the reason why.

jeroenhd · 2 months ago
I don't know about LiveJournal, but I don't believe you can host any interactive content on substack (without hacking substack at least). You can't sign up and host a phishing site, for instance.

User-uploaded content (which does pose a risk) is all hosted on substackcdn.com.

The PSL is more for "anyone can host anything in a subdomain of any domain on this list" rather than "this domain contains user-generated content". If you're allowing people to host raw HTML and JS then the PSL is the right place to go, but if you're just offering a user post/comment section feature, you're probably better off getting an early alert if someone has managed to breach your security and hacked your system into hosting phishing.

asddubs · 2 months ago
The public suffix list interferes with cookies. So on a service like livejournal, where you want users logged in across all subdomains, it's not an option
neon_erosion · 2 months ago
Exactly, this has been documented knowledge for many years now, even decades. Github and other large providers of user-generated content have public-facing documentation on the risks and ways to mitigate them. Any hosting provider that chooses to ignore those practices is putting themselves, and their customers, at risk.
lcnPylGDnU4H9OF · 2 months ago
> There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.

For what it's worth, this makes it sound like you think the vitriol should be aimed at the author's ignorance rather than the circumstances which led to it, presuming you meant the latter.

kbolino · 2 months ago
I do think the author's ignorance was a bigger problem--both in the sense of he should have known better and also in the sense that the PSL needs to be more discoverable--than anything Google('s automated systems) did.

However, I'm now reflecting on what I said as "be careful what you wish for", because the comments on this HN post have done a complete 180 since I wrote it, to the point of turning into a pile-on in the opposite direction.

ericselin · 2 months ago
This is of course true! It just takes an incident like this to get ones head out of ones ass and actually do it. :)
kbolino · 2 months ago
The good news is, once known, a lesson like this is hard to forget.

The PSL is one of those load-bearing pieces of web infrastructure that is esoteric and thanklessly maintained. Maybe there ought to be a better way, both in the sense of a direct alternative (like DNS), and in the sense of a better security model.

neon_erosion · 2 months ago
This is the kind of thing that customers rely on you to do _before_ it causes an incident.
hiatus · 2 months ago
One can only imagine the other beginner mistakes made by this operator.
ericselin · 2 months ago
Since there's a lot of discussion about the Public Suffix list, let me point out that it's not just a webform where you can add any domain. There's a whole approval process where one very important criterion is that the domain to be added has a large enough user base. When you have a large enough user base, you generally have scammers as well. That's what happened here.

It basically goes: growing user base -> growing amount of malicious content -> ability to submit domain to PSL. In that order, more or less.

In terms of security, for me, there's no issue with being on the same domain as my users. My cookies are scoped to my own subdomain, and HTTPS only. For me, being blocked was the only problem, one that I can honestly admit was way bigger than I thought.

Hence, the PSA. :)

lucb1e · 2 months ago
What sort of size would be needed to get on there?

My open source project has some daily users, but not thousands. Plenty to attract malicious content, I think a lot of people are sending it to themselves though (like onto a malware analysis VM that is firewalled off and so they look for a public website to do the transfer), but even then the content will be on the site for a few hours. After >10 years of hosting this, someone seems to have fed a page into a virus scanner and now I'm getting blocks left and right with no end in sight. I'd be happy to give every user a unique subdomain instead of short links on the main domain, and then put the root on the PSL, if that's what solves this

mh- · 2 months ago
> [..] projects not serving more then (sic) thousands of users are quite likely to be declined.

from PSL's GitHub repo's wiki [0].

[0]: https://github.com/publicsuffix/list/wiki/Guidelines#validat...

ericselin · 2 months ago
Based on what I've seen, there's no way to get that project into the PSL. I would recommend you to have the content available at projectcontent.com if the main site is project.com, though. :)
bawolff · 2 months ago
> My cookies are scoped to my own subdomain

If you mean with the domain option, that's not really sufficient. You need to use the Host- prefix

ArnoVW · 2 months ago
As a CISO I am happy with many of the protections that Google creates. They are in a unique position, and probably the only ones to be able to do it.

However, I think the issue is that with great power comes great responsibility.

They are better than most organisations, and working with many constraints that we cannot always imagine.

But several times a week we get a false "this mail is phishing" incident, where a mail from a customer or prospect is put in "Spam", with a red security banner saying it contains "dangerous links". Generally it is caused by domain reputation issues, that block all mail that uses an e-mail scanning product. These products wrap URLs so they can scan when the mail is read, and thus when they do not detect a virus, they become defacto purveyors of virii, and their entire domain is tagged as dangerous.

I have raised this to Google in May (!) and have been exchanging mail on a nearly daily basis. Pointing out a new security product that has been blacklisted, explaining the situation to a new agent, etc.

Not only does this mean that they are training our staff that security warnings are generally false, but it means we are missing important mail from prospects and customers. Our customers are generally huge corporations, missing a mail for us is not like missing one mail for a B2C outfit.

So far the issue is not resolved (we are in Oct now!) and recently they have stopped responding. I appreciate our organisation is not the US Government, but still, we pay upwards of 20K$ / year for "Google Workspace Enterprise" accounts. I guess I was expecting something more.

If someone within Google reads this: you need to fix this.

seethishat · 2 months ago
I'm old. I've been doing security for a very long time. Started back in the 1990s. Here's what I have learned over the last 30 years...

Half (or more) of security alerts/warnings are false positives. Whether it's the vulnerability scanner complaining about some non-existent issues (based on the version of Apache alone... which was back ported by the package maintaner), or an AI report generated by interns at Deloitte fresh out of college, or someone reporting www.example.com to Google Safe Browsing as malicious, etc. At least half of the things they report on are wrong.

You sort of have to have a clue (technically) and know what you are doing to weed through all the bullshit. Tools that block access, based on these things do more harm than good.

SerCe · 2 months ago
What this post might be missing is that it’s not just Google that can block your website. A whole variety of actors can, and any service that can host user-generated content, not just html (a single image is enough), is at risk, but really, any service is at risk. I’ve had to deal with many such cases: ISPs mistakenly blocking large IP prefixes, DPI software killing the traffic, random antivirus software blocking your JS chunk because of a hash collision, even small single-town ISPs sinkholing your domain because of auto-reports, and many more.

In the author’s case, he was at least able to reproduce the issues. In many cases, though, the problem is scoped to a small geographic region, but for large internet services, even small towns still mean thousands of people reaching out to support while the issue can’t be seen on the overall traffic graph.

The easiest set of steps you can do to be able to react to those issues are: 1. Set up NEL logging [1] that goes to completely separate infrastructure, 2. Use RIPE Atlas and similar services in the hope of reproducing the issue and grabbing a traceroute.

I’ve even attempted to create a hosted service for collecting NEL logs, but it seemed to be far too niche.

[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Net...

freehorse · 2 months ago
I don't see how a separate domain would solve the main issue here. If something on that separate domain was flagged, it would still affect all user content on that domain. If your business is about serving such user content, the main service of your business would be down, even though your main domain would still be up.
ericselin · 2 months ago
You are right, it would still affect all users. Until the pending PSL inclusion is complete, that is. But it now separates my own resources, such as the website and dashboard of statichost.eu from that.
toast0 · 2 months ago
A separate domain may not prevent users' conten from being blocked, but it may prevent blocking of the administrative interfaces. Which would help affected customers get their content and the service could more easily put a banner advising users of the situation, etc.
duxup · 2 months ago
It feels like unless you're one of the big social media companies, accepting user content is slowly becoming a larger and larger risk.
jacquesm · 2 months ago
It always was. You're one upload and a complaint to your ISP/Google/AWS/MS away from having your account terminated.
Bender · 2 months ago
I second this for personal sites. Having run forums and chan sites without a CDN I found that not only is this true, it is 100% automated. The timing in emails to VPS/Registrars matches the times their scripts would crawl my sites and submit illicit content, screenshot it and automatically submit the screenshots to the VPS/server/registrar providers. That was incentive enough for me to take my sites private / semi-private. I would move them to .onion nodes but that's just too slow for me. I have my own theories as to what groups are running these scripts to push people to CDN's but no smoking gun.

Corporations are a little safer. They have mutually binding contracts with multiple internet service providers and dedicated circuits. They have binding contracts with DNS registrars. Having been on the receiving end of abuse@ they notify over phone and email giving plenty of time to figure out what is going on. I've never seen corporate circuits get nuked for such shenanigans.

wahnfrieden · 2 months ago
Any services successfully offloading UGC to other moderated platforms? E.g. developer tools relying on GitHub instead of storing source/assets in the service itself, and Microsoft can take care of most moderation needs. But are there consumer apps that do things like this?
blenderob · 2 months ago
But something has definitely changed over the past few years. Back in the days, it felt completely normal for individuals to spin up and run their own forums. Small communities built and maintained by regular people. How many new truly independent, individual-run forums can you name today? Hardly any. Instead we keep hearing about long-time community sites shutting down because individuals can no longer handle the risks of user content. I've seen so many posts right here on HN announcing those kinds of shutdowns.
dylan604 · 2 months ago
Your equally just one fake report to an automated system away having your account shut down. So, yes, your actions have consequences, but more worrying to me is the ability of someone with a grudge causing consequences for you as well.