All these indignant comments but nobody bothers to look up what happened since 2020:
> In a previous blog post, we quantified that upwards of 45.80% of total DNS traffic to the root servers was, at the time, the result of Chromium intranet redirection detection tests. Since then, the Chromium team has redesigned its code to disable the redirection test on Android systems and introduced a multi-state DNS interception policy that supports disabling the redirection test for desktop browsers. This functionality was released mid-November of 2020 for Android systems in Chromium 87 and, quickly thereafter, the root servers experienced a rapid decline of DNS queries.
“I’m the original author of this code, though I no longer maintain it.
”Just want to give folks a heads-up that we’ve been in discussion with various parties about this for some time now, as we agree the negative effects on the root servers are undesirable. So in terms of “do the Chromium folks know/care”; yes, and yes.
”This is a challenging problem to solve, as we’ve repeatedly seen in the past that NXDOMAIN hijacking is pervasive (I can’t speak quantitatively or claim that “exception rather than the norm” is false, but it’s certainly at least very common), and network operators often actively worked around previous attempts to detect it in order to ensure they could hijack without Chromium detecting it.
”We do have some proposed designs to fix; at this point much of the issue is engineering bandwidth. Though I’m sure that if ISPs wanted to cooperate with us on a reliable way to detect NXDOMAIN hijacking, we wouldn’t object. Ideally, we wouldn’t have to frustrate them, nor they us.”
A way for DNS root servers and Chromium devs to cooperate that couldn’t be hijacked by domain redirecting ISPs would be nice.
Fix what, though? This is an insignificant load on a piece of core infrastructure for which all of the load from all of the users of the world's most popular Internet application is still a rounding error error relative to capacity. This is what those servers are there to do!
In almost any other scenario, this traffic would be indistinguishable from a distributed denial of service (DDoS) attack.
Yes, but see, it's not a distributed denial of service attack. It's a productive use of a capability the DNS roots were designed to serve. It's a huge amount of traffic because Chrome is one of the 3 most important and popular applications on the Internet, and the most popular of those, at that. The cart doesn't drag the horse.
And the author's incorrect claim that "interception is the exception rather than the norm" doesn't help. The author tries to compare with Firefox, but Firefox captive portal test uses an existing known URL [1] which has a very different goal from Chromium's NXDOMAIN interception test. In fact I think Firefox would have to implement a similar approach if it were popular enough.
I believe the point they were making is that they could use their own domain and infrastructure for the same kind of testing. Something like "<random-chars>.nxdetect.firefox.com" would keep the responsibility for the feature on Firefox's name servers. The same could be done for Chromium rather than contributing to a tragedy of the commons situation.
There is an obvious flaw which is that this behavior is detectable and avoidable by ISPs that become aware of what that domain is being used for... But that argument applies to the current detection method as well (as this post demonstrates).
I think "productive" depends on who's paying the bill and whether the uses of it were how the system was actually designed. I don't think that the roots ever had the idea of serving random requests billions of times because that's the entire point of having the downstream servers and DNS caching. I remember a time when directly pinging the root servers was considered very bad practice because you should always be using the downstream... Google's method essentially makes those downstream servers useless.
All of these norms about DNS queries are deployed situationally. Namedroppers types angry about excess root queries today will defend the notion of end-user operating systems querying the roots directly to facilitate DNSSEC (which without direct recursive lookups from end-systems condenses down to a single "yes I did DNSSEC" bit set by your local DNS server).
Everybody can look at this DNS probe code and kibitz about whether it could have been done more efficiently or whether it should be done at all, but this is a straightforward use of the DNS to do something important for end-users (detect whether their ISP's nameserver is lying to them about NXDOMAIN). Whatever else it is, it isn't a DDOS attack, and whatever else these servers are, they're not quiet out-of-the-way private systems; they're core Internet infrastructure.
Your house must look like a hoarder house, then. I mean, there was all that space in there, and space is meant for filling, so why not fill it up? /s
Like any computer system, Root DNS was designed around expectations of typical utilization and expected response time, with margins for burst traffic. Capacity was left unused for a reason.
Google's action here is an example of cost shifting. It doesn't cost Google or the individual user anything to test for a hijacked DNS resolver, because the costs are shifted to the Root DNS. As the article puts it, this cost shifting is indistinguishable from a DDoS because it consumes network capacity and CPU resources and electrical power in an unplanned fashion.
Root DNS was designed around resolving names, not testing for hijacked DNS. And just because you can use DNS to test for hijacked DNS doesn't mean you should.
Root DNS was designed around resolving names, not testing for hijacked DNS.
Says who? You? Have you double checked that belief with the RFCs? Last I looked, there's a lot more than just name->IP in the DNS. "Testing for hijacked DNS" is a DNS function. The root servers are there to make applications work, not the other way around. They're fine, and people watching them don't dictate terms to the browsers.
Answering the implied question in the title: ~45% of traffic to root servers is now Chromium users figuring out whether nonexistent domains get correct nxdomain responses.
That's apparently 60 billion queries per day. Doing the math, a query is some 92 bytes on the wire (50 bytes UDP payload, but the root servers are probably also on ethernet or something similar so I'll include headers), so 487 megabits per second per root server is taken up by this, assuming there are no spikes and everything is perfectly averaged. Edit: and that's the downlink. I forgot the uplink traffic that will be larger due to DNS' nature of echoing the query back with the response(s) attached, plus DNSSEC... Woa. /edit.
So what's the next step, Google generously stepping up to running (some of) those root servers and gaining further control and observability over what people do online?
Reading further, Firefox solves this by "namespace probe queries, directing them away from the root servers towards the browser’s infrastructure".
Not only could you do so, if you did not care about reliability or latency you could host the entire DNS root on one desktop computer. Global root DNS traffic is nothing.
Google doesn’t need to offer anything because the servers are holding up just fine under the load. These systems are trivially simple and cheap to operate by global internet standards.
Once again a typical tragedy of the commons problem: some actors (in this case, providers) became too greedy, fucked up service for people, and because of the needed defense against such practices now the commonwealth (in this case, the operators of the root DNS servers) suffers.
On top of that come captive hotspot/corporate portals - and here especially, I do wonder why messing around with DPI is necessary, when every public network has a DHCP server running that could be used to distribute DHCP options for gateway URLs or information if the connection is to be considered metered (mobile phone hotspot) or bandwidth constrained (train or bus hotspots).
If you're going for network login pages, you can intercept any of the standard HTTP URLs. There are specific URLs by Apple, Google, Microsoft, Mozilla, and if you're willing, a bunch of Linux distros as well, that solely serve to detect MitM redirects and show login prompts.
DHCP isn't reliable as many clients don't do anything with advanced settings you provide. Good luck getting a phone to accept the proxy server you've configured over DHCP.
This legitimate-ish DPI usually runs at the network itself, it doesn't traverse the uplink to cause any load.
This isn't a cat-and-mouse game between DPIs and the browser; it's a hamfisted ad-tech tactic deployed by ISPs that straight-up breaks a browser feature, and the browser using the DNS to unbreak it.
60B queries a day is nothing, we're talking about root servers that serve as fundation for the internet as a whole.
The other thing that people would be very suprised about is how old the software is on those root server, forget modern libraries with Rust/c++ and the like, it's pretty old tech that is very inefficient.
Edit: when talking about old tech I'm talking about the architecture of the DNS server used, IO libraries and models, caching, data structure and the like, for example a lot of stuff has been done arround web servers to serve things very efficiently, the same could be done on DNS servers.
What an arrogant hack. It's hard to believe Google engineers would find that a reasonable approach. If I'd have seen that during a code review, I would have called it out and explained that you shouldn't abuse someone else's systems.
Then again, maybe it's revenge for everyone pinging 8.8.8.8
> What an arrogant hack. It's hard to believe Google engineers would find that a reasonable approach.
Reminds me of the recent "Go module mirror fiasco" where Google found it fair to clone repositories at a rate of ~2,500 per hour in order to essentially proxy Go modules.
That doesn't at all seem like what happened (the changes were in the works prior to the drama), but this is a high-drama tangent unrelated to the story here.
This isn't some tiny authoritative DNS server being flooded unexpectedly with queries. These are the Internet DNS root services. They have to keep up with this kind of traffic. It's their literal job description.
But I dislike the whole GOPROXY design, like it breaking private repos by default and having to set some env variables to make this stupid tool download stuff from server I told it to download.
This isn't an "abuse". These are the Internet root DNS servers. We don't design applications to tiptoe around them any more than we tiptoe around the capacity of core routers. These is a huge amount of capacity and all of root DNS together is a rounding error on total Internet traffic.
Chromium is imposing the extra load on the system, so yes it is the abuser in that sense. That they are doing it for what is a good reason for the app's users is immaterial to whether the effect is bad for the root DNS servers.
If Karen takes my sandwich from the company fridge, so I take some of Jon's lunch, so I don't starve, I'm not innocent because Karen started it, I've created a situation where there are two arseholes instead of one. This isn't quite what is happening here as the root servers are effectively a public resource and stuff in the fridge is all private resources, but close enough to make the point.
> The networks that hijack DNS request should share some of the blame
They should have all the blame for deliberately breaking part of agreed protocols for their own gain.
But that doesn't make anything we do in response to that right by virtue of us doing it because we have been wronged.
They should, but Google's "solution" to the hijacking is the problem here. If you're going to hijack NXDOMAINs, you can just ignore requests to non-existent TLDs in your scheme and Chrome will be none the wiser.
Google manages entire TLDs, surely they can use their own DNS servers for this purpose.
The problem is, there is no better approach to check for intercepting middleboxes. Using a well-known path (like gstatic.com/generate_204) works for detecting if the user has a captive portal between them and the Internet, but not if the user's provider messes around with DNS.
> there is no better approach to check for intercepting middleboxes
I can see some purposes for detecting middleboxes. I've done it. It usually doesn't involve DNS. It does involve certificate pinning though.
> detecting if the user has a captive portal between them and the Internet
That's easy. Try to browse to something. If succeeds but the certificate isn't valid then the user probably has a captive portal. That, or your pinned certificate has been revoked.
> detecting ... if the user's provider messes around with DNS
Certificate pinning, again, comes to the rescue. Pin a certificate to your own DoH server and then use DoH to look up whatever you need.
If you can't connect to your DoH server then you effectively aren't (or shouldn't be) connected to the internet.
Well .. the root servers and DNS platform in general provide name resolution. Traditionally we think that as a feature/function name -> IP. But there non functional aspects also to consider. Distribution is one (DNS is really good on that), security (DNSSEC comes to mind), multi-transport channel (e.g. DNS over HTTP) are others but like seen in this case there is a non functional need, that interception can be detected. DNS does not deliver that (yet). But it is clearly a need of it in the DNS platform.
So calling it abuse, is wrong. It is a dirty hack for a non-existing feature. It is technical debt of the DNS platform and the root server suffer for it because the ISPs and in-house DNS resolvers create the problem.
Being angry at Google we can anyway be. They have enough money, enough people and enough power to either fix this financially or as a feature within the DNS platform.
Google makes money when people use 8.8.8.8, so there's an even more sinister interpretation. Chrome has all it's users DDoS other DNS servers, to raise the table stakes on running one. Now your infrastructure needs to handle apparently a whole second internet of bogus traffic. Google, of course, has the resources to meet this higher bar.
All of this extra traffic is to support the misfeature where the URL bar and search bar are combined! They are two different functions. When I type a domain I don't want the browser to send that straight to Google and give me a search result page. It makes a little bit of sense on phones where space is tight, but on desktop browsers it's just annoying.
It's an extraordinarily useful feature that millions of users take advantage of every day. The idea that the operators of core Internet infrastructure would be pressuring applications not to have these kinds of features is what should alarm you, not that the features exist. The cart doesn't drag the horse.
Sadly switching back to the old behavior has problems in Firefox. If you go to google.com and start typing in the search box it will redirect to the URL bar, where your search will fail because it is not a search box.
> In a previous blog post, we quantified that upwards of 45.80% of total DNS traffic to the root servers was, at the time, the result of Chromium intranet redirection detection tests. Since then, the Chromium team has redesigned its code to disable the redirection test on Android systems and introduced a multi-state DNS interception policy that supports disabling the redirection test for desktop browsers. This functionality was released mid-November of 2020 for Android systems in Chromium 87 and, quickly thereafter, the root servers experienced a rapid decline of DNS queries.
https://blog.verisign.com/domain-names/chromiums-reduction-o...
“I’m the original author of this code, though I no longer maintain it.
”Just want to give folks a heads-up that we’ve been in discussion with various parties about this for some time now, as we agree the negative effects on the root servers are undesirable. So in terms of “do the Chromium folks know/care”; yes, and yes.
”This is a challenging problem to solve, as we’ve repeatedly seen in the past that NXDOMAIN hijacking is pervasive (I can’t speak quantitatively or claim that “exception rather than the norm” is false, but it’s certainly at least very common), and network operators often actively worked around previous attempts to detect it in order to ensure they could hijack without Chromium detecting it.
”We do have some proposed designs to fix; at this point much of the issue is engineering bandwidth. Though I’m sure that if ISPs wanted to cooperate with us on a reliable way to detect NXDOMAIN hijacking, we wouldn’t object. Ideally, we wouldn’t have to frustrate them, nor they us.”
A way for DNS root servers and Chromium devs to cooperate that couldn’t be hijacked by domain redirecting ISPs would be nice.
It would, on the other hand, create enormous new load on the DNS infrastructure.
Yes, but see, it's not a distributed denial of service attack. It's a productive use of a capability the DNS roots were designed to serve. It's a huge amount of traffic because Chrome is one of the 3 most important and popular applications on the Internet, and the most popular of those, at that. The cart doesn't drag the horse.
[1] http://detectportal.firefox.com/success.txt etc.
There is an obvious flaw which is that this behavior is detectable and avoidable by ISPs that become aware of what that domain is being used for... But that argument applies to the current detection method as well (as this post demonstrates).
Deleted Comment
Everybody can look at this DNS probe code and kibitz about whether it could have been done more efficiently or whether it should be done at all, but this is a straightforward use of the DNS to do something important for end-users (detect whether their ISP's nameserver is lying to them about NXDOMAIN). Whatever else it is, it isn't a DDOS attack, and whatever else these servers are, they're not quiet out-of-the-way private systems; they're core Internet infrastructure.
Like any computer system, Root DNS was designed around expectations of typical utilization and expected response time, with margins for burst traffic. Capacity was left unused for a reason.
Google's action here is an example of cost shifting. It doesn't cost Google or the individual user anything to test for a hijacked DNS resolver, because the costs are shifted to the Root DNS. As the article puts it, this cost shifting is indistinguishable from a DDoS because it consumes network capacity and CPU resources and electrical power in an unplanned fashion.
Root DNS was designed around resolving names, not testing for hijacked DNS. And just because you can use DNS to test for hijacked DNS doesn't mean you should.
Says who? You? Have you double checked that belief with the RFCs? Last I looked, there's a lot more than just name->IP in the DNS. "Testing for hijacked DNS" is a DNS function. The root servers are there to make applications work, not the other way around. They're fine, and people watching them don't dictate terms to the browsers.
That's apparently 60 billion queries per day. Doing the math, a query is some 92 bytes on the wire (50 bytes UDP payload, but the root servers are probably also on ethernet or something similar so I'll include headers), so 487 megabits per second per root server is taken up by this, assuming there are no spikes and everything is perfectly averaged. Edit: and that's the downlink. I forgot the uplink traffic that will be larger due to DNS' nature of echoing the query back with the response(s) attached, plus DNSSEC... Woa. /edit.
So what's the next step, Google generously stepping up to running (some of) those root servers and gaining further control and observability over what people do online?
Reading further, Firefox solves this by "namespace probe queries, directing them away from the root servers towards the browser’s infrastructure".
For infrastructure hosted in a legit data center, this is nothing.
Deleted Comment
On top of that come captive hotspot/corporate portals - and here especially, I do wonder why messing around with DPI is necessary, when every public network has a DHCP server running that could be used to distribute DHCP options for gateway URLs or information if the connection is to be considered metered (mobile phone hotspot) or bandwidth constrained (train or bus hotspots).
DHCP isn't reliable as many clients don't do anything with advanced settings you provide. Good luck getting a phone to accept the proxy server you've configured over DHCP.
This legitimate-ish DPI usually runs at the network itself, it doesn't traverse the uplink to cause any load.
The other thing that people would be very suprised about is how old the software is on those root server, forget modern libraries with Rust/c++ and the like, it's pretty old tech that is very inefficient.
Edit: when talking about old tech I'm talking about the architecture of the DNS server used, IO libraries and models, caching, data structure and the like, for example a lot of stuff has been done arround web servers to serve things very efficiently, the same could be done on DNS servers.
There is nothing about Rust or C++ that make them faster than C.
In what way are the root servers inefficient?
I can guarantee you that this "old tech" was coded with more thought invested into it than at least half of these "modern libraries".
Well written C code can easily blow a C++/Rust application out of the water.
Then again, maybe it's revenge for everyone pinging 8.8.8.8
Reminds me of the recent "Go module mirror fiasco" where Google found it fair to clone repositories at a rate of ~2,500 per hour in order to essentially proxy Go modules.
- "Sourcehut will blacklist the Go module mirror" - https://news.ycombinator.com/item?id=34310674
After the drama become very much public, they finally decided to address the issue in a good way.
This isn't some tiny authoritative DNS server being flooded unexpectedly with queries. These are the Internet DNS root services. They have to keep up with this kind of traffic. It's their literal job description.
Deleted Comment
But I dislike the whole GOPROXY design, like it breaking private repos by default and having to set some env variables to make this stupid tool download stuff from server I told it to download.
If Karen takes my sandwich from the company fridge, so I take some of Jon's lunch, so I don't starve, I'm not innocent because Karen started it, I've created a situation where there are two arseholes instead of one. This isn't quite what is happening here as the root servers are effectively a public resource and stuff in the fridge is all private resources, but close enough to make the point.
> The networks that hijack DNS request should share some of the blame
They should have all the blame for deliberately breaking part of agreed protocols for their own gain.
But that doesn't make anything we do in response to that right by virtue of us doing it because we have been wronged.
Google manages entire TLDs, surely they can use their own DNS servers for this purpose.
I can see some purposes for detecting middleboxes. I've done it. It usually doesn't involve DNS. It does involve certificate pinning though.
> detecting if the user has a captive portal between them and the Internet
That's easy. Try to browse to something. If succeeds but the certificate isn't valid then the user probably has a captive portal. That, or your pinned certificate has been revoked.
> detecting ... if the user's provider messes around with DNS
Certificate pinning, again, comes to the rescue. Pin a certificate to your own DoH server and then use DoH to look up whatever you need.
If you can't connect to your DoH server then you effectively aren't (or shouldn't be) connected to the internet.
So calling it abuse, is wrong. It is a dirty hack for a non-existing feature. It is technical debt of the DNS platform and the root server suffer for it because the ISPs and in-house DNS resolvers create the problem.
Being angry at Google we can anyway be. They have enough money, enough people and enough power to either fix this financially or as a feature within the DNS platform.
Deleted Comment
Deleted Comment
G wants it