Industry will do absolutely anything, except making lightweight sites.
We had instant internet in the late 90s, if you were lucky enough to have a fast connection. The pages were small and there were barely any javascript. You can still find such fast loading lightweight pages today and the experience is almost surreal.
It feels like the page has completely loaded before you even released the mousebutton.
If only the user experience were better it might have been tolerable but we didn't get that either.
Honestly I used to be on the strict noscript JavaScript hate train.
But if your site works fast. Loads fast. With _a little_ JS that actually improves the functionality+usability in? I think that's completely fine. Minimal JS for the win.
When I was finishing university I bought into the framework-based web-development hype. I thought that "enterprise" web-development has to be done this way. So I got some experience by migrating my homepage to a static VUE.JS version.
Binding view and state by passing the variables name as a sting felt off, extending the build env seemed unnecessary complex and everything was slow and has to be done a certain way.
But since everyone is using this, this must be right I thought.
I got over this view and just finished the new version of my page. Raw HTML with some static-site-generator templating. The HTML size went down 90%, the JS usage went down 97% and build time is now 2s instead of 20s. The user experience is better and i get 30% more hits since the new version.
Choose the right tool for the job. Every engineering decision is a trade-off. No one blames the hammer when it's used to insert a screw into a wall either.
SPA frameworks like Vue, React and Angular are ideal for web apps. Web apps and web sites are very different. For web apps, initial page load doesn't matter a lot and business requirements are often complex. For websites it's exactly the opposite. So if all you need is a static website with little to no interactivity, why did you choose a framework?
My personal projects are all server rendered HTML. My blog (a statically rendered Hugo site) has no JS at all, my project (Rails and server rendered HTML) has minimal JS that adds some nice to have stuff but nothing else (it works with no JS). I know they're my sites, but the experience is just so much better than most of the rest of the web. We've lost so much.
I have two websites written in JS that render entirely server-side. They are blazing fast, minimal in size and reach 100/100 scores on all criteria with Lighthouse. On top of that they're highly interactive, no build step required to publish a new article.
And users clearly appreciate it. I was going over some bolt types with a design guy at my workplace yesterday for a project and his first instinct is to pull up the McMaster-Carr site to see what was possible. I don't know if we even order from them, since we pass through purchasing folks, but the site is just brilliantly simple and elegant.
Someone did an analysis of that site on tiktok or YouTube. It's using some tricks to speed things up, like preloading the html for the next page on hover and then replacing the shell of the page on click. So pre-rendering and prefetching. Pretty simple to do and effective apparently.
At Google, I worked on a pure JS Speedtest. At the time, Ookla was still Flash-based so wouldn't work on Chromebooks. That was a problem for installers to verify an installation. I learned a lot about how TCP (I realize QUIC is UDP) responds to various factors.
I look at this article and consider the result pretty much as expected. Why? Because it pushes the flow control out of the kernel (and possibly network adapters) into userspace. TCP has flow-control and sequencing. QUICK makes you manage that yourself (sort of).
Now there can be good reasons to do that. TCP congestion control is famously out-of-date with modern connection speeds, leading to newer algorithms like BRR [1] but it comes at a cost.
But here's my biggest takeaway from all that and it's something so rarely accounted for in network testing, testing Web applications and so on: latency.
Anyone who lives in Asia or Australia should relate to this. 100ms RTT latency can be devastating. It can take something that is completely responsive to utterly unusable. It slows down the bandwidth a connection can support (because of the windows) and make it less responsive to errors and congestion control efforts (both up and down).
I would strongly urge anyone testing a network or Web application to run tests where they randomly add 100ms to the latency [2].
My point in bringing this up is that the overhead of QUIC may not practically matter because your effective bandwidth over a single TCP connection (or QUICK stream) may be MUCH lower than your actual raw bandwidth. Put another way, 45% extra data may still be a win because managing your own congestion control might give you higher effective speed over between two parties.
I did a bunch of real world testing of my file transfer app[1]. Went in with the expectation that Quic would be amazing. Came out frustrated for many reasons and switched back to TCP. It’s obvious in hindsight, but with TCP you say “hey kernel send this giant buffer please” whereas UDP is packet switched! So even pushing zeroes has a massive CPU cost on most OSs and consumer hardware, from all the mode switches. Yes, there are ways around it but no they’re not easy nor ready in my experience. Plus it limits your choice of languages/libraries/platforms.
(Fun bonus story: I noticed significant drops in throughput when using battery on a MacBook. Something to do with the efficiency cores I assume.)
Secondly, quic does congestion control poorly (I was using quic-go so mileage may vary). No tuning really helped, and TCP streams would take more bandwidth if both were present.
Third, the APIs are weird man. So, quic itself has multiple streams, which makes it non-drop in replacement with TCP. However, the idea is to have HTTP/3 be drop-in replaceable at a higher level (which I can’t speak to because I didn’t do). But worth keeping in mind if you’re working on the stream level.
In conclusion I came out pretty much defeated but also with a newfound respect for all the optimizations and resilience of our old friend tcp. It’s really an amazing piece of tech. And it’s just there, for free, always provided by the OS. Even some of the main issues with tcp are not design faults but conservative/legacy defaults (buffer limits on Linux, Nagle, etc). I really just wish we could improve it instead of reinventing the wheel..
> Because it pushes the flow control out of the kernel (and possibly network adapters) into userspace
That’s not an inherent property of the QUIC protocol, it is just an implementation decision - one that was very necessary for QUIC to get off the ground, but now it exists, maybe it should be revisited? There is no technical obstacle to implementing QUIC in the kernel, and if the performance benefits are significant, almost surely someone is going to do it sooner or later.
The Network tab in the Chrome console allows you to degrade your connection. There are presets for Slow/Fast 4G, 3G, or you can make a custom present where you can specify download and upload speeds, latency in ms, a packet loss percent, a packet queue length and can enable packet reordering.
Chrome’s network emulation is a pretty poor simulation of the real world… it throttles on a per request basis so can’t simulate congestion due to multiple requests in flight at the same time
Really need something like ipfw, dummynet, tc etc to do it at the packet level
> I look at this article and consider the result pretty much as expected. Why? Because it pushes the flow control out of the kernel (and possibly network adapters) into userspace. TCP has flow-control and sequencing. QUICK makes you manage that yourself (sort of).
This implies that user space is slow. Yet, some(most?) of the fastest high-performance TCP/IP stacks are made in user space.
If the entire stack is in usermode and it's directly talking to the NIC with no kernel involvement beyond setup at all. This isn't the case with QUIC, it uses the normal sockets API to send/recv UDP.
I would be surprised if online games use TCP.
Anyway, physics is still there and light speed is fast, but that much. In 10ms it travels about 3000km, NZ to US west coast is about 11000km, so less than 60ms is impossible. Cables are probably much longer, c speed is lower in a medium, add network devices latency and 200ms from NZ to USA is not that bad.
> I look at this article and consider the result pretty much as expected. Why? Because it pushes the flow control out of the kernel (and possibly network adapters) into userspace. TCP has flow-control and sequencing. QUICK makes you manage that yourself (sort of).
I truly hope the QUIC in Linux Kernel project [0] succeeds. I'm not looking forward to linking big HTTP/3 libraries to all applications.
I've been tasked with improving a system where a lot of the events relied on timing to be just right, so now I routinely click around the app with a 900ms delay, as that's the most that I can get away with without having the hot-reloading system complain.
Plenty of assumptions break down in such an environment and part of my work is to ensure that the user always knows that the app is really doing something and not just being unresponsive.
For reasonably long downloads (so it has a chance to calibrate), why don't congestion algorithms increase the number of inflight packets to a high enough number that bandwidth is fully utilized even over high latency connections?
It seems like it should never be the case that two parallel downloads will preform better than a single one to the same host.
There are two places a packet can be ‘in-flight’. One is light travelling down cables (or the electrical equivalent) or in memory being processed by some hardware like a switch, and the other is sat in a buffer in some networking appliance because the downstream connection is busy (eg sending packets that are further up the queue, at a slower rate than they arrive). If you just increase bandwidth it is easy to get lots of in-flight packets in the second state which increases latency (admittedly that doesn’t matter so much for long downloads) and the chance of packet loss from overly full buffers.
CUBIC tries to increase bandwidth until it hits packet loss, then cuts bandwidth (to drain buffers a bit) and ramps up and hangs around close to the rate that led to loss, before it tries sending at a higher rate and filling up buffers again. Cubic is very sensitive to packet loss, which makes things particularly difficult on very high bandwidth links with moderate latency as you need very low rates of (non-congestion-related) loss to get that bandwidth.
BBR tries to do the thing you describe while also modelling buffers and trying to keep them empty. It goes through a cycle of sending at the estimated bandwidth, sending at a lower rate to see if buffers got full, and sending at a higher rate to see if that’s possible, and the second step can be somewhat harmful if you don’t need the advantages of BBR.
I think the main thing that tends to prevent the thing you talk about is flow control rather than congestion control. In particular, the sender needs a sufficiently large send buffer to store all unacked data (which can be a lot due to various kinds of ack-delaying) in case it needs to resend packets, and if you need to resend some then your send buffer would need to be twice as large to keep going. On the receive size, you need big enough buffers to be able to fill up those buffers from the network while waiting for an earlier packet to be retransmitted.
On a high-latency fast connection, those buffers need to be big to get full bandwidth, and that requires (a) growing a lot, which can take a lot of round-trips, and (b) being allowed by the operating system to grow big enough.
I've run a big webserver that served a decent size apk/other app downloads (and a bunch of small files and what nots). I had to set the maximum outgoing window to keep the overall memory within limits.
IIRC, servers were 64GB of ram and sendbufs were capped at 2MB. I was also dealing with a kernel deficiency that would leave the sendbuf allocated if the client disappeared in LAST_ACK. (This stems from a deficiency in the state description from the 1981 rfc written before my birth)
You can in theory. You just need a accurate model of your available bandwidth and enough buffering/storage to avoid stalls while you wait for acknowledgement. It is, frankly, not even that hard to do it right. But in practice many implementations are terrible, so good luck.
A major problem with TCP is that the limitations of the kernel network stack and sometimes port allocation place absurd artificial limits on the number of active connections. A modern big server should be able to have tens of millions of open TCP connections at least, but to do that well you have to do hacks like running a bunch of pointless VMs.
> A modern big server should be able to have tens of millions of open TCP connections at least, but to do that well you have to do hacks like running a bunch of pointless VMs.
Inbound connections? You don't need to do anything other than make sure your fd limit is high and maybe not be ipv4 only and have too many users behind the same cgnat.
Outbound connections is harder, but hopefully you don't need millions of connections to the same destination, or if you do, hopefully they support ipv6.
When I ran millions of connections through HAproxy (bare tcp proxy, just some peaking to determine the upstream), I had to do a bunch of work to make it scale, but not because of port limits.
As an alternative to simulating latency: How about using a VPN service to test your website via Australia? I suppose that when it easier to do, it is more likely that people will actually do this test.
Two recommendations are for improving receiver-side implementations – optimising them and making them multithreaded. Those suggest some immaturity of the implementations. A third recommendation is UDP GRO, which means modifying kernels and ideally NIC hardware to group received UDP packets together in a way that reduces per-packet work (you do lots of per-group work instead of per-packet work). This already exists in TCP and there are similar things on the send side (eg TSO, GSO in Linux), and feels a bit like immaturity but maybe harder to remedy considering the potential lack of hardware capabilities. The abstract talks about the cost of how acks work in QUIC but I didn’t look into that claim.
Another feature you see for modern tcp-based servers is offloading tls to the hardware. I think this matters more for servers that may have many concurrent tcp streams to send. On Linux you can get this either with userspace networking or by doing ‘kernel tls’ which will offload to hardware if possible. That feature also exists for some funny stuff in Linux about breaking down a tcp stream into ‘messages’ which can be sent to different threads, though I don’t know if it allows eagerly passing some later messages when earlier packets were lost.
I think that’s a pretty good impression. Lots of features for those cases:
- better behaviour under packet loss (you don’t need to read byte n before you can see byte n+1 like in tcp)
- better behaviour under client ip changes (which happen when switching between cellular data and wifi)
- moving various tricks for getting good latency and throughput in the real world into user space (things like pacing, bbr) and not leaving enough unencrypted information in packets for middleware boxes to get too funky
That's how the internet works, there's no guaranteed delivery and TCP bandwidth estimation is based on when packets start to be dropped when you send too many.
"immaturity of the implementations" is a funny wording here. QUIC was created because there is absolutely NO WAY that all internet hardware (including all middleware etc) out there will support a new TCP or TLS standard. So QUIC is an elegant solution to get a new transport standard on top of legacy internet hardware (on top of UDP).
In an ideal World we would create a new TCP and TLS standard and replace and/or update all internet routers and hardware everywhere World Wide so that it is implemented with less CPU utilization ;)
A major mistake in IP’s design was to allow middle boxes. The protocol should have had some kind of minimal header auth feature to intentionally break them. It wouldn’t have to be strong crypto, just enough to make middle boxes impractical.
It would have forced IPv6 migration immediately (no NAT) and forced endpoints to be secured with local firewalls and better software instead of middle boxes.
The Internet would be so much simpler, faster, and more capable. Peer to peer would be trivial. Everything would just work. Protocol innovation would be possible.
Of course tech is full of better roads not taken. We are prisoners of network effects and accidents of history freezing ugly hacks into place.
Enable http/3 + quic between client browser <> edge and restrict edge <> origin connections to http/2 or http/1
Cloudflare (as an example) only supports QUIC between client <> edge and doesn’t support it for connections to origin. Makes sense if the edge <> origin connection is reusable, stable, and “fast”.
Just as important is > we identify the root cause to be high receiver-side processing overhead, in particular, excessive data packets and QUIC's user-space ACKs
It doesn't sound like there's a fundamental issue with the protocol.
They also mainly identified a throughput reduction due to latency issues caused by ineffective/too many syscalls in how browsers implement it.
But such a latency issue isn't majorly increasing battery usage (compared to a CPU usage issue which would make CPUs boost). Nor is it an issue for server-to-server communication.
It basically "only" slows down high bandwidth transmissions on end user devices with (for 2024 standards) very high speed connection (if you take effective speeds from device to server, not speeds you where advertised to have bough and at best can get when the server owner has a direct pairing agreement with you network provider and a server in your region.....).
Doesn't mean the paper is worthless, browser should improve their impl. and it highlights it.
But the title of the paper is basically 100% click bait.
Internet access is only going to become faster. Switching to a slower transport just as Gigabit internet is proliferating would be a mistake, obviously.
It depends on whether it’s meaningfully slower. QUIC is pretty optimized for standard web traffic, and more specifically for high-latency networks. Most websites also don’t send enough data for throughput to be a significant issue.
I’m not sure whether it’s possible, but could you theoretically offload large file downloads to HTTP/2 to get best of both worlds?
In terms of maximum available throughput it will obviously become greater. What's less clear is if the median and worst throughput available throughout a nation or the world will continue to become substantially greater.
It's simply not economical enough to lay fibre and put 5G masts everywhere (5G LTE bands covers less area due to being higher frequency, and so are also limited to being deployed in areas with a higher enough density to be economically justifiable).
For local purposes that's certainly true. It seems that quic trades a faster connection establishment for lower throughput. I personally prefer tcp anyway.
A gigabit connection is just one prerequisite. The server also has to be sending very big bursts of foreground/immediate data or you're very unlikely to notice anything.
It's wild that 1gbit LAN has been "standard" for so long that the internet caught up.
Meanwhile, low-end computers ship with a dozen 10+Gbit class transceivers on USB, HDMI, Displayport, pretty much any external port except for ethernet, and twice that many on the PCIe backbone. But 10Gbit ethernet is still priced like it's made from unicorn blood.
My personal takeaway from that: Perhaps we shouldn't let Google design and more or less unilaterally dictate and enforce internet protocol usage via Chromium.
Brave/Vivaldi/Opera/etc: You should make a conscious choice.
Having read through that thread, most of the (top) comments are somewhat related to the lacking performance of the UDP/QUIC stack and thoughts on the meaningfulness of the speeds in the test. There is a single comment suggesting HTTP/2 was rushed (because server push was later deprecated).
QUIC is also acknowledged as being quite different from the Google version, and incorporating input from many different people.
Could you expand more on why this seems like evidence that Google unilaterally dictating bad standards? None of the changes in protocol seem objectively wrong (except possibly Server Push).
Disclaimer: Work at Google on networking, but unrelated to QUIC and other protocol level stuff.
Maybe, but QUIC is not bad as a protocol. The problem here is that OSes are not as well optimized for QUIC as they are for TCP. Just give it time, the paper even has suggestions.
QUIC has some debatable properties, like mandatory encryption, or the use of UDP instead of being a protocol under IP like TCP, but there are good reasons for it, related to ossification.
Yes, Google pushed for it, but I think it deserves its approval as a standard. It is not perfect but it is practical, they don't want another IPv6 situation.
So because the Linux kernel isn’t as optimized for QUIC as it has been for TCP we shouldn’t design new protocols? Or it should be restricted to academics that had tried and failed for decades and would have had all the same problems even if they succeeded? And all of this only in a data center environment really and less about the general internet Quic was designed for?
This sounds really really wrong. I've achieved 900mbps speeds on quic+http3 and just quic... Seems like a bad TLS implementation? Early implementation that's not efficient? The CPU usage seemed pretty avg at around 5% on gen 2 epyc cores.
This is actually very well known: current QUIC implementation in browsers is *not stable* and is built of either rustls or in another similar hacky way.
Anecdote: I was having trouble accessing wordpress.org. When I started using Wordpress, I could access the documentation just fine, but then suddenly I couldn't access the website anymore. I dual boot Linux, so it wasn't Windows fault. I could ping them just fine. I tried three different browsers with the same issue. It's just that when I accessed the website, it would get stuck and not load at all, and sometimes pages would just stop loading mid-way.
Today I found the solution. Disable "Experimental QUIC Protocol" in Chrome settings.
This makes me kind of worried because I've had issues accessing wordpress.org for months. There was no indication that this was caused by QUIC. I just managed to realize it because there was QUIC-related error in devtools that appeared only sometimes.
I wonder what other websites are rendered inaccessible by this protocol and users have no idea what is causing it.
Here “fast internet” is 500Mbps, and the reason is that quic seems to be cpu bound above that.
I didn’t look closely enough to see what their test system was to see if this is basic consumer systems or is still a problem for high performance desktops.
We had instant internet in the late 90s, if you were lucky enough to have a fast connection. The pages were small and there were barely any javascript. You can still find such fast loading lightweight pages today and the experience is almost surreal.
It feels like the page has completely loaded before you even released the mousebutton.
If only the user experience were better it might have been tolerable but we didn't get that either.
It's a blast. It's faster and way more resilient. No more state desync between frontend and backend.
I admit there is a minimum of javascript (currently a few hundred lines) for convenience.
I'll add a bit more to add the illusion this is still a SPA.
I'll kill about 40k lines of React that way and about 20k lines of Kotlin.
I'll have to rewrite about 30k lines of backend code though.
Still, I love it.
But if your site works fast. Loads fast. With _a little_ JS that actually improves the functionality+usability in? I think that's completely fine. Minimal JS for the win.
I got over this view and just finished the new version of my page. Raw HTML with some static-site-generator templating. The HTML size went down 90%, the JS usage went down 97% and build time is now 2s instead of 20s. The user experience is better and i get 30% more hits since the new version.
The web could be so nice of we used less of it.
SPA frameworks like Vue, React and Angular are ideal for web apps. Web apps and web sites are very different. For web apps, initial page load doesn't matter a lot and business requirements are often complex. For websites it's exactly the opposite. So if all you need is a static website with little to no interactivity, why did you choose a framework?
Even on the backend, now the golden goose is to sell microservices, via headless SaaS products connected via APIs, that certainly is going to perform.
https://macharchitecture.com/
However if those are the shovels people are going to buy, then those are the ones we have to stockpile, so is the IT world.
I look at this article and consider the result pretty much as expected. Why? Because it pushes the flow control out of the kernel (and possibly network adapters) into userspace. TCP has flow-control and sequencing. QUICK makes you manage that yourself (sort of).
Now there can be good reasons to do that. TCP congestion control is famously out-of-date with modern connection speeds, leading to newer algorithms like BRR [1] but it comes at a cost.
But here's my biggest takeaway from all that and it's something so rarely accounted for in network testing, testing Web applications and so on: latency.
Anyone who lives in Asia or Australia should relate to this. 100ms RTT latency can be devastating. It can take something that is completely responsive to utterly unusable. It slows down the bandwidth a connection can support (because of the windows) and make it less responsive to errors and congestion control efforts (both up and down).
I would strongly urge anyone testing a network or Web application to run tests where they randomly add 100ms to the latency [2].
My point in bringing this up is that the overhead of QUIC may not practically matter because your effective bandwidth over a single TCP connection (or QUICK stream) may be MUCH lower than your actual raw bandwidth. Put another way, 45% extra data may still be a win because managing your own congestion control might give you higher effective speed over between two parties.
[1]: https://atoonk.medium.com/tcp-bbr-exploring-tcp-congestion-c...
[2]: https://bencane.com/simulating-network-latency-for-testing-i...
(Fun bonus story: I noticed significant drops in throughput when using battery on a MacBook. Something to do with the efficiency cores I assume.)
Secondly, quic does congestion control poorly (I was using quic-go so mileage may vary). No tuning really helped, and TCP streams would take more bandwidth if both were present.
Third, the APIs are weird man. So, quic itself has multiple streams, which makes it non-drop in replacement with TCP. However, the idea is to have HTTP/3 be drop-in replaceable at a higher level (which I can’t speak to because I didn’t do). But worth keeping in mind if you’re working on the stream level.
In conclusion I came out pretty much defeated but also with a newfound respect for all the optimizations and resilience of our old friend tcp. It’s really an amazing piece of tech. And it’s just there, for free, always provided by the OS. Even some of the main issues with tcp are not design faults but conservative/legacy defaults (buffer limits on Linux, Nagle, etc). I really just wish we could improve it instead of reinventing the wheel..
[1]: https://payload.app/
That sounds like the thread priority/QoS was incorrect, but it could be WiFi or something.
That’s not an inherent property of the QUIC protocol, it is just an implementation decision - one that was very necessary for QUIC to get off the ground, but now it exists, maybe it should be revisited? There is no technical obstacle to implementing QUIC in the kernel, and if the performance benefits are significant, almost surely someone is going to do it sooner or later.
IIRC, Chrome's network simulation just applies a delay after a connection is established
Really need something like ipfw, dummynet, tc etc to do it at the packet level
This implies that user space is slow. Yet, some(most?) of the fastest high-performance TCP/IP stacks are made in user space.
When I used to (try to) play online games in NZ a few years ago, RTT to US West servers sometimes exceeded 200ms.
DSL kills.
I truly hope the QUIC in Linux Kernel project [0] succeeds. I'm not looking forward to linking big HTTP/3 libraries to all applications.
[0] https://github.com/lxin/quic
Dead Comment
Plenty of assumptions break down in such an environment and part of my work is to ensure that the user always knows that the app is really doing something and not just being unresponsive.
It seems like it should never be the case that two parallel downloads will preform better than a single one to the same host.
CUBIC tries to increase bandwidth until it hits packet loss, then cuts bandwidth (to drain buffers a bit) and ramps up and hangs around close to the rate that led to loss, before it tries sending at a higher rate and filling up buffers again. Cubic is very sensitive to packet loss, which makes things particularly difficult on very high bandwidth links with moderate latency as you need very low rates of (non-congestion-related) loss to get that bandwidth.
BBR tries to do the thing you describe while also modelling buffers and trying to keep them empty. It goes through a cycle of sending at the estimated bandwidth, sending at a lower rate to see if buffers got full, and sending at a higher rate to see if that’s possible, and the second step can be somewhat harmful if you don’t need the advantages of BBR.
I think the main thing that tends to prevent the thing you talk about is flow control rather than congestion control. In particular, the sender needs a sufficiently large send buffer to store all unacked data (which can be a lot due to various kinds of ack-delaying) in case it needs to resend packets, and if you need to resend some then your send buffer would need to be twice as large to keep going. On the receive size, you need big enough buffers to be able to fill up those buffers from the network while waiting for an earlier packet to be retransmitted.
On a high-latency fast connection, those buffers need to be big to get full bandwidth, and that requires (a) growing a lot, which can take a lot of round-trips, and (b) being allowed by the operating system to grow big enough.
IIRC, servers were 64GB of ram and sendbufs were capped at 2MB. I was also dealing with a kernel deficiency that would leave the sendbuf allocated if the client disappeared in LAST_ACK. (This stems from a deficiency in the state description from the 1981 rfc written before my birth)
Inbound connections? You don't need to do anything other than make sure your fd limit is high and maybe not be ipv4 only and have too many users behind the same cgnat.
Outbound connections is harder, but hopefully you don't need millions of connections to the same destination, or if you do, hopefully they support ipv6.
When I ran millions of connections through HAproxy (bare tcp proxy, just some peaking to determine the upstream), I had to do a bunch of work to make it scale, but not because of port limits.
One of the things he highlighted was the higher CPU utilization of HTTP/3, to the point where CPU can limit throughput.
I wonder how much of this is due to the immaturity of the implementations, and how much this is inherit due to way QUIC was designed?
Another feature you see for modern tcp-based servers is offloading tls to the hardware. I think this matters more for servers that may have many concurrent tcp streams to send. On Linux you can get this either with userspace networking or by doing ‘kernel tls’ which will offload to hardware if possible. That feature also exists for some funny stuff in Linux about breaking down a tcp stream into ‘messages’ which can be sent to different threads, though I don’t know if it allows eagerly passing some later messages when earlier packets were lost.
I never got the impression that it was intended to make all connections faster.
If viewed from that perspective, the tradeoffs make sense. Although I’m no expert and encourage someone with more knowledge to correct me.
- better behaviour under packet loss (you don’t need to read byte n before you can see byte n+1 like in tcp)
- better behaviour under client ip changes (which happen when switching between cellular data and wifi)
- moving various tricks for getting good latency and throughput in the real world into user space (things like pacing, bbr) and not leaving enough unencrypted information in packets for middleware boxes to get too funky
https://www.youtube.com/watch?v=cdb7M37o9sU
In an ideal World we would create a new TCP and TLS standard and replace and/or update all internet routers and hardware everywhere World Wide so that it is implemented with less CPU utilization ;)
It would have forced IPv6 migration immediately (no NAT) and forced endpoints to be secured with local firewalls and better software instead of middle boxes.
The Internet would be so much simpler, faster, and more capable. Peer to peer would be trivial. Everything would just work. Protocol innovation would be possible.
Of course tech is full of better roads not taken. We are prisoners of network effects and accidents of history freezing ugly hacks into place.
His testing has CPU-bound quiche at <200MB/s and nghttp2 was >900MB/s.
I wonder if the CPU was throttled.
Because if HTTP 3 impl took 4x CPU that could be interesting but not necessarily a big problem if the absolute value was very low to begin with.
Haven't read the whole paper yet, but below 600 Mbit/s is implied as being "Slow Internet" in the intro.
Enable http/3 + quic between client browser <> edge and restrict edge <> origin connections to http/2 or http/1
Cloudflare (as an example) only supports QUIC between client <> edge and doesn’t support it for connections to origin. Makes sense if the edge <> origin connection is reusable, stable, and “fast”.
https://developers.cloudflare.com/speed/optimization/protoco...
It doesn't sound like there's a fundamental issue with the protocol.
But such a latency issue isn't majorly increasing battery usage (compared to a CPU usage issue which would make CPUs boost). Nor is it an issue for server-to-server communication.
It basically "only" slows down high bandwidth transmissions on end user devices with (for 2024 standards) very high speed connection (if you take effective speeds from device to server, not speeds you where advertised to have bough and at best can get when the server owner has a direct pairing agreement with you network provider and a server in your region.....).
Doesn't mean the paper is worthless, browser should improve their impl. and it highlights it.
But the title of the paper is basically 100% click bait.
I’m not sure whether it’s possible, but could you theoretically offload large file downloads to HTTP/2 to get best of both worlds?
I grew up with 2400 baud modems as the super fast upgrade, so talk of multiple gigabits for consumers is blowing my mind a bit.
It's simply not economical enough to lay fibre and put 5G masts everywhere (5G LTE bands covers less area due to being higher frequency, and so are also limited to being deployed in areas with a higher enough density to be economically justifiable).
In 30 years it will be even faster. It would be silly to have to use older protocols to get line speed.
Meanwhile, low-end computers ship with a dozen 10+Gbit class transceivers on USB, HDMI, Displayport, pretty much any external port except for ethernet, and twice that many on the PCIe backbone. But 10Gbit ethernet is still priced like it's made from unicorn blood.
Deleted Comment
Or rather, not "Fast Internet"
QUIC is not quick enough over fast internet (acm.org)
https://news.ycombinator.com/item?id=41484991 (327 comments)
Brave/Vivaldi/Opera/etc: You should make a conscious choice.
QUIC is also acknowledged as being quite different from the Google version, and incorporating input from many different people.
Could you expand more on why this seems like evidence that Google unilaterally dictating bad standards? None of the changes in protocol seem objectively wrong (except possibly Server Push).
Disclaimer: Work at Google on networking, but unrelated to QUIC and other protocol level stuff.
QUIC has some debatable properties, like mandatory encryption, or the use of UDP instead of being a protocol under IP like TCP, but there are good reasons for it, related to ossification.
Yes, Google pushed for it, but I think it deserves its approval as a standard. It is not perfect but it is practical, they don't want another IPv6 situation.
This is an interesting hot take.
As long as the adverts arrive quickly the rest is immaterial.
Today I found the solution. Disable "Experimental QUIC Protocol" in Chrome settings.
This makes me kind of worried because I've had issues accessing wordpress.org for months. There was no indication that this was caused by QUIC. I just managed to realize it because there was QUIC-related error in devtools that appeared only sometimes.
I wonder what other websites are rendered inaccessible by this protocol and users have no idea what is causing it.
I didn’t look closely enough to see what their test system was to see if this is basic consumer systems or is still a problem for high performance desktops.