By relying on the default keepalive limit, NGINX prevents this type of attack. Creating additional connections to circumvent this limit exposes bad actors via standard layer 4 monitoring and alerting tools.
However, if NGINX is configured with a keepalive that is substantially higher than the default and recommended setting, the attack may deplete system resources.
> In a typical HTTP/2 server implementation, the server will still have to do significant amounts of work for canceled requests, such as allocating new stream data structures, parsing the query and doing header decompression, and mapping the URL to a resource. For reverse proxy implementations, the request may be proxied to the backend server before the RST_STREAM frame is processed. The client on the other hand paid almost no costs for sending the requests. This creates an exploitable cost asymmetry between the server and the client.
I'm surprised this wasn't foreseen when HTTP/2 was designed. Amplification attacks were already well known from other protocols.
I'm similarly similarly surprised it took this long for this attack to surface, but maybe HTTP/2 wasn't widely enough deployed to be a worthwhile target till recently?
Isn’t any kind of attack where a little bit of effort from the attacker causes a lot of work for the victim an amplification attack? Or do you only consider it an amplification attack if it is exploiting layer 3?
I tried looking it up and couldn’t find an authoritative answer. Can you recommend a resource that you like for this subject?
You're right. I hadn't had my coffee yet and the asymmetric cost reminded me of amplification attacks. I'm still surprised this attack wasn't foreseen though. It just doesn't seem all that clever or original.
I was surprised too, but if you look at the timelines then RST_STREAM seems to have been present in early versions of SPDY, and SPDY seems mostly to have been designed around 2009. Attacks like Slowloris were coming out at about the same time, but they weren't well-known.
On the other hand, SYN cookies were introduced in 1996, so there's definitely some historic precedent for attacks in the (victim pays Y, attacker pays X, X<<Y) class.
If you are working on the successor protocol of HTTP/1.1, and are not aware of Slowloris the moment it hits and every serious httpd implementation out there gets patched to mitigate it, I'd argue you are in the wrong line of work.
> Trying to do it on Google, with a serious effort, that's the wacky part
If I were the FBI, I'd be looking at people with recently bought Google puts expiring soon. I can't imagine anyone taking a swing at Google infra "for the lulz". Also in contention: nation-states doing a practice run.
> HTTP/2 makes the browsing experience of high-latency connections a lot more tolerable. It also makes loading web pages in general faster.
HTTP/3 does that in my experience (lots of train rides with spotty onboard Wi-Fi) quite a bit better though. As HTTP/2 is still affected by head-of-line blocking and a single packet loss can block all other streams, even if the lost packet didn't hold data for them.
In some alternative history there would have been a push to make http 1.1 pipelining work, trim fat from bloated websites (loading cookie consent banners from a 3rd party domain is a travesty on several levels) and maybe use websockets for tiny API requests. And the prioritization attributes on various resources.
Then shoveling everything over ~2 TCP connections would have done the job?
SCTP (Stream Control Transmission Protocol) or the equivalent. HTTP is really the wrong layer for things like bonding multiple connections, congestion adjustments, etc.
Unfortunately, most computers only pass TCP and UDP (Windows and middleboxes). So, protocol evolution is a dead end.
Thus you have to piggyback on what computers will let through--so you're stuck with creating an HTTP flavor of TCP.
Another reason to keep foundational protocols small. HTTP/2 has been around for more than a decade (including SPDY), and this is a first time this attack type surfaced. I wonder what surprises HTTP/3 and QUIC hide...
“Cancelation” should really be added to the “hard CS problems” list.
Like the others on that list (off by one, cache invalidation etc) it isn’t actually hard-hard, but rather underestimated and overlooked.
I think if we took half the time we spend on creation, constructors, initialization, and spent that design time thinking about destruction, cleanup, teardown, cancelation etc, we’d have a lot fewer bugs, in particular resource exhaustion bugs.
I really like Rust's async for its ability to immediately cancel Futures, the entire call stack together, at any await point, without needing cooperation from individual calls.
I would like to remind everyone that Google invented HTTP/2.
Now they are telling us a yarn about how they are heroically saving us from the problem they created, but without mentioning the part that they created it.
The nerve of these tech companies! Microsoft has been doing this for decades, too.
It depends on what you think a "request flood" attack is.
With HTTP/1.1 you could send one request per RTT [0]. With HTTP/2 multiplexing you could send 100 requests per RTT. With this attack you can send an indefinite number of requests per RTT.
I'd hope the diagram in this article (disclaimer: I'm a co-author) shows the difference, but maybe you mean yet another form of attack than the above?
[0] Modulo HTTP/1.1 pipelining which can cut out one RTT component, but basically no real clients use HTTP/1.1 pipelining, so its use would be a very crisp signal that it's abusive traffic.
I think for this audience a good clarification is:
* HTTP/1.1: 1 request per RTT per connection
* HTTP/2 multiplexing: 100 requests per RTT per connection
* HTTP/2 rapid reset: indefinite requests per connection
In each case attackers are grinding down a performance limitation they had with previous generations of the attack over HTTP. It is a request flood; the thing people need to keep in mind is that HTTP made these floods annoying to generate.
By request flood I mean, request flood, as in sending insanely high number of requests per unit of time (second) to the target server to cause exhaustion of its resources.
You're right, with HTTP/1.1 we have single request in-flight (or none in keep-alive state) at any moment. But that doesn't limit number of simultaneous connections from a single IP address. An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel. This is a lot, too. In pre-HTTP/2 era this could be mitigated by limiting number of connections per IP address.
In HTTP/2 however, we could have multiple parallel connections with multiple parallel requests at any moment, this is by many orders higher than possible with HTTP/1.x. But the preceeding mitigation could be implemented by applying to the number of requests over all connections per IP address.
I guess, this was overlooked in the implementations or in the protocol itself? Or rather, it is more difficult to apply restrictions because of L7 protocol multiplexing because it's entirely in the userspace?
Added:
The diagram in the article ("HTTP/2 Rapid Reset attack" figure) doesn't really explain why this is an attack. In my thinking, as soon as the request is reset, the server resources are expected to be freed, thus not causing exhaustion of them. I think this should be possible in modern async servers.
The new technique described avoids the maximum limit on number of requests per second (per client) the attacker can get the server to process. By sending both requests and stream resets within the same single connection, the attacker can send more requests per connection/client than used to be possible, so it is perhaps cheaper as an attack and/or more difficult to stop
Is is a fundamental HTTP/2 protocol issue or implementations issue? Could this be an issue at all, if a server has strict limits of requests per IP address, regardless of number of connections?
The largest DDoS attack to date, peaking above 398M rps - https://news.ycombinator.com/item?id=37831062
HTTP/2 Zero-Day Vulnerability Results in Record-Breaking DDoS Attacks - https://news.ycombinator.com/item?id=37830998
https://www.nginx.com/blog/http-2-rapid-reset-attack-impacti...
By relying on the default keepalive limit, NGINX prevents this type of attack. Creating additional connections to circumvent this limit exposes bad actors via standard layer 4 monitoring and alerting tools.
However, if NGINX is configured with a keepalive that is substantially higher than the default and recommended setting, the attack may deplete system resources.
I'm surprised this wasn't foreseen when HTTP/2 was designed. Amplification attacks were already well known from other protocols.
I'm similarly similarly surprised it took this long for this attack to surface, but maybe HTTP/2 wasn't widely enough deployed to be a worthwhile target till recently?
I tried looking it up and couldn’t find an authoritative answer. Can you recommend a resource that you like for this subject?
On the other hand, SYN cookies were introduced in 1996, so there's definitely some historic precedent for attacks in the (victim pays Y, attacker pays X, X<<Y) class.
As with most things like this, probably many hundreds of unimportant people saw it and tried it out.
Trying to do it on Google, with a serious effort, that's the wacky part.
If I were the FBI, I'd be looking at people with recently bought Google puts expiring soon. I can't imagine anyone taking a swing at Google infra "for the lulz". Also in contention: nation-states doing a practice run.
Luckily, HTTP/1.1 still works. You can always enable it in your browser configuration and in your web servers if you don't like the protocol.
HTTP/3 does that in my experience (lots of train rides with spotty onboard Wi-Fi) quite a bit better though. As HTTP/2 is still affected by head-of-line blocking and a single packet loss can block all other streams, even if the lost packet didn't hold data for them.
Unfortunately, most computers only pass TCP and UDP (Windows and middleboxes). So, protocol evolution is a dead end.
Thus you have to piggyback on what computers will let through--so you're stuck with creating an HTTP flavor of TCP.
Dead Comment
Like the others on that list (off by one, cache invalidation etc) it isn’t actually hard-hard, but rather underestimated and overlooked.
I think if we took half the time we spend on creation, constructors, initialization, and spent that design time thinking about destruction, cleanup, teardown, cancelation etc, we’d have a lot fewer bugs, in particular resource exhaustion bugs.
Now they are telling us a yarn about how they are heroically saving us from the problem they created, but without mentioning the part that they created it.
The nerve of these tech companies! Microsoft has been doing this for decades, too.
With HTTP/1.1 you could send one request per RTT [0]. With HTTP/2 multiplexing you could send 100 requests per RTT. With this attack you can send an indefinite number of requests per RTT.
I'd hope the diagram in this article (disclaimer: I'm a co-author) shows the difference, but maybe you mean yet another form of attack than the above?
[0] Modulo HTTP/1.1 pipelining which can cut out one RTT component, but basically no real clients use HTTP/1.1 pipelining, so its use would be a very crisp signal that it's abusive traffic.
* HTTP/1.1: 1 request per RTT per connection
* HTTP/2 multiplexing: 100 requests per RTT per connection
* HTTP/2 rapid reset: indefinite requests per connection
In each case attackers are grinding down a performance limitation they had with previous generations of the attack over HTTP. It is a request flood; the thing people need to keep in mind is that HTTP made these floods annoying to generate.
You're right, with HTTP/1.1 we have single request in-flight (or none in keep-alive state) at any moment. But that doesn't limit number of simultaneous connections from a single IP address. An attacker could use the whole port space of TCP to create 65535 (theoretically) connections to the server and to send requests to them in parallel. This is a lot, too. In pre-HTTP/2 era this could be mitigated by limiting number of connections per IP address.
In HTTP/2 however, we could have multiple parallel connections with multiple parallel requests at any moment, this is by many orders higher than possible with HTTP/1.x. But the preceeding mitigation could be implemented by applying to the number of requests over all connections per IP address.
I guess, this was overlooked in the implementations or in the protocol itself? Or rather, it is more difficult to apply restrictions because of L7 protocol multiplexing because it's entirely in the userspace?
Added: The diagram in the article ("HTTP/2 Rapid Reset attack" figure) doesn't really explain why this is an attack. In my thinking, as soon as the request is reset, the server resources are expected to be freed, thus not causing exhaustion of them. I think this should be possible in modern async servers.