Context: WebPKI revocation is broken. It didn't have to be [0], but it always has been, and likely always will be, now that the industry is moving to short-lived certificates.
With such short-lived certificates, revocation effectively becomes a non-issue.
[0]: OCSP Stapling (esp. with Must-Staple) is actually pretty effective at distributing revocation information in a timely, secure, private manner. There were no real downsides to the spec, which is why Caddy has supported automatic OCSP stapling since, gosh, I think 2016 or so. The problem is that most other web servers didn't -- and still don't -- implement it well [1], making OCSP (without the stapling) a persistent privacy problem. Additionally, many client vendors want to choose what certificates count as "revoked" now, so they use their own CRLs, making OCSP entirely useless, since they only check their own CRLs.
Other web servers have a reliable implementation of OCSP stapling via their support for loading the OCSP response from an external file. It isn't as convenient as having it built-in but it works well.
GrapheneOS used https://github.com/tomwassenberg/certbot-ocsp-fetcher for years to provide reliable OCSP stapling for nginx prior to Let's Encrypt dropping OCSP support. We used Must-Staple through that for years.
TLS ticket key rotation also generally needs to be handled externally in a noswap tmpfs to avoid the keys being lost when restarting service, especially if syncing them across multiple nodes providing the same service is desired. Syncing them across nodes also allows preserving them across reboots while still only having them in-memory.
Only the default classic and opt-in tlsserver profiles are available. The allowlist mentioned there doesn't really seem to exist yet. The tlsserver profile has the same changes as shortlived without dropping 90 days to 160 hours. Both tlsserver and shortlived drop support for clients without SNI support which are unfortunately still very common for non-web usage. We found we can't deploy tlsserver for our SMTP federation (port 25) or SUPL server without having a fallback certificate for clients without SNI support. Since shortlived drops non-SNI client support, not everyone will be able to use it. It would be nice if there were 2 variants of it instead since client functionality is typically not within the control of servers.
Let's Encrypt dropped OCSP support before they had the replacement of short-lived certificates deployed. It would have been nice if OCSP support was only dropped after shortlived was enabled for production usage, but it's too late for that now. If it was possible to start using shortlived today, we would.
For the people wondering why this does not show up as revoked in their browser:
I believe the way the current revocation systems work is that browsers compile centralized lists of revoked certificates, but they do not contain all revoked certs, but only ones that indicate some form of security issue.
Certificates can be revoked with various revocation reasons, however, it looks this one has no specific revocation reason listed in the CRL. For a certificate that was revoked with a reason of "Key Compromise", things would be different, and most browsers would probably reject it.
Firefox / CRLite includes all revocations. The issue with this particular certificate is that the CRLite backend is behind on ingesting both of the CT logs that it appears in (Digicert wyvern2025h2 [1] and Let's Encrypt oak2025h2). So from CRLite's perspective the certificate doesn't exist yet.
In the very near future, CAs are going to start embedding signed CT timestamps from "static CT" logs [2]. Once that happens, the CRLite backend will be aware of certificates within minutes of issuance.
What would be the point of accepting a certificate that was revoked for non-security reasons?
Might as well not even revoke it…
If a CA issues a certificate to the wrong entity, they won’t have knowledge of a key compromise as there is no such thing in this case — they only know that they issued something wrong…
I believe I remember reading that chrome has a seperate system for revoking CA certificates, where they have to do a manual rollout, but propagation time is pretty fast.
Seems rather problematic that a cert that appears to have been revoked 5 days ago isn’t recognized as revoked by virtually any browser. Is this an OCSP-related issue or do browsers actually do a bad job at checking for revocation?
I was a big fan of OCSP-stapling and must-staple. Both of which are slowly being discouraged; LetsEncrypt refuses to issue must-staple certificates since a few months ago, and I think they are shutting down OCSP servers, if not shut down already.
The idea with OCSP-stapling is that the webserver fetches the OCSP data, caches it for TTL ~24 hours, and staples it to the HTTPS handshake. That way, the browser does not need to query the issuer's OCSP servers, avoiding both performance and privacy concerns. Revoked certificates will continue to work for up to 24 hours, but that, IMO, is within an accepted range compared to CRL that can take a lot longer.
The downside is that the HTTPS handshakes now contain a bit more data, and we want to keep this as minimal as possible.
The problem with OCSP stapling is that it either the client has to fall back to doing OCSP checking itself if the server doesn't staple the signature, which has its own problems[1], or enough servers need to support ocsp stapling that the client can just reject connections that don't include it. And unfortunately, there was never a significant uptake for servers, partly because there wasn't really any incentive to implement OCSP stapling. Maybe if there was a TLS 2.0 (or some other standard) that required OCSP stapling and had other benefits as well, it could work.
[1]: the biggest problem with non-stapled OCSP is what to do if you don't get a response for the ocsp request. If you fail open, an attacker can intercept the request to prevent you from knowing the cert is revoked, but if you fail closed, then any issue with the connection to the ocsp server results in loss of service. And then there are also issues with additional latency to wait for the ocsp response, privacy leaks from the ocsp requests, etc.
Checking for revocation doesn't scale and has serious privacy implications. There are two ways to do revocation: CRL and OCSP. CRL is a list that becomes huge over time - hosting it would require massive amounts of bandwidth and clients would need to download a lot of extra data. OSCP is more like a query API - did this cert expire? The problem is you need to make that query for each visit and you leak your IP address when you do that query. The hoster would need to provide capacity to run those queries and serve the result. For each visit you'd need to pay a few round-trips worth of delay before showing the content, sometimes while part of the content is downloaded: you download example.com, which has some CSS which is hosted at static.example.com, and the website redirects you to m.example.com which is the mobile version after running some JavaScript which detects the browser capabilities.
So the answer then is just much shorter-lived certs? I could definitely still see the need for an immediate revocation to be recognized near-instantaneously. Or in practice is that ultimately not necessary?
IETF will be gradually reducing maximum length of public certs to 47 days. I expect this will help some of the issue since expired certs can be removed from the list.
> CRL is a list that becomes huge over time - hosting it would require massive amounts of bandwidth and clients would need to download a lot of extra data.
Compared to what? 12MB JavaScript bundles and autoplay videos? Do CDNs still exist?
There's a finite number of CAs and browsers can be expected to perform caching. Delta CRLs also exist and the CAs can decline to include expired leaf certs.
This sounds like a made up problem that was solved 25 years ago.
Might be EOL in some theoretical sense, but by turning it off they're ignoring reality. I know some organizations think this is the way to push standards forward. But to me it seems pretty irresponsible.
Several years ago, I saw some documentation about specifying the locations of CRLs in some proxy software. My first thought was "surely I have some of those on the system for web PKI". I was wrong. That sent me down a deep dark rabbit hole of how revocations are normally handled, at the bottom of which was the conclusion that most software does nothing, because it is just too hard.
Firefox has a project (crlite) that uses bloom filters to make crls more practical, but it is still experimental. I think we are a long ways out from the technology being widely used across the industry.
It turns out it is easier to significantly reduce the validity time of webpki certs than solve the problem of distributing distrbuting a list of revoked certificates. Although the former actually helps a lot with the latter, as it reduces the size of said list.
I would say "majority" rather than "not all" browsers perform revocation checking.
[1]: https://gs.statcounter.com/browser-market-share
Let's Encrypt is now offering a profile for 6-day certificates: https://letsencrypt.org/docs/profiles/#shortlived
With such short-lived certificates, revocation effectively becomes a non-issue.
[0]: OCSP Stapling (esp. with Must-Staple) is actually pretty effective at distributing revocation information in a timely, secure, private manner. There were no real downsides to the spec, which is why Caddy has supported automatic OCSP stapling since, gosh, I think 2016 or so. The problem is that most other web servers didn't -- and still don't -- implement it well [1], making OCSP (without the stapling) a persistent privacy problem. Additionally, many client vendors want to choose what certificates count as "revoked" now, so they use their own CRLs, making OCSP entirely useless, since they only check their own CRLs.
[1]: https://gist.github.com/sleevi/5efe9ef98961ecfb4da8 -- most servers have such a bad implementation of OCSP stapling that it makes sites less reliable, not more.
GrapheneOS used https://github.com/tomwassenberg/certbot-ocsp-fetcher for years to provide reliable OCSP stapling for nginx prior to Let's Encrypt dropping OCSP support. We used Must-Staple through that for years.
TLS ticket key rotation also generally needs to be handled externally in a noswap tmpfs to avoid the keys being lost when restarting service, especially if syncing them across multiple nodes providing the same service is desired. Syncing them across nodes also allows preserving them across reboots while still only having them in-memory.
> Let's Encrypt is now offering a profile for 6-day certificates: https://letsencrypt.org/docs/profiles/#shortlived
Only the default classic and opt-in tlsserver profiles are available. The allowlist mentioned there doesn't really seem to exist yet. The tlsserver profile has the same changes as shortlived without dropping 90 days to 160 hours. Both tlsserver and shortlived drop support for clients without SNI support which are unfortunately still very common for non-web usage. We found we can't deploy tlsserver for our SMTP federation (port 25) or SUPL server without having a fallback certificate for clients without SNI support. Since shortlived drops non-SNI client support, not everyone will be able to use it. It would be nice if there were 2 variants of it instead since client functionality is typically not within the control of servers.
Let's Encrypt dropped OCSP support before they had the replacement of short-lived certificates deployed. It would have been nice if OCSP support was only dropped after shortlived was enabled for production usage, but it's too late for that now. If it was possible to start using shortlived today, we would.
Certificates can be revoked with various revocation reasons, however, it looks this one has no specific revocation reason listed in the CRL. For a certificate that was revoked with a reason of "Key Compromise", things would be different, and most browsers would probably reject it.
In the very near future, CAs are going to start embedding signed CT timestamps from "static CT" logs [2]. Once that happens, the CRLite backend will be aware of certificates within minutes of issuance.
[1] The wyvern2025h2 shard had an outage last week, which is also part of the problem here https://groups.google.com/a/chromium.org/g/ct-policy/c/XpmIf....
[2] https://github.com/C2SP/C2SP/blob/main/static-ct-api.md
Might as well not even revoke it…
If a CA issues a certificate to the wrong entity, they won’t have knowledge of a key compromise as there is no such thing in this case — they only know that they issued something wrong…
The idea with OCSP-stapling is that the webserver fetches the OCSP data, caches it for TTL ~24 hours, and staples it to the HTTPS handshake. That way, the browser does not need to query the issuer's OCSP servers, avoiding both performance and privacy concerns. Revoked certificates will continue to work for up to 24 hours, but that, IMO, is within an accepted range compared to CRL that can take a lot longer.
The downside is that the HTTPS handshakes now contain a bit more data, and we want to keep this as minimal as possible.
The problem with OCSP stapling is that it either the client has to fall back to doing OCSP checking itself if the server doesn't staple the signature, which has its own problems[1], or enough servers need to support ocsp stapling that the client can just reject connections that don't include it. And unfortunately, there was never a significant uptake for servers, partly because there wasn't really any incentive to implement OCSP stapling. Maybe if there was a TLS 2.0 (or some other standard) that required OCSP stapling and had other benefits as well, it could work.
[1]: the biggest problem with non-stapled OCSP is what to do if you don't get a response for the ocsp request. If you fail open, an attacker can intercept the request to prevent you from knowing the cert is revoked, but if you fail closed, then any issue with the connection to the ocsp server results in loss of service. And then there are also issues with additional latency to wait for the ocsp response, privacy leaks from the ocsp requests, etc.
IETF will be gradually reducing maximum length of public certs to 47 days. I expect this will help some of the issue since expired certs can be removed from the list.
Compared to what? 12MB JavaScript bundles and autoplay videos? Do CDNs still exist?
There's a finite number of CAs and browsers can be expected to perform caching. Delta CRLs also exist and the CAs can decline to include expired leaf certs.
This sounds like a made up problem that was solved 25 years ago.
Not really sure how big of a problem a list could be?
Deleted Comment
Might be EOL in some theoretical sense, but by turning it off they're ignoring reality. I know some organizations think this is the way to push standards forward. But to me it seems pretty irresponsible.
Not really, since they now offer six-day certs, which makes revocation effectively irrelevant: https://letsencrypt.org/docs/profiles/#shortlived
Deleted Comment
Edit: I believe OCSP is tried, but silently ignored if there is no reponse quickly enough.
Deleted Comment
Deleted Comment
Deleted Comment
Firefox has a project (crlite) that uses bloom filters to make crls more practical, but it is still experimental. I think we are a long ways out from the technology being widely used across the industry.
It turns out it is easier to significantly reduce the validity time of webpki certs than solve the problem of distributing distrbuting a list of revoked certificates. Although the former actually helps a lot with the latter, as it reduces the size of said list.