Keeping the Internet fast and secure: introducing Merkle Tree Certificates

There's no good reason to believe that quantum computers will break modern cryptography.

Shor's algorithm requires that a quantum Fourier transform is applied to the integer to be factored. The QFT essentially takes quantum data with a representation that mirrors ordinary binary, and maps it to a representation that encodes information in quantum phase (an angle).

The precision in phase needed to perform an accurate QFT scales EXPONENTIALLY with the number of qubits you're trying to transform. You manage to develop a quantum computer capable of factoring my keys? Fine, I'll add 11 bits to my key length, come back when you've developed a computer with 2000x the phase precision.

jchw · 2 months ago

Even though I agree with you, I totally understand why the push for at least hybrid algorithms. We just don't know how quantum computers could feasibly scale. Until we know for sure that cracking existing crypto systems really is infeasible for physical reasons, using hybrid systems that provide an extra layer of security just seems like obvious good practice.

Avamander · 2 months ago

At the same time it's very suspicious how a few three-letter agencies are pushing for total deprecation of non-PQ algorithms, even in hybrids.

hulitu · 2 months ago

> We just don't know how quantum computers could feasibly scale.

We know. They are not able yet to emulate an i4004, let alone be a treat to "computing".

adgjlsfhk1 · 2 months ago

> Fine, I'll add 11 bits to my key length, come back when you've developed a computer with 2000x the phase precision.

the really weird thing it's what this isn't true. we already have quantum error correction schemes that can take a quantum computer with O(1) error and get O(exp(-k)) error with polylog(k) inaccurate qbits (and we have empirical evidence that these schemes work to correct the error of single digit numbers of qbits already). adding 11 bits to the key adds ~12 logical qbits or ~a hundred physical qbits to the side of the QC

dtj1123 · 2 months ago

Would you mind linking a paper on one of the schemes you're referring to?

sunsetonsaturn · 2 months ago

Regulators require a transition to quantum-safe algorithms. In the EU, systems labeled as "highly important" must complete the transition by 2030; so you have to do it regardless of how quantum computers evolve.

dtj1123 · 2 months ago

Perhaps regulators shouldn't require that transition?

Strilanc · 2 months ago

> The precision in phase needed to perform an accurate QFT scales EXPONENTIALLY with the number of qubits you're trying to transform.

This is false.

The gates that appear in the textbook QFT circuit (such as the one shown on wikipedia [1]) do mention angles that are exponentially small in N (the number of qubits being operated upon). That may be what's confusing you. But it's well known that the tolerance on those rotations is high, meaning that simply skipping all the exponentially tiny rotations introduces negligible error [2][3].

Here's a simple model. Each time you get a rotation off by an angle of X, add X to the "total algorithm rotation error" R. The chance of an algorithm failing is at most R^2. For example, if R is less than 1 degree then the chance of algorithm failure will be less than 0.03%. That's an acceptable retry chance for Shor's algorithm. The QFT circuit on N qubits performs less than N^2 rotations. So, for R to be less than 1 degree, it's sufficient for each rotation's error X to be less than (1°)/N^2. Therefore the required precision only increases polynomially (like N^2) instead of exponentially (like 2^N). Note the required precision can be improved from O(1/N^2) to O(1/N) using techniques like the qubit recycling QFT [4].

Actually, even if the required precision scaled exponentially, that still wouldn't be an insurmountable problem. Quantum error correction achieves exponentially tighter tolerances from polynomially increasing resources. For example, Ross and Selinger proved that continuous rotations can be approximated to a target error of epsilon using O(log(1/epsilon)) gates from the discrete gate set Clifford+T [4]. And Clifford gates with error below epsilon can be achieved in the surface code using O(log(1/epsilon)^2) noisy qubits for O(log(1/epsilon)) time [5]. And T gates can be achieved by using those reliable Clifford gates to perform magic state distillation of log(1/epsilon)^O(1) T states [6]. Since everything scales polynomially in log(1/epsilon), making epsilon exponentially smaller adds polynomial resources.

There is no part of Shor's algorithm that requires resources growing exponentially in n (the number of digits of the number being factored). The practical scaling is more like n^3: factoring a number that's twice as large can be done with ~two times as many qubits running for ~four times as long. Even if the qubits are noisy [7].

[1]: https://en.wikipedia.org/wiki/Quantum_Fourier_transform#/med...

[2]: https://arxiv.org/abs/quant-ph/0306018

[3]: https://arxiv.org/abs/quant-ph/9601018

[4]: https://arxiv.org/pdf/quant-ph/9903071#page=12

[5]: https://arxiv.org/abs/1208.0928

[6]: https://arxiv.org/abs/1209.2426

[7]: https://arxiv.org/abs/1905.09749

hulitu · 2 months ago

> There's no good reason to believe that quantum computers will break modern cryptography.

Nobody needs them The 5 eyes already have access to root certs and internet nodes.

What it _really_ matters is that you are secure, and terrorists and pedofiles stand no chance. At least in theory. /s

> All the information a client needs to validate a Merkle Tree Certificate can be disseminated out-of-band.

The post didn't discuss it but naively this feels like it becomes a privacy issue?

codebje · 2 months ago

Using The Approved Set™ from your browser or OS carries no privacy issues: it's just another little bit of data your machine pulls down from some mothership periodically, along with everyone else. There's nothing distinguishing you from anyone else there.

You may want to pull landmarks from CAs outside of The Approved Set™ for inclusion in what your machine trusts, and this means you'll need data from somewhere else periodically. All the usual privacy concerns over how you get what from where apply; if you're doing a web transaction a third party may be able to see your DNS lookup, your connection to port 443, and the amount of traffic you exchange, but they shouldn't be able to see what you asked for or what you go. Your OS or browser can snitch on you as normal, though.

I don't personally see any new privacy threats, but I may not have considered all angles.

mtud · 2 months ago

Different machines will need to have variations in when they grab updates to avoid thundering herd problems.

I could see the list of client-supplied available roots being added to client fingerprinting code for passive monitoring (e.g. JA4) if it’s in the client hello, or for the benefit of just the server if it’s encrypted in transit.

cryptonector · 2 months ago

Vaguely. Basically each CA will run one of these. Relying parties (browsers) will need to fetch the Merkle tree heads periodically for at least the CAs that sign certificates for the sites the users browse, or maybe all of the WebPKI CAs rather than just those that sign certs for the visited sites. There's on the order of 100 CAs for the whole web though, so knowledge that your browser fetched the MT heads for, say, 5 CAs, or even 1 CA, wouldn't leak much of anything about what specific sites the user is visiting. Though if the user is visiting, say, China sites only, then you might see them only fetch the MT heads for CN, and then you might say "aha! it's a Chinese user", or something, but... that's not a lot of information leaked, nor terribly useful.

mcpherrinm · 2 months ago

I think most folks involved are assuming the landmarks will be distributed by the browser/OS vendor, at least for end-user devices where privacy matters the most - Similar to how CRLSets/CRLite/etc are pushed today.

There's "full certificates" defined in the draft which include signatures for clients who don't have landmarks pre-distributed, too.

vlovich123 · 2 months ago

The privacy aspect isn’t that you fetched the heads. It’s that you are sending data that is likely usable (in combination with other data being leaked) that websites can use to fingerprint you and track you across websites (since these heads are sent by the client in TLS hello)

marginalia_nu · 2 months ago

Between SNI and OCSP, TLS has never been about hiding which websites you visit from prying eyes.

DaSHacka · 2 months ago

OCSP is being phased out and ESNI exists though

megous · 2 months ago

It would be ridiculous for cloudflare to discuss privacy issues. ;-)

> During the TLS handshake, the client tells the server which treeheads it has.

I don’t love the idea of giving every server I connect to via TLS the ability to fingerprint me by how recently (or not) I’ve fetched MTC treeheads. Even worse if this is in client hello, where anyone on the network path can view it either per connection or for my DoH requests to bootstrap encrypted client hello.

bwesterb · 2 months ago

If your browser is online on an unrestricted network, then the tree heads will be kept up to date, and this will leak nothing. If you had your laptop closer for a weekend, open it up immediately and visit a website before your browser had a chance to update, well, you leak for maybe a minute or two you had your laptop closed for a weekend. So it's not that much. But we'll want to see how we can reduce this as much as possible.

mtud · a month ago

It can't possibly be updating continuously in real time, can it? Especially for battery devices, a constant background thread polling for updates seems untenable.

Deleted Comment

Next week at IETF 124, there's a Birds-of-a-Feather session that will kick off the standardization process here.

I think Merkle Tree Certificates a promising option. I'll be participating in the standardization efforts.

Chrome has signalled in multiple venues that they anticipate this to be their preferred (or only) option for post-quantum certificates, so it seems fairly likely we will deploy this in the coming years

I work for Let's Encrypt, but this is not an official statement or promise to implement anything yet. For that you can subscribe to our newsletter :)

crote · 2 months ago

It worries me how we are increasingly making browser vendors a critical part of the TLS ecosystem - first with CRL bloom filters, now with signature Merkle trees. And they of course also manage the root stores.

Sure, it's nice and convenient if you're using an evergreen browser which is constantly getting updates from the mothership, but what is the rest supposed to do? How are we supposed to use this in Curl, or in whatever HTTP library your custom application code is using? How about email clients? Heck, is it even possible with embedded devices?

"The internet" is a hell of a lot bigger than "some website in Google Chrome", and we should be careful to not make all those other use cases impossible.

On some major OSes (like Windows and Mac), there’s a “platform verifier” which can handle some of this, including the fetching and sharing of out of band data. It doesn’t have to be tied to a browser.

Linux should probably get one too, but I don’t know who will lead that effort.

In the mean time, browsers aren’t willing to wait on OSes to get their act together, and reasonably so. There’s regulation (and users, especially corporate/government) pushing for post-quantum solutions soon, so folks are trying to find solutions that can actually be deployed.

Browsers have always led in this space, all the way back to Netscape introducing SSL in the first place.

I agree that there is a big gap between what browsers offer today, and a non-browser client. This is a good moment to improve things. There is no reason that we can't have system service populating /etc/ssl/treeheads, which openssl can read. Or fall back to $USER/.config/treeheads, or per application.

commandersaki · 2 months ago

I took some rough notes to whittle down the verbiage.

This proposal is to introduce PQ certificates in WebPKI such as for certificate authorities.

Problem is PQ signatures are large. If certificate chain is small that could be acceptable, but if the chain is large, then it can be expensive in terms of bandwidth and computation during TLS handshake. That is the exchange sends many certificates which embed a signature and a large (PQ) public key.

Merkle Tree Certificates ensures that an up to date client only needs 1 signature, 1 public key, 1 merkle tree witness.

Looking at an MTC generated certificate they've replaced the traditional signing algorithm and signature with a witness.

That means all a client needs is a signed merkle root which comes from an expanding Merkle Tree signed by the MTCA (Merkle Tree CA), which is delivered somehow out of band.

So basically TLS client receives certificate containing new signature algorithm which embeds a witness instead of a signature, a root (not sure if just a hash or a signed hash, I think the former). Client will get the signed roots out of band, which can be pre-verified, which means verifying the witness is simply doing a check on the witness.

Edit: My question: is this really a concern that needs to be addressed? PQ for TLS key exchange addresses a looming threat of HNDL (Harvest Now Decrypt Later). I don't see why we need to address making WebPKI use PQ signatures, at least for awhile now.

rand846633 · 2 months ago

Could take ipv6 ages to have this standardised and rolled out the most parts of the internet and IOT. Might make sense to do now if you want to be able to shut down that last non PQ safe tls device in the year 2050?

arccy · 2 months ago

They're saying WebPKI, which mean basically web browsers can more or less push this through on their evergreen release schedule when it becomes necessary.

PKI for everything else can go at their own pace

If the first time the client doesn't know what root the server's certificate will chain to, therefore it doesn't tell the server what treeheads it has, and so the client gets a full certificate, and then the client caches this to remember for later connections, then... that could work, though it's a slight metadata leak.

Alternatively the client could send the treeheads for all the roots it trusts. That's going to bloat the ClientHello and... it's going to leak a bit of metadata unless if the client does anything other than claim to trust all roots blessed by the CA/Browser Forum, or the Chrome Root Program.

You only need to send one treehead per MTCA. From that one treehead the server can infer it must also have the previous few. If that's still too much, we can compress it even further by only sending "I trust the standard CAs of Mozilla plus/minus some CAs and the stalest treehead I have has this timestamp". That'll be just a few bytes.

Yes, a timestamp and a code for which set of trust anchors it trusts should be enough. Or even just a timestamp. The server is not going to have lots of certs chaining to many roots, so the client's trust anchor set is really not that relevant. The timestamp can be in coarse increments.

tempay · 2 months ago

matthewaveryusa · 2 months ago

There's a section missing on the inclusion proof and what exactly the clients will be getting.

If I understand this correctly each CA publishes a signed list of landmarks at some cadence (weekly)

For the certs you get the landmark (a 256-bit hash) and the hashes along the merkle path to the leaf cert's hash. For a landmark that contains N certs, you need to include log2(N) * hash_len bytes and perform log2(N) hash computations.

For a MTC signature that uses a 256bit hash and N=1 million that's about 20*32=620bytes.

Is this the gist of it?

I'm really curious about the math behind deciding the optimal landmark size and publishing cadence

Yeah. From the RFC draft:

> If a new landmark is allocated every hour, signatureless certificate subtrees will span around 4,400,000 certificates, leading to 23 hashes in the inclusion proof, giving an inclusion proof size of 736 bytes, with no signatures.

https://davidben.github.io/merkle-tree-certs/draft-davidben-...

That's assuming 4.4 million certs per landmark, a bit bigger than your estimate.

There's also a "full certificate" which includes signatures, for clients who don't have up-to-date landmarks. Those are big still, but if it's just for the occasional "curl" command, that's not the end of the world for many clients.

politelemon · 2 months ago

Does this mean that some https fetch activities could break if a machine hasn't run sudo apt upgrade in a while?