/dev/random – a new approach

This proposed design comprehensively ratifies Linux's nonsensical legacy random/urandom division; for instance, it maintains a "512 bit emergency entropy level" in case urandom becomes "stressed". It appears to be motivated principally by concerns about the quality of entropy being fed continually into the LRNG --- which is a nonproblem.

This is silly.

There are three problems a kernel CSPRNG needs to address:

1. It needs to seed itself reliably and not provide output when it's not securely seeded.

2. It needs to re-key itself regularly to provide forward secrecy and recovery from attacks.

3. It needs to be performant enough to be used in applications to the exclusion of all other RNGs.

That's it. Once problem (1) is addressed, there is never again for the uptime of the system a concern about the quality or "level" of entropy in the LRNG. DRBGs like this document proposes are effectively stream ciphers (in fact: Salsa20, the new hotness in bulk cipher designs, is essentially a hash DRBG --- and, in the BSDs, a [crappy] stream cipher was the RNG). "Entropy" keys the cipher. The idea that you can "run low" on entropy is the same as the idea that you can "run low" on key. No.

Kernel CSPRNGs do need continually updated entropy pools. But not because their original entropy is depleted! Rather: you want your RNG to be updated so that its "key" changes regularly, so that if your system is compromised transiently, you'll recover.

And, of course, the division between /dev/random and /dev/urandom sabotages goal (3) --- arguably the most important goal. Almost every time you've read about some crypto randomness catastrophe --- from Debian commenting out their RNG and reducing SSH key space to 16 bits, to Bitcoin wallets coughing up private keys by repeating nonces --- you've been reading about userspace RNG failures. People use userspace RNGs because they think the kernel RNG is too slow. And they think the kernel RNG is too slow because they use /dev/random, which obstinately refuses to provide output when --- wait, I'm not dignifying /dev/random: it refuses to provide output at random.

sillysaurus3 · 10 years ago

A kernel CSPRNG needs to:

1. Seed itself reliably.

2. Do nothing unless seeded.

3. Re-seed itself regularly, for forward secrecy.

4. Have good performance.

How difficult is #1? It's as simple as gathering 128 bits of entropy. If you have 128 bits, you an expand them into an infinite sequence of random numbers, securely, forever.

I think this is the essence of your comment. The topic is important, so it's worth being clear.

xorcist · 10 years ago

It used to be a real problem in virtual environments. That's why they all have special drivers for randomness now.

kang · 10 years ago

> If you have 128 bits, you an expand them into an infinite sequence of random numbers, securely, forever.

How? And why 128 only?

merb · 10 years ago

The question is more how to make 'Have good performance' and 'good crypto'. It's mostly a tradeoff.

haberman · 10 years ago

Is there a difference between re-seeding and re-keying?

I think of this analogy: you re-key/re-seed periodically not because the originally lock gets weaker, but in case someone stole your key without you knowing. Accurate?

tptacek · 10 years ago

Pretty much yes.

aidenn0 · 10 years ago

Has anybody made a set of patches to give linux the BSD behavior (block until seeded, then never block again)? I run into issues with /dev/random regularly and while I can patch every single application, that feels like playing whack-a-mole compared to patching the kernel.

ryuuchin · 10 years ago

The solution (currently sans kernel patching) is to just seed /dev/urandom yourself at system startup to avoid the issue of it not being seeded. Don't most distro's do this anyway?

After that it's just down to using /dev/urandom[1] or the getrandom syscall (which is the same as using /dev/urandom). Also the getrandom syscall actually has the behavior you're talking about.

[1] http://sockpuppet.org/blog/2014/02/25/safely-generate-random...

dadrian · 10 years ago

The getrandom() system call does this. See http://man7.org/linux/man-pages/man2/getrandom.2.html

nshepperd · 10 years ago

If you're using udev, you can configure it to make /dev/random a symlink to /dev/urandom (eg. [1]). Sadly that doesn't give you the "block until seeded" behaviour though, and I don't know of any kernel patch doing that yet.

1. http://superuser.com/questions/309840/how-can-i-point-dev-ra...

indolering · 10 years ago

This would also quiet a lot of the noise around the entropy estimation blocking in `/dev/random`. I mean, if basically everyone [1] agrees that you should just use `/dev/urandom/` because blocking on entropy estimation is misguided, then why not copy the behavior of every other Unix?

[1]: http://www.2uo.de/myths-about-urandom/

raverbashing · 10 years ago

Thinking out loud here:

1 - try to read from /dev/random.

2 - While 1 has not succeeded, wait and try again

3 - Once it has succeeded, read from /dev/urandom on next reads

Tomte · 10 years ago

Not to /dev/urandom, AFAIK. But at least they have the new getrandom syscall.

mindslight · 10 years ago

Practically yes. But as cryptosystem security is based on teasing out differences with respect to various oracles, bounding the security of every piece of cryptographic software to the kernel PRNG algorithm gives me the willies. The only saving grace is that the specific algorithm can be changed out with no worries of backwards incompatibility.

I agree the current implementation makes this distinction worthless, but it doesn't have to be. The biggest hurdle is that true entropy is scarce, and thus must be strictly partitioned between users.

A hardware RNG would solve this, but RDRAND not so much. I feel that if consumer hardware is being subverted by TLAs, the RNG would be the most likely avenue of attack. We may even eventually see partitioning of secure systems into deterministic/functional and non-deterministic parts, with the non-deterministic tape being saved for auditing.

tptacek · 10 years ago

Neither of your first two paragraphs are correct. I don't have an opinion about the last.

First: while I don't know what you mean by "teasing out differences with respect to various oracles", the fact that you have a kernel CSPRNG that your system relies on means that diversity in CSPRNGs is a bad thing. Simply: alternatives to the LRNG lead to multiple single points of failure.

Entropy is not "scarce". In fact, it's infinite and renewable. To generate a practically unbounded number of random bytes from just 16 bytes of starting state entropy, just use those 16 bytes as an AES key, and use it to encrypt an incrementing counter.

tmd83 · 10 years ago

'Entropy doesn't get depleted' seems to be something that I have read in many places yet the Linux kernel dev doesn't seem to focus too much on that. Any reason for them missing the point?

It seems you would want to slowly change/update the entropy pool to protect against compromise. So can there be a worst-case situation (if you don't block - for the different reason) that your entropy pool is not updating fast enough to recover from compromise. Or the pool update requires so little entropy (compared to what linux is doing with kind of refilling it) that its almost impossible for that to matter?

Tomte · 10 years ago

Ted T'so showed up here on HN in one of these discussions and derided any criticism as "ivory tower" and "academic".

OTOH, Kroah-Hartmann and Torvalds seemed to approve of my essay ("Myths about /dev/urandom"), so maybe not all hope is lost.

Still, let's not lose perspective here: the Linux random subsystem certainly has its flaws, but it is secure for all practical purposes. Just not convenient.

IgorPartola · 10 years ago

Any insight as to why Linux is continuing to insist on taking the approach it has thus far?

thaumasiotes · 10 years ago

> wait, I'm not dignifying /dev/random: it refuses to provide output at random

Why would this be a problem? Read one byte from /dev/random, and observe one of 257 possible values chosen at random. :D

eru · 10 years ago

You'd have to correct for the biases. Not only in probability, but subsequent calls are also not independent.

4ad · 10 years ago

Why are you posting this here instead of the Linux mailing list?

incompatible · 10 years ago

Interesting. Would the re-keying process require a source of fresh entropy, if it's to be non-deterministic?

tptacek · 10 years ago

No. Modern CSPRNGs based on DRBGs are relying on fundamental crypto building blocks like block ciphers and hashes. A single bit difference between the previous state and the new state after the addition of any entropy whatsoever completely scrambles the RNG state. Obviously, you're getting many more bits than one out of your regular updates from the entropy collection subsystem.

aortega · 10 years ago

>Kernel CSPRNGs do need continually updated entropy pools. But not because their original entropy is depleted! Rather: you want your RNG to be updated so that its "key" changes regularly, so that if your system is compromised transiently, you'll recover.

It's very unlikely that a compromise will be temporary. This situation "transient compromise" will never happen. Moreover, once a system is compromised, by definition you cannot trust it until replacing it from a known-good source or backup, at which point, you just start the CSRNG with new entropy.

No system should need or even attempt to recover from a transient compromise, you are just causing a false feeling of security.

ibotty · 10 years ago

You don't necessarily know your rng is being attacked.

Rekeying frequently makes passive attacks way harder, so it makes sense to rekey as a often as you can.

indolering · 10 years ago

How do you feel about his use of all interrupts?

Dead Comment

ianmiers · 10 years ago

Linux RNGs: where refusing to XOR in additional hardware randomness into the output stream is not just a defensible opinion, but the dominant one.

ryuuchin · 10 years ago

Except they do use rdrand if supported. When the random function is called it does a SHA transform. The SHA state gets initialized with rdrand and then the kernel pool is mixed in. This avoids the theoretical issues with xoring rdrand directly into the output.

You should generally avoid xor when it comes to untrusted output such as rdrand since it could theoretically control what gets generated and as such could cancel out whatever you're xoring it with (or at least this is the claim AFAIK). It's better to mix it using some other method, e.g. hash or cipher[1].

[1] https://boringssl.googlesource.com/boringssl/+/master/crypto...

Natanael_L · 10 years ago

Refusing to XOR it in is sensible. That RNG hardware might know too much about your existing entropy pool (either malicious bias or design flaws). Rejecting to mix it in cryptographically is however dumb.

If I read this correctly, this new suggestion also keeps the entropy estimation voodoo:

"By using the primary DRBG to serve /dev/random, the LRNG ensures that each random number generated by the primary DRBG must be backed by an equal amount of entropy that seeded the DRBG. Hence, the data derived from /dev/random is backed by information theoretical entropy."

(page 21 of the PDF documentation)

twic · 10 years ago

For an explanation of the voodoo nature of the blocking /dev/random, see:

http://www.2uo.de/myths-about-urandom/

EDIT: or read tptacek's root comment on this very page.

sunnyps · 10 years ago

Thanks! This article was very helpful.

ibotty · 10 years ago

Note, that tomte is the author.

cesarb · 10 years ago

I used to think that entropy estimation was nonsense, but last year I finally understood it. Copying my comment at https://lwn.net/Articles/642839/:

Think of the pool's entropy as a "measure of unpredictability". If the entropy is ten bits, for instance, you'd need at most 1024 guesses to find the pool state.

You should think of the random generator as having two separate and independent parts: the pool itself and the output function.

Inputs are mixed into the pool using a cryptographic function which takes two values: the previous pool state and the input value, and outputs the new pool state. This function thoroughly mixes its inputs, such that a one-bit difference on any of them would result on average to half of the output bits changing, and other than by guessing it's not possible to get from the output back to the inputs.

Suppose you start with the pool containing only zeros. You add to it an input containing one bit of entropy. Around half of the pool bits will flip, and you can't easily reverse the function to get back the input, but since it's only one bit of entropy you can make two guesses and find one which matches the new pool state. Each new bit of entropy added doubles the number of guesses you need to make; but due to the pigeonhole principle, you can't have more entropy than the number of bits in the pool.

To read from the pool, you use the second part, the output function. It again is a cryptographic function which takes the whole pool as its input, mixes it together, and outputs a number. Other than by guessing, it's not possible to get from this output to the pool state.

Now let's go back to the one-bit example. The pool started with zero entropy (a fixed initial state), and got one bit of entropy added. It can now be in one of two possible states. It goes through the output function, which prevents one reading the output from getting back to the pool state. However, since there were only two possible states (one bit of entropy), you can try both and see which one would generate the output you got... and now the pool state is completely predictable, that is, it now has zero bits of entropy again! By reading from the pool, even with the protection of the output function, you reduced its entropy. Not only that, but there were only two possible outputs, so the output itself had only one bit of entropy, no matter how many bits you had asked for.

Now if you read a 32-bit number from a pool with 33 bits of entropy, you can make many guesses and find out a possible pool state. However, again due to the pigeonhole principle, there's on average two possible pool states which will generate the same 32-bit output. Two pool states = one bit. So by reading 32 bits, you reduced the remaining entropy to one bit.

This is important because if you can predict the pool state, you can predict what it will output next, which is obviously bad.

----

Now, why isn't the situation that bad in practice? First, the input entropy counter tends to underestimate the entropy being added (by design, since it's better to underestimate than to overestimate). Second, "by guessing" can take a long enough time to be impractical. Suppose you have a 1024-bit pool, and read 1023 bits from it. In theory, the pool state should be almost completely predictable: there should be only two possible states. In practice, you would have to do more than 2^1000 guesses (a ridiculously large number) before you could actually make it that predictable.

However, that only applies after the pool got unpredictable enough. That's why the new getrandom() system call (which you should use instead of reading /dev/random or /dev/urandom) will always block (or return failure in non-blocking mode) until the pool has gotten enough entropy at least once.

tptacek · 10 years ago

No. This would matter if entropy was somehow directly provided to applications. But no modern system does this. Instead, entropy is hashed into a key for a keystream generator (based either on a hash function, a MAC, or a stream cipher).

That's the entire point of a DRBG†: to securely incorporate inputs of wildly varying randomness into an RNG whose future outputs are as difficult to predict from past output as keys are to recover from ciphers.

To somehow "predict" the output of the system from its input, you would need to be able to work back from the output of a cryptographic keystream generator back to its key. If you can do that, forget the RNG: you've broken most modern cryptography.

† and even prior to this proposal, that's what the LRNG implemented

Tomte · 10 years ago

Nobody is advancing pathological one bit starting entropy cases. Of course you need entropy. But you don't need thousands of fresh bits every millisecond. You need a few hundred once.

More important: nobody knows if the entropy counter is truly conservative, because nobody knows how to properly count entropy. Linux makes educated guesses by assuming a model of entropy for the inputs (inter-arrival times of events). That's far less solid than the rest of the random subsystem.