| From: | Theodore Ts'o <tytso-AT-mit.edu> | |
| To: | Thorsten Glaser <tg-AT-mirbsd.de> | |
| Subject: | Re: [PATCH] /dev/random: Insufficient of entropy on many architectures | |
| Date: | Fri, 13 Sep 2013 15:29:15 -0400 | |
| Message-ID: | <20130913192915.GC15366@thunk.org> | |
| Cc: | linux-kernel-AT-vger.kernel.org | |
| Archive‑link: | Article |
On Fri, Sep 13, 2013 at 11:54:40AM +0000, Thorsten Glaser wrote:One of the primary requirements for the mixing function is that itshould not lose entropy in the course of doing the mixing. Forexample, if someone does the equivalent of "dd if=/dev/zeroof=/dev/random", and mixes in a whole series of zero values, we shouldnot lose any information. That's another way of saying that themixing function:POOL' = MIX(POOL, entropy_input)must be reversible. This guarantees that:POOL' = MIX(POOL, 0)will not lose any information.The second requirement of the mixing function is that we want to makesure any entropy we do have is distributed evenly across the pool.This must be true even if the input is highly structured (i.e., if weare feeding timestamps into the entropy pool, where the high bits arealways the same, and the entropy is in the low bits), successivetimestamps should not just affect certain bits in the pool.This is why using a simple XOR into the pool is not a good idea, andwhy we rotate the input byte by different amounts during each mixingoperation.However, when we extract information from the pool, this is where wewant to make sure the pool gets peturbed in such a way that if theattacker has partial knowledge of the pool, that knowledge gets harderto use very quickly. And here the paper "The Linux PseudorandomNumber Generator Revisited", by Lacharme, Rock, Strubel, and Videauhas some good points that I am seriously considering. I am stillthinking about it, but what I am thinking about doing is doing moremixing at each extract operation to so that there is moreavalanche-style stiring happening each time we extract informationfrom the pool. Right now we mix in a SHA hash at each extractoperation; but if we call mix_pool_bytes() with a zero buffer, then wewill smear that SHA hash across more of the pool, which makes itharder for an attacker to try to reverse engineer the pool whilehaving partial knowledge of an earlier state of the pool.So let me be blunt. The entropy estimator we have sucks. It's prettymuch impossible to automatically estimate given an arbitrary inputstream how much entropy is present. The best example I can give isthe aforementioned:AES_ENCRYPT(i++, NSA_KEY)Let's assume the starting vaue of i is an unknown 32-bit number. Theresulting stream will pass all statistical randomness tests, but nomatter how may times you turn the crank and generate more "random"numbers, the amount of entropy in that stream is only 32 bits --- atleast to someone who knows the NSA_KEY. For someone who knows thatthis is how the input stream was generated, but not the 256-bit NSAKEY, then the amount of entropy in the stream would be 32 + 256 = 288bit. And this is true even if you generate several gigabytes worthnumbers using the above CRNG.Now, if you know that the input stream is a series of timestamps, aswe do in add_timer_randomness() we can do something slightly betterby looking at the deltas between successive timestamps, but if thereis a strong 50 or 60 Hz component to the timestamp values, perhaps dueto how the disk drive motor is powered from the AC mains, a simpledelta estimator is not going to catch this.So the only really solid way we can get a strong entropy estimation isby knowing the details of the entropy source, and how it might fail tobe random (i.e., if it is microphone hooked up to gather room noise,there might be some regular frequency patterns generated by the HVACequipment; so examining the input in the frequency domain and lookingfor any spikes might be a good idea).The one saving grace in all of this is there is plenty ofunpredictable information which we mix in for which we give no entropycredit for whatsoever. But please don't assume that just because youcan read 8192 bits out of /dev/random, that you are guaranteed to get8192 bits of pure "true random bits". For one thing, the input poolis only 4096 bits, plus a 1024 bit output pool. Even if both of thepools are filled with pure unpredictable bits, that's only 5120 bits.Sure, as you extract those bits, some more entropy will trickle in,but it won't be a full 8192 bits in all likelihood.At the end of the day, there is no real replacement for a real HWRNG.And I've never had any illusions that the random driver could be areplacement for a real HWRNG. The problem is though is that mostHWRNG can't be audited, because they are not open, and most usersaren't going to be able to grab a wirewrap gun and make their own ---and even if they did, it's likely they will screw up in someembarassing way. Really, the best you can do is hopefull havemultiple sources of entropy. RDRAND, plus the random number generatorin the TPM, etc. and hope that mixing all of this plus some OS-levelentropy, that this is enough to frustrate the attacker enough thatit's no longer the easist way to comrpomise your security.Regards,- Ted
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds