World's most popular travel blog for travel bloggers.

[Solved]: How does an operating system create entropy for random seeds?

, , No Comments
Problem Detail: 

On Linux, the files /dev/random and /dev/urandom files are the blocking and non-blocking (respectively) sources of pseudo-random bytes.

They can be read as normal files:

$ hexdump /dev/random 0000000 28eb d9e7 44bb 1ac9 d06f b943 f904 8ffa 0000010 5652 1f08 ccb8 9ee2 d85c 7c6b ddb2 bcbe 0000020 f841 bd90 9e7c 5be2 eecc e395 5971 ab7f 0000030 864f d402 74dd 1aa8 925d 8a80 de75 a0e3 0000040 cb64 4422 02f7 0c50 6174 f725 0653 2444 ... 

Many other unix variants provide /dev/random and /dev/urandom as well, without the blocking/non-blocking distinction.

The Windows equivalent is the CryptGenRandom() function.

How does the operating system generate pseudo-randomness?

Asked By : Adam Matan

Answered By : Gilles

The title and the body of your question ask two different questions: how the OS creates entropy (this should really be obtains entropy), and how it generates pseudo-randomness from this entropy. I'll start by explaining the difference.

Where does randomness come from?

Random number generators (RNG) come in two types:

Some applications, such as simulations of physical phenomena, can be content with random numbers that pass statistical tests. Other applications, such as the generation of cryptographic keys, require a stronger property: unpredictability. Unpredictability is a security property, not (only) a statistical property: it means that an adversary cannot guess the output of the random number generator. (More precisely, you can measure the quality of the RNG by measuring the probability for an adversary to guess each bit of RNG output. If the probability is measurably different from 1/2, the RNG is bad.)

There are physical phenomena that produce random data with good statistical properties — for example, radioactive decay, or some astronomical observations of background noise, or stock market fluctuations. Such physical measurements need conditioning (whitening), to turn biased probability distributions into a uniform probability distribution. A physical measurement that is known to everyone isn't good for cryptography: stock market fluctuations might be good for geohashing, but you can't use them to generate secret keys.

Cryptography requires secrecy: an adversary must not be able to find out the data that went into conditioning. There are cryptographically secure pseudo-random number generators (CSPRNG): PRNG algorithms whose output is suitable for use in cryptographic applications, in addition to having good statistical properties. One of the properties that make a CSPRNG cryptographically secure is that its output does not allow an adversary to reconstruct the internal state (knowing all the bits but one produced by a CSPRNG does not help to find the missing bit). I won't go into how to make a CSPRNG because that's the easy bit — you can follow recipes given by professional cryptographers (use a standard algorithm, such as Hash_DRBG, HMAC_DRBG or CTR_DRBG from NIST SP 800-90A) or the ANSI X9.31 PRNG. The CSPRNG requires two properties of its state in order to be secure:

  • The state must be kept secret from the start and at all times (though exposure of the state will not reveal past outputs).
  • The state must be linear: the RNG must never be started twice from the same state.

Architecture of a random number generator

In practice, almost all good random number generators combine a CSPRNG with one or more entropy sources. To put it succintly, entropy is a measure of the unpredictability of a source of data. Basing a random number generator purely on a hardware RNG is difficult:

  • The raw physical data is likely to need conditioning anyway, to turn probabilistic data into a uniform distribution.
  • The output from the source of randomness must be kept secret.
  • Entropy sources are often slow compared with the demand.

Thus the RNG in an operating system almost always works like this:

  1. Accumulate sufficient entropy to build an unpredictable internal state.
  2. Run a CSPRNG, using the accumulated entropy as the seed, i.e. as the initial value of the internal state.
  3. Optionally, periodically mix additional entropy into the internal state. (This is not strictly necessary, since entropy is not "consumed" at any measurable rate. It helps against certain threats that leak the RNG state without compromising the whole system.)

A random number generation service is part of the job of an operating system, because entropy gathering requires access to hardware, and entropy sources constitute a shared resource: the operating system must assemble them and derive output from them that will suit applications. Pseudo-random conditioning of the entropy sources is required in the operating system; it might as well be cryptographically secure, because this isn't fundamentally harder (and it is required on operating systems where applications do not trust each other; on fully cooperative systems, each application would have to run its own CSPRNG internally if the operating system didn't provide one anyway).

Most systems with persistent storage will load an RNG seed from disk (I'll use "disk" as an abbreviation for any kind of persistent storage) when they boot, and overwrite the seed with some fresh pseudo-random data generated from that seed, or if available with random data generated from that seed plus another entropy source. This way, even if entropy is not available after a reboot, the entropy from a previous session is reused.

Some care must be taken about the saved state. Remember how I said the state must be linear? If you boot twice from the same disk state, you'll get the same RNG outputs. If this is a possibility in your environment, you need another source of entropy. Take care when restoring from backups, or when cloning a virtual machine. One technique for cloning is to mix the stored entropy with some environmental data that is predictable but unique (e.g. time and MAC address); beware that if the environmental data is predictable, anyone in possession of the stored VM state can reconstruct the seed of a forked VM instance.

Entropy sources

Finding (and correctly using) entropy sources is the most challenging part of random number generation in an operating system. The available entropy sources will necessarily be dependent on the hardware and on which environment the hardware runs in.

If you're lucky, your hardware provides a peripheral which can be used as an entropy source: a hardware random number generator, either dedicated or side-purposed. For example:

NIST SP800-90B provides design guidelines for hardware RNG. Evaluating a hardware RNG is difficult. Hardware RNG are typically delicate beasts, which need to be used with care: many types require some time after boot and some time between reads in order to destabilize, they are often sensitive to environmental conditions such as the temperature, etc.

Intel x86-64 processors based on the Ivy Bridge architecture provide the RdRand instruction which provides the output from a CSPRNG seeded by thermal noise. Most smartphone processors include a hardware entropy source, though Android doesn't always use it.

Systems that lack a strong entropy source have to make do with combining weak entropy sources and hoping (ensuring would be too strong a word) that they will suffice. Random mouse movements are popular for client machines, and you might have seen the security show by certain cryptography programs that ask you to move the mouse (even though on any 21st century PC operating system the OS will have accumulated entropy without the application needing to bother).

If you want to look at an example, you can look at Linux, though beware that it isn't perfect. In particular, /dev/random blocks too often (because it blocks until enough entropy is available, with an overly conservative notion of entropy), whereas /dev/urandom is almost always good except on first boot but gives no indication when it doesn't have enough entropy. Linux has drivers for many HRNG devices, and in accumulates entropy from various devices (including input devices) and disk timings.

If you have (confidential) persistent storage, you can use it to save entropy from one boot to the next, as indicated above. The first boot is a delicate time: the system may be in a fairly predictable state at that point, especially on mass-produced devices that essentially operate out of the factory in the same way. Some embedded devices that have persistent storage are provisioned with an initial seed in the factory (produced by a RNG running on a computer in the factory). In virtualized server environments, initial entropy can be provisioned when instantiating a virtual machine from the host or from an entropy server.

Badly-seeded devices are a widespread problem in practice — a study of public RSA keys found that many servers and devices had keys that were generated with a poor RNG, most likely a good PRNG that was insufficiently seeded. As an OS designer, you cannot solve this problem on your own: it is the job of the entity in control of the deployment chain to ensure that the RNG will be properly seeded at first boot. Your task as an OS designer is to provide a proper RNG, including an interface to provide that first seed, and to ensure proper error signaling if the RNG is used before it is properly seeded.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/29880

0 comments:

Post a Comment

Let us know your responses and feedback