Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 524 – Make os.urandom() blocking on Linux

Author:
Victor Stinner <vstinner at python.org>
Status:
Final
Type:
Standards Track
Created:
20-Jun-2016
Python-Version:
3.6

Table of Contents

Abstract

Modifyos.urandom() to block on Linux 3.17 and newer until the OSurandom is initialized to increase the security.

Add also a newos.getrandom() function (for Linux and Solaris) to beable to choose how to handle whenos.urandom() is going to block onLinux.

The bug

Original bug

Python 3.5.0 was enhanced to use the newgetrandom() syscallintroduced in Linux 3.17 and Solaris 11.3. The problem is that usersstarted to complain that Python 3.5 blocks at startup on Linux invirtual machines and embedded devices: see issues#25420 and#26839.

On Linux,getrandom(0) blocks until the kernel initialized urandomwith 128 bits of entropy. The issue #25420 describes a Linux buildplatform blocking atimportrandom. The issue #26839 describes ashort Python script used to compute a MD5 hash, systemd-cron, scriptcalled very early in the init process. The system initialization blockson this script which blocks ongetrandom(0) to initialize Python.

The Python initialization requires random bytes to implement acounter-measure against the hash denial-of-service (hash DoS), see:

Importing therandom module creates an instance ofrandom.Random:random._inst. On Python 3.5, random.Randomconstructor reads 2500 bytes fromos.urandom() to seed a MersenneTwister RNG (random number generator).

Other platforms may be affected by this bug, but in practice, only Linuxsystems use Python scripts to initialize the system.

Status in Python 3.5.2

Python 3.5.2 behaves like Python 2.7 and Python 3.4. If the systemurandom is not initialized, the startup does not block, butos.urandom() can return low-quality entropy (even it is not easilyguessable).

Use Cases

The following use cases are used to help to choose the right compromisebetween security and practicability.

Use Case 1: init script

Use a Python 3 script to initialize the system, like systemd-cron. Ifthe script blocks, the system initialize is stuck too. The issue #26839is a good example of this use case.

Use case 1.1: No secret needed

If the init script doesn’t have to generate any secure secret, this usecase is already handled correctly in Python 3.5.2: Python startupdoesn’t block on system urandom anymore.

Use case 1.2: Secure secret required

If the init script has to generate a secure secret, there is no safesolution.

Falling back to weak entropy is not acceptable, it wouldreduce the security of the program.

Python cannot produce itself secure entropy, it can only wait untilsystem urandom is initialized. But in this use case, the whole systeminitialization is blocked by this script, so the system fails to boot.

The real answer is that the system initialization must not be blocked bysuch script. It is ok to start the script very early at systeminitialization, but the script may blocked a few seconds until it isable to generate the secret.

Reminder: in some cases, the initialization of the system urandom neveroccurs and so programs waiting for system urandom blocks forever.

Use Case 2: Web server

Run a Python 3 web server serving web pages using HTTP and HTTPSprotocols. The server is started as soon as possible.

The first target of the hash DoS attack was web server: it’s importantthat the hash secret cannot be easily guessed by an attacker.

If serving a web page needs a secret to create a cookie, create anencryption key, …, the secret must be created with good entropy:again, it must be hard to guess the secret.

A web server requires security. If a choice must be made betweensecurity and running the server with weak entropy, security is moreimportant. If there is no good entropy: the server must block or failwith an error.

The question is if it makes sense to start a web server on a host beforesystem urandom is initialized.

The issues #25420 and #26839 are restricted to the Python startup, notto generate a secret before the system urandom is initialized.

Fix system urandom

Load entropy from disk at boot

Collecting entropy can take up to several minutes. To accelerate thesystem initialization, operating systems store entropy on disk atshutdown, and then reload entropy from disk at the boot.

If a system collects enough entropy at least once, the system urandomwill be initialized quickly, as soon as the entropy is reloaded fromdisk.

Virtual machines

Virtual machines don’t have a direct access to the hardware and so haveless sources of entropy than bare metal. A solution is to add avirtio-rng device to pass entropyfrom the host to the virtual machine.

Embedded devices

A solution for embedded devices is to plug an hardware RNG.

For example, Raspberry Pi have an hardware RNG but it’s not used bydefault. See:Hardware RNG on Raspberry Pi.

Denial-of-service when reading random

Don’t use /dev/random but /dev/urandom

The/dev/random device should only used for very specific use cases.Reading from/dev/random on Linux is likely to block. Users don’tlike when an application blocks longer than 5 seconds to generate asecret. It is only expected for specific cases like generatingexplicitly an encryption key.

When the system has no available entropy, choosing between blockinguntil entropy is available or falling back on lower quality entropy is amatter of compromise between security and practicability. The choicedepends on the use case.

On Linux,/dev/urandom is secure, it should be used instead of/dev/random. SeeMyths about /dev/urandom by Thomas Hühn: “Fact:/dev/urandom is the preferred source of cryptographic randomness onUNIX-like systems”

getrandom(size, 0) can block forever on Linux

The origin of the Python issue #26839 is theDebian bugreport #822431: in fact,getrandom(size,0) blocks forever on the virtual machine. The systemsucceeded to boot because systemd killed the blocked process after 90seconds.

Solutions likeLoad entropy from disk at boot reduces the risk ofthis bug.

Rationale

On Linux, reading the/dev/urandom can return “weak” entropy beforeurandom is fully initialized, before the kernel collected 128 bits ofentropy. Linux 3.17 adds a newgetrandom() syscall which allows toblock until urandom is initialized.

On Python 3.5.2, os.urandom() uses thegetrandom(size,GRND_NONBLOCK), but falls back on reading thenon-blocking/dev/urandom ifgetrandom(size,GRND_NONBLOCK)fails withEAGAIN.

Security experts promotesos.urandom() to generate cryptographickeys because it is implemented with aCryptographically securepseudo-random number generator (CSPRNG).By the way,os.urandom() is preferred overssl.RAND_bytes() fordifferent reasons.

This PEP proposes to modify os.urandom() to usegetrandom() inblocking mode to not return weak entropy, but also ensure that Pythonwill not block at startup.

Changes

Make os.urandom() blocking on Linux

All changes described in this section are specific to the Linuxplatform.

Changes:

  • Modify os.urandom() to block until system urandom is initialized:os.urandom() (C function_PyOS_URandom()) is modified toalways callgetrandom(size,0) (blocking mode) on Linux andSolaris.
  • Add a new private_PyOS_URandom_Nonblocking() function: try tocallgetrandom(size,GRND_NONBLOCK) on Linux and Solaris, butfalls back on reading/dev/urandom if it fails withEAGAIN.
  • Initialize hash secret from non-blocking system urandom:_PyRandom_Init() is modified to call_PyOS_URandom_Nonblocking().
  • random.Random constructor now uses non-blocking system urandom: itis modified to use internally the new_PyOS_URandom_Nonblocking()function to seed the RNG.

Add a new os.getrandom() function

A newos.getrandom(size,flags=0) function is added: usegetrandom() syscall on Linux andgetrandom() C function onSolaris.

The function comes with 2 new flags:

  • os.GRND_RANDOM: read bytes from/dev/random rather thanreading/dev/urandom
  • os.GRND_NONBLOCK: raise a BlockingIOError ifos.getrandom()would block

Theos.getrandom() is a thin wrapper on thegetrandom()syscall/C function and so inherit of its behaviour. For example, onLinux, it can return less bytes than requested if the syscall isinterrupted by a signal.

Examples using os.getrandom()

Best-effort RNG

Example of a portable non-blocking RNG function: try to get random bytesfrom the OS urandom, or fallback on the random module.

defbest_effort_rng(size):# getrandom() is only available on Linux and Solarisifnothasattr(os,'getrandom'):returnos.urandom(size)result=bytearray()try:# need a loop because getrandom() can return less bytes than# requested for different reasonswhilesize:data=os.getrandom(size,os.GRND_NONBLOCK)result+=datasize-=len(data)exceptBlockingIOError:# OS urandom is not initialized yet:# fallback on the Python random moduledata=bytes(random.randrange(256)forbyteinrange(size))result+=datareturnbytes(result)

This functioncan block in theory on a platform whereos.getrandom() is not available butos.urandom() can block.

wait_for_system_rng()

Example of function waitingtimeout seconds until the OS urandom isinitialized on Linux or Solaris:

defwait_for_system_rng(timeout,interval=1.0):ifnothasattr(os,'getrandom'):returndeadline=time.monotonic()+timeoutwhileTrue:try:os.getrandom(1,os.GRND_NONBLOCK)exceptBlockingIOError:passelse:returniftime.monotonic()>deadline:raiseException('OS urandom not initialized after%s seconds'%timeout)time.sleep(interval)

This function isnot portable. For example,os.urandom() can blockon FreeBSD in theory, at the early stage of the system initialization.

Create a best-effort RNG

Simpler example to create a non-blocking RNG on Linux: choose betweenRandom.SystemRandom andRandom.Random depending ifgetrandom(size) would block.

defcreate_nonblocking_random():ifnothasattr(os,'getrandom'):returnrandom.Random()try:os.getrandom(1,os.GRND_NONBLOCK)exceptBlockingIOError:returnrandom.Random()else:returnrandom.SystemRandom()

This function isnot portable. For example,random.SystemRandomcan block on FreeBSD in theory, at the early stage of the systeminitialization.

Alternative

Leave os.urandom() unchanged, add os.getrandom()

os.urandom() remains unchanged: never block, but it can return weakentropy if system urandom is not initialized yet.

Only add the newos.getrandom() function (wrapper to thegetrandom() syscall/C function).

Thesecrets.token_bytes() function should be used to write portablecode.

The problem with this change is that it expects that users understandwell security and know well each platforms. Python has the tradition ofhiding “implementation details”. For example,os.urandom() is not athin wrapper to the/dev/urandom device: it usesCryptGenRandom() on Windows, it usesgetentropy() on OpenBSD, ittriesgetrandom() on Linux and Solaris or falls back on reading/dev/urandom. Python already uses the best available system RNGdepending on the platform.

This PEP does not change the API:

  • os.urandom(),random.SystemRandom andsecrets for security
  • random module (exceptrandom.SystemRandom) for all other usages

Raise BlockingIOError in os.urandom()

Proposition

PEP 522: Allow BlockingIOError in security sensitive APIs on Linux.

Python should not decide for the developer how to handleThe bug:raising immediately aBlockingIOError ifos.urandom() is going toblock allows developers to choose how to handle this case:

  • catch the exception and falls back to a non-secure entropy source:read/dev/urandom on Linux, use the Pythonrandom module(which is not secure at all), use time, use process identifier, etc.
  • don’t catch the error, the whole program fails with this fatalexception

More generally, the exception helps to notify when sometimes goes wrong.The application can emit a warning when it starts to wait foros.urandom().

Criticism

For the use case 2 (web server), falling back on non-secure entropy isnot acceptable. The application must handleBlockingIOError: pollos.urandom() until it completes. Example:

defsecret(n=16):try:returnos.urandom(n)exceptBlockingIOError:passprint("Wait for system urandom initialization: move your ""mouse, use your keyboard, use your disk, ...")while1:# Avoid busy-loop: sleep 1 mstime.sleep(0.001)try:returnos.urandom(n)exceptBlockingIOError:pass

For correctness, all applications which must generate a secure secretmust be modified to handleBlockingIOError even ifThe bug isunlikely.

The case of applications usingos.urandom() but don’t really requiresecurity is not well defined. Maybe these applications should not useos.urandom() at the first place, but always the non-blockingrandom module. Ifos.urandom() is used for security, we are backto the use case 2 described above:Use Case 2: Web server. If adeveloper doesn’t want to dropos.urandom(), the code should bemodified. Example:

defalmost_secret(n=16):try:returnos.urandom(n)exceptBlockingIOError:returnbytes(random.randrange(256)forbyteinrange(n))

The question is ifThe bug is common enough to require that so manyapplications have to be modified.

Another simpler choice is to refuse to start before the system urandomis initialized:

defsecret(n=16):try:returnos.urandom(n)exceptBlockingIOError:print("Fatal error: the system urandom is not initialized")print("Wait a bit, and rerun the program later.")sys.exit(1)

Compared to Python 2.7, Python 3.4 and Python 3.5.2 where os.urandom()never blocks nor raise an exception on Linux, such behaviour change canbe seen as a major regression.

Add an optional block parameter to os.urandom()

See theissue #27250: Add os.urandom_block().

Add an optional block parameter to os.urandom(). The default value maybeTrue (block by default) orFalse (non-blocking).

The first technical issue is to implementos.urandom(block=False) onall platforms. Only Linux 3.17 (and newer) and Solaris 11.3 (and newer)have a well defined non-blocking API (getrandom(size,GRND_NONBLOCK)).

AsRaise BlockingIOError in os.urandom(), it doesn’t seem worth it tomake the API more complex for a theoretical (or at least very rare) usecase.

AsLeave os.urandom() unchanged, add os.getrandom(), the problem isthat it makes the API more complex and so more error-prone.

Acceptance

The PEP wasaccepted on 2016-08-08 by Guido van Rossum.

Annexes

Operating system random functions

os.urandom() uses the following functions:

On Linux, commands to get the status of/dev/random (results arenumber of bytes):

$ cat /proc/sys/kernel/random/entropy_avail2850$ cat /proc/sys/kernel/random/poolsize4096

Why using os.urandom()?

Sinceos.urandom() is implemented in the kernel, it doesn’t haveissues of user-space RNG. For example, it is much harder to get itsstate. It is usually built on a CSPRNG, so even if its state is“stolen”, it is hard to compute previously generated numbers. The kernelhas a good knowledge of entropy sources and feed regularly the entropypool.

That’s also whyos.urandom() is preferred overssl.RAND_bytes().

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0524.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp