Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 504 – Using the System RNG by default

PEP 504 – Using the System RNG by default

Author:
Alyssa Coghlan <ncoghlan at gmail.com>
Status:
Withdrawn
Type:
Standards Track
Created:
15-Sep-2015
Python-Version:
3.6
Post-History:
15-Sep-2015

Table of Contents

Abstract

Python currently defaults to using the deterministic Mersenne Twister randomnumber generator for the module level APIs in therandom module, requiringusers to know that when they’re performing “security sensitive” work, theyshould instead switch to using the cryptographically secureos.urandom orrandom.SystemRandom interfaces or a third party library likecryptography.

Unfortunately, this approach has resulted in a situation where developers thataren’t aware that they’re doing security sensitive work use the default modulelevel APIs, and thus expose their users to unnecessary risks.

This isn’t an acute problem, but it is a chronic one, and the often longdelays between the introduction of security flaws and their exploitation meansthat it is difficult for developers to naturally learn from experience.

In order to provide an eventually pervasive solution to the problem, this PEPproposes that Python switch to using the system random number generator bydefault in Python 3.6, and require developers to opt-in to using thedeterministic random number generator process wide either by using a newrandom.ensure_repeatable() API, or by explicitly creating their ownrandom.Random() instance.

To minimise the impact on existing code, module level APIs that requiredeterminism will implicitly switch to the deterministic PRNG.

PEP Withdrawal

During discussion of this PEP, Steven D’Aprano proposed the simpler alternativeof offering a standardisedsecrets module that provides “one obvious way”to handle security sensitive tasks like generating default passwords and othertokens.

Steven’s proposal has the desired effect of aligning the easy way to generatesuch tokens and the right way to generate them, without introducing anycompatibility risks for the existingrandom module API, so this PEP hasbeen withdrawn in favour of further work on refining Steven’s proposal asPEP 506.

Proposal

Currently, it is never correct to use the module level functions in therandom module for security sensitive applications. This PEP proposes tochange that admonition in Python 3.6+ to instead be that it is not correct touse the module level functions in therandom module for security sensitiveapplications ifrandom.ensure_repeatable() is ever called (directly orindirectly) in that process.

To achieve this, rather than being bound methods of arandom.Randominstance as they are today, the module level callables inrandom wouldchange to be functions that delegate to the corresponding method of theexistingrandom._inst module attribute.

By default, this attribute will be bound to arandom.SystemRandom instance.

A newrandom.ensure_repeatable() API will then rebind therandom._instattribute to asystem.Random instance, restoring the same module levelAPI behaviour as existed in previous Python versions (aside from theadditional level of indirection):

defensure_repeatable():"""Switch to using random.Random() for the module level APIs    This switches the default RNG instance from the cryptographically    secure random.SystemRandom() to the deterministic random.Random(),    enabling the seed(), getstate() and setstate() operations. This means    a particular random scenario can be replayed later by providing the    same seed value or restoring a previously saved state.    NOTE: Libraries implementing security sensitive operations should    always explicitly use random.SystemRandom() or os.urandom in order to    correctly handle applications that call this function.    """ifnotisinstance(_inst,Random):_inst=random.Random()

To minimise the impact on existing code, calling any of the following modulelevel functions will implicitly callrandom.ensure_repeatable():

  • random.seed
  • random.getstate
  • random.setstate

There are no changes proposed to therandom.Random orrandom.SystemRandom class APIs - applications that explicitly instantiatetheir own random number generators will be entirely unaffected by thisproposal.

Warning on implicit opt-in

In Python 3.6, implicitly opting in to the use of the deterministic PRNG willemit a deprecation warning using the following check:

ifnotisinstance(_inst,Random):warnings.warn(DeprecationWarning,"Implicitly ensuring repeatability. ""See help(random.ensure_repeatable) for details")ensure_repeatable()

The specific wording of the warning should have a suitable answer added toStack Overflow as was done for the custom error message that was added formissing parentheses in a call to print[10].

In the first Python 3 release after Python 2.7 switches to security fix onlymode, the deprecation warning will be upgraded to a RuntimeWarning so it isvisible by default.

This PEP doesnot propose ever removing the ability to ensure the default RNGused process wide is a deterministic PRNG that will produce the same series ofoutputs given a specific seed. That capability is widely used in modellingand simulation scenarios, and requiring thatensure_repeatable() be calledeither directly or indirectly is a sufficient enhancement to address the caseswhere the module level random API is used for security sensitive tasks in webapplications without due consideration for the potential security implicationsof using a deterministic PRNG.

Performance impact

Due to the large performance difference betweenrandom.Random andrandom.SystemRandom, applications ported to Python 3.6 will encounter asignificant performance regression in cases where:

  • the application is using the module level random API
  • cryptographic quality randomness isn’t needed
  • the application doesn’t already implicitly opt back in to the deterministicPRNG by callingrandom.seed,random.getstate, orrandom.setstate
  • the application isn’t updated to explicitly callrandom.ensure_repeatable

This would be noted in the Porting section of the Python 3.6 What’s New guide,with the recommendation to include the following code in the__main__module of affected applications:

ifhasattr(random,"ensure_repeatable"):random.ensure_repeatable()

Applications that do need cryptographic quality randomness should be using thesystem random number generator regardless of speed considerations, so in thosecases the change proposed in this PEP will fix a previously latent securitydefect.

Documentation changes

Therandom module documentation would be updated to move the documentationof theseed,getstate andsetstate interfaces later in the module,along with the documentation of the newensure_repeatable function and theassociated security warning.

That section of the module documentation would also gain a discussion of therespective use cases for the deterministic PRNG enabled byensure_repeatable (games, modelling & simulation, software testing) and thesystem RNG that is used by default (cryptography, security token generation).This discussion will also recommend the use of third party security librariesfor the latter task.

Rationale

Writing secure software under deadline and budget pressures is a hard problem.This is reflected in regular notifications of data breaches involving personallyidentifiable information[1], as well as with failures to takesecurity considerations into account when new systems, like motor vehicles[2], are connected to the internet. It’s also the case that a lot ofthe programming advice readily available on the internet[4] simplydoesn’t take the mathematical arcana of computer security into account.Compounding these issues is the fact that defenders have to coverall oftheir potential vulnerabilities, as a single mistake can make it possible tosubvert other defences[11].

One of the factors that contributes to making this last aspect particularlydifficult is APIs where using them inappropriately creates asilent securityfailure - one where the only way to find out that what you’re doing isincorrect is for someone reviewing your code to say “that’s a potentialsecurity problem”, or for a system you’re responsible for to be compromisedthrough such an oversight (and you’re not only still responsible for thatsystem when it is compromised, but your intrusion detection and auditingmechanisms are good enough for you to be able to figure out after the eventhow the compromise took place).

This kind of situation is a significant contributor to “security fatigue”,where developers (often rightly[9]) feel that security engineersspend all their time saying “don’t do that the easy way, it creates asecurity vulnerability”.

As the designers of one of the world’s most popular languages[8],we can help reduce that problem by making the easy way the right way (or atleast the “not wrong” way) in more circumstances, so developers and securityengineers can spend more time worrying about mitigating actually interestingthreats, and less time fighting with default language behaviours.

Discussion

Why “ensure_repeatable” over “ensure_deterministic”?

This is a case where the meaning of a word as specialist jargon conflicts withthe typical meaning of the word, even though it’stechnically the same.

From a technical perspective, a “deterministic RNG” means that given knowledgeof the algorithm and the current state, you can reliably compute arbitraryfuture states.

The problem is that “deterministic” on its own doesn’t convey those qualifiers,so it’s likely to instead be interpreted as “predictable” or “not random” byfolks that are familiar with the conventional meaning, but aren’t familiar withthe additional qualifiers on the technical meaning.

A second problem with “deterministic” as a description for the traditional RNGis that it doesn’t really tell you what you cando with the traditional RNGthat you can’t do with the system one.

“ensure_repeatable” aims to address both of those problems, as its commonmeaning accurately describes the main reason for preferring the deterministicPRNG over the system RNG: ensuring you can repeat the same series of outputsby providing the same seed value, or by restoring a previously saved PRNG state.

Only changing the default for Python 3.6+

Some other recent security changes, such as upgrading the capabilities of thessl module and switching to properly verifying HTTPS certificates bydefault, have been considered critical enough to justify backporting thechange to all currently supported versions of Python.

The difference in this case is one of degree - the additional benefits fromrolling out this particular change a couple of years earlier than willotherwise be the case aren’t sufficient to justify either the additional effortor the stability risks involved in making such an intrusive change in amaintenance release.

Keeping the module level functions

In additional to general backwards compatibility considerations, Python iswidely used for educational purposes, and we specifically don’t want toinvalidate the wide array of educational material that assumes the availabilityof the currentrandom module API. Accordingly, this proposal ensures thatmost of the public API can continue to be used not only without modification,but without generating any new warnings.

Warning when implicitly opting in to the deterministic RNG

It’s necessary to implicitly opt in to the deterministic PRNG as Python iswidely used for modelling and simulation purposes where this is the rightthing to do, and in many cases, these software models won’t have a dedicatedmaintenance team tasked with ensuring they keep working on the latest versionsof Python.

Unfortunately, explicitly callingrandom.seed with data fromos.urandomis also a mistake that appears in a number of the flawed “how to generate asecurity token in Python” guides readily available online.

Using first DeprecationWarning, and then eventually a RuntimeWarning, toadvise against implicitly switching to the deterministic PRNG aims tonudge future users that need a cryptographically secure RNG away fromcallingrandom.seed() and those that genuinely need a deterministicgenerator towards explicitly callingrandom.ensure_repeatable().

Avoiding the introduction of a userspace CSPRNG

The original discussion of this proposal on python-ideas[5] suggestedintroducing a cryptographically secure pseudo-random number generator and usingthat by default, rather than defaulting to the relatively slow system randomnumber generator.

The problem[7] with this approach is that it introduces an additionalpoint of failure in security sensitive situations, for the sake of applicationswhere the random number generation may not even be on a critical performancepath.

Applications that do need cryptographic quality randomness should be using thesystem random number generator regardless of speed considerations, so in thosecases.

Isn’t the deterministic PRNG “secure enough”?

In a word, “No” - that’s why there’s a warning in the module documentationthat says not to use it for security sensitive purposes. While we’re notcurrently aware of any studies of Python’s random number generator specifically,studies of PHP’s random number generator[3] have demonstrated the abilityto use weaknesses in that subsystem to facilitate a practical attack onpassword recovery tokens in popular PHP web applications.

However, one of the rules of secure software development is that “attacks onlyget better, never worse”, so it may be that by the time Python 3.6 is releasedwe will actually see a practical attack on Python’s deterministic PRNG publiclydocumented.

Security fatigue in the Python ecosystem

Over the past few years, the computing industry as a whole has beenmaking a concerted effort to upgrade the shared network infrastructure we alldepend on to a “secure by default” stance. As one of the most widely usedprogramming languages for network service development (including the OpenStackInfrastructure-as-a-Service platform) and for systems administrationon Linux systems in general, a fair share of that burden has fallen on thePython ecosystem, which is understandably frustrating for Pythonistas usingPython in other contexts where these issues aren’t of as great a concern.

This consideration is one of the primary factors driving the substantialbackwards compatibility improvements in this proposal relative to the initialdraft concept posted to python-ideas[6].

Acknowledgements

  • Theo de Raadt, for making the suggestion to Guido van Rossum that weseriously consider defaulting to a cryptographically secure random numbergenerator
  • Serhiy Storchaka, Terry Reedy, Petr Viktorin, and anyone else in thepython-ideas threads that suggested the approach of transparently switchingto therandom.Random implementation when any of the functions that onlymake sense for a deterministic RNG are called
  • Nathaniel Smith for providing the reference on practical attacks againstPHP’s random number generator when used to generate password reset tokens
  • Donald Stufft for pursuing additional discussions with network securityexperts that suggested the introduction of a userspace CSPRNG would meanadditional complexity for insufficient gain relative to just using thesystem RNG directly
  • Paul Moore for eloquently making the case for the current level of securityfatigue in the Python ecosystem

References

[1]
Visualization of data breaches involving more than 30k records (each)(http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/)
[2]
Remote UConnect hack for Jeep Cherokee(http://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/)
[3]
PRNG based attack against password reset tokens in PHP applications(https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf)
[4]
Search link for “python password generator”(https://www.google.com.au/search?q=python+password+generator)
[5]
python-ideas thread discussing using a userspace CSPRNG(https://mail.python.org/pipermail/python-ideas/2015-September/035886.html)
[6]
Initial draft concept that eventually became this PEP(https://mail.python.org/pipermail/python-ideas/2015-September/036095.html)
[7]
Safely generating random numbers(http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/)
[8]
IEEE Spectrum 2015 Top Ten Programming Languages(http://spectrum.ieee.org/computing/software/the-2015-top-ten-programming-languages)
[9]
OWASP Top Ten Web Security Issues for 2013(https://www.owasp.org/index.php/OWASP_Top_Ten_Project#tab=OWASP_Top_10_for_2013)
[10]
Stack Overflow answer for missing parentheses in call to print(http://stackoverflow.com/questions/25445439/what-does-syntaxerror-missing-parentheses-in-call-to-print-mean-in-python/25445440#25445440)
[11]
Bypassing bcrypt through an insecure data cache(http://arstechnica.com/security/2015/09/once-seen-as-bulletproof-11-million-ashley-madison-passwords-already-cracked/)

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0504.rst

Last modified:2025-12-27 18:19:14 GMT


[8]ページ先頭

©2009-2026 Movatter.jp