Movatterモバイル変換
[0]ホーム
[Security-sig] PEP 522: Allow BlockingIOError in security sensitive APIs on Linux
Nick Coghlanncoghlan at gmail.com
Fri Jun 24 20:07:37 EDT 2016
On 24 June 2016 at 16:21, Victor Stinner <victor.stinner at gmail.com> wrote:> 2016-06-24 22:05 GMT+02:00 Nick Coghlan <ncoghlan at gmail.com>:>> As such, the idioms I currently have in PEP 522 are wrong - the "wait>> for the system RNG or not" decision wouldn't be one to be made on a>> per-call basis, but rather on a per-__main__ execution basis, with>> developers choosing which user experience they want to support on>> systems with a non-blocking /dev/urandom:>>>> * this application will fail if you run it before the system RNG is>> ready (so you may need to add "ExecStartPre=python3 -c 'import>> secrets; secrets.wait_for_system_rng()'" in your systemd unit file)>> In short, if an application is not run using systemd but directly on> the command line, it *can* fail with a fatal BlockingIOError?>From the command line, the answer is equally simple: just run "python3-c 'import secrets; secrets.wait_for_system_rng()'" before the commandyou actually care about.As an added bonus, that will work even if the command you care aboutisn't written in Python 3, and even if it reads from /dev/urandomrather than using the new syscall.> Wait, I don't think that it is an acceptable behaviour from the user> point of view.>> Compared to Python 2.7, Python 3.4 and Python 3.5.2 where os.urandom()> never blocks nor raises an exception on Linux, such behaviour change> can be seen as a major regression.The *only* way to get it to block (your PEP) or raise an exception(PEP 522) is to call os.urandom() (directly or indirectly) when thekernel RNG isn't ready - I consider the relevant analogy to be to PEP476, where we turned the silent security failure of accepting aninvalid or untrusted certificate (or one that didn't cover the namedhost) into the noisy error of failing to make the connection.>> * this application implicitly calls "secrets.wait_for_system_rng()">> and hence may block waiting for the system RNG if you run it before>> the system RNG is ready>> It's hard to guess if os.urandom() is used in a third-party library.> Maybe it's not. What if a new library version starts to use> os.urandom()? Should you start to call secrets.wait_for_system_rng()?>> To be safe, I expect that *all* applications should start with> secrets.wait_for_system_rng()... It doesn't make sense to have to put> such code in *all* applications.Application developers porting to Python 3.6 can wait and see whattheir own testing reports and what their users report - they don'tneed to guess.> The main advantage of the PEP 522 is to control how the "system> urandom not initialized yet" case is handled. But you are more and> more saying that secrets.wait_for_system_rng() should be used to not> get BlockingIOError in most cases. Am I wrong?I'm saying I think it's an application level decision, not a librarylevel decision.> I expect that some libraries will start to use> secrets.wait_for_system_rng() in their own code.>> ... At the end, it looks you basically reimplemented a blocking> os.urandom(), no?Potentially, but one of the important aspects of PEP 522 is that we'renot imposing that outcome by fiat - we're letting developers choosethe behaviour they want on a case by case basis, and seeing what theemergent consensus on correct behaviour turns out to be.It's equally possible that the outcome will be that both Python andLinux developers conclude that this is an operating system integrationissue, so systemd ends up adding a standard "kernelrng" target thatcomponents can wait for, and that then gets included as a requirementfor getting to the singleuser state on most distros.If we *do* reach a point where "always callsecrets.wait_for_system_rng() before using secrets,random.SystemRandom or os.urandom" is the idiomatic advice forPythonistas, *then* we can make os.urandom() blocking, andsecrets.wait_for_system_rng() would reduced to: def wait_for_system_rng(): os.urandom(1)> -->> Why do we have to bother *all* users with> secrets.wait_for_system_rng(), while only a very few will really care> of the exceptional case?We don't - only the ones that actually get the exception, sincethey're necessarily the ones the problem is relevant to. Runtimesystem configuration related exceptions aren't something to be avoidedat all costs - if they were, we'd never have made the changes we didto the way Unicode handling works.A good example of this at the library level is Armin Ronacher's clickcommand line helper - when you run that in the C locale under Python3, it just fails immediately, since the actual problem is thatsomething has gone wrong and your system locale isn't configuredproperly. The right answer is almost always to fix the localeconfiguration settings, not to change anything in the Python code.> Why not adding something for users who want to handle the exceptional> case, but make os.urandom() blocking?The main problem I have with the blocking solution is that if someonehits it unexpectedly, they're left staring at a blinking cursor (atbest), and no helpful hints to get started on debugging the problem.If it's a component they didn't write, they also can't really give agood bug report beyond "It hangs when I try to run it".By contrast, PEP 522 gives them an immediate exception and errormessage: "BlockingIOError: system random number generator is notready".If they're a developer themselves, they can plug that into Google andhopefully find a relevant answer (which we can virtually guarantee bypreseeding Stack Overflow with a suitable response)If they're *not* the application developer, they can paste thetraceback into a bug report or support ticket and say "Hey, what'sgoing on here?". At which point, the developer or support techhandling the ticket can do the appropriate Google search and respondaccordingly.Now, we could gain most of those debuggability benefits for a blockingsolution by trying in non-blocking mode first, then falling back toblocking only if we get EAGAIN - that would let us print aGoogle-friendly warning message before we implicitly block.That's where the argument of adopting a consistent approach of "trynon-blocking first, then maybe fall back to something else if itdoesn't work" comes into play - if os.urandom() (and hence indirectlythe secrets module) is trying in non-blocking mode and falling back toan alternative, *and* SipHash initialisation is doing that, *and*importing the random module is doing that, it sends a strong messageto me that the base primitive here is actually "try to read the systemRNG, and maybe fail to do so", rather than "read the system RNG andonly return when the requested data is available"Cheers,Nick.-- Nick Coghlan |ncoghlan at gmail.com | Brisbane, Australia
More information about the Security-SIGmailing list
[8]ページ先頭