Python Enhancement Proposals

Python »
PEP Index »
PEP 499

PEP 499 –`python-mfoo` should also bind`'foo'` in`sys.modules`

Author:: Cameron Simpson <cs at cskk.id.au>, Chris Angelico <rosuav at gmail.com>, Joseph Jevnik <joejev at gmail.com>
BDFL-Delegate:: Alyssa Coghlan
Status:

PEP Deferral

The implementation of this PEP isn’t currently expected to be ready for thePython 3.9 feature freeze in April 2020, so it has been deferred 12 months toPython 3.10.

Abstract

When a module is used as a main program on the Python command line,such as by:

python -m module.name …

it is easy to accidentally end up with two independent instancesof the module if that module is again imported within the program.This PEP proposes a way to fix this problem.

When a module is invoked via Python’s -m option the module is boundtosys.modules['__main__'] and its.__name__ attribute is set to'__main__'.This enables the standard “main program” boilerplate code at thebottom of many modules, such as:

if__name__=='__main__':sys.exit(main(sys.argv))

However, when the above command line invocation is used it is anatural inference to presume that the module is actually importedunder its official namemodule.name,and therefore that if the program again imports that namethen it will obtain the same module instance.

That actuality is that the module was imported only as'__main__'.Another import will obtain a distinct module instance, which canlead to confusing bugs,all stemming from having two instances of module global objects:one in each module.

Examples include:

module level data structures: Some modules provide features such as caches or registriesas module level global variables,typically private.A second instance of a module creates a second data structure.If that structure is a cachesuch as in there modulethen two caches exist leading to wasteful memory use.If that structure is a shared registrysuch as a mapping of values to handlersthen it is possible to register a handler to one registryand to try to use it via the other registry, where it is unknown.
sentinels: The standard test for a sentinel value provided by a moduleis the identity comparison usingis,as this avoids unreliable “looks like” comparisonssuch as equality which can both mismatch two values as “equal”(for example being zeroish)or raise aTypeError when the objects are incompatible.When there are two instances of a modulethere are two sentinel instancesand only one will be recognised viais.
classes: With two modulesthere are duplicate class definitions of any classes provided.All operations which depend on recognising these classesand subclasses of these are prone to failuredepending where the reference class(from one of the modules) is obtainedand where the comparison class or instance is obtained.This impactsisinstance,issubclassand alsotry/except constructs.

Proposal

It is suggested that to fix this situation all that is needed is asimple change to the way the-m option is implemented: in additionto binding the module object tosys.modules['__main__'], it is alsobound tosys.modules['module.name'].

Alyssa (Nick) Coghlan has suggested that this is as simple as modifying therunpy module’s_run_module_as_main function as follows:

main_globals=sys.modules["__main__"].__dict__

to instead be:

main_module=sys.modules["__main__"]sys.modules[mod_spec.name]=main_modulemain_globals=main_module.__dict__

Joseph Jevnik has pointed out that modules which are packages alreadydo something very similar to this proposal:the __init__.py file is bound to the module’s canonical nameand the __main__.py file is bound to “__main__”.As such, the double import issue does not occur.Therefore, this PEP proposes to affect only simple non-package modules.

Considerations and Prerequisites

Pickling Modules

Alyssa has mentionedissue 19702 which proposes (quoted from the issue):

runpy will ensure that when __main__ is executed via the importsystem, it will also be aliased in sys.modules as __spec__.name
if __main__.__spec__ is set, pickle will use __spec__.name ratherthan __name__ to pickle classes, functions and methods defined in__main__
multiprocessing is updated appropriately to skip creating __mp_main__in child processes when __main__.__spec__ is set in the parentprocess

The first point above covers this PEP’s specific proposal.

A Normal Module’s`name` Is No Longer Canonical

Chris Angelico points out that it becomes possible to import amodule whose__name__ is not what you gave to “import”, since“__main__” is now present at “module.name”, so a subsequentimportmodule.name finds it already present.Therefore,__name__ is no longer the canonical name for some normal imports.

Some counter arguments follow:

As ofPEP 451 a module’s canonical name is stored at__spec__.name.
Very little code should actually care about__name__ being the canonical nameand any that does should arguably be updated to consult__spec__.namewith fallback to__name__ for older Pythons, should that be relevant.This is true even if this PEP is not approved.
Should this PEP be approved,it becomes possible to introspect a module by its canonical nameand ask “was this the main program?” by inferring from__name__.This was not previously possible.

The glaring counter example is the standard “am I the main program?” boilerplate,where__name__ is expected to be “__main__”.This PEP explicitly preserves that semantic.

Reference Implementation

BPO 36375 is the issue tracker entryfor the PEP’s reference implementation, with the current draft PR beingavailableon GitHub.

Open Questions

This proposal does raise some backwards compatibility concerns, and these willneed to be well understood, and either a deprecation process designed, or clearporting guidelines provided.

Pickle compatibility

If no changes are made to the pickle module, then pickles that were previouslybeing written with the correct module name (due to a dual import) may startbeing written with__main__ as their module name instead, and hence failto be loaded correctly by other projects.

Scenarios to be checked:

pythonscript.py writing,python-mscript reading
python-mscript writing,pythonscript.py reading
python-mscript writing,pythonsome_other_app.py reading
old_python-mscript writing,new_python-mscript reading
new_python-mscript writing,old_python-mscript reading

Projects that special-case`main`

In order to get the regression test suite to pass, the current referenceimplementation had to patchpdb to avoid destroying its own globalnamespace.

This suggests there may be a broader compatibility issue where some scripts arerelying on direct execution and import giving different namespaces (just aspackage execution keeps the two separate by executing the__main__submodule in the__main__ namespace, while the package name referencesthe__init__ file as usual.

Background

I tripped over this issue while debugging a main program via amodule which tried to monkey patch a named module, that being themain program module. Naturally, the monkey patching was ineffectiveas it imported the main module by name and thus patched the secondmodule instance, not the running module instance.

However, the problem has been around as long as the-m commandline option and is encountered regularly, if infrequently, by others.

In addition toissue 19702, the discrepancy around__main__is alluded to inPEP 451 and a similar proposal (predatingPEP 451)is described inPEP 395 underFixing dual imports of the main module.

Copyright

This document has been placed in the public domain.

Source:https://github.com/python/peps/blob/main/peps/pep-0499.rst

Last modified:2025-02-01 08:55:40 GMT

Movatterモバイル変換

PEP 499 –python-mfoo should also bind'foo' insys.modules

PEP 499 –`python-mfoo` should also bind`'foo'` in`sys.modules`