Built-in Package Support in Python 1.5
Built-in Package Support in Python 1.5
Starting with Python version 1.5a4, package support is built intothe Python interpreter. This implements a slightly simplified andmodified version of the package import semantics pioneered by the "ni"module.
"Package import" is a method to structure Python's module namespaceby using "dotted module names". For example, the module name A.Bdesignates a submodule named B in a package named A. Just like theuse of modules saves the authors of different modules from having toworry about each other's global variable names, the use of dottedmodule names saves the authors of multi-module packages like NumPy orPIL from having to worry about each other's module names.
Starting with Python version 1.3, package import was supported by astandard Python library module, "ni". (The name is supposed to be anacronym for New Import, but really referrs to theKnights Who SayNi in the movieMonty Python and the Holy Grail, who, afterKing Arthur's knights return with a shrubbery, have changed theirnames to theKnights Who Say Neeeow ... Wum ... Ping - butthat's another story.)
The ni module was all user code except for a few modifications tothe Python parser (also introduced in 1.3) to accept import statementsof the for "import A.B.C" and "from A.B.C import X". When ni was notenabled, using this syntax resulted in a run-time error "No suchmodule". Once ni was enabled (by executing "import ni" beforeimporting other modules), ni's import hook would look for thesubmodule of the correct package.
The new package support is designed to resemble ni, but has beenstreamlined, and a few features have been changed or removed.
An Example
Suppose you want to design a package for the uniform handling ofsound files and sound data. There are many different sound fileformats (usually recognized by their extension, e.g. .wav, .aiff,.au), so you may need to create and maintain a growing collection ofmodules for the conversion between the various file formats. Thereare also many different operations you might want to perform on sounddata (e.g. mixing, adding echo, applying an equalizer function,creating an artificial stereo effect), so in addition you will bewriting a never-ending stream of modules to perform these operations.Here's a possible structure for your package (expressed in terms of ahierarchical filesystem):
Sound/Top-level package __init__.pyInitialize the sound package Utils/Subpackage for internal use __init__.py iobuffer.py errors.py ... Formats/Subpackage for file format conversions __init__.py wavread.py wavwrite.py aiffread.py aiffwrite.py auread.py auwrite.py ... Effects/Subpackage for sound effects __init__.py echo.py surround.py reverse.py ... Filters/Subpackage for filters __init__.py equalizer.py vocoder.py karaoke.py dolby.py ...
Users of the package can import individual modules from thepackage, for example:
import Sound.Effects.echo- This loads the submodule Sound.Effects.echo. It must be referencedwith its full name, e.g.
Sound.Effects.echo.echofilter(input, output, delay=0.7, atten=4) from Sound.Effects import echo- This also loads the submodule echo, and makes it available withoutits package prefix, so it can be used as follows:
echo.echofilter(input, output, delay=0.7, atten=4) from Sound.Effects.echo import echofilter- Again, this loads the submodule echo, but this makes its functionechofilter directly available:
echofilter(input, output, delay=0.7, atten=4)
Note that when usingfrompackageimportitem, the item can be either a submodule(or subpackage) of the package, or some other name defined in a thepackage, like a function, class or variable. The import statementfirst tests whether the item is defined in the package; if not, itassumes it is a module and attempts to load it. If it fails to findit, ImportError is raised.
Contrarily, when using syntax likeimportitem.subitem.subsubitem, each item except for the last must bea package; the last item can be a module or a package but can't bea class or function or variable defined in the previous item.
Importing * From a Package; the__all__ Attribute
Now what happens when the user writesfrom Sound.Effectsimport *? Ideally, one would hope that this somehow goes outto the filesystem, finds which submodules are present in the package,and imports them all. Unfortunately, this operation does not workvery well on Mac and Windows platforms, where the filesystem does notalways have accurate information about the case of a filename! Onthese platforms, there is no guaranteed way to know whether a fileECHO.PY should be imported as a module echo, Echo or ECHO. (Forexample, Windows 95 has the annoying practice of showing all filenames with a capitalized first letter.) The DOS 8+3 filenamerestriction adds another interesting problem for long module names.
The only solution is for the package author to provide an explicitindex of the package. The import statement uses the followingconvention: if a package's __init__.py code defines a list named__all__, it is taken to be the list of module names that should be importedwhenfrompackageimport * isencountered. It is up to the package author to keep this listup-to-date when a new version of the package is released. Packageauthors may also decide not to support it, if they don't see a use forimporting * from their package. For example, the fileSounds/Effects/__init__.py could contain the following code:
__all__ = ["echo", "surround", "reverse"]This would mean that
from Sound.Effects import * wouldimport the three named submodules of the Sound package.If __all__ is not defined, the statementfrom Sound.Effectsimport * doesnot import all submodules from the packageSound.Effects into the current namespace; it only ensures that thepackage Sound.Effects has been imported (possibly running itsinitialization code, __init__.py) and then imports whatever names aredefined in the package. This includes any names defined (andsubmodules explicitly loaded) by __init__.py. It also includes anysubmodules of the package that were explicitly loaded by previousimport statements, e.g.
In this example, the echo and surround modules are imported in thecurrent namespace because they are defined in the Sound.Effectspackage when the from...import statement is executed. (This alsoworks when __all__ is defined.)import Sound.Effects.echoimport Sound.Effects.surroundfrom Sound.Effects import *
Note that in general the practicing of importing * from a module orpackage is frowned upon, since it often causes poorly readable code.However, it is okay to use it to save typing in interactive sessions,and certain modules are designed to export only names that followcertain patterns.
Remember, there is nothing wrong with usingfrom Packageimport specific_submodule! In fact this becomes therecommended notation unless the importing module needs to usesubmodules with the same name from different packages.
Intra-package References
The submodules often need to refer to each other. For example, thesurround module might use the echo module. In fact, such referencesare so common that the import statement first looks in the containingpackage before looking in the standard module search path. Thus, thesurround module can simply useimport echo orfromecho import echofilter. If the imported module is not foundin the current package (the package of which the current module is asubmodule), the import statement looks for a top-level module with thegiven name.
When packages are structured into subpackage (as with the Soundpackage in the example), there's no shortcut to refer to submodules ofsibling packages - the full name of the subpackage must be used. Forexample, if the module Sound.Filters.vocoder needs to use the echomodule in the Sound.Effects package, it can usefromSound.Effects import echo.
(One could design a notation to refer to parent packages, similarto the use of ".." to refer to the parent directory in Unix andWindows filesystems. In fact, ni supported this using __ for thepackage containing the current module, __.__ for the parent package,and so on. This feature was dropped because of its awkwardness; sincemost packages will have a relative shallow substructure, this is nobig loss.)
Details
Packages Are Modules, Too!
Warning: the following may be confusing for those who are familiarwith Java's package notation, which is similar to Python's, butdifferent.
Whenever a submodule of a package is loaded, Python makes sure thatthe package itself is loaded first, loading its __init__.py file ifnecessary. The same for packages. Thus, when the statementimport Sound.Effects.echo is executed, it first ensuresthat Sound is loaded; then it ensures that Sound.Effects is loaded;and only then does it ensure that Sound.Effects.echo is loaded(loading it if it hasn't been loaded before).
Once loaded, the difference between a package and a module isminimal. In fact, both are represented by module objects, and bothare stored in the table of loaded modules, sys.modules. The key insys.modules is the full dotted name of a module (which is not alwaysthe same name as used in the import statement). This is also thecontents of the __name__ variable (which gives the full name of themodule or package).
The __path__ Variable
The one distinction between packages and modules lies in thepresence or absence of the variable __path__. This is only presentfor packages. It is initialized to a list of one item, containing thedirectory name of the package (a subdirectory of a directory onsys.path). Changing __path__ changes the list of directories that aresearched for submodules of the package. For example, theSound.Effects package might contain platform specific submodules. Itcould use the following directory structure:
Sound/ __init__.py Effects/# Generic versions of effects modules __init__.py echo.py surround.py reverse.py ... plat-ix86/# Intel x86 specific effects modules echo.pysurround.py plat-PPC/# PPC specific effects modules echo.py
The Effects/__init__.py file could manipulate its __path__ variableso that the appropriate platform specific subdirectory comesbefore the main Effects directory, so that the platformspecific implementations of certain effects (if available) overridethe generic (probably slower) implementations. For example:
platform = ...# Figure out which platform appliesdirname = __path__[0]# Package's main folder__path__.insert(0, os.path.join(dirname, "plat-" + platform))
If it is not desirable that platform specific submodules hidegeneric modules with the same name, __path__.append(...) should beused instead of __path__.insert(0, ...).
Note that the plat-* subdirectories arenot subpackages ofEffects - the file Sound/Effects/plat-PPC/echo.py correspondes to themodule Sound.Effects.echo.
Dummy Entries insys.modules
When using packages, you may occasionally find spurious entries insys.modules, e.g. sys.modules['Sound.Effects.string'] could be foundwith the value None. This is an "indirection" entry created becausesome submodule in the Sound.Effects package imported the top-levelstring module. Its purpose is an important optimization: because theimport statement cannot tell whether a local or global module iswanted, and because the rules state that a local module (in the samepackage) hides a global module with the same name, the importstatement must search the package's search pathbefore lookingfor a (possibly already imported) global module. Since searching thepackage's path is a relatively expensive operation, and importing analready imported module is supposed to be cheap (in the order of oneor two dictionary lookups) an optimization is in order. The dummyentry avoids searching the package's path when the same global moduleis imported from the second time by a submodule of the same package.
Dummy entries are only created for modules that are found at thetop level; if the module is not found at all, the import fails and theoptimization is generally not needed. Moreover, in interactive use,the user could create the module as a package-local submodule andretry the import; if a dummy entry had been created this would not befound. If the user changes the package structure by creating a localsubmodule with the same name as a global module that has already beenused in the package, the result is generally known as a "mess", andthe proper solution is to quit the interpreter and start over.
What If I Have a Module and a Package With The Same Name?
You may have a directory (on sys.path) which has both a modulespam.py and a subdirectory spam that contains an __init__.py (withoutthe __init__.py, a directory is not recognized as a package). In thiscase, the subdirectory has precedence, and importing spam will ignorethe spam.py file, loading the package spam instead. If you want themodule spam.py to have precedence, it must be placed in a directorythat comes earlier in sys.path.
(Tip: the search order is determined by the list of suffixesreturned by the function imp.get_suffixes(). Usually the suffixes aresearched in the following order: ".so", "module.so", ".py", ".pyc".Directories don't explicitly occur in this list, but precede allentries in it.)
A Proposal For Installing Packages
In order for a Python program to use a package, the package must befindable by the import statement. In other words, the package must bea subdirectory of a directory that is on sys.path.
Traditionally, the easiest way to ensure that a package was onsys.path was to either install it in the standard library or to haveusers extend sys.path by setting their $PYTHONPATH shell environmentvariable. In practice, both solutions quickly cause chaos.
Dedicated Directories
In Python 1.5, a convention has been established that shouldprevent chaos, by giving the system administrator more control. Firstof all, two extra directories are added to the end of the defaultsearch path (four if the install prefix and exec_prefix differ).These are relative to the install prefix (which defaults to/usr/local):
- $prefix/lib/python1.5/site-packages
- $prefix/lib/site-python
The site-packages directory can be used for packages that arelikely to depend on the Python version (e.g. package containing sharedlibraries or using new features). The site-python directory is usedfor backward compatibility with Python 1.4 and for pure Pythonpackages or modules that are not sensitive to the Python versionused.
Recommended use of these directories is to place each package in asubdirectory of its own in either the site-packages or the site-pythondirectory. The subdirectory should be the package name, which shouldbe acceptable as a Python identifier. Then, any Python program canimport modules in the package by giving their full name. For example,the Sound package used in the example could be installed in thedirectory $prefix/lib/python1.5/site-packages/Sound to enable importsstatements likeimport Sound.Effects.echo).
Adding a Level of Indirection
Some sites wish to install their packages in other places, butstill wish them to to be importable by all Python programs run by alltheir users. This can be accomplished by two different means:
- Symbolic Links
- If the package is structured for dotted-name import, place asymbolic link to its top-level directory in the site-packages orsite-python directory. The name of the symbolic link should be thepackage name; for example, the Sound package could have a symboliclink $prefix/lib/python1.5/site-packages/Sound pointing to/usr/home/soundguru/lib/Sound-1.1/src.
- Path Configuration Files
- If the package really requires adding one or more directories onsys.path (e.g. because it has not yet been structured to supportdotted-name import), a "path configuration file" namedpackage.pth can be placed in either the site-python orsite-packages directory. Each line in this file (except for commentsand blank lines) is considered to contain a directory name which isappended to sys.path. Relative pathnames are allowed and interpretedrelative to the directory containing the .pth file.
The .pth files are read in alphabetic order, with case sensitivitythe same as the local file system. This means that if you find theirresistable urge to play games with the order in which directoriesare searched, at least you can do it in a predictable way. (This isnot the same as an endorsement. A typical installation should have noor very few .pth files or something is wrong, and if you need to playwith the search order, something isvery wrong. Nevertheless,sometimes the need arises, and this is how you can do it of you must.)
Notes for Mac and Windows Platforms
On Mac and Windows, the conventions are slightly different. Theconventional directory for package installation on these platformsis the root (or a subdirectory) of the Python installation directory,which is specific to the installed Python version. This is also the(only) directory searched for path configuration files (*.pth).
Subdirectories of the Standard Library Directory
Since any subdirectory of a directory on sys.path is now implicitlyusable as a package, one could easily be confused about whether theseare intended as such. For example, assume there's a subdirectorycalled tkinter containing a module Tkinter.py. Should one writeimport Tkinter or import tkinter.Tkinter? If the tkintersubdirectory os on the path, both will work, but that's creatingunnecessary confusion.
I have established a simple naming convention that should removethis confusion: non-package directories must have a hyphen in theirname. In particular, all platform-specific subdirectories (sunos5,win, mac, etc.) have been renamed to a name with the prefix "plat-".The subdirectories specific to optional Python components that haven'tbeen converted to packages yet have been renamed to a name with theprefix "lib-". The dos_8x3 sundirectory has been renamed to dos-8x3.The following tables gives all renamed directories:
| Old Name | New Name |
| tkinter | lib-tk |
| stdwin | lib-stdwin |
| sharedmodules | lib-dynload |
| dos_8x3 | dos-8x3 |
| aix3 | plat-aix3 |
| aix4 | plat-aix4 |
| freebsd2 | plat-freebsd2 |
| generic | plat-generic |
| irix5 | plat-irix5 |
| irix6 | plat-irix6 |
| linux1 | plat-linux1 |
| linux2 | plat-linux2 |
| next3 | plat-next3 |
| sunos4 | plat-sunos4 |
| sunos5 | plat-sunos5 |
| win | plat-win |
| test | test |
Note that the test subdirectory is not renamed. It is now apackage. To invoke it, use a statement likeimporttest.autotest.
Other Stuff
XXX I haven't had the time to write up discussions of the followingitems yet:Changes Fromni
The following features of ni have not been duplicated exactly.Ignore this section unless you are currently using the ni module andwish to migrate to the built-in package support.
Dropped__domain__
By default, when a submodule of package A.B.C imports a module X,ni would search for A.B.C.X, A.B.X, A.X and X, in that order. Thiswas defined by the __domain__ variable in the package which could beset to a list of package names to be searched. This feature isdropped in the built-in package support. Instead, the search alwayslooks for A.B.C.X first and then for X. (This a reversal to the "twoscope" approach that is used successfully for namespace resolutionelsewhere in Python.)
Dropped__
Using ni, packages could use explicit "relative" module namesusing the special name "__" (two underscores). For example, modulesin package A.B.C can refer to modules defined in package A.B.K vianames of the form __.__.K.module. This feature has been droppedbecause of its limited use and poor readability.
Incompatible Semantics For__init__
Using ni, the __init__.py file inside a package (if present) wouldbe imported as a standard submodule of the package. The built-inpackage support instead loads the __init__.py file in the package'snamespace. This means that if __init__.py in package A defines a namex, if can be referred to as A.x without further effort. Using ni, the__init__.py would have to contain an assignment of the form__.x= x to get the same effect.
Also, the new package supportrequires that an__init__ module is present; under ni, it was optional.This is a change introduced in Python 1.5b1; it is designed to avoiddirectories with common names, like "string", to unintentionally hidevalid modules that occur later on the module search path.
Packages that wish to be backwards compatible with ni can testwhether the special variable __ exists, e.g.:
# Define a function to be visible at the package leveldef f(...): ...try: __except NameError: # new built-in package support passelse: # backwards compatibility for ni __.f = f
