This PEP adds the ability to import Python modules*.py,*.py[co] and packages from zip archives. Thesame code is used to speed up normal directory importsprovidedos.listdir is available.
Zip imports were added to Python 2.3, but the final implementationuses an approach different from the one described in this PEP.The 2.3 implementation is SourceForge patch #652586[1], which addsnew import hooks described inPEP 302.
The rest of this PEP is therefore only of historical interest.
Currently,sys.path is a list of directory names as strings. Ifthis PEP is implemented, an item ofsys.path can be a stringnaming a zip file archive. The zip archive can contain asubdirectory structure to support package imports. The ziparchive satisfies imports exactly as a subdirectory would.
The implementation is in C code in the Python core and works onall supported Python platforms.
Any files may be present in the zip archive, but only files*.py and*.py[co] are available for import. Zip import ofdynamic modules (*.pyd,*.so) is disallowed.
Just assys.path currently has default directory names, a defaultzip archive name is added too. Otherwise there is no way toimport all Python library files from an archive.
The zip archive must be treated exactly as a subdirectory tree sowe can support package imports based on current and future rules.All zip data is taken from the Central Directory, the data must becorrect, and brain dead zip files are not accommodated.
Supposesys.path contains “/A/B/SubDir” and “/C/D/E/Archive.zip”,and we are trying to importmodfoo from theQ package. Thenimport.c will generate a list of paths and extensions and willlook for the file. The list of generated paths does not changefor zip imports. Supposeimport.c generates the path“/A/B/SubDir/Q/R/modfoo.pyc”. Then it will also generate the path“/C/D/E/Archive.zip/Q/R/modfoo.pyc”. Finding the SubDir path isexactly equivalent to finding “Q/R/modfoo.pyc” in the archive.
Suppose you zip up /A/B/SubDir/* and all its subdirectories. Thenyour zip file will satisfy imports just as your subdirectory did.
Well, not quite. You can’t satisfy dynamic modules from a zipfile. Dynamic modules have extensions like.dll,.pyd, and.so.They are operating system dependent, and probably can’t be loadedexcept from a file. It might be possible to extract the dynamicmodule from the zip file, write it to a plain file and load it.But that would mean creating temporary files, and dealing with allthedynload_*.c, and that’s probably not a good idea.
When trying to import*.pyc, if it is not available then*.pyo will be used instead. And vice versa when looking for*.pyo.If neither*.pyc nor*.pyo is available, or if the magic numbersare invalid, then*.py will be compiled and used to satisfy theimport, but the compiled file will not be saved. Python wouldnormally write it to the same directory as*.py, but surely wedon’t want to write to the zip file. We could write to thedirectory of the zip archive, but that would clutter it up, notgood if it is/usr/bin for example.
Failing to write the compiled files will make zip imports very slow,and the user will probably not figure out what is wrong. So itis best to put*.pyc and*.pyo in the archive with the*.py.
The only way to find files in a zip archive is linear search. Sofor each zip file insys.path, we search for its names once, andput the names plus other relevant data into a static Pythondictionary. The key is the archive name fromsys.path joined withthe file name (including any subdirectories) within the archive.This is exactly the name generated byimport.c, and makes lookupeasy.
This same mechanism is used to speed up directory (non-zip) imports.See below.
Compressed zip archives requirezlib for decompression. Prior toany other imports, we attempt an import ofzlib. Import ofcompressed files will fail with a message “missingzlib” unlesszlib is available.
Python importssite.py itself, and this importsos,nt,ntpath,stat, andUserDict. It also importssitecustomize.py which mayimport more modules. Zip imports must be available beforesite.pyis imported.
Just as there are default directories insys.path, there must beone or more default zip archives too.
The problem is what the name should be. The name should be linkedwith the Python version, so the Python executable can correctlyfind its corresponding libraries even when there are multiplePython versions on the same machine.
We add one name tosys.path. On Unix, the directory issys.prefix+"/lib", and the file name is"python%s%s.zip"%(sys.version[0],sys.version[2]).So for Python 2.2 and prefix/usr/local, the path/usr/local/lib/python2.2/ is already onsys.path, and/usr/local/lib/python22.zip would be added.On Windows, the file is the full path topython22.dll, with“dll” replaced by “zip”. The zip archive name is always insertedas the second item insys.path. The first is the directory of themain.py (thanks Tim).
The static Python dictionary used to speed up zip imports can beused to speed up normal directory imports too. For each item insys.path that is not a zip archive, we callos.listdir, and addthe directory contents to the dictionary. Then instead of callingfopen() in a double loop, we just check the dictionary. Thisgreatly speeds up imports. Ifos.listdir doesn’t exist, thedictionary is not used.
| Case | Original 2.2a3 | Using os.listdir | Zip Uncomp | Zip Compr |
|---|---|---|---|---|
| 1 | 3.2 2.5 3.2->1.02 | 2.3 2.5 2.3->0.87 | 1.66->0.93 | 1.5->1.07 |
| 2 | 2.8 3.9 3.0->1.32 | Same as Case 1. | ||
| 3 | 5.7 5.7 5.7->5.7 | 2.1 2.1 2.1->1.8 | 1.25->0.99 | 1.19->1.13 |
| 4 | 9.4 9.4 9.3->9.35 | Same as Case 3. |
Case 1: Local drive C:,sys.path has its default value.Case 2: Local drive C:, directory with files is at the end ofsys.path.Case 3: Network drive,sys.path has its default value.Case 4: Network drive, directory with files is at the end ofsys.path.
Benchmarks were performed on a Pentium 4 clone, 1.4 GHz, 256 Meg.The machine was running Windows 2000 with a Linux/Samba network server.Times are in seconds, and are the time to import about 100 Lib modules.Case 2 and 4 have the “correct” directory moved to the end ofsys.path.“Uncomp” means uncompressed zip archive, “Compr” means compressed.
Initial times are after a re-boot of the system; the time after“->” is the time after repeated runs. Times to import from C:after a re-boot are rather highly variable for the “Original” case,but are more realistic.
The logic demonstrates the ability to import using default searchinguntil a needed Python module (in this case,os) becomes available.This can be used to bootstrap custom importers. For example, if“importer()” in__init__.py exists, then it could be used for imports.The “importer()” can freely import os and other modules, and thesewill be satisfied from the default mechanism. This PEP does notdefine any custom importers, and this note is for information only.
A C implementation is available as SourceForge patch 492105.Superseded by patch 652586 and current CVS.[2]
A newer version (updated for recent CVS by Paul Moore) is 645650.Superseded by patch 652586 and current CVS.[3]
A competing implementation by Just van Rossum is 652586, which isthe basis for the final implementation ofPEP 302.PEP 273 hasbeen implemented usingPEP 302’s import hooks.[1]
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0273.rst
Last modified:2025-02-01 08:55:40 GMT