Important
This PEP is a historical document. The up-to-date, canonical spec,Source distribution file name, is maintained on thePyPA specs page.
×
See thePyPA specification update process for how to propose changes.
This PEP describes a standard naming scheme for a Source Distribution, alsoknown as ansdist. An sdist is distinct from an arbitrary archive filecontaining source code of Python packages, and can be used to communicateinformation about the distribution to packaging tools.
A standard sdist specified here is a gzipped tar file with a speciallyformatted filename and the usual.tar.gz suffix. This PEP does not specifythe contents of the tarball, as that is covered in other specifications.
An sdist is a Python package distribution that contains “source code” of thePython package, and requires a build step to be turned into a wheel oninstallation. This format is often considered as an unbuilt counterpart of aPEP 427 wheel, and given special treatments in various parts of thepackaging ecosystem.
The content of an sdist is specified inPEP 517 andPEP 643, but currentlythe filename of the sdist is incompletely specified, meaning that consumersof the format must download and process the sdist to confirm the name andversion of the distribution included within.
Installers currently rely on heuristics to infer the name and/or version fromthe filename, to help the installation process. pip, for example, parses thefilename of an sdist from aPEP 503 index, to obtain the distribution’sproject name and version for dependency resolution purposes. But due to thelack of specification, the installer does not have any guarantee as to thecorrectness of the inferred data, and must verify it at some point by locallybuilding the distribution metadata.
This build step is awkward for a certain class of operations, when the userdoes not expect the build process to occur.pypa/pip#8387 describes anexample. The commandpipdownload--no-deps--no-binary=numpynumpy isexpected to only download an sdist for numpy, since we do not need to checkfor dependencies, and both the name and version are available by introspectingthe downloaded filename. pip, however, cannot assume the downloaded archivefollows the convention, and must build and check the metadata. For aPEP 518project, this means running theprepare_metadata_for_build_wheel hookspecified inPEP 517, which incurs significant overhead.
By creating a special filename scheme for the sdist format, this PEP frees uptools from the time-consuming metadata verification step when they only needthe metadata available in the filename.
This PEP also serves as the formal specification to the long-standingfilename convention used by the current sdist implementations. The filenamecontains the distribution name and version, to aid tools identifying adistribution without needing to download, unarchive the file, and performcostly metadata generation for introspection, if all the information they needis available in the filename.
The name of an sdist should be{distribution}-{version}.tar.gz.
distribution is the name of the distribution as defined inPEP 345,and normalised as described inthe wheel spec e.g.'pip','flit_core'.version is the version of the distribution as defined inPEP 440,e.g.20.2, and normalised according to the rules in that PEP.An sdist must be a gzipped tar archive in pax format, that is able to beextracted by the standard librarytarfile module with the open flag'r:gz'.
Code that produces an sdist file MUST give the file a name that matches thisspecification. The specification of thebuild_sdist hook fromPEP 517 isextended to require this naming convention.
Code that processes sdist files MAY determine the distribution name and versionby simply parsing the filename, and is not required to verify that informationby generating or reading the metadata from the sdist contents.
Conforming sdist files can be recognised by the presence of the.tar.gzsuffix and asingle hyphen in the filename. Note that some legacy files mayalso match these criteria, but this is not expected to be an issue in practice.See the “Backwards Compatibility” section of this document for more details.
The new filename scheme is a subset of the current informal namingconvention for sdist files, so tools that create or publish files conformingto this standard will be readable by older tools that only understand theprevious naming conventions.
Tools that consume sdist filenames would technically not be able to determinewhether a file is using the new standard or a legacy form. However, a reviewof the filenames on PyPI determined that 37% of files are obviously legacy(because they contain multiple or no hyphens) and of the remainder, parsingaccording to this PEP gives the correct answer in all but 0.004% of cases.
Currently, tools that consume sdists should, if they are to be fully correct,treat the name and version parsed from the filename as provisional, and verifythem by downloading the file and generating the actual metadata (or reading it,if the sdist conforms toPEP 643). Tools supporting this specification cantreat the name and version from the filename as definitive. In theory, thiscould risk mistakes if a legacy filename is assumed to conform to this PEP,but in practice the chance of this appears to be vanishingly small.
Since this PEP was first written,PEP 643 has been accepted, defining atrustworthy, standard sdist metadata format. This allows distribution metadata(and in particular name and version) to be determined statically.
This is not considered sufficient, however, as in a number of significantcases (for example, reading filenames from a package index) the applicationonly has access to the filename, and reading metadata would involve apotentially costly download.
The original version of this PEP proposed a filename of{distribution}-{version}.sdist. This has the advantage of being explicit,as well as allowing a future change to the storage format without needing afurther change of the file naming convention.
However, there are significant compatibility issues with a new extension. Indexservers may currently disallow unknown extensions, and if we introduced a newone, it is not clear how to handle cases like a legacy index trying to mirror anindex that hosts new-style sdists. Is it acceptable to only partially mirror,omitting sdists for newer versions of projects? Also, build backends that producethe new format would be incompaible with index servers that only accept the oldformat, and as there is often no way for a user to request an older version of abackend when doing a build, this could make it impossible to build and uploadsdists.
A scheme{distribution}-{version}.sdist.tar.gz was raised during theinitial discussion. This was abandoned due to backwards compatibility issueswith currently available installation tools. pip 20.1, for example, wouldparsedistribution-1.0.sdist.tar.gz as projectdistribution withversion1.0.sdist. This would cause the sdist to be downloaded, but fail toinstall due to inconsistent metadata.
The main advantage of this proposal was that it is easier for tools torecognise the new-style naming. But this is not a particularly significantbenefit, given that all sdists with a single hyphen in the name are parsedthe same way under the old and new rules.
The contents of an sdist are required to contain a single top-level directorynamed{name}-{version}. Currently no normalisation rules are requiredfor the components of this name. Should this PEP require that the same normalisationrules are applied here as for the filename? Note that in practice, it is likelythat tools will create the two names using the same code, so normalisation islikely to happen naturally, even if it is not explicitly required.
This document is placed in the public domain or under the CC0-1.0-Universallicense, whichever is more permissive.
Source:https://github.com/python/peps/blob/main/peps/pep-0625.rst
Last modified:2025-05-06 21:28:00 GMT