Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 527 – Removing Un(der)used file types/extensions on PyPI

Author:
Donald Stufft <donald at stufft.io>
BDFL-Delegate:
Alyssa Coghlan <ncoghlan at gmail.com>
Discussions-To:
Distutils-SIG list
Status:
Final
Type:
Standards Track
Topic:
Packaging
Created:
23-Aug-2016
Post-History:
23-Aug-2016
Resolution:
Distutils-SIG message

Table of Contents

Abstract

This PEP recommends deprecating, and ultimately removing, support for uploadingcertain unused or under used file types and extensions to PyPI. In particularit recommends disallowing further uploads of any files of the typesbdist_dumb,bdist_rpm,bdist_dmg,bdist_msi, andbdist_wininst, leaving PyPI to only accept new uploads of thesdist,bdist_wheel, andbdist_egg file types.

In addition, this PEP proposes removing support for new uploads of sdists usingthe.tar,.tar.bz2,.tar.xz,.tar.Z,.tgz,.tbz, andany other extension besides.tar.gz and.zip.

Finally, this PEP also proposes limiting the number of allowed sdist uploadsfor each individual release of a project on PyPI to one instead of one for eachallowed extension.

Rationale

File Formats

Currently PyPI supports the following file types:

  • sdist
  • bdist_wheel
  • bdist_egg
  • bdist_wininst
  • bdist_msi
  • bdist_dmg
  • bdist_rpm
  • bdist_dumb

However, these different types of files have varying amounts of usefulness orgeneral use in the ecosystem. Continuing to support them adds a maintenanceburden on PyPI as well as tool authors and incurs a cost in both bandwidth anddisk space not only on PyPI itself, but also on any mirrors of PyPI.

Python packaging is a multi-level ecosystem where PyPI is primarily suited andused to distribute virtual environment compatible packages directly from theirrespective project owners. These packages are then consumed either directlyby end-users, or by downstream distributors that take these packages and turnthem into their respective system level packages (such as RPM, deb, MSI, etc).

While PyPI itself only directly works with these Python specific but platformagnostic packages, we encourage community-driven and commercial conversions ofthese packages to downstream formats for particular target environments, like:

  • The conda cross-platform data analysis ecosystem (conda-forge)
  • The deb based Linux ecosystem (Debian, Ubuntu, etc)
  • The RPM based Linux ecosystem (Fedora, openSuSE, Mageia, etc)
  • The homebrew, MacPorts and fink ecosystems for Mac OS X
  • The Windows Package Management ecosystem (NuGet, Chocolatey, etc)
  • 3rd party creation of Windows MSIs and installers (e.g. Christoph Gohlke’swork athttp://www.lfd.uci.edu/~gohlke/pythonlibs/ )
  • other commercial redistribution formats (ActiveState’s PyPM, EnthoughtCanopy, etc)
  • other open source community redistribution formats (Nix, Gentoo, Arch, *BSD,etc)

It is the belief of this PEP that the entire ecosystem is best supported bykeeping PyPI focused on the platform agnostic formats, where the limited amountof time by volunteers can be best used instead of spreading the available timeout amongst several platforms. Further more, this PEP believes that the peoplebest positioned to provide well integrated packages for a particular platformare people focused on that platform, and not across all possible platforms.

bdist_dumb

As it’s name implies,bdist_dumb is not a very complex format, however itis so simple as to be worthless for actual usage.

For instance, if you’re using something like pyenv on macOS and you’re buildinga library using Python 3.5, thenbdist_dumb will produce a.tar.gz filenamed something likeexampleproject-1.0.macosx-10.11-x86_64.tar.gz. Rightoff the bat this file name is somewhat difficult to differentiate from ansdist since they both use the same file extension (and with the legacy prePEP 440 versions,1.0-macosx-10.11-x86_64 is a valid, although quite silly,version number). However, once you open up the created.tar.gz, you’d findthat there is no metadata inside that could be used for things like dependencydiscovery and in fact, it is quite simply a tarball containing hardcoded pathsto wherever files would have been installed on the computer creating thebdist_dumb. Going back to our pyenv on macOS example, this means that if Icreated it, it would contain files like:

Users/dstufft/.pyenv/versions/3.5.2/lib/python3.5/site-packages/example.py

bdist_rpm

Thebdist_rpm format on PyPI allows people to upload.rpm files forend users to manually download by hand and then manually install by hand.However, the common usage ofrpm is with a specially designed repositorythat allows automatic installation of dependencies, upgrades, etc which PyPIdoes not provide. Thus, it is a type of file that is barely being used on PyPIwith only ~460 files of this type having been uploaded to PyPI (out a total of662,544).

In addition, services likeCOPR providea better supported mechanism for publishing and using RPM files than we’re everlikely to get on PyPI.

bdist_dmg, bdist_msi, and bdist_wininst

Thebdist_dmg,bdist_msi, andbdist_winist formats are similar inthat they are an OS specific installer that will only install a library into anenvironment and are not designed for real user facing installs of applications(which would require things like bundling a Python interpreter and the like).

Out of these three, the usage forbdist_dmg andbdist_msi is very low,with only ~500bdist_msi files and ~50bdist_dmg files having beenuploaded to PyPI. Thebdist_wininst format has more use, with ~14,000 fileshaving ever been uploaded to PyPI.

It’s quite easy to look at the low usage ofbdist_dmg andbdist_msi andconclude that removing them will be fairly low impact, howeverbdist_wininst has several orders of magnitude more usage. This is somewhatmisleading though, because although it has more peopleuploading those filesthe actual usage of those uploaded files is fairly low. Taking a look at theprevious 30 days, we can see that 90% of all downloads ofbdist_winistfiles from PyPI were generated by the mirroring infrastructure and 7% of themwere generated by setuptools (which can currently be better covered bybdist_egg files).

Given the small number of files uploaded forbdist_dmg andbdist_msiand thatbdist_wininst is largely existing to either consume bandwidth anddisk space via the mirroring infrastructureor could be trivially replacedwithbdist_egg, this PEP proposes to include these three formats in thelist of those to be disallowed.

File Extensions

Currentlysdist supports a wide variety of file extensions like.tar.gz,.tar,.tar.bz2,.tar.xz,.zip,.tar.Z,.tgz, and.tbz. However, of those the only extensions which get anything more thannegligible usage is.tar.gz with 444,338 sdists currently,.zip with58,774 sdists currently, and.tar.bz2 with 3,265 sdists currently.

Having multiple formats accepted requires tooling both within PyPI and outsideof PyPI to handle all of the various extensions thatmight be used (even ifnobody is currently using them). This doesn’t only affect PyPI, but ripples outthroughout the ecosystem. In addition, the different formats all have differentrequirements for what optional C libraries Python was linked against anddifferent requirements for what versions of Python they support. In addition,multiple formats also create a weird situation where there may be twosdist files for a particular project/release with subtly different content.

It’s easy to advocate that anything outside of.tar.gz,.zip, and.tar.bz2 should be disallowed. Outside of a tiny handful, nobody hasactively been uploading these other types of files in the ~15 years of PyPI’sexistence so they’ve obviously not been particularly useful. In addition, while.tar.xz is theoretically a nicer format than the other.tar.* formatsdue to the better compression ratio achieved by LZMA, it is only available inPython 3.3+ and has an optional dependency on the lzma C library.

Looking at the three extensions wedo have in current use, it’s also fairlyeasy to conclude that.tar.bz2 can be disallowed as well. It has a fairlysmall number of files ever uploaded with it and it requires an additionaloptional C library to handle the bzip2 compression.

Finally we get down to.tar.gz and.zip. Looking at the pure numbersfor these two, we can see that.tar.gz is by far the most uploaded format,with 444,338 total uploaded compared to.zip’s 58,774 and on POSIXoperating systems.tar.gz is also the default produced by all currentlyreleased versions of Python and setuptools. In addition, these two file typesboth use the same C library (zlib) which is also required forbdist_wheel andbdist_egg. The two wrinkles with deciding between.tar.gz and.zip is that while on POSIX operating systems.tar.gzis the default, on Windows.zip is the default and thebdist_wheelformat also uses zip.

Instead of trying to standardize on either.tar.gz or.zip, this PEPproposes that we alloweither.tar.gz or.zip for sdists.

Limiting number of sdists per release

A sdist on PyPI should be a single source of truth for a particular release ofsoftware. However, currently PyPI allows you to upload one sdist for each ofthe sdist file extensions it allows. Currently this allows something like 10different sdists for a project, but even with this PEP it allows two differentsources of truth for a single version. Having multiple sdists oftentimes canaccount for strange bugs that only expose themselves based on which sdist thatthe person used.

To resolve this, this PEP proposes to allow one, and only one, sdist perrelease of a project.

Removal Process

This PEP doesNOT propose removing any existing files from PyPI, onlydisallowing new ones from being uploaded. This restriction will be phased in ona per-project basis to allow projects to adjust to the new restrictions whereapplicable.

First, anyexisting projects will be flagged to allow legacy file types to beuploaded, and any project without that flag (i.e. new projects) will not beable to upload anything butsdist with a.tar.gz or.zip extension,bdist_wheel, andbdist_egg. Then, any existing projects that have neveruploaded a file that requires the legacy file type flag will have that flagremoved, also making them fall under the new restrictions. Finally, an emailwill be generated to the maintainers of all projects still given the legacyflag, which will inform them of the upcoming new restrictions on uploads andtell them that these restrictions will be applied to future uploads to theirprojects starting in 1 month. Finally, after 1 month all projects will have thelegacy file type flag removed, and support for uploading these types of fileswill cease to exist on PyPI.

This plan should provide minimal disruption since it does not remove anyexisting files, and the types of files it does prevent from being uploaded areeither not particularly useful (or used) types of filesor they can continueto upload a similar type of file with a slight change to their process.

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0527.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp