This PEP proposes the deprecation and removal of support for hosting filesexternally to PyPI as well as the deprecation and removal of the functionalityadded byPEP 438, particularly rel information to classify different types oflinks and the meta-tag to indicate API version.
Historically PyPI did not have any method of hosting files nor any method ofautomatically retrieving installables, it was instead focused on providing acentral registry of names, to prevent naming collisions, and as a means ofdiscovery for finding projects to use. In the course of time setuptools beganto scrape these human facing pages, as well as pages linked from those pages,looking for things it could automatically download and install. Eventually thisbecame the “Simple” API which used a similar URL structure however iteliminated any of the extraneous links and information to make the API moreefficient. Additionally PyPI grew the ability for a project to upload releasefiles directly to PyPI enabling PyPI to act as a repository in addition to anindex.
This gives PyPI two equally important roles that it plays in the Pythonecosystem, that of index to enable easy discovery of Python projects andcentral repository to enable easy hosting, download, and installation of Pythonprojects. Due to the history behind PyPI and the very organic growth it hasexperienced the lines between these two roles are blurry, and this blurring hascaused confusion for the end users of both of these roles and this has in turncaused ire between people attempting to use PyPI in different capacities, mostoften when end users want to use PyPI as a repository but the author wants touse PyPI solely as an index.
This confusion comes down to end users of projects not realizing if a projectis hosted on PyPI or if it relies on an external service. This often manifestsitself when the external service is down but PyPI is not. People will see thatPyPI works, and other projects works, but this one specific one does not. Theyoftentimes do not realize who they need to contact in order to get this fixedor what their remediation steps are.
PEP 438 attempted to solve this issue by allowing projects to explicitlydeclare if they were using the repository features or not, and if they werenot, it had the installers classify the links it found as either “internal”,“verifiable external” or “unverifiable external”.PEP 438 was accepted andimplemented in pip 1.4 (released on Jul 23, 2013) with the final transitionimplemented in pip 1.5 (released on Jan 2, 2014).
PEP 438 was successful in bringing about more people to utilize PyPI’srepository features, an altogether good thing given the global CDN poweringPyPI providing speed ups for a lot of people, however it did so by introducinga new point of confusion and pain for both the end users and the authors.
By moving to using explicit multiple repositories we can make the lines betweenthese two roles much more explicit and remove the “hidden” surprises caused bythe current implementation of handling people who do not want to use PyPI as arepository.
The two common installer tools, pip and easy_install/setuptools, both supportthe concept of additional locations to search for files to satisfy theinstallation requirements and have done so for many years. This means thatthere is no need to “phase” in a new flag or concept and the solution toinstalling a project from a repository other than PyPI will function regardlessof how old (within reason) the end user’s installer is. Not only has thisconcept existed in the Python tooling for some time, but it is a concept thatexists across languages and even extending to the OS level with OS packagetools almost universally using multiple repository support making it extremelylikely that someone is already familiar with the concept.
Additionally, the multiple repository approach is a concept that is usefuloutside of the narrow scope of allowing projects that wish to be included onthe index portion of PyPI but do not wish to utilize the repository portion ofPyPI. This includes places where a company may wish to host a repository thatcontains their internal packages or where a project may wish to have multiple“channels” of releases, such as alpha, beta, release candidate, and finalrelease. This could also be used for projects wishing to host files whichcannot be uploaded to PyPI, such as multi-gigabyte data files or, currently atleast, Linux Wheels.
While the additional search location support has existed in pip and setuptoolsfor quite some time support forPEP 438 has only existed in pip since the 1.4version, and still has yet to be implemented in setuptools. The design ofPEP 438 did mean that users still benefited for projects which did not requireexternal files even with older installers, however for projects whichdidrequire external files, users are still silently being given either potentiallyunreliable or, even worse, unsafe files to download. This system is also uniqueto Python as it arises out of the history of PyPI, this means that it is almostcertain that this concept will be foreign to most, if not all users, until theyencounter it while attempting to use the Python toolchain.
Additionally, the classification system proposed byPEP 438 has, in practice,turned out to be extremely confusing to end users, so much so that it is aposition of this PEP that the situation as it stands is completely untenable.The common pattern for a user with this system is to attempt to install aproject possibly get an error message (or maybe not if the project everuploaded something to PyPI but later switched without removing old files), seethat the error message suggests--allow-external, they reissue the commandadding that flag most likely getting another error message, see that this timethe error message suggests also adding--allow-unverified, and again issuethe command a third time, this time finally getting the thing they wish toinstall.
This UX failure exists for several reasons.
--allow-external) largely useless.$ pip install --allow-external myproject --allow-unverified myproject myproject$ pip install --allow-all-external --allow-unverified myproject myproject
Installers SHOULD implement or continue to offer, the ability to point theinstaller at multiple URL locations. The exact mechanisms for a user toindicate they wish to use an additional location is left up to each individualimplementation.
Additionally the mechanism discovering an installation candidate when multiplerepositories are being used is also up to each individual implementation,however once configured an implementation should not discourage, warn, orotherwise cast a negative light upon the use of a repository simply because itis not the default repository.
Currently both pip and setuptools implement multiple repository support byusing the best installation candidate it can find from either repository,essentially treating it as if it were one large repository.
Installers SHOULD also implement some mechanism for removing or otherwisedisabling use of the default repository. The exact specifics of how that isachieved is up to each individual implementation.
Installers SHOULD also implement some mechanism for whitelisting andblacklisting which projects a user wishes to install from a particularrepository. The exact specifics of how that is achieved is up to eachindividual implementation.
ThePython packaging guide MUST be updatedwith a section detailing the options for setting up their own repository sothat any project that wishes to not host on PyPI in the future can referencethat documentation. This should include the suggestion that projects relying onhosting their own repositories should document in their project description howto install their project.
A new hosting mode will be added to PyPI. This hosting mode will be calledpypi-only and will be in addition to the three thatPEP 438 has alreadygiven us which arepypi-explicit,pypi-scrape,pypi-scrape-crawl.This new hosting mode will modify a project’s simple api page so that it onlylists the files which are directly hosted on PyPI and will not link to anythingelse.
Upon acceptance of this PEP and the addition of thepypi-only mode, all newprojects will be defaulted to the PyPI only mode and they will be locked tothis mode and unable to change this particular setting.
An email will then be sent out to all of the projects which are hosted only onPyPI informing them that in one month their project will be automaticallyconverted to thepypi-only mode. A month after these emails have been sentany of those projects which were emailed, which still are hosted only on PyPIwill have their mode set permanently topypi-only.
At the same time, an email will be sent to projects which rely on hostingexternal to PyPI. This email will warn these projects that externally hostedfiles have been deprecated on PyPI and that in 3 months from the time of thatemail that all external links will be removed from the installer APIs. ThisemailMUST include instructions for converting their projects to be hostedon PyPI andMUST include links to a script or package that will enable themto enter their PyPI credentials and package name and have it automaticallydownload and re-host all of their files on PyPI. This emailMUST alsoinclude instructions for setting up their own index page. This email must alsocontain a link to the Terms of Service for PyPI as many users may have signedup a long time ago and may not recall what those terms are. Finally this emailmust also contain a list of the links registered with PyPI where we were ableto detect an installable file was located.
Two months after the initial email, another email must be sent to any projectsstill relying on external hosting. This email will include all of the sameinformation that the first email contained, except that the removal date willbe one month away instead of three.
Finally a month later all projects will be switched to thepypi-only modeand PyPI will be modified to remove the externally linked files functionality.
To determine impact, we’ve looked at all projects using a method of searchingPyPI which is similar to what pip and setuptools use and searched for allfiles available on PyPI, safely linked from PyPI, unsafely linked from PyPI,and finally unsafely available outside of PyPI. When the same file was foundin multiple locations it was deduplicated and only counted it in one locationbased on the following preferences: PyPI > Safely Off PyPI > Unsafely Off PyPI.This gives us the broadest possible definition of impact, it means that anysingle file for this project may no longer be visible by default, however thatfile could be years old, or it could be a binary file while there is a sdistavailable on PyPI. This means that thereal impact will likely be muchsmaller, but in an attempt not to miscount we take the broadest possibledefinition.
At the time of this writing there are 65,232 projects hosted on PyPI and ofthose, 59 of them rely on external files that are safely hosted outside of PyPIand 931 of them rely on external files which are unsafely hosted outside ofPyPI. This shows us that 1.5% of projects will be affected in some way by thischange while 98.5% will continue to function as they always have. In addition,only 5% of the projects affected are using the features provided byPEP 438 tosafely host outside of PyPI while 95% of them are exposing their users toRemote Code Execution via a Man In The Middle attack.
First you should decide if <X> is something inherent to PyPI, or if PyPI couldgrow a feature to solve <X> for you. If PyPI can add a feature to enable you tohost your project on PyPI then you should propose that feature. However, if <X>is something inherent to PyPI, such as wanting to maintain control over yourown files, then you should setup your own package repository and instruct yourusers in your project’s description to add it to the list of repositories theirinstaller of choice will use.
Part of this answer is going to be specific to each individual project, you’llneed to explain to your users what caused you to decide to host in your ownrepository instead of utilizing one that they already have in their installer’sdefault list of repositories. However, part of this answer will also beexplaining that the previous behavior of transparently including external linkswas both a security hazard (given that in most cases it allowed a MITM toexecute arbitrary Python code on the end users machine) and a reliabilityconcern and thatPEP 438 attempted to resolve this by making them explicitlyopt in, but thatPEP 438 brought along with it a number of serious usabilityissues.PEP 470 represents a simplification of the model to a model that manyusers will be familiar with, which is common amongst Linux distributions.
There are a number of cheap or free hosts that would gladly support what isrequired for a repository. In particular you don’t actually need to upload yourfiles anywhere differently as long as you can generate a host with the correctstructure that points to where your files are actually located. Many of thesehosts provide free HTTPS using a shared domain name, and free HTTPScertificates can be gotten fromStartSSL, or inthe near futureLetsEncrypt or they may be gottencheap from any number of providers.
The answer here will depend on what <X> is, however the answers typically areone of:
Additional PEPs to propose additional features are always welcome, however theywould need someone with the time and expertise to accurately design <X>. Thisparticular PEP is intended to focus on getting us to a point where thecapabilities of PyPI are straightforward with an easily understood baselinethat is similar to existing models such as Linux distribution repositories.
PyPI serves two critical functions for the Python ecosystem. One of those is asa central repository for the actual files that get downloaded and installed bypip or another package manager and it is this function that this PEP isconcerned with and that you’d be replacing if you’re running your ownrepository. However, it also provides a central registry of who owns what namein order to prevent naming collisions, think of it sort of as DNS but forPython packages. In addition to making sure that names are handed out in afirst-come, first-served manner it also provides a single place for users to goto look search for and discover new projects. So the simple answer is, youshould still register your project with PyPI to avoid naming collisions and tomake it so people can still easily discover your project.
A previous version of this PEP included a new feature added to both PyPI andinstallers that would allow project authors to enter into PyPI a list ofURLs that would instruct installers to ignore any files uploaded to PyPI andinstead return an error telling the end user about these extra URLs that theycan add to their installer to make the installation work.
This feature has been removed from the scope of the PEP because it proved toodifficult to develop a solution that avoided UX issues similar to those thatcaused so many problems with thePEP 438 solution. If needed, a future PEPcould revisit this idea.
This PEP rejects several related proposals which attempt to fix some of theusability problems with the current system but while still keeping the generalgist ofPEP 438.
This includes:
These proposals are rejected because:
/simple/<project>/ page, and possibly any URLslinked from that page.--allow-* options as well as the inability todetermine if a link is expected to fail or not.This is essentially the backwards compatible version of this PEP. It attemptsto allow people using older clients, or clients which do not implement thisPEP to continue on as if nothing had changed. This proposal is rejected becausethe vast bulk of those scenarios are unsafe uses of the deprecated features. Itis the opinion of this PEP that silently allowing unsafe actions to take placeon behalf of end users is simply not an acceptable solution.
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0470.rst
Last modified:2025-02-01 08:59:27 GMT