woodruffw (William Woodruff)1 Draft PEP:PEP 807 – Index support for Trusted Publishing | peps.python.org
Pre-PEP thread:Pre-PEP: Trusted Publishing token exchange
Summary of the rationale and motivation
PyPI currently implements a technique that it callsTrusted Publishing, which is essentially a misuse-resistant token exchange mechanism that allows users to establish trust directly through a CI/CD or other identity provider instead of having to manually issue long-lived API credentials. Trusted Publishing has seen broad adoption on PyPI in the approximately ~2 years it’s been publicly available; some numbers are in the PEP itself.
While Trusted Publishing has been a success for PyPIitself, it isn’t a standard aspect of Python packaging. This means that third-party indices can’t easily implement it (without making implementation-specific assumptions from the Warehouse codebase), and that official PyPA tooling (like twine and gh-action-pypi-publish) are tied to Warehouse’s specific implementation decisions.
This PEP seeks to address both of these problems, and to make Trusted Publishing discovery and token exchange a fully-standard aspect of Python packaging (like other interoperability mechanisms). Specifically, this PEP seeks to standardize the Trusted Publishing discovery and token exchange flows so all parties (indices, including 3p indices, and clients) can implement them regardless of how similar their registry “topology” is to the PyPI topology (of a single registry on a domain).
Summary of the proposed changes
The bulk of the PEP’s standard language is a human description of the Trusted Publishing flowas currently implemented on PyPI. The main proposed changes are:
- A “discovery” flow that augments the current “implicit discovery” process used with PyPI, which currently makes service/topological assumptions about the registry that aren’t guaranteed to be true for other indices (namely that a single host only has a single registry, which isn’t true for many third-party registry hosts). The discovery flow also makes it possible for both PyPI and other hosts to make changes to their Trusted Publishing URLswithout breaking existing clients.
- An “exchange” flow that closely mirrors the existing flow implemented on PyPI. This flow consists of two endpoints: an “audience” endpoint that tells the uploading client which OIDC audience they need to obtain, and a “token minting” endpoint that the uploading client must submit their OIDC credential to in order to return an (index-specific) upload credential.
As always, thanks in advance to everyone who provides feedback below! I look forward to hearing the community’s thoughts on this proposal.
CC@dstufft as sponsor/delegate
4 Likes
woodruffw (William Woodruff)2 One thing I’m flagging for discussion/that I want feedback on: right now the error payload/model is a pretty bespoke one, and it mirrors what PyPI currently serves on its endpoints. I could see this being kind of annoying for integrators/third-party registries, which might prefer to use a more standard error response.
I only just today learned about RFC 9457 fromthis thread, so that’s one possible option! But I’m curious if others with more relevant experience here have other/better ideas too; I freely admit that HTTP API design isnot my primary subject area
1 Like
AA-Turner (Adam Turner)3 cc@kpfleming re RFC 9457 et al
3 Likes
woodruffw (William Woodruff)4 Another thing I wanted to flag: right now the token exchange part of this PEP mirrors exactly what PyPI does. However, one bottleneck that PyPI has observed is with publishers that correspond to hundreds (or thousands) of packages: when that happens PyPI needs to issue a scoped credential forall packages that could be uploaded by the publisher, even though only a very small minority are likely being uploaded.
One potential solution to that would be to allow the token exchange endpoint to accept a set of packages that the publisher actuallyintends to upload to, which would then be intersected with the eligible set. This would both make life easier on the PyPI side (no gigantic issued API tokens)and would have a more general security benefit (in that we get automatic scoping with TP, but the user canalso restrict beyond what the automatic scope would grant).
In practice, this would look something like this in the token minting endpoint:
POST /blahblah/mint-token{ "token": "oidc-cred-here", "packages": ["abc", "def"]}
…so if the matching publisher was registered forabc,def,foo, andbar, it would only provide a scoped credential forabc anddef instead of all four (which would be the default).
CC@miketheman and@dustin in particular for thoughts on the above
1 Like
dustin (Dustin Ingram)5 FWIW, I think the edge case that PyPI experiences here with a publisher w/ a large number of packages is more of a bug/implementation detail in PyPI than a general issue with indexes supporting trusted publishing (for context,https://github.com/pypi/warehouse/issues/18514).
I think slightly reducing the scope of an API token would be nice but I’m not totally convinced the incremental benefit is really worth the effort – given that the underlying identity would still have the ability to mint tokens for all the projects it’s configured for, this doesn’t protect a user with a compromised workflow, just in the narrow case where an API token leaks (within the expiry window, without the workflow also being compromised somehow).
If reducing the impact of API token leak is the ultimate goal, I would suggest instead that we find a way to make tokens single-use instead. This could even be a configurable setting that the index could support, which clients would detect based on the response from the token-minting endpoint, and handle refreshing the token automatically for uploads that need multiple separate requests.
2 Likes
dustin (Dustin Ingram)7 Yep, makes sense to me! The only question I would have is whether it should always be possible to request a single-use token when minting, or if this a configuration set on the index that happens automatically (or, both?).
1 Like
woodruffw (William Woodruff)10 That all makes sense to me! To spitball a design, what about this for the discovery response?
{ "audience-endpoint": "https://upload.example.com/_/oidc/audience", "token-mint-endpoint": "https://upload.example.com/_/oidc/mint-token", "features": ["single-use-token", "multi-use-token"], "default-features": ["multi-use-token"]}
In the above case, the server would be advertising that it supports both single- and multi-use tokens, with multi-use being the default. Then, the token-minting request:
{ "token": "oidc-token", "features": ["single-use"]}
…would change the default. The server would then gain a bit of complexity in terms of deconflicting feature requests, but that’s not too bad IMO.
woodruffw (William Woodruff)11
woodruffw (William Woodruff)12 After a bit of a hiatus, I come bearing updates!
The latest version of the proposed PEP includes two key changes:
- A feature negotiation design, so that a given index (like PyPI) can advertise support for Trusted Publishing behaviors beyond the default. For example, an index may wish to make its temporary publishing tokens single-use only; this negotiation design allows them to communicate this to clients so they can perform additional exchanges as necessary.
- A change to the “discovery” protocol, which both simplifies it and makes it “reversible” from the server’s perspective.
Apart from these changes, the PEP is the same as it was in November. I’d greatly appreciate any thoughts on it!
Edit: PEP link:PEP 807 – Index support for Trusted Publishing | peps.python.org
1 Like