barry (Barry Warsaw)1 I’m very happy to share round 2 of PEP 694 – PyPI upload API 2.0. This latest round incorporates all the changes discussed in thefirst round of DPO discussion, plus changes from@EWDurbin based on our discussion at PyCon 2025. Please use this thread for any further feedback.
Given that@dstufft would normally be the PEP Delegate for PEPs affecting PyPI, and that he along with Ee are co-authors, we will be putting this PEP on the Steering Council’s agenda for pronouncement. Should they wish to delegate,@dustin has agreed to serve in that role.

This PEP proposes an extensible API for uploading files to a Python package index such as PyPI. Along with standardization, the upload API provides additional useful features such as support for:
11 Likes
woodruffw (William Woodruff)2 Thanks for sharing this@barry! I’m really excited to see the upload 2.0 designs move along.
Some scatterbrained thoughts from reading the current proposal:
- The
expires-at key is currently described as an ISO 8601 formatted timestamp. What do you think about constraining this a bit further, and requiring that it (1) be a UTC timestamp with the Zulu marker, and (2) forbidding fractional seconds (i.e. whole seconds only)? IME (in other contexts) these are unnecessary sources of malleability, especially since these are server-originated times that should only ever be UTC anyways. - This is pedantic, but perhaps the
filename andversion fields should specify that they correspond to their relevant PEPs/living standards? Forfilename I think that would be theSource distribution file name andBinary distribution file name convention and forversion I think it’sVersion specifiers. I think this is arguably obvious from context, but maybe being explicit will help future implementers. - Two thoughts about the nonce/session design:
The currentgentoken implementation lacks domain separation between the fields. I think this could result in some niche cases where two users end up with the same session token. Specifically, if two users both don’t include anonce and have a(name, version) pairs that compose into the same “domain,” then they’ll end up with the same session key.
This is pretty easy to contrive: for example,(foo, 11) (foo version v11) and(foo1, 1) (foo1, version v1) both compose intofoo11 for the purpose of the session key computation. The “classic” way to solve this is to encode each field such that it’s implicitly domain-separated, e.g. length-prefix each component.
Another option (which is probably sound but harder to formally assert) is adding delimiters between the fields that can’t occur in each field, e.g.\t or ASCII SOH.
More generally, I’m curious if anybody would object to making thenonce fully required – I think the above domain separation should still be addressed, but having thenonce would have effectively prevented it as well.
I completely understand the value of being able to disclose the staging URL (I want that feature!), but I think maybe we can preserve that property while reducing the flexibility of the session creation itself: uploading clients are in full control of the session URL regardless of thenonce state, so the UX of a--publish-staging (or similar) flag isn’t contingent on the upload APIitself allowing thenonce to be unset.
I think this has two virtuous effects: the first is that it’s a defense in depth against the scenario above, and the second is that it eliminates a source of user error/nonfamiliarity (i.e. users needing to understand what anonce is to get their intended behavior, versus the self-describing behavior of “you’ve disclosed the staging URL, regardless of how the machinery did that internally.”
2 Likes
EpicWink (Laurie O)7 Using fixed-precision, standardised-timezone ISO-8601 strings means you can compare and sort these date-times without needing to do any parsing.
Regarding the nonce and gentoken, what’s the rationale of making the token generation algorithm standard over simply returning the token in the get/create response and saying the token’s value is opaque?
2 Likes
encukou (Petr Viktorin)8 ISO 8601 has cruft likeweek dates and fractional minutes, as in2024-W33-4T09:59,5+0200. You probably wantRFC 3339 instead of that.
But of course, integer seconds are even simpler.
5 Likes
Jost (Jost Migenda)10 In the context of a public server like PyPI, I agree with everything you say. However, this PEP is more broadly applicable to Python package indices; and in the context of an internal package index, I think many of your points no longer apply.
Let me give a bit more context: I’m working in the e-Research team at a university. Among other things, we’re building Trusted Research Environments (VMs used for working with e.g. sensitive medical data), which have restrictions on network access and package installation. If this PEP is accepted, we will likely use it in a local package index and write various custom scripts to interact with the API.
In this context, I do know exactly which time zone (and which data centre and rack) the server is located in, I know this time zone is not going to change and I do actually want to see the server local time.
Most of those custom scripts are going to be fairly simple; in many cases, they’ll probably just echo the server response verbatim[1]. Enforcing UTC in the protocol and requiring all these scripts to explicitly parse the datetime and perform time zone conversion would add a bit of unnecessary friction.
(And yes, this isn’t too much work in the grand scheme of things; so if I haven’t convinced you after this post, I’m happy to drop the topic.)
The proposed API is nicely designed and mostly human readable—at least if those humans are sysadmins
↩︎
fungi (fungi)12 Let me give a bit more context: I’m working in the e-Research team at a university. Among other things, we’re building Trusted Research Environments (VMs used for working with e.g. sensitive medical data), which have restrictions on network access and package installation. If this PEP is accepted, we will likely use it in a local package index and write various custom scripts to interact with the API.
In this context, I do know exactly which time zone (and which data centre and rack) the server is located in, I know this time zone is not going to change and I do actually want to see the server local time.
Does server local time change in this context? That is to say, does the server exist in a locale that observes twice-yearly time changes? Taking my time zone as a reference, if I upload a package in February then the local time is UTC-0500, but if I’m downloading it in July my local time is UTC-0400. Which one is the relevant time to display? Does the server or client need to perform some conversion to “correct” it to now-local vs then-local time? Do you simply swap the offset indicator or do you add/subtract an hour on the timestamp too? Do you care about uploads that occur during the overlap of “fall back” hours where a later upload can have an earlier “local” timestamp?
Using UTC for the storage and transmission protocols simplifies this into a purely front-end implementation detail that doesn’t require standardizing.
6 Likes
EpicWink (Laurie O)17 An alternative is to encode the field lengths as 64-bit integers usingstruct.pack rather than including their textual representation.
mgorny (Michał Górny)20 Overall, it looked great, and the second iteration is even better. A few minor comments:
[…]
ServersMUST NOT advertise support for API versions beyond those defined in approved PEPs. Any new versions or formats require standardization through a new PEP.
This sounds like belonging to (and already stated in) the previous section.
If an error occurs, the appropriate4xx code will be returned, as described in theErrors section.
Looks like the “Errors section” doesn’tdescribe error codes. (Also in a few other places.)
In either case, the server should include aLocation header pointing back to the Publishing Session status URL, and if the server returned a202 Accepted, the client may poll that URL to watch for the status to change.
I’m sorry for being picky but this sounds like the client “may” poll it only for a 202. Perhaps something like:
In either case, the server should include aLocation header pointing back to the Publishing Session status URL that can be used to query the current status. If the server returned a202 Accepted, polling it can be used to watch for the status to change.
The serverMAY allow parallel uploads of files, but is not required to.
Presumably also returning409 Conflict if it doesn’t and you attempt a second parallel upload? I wonder if it should advertise the support for them somewhere up front, so clients wouldn’t have to determine-by-trying.