Important
This PEP is a historical document. The up-to-date, canonical spec,Index hosted attestations, is maintained on thePyPA specs page.
×
See thePyPA specification update process for how to propose changes.
Important
This PEP is a historical document. The up-to-date, canonical documentation can now be found atPyPI - Digital Attestations.
×
SeePEP 1 for how to propose changes.
This PEP proposes a collection of changes related to the upload and distributionof digitally signed attestations and metadata used to verify them on a Pythonpackage repository, such as PyPI.
These changes have two subcomponents:
This PEP does not make a policy recommendation around mandatory digitalattestations on release uploads or their subsequent verification by installingclients likepip.
Desire for digital signatures on Python packages has been repeatedlyexpressed by both package maintainers and downstream users:
This proposal seeks to accommodate each of the above use cases.
Additionally, this proposal identifies the following motivations:
Digital attestations impose additional sophistication requirements: theattacker must be sufficiently sophisticated to access private signing material(or signing identities).
This PEP proposes a generic attestation format, containing anattestation statement for signature generation,with the expectation that index providers adopt theformat with a suitable source of identity for signature verification, such asTrusted Publishing.
This PEP identifies the following design considerations when evaluatingboth its own proposed changes and previous work in the same or adjacentareas of Python packaging:
This both simplifies some compatibility concerns (by avoidingthe need to modify the distribution formats themselves) and also simplifiesthe behavior of potential installing clients (by allowing them toretrieve each attestation before its corresponding package without needingto do streaming decompression).
This both increases the overall qualityof attestations uploaded to the index (preventing, for example, usersfrom accidentally uploading incorrect or invalid attestations) and alsoenables UI and UX refinements on the index itself (such as a “provenance”view for each uploaded package).
For example, to prevent domain separation between a distribution’s name andits contents, this PEP uses ‘Statements’from thein-toto project to bind the distribution’scontents (via SHA-256 digest) to its filename.
PyPI and other indices have historically supported PGP signatures on uploadeddistributions. These could be supplied during upload, and could be retrievedby installing clients via thedata-gpg-sig attribute in thePEP 503API, thegpg-sig key on thePEP 691 API, or via an adjacent.asc-suffixed URL.
PGP signature uploads have been disabled on PyPI sinceMay 2023, afteran investigationdetermined that the majority of signatures (which, themselves, constituted atiny percentage of overall uploads) could not be associated with a public key orotherwise meaningfully verified.
In their previously supported form on PyPI, PGP signatures satisfiedconsiderations (1) and (3) above but not (2) (owing to the need for externalkeyservers and key distribution) or (4) (due to PGP signatures typically beingconstructed over just an input file, without any associated signed metadata).
PEP 427 (and itsliving PyPA counterpart)specify thewheel format.
This format includes accommodations for digital signatures embedded directlyinto the wheel, in either JWS or S/MIME format. These signatures are specifiedover aPEP 376 RECORD, which is modified to include a cryptographic digestfor each recorded file in the wheel.
While wheel signatures are fully specified, they do not appear to be broadlyused; the officialwheel tooling deprecatedsignature generation and verification supportin 0.32.0, which wasreleased in 2018.
Additionally, wheel signatures do not satisfy any ofthe above considerations (due to the “attached” nature of the signatures,non-verifiability on the index itself, and support for wheels only).
The current upload API is not standardized. However, we propose the followingchanges to it:
content andgpg_signature fields,the indexSHALL acceptattestations as an additional multipart formfield.attestations fieldSHALL be a JSON array.attestations arraySHALL have one or more items, each a JSON objectrepresenting an individual attestation.attestations, itMUST reject the upload.The format of attestation objects is defined underAttestation objectsand the process for verifying attestations is defined underAttestation verification.The following changes are made to thesimple repository API:
The location of the provenance file is signaled by the index viathedata-provenance attribute.
data-provenance attribute on its file link. The value of thedata-provenance attributeSHALL be a fully qualified URL,signaling the the file’s provenance can be foundat that URL. This URLMUST represent asecure origin.The following table provides examples of release file URLs,data-provenancevalues, and their resulting provenance file URLs.
| File URL | data-provenance | Provenance URL |
|---|---|---|
| https://example.com/sampleproject-1.2.3.tar.gz | https://example.com/sampleproject-1.2.3.tar.gz.provenance | https://example.com/sampleproject-1.2.3.tar.gz.provenance |
| https://example.com/sampleproject-1.2.3.tar.gz | https://other.example.com/sampleproject-1.2.3.tar.gz/provenance | https://other.example.com/sampleproject-1.2.3.tar.gz/provenance |
| https://example.com/sampleproject-1.2.3.tar.gz | ../relative | (invalid: not a fully qualified URL) |
| https://example.com/sampleproject-1.2.3.tar.gz | http://unencrypted.example.com/provenance | (invalid: not a secure origin) |
SeeChanges to provenance objects for an additional discussion ofreasons why a file’s provenance may change.
The following changes are made to theJSON simple API:
provenance key in thefile dictionary for that file.The value of theprovenance keySHALL be either a JSON stringornull. Ifprovenance is notnull, itSHALL be a URLto the associated provenance file.
SeeAppendix 3: Simple JSON API size considerations for an explanation of the technical decision toembed the SHA-256 digest in the JSON API, rather than the fullprovenance object.
These changes require a version change to the JSON API:
api-versionSHALL specify version 1.3 or later.An attestation object is a JSON object with several required keys; applicationsor signers may include additional keys so long as all explicitlylisted keys are provided. The required layout of an attestationobject is provided as pseudocode below.
@dataclassclassAttestation:version:Literal[1]""" The attestation object's version, which is always 1. """verification_material:VerificationMaterial""" Cryptographic materials used to verify `envelope`. """envelope:Envelope""" The enveloped attestation statement and signature. """@dataclassclassEnvelope:statement:bytes""" The attestation statement. This is represented as opaque bytes on the wire (encoded as base64), but it MUST be an JSON in-toto v1 Statement. """signature:bytes""" A signature for the above statement, encoded as base64. """@dataclassclassVerificationMaterial:certificate:str""" The signing certificate, as `base64(DER(cert))`. """transparency_entries:list[object]""" One or more transparency log entries for this attestation's signature and certificate. """
A full data model for each object intransparency_entries is provided inAppendix 2: Data models for Transparency Log Entries. Attestation objectsSHOULD include one or moretransparency log entries, andMAY include additional keys for othersources of signed time (such as anRFC 3161 Time Stamping Authority or aRoughtime server).
Attestation objects are versioned; this PEP specifies version 1. Each versionis tied to a single cryptographic suite to minimize unnecessary cryptographicagility. In version 1, the suite is as follows:
Future PEPs may change this suite (and the overall shape of the attestationobject) by selecting a new version number.
Theattestation statement is the actual claim that is cryptographically signedover within the attestation object (i.e., theenvelope.statement).
The attestation statement is encoded as av1 in-toto Statement object,in JSON form. When serialized the statement is treated as an opaque binary blob,avoiding the need for canonicalization. An example JSON-encoded statement isprovided inAppendix 4: Example attestation statement.
In addition to being a v1 in-toto Statement, the attestation statement is constrainedin the following ways:
subjectMUST contain only a single subject.subject[0].name is the distribution’s filename, whichMUST bea validsource distribution orwheel distribution filename.subject[0].digestMUST contain a SHA-256 digest. Other digestsMAY be present. The digestsMUST be represented as hexadecimal strings.predicateType values are supported:https://slsa.dev/provenance/v1https://docs.pypi.org/attestations/publish/v1The signature over this statement is constructed using thev1 DSSE signature protocol,with aPAYLOAD_TYPE ofapplication/vnd.in-toto+json and aPAYLOAD_BODY of the JSON-encodedstatement above. No otherPAYLOAD_TYPE is permitted.
The index will serve uploaded attestations along with metadata that can assistin verifying them in the form of JSON serialized objects.
Theseprovenance objects will be available via both the Simple Indexand JSON-based Simple API as described above, and will have the following layout:
{"version":1,"attestation_bundles":[{"publisher":{"kind":"important-ci-service","claims":{},"vendor-property":"foo","another-property":123},"attestations":[{/* attestation 1 ... */},{/* attestation 2 ... */}]}]}
or, as pseudocode:
@dataclassclassPublisher:kind:string""" The kind of Trusted Publisher. """claims:object|None""" Any context-specific claims retained by the index during Trusted Publisher authentication. """_rest:object""" Each publisher object is open-ended, meaning that it MAY contain additional fields beyond the ones specified explicitly above. This field signals that, but is not itself present. """@dataclassclassAttestationBundle:publisher:Publisher""" The publisher associated with this set of attestations. """attestations:list[Attestation]""" The set of attestations included in this bundle. """@dataclassclassProvenance:version:Literal[1]""" The provenance object's version, which is always 1. """attestation_bundles:list[AttestationBundle]""" One or more attestation "bundles". """
version is1. Like attestation objects, provenance objects areversioned, and this PEP only defines version1.attestation_bundles is arequired JSON array, containing oneor more “bundles” of attestations. Each bundle corresponds to asigning identity (such as a Trusted Publishing identity), and containsone or more attestation objects.As noted in thePublisher model,eachAttestationBundle.publisher object is specific to its Trusted Publisherbut must include at minimum:
kind key, whichMUST be a JSON string that uniquely identifies thekind of Trusted Publisher.claims key, whichMUST be a JSON object containing any context-specificclaims retained by the index during Trusted Publisher authentication.All other keys in the publisher object are publisher-specific. A fullillustrative example of a publisher object is provided inAppendix 1: Example Trusted Publisher Representation.
Each array of attestation objects is a superset of theattestationsarray supplied by the uploaded through theattestations field at uploadtime, as described inUpload endpoint changes andChanges to provenance objects.
Provenance objects arenot immutable, and may change over time. Reasonsfor changes to the provenance object include but are not limited to:
Verifying an attestation object against a distribution file requires verification of each of thefollowing:
version is1. The verifierMUST reject any other version.verification_material.certificate is a valid signing certificate, asissued by ana priori trusted authority (such as a root of trust alreadypresent within the verifying client).verification_material.certificate identifies an appropriate signingsubject, such as the machine identity of the Trusted Publisher that publishedthe package.envelope.statement is a valid in-toto v1 Statement, with a subjectand digest thatMUST match the distribution’s filename and contents.For the distribution’s filename, matchingMUST be performed by parsingusing the appropriate source distribution or wheel filename format, asthe statement’s subject may be equivalent but normalized.envelope.signature is a valid signature forenvelope.statementcorresponding toverification_material.certificate,as reconstituted via thev1 DSSE signature protocol.In addition to the above required steps, a verifierMAY additionally verifyverification_material.transparency_entries on a policy basis, e.g. requiringat least one transparency log entry or a threshold of entries. When verifyingtransparency entries, the verifierMUST confirm that the inclusion time foreach entry lies within the signing certificate’s validity period.
This PEP is primarily “mechanical” in nature; it provides layouts forstructuring and serving verifiable digital attestations without specifyinghigher level security “policies” around attestation validity, thresholdsbetween attestations, and so forth.
Algorithmic agility is a common source of exploitable vulnerabilitiesin cryptographic schemes. This PEP limits algorithmic agility in two ways:
This PEP doesnot increase (or decrease) trust in the index itself:the index is still effectively trusted to honestly deliver unmodified packagedistributions, since a dishonest index capable of modifying packagecontents could also dishonestly modify or omit package attestations.As a result, this PEP’s presumption of index trust is equivalent to theunstated presumption with earlier mechanisms, like PGP and wheel signatures.
This PEP does not preclude or exclude future index trust mechanisms, suchasPEP 458 and/orPEP 480.
This PEP recommends, but does not mandate, that attestation objectscontain one or more verifiable sources of signed time that corroborate thesigning certificate’s claimed validity period. Indices that implement thisPEP may choose to strictly enforce this requirement.
This appendix provides a fictional example of apublisher key withina simple JSON APIproject.files[].provenance listing:
"publisher":{"kind":"GitHub","claims":{"ref":"refs/tags/v1.0.0","sha":"da39a3ee5e6b4b0d3255bfef95601890afd80709"},"repository_name":"HolyGrail","repository_owner":"octocat","repository_owner_id":"1","workflow_filename":"publish.yml","environment":null}
This appendix contains pseudocoded data models for transparency log entriesin attestation objects. Each transparency log entry serves as a sourceof signed inclusion time, and can be verified either online or offline.
@dataclassclassTransparencyLogEntry:log_index:int""" The global index of the log entry, used when querying the log. """log_id:str""" An opaque, unique identifier for the log. """entry_kind:str""" The kind (type) of log entry. """entry_version:str""" The version of the log entry's submitted format. """integrated_time:int""" The UNIX timestamp from the log from when the entry was persisted. """inclusion_proof:InclusionProof""" The actual inclusion proof of the log entry. """@dataclassclassInclusionProof:log_index:int""" The index of the entry in the tree it was written to. """root_hash:str""" The digest stored at the root of the Merkle tree at the time of proof generation. """tree_size:int""" The size of the Merkle tree at the time of proof generation. """hashes:list[str]""" A list of hashes required to complete the inclusion proof, sorted in order from leaf to root. The leaf and root hashes are not themselves included in this list; the root is supplied via `root_hash` and the client must calculate the leaf hash. """checkpoint:str""" The signed tree head's signature, at the time of proof generation. """cosigned_checkpoints:list[str]""" Cosigned checkpoints from zero or more log witnesses. """
A previous draft of this PEP required embedding eachprovenance object directly into its appropriate partof the JSON Simple API.
The current version of this PEP embeds the SHA-256 digest of the provenanceobject instead. This is done for size and network bandwidth considerationreasons:
These numbers are significantly worse in “pathological” cases, where projectshave hundreds or thousands of releases and/or dozens of files per release.
Given a source distributionsampleproject-1.2.3.tar.gz with a SHA-256digest ofe3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855,the following is an appropriate in-toto Statement, as a JSON object:
{"_type":"https://in-toto.io/Statement/v1","subject":[{"name":"sampleproject-1.2.3.tar.gz","digest":{"sha256":"e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"}}],"predicateType":"https://some-arbitrary-predicate.example.com/v1","predicate":{"something-else":"foo"}}
This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.
Source:https://github.com/python/peps/blob/main/peps/pep-0740.rst
Last modified:2024-12-03 18:16:41 GMT