RFC 9709 | Encryption Key Derivation in CMS | January 2025 |
Housley | Standards Track | [Page] |
This document specifies the derivation of the content-encryption key or thecontent-authenticated-encryption key in the Cryptographic Message Syntax (CMS)using the HMAC-based Extract-and-Expand Key Derivation Function (HKDF) with SHA-256.The use of this mechanism provides protection against an attacker that manipulates thecontent-encryption algorithm identifier or the content-authenticated-encryptionalgorithm identifier.¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9709.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
This document specifies the derivation of the content-encryption key for theCryptographic Message Syntax (CMS) enveloped-data content type[RFC5652], thecontent-encryption key for the CMS encrypted-data content type[RFC5652], or thecontent-authenticated-encryption key for the authenticated-enveloped-datacontent type[RFC5083].¶
The use of this mechanism provides protection against an attacker that manipulates thecontent-encryption algorithm identifier or the content-authenticated-encryptionalgorithm identifier. Johannes Roth and Falko Strenzke presented such an attackat IETF 118[RS2023], where:¶
The attacker intercepts a CMS authenticated-enveloped-data content[RFC5083]that uses either AES-CCM or AES-GCM[RFC5084].¶
The attacker turns the intercepted content into a "garbage" CMS enveloped-datacontent (Section 6 of [RFC5652]) that is composed of AES-CBC guess blocks.¶
The attacker sends the "garbage" message to the victim, and the victim revealsthe result of the decryption to the attacker.¶
If any of the transformed plaintext blocks match the guess for that block, thenthe attacker learns the plaintext for that block.¶
With highly structured messages, one block can reveal the only sensitive part ofthe original message.¶
This attack is thwarted if the encryption key depends upon the delivery ofthe unmodified algorithm identifier.¶
The mitigation for this attack has three parts:¶
Perform encryption with a derived content-encryption key or content-authenticated-encryption key:¶
CEK' = HKDF(CEK, AlgorithmIdentifier)¶
CMS values are generated using ASN.1[X680], using the BasicEncoding Rules (BER) and the Distinguished Encoding Rules(DER)[X690].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
There is no provision for key derivation functions other than HKDF, andthere is no provision for hash functions other than SHA-256. If there isever a need to support another key derivation function or another hashfunction, it will be very straightforward to assign a new objectidentifier. At this point, keeping the design very simple seems mostimportant.¶
The mitigation uses the HMAC-based Extract-and-Expand Key DerivationFunction (HKDF)[RFC5869] to derive output keying material (OKM) frominput keying material (IKM). HKDF is used with the SHA-256 hash function[FIPS180]. The derivation includes the DER-encoded AlgorithmIdentifier asthe optional info input value. The encoded value includes the ASN.1 tagfor SEQUENCE (0x30), the length, and the value. This AlgorithmIdentifier iscarried as the parameter to the id-alg-cek-hkdf-sha256 algorithm identifier. Ifan attacker were to change the originator-provided AlgorithmIdentifier, then therecipient will derive a different content-encryption key orcontent-authenticated-encryption key.¶
The CMS_CEK_HKDF_SHA256 function uses the HKDF-Extract and HKDF-Expand functionsto derive the OKM from the IKM:¶
The output OKM is calculated as follows:¶
OKM_SIZE = len(IKM) /* length in octets */ IF OKM_SIZE > 8160 THEN raise error salt = "The Cryptographic Message Syntax" PRK = HKDF-Extract(salt, IKM) OKM = HKDF-Expand(PRK, info, OKM_SIZE)¶
The id-alg-cek-hkdf-sha256 algorithm identifier indicates that the CMS_CEK_HKDF_SHA256function defined inSection 2 is used to derive the content-encryption key or thecontent-authenticated-encryption key.¶
The following object identifier identifies the id-alg-cek-hkdf-sha256algorithm:¶
id-alg-cek-hkdf-sha256 OBJECT IDENTIFIER ::= { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) id-smime(16) alg(3) 31 }¶
The id-alg-cek-hkdf-sha256 parameters field has an ASN.1 type of AlgorithmIdentifier.¶
Using the conventions from[RFC5911], the id-alg-cek-hkdf-sha256 algorithm identifieris defined as:¶
ContentEncryptionAlgorithmIdentifier ::= AlgorithmIdentifier{CONTENT-ENCRYPTION, { ... } } cea-CEKHKDFSHA256 CONTENT-ENCRYPTION ::= { IDENTIFIER id-alg-cek-hkdf-sha256 PARAMS TYPE ContentEncryptionAlgorithmIdentifier ARE required SMIME-CAPS { IDENTIFIED BY id-alg-cek-hkdf-sha256 } }¶
The SMIMECapabilities attribute is defined inSection 2.5.2 of [RFC8551]. AnS/MIME client announces the set of cryptographic functions it supports using theSMIMECapabilities attribute.¶
If an S/MIME client supports the mechanism in this document, theid-alg-cek-hkdf-sha256 object identifierSHOULD be included in the set ofcryptographic functions. The parameter with this encodingMUST be absent.¶
The encoding for id-alg-cek-hkdf-sha256, in hexadecimal, is:¶
30 0d 06 0b 2a 86 48 86 f7 0d 01 09 10 03 1f¶
This section describes the originator and recipient processing to implementthis mitigation for each of the CMS encrypting content types.¶
The fourth step of constructing an enveloped-data content type is repeated belowfromSection 6 of [RFC5652]:¶
To implement this mitigation, the originator expands this step as follows:¶
Derive the new content-encryption key (CEK') from the original content-encryption key (CEK) and the ContentEncryptionAlgorithmIdentifier, which is carried in the contentEncryptionAlgorithm.parameters field:¶
CEK' = CMS_CEK_HKDF_SHA256(CEK, ContentEncryptionAlgorithmIdentifier)¶
The presence of the id-alg-cek-hkdf-sha256 algorithm identifier in thecontentEncryptionAlgorithm.algorithm field of the EncryptedContentInfostructure tells the recipient to derive the new content-encryptionkey (CEK') as shown above, and then use it for decryption of theEncryptedContent. If the id-alg-cek-hkdf-sha256 algorithm identifieris not present in the contentEncryptionAlgorithm.algorithm field ofthe EncryptedContentInfo structure, then the recipient uses the originalcontent-encryption key (CEK) for decryption of the EncryptedContent.¶
As specified inSection 8 of [RFC5652], the content-encryption keyis managed by other means.¶
To implement this mitigation, the originator performs the following:¶
Derive the new content-encryption key (CEK') from the original content-encryption key (CEK) and the ContentEncryptionAlgorithmIdentifier, which is carried in the contentEncryptionAlgorithm.parameters field:¶
CEK' = CMS_CEK_HKDF_SHA256(CEK, ContentEncryptionAlgorithmIdentifier)¶
The presence of the id-alg-cek-hkdf-sha256 algorithm identifier in thecontentEncryptionAlgorithm.algorithm field of the EncryptedContentInfostructure tells the recipient to derive the new content-encryptionkey (CEK') as shown above, and then use it for decryption of theEncryptedContent. If the id-alg-cek-hkdf-sha256 algorithm identifieris not present in the contentEncryptionAlgorithm.algorithm field ofthe EncryptedContentInfo structure, then the recipient uses the originalcontent-encryption key (CEK) for decryption of the EncryptedContent.¶
The fifth step of constructing an authenticated-enveloped-data content type isrepeated below fromSection 2 of [RFC5083]:¶
- The attributes collected in step 4 are authenticated and the CMS content is authenticated and encrypted with the content- authenticated-encryption key. If the authenticated encryption algorithm requires either the additional authenticated data (AAD) or the content to be padded to a multiple of some block size, then the padding is added as described in Section6.3 of [CMS].¶
Note that [CMS] refers to[RFC3852], which has been obsoleted by[RFC5652], but the text in Section 6.3 was unchanged in RFC 5652.¶
To implement this mitigation, the originator expands this step as follows:¶
Derive the new content-authenticated-encryption key (CEK') from the original content-authenticated-encryption key (CEK) and the ContentEncryptionAlgorithmIdentifier:¶
CEK' = CMS_CEK_HKDF_SHA256(CEK, ContentEncryptionAlgorithmIdentifier)¶
The presence of the id-alg-cek-hkdf-sha256 algorithm identifier in thecontentEncryptionAlgorithm.algorithm field of the EncryptedContentInfostructure tells the recipient to derive the new content-authenticated-encryptionkey (CEK') as shown above, and then use it for authenticated decryption of theEncryptedContent and the authentication of the AAD. If the id-alg-cek-hkdf-sha256algorithm identifier is not present in the contentEncryptionAlgorithm.algorithmfield of the EncryptedContentInfo structure, then the recipient uses the originalcontent-authenticated-encryption (CEK) for decryption and authentication ofthe EncryptedContent and the authentication of the AAD.¶
This mitigation always uses HKDF with SHA-256. One KDF algorithm wasselected to avoid the need for negotiation. In the future, if a weaknessis found in the KDF algorithm, a new attribute will need to be assigned foruse with an alternative KDF algorithm.¶
If the attacker removes the id-alg-cek-hkdf-sha256 object identifier from thecontentEncryptionAlgorithm.algorithm field of the EncryptedContentInfostructure prior to delivery to the recipient, then the recipient will notattempt to derive CEK', which will deny the recipient access to the contentbut will not assist the attacker in recovering the plaintext content.¶
If the attacker changes contentEncryptionAlgorithm.parameters field of theEncryptedContentInfo structure prior to delivery to the recipient, then therecipient will derive a different CEK', which will not assist the attacker inrecovering the plaintext content. Providing the object identifier as an input tothe key derivation function is sufficient to mitigate the attack describedin[RS2023], but this mitigation includes both the object identifier and theparameters to protect against some yet-to-be-discovered attack that onlymanipulates the parameters.¶
ImplementationsMUST protect the content-encryption keys andcontent-authenticated-encryption keys, including the CEK and CEK'.Compromise of a content-encryption key may result in disclosure of theassociated encrypted content. Compromise of a content-authenticated-encryptionkey may result in disclosure of the associated encrypted content or allowmodification of the authenticated content and the AAD.¶
ImplementationsMUST randomly generate content-encryption keys andcontent-authenticated-encryption keys. Using an inadequate pseudorandomnumber generator (PRNG) to generate cryptographic keys can result in little orno security. An attacker may find it much easier to reproduce the PRNGenvironment that produced the keys and then search the resulting small setof possibilities, rather than brute-force searching the whole key space. Thegeneration of quality random numbers is difficult.[RFC4086] offers importantguidance on this topic.¶
If the message-digest attribute is included in the AuthAttributes,then the attribute value will contain the unencrypted one-way hashvalue of the plaintext of the content. Disclosure of this hash valueenables content tracking, and it can be used to determine if thecontent matches one or more candidates. For these reasons,the AuthAttributesSHOULD NOT contain the message-digest attribute.¶
CMS is often used to provide encryption in messaging environments,where various forms of unsolicited messages (such as spam and phishing)represent a significant volume of unwanted traffic. Mitigation strategies forunwanted message traffic involve analysis of plaintext message content. Whenrecipients accept unsolicited encrypted messages, they become even morevulnerable to unwanted traffic since many mitigation strategies will beunable to access the plaintext message content. Therefore, software thatreceives messages that have been encrypted using CMS ought to provide alternatemechanisms to handle the unwanted message traffic. One approach thatdoes not require disclosure of keying material to a server is to rejector discard encrypted messages unless they purport to come from a memberof a previously approved originator list.¶
For the ASN.1 module inAppendix A of this document, IANA has assigned the following object identifier (OID) in the "SMI Security for S/MIME Module Identifier (1.2.840.113549.1.9.16.0)" registry:¶
Decimal | Description | References |
---|---|---|
80 | id-mod-CMS-CEK-HKDF-SHA256-2023 | RFC 9709 |
IANA has allocated the id-alg-cek-hkdf-sha256 algorithm identifier as specifiedinSection 3 in the "SMI Security for S/MIME Algorithms (1.2.840.113549.1.9.16.3)" registry as follows:¶
Decimal | Description | References |
---|---|---|
31 | id-alg-cek-hkdf-sha256 | RFC 9709 |
This ASN.1 module builds upon the conventions established in[RFC5911].¶
<CODE BEGINS>CMS-CEK-HKDF-SHA256-Module-2023 { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) id-smime(16) id-mod(0) id-mod-CMS-CEK-HKDF-SHA256-2023(80) }DEFINITIONS IMPLICIT TAGS ::= BEGINEXPORTS ALL;IMPORTS AlgorithmIdentifier{}, CONTENT-ENCRYPTION, SMIME-CAPS FROM AlgorithmInformation-2009 -- in RFC 5911 { iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) id-mod(0) id-mod-algorithmInformation-02(58) } ;---- CEK-HKDF-SHA256 Algorithm--id-alg-cek-hkdf-sha256 OBJECT IDENTIFIER ::= { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1) pkcs-9(9) id-smime(16) alg(3) 31 }ContentEncryptionAlgorithmIdentifier ::= AlgorithmIdentifier{CONTENT-ENCRYPTION, { ... } }cea-CEKHKDFSHA256 CONTENT-ENCRYPTION ::= { IDENTIFIER id-alg-cek-hkdf-sha256 PARAMS TYPE ContentEncryptionAlgorithmIdentifier ARE required SMIME-CAPS { IDENTIFIED BY id-alg-cek-hkdf-sha256 } }---- S/MIME Capability for CEK-HKDF-SHA256 Algorithm--SMimeCaps SMIME-CAPS ::= { cap-CMSCEKHKDFSHA256, ... }cap-CMSCEKHKDFSHA256 SMIME-CAPS ::= { -- No value -- IDENTIFIED BY id-alg-cek-hkdf-sha256 }END<CODE ENDS>¶
This appendix provides two test vectors for the CMS_CEK_HKDF_SHA256 function.¶
This test vector includes an AlgorithmIdentifier for AES-128-GCM.¶
IKM = c702e7d0a9e064b09ba55245fb733cf3The AES-128-GCM AlgorithmIdentifier: algorithm=2.16.840.1.101.3.4.1.6 parameters=GCMParameters: aes-nonce=0x5c79058ba2f43447639d29e2 aes-ICVlen is omitted; it indicates the DEFAULT of 12DER-encoded AlgorithmIdentifier: 301b0609608648016503040106300e040c5c79058ba2f43447639d29e2OKM = 2124ffb29fac4e0fbbc7d5d87492bff3¶
This test vector uses includes an AlgorithmIdentifier for AES-128-CBC.¶
IKM = c702e7d0a9e064b09ba55245fb733cf3The AES-128-CBC AlgorithmIdentifier: algorithm=2.16.840.1.101.3.4.1.2 parameters=AES-IV=0x651f722ffd512c52fe072e507d72b377DER-encoded AlgorithmIdentifier: 301d06096086480165030401020410651f722ffd512c52fe072e507d72b377OKM = 9cd102c52f1e19ece8729b35bfeceb50¶
Thanks toMike Ounsworth,Carl Wallace, andJoe Mandel their careful review and constructive comments.¶