Movatterモバイル変換


[0]ホーム

URL:


RFC 9741CDDL: More Control Operators for TextMarch 2025
BormannStandards Track[Page]
Stream:
Internet Engineering Task Force (IETF)
RFC:
9741
Category:
Standards Track
Published:
ISSN:
2070-1721
Author:
C. Bormann
Universität Bremen TZI

RFC 9741

Concise Data Definition Language (CDDL): Additional Control Operators for the Conversion and Processing of Text

Abstract

The Concise Data Definition Language (CDDL), standardized in RFC 8610,provides "control operators" as its main language extension point.RFCs have added to this extension point in both anapplication-specific and a more general way.

The present document defines a number of additional generallyapplicable control operators for text conversion (bytes, integers,printf-style formatting, and JSON) and for an operation on text.

Status of This Memo

This is an Internet Standards Track document.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9741.

Copyright Notice

Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.

Table of Contents

1.Introduction

The Concise Data Definition Language (CDDL), standardized in[RFC8610],provides "control operators" as its main language extension point(Section 3.8 of [RFC8610]).RFCs have added to this extension point in both anapplication-specific[RFC9090] and a more general[RFC9165] way.

The present document defines a number of additional generallyapplicable control operators. InTable 1, the column marked t is for "target type" (left-hand side), and the column marked c is for "controller type" (right-hand side).

Table 1:Summary of New Control Operators in This Document
NametcPurpose
.b64u,.b64ctextbytesBase64 representation of byte strings
.b64u-sloppy,.b64c-sloppytextbytesSloppy-tolerant variants of the above
.hex,.hexlc,.hexuctextbytesBase16 representation of byte strings
.b32,.h32textbytesBase32 representation of byte strings
.b45textbytesBase45 representation of byte strings
.base10textintText representation of integer numbers
.printftextarrayPrintf-formatted text representation of data items
.jsontextanyText representation of JSON values
.jointext or bytesarrayBuild text or byte string from array of components

1.1.Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.

Regular expressions mentioned in the text are as defined in[RFC9485].

This specification uses terminology from[RFC8610].In particular, with respect to control operators, "target" refers tothe left-hand-side operand and "controller" to the right-hand-side operand."Tool" refers to tools along the lines of that described inAppendix F of [RFC8610].Note also that the data model underlying CDDL provides for textstrings as well as byte strings as two separate types, which are then collectively referred to as "strings".

The term "opinionated" is used in this document to explain that the selection of operators included is somewhat frugal, based on opinions about what the preferred (and likely) usage scenarios will be. Specifically, not including a potential choice doesn't by itself intend to express that the choice is unacceptable; it might still be added in a future registration if these opinions evolve.

2.Text Conversion

2.1.Byte Strings: Base 16 (Hex), Base 32, Base 45, and Base 64

A CDDL model often defines data that are byte strings in essence butneed to be transported in various encoded forms, such as base64 orhex.This section defines a number of control operators to model theseconversions.

The control operators generally are of a form that could be used likethis:

signature-for-json = text .b64u signaturesignature = bytes .cbor COSE_Sign1

The specification of these control operators cannot provide full coverage of the large number of transformations in use; it focuses on[RFC4648] and additionally[RFC9285], as shown inTable 2. For the representations defined in[RFC4648], this specification uses names as inspired by Section8 of RFC 8949[STD94]:

Table 2:Control Operators for Text Conversion of Byte Strings
NameMeaningReference
.b64uBase64url, no paddingSection 5 of [RFC4648]
.b64u-sloppyBase64url, no padding, sloppySection 5 of [RFC4648]
.b64cBase64 classic, paddingSection 4 of [RFC4648]
.b64c-sloppyBase64 classic, padding, sloppySection 4 of [RFC4648]
.b32Base32, no paddingSection 6 of [RFC4648]
.h32Base32 with "Extended Hex" alphabet, no paddingSection 7 of [RFC4648]
.hexBase16 (hex), either caseSection 8 of [RFC4648]
.hexlcBase16 (hex), lower caseSection 8 of [RFC4648]
.hexucBase16 (hex), upper caseSection 8 of [RFC4648]
.b45Base45[RFC9285]

Note that this specification is somewhat opinionated here: It does notprovide base64url or base32(hex) encoding with padding orbase64 classic without padding. Experience indicates that thesecombinations only ever occur in error, so the usability of CDDL isincreased by not providing them in the first place. Also, adding "c"makes sure that any decision for classic base64 is actively taken.

These control operators are "strict" in their matching, i.e., theyonly match base encodings that conform to the mandates of theirdefining documents.Note that this also means that.b64u and.b64c only match textstrings composed of the set of characters defined for each of them,respectively.(This is perhaps worth pointing out explicitly as it contrastswith the "b64" literal prefix that can be used to notate byte stringsin CDDL source code, which simply accepts characters from either alphabet.This behavior is different from the matching behavior of the fourbase64 control operators defined here.)

The additional designation "sloppy" indicates that the text string isnot validated for any additional bits being zero, in variance to whatis specified in the paragraph that follows Table 1 inSection 4 of [RFC4648].Note that the present specification is opinionated again in notspecifying a sloppy variant of base32 or base32hex, as no legacy useof sloppy base32(hex) was known at the time of writing.Base45[RFC9285] is known to be suboptimal for use in environments with limiteddata transparency (such as URLs) but is included because of its closerelationship to QR codes and its wide use in health informatics (notethat base45 is strongly specified not to allow sloppy formsof encoding).

2.2.Numerals

Table 3:Control Operator for Text Conversion of Integers
NameMeaningReference
.base10Base-ten (decimal) integer---

The control operator.base10 allows the modeling of text stringsthat carry an integer number in decimal form (as a text string withdigits in the usual base-ten positional numeral system), such as in the uint64/int64 formats ofYANG-JSON[RFC7951].

yang-json-sid = text .base10 (0..9223372036854775807)

Again, the specification is opinionated by only providing for integer numbersrepresented without leading zeros, i.e., the decimal integernumerals match the regularexpression0|-?[1-9][0-9]* (of course, this is further restricted by thecontrol type).SeeSection 2.3 for more flexibility and for other numeric basessuch as octal, hexadecimal,or binary conversions.

Note that this control operator governs text representations ofintegers and should not be confused with the control operatorsgoverning text representations of byte strings (such as.b64u).This contrast is somewhat reinforced by spelling out "base" in thename.base10 as opposed to those of the byte string operators.

2.3.Printf-Style Formatting

Table 4:Control Operator for Printf-Style Formatting of Data Item(s)
NameMeaningReference
.printfPrintf-style formatting of data item(s)---

The control operator.printf allows the modeling of text strings that carry various formattedinformation, as long as the format can be represented in printf-styleformatting strings as they are used in the C language (see Section7.23.6.1 of[C]; note that the "C23" standard includes %b and %B for formatting into binary digits).

The controller (right-hand side) of the.printf control is an arrayof one printf-style format string and zero or more data items that fitthe individual conversion specifications in the format string. The construct matches a text string representing the textual output of an equivalent C-languageprintf function call that receives as arguments the format string and the data items following it in the array.

Out of the functionality described for printf formatting in Section 7.23.6.1 of the C language specification[C], length modifiers (paragraph 7)are not used andMUST NOT be included in the format string.The "s" conversion specifier (paragraph 8) is used tointerpolate a text string in UTF-8 form.The "c" conversion specifier (paragraph 8) represents a single Unicodescalar value as a UTF-8 character.The "p" and "n" conversion specifiers (paragraph 8) are not used andMUST NOT be included in the format string.

In the following example,my_alg_19 matches the text string"0x0013":

my_alg_19 = hexlabel<19>hexlabel<K> = text .printf (["0x%04x", K])

The data items in the controller array do not need to be literals, as in the following example:

any_alg = hexlabel<1..20>hexlabel<K> = text .printf (["0x%04x", K])

Here,any_alg matches the text strings"0x0013" or"0x0001" butnot"0x1234".

2.4.JSON Values

Some applications store complete JSON texts[STD90] into text strings. TheJSON value of these can easily be defined in CDDL by using the defaultJSON-to-CBOR conversion rules provided in Section6.2 of RFC 8949[STD94].This is supported by a control operator similar to.cbor as defined inSection 3.8.4 of [RFC8610].

Table 5:Control Operator for Text Conversion of JSON Values
NameMeaningReference
.jsonJSON[STD90]
embedded-claims = text .json claimsclaims = {iss: text, exp: text}

Notes:

  • JSON has known interoperability problems[RFC7493]. WhileSection 4 of [RFC7493] probably is not relevant to this specification,Section 2 of [RFC7493] provides requirements that need to be followed to make use of the generic data model underlying CDDL. Note that the intention ofSection 2.2 of [RFC7493] is directly supported by Section6.2 of RFC 8949[STD94]. The recommendation to use text strings for representing numbers outside JSON's interoperable range is a requirement on the application data model and therefore needs to be reflected on the right-hand side of the.json control operator.

  • This control operator provides no way to constrain the use of blank space or other serialization variants in the JSON representation of the data items; restrictions on the serialization to specific variants (e.g., not providing for the addition of any insignificant blank space and prescribing an order in which map entries are serialized) could be defined in future control operators.

  • A.jsonseq is not provided in this document for JSON text sequences[RFC7464], as no use case for inclusion in CDDL is known at the time of writing; again, future control operators could address this use case.

3.Text Processing

3.1.Join

Often, text strings need to be constructed out of parts that can bestbe modeled as an array.

Table 6:Control Operator for Text Generation from Arrays
NameMeaningReference
.joinConcatenate elements of an array---

For example, an IPv4 address in dotted-decimal might be modeled as inFigure 1.

legacy-ip-address = text .join legacy-ip-address-elementslegacy-ip-address-elements = [bytetext, ".", bytetext, ".",                              bytetext, ".", bytetext]bytetext = text .base10 bytebyte = 0..255
Figure 1:Using the .join Operator to Build Dotted-Decimal IPv4 Addresses

The elements of the controller array need to be strings (text or bytestrings).The control operator matches a data item if that data item is also astring, built by concatenating the strings in the array.The result of this concatenation is of the same kind of string (textor bytes) as the first element of the array.(If there is no element in the array, the.join construct matcheseither kind of empty string, obviously further constrained by thecontrol operator target.)The concatenation is performed on the sequences of bytes in thestrings.If the result of the concatenation is a text string, the resultingsequence of bytes only matches the target data item if that result isa valid text string (i.e., valid UTF-8). Note that in contrast to thealgorithm used in Section3.2.3 of RFC 8949[STD94], there is no needfor all individual byte sequences going into the concatenation toconstitute valid text strings.

Note that this control operator is hard to validate in the mostgeneral case, as this would require full parser functionality.Simple implementation strategies will use array elements with constantvalues as guideposts ("markers", such as the"." inFigure 1)for isolating the variable elements that need further validation atthe CDDL data model level.Therefore, it is recommended to limit the use of.join to simplearrangements where the array elements are laid out explicitly andthere are no adjacent variable elements without intervening constantvalues, and where these constant values do not occur within the textdescribed by the variable elements.If more complex parsing functionality is required, the ABNF controloperators (seeSection 3 of [RFC9165]) may be useful; however, thesecannot reach back into CDDL-specified elements like.join can.

Implementation note: A validator implementation can use the markerelements to scan the text and isolate the variable elements.It also can build a parsing regexp from the elements of the controller array, with capturegroups for each element, and validate the captures against theelements of the array. (For more about parsing regexps, seeSection 6 of [RFC9485]; see alsoSection 8 of [RFC9485] for security considerations related toregexps.)In the most general case, these implementation strategies can exhibitfalse negatives, where the implementation cannot find the structurethat would be successfully validated using the controller; it isRECOMMENDED that implementations provide full coverage at least forthe marker-based subset outlined in the previous paragraph.

4.IANA Considerations

IANA has registered the contents ofTable 7 into the "CDDL Control Operators" registry of[IANA.cddl]:

Table 7:New Control Operators
NameReference
.b64uRFC 9741
.b64u-sloppyRFC 9741
.b64cRFC 9741
.b64c-sloppyRFC 9741
.b45RFC 9741
.b32RFC 9741
.h32RFC 9741
.hexRFC 9741
.hexlcRFC 9741
.hexucRFC 9741
.base10RFC 9741
.printfRFC 9741
.jsonRFC 9741
.joinRFC 9741

5.Security Considerations

The security considerations inSection 5 of [RFC8610] apply. In addition, for the control operators defined inSection 2.1, the security considerations inSection 12 of [RFC4648] apply.

6.References

6.1.Normative References

[C]
International Organization for Standardization,"Information technology - Programming languages - C",Fourth Edition,ISO/IEC 9899:2024,,<https://www.iso.org/standard/82075.html>.Technically equivalent specification text is available at<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf>.
[IANA.cddl]
IANA,"Concise Data Definition Language (CDDL)",<https://www.iana.org/assignments/cddl>.
[RFC2119]
Bradner, S.,"Key words for use in RFCs to Indicate Requirement Levels",BCP 14,RFC 2119,DOI 10.17487/RFC2119,,<https://www.rfc-editor.org/info/rfc2119>.
[RFC4648]
Josefsson, S.,"The Base16, Base32, and Base64 Data Encodings",RFC 4648,DOI 10.17487/RFC4648,,<https://www.rfc-editor.org/info/rfc4648>.
[RFC8174]
Leiba, B.,"Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words",BCP 14,RFC 8174,DOI 10.17487/RFC8174,,<https://www.rfc-editor.org/info/rfc8174>.
[RFC8610]
Birkholz, H.,Vigano, C., andC. Bormann,"Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures",RFC 8610,DOI 10.17487/RFC8610,,<https://www.rfc-editor.org/info/rfc8610>.
[RFC9165]
Bormann, C.,"Additional Control Operators for the Concise Data Definition Language (CDDL)",RFC 9165,DOI 10.17487/RFC9165,,<https://www.rfc-editor.org/info/rfc9165>.
[RFC9285]
Fältström, P.,Ljunggren, F., andD.W. van Gulik,"The Base45 Data Encoding",RFC 9285,DOI 10.17487/RFC9285,,<https://www.rfc-editor.org/info/rfc9285>.
[RFC9485]
Bormann, C. andT. Bray,"I-Regexp: An Interoperable Regular Expression Format",RFC 9485,DOI 10.17487/RFC9485,,<https://www.rfc-editor.org/info/rfc9485>.
[STD90]
Internet Standard 90,<https://www.rfc-editor.org/info/std90>.
At the time of writing, this STD comprises the following:
Bray, T., Ed.,"The JavaScript Object Notation (JSON) Data Interchange Format",STD 90,RFC 8259,DOI 10.17487/RFC8259,,<https://www.rfc-editor.org/info/rfc8259>.
[STD94]
Internet Standard 94,<https://www.rfc-editor.org/info/std94>.
At the time of writing, this STD comprises the following:
Bormann, C. andP. Hoffman,"Concise Binary Object Representation (CBOR)",STD 94,RFC 8949,DOI 10.17487/RFC8949,,<https://www.rfc-editor.org/info/rfc8949>.

6.2.Informative References

[RFC7464]
Williams, N.,"JavaScript Object Notation (JSON) Text Sequences",RFC 7464,DOI 10.17487/RFC7464,,<https://www.rfc-editor.org/info/rfc7464>.
[RFC7493]
Bray, T., Ed.,"The I-JSON Message Format",RFC 7493,DOI 10.17487/RFC7493,,<https://www.rfc-editor.org/info/rfc7493>.
[RFC7951]
Lhotka, L.,"JSON Encoding of Data Modeled with YANG",RFC 7951,DOI 10.17487/RFC7951,,<https://www.rfc-editor.org/info/rfc7951>.
[RFC9090]
Bormann, C.,"Concise Binary Object Representation (CBOR) Tags for Object Identifiers",RFC 9090,DOI 10.17487/RFC9090,,<https://www.rfc-editor.org/info/rfc9090>.

List of Figures

Figure 1:Using the .join Operator to Build Dotted-Decimal IPv4 Addresses

List of Tables

Table 1:Summary of New Control Operators in This Document

Table 2:Control Operators for Text Conversion of Byte Strings

Table 3:Control Operator for Text Conversion of Integers

Table 4:Control Operator for Printf-Style Formatting of Data Item(s)

Table 5:Control Operator for Text Conversion of JSON Values

Table 6:Control Operator for Text Generation from Arrays

Table 7:New Control Operators

Acknowledgements

Henk Birkholz suggested the need for many of the control operators defined here. The author would like to thankLaurence Lundblade andJeremy O'Donoghue for sharpening some of the mandates,Mikolai Gütschow for improvements to some examples,A.J. Stein for serving as shepherd for this document and for his shepherd review, the IESG and Directorate reviewers (notablyAri Keränen,Darrel Miller, andÉric Vyncke), andOrie Steele for serving as responsible AD and for providing a detailed AD review.

Author's Address

Carsten Bormann
Universität Bremen TZI
Postfach 330440
D-28359Bremen
Germany
Phone:+49-421-218-63921
Email:cabo@tzi.org

[8]ページ先頭

©2009-2026 Movatter.jp