RFC 9193 | SenML Data Content-Format Indication | June 2022 |
Keränen & Bormann | Standards Track | [Page] |
The Sensor Measurement Lists (SenML) media types support multiple typesof values, from numbers to text strings and arbitrary binary Data Values.In order to facilitate processing of binary Data Values, this documentspecifies a pair of new SenML fields for indicating thecontent format of those binary Data Values, i.e., their Internet mediatype, including parameters as well as any content codings applied.¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9193.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
The Sensor Measurement Lists (SenML) media types[RFC8428] can be usedto send various kinds of data. In the example given inFigure 1, a temperature value, an indication whether a lock is open, anda Data Value (with SenML field "vd") read from a Near Field Communication (NFC) reader is sent in asingle SenML Pack.The example is given in SenML JSON representation, so the "vd" (DataValue) field is encoded as a base64url string (withoutpadding), as perSection 5 of [RFC8428].¶
[ {"bn":"urn:dev:ow:10e2073a01080063:","n":"temp","u":"Cel","v":7.1}, {"n":"open","vb":false}, {"n":"nfc-reader","vd":"aGkgCg"}]
The receiver is expected to know how to interpret the data in the "vd"field based on the context, e.g., the name of the data source and out-of-bandknowledge of the application. However, this context may not always beeasily available to entities processing the SenML Pack, especially ifthe Pack is propagated over time and via multiple entities. To facilitateautomatic interpretation, it is useful to be able to indicate an Internetmedia type and, optionally, content codings right in the SenML Record.¶
The Constrained Application Protocol (CoAP)Content-Format (Section 12.3 of [RFC7252]) provides thisinformation in the form of a single unsigned integer. For instance,[RFC8949] defines the Content-Format number 60 forContent-Type application/cbor. Enclosing this Content-Format number in the Record is illustrated inFigure 2. All registered CoAP Content-Format numbers are listedin the "CoAP Content-Formats" registry[IANA.core-parameters], as specified bySection 12.3 of [RFC7252].Note that, at the time of writing, the structure of this registry onlyprovides for zero or one content coding; nothing in the presentdocument needs to change if the registry is extended to allowsequences of content codings.¶
{"n":"nfc-reader", "vd":"gmNmb28YKg", "ct":"60"}
In this example SenML Record, the Data Value contains a string "foo" and anumber 42 encoded in a Concise Binary Object Representation (CBOR)[RFC8949] array. Since the example aboveuses the JSON format of SenML, the Data Value containing the binary CBORvalue is base64 encoded (Section 5 of [RFC4648]).The Data Value after base64 decoding is shownwith CBOR diagnostic notation inFigure 3.¶
82 # array(2) 63 # text(3) 666F6F # "foo" 18 2A # unsigned(42)
As with SenML in general, there is no expectation that the creator ofa SenML Pack knows (or has negotiated with) each consumer of that Pack,which may be very remote in space and particularly in time.This means that the SenML creator in general has no way to knowwhether the consumer knows:¶
What SenML, as well as the new fields defined here, guarantees is thata recipient implementationknows when it needs to be updated tounderstand these field values and the values controlled by them;registries are used to evolve these name spaces in a controlled way.SenML Packs can be processed by a consumer while not understanding allthe information in them, and information can generally be preserved inthis processing such that it is useful for further consumers.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALLNOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED","MAY", and "OPTIONAL" in this document are to be interpreted asdescribed in BCP 14[RFC2119][RFC8174] when, and only when, theyappear in all capitals, as shown here.¶
Content-Type
header field.¶Readers should also be familiar with the terms and concepts discussed in[RFC8428].¶
When a SenML Record contains a Data Value field ("vd"), the RecordMAYalso include a Content-Format indication field, using label "ct". Thevalue of this field is a Content-Format-Spec, i.e., one of the following:¶
The syntax of this field is formally defined inSection 6.¶
The CoAP Content-Format number provides a simple and efficient wayto indicate the type of the data. Since some Internet media types andtheir content coding and parameter alternatives do not have assignedCoAP Content-Format numbers, using Content-Type and zero or morecontent codingsis also allowed. Both methods use a string value in the "ct" field tokeep its data type consistent across uses. When the "ct" fieldcontains only digits, it is interpreted as a CoAP Content-Formatnumber.¶
To indicate that one or more content codings are used with a Content-Type,each of the content coding values is appended to the Content-Type value (mediatype and parameters, if any), separated by an "@" sign, in the order of whenthe content codings were applied (the same order as inSection 8.4 of [RFC9110]).For example (using a content coding value of "deflate", as defined inSection 8.4.1.2 of [RFC9110]):¶
text/plain; charset=utf-8@deflate¶
If no "@" sign is present after the media type and parameters,then no content coding has been specified, and the "identity"content coding is used -- no encoding transformation is employed.¶
The Base Content-Format field, label "bct", provides a default value forthe Content-Format field (label "ct") within its range. The range of thebase field includes the Record containing it, up to (but not including)the next Record containing a "bct" field, if any, or up to the end of thePack otherwise. The process of resolving (Section 4.6 of [RFC8428]) this basefield is performed by adding its value with the label "ct" to all Recordsin this range that carry a "vd" field but do not already contain aContent-Format ("ct") field.¶
Figure 4 shows a variation ofFigure 2 with multiple records, with the"nfc-reader" records resolving to the base field value "60" and the"iris-photo" record overriding this with the "image/png" media type(actual data left out for brevity).¶
[ {"n":"nfc-reader", "vd":"gmNmb28YKg", "bct":"60", "bt":1627430700}, {"n":"nfc-reader", "vd":"gmNiYXIYKw", "t":10}, {"n":"iris-photo", "vd":".....", "ct":"image/png", "t":10}, {"n":"nfc-reader", "vd":"gmNiYXoYLA", "t":20}]
The following examples are valid values for the "ct" and "bct" fields(explanation/comments in parentheses):¶
This specification provides a formal definition of the syntax ofContent-Format-Spec strings using ABNF notation[RFC5234], whichcontains three new rules and a number of rules collected and adaptedfrom various RFCs[RFC9110][RFC6838][RFC5234][RFC8866].¶
; New in this documentContent-Format-Spec = Content-Format-Number / Content-Format-StringContent-Format-Number = "0" / (POS-DIGIT *DIGIT)Content-Format-String = Content-Type *("@" Content-Coding); Cleaned up from RFC 9110,; leaving only SP as blank space,; removing legacy 8-bit characters, and; leaving the parameter as mandatory with each semicolon:Content-Type = Media-Type-Name *( *SP ";" *SP parameter )parameter = token "=" ( token / quoted-string )token = 1*tchartchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHAquoted-string = %x22 *( qdtext / quoted-pair ) %x22qdtext = SP / %x21 / %x23-5B / %x5D-7Equoted-pair = "\" ( SP / VCHAR ); Adapted from Section 8.4.1 of RFC 9110Content-Coding = token; Adapted from various specsMedia-Type-Name = type-name "/" subtype-name; From RFC 6838type-name = restricted-namesubtype-name = restricted-namerestricted-name = restricted-name-first *126restricted-name-charsrestricted-name-first = ALPHA / DIGITrestricted-name-chars = ALPHA / DIGIT / "!" / "#" / "$" / "&" / "-" / "^" / "_"restricted-name-chars =/ "." ; Characters before first dot always ; specify a facet namerestricted-name-chars =/ "+" ; Characters after last plus always ; specify a structured syntax suffix; Boilerplate from RFC 5234 and RFC 8866DIGIT = %x30-39 ; 0 - 9POS-DIGIT = %x31-39 ; 1 - 9ALPHA = %x41-5A / %x61-7A ; A - Z / a - zSP = %x20VCHAR = %x21-7E ; printable ASCII (no SP)
The indication of a media type in the data does not exempt a consumingapplication from properly checking its inputs.Also, the ability for an attacker to supply crafted SenML data thatspecifies media types chosen by the attacker may expose vulnerabilitiesof handlers for these media types to the attacker.This includes "decompression bombs", compressed data that is craftedto decompress to extremely large data items.¶
IANA has assigned the following new labels in the "SenML Labels" subregistryof the "Sensor Measurement Lists (SenML)" registry[IANA.senml] (as defined inSection 12.2 of [RFC8428]) for theContent-Format indication, as perTable 1:¶
Name | Label | JSON Type | XML Type | Reference |
---|---|---|---|---|
Base Content-Format | bct | String | string | RFC 9193 |
Content-Format | ct | String | string | RFC 9193 |
Note that, perSection 12.2 of [RFC8428], no CBOR labels nor Efficient XML Interchange (EXI)schemaId values (EXI ID column) are supplied.¶
The authors would like to thankSérgio Abreu for the discussions leadingto the design of this extension andIsaac Rivera for reviews andfeedback.Klaus Hartke suggested not burdening this document with a separatemandatory-to-implement version of the fields.Alexey Melnikov,Jim Schaad, andThomas Fossati provided helpfulcomments at Working Group Last Call.Marco Tiloca asked for clarifying and using the term Content-Format-Spec.¶