RFC 9571 | SFL | May 2024 |
Bryant, et al. | Standards Track | [Page] |
RFC 6374 describes methods of making loss and delay measurements on Label Switched Paths (LSPs) primarily as they are used in MPLS Transport Profile (MPLS-TP) networks. This document describes a method of extending the performance measurements (specified in RFC 6374) from flows carried over MPLS-TP to flows carried over generic MPLS LSPs. In particular, it extends the technique to allow loss and delay measurements to be made on multipoint-to-point LSPs and introduces some additional techniques to allow more sophisticated measurements to be made in both MPLS-TP and generic MPLS networks.¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9571.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
[RFC6374] was originally designed for use as an Operations, Administration, andMaintenance (OAM) protocolfor use with MPLS Transport Profile (MPLS-TP)[RFC5921] LSPs. MPLS-TP onlysupports point-to-point and point-to-multipoint LSPs. This document describes how to use[RFC6374] in the generic MPLS case and also introduces a numberof more sophisticated measurements of applicability to both cases.¶
[RFC8372] describes the requirement for introducingflow identities when using packet loss measurements described in[RFC6374]. In summary,[RFC6374] describes use of the loss measurement (LM) message as thepacket accountingdemarcation point. Unfortunately, this gives rise to a number ofproblems that may lead to significant packet accounting errors incertain situations. For example:¶
An approach to mitigating these synchronization issues is described in[RFC9341] -- the packets arebatched by the sender, and each batch is marked in some way such thatadjacent batches can be easily recognized by the receiver.¶
An additional problem arises where the LSP is a multipoint-to-pointLSP since MPLS does not include a source address in the packet.Network management operations require the measurement of packet lossbetween a source and destination. It is thus necessary to introducesome source-specific information into the packet to identify packetbatches from a specific source.¶
[RFC8957] describes a method of encoding per-flowinstructions in an MPLS label stack using a technique calledSynonymous Flow Labels (SFLs), in which labels that mimic thebehavior of other labels provide the packet batch identifiers andenable the per-batch packet accounting. This memo specifies how SFLsare used to perform packet loss and delay measurements as described in[RFC6374].¶
When the terms "performance measurement method," "Query," "packet," or "message" are used in this document, they refer to a performance measurement method, Query, packet, or message as specified in[RFC6374].¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
The data service packets of the flow being instrumented are groupedinto batches, and all the packets within a batch are marked withthe SFL[RFC8372] corresponding to that batch.The sender counts the number of packets in the batch. When thebatch has completed and the sender is confident that all of thepackets in that batch will have been received, the sender issuesa Query message to determine the number actuallyreceived and hence the number of packets lost. The Query message is sent using the same SFL as the corresponding batch ofdata service packets. The format of the Query and Response packets isdescribed inSection 9.¶
[RFC6374] describes how to measure the packet delay by measuring thetransit time of a packet over an LSP. Such a packet may not need to be carried over an SFL since the delay over a particular LSP should be a function of the Traffic Class (TC) bits.¶
However, where SFLs are being used to monitor packet loss or wherelabel-inferred scheduling is used[RFC3270], thenthe SFL would beREQUIRED to ensure that the packetthat was being used as a proxy for a data service packet experienceda representative delay. The format of a packet carried over the LSP using an SFL is shown inSection 9.¶
Where it is desired to more thoroughly instrument a packet flow and todetermine the delay of a number of packets, it is undesirable tosend a large number of packets acting as proxy data servicepackets (seeSection 4). A method of directly measuring the delay characteristicsof a batch of packets is therefore needed.¶
Given the long intervals over which it is necessary to measure packetloss, it is not necessarily the case that the batch times for the twomeasurement types would be identical. Thus, we use a technique thatpermits the two measurements to be made concurrently and yet relativelyindependently from each other. The notion that they are relativelyindependent arises from the potential for the two batches to overlap in time,in which case either the delay batch time will need to be cut short or the losstime will need to be extended to allow correct reconciliation of thevarious counters.¶
The problem is illustrated inFigure 1.¶
(Case 1) AAAAAAAAAABBBBBBBBBBAAAAAAAAAABBBBBBBBBB SFL marking of a packet batch for loss measurement(Case 2) AADDDDAAAABBBBBBBBBBAAAAAAAAAABBBBBBBBBB SFL marking of a subset of the packets for delay(Case 3) AAAAAAAADDDDBBBBBBBBAAAAAAAAAABBBBBBBBBB SFL marking of a subset of the packets across a packet loss measurement boundary(Case 4) AACDCDCDAABBBBBBBBBBAAAAAAAAAABBBBBBBBBB A case of multiple delay measurements within a packet loss measurementwhere A and B are packets where loss is being measured. C and D are packets where loss and delay are being measured.
In Case 1, we show where loss measurement aloneis being carried out on the flow under analysis. For illustrativepurposes, consider that 10 packets are used in each flow in thetime interval being analyzed.¶
Now consider Case 2, where a small batch ofpackets need to be analyzed for delay. These are marked with a differentSFL type, indicating that they are to be monitored for both lossand delay. The SFL=A indicates loss batch A, and SFL=D indicates a batchof packets that are to be instrumented for delay, but SFL D issynonymous with SFL A, which in turn is synonymous with the underlyingForwarding Equivalence Class (FEC). Thus, a packet marked "D" will be accumulated into the A lossbatch, into the delay statistics, and will be forwarded as normal.Whether the packet is actually counted twice (for loss and delay)or whether the two counters are reconciled during reporting isa local matter.¶
Now consider Case 3, where a small batch of packetsis marked for delay across a loss batch boundary. These packetsneed to be considered as a part of batch A or a part of batch B, andany Query needs to take place after all packetsA or D (whichever option is chosen) have arrived at the receiving Label Switching Router (LSR).¶
Now consider Case 4. Here, we have a case whereit is required to take a number of delay measurements withina batch of packets that we are measuring for loss. To do this,we need two SFLs for delay (C and D) and alternate betweenthem (on a delay-batch-by-delay-batch basis) for the purposes ofmeasuring the delay characteristics of the different batches of packets.¶
It is possible to construct a large set of overlapping measurementtypes in terms of loss, delay, loss and delay, and batch overlap. Ifwe allow all combinations of cases, this leads to configuration,testing, and implementation complexity and, hence, increased costs.The following simplifying rules represent thedefault case:¶
Given that the sender controls both the start and duration ofa loss and a delay packet batch, these rules are readily implementedin the control plane.¶
A number of methods are described that add to the set of measurementsoriginally specified in[RFC6374]. Each of these methods has differentcharacteristics and different processing demands on the packet forwarder.The choice of method will depend on the type of diagnostic that the operator seeks.¶
Three methods are discussed:¶
In this method, the receiving LSR measures the inter-packet gap, classifies the delay into a number of delay buckets, and records the number of packets in each bucket. As an example, if the operator were concerned about packets with a delay of up to 1 μs, 2 μs, 4 μs, 8 μs, and over 8 μs, then there would be five buckets, and packets that arrived up to 1 μs would cause the "up to 1 μs" bucket counter to increase. Likewise, for those that arrived between 1 μs and 2 μs, the "2 μs" bucket counter would increase, etc. In practice, it might be better in terms of processing and potential parallelism if both the "up to 1 μs" and "2 μs" counters were incremented when a packet hada delay relative to its predecessor of 2 μs, and any more detailed information was calculated in the analyticssystem.¶
This method allows the operator to see more structure in the jitter characteristicsthan simply measuring the average jitter and avoids the complication of needingto perform a per-packet multiply but will probably need the time intervals betweenbuckets to be programmable by the operator.¶
The packet format of a Time Bucket Jitter Measurement messageis shown below:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|Version| Flags | Control Code | Message Length |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| QTF | RTF | RPTF | Reserved |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Session Identifier | DS |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Number of | Reserved 1 || Buckets | |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Interval (in 10 ns units) || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Number of Pkts in Bucket 1 || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+~ ~+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Number of Pkts in Bucket N || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+~ ~~ TLV Block ~+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Version, Flags, Control Code, Message Length, Querier Timestamp Format (QTF), Responder Timestamp Format (RTF), Responder's Preferred Timestamp Format (RPTF),Session Identifier, Reserved, and Differentiated Services (DS) fields are as defined inSection 3.2 of [RFC6374]. The remaining fields, which are unsigned integers, are as follows:¶
There will be a number of Interval/Number pairs depending on thenumber of buckets being specified by the Querier. If a message is being used to configure the buckets (i.e., the responder is creating or modifying the buckets according to the intervals inthe Query message), then the responderMUST respond with 0 packets in each bucket until it has beenconfigured for a full measurement period. This indicates that it was configuredat the time of the last response message, and thus, the responseis valid for the whole interval. As per the convention in[RFC6374],the Number of Pkts in Bucket fields are included in the Query message and setto zero.¶
Out-of-band configuration is permitted by this mode of operation.¶
Note this is a departure from the normal fixed format used in[RFC6374].¶
The Time Bucket Jitter Measurement message is carried over an LSP in the way described in[RFC6374] and over an LSP with an SFL as described inSection 9.¶
In this method, provision is made for reporting the following delaycharacteristics:¶
Characteristics 1 and 2 give the mean delay. Measuring the delay of eachpair in the batch is discussed inSection 7.3.¶
Characteristics 3 and 4 give the outliers.¶
Characteristics 1, 2, and 5 can be used to calculate the variance of theinter-packet gap, hence the standard deviation giving a view ofthe distribution of packet delays and hence the jitter. The equationfor the variance (var) is given by:¶
var = (SumS - S*S/n)/(n-1)¶
There is some concern over the use of this algorithm for measuringvariance because SumS and S*S/n can be similar numbers, particularlywhere variance is low. However, the method is acceptable because it doesnot require a division in the hardware.¶
The packet format of a Multi-packet Delay Measurement messageis shown below:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|Version| Flags | Control Code | Message Length |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| QTF | RTF | RPTF | Reserved |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Session Identifier | DS |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Number of Packets || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Sum of Delays for Batch || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Minimum Delay || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Maximum Delay || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Sum of squares of Inter-packet delay || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+~ ~~ TLV Block ~+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Version, Flags, Control Code, Message Length, QTF, RTF, RPTF,Session Identifier, Reserved, and DS fields are as defined inSection 3.2 of [RFC6374]. The remaining fields are as follows:¶
The Multi-packet Delay Measurement message is carried over an LSP in the way described in[RFC6374] and over an LSP with an SFL as described inSection 9.¶
If detailed packet delay measurement is required, then it might bepossible to record the inter-packet gap for each packet pair. In cases other than the exceptions of slow flows or small batch sizes, this would create a large (per-packet) demand on storage in the instrumentation system, a large bandwidth for such a storage system and large bandwidth for the analytics system.Such a measurement technique is outside the scope of this document.¶
Introduced in[RFC9341] is the concept of a one-way delay measurement in which the average time of arrival of aset of packets is measured. In this approach, the packet is timestampedat arrival, and the responder returns the sum of the timestampsand the number of timestamps. From this, the analytics engine candetermine the mean delay. An alternative model is that the responderreturns the timestamp of the first and last packet and thenumber of packets. This latter method has the advantage of allowing theaverage delay to be determined at a number of points along thepacket path and allowing the components of the delay to becharacterized. Unless specifically configured otherwise, theresponder may return either or both types of response, andthe analytics engine should process the response appropriately.¶
The packet format of an Average Delay Measurement messageis shown below:¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+|Version| Flags | Control Code | Message Length |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| QTF | RTF | RPTF | Reserved |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Session Identifier | DS |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Number of Packets || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Time of First Packet || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Time of Last Packet || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Sum of Timestamps of Batch || |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+~ ~~ TLV Block ~+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The Version, Flags, Control Code, Message Length, QTF, RTF, RPTF,Session Identifier, and DS fields are as defined inSection 3.2 of [RFC6374]. The remaining fields are as follows:¶
The Average Delay Measurement messageis carried over an LSP in the way described in[RFC6374] and over an LSP with an SFL as described inSection 9.As is the convention with[RFC6374], the Query message contains placeholdersfor the Response message. The placeholders are sent as zero.¶
In the discussion so far, it has been assumed that we would measurethe delay characteristics of every packet in a delay measurementinterval defined by an SFL of constant color.In[RFC9341], the concept of a sampledmeasurement is considered. That is, the responder only measures a packetat the start of a group of packets being marked for delay measurementby a particular color, rather than every packet in the marked batch.A measurementinterval is not defined by the duration of a marked batch of packetsbut the interval between a pair of packets taking a readoutof the delay characteristic. This approach has the advantage thatthe measurement is not impacted by ECMP effects.¶
This sampled approach may be used if supported by the responder andconfigured by the operator.¶
We illustrate the packet format of a Query message using SFLsfor the case of an MPLS Direct Loss Measurement inFigure 5.¶
+-------------------------------+| || LSP || Label |+-------------------------------+| || Synonymous Flow || Label |+-------------------------------+| || GAL || |+-------------------------------+| || ACH Type = 0xA || |+-------------------------------+| || Measurement Message || || +-------------------------+ || | | || | Fixed-format | || | portion of msg | || | | || +-------------------------+ || | | || | Optional SFL TLV | || | | || +-------------------------+ || | | || | Optional Return | || | Information | || | | || +-------------------------+ || |+-------------------------------+
The MPLS label stack is exactly the same as that used for the userdata service packets being instrumented except for the inclusionof the Generic Associated Channel Label (GAL)[RFC5586] to allow the receiver to distinguish betweennormal data packets and OAM packets. Since the packet lossmeasurements are being made on the data service packets,an MPLS Direct Loss Measurement is being made,which is indicated by the type field in the Associated Channel Header (ACH) (Type = 0x000A).¶
The measurement message consists of up to three components as follows.¶
The fixed-format portion of the message is carried over the ACH channel. The ACH channel type specifies the type of measurement being made (currently: loss, delay, or loss and delay).¶
(Optional) The SFL TLV specified inSection 9.1MAY be carried if needed. It is used to provide the implementation with a reminder of the SFL that was used to carry the message. This is needed because a number of MPLS implementations do not provide the MPLS label stack to the MPLS OAM handler. This TLV is required if messages are sent over UDP[RFC7876]. This TLVMUST be included unless, by some method outside the scope of this document, it is known that this information is not needed by the responder.¶
(Optional) The Return InformationMAY be carried if needed. It allows the responder send the response to the Querier. This is not needed if the response is requested in band and the MPLS construct being measured is a point-to-point LSP, but it otherwiseMUST be carried. The Return Address TLV is defined in[RFC6374], and the optional UDP Return Object is defined in[RFC7876].¶
Where a measurement other than an MPLS Direct Loss Measurement is to bemade, the appropriate measurement message is used (for example, one of thenew types defined in this document), and this is indicated to the receiverby the use of the corresponding ACH type.¶
The[RFC6374] SFL TLV is shown inFigure 6. This contains the SFL that was carried in the label stack, the FEC that was used to allocate the SFL, and the index (into the batch of SFLs that were allocated for the FEC) that corresponds to this SFL.¶
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Type | Length |MBZ| SFL Batch | SFL Index |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| SFL | Reserved |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| FEC |. .+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where:¶
The index of this SFL within the list of SFLs that were assigned for the FEC.¶
Multiple SFLs can be assigned to a FEC, each with different actions. This index is an optional convenience for use in mapping between the TLV and the associated data structures in the LSRs. The use of this feature is agreed upon between the two parties during configuration. It is not required but is a convenience for the receiver if both parties support the facility.¶
This information is needed to allow for operation with hardware thatdiscards the MPLS label stack before passing the remainder of thestack to the OAM handler. By providing both the SFL and the FEC plusindex into the array of allocated SFLs, a number of implementationtypes are supported.¶
This mode of operation is not currently supported by this specification.¶
The inclusion of originating and/or flow information in a packetprovides more identity information and hence potentially degrades theprivacy of the communication. While the inclusion of the additionalgranularity does allow greater insight into the flow characteristics,it does not specifically identify which node originated the packetother than by inspection of the network at the point of ingress orinspection of the control protocol packets. This privacy threat maybe mitigated by encrypting the control protocol packets, regularlychanging the synonymous labels, and by concurrently using a number ofsuch labels.¶
The security considerations documented in[RFC6374] and[RFC8372](which in turn calls up[RFC5920] and[RFC7258]) are applicable to thisprotocol.¶
The issue noted inSection 11 is a security consideration. There areno other new security issues associated with the MPLS data plane. Anycontrol protocol used to request SFLs will need to ensure thelegitimacy of the request.¶
An attacker that manages to corrupt the[RFC6374] SFL TLV inSection 9.1 coulddisrupt the measurements in a way that the[RFC6374] responder is unable todetect. However, the network operator is likely to notice theanomalous network performance measurements, and in any case,normal MPLS network security procedures make this type of attack extremely unlikely.¶
As per the IANA considerations in[RFC5586] updated by[RFC7026] and[RFC7214], IANA hasallocated the following values in the "MPLS Generalized Associated Channel (G-ACh) Types" registry, in the "Generic Associated Channel (G-ACh) Parameters"registry group:¶
Value | Description | Reference |
---|---|---|
0x0010 | Time Bucket Jitter Measurement | RFC 9571 |
0x0011 | Multi-packet Delay Measurement | RFC 9571 |
0x0012 | Average Delay Measurement | RFC 9571 |
IANA has allocated the following TLV from the 0-127 range of the"MPLS Loss/Delay Measurement TLV Object" registry in the"Generic Associated Channel (G-ACh) Parameters" registry group:¶
Type | Description | Reference |
---|---|---|
4 | Synonymous Flow Label | RFC 9571 |
The authors thankBenjamin Kaduk andElwyn Davies for their thorough and thoughtfulreview of this document.¶