RFC 9262 | BIER-TE ARCH | October 2022 |
Eckert, et al. | Standards Track | [Page] |
This memo describes per-packet stateless strict and loose pathsteered replication and forwarding for "Bit Index Explicit Replication" (BIER) packets (RFC 8279); it is called "Tree Engineering for Bit Index Explicit Replication" (BIER-TE) and is intended to be used as the path steering mechanism for Traffic Engineeringwith BIER.¶
BIER-TE introduces a new semantic for "bit positions" (BPs). These BPs indicate adjacenciesof the network topology, as opposed to (non-TE) BIER in which BPs indicate "Bit-Forwarding Egress Routers" (BFERs). A BIER-TE "packets BitString" therefore indicates theedges of the (loop-free) tree across which the packets are forwarded by BIER-TE.BIER-TE can leverage BIER forwarding engines with little changes.Co-existence of BIER and BIER-TE forwarding in the same domain is possible -- for example, by usingseparate BIER "subdomains" (SDs). Except for the optional routed adjacencies, BIER-TE does notrequire a BIER routing underlay and can therefore operate without dependingon a routing protocol such as the "Interior Gateway Protocol" (IGP).¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9262.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
"Tree Engineering for Bit Index Explicit Replication" (BIER-TE) is based on the (non-TE) BIER architecture, terminology, and packet formats as describedin[RFC8279] and[RFC8296].This document describes BIER-TE, with the expectation that the reader is familiarwith these two documents.¶
BIER-TE introduces a new semantic for "bit positions" (BPs). These BPs indicate adjacenciesof the network topology, as opposed to (non-TE) BIER in which BPs indicate "Bit-Forwarding Egress Routers" (BFERs). A BIER-TE "packets BitString" therefore indicates the edges of the (loop-free) tree across which the packets are forwarded by BIER-TE.With BIER-TE, the "Bit Index Forwarding Table" (BIFT) of each "Bit-Forwarding Router" (BFR)is only populated with BPs that are adjacent to the BFRin the BIER-TE topology. Other BPs are empty in the BIFT. The BFR replicatesand forwards BIER packets to adjacent BPs that are set in the packets.BPs are normally also cleared upon forwarding to avoid duplicates and loops.¶
BIER-TE can leverage BIER forwarding engines with little or no changes.It can also co-exist with BIER forwarding in the same domain -- for example, by usingseparate BIER subdomains. Except for the optional routed adjacencies, BIER-TE does notrequire a BIER routing underlay and can therefore operate without dependingon a routing protocol such as the "Interior Gateway Protocol" (IGP).¶
This document is structured as follows:¶
Note that related work[CONSTRAINED-CAST]uses Bloom filters[Bloom70] to represent leaves or edges of the intended delivery tree. Bloom filtersin general can support larger trees/topologies with fewer addressing bits than explicit BitStrings,but they introduce the heuristic risk of false positives and cannot clear bits inthe BitStrings during forwarding to avoid loops. For these reasons, BIER-TE, like BIER,uses explicit BitStrings. Explicit BitStrings as used by BIER-TE can alsobe seen as a special type of Bloom filter, and this is how other related work[ICC]describes it.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
BIER-TE forwarding is best introduced with simple examples. These examplesuse formal terms defined later in this document (Figure 4 inSection 4.1),including forward_connected(), forward_routed(), and local_decap().¶
Consider the simple network in the BIER-TE overview example shown inFigure 1, with six BFRs. p1...p15 are the bit positions used. All BFRs can act asa "Bit-Forwarding Ingress Router" (BFIR); BFR1, BFR3, BFR4, andBFR6 can also be BFERs. "Forward_connected()" is the name used foradjacencies that represent subnet adjacencies of the network."Local_decap()" is the name used for the adjacency that decapsulates BIER-TE packets andpasses their payload to higher-layer processing.¶
BIER-TE Topology: Diagram: p5 p6 --- BFR3 --- p3/ p13 \p7 p15 BFR1 ---- BFR2 BFR5 ----- BFR6 p1 p2 p4\ p14 /p10 p11 p12 --- BFR4 --- p8 p9 (simplified) BIER-TE Bit Index Forwarding Tables (BIFTs): BFR1: p1 -> local_decap() p2 -> forward_connected() to BFR2 BFR2: p1 -> forward_connected() to BFR1 p5 -> forward_connected() to BFR3 p8 -> forward_connected() to BFR4 BFR3: p3 -> forward_connected() to BFR2 p7 -> forward_connected() to BFR5 p13 -> local_decap() BFR4: p4 -> forward_connected() to BFR2 p10 -> forward_connected() to BFR5 p14 -> local_decap() BFR5: p6 -> forward_connected() to BFR3 p9 -> forward_connected() to BFR4 p12 -> forward_connected() to BFR6 BFR6: p11 -> forward_connected() to BFR5 p15 -> local_decap()
Assume that a packet from BFR1 should be sent via BFR4 to BFR6. This requiresa BitString (p2,p8,p10,p12,p15). When this packet is examined by BIER-TEon BFR1, the only bit position from the BitString that is also set inthe BIFT is p2. This will cause BFR1 to send the only copy of the packetto BFR2. Similarly, BFR2 will forward to BFR4 because of p8, BFR4 to BFR5because of p10, and BFR5 to BFR6 because of p12. p15 finally makes BFR6 receiveand decapsulate the packet.¶
To send a copy to BFR6 via BFR4 and also a copy to BFR3, the BitString needsto be (p2,p5,p8,p10,p12,p13,p15). When this packet is examined by BFR2, p5 causes one copy to be sent to BFR3 and p8 one copy to BFR4.When BFR3 receives the packet, p13 will cause it to receive and decapsulatethe packet.¶
If instead the BitString was (p2,p6,p8,p10,p12,p13,p15), the packetwould be copied by BFR5 towards BFR3 because of p6 instead of being copied by BFR2 to BFR3 because of p5 in the prior case. This demonstrates the ability of theBIER-TE topology, as shown inFigure 1, to make the traffic pass across any possible path and bereplicated where desired.¶
BIER-TE has various options for minimizing BP assignments,many of which are based on out-of-band knowledge about the required multicast trafficpaths and bandwidth consumption in the network, e.g., from predeployment planning.¶
Figure 2 shows a modified example, in which Rtr2 and Rtr5 areassumed not to support BIER-TE, so traffic has to be unicast encapsulated acrossthem. To explicitly distinguish routed/tunneled forwarding of BIER-TE packetsfrom Layer 2 forwarding (forward_connected()), these adjacencies are called "forward_routed()" adjacencies. Otherwise, there is no differencein their processing over the aforementioned forward_connected() adjacencies.¶
In addition, bits are saved in the following example by assuming that BFR1 onlyneeds to be a BFIR -- not a BFER or a transit BFR.¶
BIER-TE Topology: Diagram: p1 p3 p7 ....> BFR3 <.... p5 ........ ........> BFR1 (Rtr2) (Rtr5) BFR6 ........ ........> p9 ....> BFR4 <.... p6 p2 p4 p8 (simplified) BIER-TE Bit Index Forwarding Tables (BIFTs): BFR1: p1 -> forward_routed() to BFR3 p2 -> forward_routed() to BFR4 BFR3: p3 -> local_decap() p5 -> forward_routed() to BFR6 BFR4: p4 -> local_decap() p6 -> forward_routed() to BFR6 BFR6: p7 -> forward_routed() to BFR3 p8 -> forward_routed() to BFR4 p9 -> local_decap()
To send a BIER-TE packet from BFR1 via BFR3 to be received by BFR6,the BitString is (p1,p5,p9). A packet from BFR1 via BFR4 to be received by BFR6 uses the BitString (p2,p6,p9). A packet from BFR1 to be received by BFR3,BFR4and from BFR3 to be received by BFR6 uses (p1,p2,p3,p4,p5,p9). A packetfrom BFR1 to be received by BFR3,BFR4 and from BFR4 to be received by BFR6uses (p1,p2,p3,p4,p6,p9). Apacket from BFR1 to be received by BFR4, then from BFR4 to bereceived by BFR6, and finally from BFR6 to be received by BFR3, uses(p2,p3,p4,p6,p7,p9). A packet from BFR1 to be received by BFR3,then from BFR3 to be received by BFR6, and finally from BFR6 to bereceived by BFR4, uses (p1,p3,p4,p5,p8,p9).¶
The key new component in BIER-TE compared to (non-TE) BIER is the BIER-TE topologyas introduced through the two examples inSection 2.2.It is used to control where replication can or should happen and how tominimize the required number of BPs for adjacencies.¶
The BIER-TE topology consists of the BIFTs of all the BFRs andcan also be expressed as a directed graph where the edges are the adjacencies between the BFRs labeled with the BP used for the adjacency. Adjacencies arenaturally unidirectional. A BP can be reused across multiple adjacencies as long as this does notlead to undesired duplicates or loops, as explained inSection 5.2.¶
If the BIER-TE topology represents (a subset of) the underlying (Layer 2)topology of the network as shown in the first example, this may be called an "underlay"BIER-TE topology. A topology consisting only of "forward_routed()" adjacencies asshown in the second example may be called an "overlay" BIER-TE topology.A BIER-TE topology with both forward_connected() and forward_routed() adjacenciesmay be called a "hybrid" BIER-TE topology.¶
BIER-TE is designed so that its forwarding plane is a simple extension to the (non-TE) BIER forwarding plane, hence allowing it to be added to BIER deployments where it can be beneficial.¶
BIER-TE is also intended as an option to expand the BIER architecture into deployments where (non-TE) BIER may not be the best fit, such as statically provisioned networks that need path steering but do not want distributed routing protocols.¶
BIER-TE inherits the following aspects from BIER unchanged:¶
BIER-TE has the following key changes with respect to BIER:¶
The following elements/functions described in the BIER architecture are not required by the BIER-TE architecture:¶
Co-existence of BIER and BIER-TE in the same network requires the following:¶
BIER-TE forwarding rules, especially BitString parsing, are designed to be as closeas possible to those of BIER, with the expectation that this eases the programming of BIER-TE forwarding code and/or BIER-TE forwarding hardware on platforms supporting BIER. The pseudocode inSection 4.4 shows how existing (non-TE) BIER/BIFT forwarding can be modified to support the required BIER-TE forwardingfunctionality (Section 4.5), by using the BIER BIFT's "Forwarding Bit Mask" (F-BM):only the clearing of bits to avoid sending duplicatepackets to a BFR's neighbor is skipped in BIER-TE forwarding, because it is not necessaryand could not be done when using a BIER F-BM.¶
Whether to use BIER or BIER-TE forwarding is simply a choice of the modeof the BIFT indicated by the packet (BIER or BIER-TE BIFT). This is determinedby the BFR configuration for the encapsulation; seeSection 4.3.¶
BIER-TE can be thought of as being composed of the same threelayers as BIER: the "multicast flow overlay", the "BIER layer", andthe "routing underlay".Figure 3 also shows how the BIER layeris composed of the "BIER-TE forwarding plane" and the "BIER-TE control plane" asrepresented by the "BIER-TE controller".¶
<------BGP/PIM-----> |<-IGMP/PIM-> multicast flow <-PIM/IGMP->| overlay BIER-TE [BIER-TE Controller] <=> [BIER-TE Topology] control ^ ^ ^ plane / | \ BIER-TE control protocol | | | (e.g., YANG/NETCONF/RESTCONF | | | PCEP/...) v v v Src -> Rtr1 -> BFIR-----BFR-----BFER -> Rtr2 -> Rcvr |<----------------->| BIER-TE forwarding plane |<- BIER-TE domain->| |<--------------------->| Routing underlay
The multicast flow overlay has the same role as that described for BIERin[RFC8279],Section 4.3. See alsoSection 3.2.1.2.¶
When a BIER-TE controller is used, it might also be preferable thatmulticast flow overlay signaling be performed through a central point of control. For BGP-basedoverlay flow services such as "Multicast VPN Using Bit Index Explicit Replication (BIER)"[RFC8556], thiscan be achieved by making the BIER-TE controller operate as a BGP RouteReflector[RFC4456] and combining it with signaling through BGPor a different protocol for the BIER-TE controller's calculated BitStrings.See Sections 3.2.1.2 and5.3.4.¶
In the (non-TE) BIER architecture[RFC8279], the BIER layer is summarized inSection 4.2 of [RFC8279]. This summary includes both the functionsof the BIER-layer control plane and forwarding plane, without using those terms.Example standardized options for the BIER control plane includeIS-IS and OSPF extensions for BIER, as specified in[RFC8401] and[RFC8444], respectively.¶
For BIER-TE, the control plane includes, at a minimum, the following functionality.¶
BIER-TE topology control: During initial provisioning of the network and/or during modifications of its topology and/or services, the protocols and/or procedures to establish BIER-TE BIFTs:¶
BIER-TE tree control: During network operations, protocols and/or procedures to support creation/change/removal of overlay flows on BFIRs:¶
This architecture describes theBIER-TE control plane, as shown inFigure 3, as consisting of:¶
The single, centralized BIER-TE controller is used in this document as the reference option for the BIER-TE control plane, but other options are equally feasible. The BIER-TE control plane could equally be implemented without automated configuration/protocols,by an operator via a CLI on the BFRs. In that case, operator-configured local policy on the BFIR would have todetermine how to set the appropriate BIER header fields. The BIER-TE control plane could also be decentralizedand/or distributed, but this document does not consider any additional protocols and/or proceduresthat would then be necessary to coordinate its (distributed/decentralized) entities to achieve the above-described functionality.¶
The first item listed for BIER-TE topology control (Section 3.2, point 1.a.)includes network topology discovery and BIER-TE topology creation. The latter describesthe process by which a controller determines which routers are to be configured as BFRs and theadjacencies between them.¶
In statically managed networks, e.g., industrial environments, both discovery and creation can be a manual/offline process.¶
In other networks, topology discovery may rely on such protocols as those that include extending an IGP based on a link-state protocol into the BIER-TE controller itself, e.g., BGP-LS[RFC7752] or YANG topology[RFC8345], as well as methods specific to BIER-TE -- for example, via[BIER-TE-YANG]. These options are non-exhaustive.¶
Dynamic creation of the BIER-TE topology can be as easy as mapping the network topology 1:1 to the BIER-TE topology by assigning a BP for every network subnet adjacency. In larger networks, it likely involves more complex policy and optimization decisions, including how to minimize the number of BPs required and how to assign BPs across different BitStrings to minimize the number of duplicate packets across links when delivering an overlay flow to BFERs using different SIs:BitStrings. These topics are discussed inSection 5.¶
When the BIER-TE topology has been determined, the BIER-TE controller pushesthe BPs/adjacencies to the BIFT of the BFRs. On each BFR, only those SIs:BPsthat are adjacencies to other BFRs in the BIER-TE topology are populated.¶
Communications between the BIER-TE controller and BFRs for both BIER-TE topologycontrol and BIER-TE tree control are ideally via standardized protocols and data models suchas NETCONF/RESTCONF/YANG/PCEP. A vendor-specific CLI on the BFRs is also an option (as in many other "Software-Defined Network" (SDN)solutions lacking definitions of standardized data models).¶
In BIER, the same set of BFERs in a single subdomain is always encoded as the same BitString.In BIER-TE, the BitString used to reach the same set of BFERs in the same subdomain can bedifferent for different overlay flows because the BitString encodes the paths towards the BFERs,so the BitStrings from different BFIRs to the same set of BFERs will often be different. Likewise, the BitString fromthe same BFIR to the same set of BFERs can be different for differentoverlay flows if different policies should be applied to those overlayflows, such as shortest path trees, Steinertrees (minimum cost trees), diverse path trees for redundancy, and so on.¶
See also[BIER-MCAST-OVERLAY] for an applicationleveraging BIER-TE engineered trees.¶
If the network topology changes (not failure based) so that adjacenciesthat are assigned to bit positions are no longer needed, the BIER-TE controller canreuse those bit positions for new adjacencies. First, these bit positionsneed to be removed from any BFIR flow state and BFR BIFT state. Then, theycan be repopulated, first into the BIFT and then into the BFIR.¶
When links or nodes fail or recover in the topology, BIER-TE could quicklyrespond with "Fast Reroute" (FRR) procedures such as those described in[BIER-TE-PROTECTION], the details of which are out of scope for this document. It can also more slowly react byrecalculating the BitStrings of affected multicast flows. This reaction isslower than the FRR procedure because the BIER-TE controller needs to receivelink/node up/down indications, recalculate the desired BitStrings, and pushthem down into the BFIRs. With FRR, this is all performed locally on a BFRreceiving the adjacency up/down notification.¶
The BIER-TE forwarding plane consists of the following components:¶
When the BIER-TE forwarding plane receives a packet, it simply looksup the bit positions that are set in the BitString of the packet in theBIFT that was populated by the BIER-TE controller.For every BP that is set in the BitString and has one ormore adjacencies in the BIFT, a copy is made according to the typesof adjacencies for that BP in the BIFT. Before sending any copies, theBFR clears all BPs in the BitString of the packet for which theBFR has one or more adjacencies in the BIFT. Clearing these bits preventspackets from looping when a BitString erroneously includes a forwarding loop.When a forward_connected() adjacency has the "DoNotClear" (DNC) flagset, this BP is reset for the packet copied to that adjacency.SeeSection 4.2.1.¶
For forward_connected() adjacencies, BIER-TE sends BIER packets to directly connectedBIER-TE neighbors as L2 (unicast) BIER packets without requiring arouting underlay. For forward_routed() adjacencies, BIER-TE forwarding encapsulatesa copy of the BIER packet so that it can be delivered by the forwarding planeof the routing underlay to the routable destination address indicated in the adjacency.SeeSection 4.2.2 for details on forward_routed() adjacencies.¶
BIER relies on the routing underlay to calculate paths towards BFERs and derive next-hop BFR adjacencies for those paths. These two steps commonly rely on BIER-specific extensions to the routing protocols of the routing underlay but may also be establishedby a controller. In BIER-TE, the next hops for a packet are determined by the BitStringthrough the BIER-TE controller-established adjacencies on the BFR for the BPs of the BitString.There is thus no need for BFR-specific routing underlay extensions to forward BIER packets withBIER-TE semantics.¶
Encapsulation parameters can be provisioned by the BIER-TE controller into the forward_connected() or forward_routed() adjacencies directly without relying on a routing underlay.¶
If the BFR intends to support FRR for BIER-TE, then the BIER-TEforwarding plane needs to receive fast adjacency up/down notifications:link up/down or neighbor up/down, e.g., from "Bidirectional Forwarding Detection" (BFD). Providing these notificationsis considered to be part of the routing underlay in this document.¶
Traffic Engineering[TE-OVERVIEW]provides performance optimization of operational IP networks while utilizing network resources economically andreliably. The key elements needed to effect Traffic Engineering are policy, path steering,and resource management. These elements require support at thecontrol/controller level and within the forwarding plane.¶
Policy decisions are made within the BIER-TE control plane, i.e., withinBIER-TE controllers. Controllers use policy when composing BitStrings and BFR BIFT state. The mapping of user/IP traffic to specificBitStrings / BIER-TE flows is made based on policy. The specific details ofBIER-TE policies and how a controller uses them are out of scope for thisdocument.¶
Path steering is supported via the definition of a BitString. BitStringsused in BIER-TE are composed based on policy and resource managementconsiderations. For example, when composing BIER-TE BitStrings, a controller must takeinto account the resources available at each BFR and for each BPwhen it is providing congestion-loss-free services such as Rate-Controlled Service Disciplines[RCSD94]. Resource availability could be provided, for example, via routing protocol information butmay also be obtained via a BIER-TE control protocol such as NETCONF orany other protocol commonly used by a controller to understand the resourcesof the network on which it operates. Theresource usage of the BIER-TE traffic admitted by the BIER-TE controllercan be solely tracked on the BIER-TE controller based on local accountingas long as no forward_routed() adjacencies are used (seeSection 4.2.2 for the definitionof forward_routed() adjacencies). When forward_routed() adjacencies are used,the paths selected by the underlying routing protocol need to be tracked as well.¶
Resource management has implications for the forwarding plane beyondthe BIER-TE-defined steering of packets; this includes allocation ofbuffers to guarantee the worst-case requirements for admitted RCSD trafficand potentially policing and/or rate-shaping mechanisms, typically donevia various forms of queuing. This level of resource control,while optional, is important in networks that wish tosupport congestion management policies to control or regulate the offeredtraffic to deliver different levels of service and alleviate congestionproblems, or those networks that wish to control latencies experienced byspecific traffic flows.¶
The BIER-TE BIFT is equivalent to the (non-TE) BIER BIFT. Itexists on every BFR running BIER-TE. For every BIER "subdomain" (SD) in use for BIER-TE,the BIFT is constructed per the example shown inFigure 4. TheBIFT in the figure assumes a BSL of 8 "bit positions" (BPs) in the packets BitString.As in[RFC8279], this BSL is purely used as an example and is not a BSL supported by BIER/BIER-TE(minimum BSL is 64).¶
A BIER-TE BIFT is compared to a BIER BIFT as shown in[RFC8279] asfollows.¶
In both BIER and BIER-TE, BIFT rows/entries are indexed in their respective BIER pseudocode([RFC8279],Section 6.5) and BIER-TE pseudocode (Section 4.4)by the BIFT-index derived from the packet's SI, BSL, and the one bit position of thepackets BitString (BP) addressing the BIFT row: BIFT-index = SI * BSL + BP - 1.BPs within a BitString are numbered from 1 to BSL -- hence, the - 1 offset when convertingto a BIFT-index. This document also uses the notion "SI:BP" to indicate BIFT rows.[RFC8279] uses the equivalent notion "SI:BitString", where the BitString isfilled with only the BPs for the BIFT row.¶
In BIER, each BIFT-index addresses one BFER by its BFR-id = BIFT-index + 1and is populated on each BFR with the next-hop "BFR Neighbor" (BFR-NBR) towards that BFER.¶
In BIER-TE, each BIFT-index and, therefore, SI:BP indicates one or, in the case of reuse of SI:BP, more than one adjacency between BFRs in the topology. The SI:BPis populated with the adjacency on the upstream BFR of the adjacency. The BIFT entries are empty on all other BFRs.¶
In BIER, each BIFT row also requires a "Forwarding Bit Mask" (F-BM) entry for BIER forwarding rules. In BIER-TE forwarding, an F-BM is not required but can be usedwhen implementing BIER-TE on forwarding hardware, derived from BIER forwarding, thatmust use an F-BM. This is discussed in the first variation of BIER-TE forwarding pseudocode shown inSection 4.4.¶
------------------------------------------------------------------- | BIFT-index | | Adjacencies: | | (SI:BP) |(F-BM)| <empty> or one or more per entry | =================================================================== | BIFT indices for Packets with SI=0 | ------------------------------------------------------------------- | 0 (0:1) | ... | forward_connected(interface,neighbor{,DNC}) | ------------------------------------------------------------------- | 1 (0:2) | ... | forward_connected(interface,neighbor{,DNC}) | | | ... | forward_connected(interface,neighbor{,DNC}) | ------------------------------------------------------------------- | ... | ... | ... | ------------------------------------------------------------------- | 4 (0:5) | ... | local_decap({VRF}) | ------------------------------------------------------------------- | 5 (0:6) | ... | forward_routed({VRF,}l3-neighbor) | ------------------------------------------------------------------- | 6 (0:7) | ... | <empty> | ------------------------------------------------------------------- | 7 (0:8) | ... | ECMP((adjacency1,...adjacencyN){,seed}) | ------------------------------------------------------------------- | BIFT indices for BitString/Packet with SI=1 | ------------------------------------------------------------------- | 9 (1:1) | | ... | | ... | ... | ... | -------------------------------------------------------------------
The BIFT is configured for the BIER-TE data plane of a BFR by the BIER-TEcontroller through an appropriate protocol and data model. The BIFT is then used to forward packets, according to the procedures for the BIER-TE forwarding plane as specified inSection 3.3.¶
Note that a BIFT-index (SI:BP) may be populated in the BIFT of morethan one BFR to save BPs. SeeSection 5.1.6 for an example of how a BIER-TE controllercould assign BPs to (logical) adjacencies shared across multiple BFRs,Section 5.1.3 for an example of assigning the same BP to differentadjacencies, andSection 5.1.9 for general guidelines regarding the reuse of BPs across different adjacencies.¶
{VRF} indicates the Virtual Routing and Forwarding context into whichthe BIER payload is to be delivered. This is optional and dependson the multicast flow overlay.¶
A "forward_connected()" adjacency is an adjacency towards a directly connectedBFR-NBR using an interface address of that BFR on the connectinginterface. A forward_connected() adjacency does not route packets;only L2 forwards them to the neighbor.¶
Packets sent to an adjacency with "DoNotClear" (DNC) set in theBIFTMUST NOT have the bit position for that adjacency cleared when theBFR creates a copy for it. The bit position will still be cleared forcopies of a packet made towards other adjacencies. This can beused, for example, in ring topologies as explained inSection 5.1.6.¶
For protection against loops caused by misconfiguration (seeSection 5.2.1),DNC is only permissible for forward_connected() adjacencies. No need or benefitof DNC for other types of adjacencies was identified, and associated risks were not analyzed.¶
A "forward_routed()" adjacency is an adjacency towards a BFR thatuses a (tunneling) encapsulation that will cause a packet to beforwarded by the routing underlay towards the adjacent BFR indicated via the l3-neighbor parameter of the forward_routed() adjacency. This canleverage any feasible encapsulation, such as MPLS or tunneling over IP/IPv6,as long as the BIER-TE packet can be identified as a payload. This identificationcan rely on either the BIER/BIER-TE co-existence mechanisms described inSection 4.3 or explicit support for a BIER-TE payload typein the tunneling encapsulation.¶
Forward_routed() adjacencies are necessary to pass BIER-TE traffic acrossrouters that are not BIER-TE capable or to minimize the number of required BPs bytunneling over (BIER-TE-capable) routers on which neither replication norpath steering is desired, or simply to leverage the routing underlay's path redundancy and FRR towards the next BFR. They may also be useful to a multi-subnet adjacent BFR for leveraging the routing underlay ECMP independently of BIER-TE ECMP (Section 4.2.3).¶
(Non-TE) BIER ECMP is tied to the BIER BIFT processing semantic and is thereforenot directly usable with BIER-TE.¶
A BIER-TE "Equal-Cost Multipath" (ECMP()) adjacency as shown inFigure 4for BIFT-index 7 has a list of two or more non-ECMP() adjacencies as parameters and an optionalseed parameter. When a BIER-TE packet is copiedonto such an ECMP() adjacency, an implementation-specific so-called hash functionwill select one out of the list's adjacencies to which the packet is forwarded.If the packet's encapsulation contains an entropy field, the entropy fieldSHOULDbe respected; two packets with the same value of the entropy fieldSHOULD be sent onthe same adjacency. The seed parameter permits the design ofhash functions that are easy to implement at high speed without running intopolarization issues across multiple consecutive ECMP hops. SeeSection 5.1.7for details.¶
A "local_decap()" adjacency passes a copy of the payload ofthe BIER-TE packet to the protocol ("NextProto") within the BFR (IP/IPv6, Ethernet,...) responsible forthat payload according to the packet header fields.A local_decap() adjacency turns the BFR into a BFER for matchingpackets. Local_decap() adjacencies require the BFER to supportrouting or switching for NextProto to determine how to furtherprocess the packets.¶
Specifications for BIER-TE encapsulation are outside the scope of this document.This section gives explanations and guidelines.¶
The handling of "Maximum Transmission Unit" (MTU) limitations isoutside the scope of this document and is not discussed in[RFC8279] either. Instead, this process is part of the BIER-TE packet encapsulation and/or flow overlay; for example, see[RFC8296],Section 3.It applies equally to BIER-TE and BIER.¶
Because a BFR needs to interpret the BitString of a BIER-TE packet differentlyfrom a (non-TE) BIER packet, it is necessary to distinguish BIER packets from BIER-TE packets.In BIER encapsulation[RFC8296], the BIFT-id field of the packet indicates the BIFT of the packet. BIER and BIER-TE cantherefore be run simultaneously, when the BIFT-id address space is shared acrossBIER BIFTs and BIER-TE BIFTs. Partitioning the BIFT-id address space is subjectto BIER-TE/BIER control plane procedures.¶
When[RFC8296] is used for BIER with MPLS, BIFT-id address rangescan be dynamically allocated from MPLS label space only for the set of actuallyused SD:BSL BIFTs. This also permits the allocation of non-overlapping label ranges for BIFT-idsthat are to be used with BIER-TE BIFTs.¶
With MPLS, it is also possible to reuse thesame SD space for both BIER-TE and BIER, so that the same SD has both a BIER BIFT with a corresponding range of BIFT-ids and disjoint BIER-TE BIFTs with a non-overlapping range of BIFT-ids.¶
Assume that a fixed mapping from BSL, SD, and SI to a BIFT-id is used,which does not explicitly partition the BIFT-id space between BIERand BIER-TE -- for example, as proposed for non-MPLS forwarding withBIER encapsulation[RFC8296] in[NON-MPLS-BIER-ENCODING],Section 5.In this case, it is necessary to allocate disjoint SDs to BIER and BIER-TE BIFTsso that both can be addressed by the BIFT-ids. The encodingproposed inSection 6 of [NON-MPLS-BIER-ENCODING] does not statically encode the BSL or SD into the BIFT-id, but the encodingpermits a mapping and hence could provide the same freedom as whenMPLS is being used (the same SD, or different SDs for BIER/BIER-TE).¶
Forward_routed() requires an encapsulation that permits directing unicast encapsulated BIER-TE packets to a specific interface address on a target BFR. With MPLS encapsulation, this cansimply be done via a label stack with that address's label as the top label, followedby the label assigned to the (BSL,SD,SI) BitString.With non-MPLS encapsulation, some form of IP encapsulation would be required (for example, IP/GRE).¶
The encapsulation used for forward_routed() adjacencies can equally supportexisting advanced adjacency information such as "loose source routes" via, for example, MPLSlabel stacks or appropriate header extensions (e.g., for IPv6).¶
The pseudocode for BIER-TE forwarding, as shown inFigure 5, is basedon the (non-TE) BIER forwarding pseudocode provided in[RFC8279],Section 6.5, with one modification.¶
void ForwardBitMaskPacket_withTE (Packet) { SI=GetPacketSI(Packet); Offset=SI*BitStringLength; for (Index = GetFirstBitPosition(Packet->BitString); Index ; Index = GetNextBitPosition(Packet->BitString, Index)) { F-BM = BIFT[Index+Offset]->F-BM; if (!F-BM) continue; [3] BFR-NBR = BIFT[Index+Offset]->BFR-NBR; PacketCopy = Copy(Packet); PacketCopy->BitString &= F-BM; [2] PacketSend(PacketCopy, BFR-NBR); // The following must not be done for BIER-TE: // Packet->BitString &= ~F-BM; [1] } }
In step [2], the F-BM is used to clear one or more bits in PacketCopy.This step exists in both BIER and BIER-TE, but the F-BMs need to bepopulated differently for BIER-TE than for BIER for the desired clearing.¶
In BIER, multiple bits of a BitString can have the same BFR-NBR.When a received packets BitString has more than one of those bits set,BIER's replication logic has to prevent more than one PacketCopy from being sent to that BFR-NBR ([1]). Likewise, the PacketCopy sent to a BFR-NBRmust clear all bits in its BitString that are not routed across a BFR-NBR.This prevents BIER's replication logic from creating duplicates on any possible further BFRs ([2]).¶
To solve both [1] and [2] for BIER, the F-BM of each bit index needs to have allbits set that this BFR wants to route across a BFR-NBR. [2] clearsall other bits in PacketCopy->BitString, and [1] clears those bits fromPacket->BitString after the first PacketCopy.¶
In BIER-TE, a BFR-NBR in this pseudocode is an adjacency -- forward_connected(), forward_routed(),or local_decap(). There is no need for [2] to suppress duplicates in the same waythat BIER does, because in general, different BPs would never have the sameadjacency. If a BIER-TE controller actually finds some optimization inwhich this would be desirable, then the controller is also responsible forensuring that only one of those bits is set in any Packet->BitString, unless the controller explicitly wants duplicates to be created.¶
The following points describe how the F-BM for each BP is configured in the BIFT and how this impacts the BitString of the packet being processed with that BIFT:¶
This forwarding pseudocode can support the required BIER-TE forwardingfunctions (seeSection 4.5) -- forward_connected(),forward_routed(), and local_decap() -- but cannot support the recommended functions (DNC flag and multiple adjacencies per bit) or the optional function (i.e., ECMP() adjacencies).The DNC flag cannot be supported when using only [1] to mask bits.¶
The modified and expanded forwarding pseudocode inFigure 6 specifies how tosupport all BIER-TE forwarding functions (required, recommended, and optional):¶
This pseudocode eliminates per-bit F-BMs, therefore reducing the size of BIFT state by SI*BSL2 and eliminating the need for per-packet-copy BitString masking operations, except for adjacencies with the DNC flag set:¶
void ForwardBitMaskPacket_withTE (Packet) { SI = GetPacketSI(Packet); Offset = SI * BitStringLength; // Determine adjacent bits in the packets BitString PktAdjacentBits = Packet->BitString & AdjacentBits[SI]; // Clear adjacent bits in the packet header to avoid loops Packet->BitString &= ~AdjacentBits[SI]; // Loop over PktAdjacentBits to create packet copies for (Index = GetFirstBitPosition(PktAdjacentBits); Index ; Index = GetNextBitPosition(PktAdjacentBits, Index)) { for adjacency in BIFT[Index+Offset]->Adjacencies { if(adjacency.type == ECMP(ListOfAdjacencies,seed) ) { I = ECMP_hash(sizeof(ListOfAdjacencies), Packet->Entropy,seed); adjacency = ListOfAdjacencies[I]; } PacketCopy = Copy(Packet); switch(adjacency.type) { case forward_connected(interface,neighbor,DNC): if(DNC) PacketCopy->BitString |= 1<<(Index-1); SendToL2Unicast(PacketCopy,interface,neighbor); case forward_routed({VRF,}l3-neighbor): SendToL3(PacketCopy,{VRF,}l3-neighbor); case local_decap({VRF},neighbor): DecapBierHeader(PacketCopy); PassTo(PacketCopy,{VRF,}Packet->NextProto); } } } }
BFRs that support BIER-TE and BIERMUST support a configuration that enablesBIER-TE instead of (non-TE) BIER forwarding rules for all BIFTs of one or moreBIER subdomains. Every BP in a BIER-TE BIFTMUST support havingzero or one adjacency. BIER-TE forwardingMUST support the adjacency types forward_connected() with the DNC flag not set, forward_routed(), and local_decap().As explained inSection 4.4, these required BIER-TE forwarding functionscan be implemented via the same forwarding pseudocode as that used for BIER forwarding, except forone modification (skipping one masking with an F-BM).¶
BIER-TE forwardingSHOULD support forward_connected() adjacencies with the DNC flag set,as this is very useful for saving bits in rings (seeSection 5.1.6).¶
BIER-TE forwardingSHOULD support more than one adjacency on a bit.This allows bits to be saved in hub-and-spoke scenarios (seeSection 5.1.5).¶
BIER-TE forwardingMAY support ECMP() adjacencies to save bits in ECMPscenarios; seeSection 5.1.7 for an example.This is an optional requirement, because for ECMP deployments using BIER-TEone can also leverage the routing underlay ECMP via forward_routed()adjacencies and/or might prefer to have more explicit control of the pathchosen via explicit BPs/adjacencies for each ECMP path alternative.¶
This section describes how the BIER-TE controller can use thedifferent BIER-TE adjacency types to define the bit positions of a BIER-TE domain.¶
Because the size of the BitString limits the size of theBIER-TE domain, many of the options described here exist to support largertopologies with fewer bit positions.¶
On a "point-to-point" (P2P) link that connects two BFRs, the same bit position can be used onboth BFRs for the adjacency to the neighboring BFR. A P2P link therefore requires only one bit position.¶
A leaf BFER is one where incoming BIER-TE packets never need tobe forwarded to another BFR but are only sent to the BFERto exit the BIER-TE domain. For example, in networks where "Provider Edge" (PE) routersare spokes connected to Provider (P) routers, those PEs are leaf BFERs, unlessthere is a U-turn between two PEs.¶
Consider how redundant disjointtraffic can reach BFER1/BFER2 as shown inFigure 7: when BFER1/BFER2are non-leaf BFERs as shown on the right-hand side, one trafficcopy would be forwarded to BFER1 from BFR1, but the other onecould only reach BFER1 via BFER2, which makes BFER2 a non-leafBFER. Likewise, BFER1 is a non-leaf BFER when forwarding traffic to BFER2.Note that the BFERs on the left-hand side of the figure are only guaranteed tobe leaf BFERs by correctly applying a routing configuration that prohibits transittraffic from passing through the BFERs, which is commonly applied in thesetopologies.¶
BFR1(P) BFR2(P) BFR1(P) BFR2(P) | \ / | | | | X | | | | / \ | | | BFER1(PE) BFER2(PE) BFER1(PE)----BFER2(PE) ^ U-turn link Leaf BFER / Non-leaf BFER / PE router PE router
In most situations, leaf BFERs that are to be addressed via the same BitString can share a single bit position for their local_decap() adjacency in that BitString and therefore save bit positions. On a non-leaf BFER, a received BIER-TE packet may only need to transit the BFER, or it may also need to be decapsulated. Whether or not to decapsulate the packet therefore needs to be indicated by a unique bit position populated only on the BIFT of this BFER with a local_decap() adjacency. On a leaf BFER, packets never need to pass through; any packet received is therefore usually intended to be decapsulated. This can be expressed by a single, shared bit position that is populated with a local_decap() adjacency on all leaf BFERs addressed by the BitString.¶
The possible exceptions to this leaf BFER bit position optimization scenario can be cases where the bit position on the prior BIER-TE BFR (which created the packet copy for the leaf BFER in question) is populated with multiple adjacencies as an optimization -- for example, as described in Sections 5.1.4 and5.1.5. With either of these two optimizations, the sender of the packet could only control explicitly whether the packet was to be decapsulated on the leaf BFER in question, if the leaf BFER has a unique bit position for its local_decap() adjacency.¶
However, if the bit position is shared across a leaf BFER and packets are therefore decapsulated -- potentially unnecessarily -- this may still be appropriate if the decapsulated payload of the BIER-TE packet indicates whether or not the packets need to be further processed/received. This is typically true, for example, if the payload is IP multicast, because IP multicast on a BFER would know the membership state of the IP multicast payload and be able to discard it if the packets were delivered unnecessarily by the BIER-TE layer. If the payload has no such membership indication and the BFIR wants to have explicit control regarding which BFERs are to receive and decapsulate a packet, then these two optimizations cannot be used together with shared bit position optimization for a leaf BFER.¶
In a LAN, the adjacency to each neighboring BFRis given a unique bit position. The adjacency of this bit positionis a forward_connected() adjacency towards the BFR, and this bit positionis populated into the BIFT of all the other BFRs on that LAN.¶
BFR1 |p1 LAN1-+-+---+-----+ p3| p4| p2| BFR3 BFR4 BFR7
If bandwidth on the LAN is not an issue and most BIER-TE trafficshould be copied to all neighbors on a LAN, then bit positionscan be saved by assigning just a single bit position to the LANand populating the bit position of the BIFTs of each BFR onthe LAN with a list of forward_connected() adjacencies to all otherneighbors on the LAN.¶
This optimization does not work in the case of BFRs redundantlyconnected to more than one LAN with this optimization. TheseBFRs would receive duplicates and forward those duplicates into theother LANs. Such BFRs require separate bit positions for each LAN theyconnect to.¶
In a setup with a hub and multiple spokes connected via separateP2P links to the hub, all P2P adjacencies from the hub to the spokes' links can share the same bit position.The bit position on the hub's BIFT is set up with a list offorward_connected() adjacencies, one for each spoke.¶
This option is similar to the bit position optimization inLANs: redundantly connected spokes need their own bit positions,unless they are themselves leaf BFERs.¶
This type of optimized BP could be used, for example, when alltraffic is "broadcast" traffic (very dense receiver sets),such as live TV or many-to-many telemetry, including situational awareness.This BP optimization can then be used to explicitly steer different trafficflows across different ECMP paths in data-center or broadband-aggregationnetworks with minimal use of BPs.¶
In L3 rings, instead of assigning a single bit position forevery P2P link in the ring, it is possible to save bit positions bysetting the "DoNotClear" (DNC) flag on forward_connected() adjacencies.¶
For the ring shown inFigure 9, a single bit positionwill suffice to forward traffic entering the ring at BFRa or BFRball the way up to BFR1, as follows.¶
On BFRa, BFRb, BFR30,... BFR3, the bit position is populated witha forward_connected() adjacency pointing to the clockwise neighboron the ring and with DNC set. On BFR2, the adjacency also pointsto the clockwise neighbor BFR1, but without DNC set.¶
Handling DNC this way ensures that copies forwarded from any BFRs inthe ring to a BFR outside the ring will not have the ring bit position set,therefore minimizing the risk of creating loops.¶
v v | | L1 | L2 | L3 /-------- BFRa ---- BFRb --------------------\ | | \- BFR1 - BFR2 - BFR3 - ... - BFR29 - BFR30 -/ | | L4 | | p33| p15| BFRd BFRc
Note that this example only permits packets intended to make it allthe way around the ring to enter it atBFRa and BFRb. Note also that packets will always travel clockwise. Ifpackets should be allowed to enter the ring at any of the ring's BFRs, then onewould have to use two ring bit positions, one for each direction:clockwise and counterclockwise.¶
Both would be set up to stop rotating on the same link, e.g., L1. When thering's BFIR creates the clockwise copy, it will clear the counterclockwisebit position because the DNC bit only applies to the bit for which thereplication is done (likewise for the clockwisebit position for the counterclockwise copy). As a result, the ring'sBFIR will send a copy in both directions, serving BFRs on either side of thering up to L1.¶
An ECMP() adjacency allows the use of just one BP to deliver packetsto one of N adjacencies instead of one BP for each adjacency. In the common example case shown inFigure 10,a link bundle of three links L1,L2,L3 connects BFR1 and BFR2, andonly one BP is used instead of three BPs to deliver packets fromBFR1 to BFR2.¶
--L1----- BFR1 --L2----- BFR2 --L3----- BIFT entry in BFR1: ------------------------------------------------------------------ | Index | Adjacencies | ================================================================== | 0:6 | ECMP({forward_connected(L1, BFR2), | | | forward_connected(L2, BFR2), | | | forward_connected(L3, BFR2)}, seed) | ------------------------------------------------------------------ BIFT entry in BFR2: ------------------------------------------------------------------ | Index | Adjacencies | ================================================================== | 0:6 | ECMP({forward_connected(L1, BFR1), | | | forward_connected(L2, BFR1), | | | forward_connected(L3, BFR1)}, seed) | ------------------------------------------------------------------
This document does not standardize any ECMP algorithm because itis sufficient for implementations to document their freely chosen ECMP algorithm.Figure 11 shows an example ECMP algorithmand would double as its documentation: a BIER-TE controller coulddetermine which adjacency is chosen based on the seed and adjacencies parametersand on packet entropy.¶
forward(packet, ECMP(adj(0), adj(1),... adj(N-1), seed)): i = (packet(bier-header-entropy) XOR seed) % N forward packet to adj(i)
In the example shown inFigure 12, all traffic from BFR1 towards BFR10 isintended to be ECMP load-split equally across the topology. Thisexample is not meant as a likely setup; rather, it illustrates that ECMP canbe used to share BPs not only across link bundles but also acrossalternative paths across different transit BFRs, and it explainsthe use of the seed parameter.¶
BFR1 (BFIR) /L11 \L12 / \ BFR2 BFR3 /L21 \L22 /L31 \L32 / \ / \ BFR4 BFR5 BFR6 BFR7 \ / \ / \ / \ / BFR8 BFR9 \ / \ / BFR10 (BFER) BIFT entry in BFR1: ------------------------------------------------------------------ | 0:6 | ECMP({forward_connected(L11, BFR2), | | | forward_connected(L12, BFR3)}, seed1) | ------------------------------------------------------------------ BIFT entry in BFR2: ------------------------------------------------------------------ | 0:7 | ECMP({forward_connected(L21, BFR4), | | | forward_connected(L22, BFR5)}, seed1) | ------------------------------------------------------------------ BIFT entry in BFR3: ------------------------------------------------------------------ | 0:7 | ECMP({forward_connected(L31, BFR6), | | | forward_connected(L32, BFR7)}, seed1) | ------------------------------------------------------------------ BIFT entry in BFR4, BFR5: ------------------------------------------------------------------ | 0:8 | forward_connected(Lxx, BFR8) |xx differs on BFR4/BFR5| ------------------------------------------------------------------ BIFT entry in BFR6, BFR7: ------------------------------------------------------------------ | 0:8 | forward_connected(Lxx, BFR9) |xx differs on BFR6/BFR7| ------------------------------------------------------------------ BIFT entry in BFR8, BFR9: ------------------------------------------------------------------ | 0:9 | forward_connected(Lxx, BFR10) |xx differs on BFR8/BFR9| ------------------------------------------------------------------
Note that for the following discussion of ECMP, only the BIFT ECMP()adjacencies on BFR1, BFR2, and BFR3 are relevant. The reuse of BPs acrossBFRs in this example is further explained inSection 5.1.9below.¶
With the ECMP setup shown in the topology above, traffic would not beequally load-split. Instead, links L22 and L31 would see no trafficat all: BFR2 will only see traffic from BFR1, for which the ECMPhash in BFR1 selected the first adjacency in the list of two adjacenciesgiven as parameters to the ECMP: link L11-to-BFR2. BFR2 again performs ECMP with two adjacencies on that subset of traffic using the sameseed1 and will therefore again select the first of its two adjacencies:L21-to-BFR4. Therefore, L22 and BFR5 see no traffic (likewise forL31 and BFR6).¶
This issue in BFR2/BFR3 is called "polarization". It results from thereuse of the same hash function across multiple consecutive hops intopologies like these. To resolve this issue, the ECMP() adjacency on BFR1 can be set up with a different seed2 than the ECMP() adjacencies on BFR2/BFR3.BFR2/BFR3 can use the same hash because packets will not sequentiallypass across both of them. Therefore, they can also use the same BP (i.e., 0:7).¶
Note that ECMP solutions outside of BIER often hide theseed by auto-selecting it from local entropy such as unique local ornext-hop identifiers. Allowing the BIER-TE controller to explicitly set the seed givesthe BIER-TE controller the ability to control the selection of the same path or different paths across multipleconsecutive ECMP hops.¶
Forward_routed() adjacencies can reduce the number of bit positionsrequired when the path steering requirement is not hop-by-hopexplicit path selection but rather is loose-hop selection. Forward_routed() adjacenciescan also permit BIER-TE operation across intermediate-hop routersthat do not support BIER-TE.¶
Assume that the requirement inFigure 13 is to explicitly steer traffic flows that have arrived at BFR1 or BFR4 via a pathin the routing underlay "Network Area 1" to one of the following next three segments: (1) BFR2 via link L1, (2) BFR2 via link L2, or (3) via BFR3 and thennot caring whether the packet is forwarded via L3 or L4.¶
............... ...BFR1--... ...--L1-- BFR2... ... .Routers. ...--L2--/ ...BFR4--... ...--L3-- BFR3... ... ...--L4--/ | ............... | LO Network Area 1
To enable this, both BFR1 and BFR4 are set up with a forward_routed()adjacency bit position towards an address of BFR2 on link L1, anotherforward_routed() bit position towards an address of BFR2 on link L2, and a thirdforward_routed() bit position towards a node address LO of BFR3.¶
Forward_routed() adjacencies also enable incremental deployment of BIER-TE.Only the nodes through which BIER-TE traffic needs to be steered --with or without replication -- need to support BIER-TE. Wherethey are not directly connected to each other, forward_routed()adjacencies are used to pass over nodes that are not BIER-TE enabled.¶
BPs can be reused across multiple BFRs to minimize the numberof BPs needed. This happens when adjacencies on multiple BFRs use the DNCflag as described above, but it can also be done for non-DNC adjacencies.This section only discusses this non-DNC case.¶
Because a given BP is cleared when passing a BFR with an adjacency for thatBP, reusing BPs across multiple BFRs does not introduce any problemswith duplicates or loops that do not also exist when every adjacency hasa unique BP. Instead, the challenge when reusing BPs is whether the desired Tree Engineering goals can still be achieved.¶
A BP cannot be reused across two BFRs that would need to be passedsequentially for some path: the first BFR will clear the BP, so thosepaths cannot be built. A BP can be set across BFRs that would only occur across(A) different paths or (B) different branches of the same tree.¶
An example of (A) was given inFigure 12,where BP 0:7, BP 0:8, and BP 0:9 are each reused across multiple BFRs becausea single packet/path would never be able to reach more than one BFR sharing the same BP.¶
Assume that the example was changed: BFR1 has no ECMP() adjacency for BP 0:6but instead has BP 0:5 with forward_connected() to BFR2 and BP 0:6 withforward_connected() to BFR3. Packets with both BP 0:5 and BP 0:6 wouldnow be able to reach both BFR2 and BFR3, and the still-existing reuseof BP 0:7 between BFR2 and BFR3 is a case of (B) where reusing a BPis perfect because it does not limit the set of useful path choices, as in the following example.¶
If instead of reusing BP 0:7 BFR3 used a separate BP 0:10 for itsECMP() adjacency, no useful additional path steering options would be enabled.If duplicates at BFR10 were undesirable, this would be done by notsetting BP 0:5 and BP 0:6 for the same packet. If the duplicates weredesirable (e.g., resilient transmission), the additional BP 0:10would also not render additional value.¶
Reuse may also save BPs in larger topologies. Consider the topologyshown inFigure 14.¶
area1 BFR1a BFR1b / \ .................................... . Core . .................................... | / \ / \ | BFR2a BFR2b BFR3a BFR3b BFR6a BFR6b /-------\ /---------\ /--------\ | area2 | | area3 | ... | area6 | | ring | | ring | | ring | \-------/ \---------/ \--------/ more BFRs more BFRs more BFRs
A BFIR/sender (e.g., video headend) is attached to area 1,and the five areas 2...6 contain receivers/BFERs. Assume that each area has a distributionring, each with two BPs to indicate the direction (as explained before). These two BPs could be reused across the five areas. Packets would be replicatedthrough other BPs from the core to the desired subset of areas, and once a packet copyreaches the ring of the area, the two ring BPs come into play. This reuse is a case of (B), but it limits the topology choices: packetscan only flow around the same direction in the rings of all areas. This may or may notbe acceptable based on the desired path steering options: if resilienttransmission is the path engineering goal, then it is likely a goodoptimization; however, if the bandwidth of each ring were to be optimized separately,it would not be a good limitation.¶
In this section, we reviewed a range of techniques by which a BIER-TE controller can createa BIER-TE topology in a way that minimizes the number of necessary BPs.¶
Without any optimization, a BIER-TE controller would attempt to map the networksubnet topology 1:1 into the BIER-TE topology, every adjacentneighbor in the subnet would require a forward_connected() BP, and every BFER would require a local_decap() BP.¶
The optimizations described in this document are then as follows:¶
Note that this list of optimizations is not exhaustive. Further optimizations of BPs are possible, especially when both the set of required path steering choices and the possible subsets of BFERs that should be able to receive traffic are limited. The hub-and-spoke optimization is a simple example of such traffic-pattern-dependent optimizations.¶
Whenever BIER-TE creates a copy of a packet, the BitString ofthat copy will have all bit positions cleared that are associatedwith adjacencies on the BFR. This prevents packets from looping.The only exceptions are adjacencies with DNC set.¶
With DNC set, looping can happen. Consider inFigure 15that link L4 from BFR3 is (inadvertently) plugged into the L1 interface ofBFRa (instead of BFR2). This creates a loop where the ring's clockwise bit position isnever cleared for copies of the packets traveling clockwisearound the ring.¶
v v | | L1 | L2 | L3 /-------- BFRa ---- BFRb ---------------------\ | . | | ...... Wrong link wiring | | . | \- BFR1 - BFR2 BFR3 - ... - BFR29 - BFR30 -/ | | L4 | | p33| p15| BFRd BFRc
To inhibit looping in the face of such physical misconfiguration,only forward_connected() adjacencies are permitted to have DNC set,and the link layer port unique unicast destination address of the adjacency (e.g., "Media Access Control" (MAC) address)protects against closing the loop. Link layers without port uniquelink layer addresses should not be used with the DNC flag set.¶
Duplicates happen when the graph expressed by a BitString is not atree but is redundantly connecting BFRs with each other. InFigure 16,a BitString of p2,p3,p4,p5 would result in duplicate packets arriving on BFER4.The BIER-TE controller must therefore ensure that only BitStrings that are trees are created.¶
BFIR1 / \ / p2 \ p3 BFR2 BFR3 \ p4 / p5 \ / BFER4
When links are incorrectly physically reconnected before theBIER-TE controller updates BitStrings in BFIRs, duplicates can happen.Like loops, these can be inhibited by link layer addressingin forward_connected() adjacencies.¶
If interface or loopback addresses used in forward_routed() adjacenciesare moved from one BFR to another, duplicates are equally likely to happen.Such readdressing operations must be coordinated with the BIER-TE controller.¶
When the number of bits required to represent the necessary hopsin the topology and BFERs exceeds the supported "BitStringLength" (BSL),multiple SIs and/or subdomains must be used. This section discusses how this is done.¶
BIER-TE forwarding does not require the concept of BFR-ids, but routingunderlay, flow overlay, and BIER headers may. This section also discusseshow BFR-ids can be assigned to BFIRs/BFERs for BIER-TE.¶
For (non-TE) BIER and BIER-TE forwarding, the most important result of using multipleSIs and/or subdomains is the same: multicast flow overlay packets that need to be sent to BFERs indifferent SIs or subdomains require multiple BIER packets, each one with aBitString for a different (SI,subdomain) combination. Each such BitString usesone BSL-sized SI block in the BIFT of the subdomain. We call thisa BIFT:SI (block).¶
SIs and subdomains have different purposes in the BIER architectureand also the BIER-TE architecture. This impacts how operators manage them andespecially how flow overlays will likely use them.¶
By default, every possible BFIR/BFER in a BIER network would likely be givena BFR-id in subdomain 0 (unless there are > 64k BFIRs/BFERs).¶
If there are different flow services (or service instances) requiring replicationto different subsets of BFERs, then it will likely not be possible to achievethe best replication efficiency for all of these service instances via subdomain 0.Ideal replication efficiency for N BFERs exists in a subdomain if they aresplit over no more than ceiling(N/BitStringLength) SIs.¶
If service instances justify additional BIER:SI state in the network, additionalsubdomains will be used: BFIRs/BFERs are assigned BFR-ids in those subdomains,and each service instance is configured to use the most appropriate subdomain.This results in improved replication efficiency for different services.¶
Even if creation of subdomains and assignment of BFR-ids to BFIRs/BFERs in thosesubdomains is automated, it is not expected that individualservice instances can deal with BFERs in different subdomains. A serviceinstance may only support configuration of a single subdomain it should rely on.¶
To be able to easily reuse (and modify as little as possible) existingBIER procedures (including flow overlay and routing underlay), when BIER-TEforwarding is added, we therefore reuse SIs and subdomains logically in thesame way as they are used in BIER: all necessary BFIRs/BFERs for a service usea single BIER-TE BIFT and are split across as many SIs as necessary (seeSection 5.3.2).Different services may use different subdomains that primarily exist toprovide more efficient replication (and, for BIER-TE, desirable path steering)for different subsets of BFIRs/BFERs.¶
In BIER, BitStrings only need to carry bits for BFERs; this leads to themodel where BFR-ids map 1:1 to each bit in a BitString.¶
In BIER-TE, BitStrings need to carry bits to indicate not only the receivingBFER but also the intermediate hops/links across which the packet must be sent.The maximum number of BFERs that can be supported in a single BitString or BIFT:SIdepends on the number of bits necessary to represent the desired topology betweenthem.¶
"Desired" topology means that it depends on the physical topology andthe operator's desire to¶
The total number of bits to describe the topology vs. the number of BFERs in a BIFT:SI canrange widely based on the size of the topology and the amount of alternative pathsin it. In a BIER-TE topology crafted by a BIER-TE expert, the higher the percentage of non-BFER bits, the higher the likelihood that those topologybits are not just BIER-TE overhead without additional benefit but instead will allow the expression of desirable path steering alternatives.¶
BIER-TE forwarding does not use BFR-ids, nor does it require thatthe BFIR-id field of the BIER header be set to a particular value.However, other parts of a BIER-TE deployment may need a BFR-id -- specifically,multicast flow overlay signaling and multicast flow overlay packet disposition;in that case, BFRs need to also have BFR-ids for BIER-TE SDs.¶
For example, for BIER overlay signaling, BFIRs need to have a BFR-id, because thisBFIR BFR-id is carried in the BFIR-id field of the BIER header to indicateto the overlay signaling on the receiving BFER which BFIR originated the packet.¶
In BIER, BFR-id = SI * BSL + BP, such that the SI and BP of a BFERcan be calculated from the BFR-id and vice versa. This also meansthat every BFR with a BFR-id has a reserved BP in an SI, even ifthat is not necessary for BIER forwarding, because the BFR maynever be a BFER (i.e., will only be a BFIR).¶
In BIER-TE, for a non-leaf BFER, there is usually a single BP for that BFER with alocal_decap() adjacency on the BFER. The BFR-id for such a BFER can thereforebe determined using the same procedure as that used for (non-TE) BIER: BFR-id = SI * BSL + BP.¶
As explained inSection 5.1.3, leaf BFERs do not need sucha unique local_decap() adjacency. Likewise, BFIRs that are not also BFERsmay not have a unique local_decap() adjacency either. For all those BFIRsand (leaf) BFERs, the controller needs to determine unique BFR-ids thatdo not collide with the BFR-ids derived from the non-leaf BFER local_decap() BPs.¶
While this document defines no requirements on how to allocate such BFR-ids,a simple option is to derive it from the (SI,BP) of an adjacency that isunique to the BFR in question. For a BFIR, this can be the first adjacency that isonly populated on this BFIR; for a leaf BFER, this could be the first BPwith an adjacency towards that BFER.¶
In BIER, applications of the flow overlay on a BFIR can calculate the (SI,BP) of aBFER from the BFR-id of the BFER and can therefore easily determine the BitStringsfor a BIER packet to a set of BFERs with known BFR-ids.¶
In BIER-TE, this mapping needs to be equally supported for flow overlays.This section outlines two core options, based on what type of Tree Engineeringthe BIER-TE controller needs to perform for a particular application.¶
If "independent branches" are used, the BIER-TE controller can signal to the BFIR flow overlay for every BFER an SI:BitString thatrepresents the branch to that BFER. The flow overlay on the BFIR can then, independentlyof the controller, calculate the SI:BitString for all desired BFERs by ORing their BitStrings.This allows flow overlay applications to operate independently of the controllerwhenever they need to determine which subset of BFERs needs to receive a particular packet.¶
If "interdependent branches" are required, an application would need to querythe SI:BitString for a given set of BFERs whenever the set changes.¶
Note that in either case (unlike the scenario for BIER), the bits may need tochange upon link/node failure/recovery, network expansion, or network resource consumption by other traffic as part of achieving Traffic Engineering goals (e.g., reoptimization oflower-priority traffic flows). Interactions between such BFIR applications and the BIER-TE controllerdo therefore need to support dynamic updates to the SIs:BitStrings.¶
Communications between the BFIR flow overlay and the BIER-TE controllerrequire some way to identify the BFERs. If BFR-ids are used in the deployment, asoutlined inSection 5.3.3, then those are the "natural" BFR-ids. IfBFR-ids are not used, then any other unique identifier, such as a BFR's BFR-prefix[RFC8279], could be used.¶
It is not currently determined if a single subdomain could or should beallowed to forward both (non-TE) BIER and BIER-TE packets. If this should besupported, there are two options:¶
Consider a network setup with a BSL of 256 for a networktopology as shown inFigure 17. The network has six areas, each with170 BFERs, connecting via a core with four (core) BFRs. To address all BFERs with BIER,four SIs are required. To send a BIERpacket to all BFERs in the network, four copies need to be sent by the BFIR. On theBFIR, it does not matter how the BFR-ids are allocated to BFERsin the network, but it does matter for efficiency further down in the network.¶
area1 area2 area3 BFR1a BFR1b BFR2a BFR2b BFR3a BFR3b | \ / \ / | ................................ . Core . ................................ | / \ / \ | BFR4a BFR4b BFR5a BFR5b BFR6a BFR6b area4 area5 area6
With random allocation of BFR-ids to BFERs, each receiving area would (most likely)have to receive all four copies of the BIER packet because there would beBFR-ids for each of the four SIs in each of the areas. Only further towards eachBFER would this duplication subside -- when each of the four trees runs out ofbranches.¶
If BFR-ids are allocated intelligently, then all the BFERs in an areawould be given BFR-ids with as few different SIs as possible.Each area would only have to forward one or two packets instead of four.¶
Given how networks can grow over time, replication efficiency in an areawill then also go down over time when BFR-ids are only allocated sequentially, network wide.An area that initially only has BFR-ids in one SImight end up with many SIs over a longer period of growth. Allocating SIsto areas that initially have sufficiently many spare bits for growth can helpalleviate this issue. Alternatively, BFERs can be renumbered after network expansion. Inthis example, one may consider using six SIs and assigning one to each area.¶
This example shows that intelligent BFR-id allocation within at leastsubdomain 0 can be helpful or even necessary in BIER.¶
In BIER-TE, one needs to determine a subset of the physical topologyand attached BFERs so that the "desired" representation of this topologyand the BFERs fit into a single BitString. This process needs to berepeated until the whole topology is covered.¶
Once bits/SIs are assigned to the topology and BFERs, BFR-ids are just a derivedset of identifiers from the operator / BIER-TE controller as explained above.¶
Whenever different subtopologies have overlap, bits need tobe repeated across the BitStrings, increasing the overall amount of bitsrequired across all BitStrings/SIs. In the worst case, one assigns random subsets of BFERs to different SIs. This will result in an outcome much worse than in (non-TE) BIER: it maximizes the amount of unnecessary topology overlap across SIs and thereforereduces the number of BFERs that can be reached across each individual SI.Intelligent BFER-to-SI assignment and selecting specific "desired" subtopologies canminimize this problem.¶
To set up BIER-TE efficiently for the topology shown inFigure 17, the following bitallocation method can be used. This method can easily be expanded toother, similarly structured larger topologies.¶
Each area is allocated one or more SIs, depending on the number of futureexpected BFERs and the number of bits required for the topology in the area.In this example, six SIs are used, one per area.¶
In addition, we use four bits in each SI:¶
These bits will be used to pass BIERpackets from any BFIR via any combination of ingress area a/b BFRs and egress area a/b BFRs into a specific target area. These bits are then set up with the right forward_routed() adjacencies on the BFIRs and area edge BFRs as follows.¶
On all BFIRs in an area, j|j=1...6, bia in each BIFT:SI is populated with the sameforward_routed(BFRja) and bib with forward_routed(BFRjb). On all areaedge BFRs, bea in BIFT:SI=k|k=1...6 is populated with forward_routed(BFRka) andbeb in BIFT:SI=k with forward_routed(BFRkb).¶
For BIER-TE forwarding of a packet to a subset of BFERs across all areas,a BFIR would create at most six copies, with SI=1...SI=6. In each packet,the BitString includes bits for one area and the BFERs in that area, plus the four bitsto indicate whether to pass this packet via the ingress area a or b border BFRand the egress area a or b border BFR, therefore allowing path steeringfor those two "unicast" legs: 1) BFIR to ingress area edge and 2) core to egressarea edge. Replication only happens inside the egress areas. For BFERs that are in thesame area as the BFIR, these four bits are not used.¶
BIER-TE can, like BIER, support multiple SIs within a subdomain. This allowsapplication of the mapping BFR-id = SI * BSL + BP. This also permits the reuse ofthe BIER architecture concept of BFR-ids and, therefore, minimization of BIER-TE-specific functions in possibleBIER layer control plane mechanisms with BIER-TE, including flow overlay methods and BIER header fields.¶
The number of BFIRs/BFERs possible in a subdomain is smaller than in BIERbecause BIER-TE uses additional bits for the topology.¶
Subdomains in BIER-TE can be used as they are in BIER to create more efficientreplication to known subsets of BFERs.¶
Assigning bits for BFERs intelligently into the right SI is more important inBIER-TE than in BIER because of replication efficiency and the overall amount ofbits required.¶
If "Encapsulation for Bit Index Explicit Replication (BIER) in MPLS and Non-MPLS Networks"[RFC8296] is used, its security considerations also apply to BIER-TE.¶
The security considerations of "Multicast Using Bit Index Explicit Replication (BIER)"[RFC8279] also apply to BIER-TE, withthe following overriding or additional considerations.¶
BIER-TE forwarding explicitly supports unicast "tunneling" of BIER packets via forward_routed()adjacencies. The BIER domain security model is based on a subset of interfaces on a BFRthat connect to other BFRs of the same BIER domain. For BIER-TE, this security model equally appliesto such unicast "tunneled" BIER packets. This not only includes the need to filterreceived unicast "tunneled" BIER packets to prohibit the injection of such "tunneled" BIERpackets from outside the BIER domain but also the need to prohibit forward_routed() adjacenciesfrom leaking BIER packets from the BIER domain. ItSHOULD be possible to configureinterfaces to be part of a BIER domain solely for sending and receiving unicast"tunneled" BIER packets even if the interface cannot send/receive BIER encapsulated packets.¶
In BIER, the standardized methods for the routing underlays are IGPswith extensions to distribute BFR-ids and BFR-prefixes.[RFC8401] specifies the extensions for IS-IS, and[RFC8444] specifies the extensions for OSPF.Attacking the protocols for the BIER routing underlay or (non-TE) BIER layer controlplane, or the impairment of any BFRs in a domain, may lead to successful attacksagainst the information that BIER-TE learns from the routing protocol (routes, next hops, BFR-ids, ...), enabling DoS attacks againstpaths or the addressing (BFR-ids, BFR-prefixes) used by BIER.¶
The reference model for the BIER-TE layer control plane is a BIER-TE controller.When such a controller is used, the impairment of an individual BFR in a domain causesno impairment of the BIER-TE control plane on other BFRs. If a routingprotocol is used to support forward_routed() adjacencies, then this is still anattack vector as in BIER, but only for BIER-TE forward_routed() adjacencies andnot other adjacencies.¶
Whereas IGP routing protocols are most often not well secured throughcryptographic authentication and confidentiality, communications between controllers and routers such as thoseto be considered for the BIER-TE controller / control plane can be, and are, much more commonlysecured with those security properties -- for example, by using "Secure Shell" (SSH)[RFC4253] for NETCONF[RFC6242]; or via "Transport Layer Security" (TLS), such as[RFC8253] for PCEP[RFC5440] or[RFC7589] for NETCONF. BIER-TE controllersSHOULD use security equal to or better than these mechanisms.¶
When any of these security mechanisms/protocols are used for communicationsbetween a BIER-TE controller and BFRs, their security considerations apply to BIER-TE.In addition, the security considerations of "A Path Computation Element (PCE)-Based Architecture"[RFC4655] apply.¶
The most important attack vector in BIER-TE is misconfiguration,either on the BFRs themselves or via the BIER-TE controller.Forwarding entries with DNC could be set up to create persistent loops, in whichpackets only expire because of TTL. To minimize the impact of such attacks(or, more likely, unintentional misconfiguration by operators and/or bad BIER-TE controller software),the BIER-TE forwarding rules are defined to be as strict in clearingbits as possible. The clearing of all bits with an adjacency ona BFR prohibits a looping packet from creating additional packet amplificationthrough the misconfigured loop on the packet's second time or subsequent times around theloop, because all relevant adjacency bits would have been cleared on the first roundthrough the loop. As a result, looping packets can occur in BIER-TE to the same degreeas is possible with unintentional or malicious loops in the routing underlay with BIER, or even with unicast traffic.¶
Deployments where BIER-TE would likely be beneficial may include operational models where actual configuration changesfrom the controller are only required during non-production phases ofthe network's life cycle, e.g., in embedded networks or in manufacturingnetworks during such activities as plant reworking or repairs. In thesetypes of deployments, configuration changes could be locked out when thenetwork is in production state and could only be (re-)enabled through reverting the network/installation to non-production state. Suchsecurity designs would not only allow a deployment to provide additional layersof protection against configuration attacks but would, first and foremost,protect the active production process from such configuration attacks.¶
This document has no IANA actions.¶
SR[RFC8402] aims to enable lightweight path steeringvia loose source routing. For example, compared to its more heavyweight predecessor, RSVP-TE, SR does not require per-path signaling to each of these hops.¶
BIER-TE supports the same design philosophy for multicast.Like SR, BIER-TE¶
Any other hops canbe skipped via the use of routed adjacencies.¶
BIER-TE "bit positions" (BPs) can be understood as the BIER-TE equivalent of"forwarding segments" in SR, but they have a different scope than do forwardingsegments in SR. Whereas forwarding segments in SR are global or local, BPs in BIER-TEhave a scope that is comprised of one or more BFRs that have adjacencies for the BPs intheir BIFTs. These segments can be called "adjacency-scoped" forwarding segments.¶
Adjacency scope could be global, but then every BFR would need an adjacencyfor a given BP -- for example, a forward_routed() adjacency with encapsulation tothe global SR "Segment Identifier" (SID) of the destination. Such a BP would always result in ingressreplication, though (as in[RFC7988]). The first BFR encountering this BP would directlyreplicate traffic on it. Only by using non-global adjacency scope for BPs can traffic be steered and replicated on a non-BFIR.¶
SR can naturally be combined with BIER-TE and can help optimize it. For example,instead of defining bit positions for non-replicating hops, it is equallypossible to use SR encapsulations (e.g., SR-MPLS label stacks)for the encapsulation of "forward_routed()" adjacencies.¶
Note that (non-TE) BIER itself can also be seen as being similar to SR. BIER BPs actas global destination Node-SIDs, and the BIER BitString is simply a highly optimizedmechanism to indicate multiple such SIDs and let the network take care of effectivelyreplicating the packet hop by hop to each destination Node-SID. BIER does not allow the indication of intermediate hops or, in terms of SR, the ability to indicate a sequence of SIDsto reach the destination. On the other hand, BIER-TE and its adjacency-scoped BPs provide these capabilities.¶
The authors would like to thankGreg Shepherd,IJsbrand Wijnands,Neale Ranns,Dirk Trossen,Sandy Zheng,Lou Berger,Jeffrey Zhang,Carsten Bormann, andWolfgang Braun for their reviews and suggestions.¶
Special thanks toXuesong Geng for shepherding this document. Special thanks also for IESG review/suggestions byAlvaro Retana (responsible AD/RTG),Benjamin Kaduk (SEC),Tommy Pauly (TSV),Zaheduzzaman Sarker (TSV),Éric Vyncke (INT),Martin Vigoureux (RTG),Robert Wilton (OPS),Erik Kline (INT),Lars Eggert (GEN),Roman Danyliw (SEC),Ines Robles (RTGDIR),Robert Sparks (Gen-ART),Yingzhen Qu (RTGDIR), andMartin Duke (TSV).¶