Movatterモバイル変換


[0]ホーム

URL:


RFC 9638NVO3 Encapsulation ConsiderationsSeptember 2024
Boutros & Eastlake 3rdInformational[Page]
Stream:
Internet Engineering Task Force (IETF)
RFC:
9638
Category:
Informational
Published:
ISSN:
2070-1721
Authors:
S. Boutros,Ed.
Ciena Corporation
D. Eastlake 3rd,Ed.
Independent

RFC 9638

Network Virtualization over Layer 3 (NVO3) Encapsulation Considerations

Abstract

The IETF Network Virtualization Overlays (NVO3) Working Group developed considerations for a common encapsulation that addresses various network virtualization overlay technical concerns. This document provides a record, for the benefit of the IETF community, of the considerations arrived at by the NVO3 Working Group starting from the output of the NVO3 encapsulation Design Team. These considerations may be helpful with future deliberations by working groups over the choice of encapsulation formats.

There are implications of having different encapsulations in real environments consisting of both software and hardware implementations and within and spanning multiple data centers. For example, Operations, Administration, and Maintenance (OAM) functions such as path MTU discovery become challenging with multiple encapsulations along the data path.

Based on these considerations, the NVO3 Working Group determined that Generic Network Virtualization Encapsulation (Geneve) with a few modifications is the common encapsulation. This document provides more details, particularly in Section 7.

Status of This Memo

This document is not an Internet Standards Track specification; it is published for informational purposes.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are candidates for any level of Internet Standard; see Section 2 of RFC 7841.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9638.

Copyright Notice

Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.

Table of Contents

1.Introduction

The NVO3 Working Group is chartered to gather requirements anddevelop solutions for network virtualization data planes based onencapsulation of virtual network traffic over an IP-based underlaydata plane. Requirements include due consideration for OAM andsecurity. Based on these requirements, the WG was to select, extend,and/or develop one or more data plane encapsulation formats.

This led to WG Internet-Drafts and an RFC describing three encapsulations asfollows:

Discussion on the list and in face-to-face meetings identified anumber of technical problems with each of these encapsulations.Furthermore, there was a clear consensus at the 96th IETF meeting inBerlin that the working group should progress only one data plane encapsulation, to maximize interoperability. In order to overcome adeadlock on the encapsulation decision, the WG consensus was to form aDesign Team[RFC2418] to resolve this issue and provideinitial considerations.

2.Design Team and Working Group Process

The Design Team was to select one of the proposed encapsulations andenhance it to address the technical concerns. The goals were simple evolution ofdeployed networks as well as applicability to all locations in the NVO3architecture. The Design Team was to specifically select a design that allows for future extensibility but is not burdensome on hardware implementations. The selected design also needed to operate well with the Internet Control Message Protocol (ICMP) andin Equal-Cost Multipath (ECMP) environments. If further extensibility isrequired, then it should be done in such a manner that it does not require theconsent of an entity outside of the IETF.

The output of the Design Team was then processed through the working group, resulting in a working group consensus for this document.

3.Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.

4.Abbreviations, Acronyms, and Definitions

The following abbreviations and acronyms are used in this document:

ACL:
Access Control List
ECMP:
Equal-Cost Multipath
EVPN:
Ethernet VPN[RFC8365]
Geneve:
Generic Network Virtualization Encapsulation[RFC8926]
GPE:
Generic Protocol Extension
GUE:
Generic UDP Encapsulation[GUE]
HMAC:
Hash-Based Message Authentication Code[RFC2104]
IEEE:
Institute for Electrical and Electronic Engineers (<https://www.ieee.org/>)
NIC:
Network Interface Card (refers to network interface hardware that is not necessarily a discrete "card")
NSH:
Network Service Header[RFC8300]
NVA:
Network Virtualization Authority
NVE:
Network Virtual Edge (refers to an NVE device)
NVO3:
Network Virtualization over Layer 3
OAM:
Operations, Administration, and Maintenance[RFC6291]
PWE3:
Pseudowire Emulation Edge-to-Edge
TCAM:
Ternary Content-Addressable Memory
TLV:
Type-Length-Value
Transit device:
Refers to underlay network devices between NVEs.
UUID:
Universally Unique Identifier
VNI:
Virtual Network Identifier
VXLAN:
Virtual eXtensible Local Area Network[RFC7348]

5.Encapsulation Issues and Background

The following subsections describe issues with current encapsulations as discussed by the NVO3 WG. Numerous extensions and options have been designed for GUE and Geneve that may help resolve some of these issues, but these have not yet been validated by the WG.

Also included are diagrams and information on the candidate encapsulations. These are mostly copied from other documents. Since each protocol is assumed to be sent over UDP, an initial UDP header is shown that would be preceded by an IPv4 or IPv6 header.

5.1.Geneve

The Geneve packet format, taken from[RFC8926], is shown inFigure 1 below.

    0                   1                   2                   3    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1Outer UDP Header:   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |          Source Port          |    Dest Port = 6081 Geneve    |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |          UDP Length           |        UDP Checksum           |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+Geneve Header:   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |Ver|  Opt Len  |O|C|    Rsvd.  |          Protocol Type        |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |        Virtual Network Identifier (VNI)       |    Reserved   |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |                                                               |   ~                    Variable-Length Options                    ~   |                                                               |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 1:Geneve Header

The type of payload being carried is indicated by an Ethertype[RFC9542] in the Protocol Type field in the Geneve header; Ethernet itself is represented by Ethertype 0x6558. See[RFC8926] for details concerning UDP header fields. The O bit indicates an OAM packet. The Geneve C bit is the "Critical" bit, which means that the options must be processed or the packet discarded.

Issues with Geneve[RFC8926] are as follows:

The selection of Geneve despite these issues may be the result of the Geneve design effort, assuming that the Geneve header would typically be delivered to a server and parsed in software.

5.2.Generic UDP Encapsulation (GUE)

    0                   1                   2                   3    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1UDP Header:   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |        Source Port            |     Dest Port = 6080 GUE      |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |        UDP Length             |          UDP Checksum         |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+GUE Header:   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   | 0 |C|   Hlen  |  Proto/ctype  |             Flags             |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |                                                               |   ~                  Extensions Fields (optional)                 ~   |                                                               |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 2:GUE Header

The type of payload being carried is indicated by an IANA protocol number in the Proto/ctype field. The GUE C bit (Control bit) indicates acontrol packet.

Issues with GUE[GUE] are asfollows:

5.3.Generic Protocol Extension (GPE) for VXLAN

    0                   1                   2                   3    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1Outer UDP Header:   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |           Source Port         |     Dest Port = 4790 GPE      |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |           UDP Length          |           UDP Checksum        |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+VXLAN-GPE Header   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |R|R|Ver|I|P|B|O|       Reserved                | Next Protocol |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+   |              Virtual Network Identifier (VNI) |   Reserved    |   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3:GPE Header

The type of payload being carried is indicated by the Next Protocolfield using a registry specific to VXLAN-GPE. The I bit indicates thatthe VNI is valid. The P bit indicates that the Next Protocol field isvalid. The B bit indicates that the packet is an ingress replicatedBroadcast, Unknown Unicast, or Multicast packet. The O bit indicatesan OAM packet.

Issues with VXLAN-GPE[VXLAN-GPE] are asfollows:

6.Common Encapsulation Considerations

6.1.Current Encapsulations

Appendix A includes a detailed comparison between the threeproposed encapsulations. The comparison indicates several commonproperties but also three major differences among theencapsulations:

  • Extensibility: Geneve and GUE were defined with built-in extensibility, while VXLAN-GPE is not inherently extensible. Note that any of the three encapsulations can be extended using the Network Service Header (NSH)[RFC8300].
  • Extension method: Geneve is extensible using Type-Length-Value (TLV) fields, while GUE uses a small set of possible extensions and a set of flags that indicate which extensions are present.
  • Length field: Geneve and GUE include a Length field, indicating the length of the encapsulation header, while VXLAN-GPE does not include such a field. Thus, it may be harder to skip the encapsulation header with VXLAN-GPE

6.2.Useful Extensions Use Cases

Extensions that are not vendor-specific, such as TLVs,MUST follow thestandardization process. The following use cases for extensions showthat there is a strong requirement to support variable-lengthextensions with possible different subtypes.

6.2.1.Telemetry Extensions

In several scenarios, it is beneficial to make information available to theoperator about the path a packet took through the network or through a networkdevice as well as information about associated telemetry.

This includes not only tasks like debugging, troubleshooting, andnetwork planning and optimization but also policy or service levelagreement compliance checks.

Packet scheduling algorithms, especially for balancing trafficacross equal-cost paths or links, often leverage information containedwithin the packet, such as protocol number, IP address, or MessageAuthentication Code (MAC) address. Thus, probe packets would need to be either sent between theexact same endpoints with the exact same parameters or artificially constructed as "fake" packets andinserted along the path. Both approaches are often not feasible froman operational perspective because access to the end system is notfeasible or the diversity of parameters and associated probe packetsto be created is simply too large. An extension providing an in-bandtelemetry mechanism[RFC9197] is an alternative inthose cases.

6.2.2.Security/Integrity Extensions

Since the currently proposed NVO3 encapsulations do not protecttheir headers, a single bit corruption in the VNI field could delivera packet to the wrong tenant. Extension headers are needed to use anysophisticated security.

The possibility of VNI spoofing with an NVO3 protocol isexacerbated by using UDP. Systems typically have no restrictions onapplications being able to send to any UDP port, so an unprivilegedapplication can trivially spoof VXLAN[RFC7348] packets,using arbitrary VNIs, for instance.

One can envision support of an HMAC-like Message AuthenticationCode (MAC)[RFC2104] in an NVO3 extension toauthenticate the header and the outer IP addresses, thereby preventingattackers from injecting packets with spoofed VNIs.

Another aspect of security is payload security. Essentially, thismakes packets that look like the following:

  IP|UDP|NVO3 Encap|DTLS/IPsec-ESP Extension|payload.

This is desirable because:

  • we still have the UDP header for ECMP,
  • the NVO3 header is in plain text so it can be read by network elements, and
  • different security or other payload transforms can be supported ona single UDP port (we don't need a separate UDP port for DTLS/IPsec; see[RFC9147] and[RFC6071], respectively).

6.2.3.Group-Based Policy

Another use case would be to carry the Group-Based Policy (GBP)source group information within a NVO3 header extension in a similarmanner as has been implemented for VXLAN[VXLAN-GROUP].This allows various forms of policy such as access control and QoS tobe applied between abstract groups rather than coupled to specificendpoint addresses.

6.3.Hardware Considerations

Hardware restrictions should be taken into consideration along withfuture hardware enhancements that may provide more flexible metadata (MD)processing. However, the set of options that need to and will beimplemented in hardware will be a subset of what is implemented insoftware. This is because software NVEs are likely to grow features, and henceoption support, at a more rapid rate.

It is hard to predict which options will be implemented in whichpiece of hardware and when. That depends on whether the hardware willbe in the form of:

  • a NIC providing increasing offload capabilities to software NVEs, or
  • a switch chip being used as an NVE gateway towards non-NVO3 parts of the network, or even
  • a transit device that participates in the NVO3 data plane, e.g., for OAM purposes.

A result of this is that it doesn't look useful to prescribe someorder to the options so that the ones that are likely to be implementedin hardware come first. We can't decide such an order when we definethe options; however, a control plane can enforce such an order forsome hardware implementations.

We do know that hardware initially needs to be able to efficientlyskip over the NVO3 header to find the inner payload. That is neededboth for NICs implementing various TCP offload mechanisms and fortransit devices and NVEs applying policy or ACLs to the innerpayload.

6.4.Extension Size

Extension header length has a significant impact on hardware andsoftware implementations. A maximum total header length that is toosmall will unnecessarily constrain software flexibility. A maximumtotal header length that is too large will place a nontrivial cost onhardware implementations. Thus, the DT recommends that there be aminimum and maximum total available extension header length specified.The maximum total header length is determined by the size of the bitfield allocated for the total extension header length field. The riskwith this approach is that it may be difficult to extend the totalheader size in the future. The minimum total header length isdetermined by a requirement in the specifications that allimplementations must meet. The risk with this approach is that allimplementations will only implement support for the minimum totalheader length, which would then become the de facto maximum totalheader length.

The recommended minimum total available header length is 64bytes.

The size of an extension header should always be 4-bytealigned.

The maximum length of a single option should be large enough tomeet the different extension use case requirements, e.g., for in-bandtelemetry and future use.

6.5.Ordering of Extension Headers

To support hardware nodes at the target NVE or at a transit devicethat can process one or a few extension headers in TCAM, a controlplane in such a deployment could signal a capability to ensure that aspecific extension header will always appear in a specific order, forexample, that such a specific extension header appear first in the packet.

The order of the extension headers should be hardware friendly forboth the sender and the receiver and possibly some transit devicesas well. This may require that the extension headers and their order bedetermined dynamically based on the hardware of those devices.

Transit devices don't participate in control plane communicationbetween the endpoints and are not required to process the extensionheaders; however, if they do, they may need to process only a smallsubset of the extension headers that will be consumed by targetNVEs.

6.6.TLV versus Bit Fields

If there is a well-known initial set of options that is likely tobe implemented in software and in hardware, it can be efficient to usethe bit fields approach to indicate the presence of extensions as inGUE. However, as described inSection 6.3, if options are added overtime and different subsets of options are likely to be implemented indifferent pieces of hardware, then it would be hard for the IETF tospecify which options should get the early bit fields. TLVs are a lotmore flexible, which avoids the need to determine the relativeimportance of different options. However, general TLVs of arbitraryorder, size, and repetition are difficult to implement in hardware. Amiddle ground is to use TLVs with restrictions on their size andalignment, observing that individual TLVs can have a fixed length, andto support via the control plane a method such that an NVE will onlyreceive options that it needs and implements. The control planeapproach can potentially be used to control the order of the TLVs sentto a particular NVE. Note that transit devices are not likely toparticipate in the control plane; hence, to the extent that they needto participate in option processing, some other method must beused. Transit devices would have issues with future GUE bit fieldsbeing defined for future options as well.

A benefit of TLVs from a hardware perspective is that they are self describing,i.e., all the information is in the TLV. In a bit fieldapproach, the hardware needs to look up the bit to determine thelength of the data associated with the bit through some separatetable, which would add hardware complexity.

There are use cases where multiple modules of software are runningon an NVE. These can be modules such as a diagnostic module by onevendor that does packet sampling and another module from a differentvendor that implements a firewall. Using a TLV format, it is easierto have different software modules process different TLVs without conflicting with each other. Such TLVs could be standard extensions or vendor-specific extensions. This can helpwith hardware modularity as well. There are some implementations withoptions that allow different software modules, like MAC learning andsecurity, to process different options.

6.7.Control Plane Considerations

Given that we want to allow considerable flexibility andextensibility (e.g., for software NVEs), yet want to be able to supportimportant extensions in less flexible contexts such as hardware NVEs,it is useful to consider the control plane. By control plane in thissection we mean protocols, such as EVPN[RFC8365]and others, and deployment-specific configurations.

If each NVE can express in the control plane that it only supportscertain extensions (which could be a single extension, or a few), andthe source NVEs only include supported extensions in the NVO3 packets,then the target NVE can use a simpler parser (e.g., a TCAM mightbe usable to look for a single NVO3 extension) and the depth of theinner payload in the NVO3 packet will be minimized. Furthermore, ifthe target NVE cares about a few extensions and can express in thecontrol plane the desired order of those extensions in the NVO3packets, then the deployment can provide useful functionality withsimplified hardware requirements for the target NVE.

Transit devices that are not aware of the NVO3 extensions somewhatbenefit from such an approach, since the inner payload is less deep inthe packet if no extraneous extension headers are included in thepacket. In general, a transit device is not likely to participate inthe NVO3 control plane. However, configuration mechanisms can takeinto account limitations of the transit devices used in particulardeployments.

Note that with this approach, different NVEs could desire differentextensions or sets of extensions, which means that the source NVEneeds to be able to place different sets of extensions in differentNVO3 packets, and perhaps in a different order. It also assumes thatunderlay multicast or replication servers are not used together withNVO3 extension headers.

There is a need to consider mandatory extensions versus optionalextensions. Mandatory extensions require the receiver to drop thepacket if the extension is unknown. A control plane mechanism canprevent the need for dropping unknown extensions, since they would notbe included to target NVEs that do not support them.

The control planes defined today need to add the ability todescribe the different encapsulations. Thus, perhaps EVPN[RFC8365] and any other control plane protocol that the IETFdefines should have a way to indicate the supported NVO3 extensionsand their order for each of the encapsulations supported.

Developing a separate document on guidance for option processing andcontrol plane participation should be considered. This should provideexamples and guidance on the range of usage models and deployment scenariosfor specific options. It should also provide examples of option ordering that are relevant for that specificdeployment. This includes endpoints and middleboxes that are using theoptions. Having the control plane negotiate the constraints is themost appropriate and flexible way to address these requirements.

6.8.Split NVE

If there is a need for hosts to send and receive options in a splitNVE case[RFC8394], this is possible using any of theexisting extensible encapsulations (GPE with NSH, GUE, or Geneve) by defininga way to carry those over other transports. An NSH can already be usedover different transports.

If this is needed with other encapsulations, it can be done bydefining an Ethertype so that it can be carried over Ethernet andIEEE Std 802.1Q[IEEE802.1Q].

If there is a need to carry other encapsulations over MPLS, itwould require an EVPN control plane to signal that other encapsulationheaders and options will be present in front of the Layer 2 (L2) packet. The VNIcan be ignored in the header, and the MPLS label will be the one usedto identify the EVPN L2 instance.

6.9.Larger VNI Considerations

Whether we should make the VNI 32 bits or larger was one of thetopics considered. The benefit of a 24-bit VNI would be to avoidunnecessary changes with existing proposals and implementations thatare almost all, if not all, using a 24-bit VNI. If we need a largerVNI, perhaps for a telemetry case, an extension can be used to supportthat.

7.Recommendations

The Design Team reported that Geneve was most suitable as astarting point for a proposed standard for network virtualization, forthe following reasons given below. This conclusion was supported bythe NVO3 Working Group.

  1. On whether the VNI should be in the base header or in an extension header and whether it should be a 24-bit or 32-bit field (seeSection 6.9), it was agreed that the VNI is critical information for network virtualization andMUST be present in all packets. It was also agreed that a 24-bit VNI, which is supported by Geneve, matches the existing widely used encapsulation formats, i.e., VXLAN[RFC7348] and Network Virtualization Using Generic Routing Encapsulation (NVGRE)[RFC7637], and hence is more suitable to use going forward.
  2. The Geneve header has the total options length, which allows skipping over the options for NIC offload operations and transit devices to view flow information in the inner payload.
  3. The option of using an NSH[RFC8300] with VXLAN-GPE was considered, but given that an NSH is targeted at service chaining and contains service chaining information, it is less suitable for the network virtualization use case. The other downside of VXLAN-GPE was the lack of a header length in VXLAN-GPE, which makes skipping over the headers to process inner payloads more difficult. A total options length is present in Geneve. It is not possible to skip any options in the middle with VXLAN-GPE. In principle, a split between a base header and a header with options is interesting (whether that options header is an NSH or some new header without ties to a service path). Whether it would make sense to either use an NSH for this or define a new NVO3 options header was explored. However, this makes it slightly harder to find the inner payload since the Length field is not in the NVO3 header itself. Thus, one more field would have to be extracted to compute the start of the inner payload. Also, if the experience with IPv6 extension headers is a guide, there would be a risk that key pieces of hardware might not implement the options header, resulting in future calls to deprecate its use. Making the options part of the base NVO3 header has less of those issues. Even though the implementation of any particular option can't be predicted ahead of time, the option mechanism and ability to skip the options is likely to be broadly implemented.
  4. The TLV style and bit field style of extension mechanisms were compared. It was deemed that parsing either TLVs or bit fields is expensive, and while bit fields may be simpler to parse, they are also more restrictive and require guessing which extensions will be widely implemented in order to get early bit assignments. Given that half the bits are already assigned in GUE, a widely deployed extension may appear in a flag extension, and this will require extra processing to dig the flag from the flag extension and then look for the extension itself. Also, bit fields are not flexible enough to address the requirements from OAM, telemetry, and security extensions for variable-length options and different subtypes of the same option. While TLVs are more flexible, a control plane can restrict the number of option TLVs as well as the order and size of the TLVs to limit this flexibility and make the TLVs simpler for a data plane implementation to handle.
  5. The multi-vendor NVE case was briefly discussed, as was the need to allow vendors to put their own extensions in the NVE header. This is possible with TLVs.
  6. It was agreed that the C bit (Critical bit) in Geneve is helpful. This bit indicates that the header includes options that must be parsed, or else the packet must be discarded. The bit allows a receiver NVE to easily decide whether or not to process options (such as a UUID-based packet trace) and decide how an optional extension can be ignored. Thus, a Critical bit makes it easy for the NVE to skip over the options not marked with such a bit. Thus, the C bit should remain as defined in Geneve.
  7. There are already some extensions of varying sizes that are being discussed (seeSection 6.2). By using Geneve options, it is possible to get in-band parameters like switch id, ingress port, egress port, internal delay, and queue size using TLV extensions for telemetry purposes from switches. It is also possible to add security extension TLVs like HMAC[RFC2104] and DTLS/IPsec (see[RFC9147] and[RFC6071], respectively) to authenticate the Geneve packet header and secure the Geneve packet payload by software or hardware tunnel endpoints. A Group-Based Policy extension TLV can be carried as well.
  8. There are already implementations of Geneve options deployed in production networks. There is new hardware supporting Geneve TLV parsing as well. In addition, an In-band Telemetry (INT) specification[INT] is being developed by P4.org that illustrates the option of INT metadata carried over Geneve. Open Virtual Network (OVN) and Open vSwitch (OVS)[OVN] have also defined one or more option TLVs for Geneve.
  9. Usage requirements (seeSection 6) have been addressed while alsoconsidering requirements and implementations in general (including those forsoftware and hardware).

There seems to be interest in standardizing some well-known secureoption TLVs to secure the header and payload to guaranteeencapsulation header integrity and tenant data privacy. The workinggroup should consider standardizing such option(s).

The following enhancements to Geneve are recommended to make itmore suitable to hardware and yet provide flexibility forsoftware:

8.Security Considerations

This document does not introduce any additional security constraints;however,Section 6.2.2 discusses security/integrity extensions andthis document suggests, inSection 7, that the NVO3 WGwork on security options for Geneve.

9.IANA Considerations

This document has no IANA actions.

10.References

10.1.Normative References

[RFC2119]
Bradner, S.,"Key words for use in RFCs to Indicate Requirement Levels",BCP 14,RFC 2119,DOI 10.17487/RFC2119,,<https://www.rfc-editor.org/info/rfc2119>.
[RFC8174]
Leiba, B.,"Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words",BCP 14,RFC 8174,DOI 10.17487/RFC8174,,<https://www.rfc-editor.org/info/rfc8174>.

10.2.Informative References

[GUE]
Herbert, T.,Yong, L., andO. Zia,"Generic UDP Encapsulation",Work in Progress,Internet-Draft, draft-ietf-intarea-gue-09,,<https://datatracker.ietf.org/doc/html/draft-ietf-intarea-gue-09>.
[GUE-ENCAPSULATION]
Yong, L.,Herbert, T., andO. Zia,"Generic UDP Encapsulation (GUE) for Network Virtualization Overlay",Work in Progress,Internet-Draft, draft-hy-nvo3-gue-4-nvo-04,,<https://datatracker.ietf.org/doc/html/draft-hy-nvo3-gue-4-nvo-04>.
[GUE-EXTENSIONS]
Herbert, T.,Yong, L., andF. Templin,"Extensions for Generic UDP Encapsulation",Work in Progress,Internet-Draft, draft-ietf-intarea-gue-extensions-06,,<https://datatracker.ietf.org/doc/html/draft-ietf-intarea-gue-extensions-06>.
[IEEE802.1Q]
IEEE,"IEEE Standard for Local and Metropolitan Area Networks--Bridges and Bridged Networks",IEEE Std 802.1Q-2022,DOI 10.1109/IEEESTD.2022.10004498,,<https://doi.org/10.1109/IEEESTD.2022.10004498>.
[INT]
P4.org Applications Working Group,"In-band Network Telemetry (INT) Dataplane Specification",,<https://p4.org/p4-spec/docs/INT_v2_1.pdf>.
[OVN]
Linux Foundation,"Open vSwitch",<https://www.openvswitch.org/>.
[RFC2104]
Krawczyk, H.,Bellare, M., andR. Canetti,"HMAC: Keyed-Hashing for Message Authentication",RFC 2104,DOI 10.17487/RFC2104,,<https://www.rfc-editor.org/info/rfc2104>.
[RFC2418]
Bradner, S.,"IETF Working Group Guidelines and Procedures",BCP 25,RFC 2418,DOI 10.17487/RFC2418,,<https://www.rfc-editor.org/info/rfc2418>.
[RFC3985]
Bryant, S., Ed. andP. Pate, Ed.,"Pseudo Wire Emulation Edge-to-Edge (PWE3) Architecture",RFC 3985,DOI 10.17487/RFC3985,,<https://www.rfc-editor.org/info/rfc3985>.
[RFC6071]
Frankel, S. andS. Krishnan,"IP Security (IPsec) and Internet Key Exchange (IKE) Document Roadmap",RFC 6071,DOI 10.17487/RFC6071,,<https://www.rfc-editor.org/info/rfc6071>.
[RFC6291]
Andersson, L.,van Helvoort, H.,Bonica, R.,Romascanu, D., andS. Mansfield,"Guidelines for the Use of the "OAM" Acronym in the IETF",BCP 161,RFC 6291,DOI 10.17487/RFC6291,,<https://www.rfc-editor.org/info/rfc6291>.
[RFC7348]
Mahalingam, M.,Dutt, D.,Duda, K.,Agarwal, P.,Kreeger, L.,Sridhar, T.,Bursell, M., andC. Wright,"Virtual eXtensible Local Area Network (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over Layer 3 Networks",RFC 7348,DOI 10.17487/RFC7348,,<https://www.rfc-editor.org/info/rfc7348>.
[RFC7637]
Garg, P., Ed. andY. Wang, Ed.,"NVGRE: Network Virtualization Using Generic Routing Encapsulation",RFC 7637,DOI 10.17487/RFC7637,,<https://www.rfc-editor.org/info/rfc7637>.
[RFC8300]
Quinn, P., Ed.,Elzur, U., Ed., andC. Pignataro, Ed.,"Network Service Header (NSH)",RFC 8300,DOI 10.17487/RFC8300,,<https://www.rfc-editor.org/info/rfc8300>.
[RFC8365]
Sajassi, A., Ed.,Drake, J., Ed.,Bitar, N.,Shekhar, R.,Uttaro, J., andW. Henderickx,"A Network Virtualization Overlay Solution Using Ethernet VPN (EVPN)",RFC 8365,DOI 10.17487/RFC8365,,<https://www.rfc-editor.org/info/rfc8365>.
[RFC8394]
Li, Y.,Eastlake 3rd, D.,Kreeger, L.,Narten, T., andD. Black,"Split Network Virtualization Edge (Split-NVE) Control-Plane Requirements",RFC 8394,DOI 10.17487/RFC8394,,<https://www.rfc-editor.org/info/rfc8394>.
[RFC8926]
Gross, J., Ed.,Ganga, I., Ed., andT. Sridhar, Ed.,"Geneve: Generic Network Virtualization Encapsulation",RFC 8926,DOI 10.17487/RFC8926,,<https://www.rfc-editor.org/info/rfc8926>.
[RFC9147]
Rescorla, E.,Tschofenig, H., andN. Modadugu,"The Datagram Transport Layer Security (DTLS) Protocol Version 1.3",RFC 9147,DOI 10.17487/RFC9147,,<https://www.rfc-editor.org/info/rfc9147>.
[RFC9197]
Brockners, F., Ed.,Bhandari, S., Ed., andT. Mizrahi, Ed.,"Data Fields for In Situ Operations, Administration, and Maintenance (IOAM)",RFC 9197,DOI 10.17487/RFC9197,,<https://www.rfc-editor.org/info/rfc9197>.
[RFC9542]
Eastlake 3rd, D.,Abley, J., andY. Li,"IANA Considerations and IETF Protocol and Documentation Usage for IEEE 802 Parameters",BCP 141,RFC 9542,DOI 10.17487/RFC9542,,<https://www.rfc-editor.org/info/rfc9542>.
[VXLAN-GPE]
Maino, F., Ed.,Kreeger, L., Ed., andU. Elzur, Ed.,"Generic Protocol Extension for VXLAN (VXLAN-GPE)",Work in Progress,Internet-Draft, draft-ietf-nvo3-vxlan-gpe-13,,<https://datatracker.ietf.org/doc/html/draft-ietf-nvo3-vxlan-gpe-13>.
[VXLAN-GROUP]
Smith, M. andL. Kreeger,"VXLAN Group Policy Option",Work in Progress,Internet-Draft, draft-smith-vxlan-group-policy-05,,<https://datatracker.ietf.org/doc/html/draft-smith-vxlan-group-policy-05>.

Appendix A.Encapsulation Comparison

A.1.Overview

This section presents a comparison of the three NVO3 encapsulation proposals: Geneve[RFC8926], GUE[GUE], and VXLAN-GPE[VXLAN-GPE]. The three encapsulations use an outer UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet header, while GUE uses a 4-octet header. In addition to the base header, optional extensions may be included in the encapsulation, as discussed inAppendix A.2 below.

A.2.Extensibility

A.2.1.Innate Extensibility Support

The Geneve and GUE encapsulations both enable optional headers tobe incorporated at the end of the base encapsulation header.

VXLAN-GPE does not provide innate support for header extensions.However, as discussed in[VXLAN-GPE],extensibility can be attained to some extent if the Network ServiceHeader (NSH)[RFC8300] is used immediately followingthe VXLAN-GPE header. The NSH supports either a fixed-size extension (MDType 1) or a variable-size TLV-based extension (MD Type 2). Notethat NSH-over-VXLAN-GPE implies an additional overhead of the 8-octetNSH, in addition to the VXLAN-GPE header.

A.2.2.Extension Parsing

The Geneve variable-length options are defined as Type-Length-Value(TLV) extensions. Similarly, VXLAN-GPE, when using an NSH, can includeNSH TLV-based extensions. In contrast, GUE defines a small set ofpossible extension fields (proposed in[GUE-EXTENSIONS] and[GUE-ENCAPSULATION]), and a set of flags in the GUE headerthat indicate for each extension type whether it is present ornot.

TLV-based extensions, as defined in Geneve, provide the flexibilityfor a large number of possible extension types. Similar behavior canbe supported in NSH-over-VXLAN-GPE when using MD Type 2. Theflag-based approach taken in GUE strives to simplify implementationsby defining a small number of possible extensions used in a fixedorder.

The Geneve and GUE headers both include a Length field that definesthe total length of the encapsulation, including the optionalextensions. This Length field simplifies the parsing by transitdevices that skip the encapsulation header without parsing itsextensions.

A.2.3.Critical Extensions

The Geneve encapsulation header includes the C field, whichindicates whether the current Geneve header includes critical options,that is to say, options which must be parsed by the target NVE. Ifthe endpoint is not able to process a critical option, the packet isdiscarded.

A.2.4.Maximal Header Length

The maximal header length in Geneve, including options, is 260octets. GUE defines the maximal header to be 128 octets. VXLAN-GPEuses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE isused, yielding an encapsulation header of up to 264 octets.

A.3.Encapsulation Header

A.3.1.Virtual Network Identifier (VNI)

The Geneve and VXLAN-GPE headers both include a 24-bit VNI field.GUE, on the other hand, enables the use of a 32-bit field called VNID;this field is not included in the GUE header but was defined as anoptional extension in[GUE-ENCAPSULATION].

The VXLAN-GPE header includes the I bit, indicating that the VNIfield is valid in the current header. A similar indicator is definedas a flag in the GUE header[GUE-EXTENSIONS].

A.3.2.Next Protocol

All three encapsulation headers include a field that specifies thetype of the next protocol header, which resides after the NVO3encapsulation header. The Geneve header includes a 16-bit field thatuses the IEEE Ethertype convention. GUE uses an 8-bit field, whichuses the IANA protocol numbering. The VXLAN-GPE headerincorporates an 8-bit Next Protocol field, using a registry specific to VXLAN-GPE, defined in[VXLAN-GPE].

The VXLAN-GPE header also includes the P bit, which explicitlyindicates whether the Next Protocol field is present in the currentheader.

A.3.3.Other Header Fields

The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicateswhether the current packet is an OAM packet. The GUE header includesa similar field but uses different terminology; the GUE C bit (Control bit)specifies whether the current packet is a control packet. Note thatthe GUE C bit can potentially be used in a large set ofprotocols that are not OAM protocols. However, the control packetexamples discussed in[GUE] arerelated to OAM.

Each of the three NVO3 encapsulation headers includes a 2-bitVersion field, which is currently defined to be zero.

The Geneve and VXLAN-GPE headers include reserved fields; 14 bitsin the Geneve header and 27 bits in the VXLAN-GPE header arereserved.

A.4.Comparison Summary

The following table summarizes the comparison between the threeNVO3 encapsulations. In some cases, a plus sign ("+") or minus sign("-") is used to indicate that the header is stronger or weaker in anarea, respectively.

Table 1:Encapsulations Comparison
GeneveGUEVXLAN-GPE
Outer transport UDP Port NumberUDP/IP 6081UDP/IP 6080UDP/IP 4790
Base header length8 octets4 octets8 octets (16 octets using an NSH)
ExtensibilityVariable-length optionsExtension fieldsNo innate extensibility. Might use an NSH.
Extension parsing methodTLV-basedFlag-basedTLV-based (using an NSH with MD Type 2)
Extension orderVariableFixedVariable (using an NSH)
Length field++-
Max header length260 octets128 octets8 octets (264 using an NSH)
Critical extension bit+--
VNI field size24 bits32 bits (extension)24 bits
Next Protocol field16 bits Ethertype registry8 bits Internet protocol registry8 bits New registry
Next protocol indicator--+
OAM / Control fieldOAM bitControl bitOAM bit
Version field2 bits2 bits2 bits
Reserved bits14 bitsnone27 bits

Acknowledgements

The authors would like to thankTom Herbert forproviding the motivation for the security/integrity extension and for hisvaluable comments;T. Sridhar for his valuable commentsand feedback;Anoop Ghanwani for his extensive comments;andIgnas Bagdonas.

Contributors

The following coauthors have contributed to this document:

Ilango Ganga
Intel
Email:ilango.s.ganga@intel.com
Pankaj Garg
Microsoft
Email:pankajg@microsoft.com
Rajeev Manur
Broadcom
Email:rajeev.manur@broadcom.com
Tal Mizrahi
Huawei
Email:tal.mizrahi.phd@gmail.com
David Mozes
Email:mosesster@gmail.com
Erik Nordmark
ZEDEDA
Email:nordmark@sonic.net
Michael Smith
Cisco
Email:michsmit@cisco.com
Sam Aldrin
Google
Email:aldrin.ietf@gmail.com

Authors' Addresses

Sami Boutros (editor)
Ciena Corporation
United States of America
Email:sboutros@ciena.com
Donald E. Eastlake 3rd (editor)
Independent
2386 Panoramic Circle
Apopka,FL32703
United States of America
Phone:+1-508-333-2270
Email:d3e3e3@gmail.com

[8]ページ先頭

©2009-2026 Movatter.jp