RFC 9312 | QUIC Manageability | September 2022 |
Kühlewind & Trammell | Informational | [Page] |
This document discusses manageability of the QUIC transport protocol and focuseson the implications of QUIC's design and wire image on network operationsinvolving QUIC traffic. It is intended as a "user's manual" for the wire image toprovide guidance for network operators and equipment vendors who rely on theuse of transport-aware network functions.¶
This document is not an Internet Standards Track specification; it is published for informational purposes.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are candidates for any level of Internet Standard; see Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9312.¶
Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
QUIC[QUIC-TRANSPORT] is a new transport protocolthat is encapsulated in UDP. QUIC integrates TLS[QUIC-TLS] to encrypt all payload data and most controlinformation. QUIC version 1 was designed primarily as a transport for HTTP withthe resulting protocol being known as HTTP/3[QUIC-HTTP].¶
This document provides guidance for network operations that manage QUICtraffic. This includes guidance on how to interpret and utilize information thatis exposed by QUIC to the network, requirements and assumptions of the QUICdesign with respect to network treatment, and a description of how commonnetwork management practices will be impacted by QUIC.¶
QUIC is an end-to-end transport protocol; therefore, no information inthe protocol header is intended to be mutable by the network. Thisproperty isenforced through integrity protection of the wire image[WIRE-IMAGE].Encryption of most transport-layer control signaling means that less informationis visible to the network in comparison to TCP.¶
Integrity protection can also simplify troubleshooting at the end points as noneof the nodes on the network path can modify transport layer information.However, it means in-network operations that depend on modification of data(for examples, see[RFC9065]) are not possible without the cooperation ofa QUIC endpoint. Such cooperation might be possible with the introduction ofa proxy that authenticates as an endpoint. Proxy operations are not in scopefor this document.¶
Network management is not a one-size-fits-all endeavor; for example, practices considerednecessary or even mandatory within enterprise networks with certain compliancerequirements would be impermissible on other networks withoutthose requirements. Therefore, presence of a particular practice in this documentshould not be construed as a recommendation to apply it. For eachpractice, this document describes what is and is not possible with the QUICtransport protocol as defined.¶
This document focuses solely on network management practices that observetraffic on the wire. For example, replacement of troubleshooting based on observationwith active measurement techniques is therefore out of scope.A more generalized treatment of network management operations on encryptedtransports is given in[RFC9065].¶
QUIC-specific terminology used in this document is definedin[QUIC-TRANSPORT].¶
This section discusses aspects of the QUIC transport protocol thathave an impact on the design and operation of devices that forward QUIC packets.Therefore, this section is primarily considering the unencrypted part of QUIC'swire image[WIRE-IMAGE], which is defined as the information available in thepacket header in each QUIC packet, and the dynamics of that information. SinceQUIC is a versioned protocol, the wire image of the header format can alsochange from version to version. However, the field that identifies the QUICversion in some packets and the format of the Version Negotiation packetare both inspectable and invariant[QUIC-INVARIANTS].¶
This document addresses version 1 of the QUIC protocol, whose wire imageis fully defined in[QUIC-TRANSPORT] and[QUIC-TLS]. Features of the wireimage described herein may change in future versions of the protocol exceptwhen specified as an invariant[QUIC-INVARIANTS]and cannot be used to identify QUIC as a protocol orto infer the behavior of future versions of QUIC.¶
QUIC packets may have either a long header or a short header. The first bitof the QUIC header is the Header Form bit and indicates which type of headeris present. The purpose of this bit is invariant across QUIC versions.¶
The long header exposes more information. It contains a version number, as wellas Source and Destination Connection IDs for associating packets with a QUICconnection. The definition and location of these fields in the QUIC long headerare invariant for future versions of QUIC, although future versions of QUIC mayprovide additional fields in the long header[QUIC-INVARIANTS].¶
In version 1 of QUIC, the long header is used during connection establishmentto transmit CRYPTO handshake data, perform version negotiation, retry, andsend 0-RTT data.¶
Short headers are used after a connection establishment in version 1 of QUICand expose only an optional Destination Connection ID and the initial flagsbyte with the spin bit for RTT measurement.¶
The following information is exposed in QUIC packet headers in all versions ofQUIC (as specified in[QUIC-INVARIANTS]):¶
In version 1 of QUIC, the following additional information is exposed:¶
Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted. Retry packets areintegrity protected. Transport parameters are used to authenticate the contents of Retry packets later in the handshake. For otherkinds of packets, version 1 of QUIC cryptographically protects otherinformation in the packet headers:¶
Multiple QUIC packets may be coalesced into a single UDP datagramwith a datagramcarrying one or more long header packets followed by zero or one short headerpackets. When packets are coalesced, the Length fields in the long headers areused to separate QUIC packets; seeSection 12.2 of [QUIC-TRANSPORT].The Length field is a variable-length field, and its position in the headeralso varies depending on the lengths of the Source and Destination ConnectionIDs; seeSection 17.2 of [QUIC-TRANSPORT].¶
Applications that have a mapping for TCP and QUIC are expected touse the same port number for both services. However, as for all other IETFtransports[RFC7605], there is no guarantee that a specific applicationwill use a given registered port or that a given port carries traffic belongingto the respective registered service, especially when application layerinformation is encrypted. For example,[QUIC-HTTP] specifies the use of theHTTP Alternative Services mechanism[RFC7838] for discovery of HTTP/3services on other ports.¶
Further, as QUIC has a connection ID, it is also possible to maintain multipleQUIC connections over one 5-tuple (protocol, source, and destination IP address and source and destination port). However, if the connection ID is zero length,all packets of the 5-tuple likely belong to the same QUIC connection.¶
New QUIC connections are established using a handshake that is distinguishableon the wire (seeSection 3.1 for details) and contains some informationthat can be passively observed.¶
To illustrate the information visible in the QUIC wire image during thehandshake, we first show the general communication pattern visible in the UDPdatagrams containing the QUIC handshake. Then, we examine each of the datagrams indetail.¶
The QUIC handshake can normally be recognized on the wire through four flightsof datagrams labeled "Client Initial", "Server Initial", "Client Completion",and "Server Completion" as illustrated inFigure 1.¶
A handshake starts with the client sending one or more datagrams containingInitial packets (detailed inFigure 2), which elicits theServer Initial response (detailed inFigure 3), which typically contains three types of packets: Initial packet(s) with the beginning of the server'sside of the TLS handshake, Handshake packet(s) with the rest of the server'sportion of the TLS handshake, and 1-RTT packet(s), if present.¶
Client Server | | +----Client Initial----------------------->| +----(zero or more 0-RTT)----------------->| | | |<-----------------------Server Initial----+ |<--------(1-RTT encrypted data starts)----+ | | +----Client Completion-------------------->| +----(1-RTT encrypted data starts)-------->| | | |<--------------------Server Completion----+ | |
As shown here, the client can send 0-RTT data as soon as it has sent its ClientHello and the server can send 1-RTT data as soon as it has sent its ServerHello. The Client Completion flight contains at least one Handshake packet andcould also include an Initial packet. During the handshake, QUIC packets in separate contexts can be coalesced (seeSection 2.2) in order to reduce thenumber of UDP datagrams sent during the handshake.¶
Handshake packets can arrive out-of-order without impacting the handshake aslong as the reordering was not accompanied by extensive delays that trigger aspurious Probe Timeout (Section 6.2 of [QUIC-RECOVERY]).If QUIC packets get lost or reordered, packets belongingto the same flight might not be observed in close time succession, thoughthe sequence of the flights will not change because one flight dependsupon the peer's previous flight.¶
Datagrams that contain an Initial packet (Client Initial, ServerInitial, and some Client Completion) contain at least 1200 octets of UDPpayload. This protects against amplification attacks and verifies that thenetwork path meets the requirements for the minimum QUIC IP packet size;seeSection 14 of [QUIC-TRANSPORT]. This is accomplished by either addingPADDING frames within the Initial packet, coalescing other packets with theInitial packet, or leaving unused payload in the UDP packet after the Initialpacket. A network path needs to be able to forward packets of at least this sizefor QUIC to be used.¶
The content of Initial packets is encrypted using Initial Secrets,which are derived from a per-version constant and the client'sDestination Connection ID. That content is therefore observable byany on-path device that knows the per-version constant and isconsidered visible in this illustration. The content of QUICHandshake packets is encrypted using keys established during theinitial handshake exchange and is therefore not visible.¶
Initial, Handshake, and 1-RTT packets belong to different cryptographic andtransport contexts. The Client Completion (Figure 4) and theServer Completion (Figure 5) flights conclude the Initialand Handshake contexts by sending final acknowledgments andCRYPTO frames.¶
+----------------------------------------------------------+| UDP header (source and destination UDP ports) |+----------------------------------------------------------+| QUIC long header (type = Initial, Version, DCID, SCID) (Length)+----------------------------------------------------------+ || QUIC CRYPTO frame header | |+----------------------------------------------------------+ || | TLS ClientHello (incl. TLS SNI) | | |+----------------------------------------------------------+ || QUIC PADDING frames | |+----------------------------------------------------------+<-+
A Client Initial packet exposes the Version, Source, and DestinationConnection IDs without encryption. The payload of the Initialpacket is protected using the Initial secret. The complete TLSClientHello, including any TLS Server Name Indication (SNI)present, is sent in one or more CRYPTO frames across one or moreQUIC Initial packets.¶
+------------------------------------------------------------+| UDP header (source and destination UDP ports) |+------------------------------------------------------------+| QUIC long header (type = Initial, Version, DCID, SCID) (Length)+------------------------------------------------------------+ || QUIC CRYPTO frame header | |+------------------------------------------------------------+ || TLS ServerHello | |+------------------------------------------------------------+ || QUIC ACK frame (acknowledging client hello) | |+------------------------------------------------------------+<-+| QUIC long header (type = Handshake, Version, DCID, SCID) (Length)+------------------------------------------------------------+ || encrypted payload (presumably CRYPTO frames) | |+------------------------------------------------------------+<-+| QUIC short header |+------------------------------------------------------------+| 1-RTT encrypted payload |+------------------------------------------------------------+
The Server Initial datagram also exposes the version number and the Source and DestinationConnection IDs in the clear; the payload of the Initial packet isprotected using the Initial secret.¶
+------------------------------------------------------------+| UDP header (source and destination UDP ports) |+------------------------------------------------------------+| QUIC long header (type = Initial, Version, DCID, SCID) (Length)+------------------------------------------------------------+ || QUIC ACK frame (acknowledging Server Initial) | |+------------------------------------------------------------+<-+| QUIC long header (type = Handshake, Version, DCID, SCID) (Length)+------------------------------------------------------------+ || encrypted payload (presumably CRYPTO/ACK frames) | |+------------------------------------------------------------+<-+| QUIC short header |+------------------------------------------------------------+| 1-RTT encrypted payload |+------------------------------------------------------------+
The Client Completion flight does not expose any additional information;however, as the Destination Connection ID is server-selected, it usuallyis not the same ID that is sent in the Client Initial. Client Completionflights contain 1-RTT packets that indicate the handshake has completed(seeSection 3.2) on the client and for three-way handshake RTTestimation as inSection 3.8.¶
+------------------------------------------------------------+| UDP header (source and destination UDP ports) |+------------------------------------------------------------+| QUIC long header (type = Handshake, Version, DCID, SCID) (Length)+------------------------------------------------------------+ || encrypted payload (presumably ACK frame) | |+------------------------------------------------------------+<-+| QUIC short header |+------------------------------------------------------------+| 1-RTT encrypted payload |+------------------------------------------------------------+
Similar to Client Completion, Server Completion does not expose additionalinformation; observing it serves only to determine that the handshake hascompleted.¶
When the client uses 0-RTT data, the Client Initialflight can also include one or more 0-RTT packets as shown inFigure 6.¶
+----------------------------------------------------------+| UDP header (source and destination UDP ports) |+----------------------------------------------------------+| QUIC long header (type = Initial, Version, DCID, SCID) (Length)+----------------------------------------------------------+ || QUIC CRYPTO frame header | |+----------------------------------------------------------+ || TLS ClientHello (incl. TLS SNI) | |+----------------------------------------------------------+<-+| QUIC long header (type = 0-RTT, Version, DCID, SCID) (Length)+----------------------------------------------------------+ || 0-RTT encrypted payload | |+----------------------------------------------------------+<-+
When a 0-RTT packet is coalesced with an Initial packet, the datagramwill be padded to 1200 bytes. Additional datagrams containing only 0-RTTpackets with long headers can be sent after the client Initial packet, which contains more 0-RTT data. The amount of 0-RTT protected data thatcan be sent in the first flight is limited by the initial congestionwindow, typically to around 10 packets (seeSection 7.2 of [QUIC-RECOVERY]).¶
As soon as the cryptographic context is established, all information in the QUICheader, including exposed information, is integrity protected. Further,information that was exposed in packets sent before the cryptographic contextwas established is validated during the cryptographic handshake. Therefore,devices on path cannot alter any information or bits in QUIC packets. Suchalterations would cause the integrity check to fail, which results in thereceiver discarding the packet. Some parts of Initial packets could be alteredby removing and reapplying the authenticated encryption without immediatediscard at the receiver. However, the cryptographic handshake validates mostfields and any modifications in those fields will result in a connectionestablishment failure later.¶
The connection ID in the QUIC packet headers allows association of QUICpackets using information independent of the 5-tuple. This allowsrebinding of a connection after one of the endpoints (usually theclient) has experienced an address change. Further, it can be used byin-network devices to ensure that related 5-tuple flows are appropriatelybalanced together (seeSection 4.4).¶
Client and server each choose a connection ID during the handshake; forexample, a server might request that a client use a connection ID, whereas theclient might choose a zero-length value. Connection IDs for either endpoint maychange during the lifetime of a connection, with the new connection ID beingsupplied via encrypted frames (seeSection 5.1 of [QUIC-TRANSPORT]).Therefore, observing a new connection ID does not necessarily indicate a newconnection.¶
[QUIC-LB] specifies algorithms forencoding the server mapping in a connection ID in order to share thisinformation with selected on-path devices such as load balancers. Servermappings should only be exposed to selected entities. Uncontrolled exposurewould allow linkage of multiple IP addresses to the same host if the serveralso supports migration that opens an attack vector on specific servers orpools. The best way to obscure an encoding is to appear random to any otherobservers, which is most rigorously achieved with encryption. As a result,any attempt to infer information from specific parts of a connection ID isunlikely to be useful.¶
The Packet Number field is always present in the QUIC packet header in version1; however, it is always encrypted. The encryption key for packet numberprotection on Initial packets (which are sent before cryptographic contextestablishment) is specific to the QUIC version while packet number protectionon subsequent packets uses secrets derived from the end-to-end cryptographiccontext. Packet numbers are therefore not part of the wire image that is visibleto on-path observers.¶
Version Negotiation packets are used by the server to indicate that a requestedversion from the client is not supported (seeSection 6 of [QUIC-TRANSPORT]).Version Negotiation packets are not intrinsically protected, but future QUICversions could use later encrypted messages to verify that they were authentic.Therefore, any modification of this list will be detected and may cause theendpoints to terminate the connection attempt.¶
Also note that the list of versions in the Version Negotiation packet maycontain reserved versions. This mechanism is used to avoid ossification in theimplementation of the selection mechanism. Further, a client may send an Initialpacket with a reserved version number to trigger version negotiation. Inthe Version Negotiation packet, the connection IDs of the client'sInitial packetare reflected to provide a proof of return-routability. Therefore, changing thisinformation will also cause the connection to fail.¶
QUIC is expected to evolve rapidly. Therefore, new versions (both experimental and IETFstandard versions) will be deployed on the Internet more often than withother commonly deployed Internet and transport-layer protocols. Useof the Version field for traffic recognition will therefore behavedifferently than with these protocols. Using a particular version numberto recognize valid QUIC traffic is likely to persistently miss a fraction ofQUIC flowsand completely fail in the near future. Reliance on the Version field for the purpose ofadmission control is also likely to lead to unintended failure modes. Admission of QUIC traffic regardless of versionavoids these failure modes, avoids unnecessary deployment delays, andsupports continuous version-based evolution.¶
This section addresses the different kinds of observations and inferences thatcan be made about QUIC flows by a passive observer in the network based on thewire image inSection 2. Here, we assume a bidirectional observer (onethat can see packets in both directions in the sequence in which they arecarried on the wire) unless noted, but typically without access to any keyinginformation.¶
The QUIC wire image is not specifically designed to be distinguishablefrom other UDP traffic by a passive observer in the network. While certainQUIC applications may be heuristically identifiable on a per-applicationbasis, there is no general method for distinguishing QUIC traffic fromotherwise unclassifiable UDP traffic on a given link. Therefore, any unrecognized UDPtraffic may be QUIC traffic.¶
At the time of writing, two application bindings for QUIC have been publishedor adopted by the IETF: HTTP/3[QUIC-HTTP] and DNS over Dedicated QUICConnections[RFC9250]. These are both known to have active Internet deployments, so an assumption that allQUIC traffic is HTTP/3 is not valid. HTTP/3 uses UDP port 443 byconvention but various methods can be used to specify alternate port numbers.Other applications (e.g., Microsoft's SMB over QUIC) also use UDP port 443 bydefault. Therefore, simple assumptions about whether a given flow is usingQUIC (or indeed which application might be using QUIC) based solely upona UDP port number may not hold; seeSection 5 of [RFC7605].¶
While the second-most-significant bit (0x40) of the first octet is set to 1 inmost QUIC packets of the current version (seeSection 2.1 andSection 17 of [QUIC-TRANSPORT]), this method of recognizing QUIC traffic is not reliable.First, it only provides one bit of information and is prone to collision withUDP-based protocols other than those considered in[RFC7983]. Second, thisfeature of the wire image is not invariant[QUIC-INVARIANTS] and may change infuture versions of the protocol or even be negotiated during the handshake viathe use of an extension[QUIC-GREASE].¶
Even though transport parameters transmitted in the client's Initial packet areobservable by the network, they cannot be modified by the network withoutcausing a connection failure. Further, the reply from the server cannot beobserved, so observers on the network cannot know which parameters are actuallyin use.¶
An in-network observer assuming that a set of packets belongs to a QUIC flowmight infer the version number in use by observing the handshake. If theversion number in an Initial packet of the server response is subsequentlyseen in a packet from the client, that version has been accepted by bothendpoints to be used for the rest of the connection (seeSection 2 of [QUIC-VERSION-NEGOTIATION]).¶
The negotiated version cannot be identified for flows in which a handshake isnot observed, such as in the case of connection migration. However, it might bepossible to associate a flow with a flow for which a version has beenidentified; seeSection 3.5.¶
A related question is whether the first packet of a given flow on a port knownto be associated with QUIC is a valid QUIC packet. This determination supportsin-network filtering of garbage UDP packets (reflection attacks, randombackscatter, etc.). While heuristics based on the first byte of the packet(packet type) could be used to separate valid from invalid first packet types,the deployment of such heuristics is not recommended as bits in the first bytemay have different meanings in future versions of the protocol.¶
This document focuses on QUIC version 1, and this Connection Confirmationsection applies only to packets belonging to QUIC version 1 flows; for purposesof on-path observation, it assumes that these packets have been identified assuch through the observation of a version number exchange as described above.¶
Connection establishment uses Initial and Handshake packets containing aTLS handshake and Retry packets that do not contain parts of the handshake.Connection establishment can therefore be detected using heuristics similar tothose used to detect TLS over TCP. A client initiating a connection mayalso send data in 0-RTT packets directly after the Initialpacket containing the TLS ClientHello. Since packets may be reordered or lostin the network, 0-RTT packets could be seen before the Initialpacket.¶
Note that in this version of QUIC, clients send Initial packets before serversdo, servers send Handshake packets before clients do, and only clients sendInitial packets with tokens. Therefore, an endpoint can be identified as aclient or server by an on-path observer. An attempted connection after Retry canbe detected by correlating the contents of the Retry packet with the Token andthe Destination Connection ID fields of the new Initial packet.¶
Some deployed in-network functions distinguish packets that carry onlyacknowledgment (ACK-only) informationfrom packets carrying upper-layer data in order to attempt to enhanceperformance (for example, by queuing ACKs differently or manipulating ACKsignaling[RFC3449]). Distinguishing ACK packets is possible in TCP,but is not supported byQUIC since acknowledgment signaling is carried inside QUIC's encrypted payloadand ACK manipulation is impossible. Specifically, heuristics attempting todistinguish ACK-only packets from payload-carrying packets based on packet sizeare likely to fail and are not recommended to use as a way to construeinternals of QUIC's operation as those mechanisms can change, e.g., due to theuse of extensions.¶
The client's TLS ClientHello may contain a Server Name Indication (SNI)extension[RFC6066] by which the client reveals the name of the server itintends to connect to in order to allow the server to present a certificatebased on that name. If present, SNI information is available to unidirectional observerson the client-to-server path if it.¶
The TLS ClientHello may also contain an Application-Layer ProtocolNegotiation (ALPN) extension[RFC7301], by which the client exposes the namesof application-layer protocols it supports; an observer can deduce that one ofthose protocols will be used if the connection continues.¶
Work is currently underway in the TLS working group to encrypt the contents ofthe ClientHello in TLS 1.3[TLS-ECH]. This would makeSNI-based application identification impossible by on-path observation for QUICand other protocols that use TLS.¶
If the ClientHello is not encrypted, SNI can be derived from the client'sInitial packets by calculating the Initial secret to decrypt the packetpayload and parsing the QUIC CRYPTO frames containing the TLS ClientHello.¶
As both the derivation of the Initial secret and the structure of the Initialpacket itself are version specific, the first step is always to parse theversion number (the second through fifth bytes of the long header). Note thatonly long header packets carry the version number, so it is necessary to alsocheck if the first bit of the QUIC packet is set to 1, which indicates a long header.¶
Note that proprietary QUIC versions that have been deployed beforestandardization might not set the first bit in a QUIC long header packet to1. However, it is expected that these versions willgradually disappear over time and therefore do not require any specialconsideration or treatment.¶
When the version has been identified as QUIC version 1, the packet type needs tobe verified as an Initial packet by checking that the third and fourth bits ofthe header are both set to 0. Then, the Destination Connection ID needs to beextracted from the packet. The Initial secret is calculated using theversion-specific Initial salt as described inSection 5.2 of [QUIC-TLS].The length of the connection ID is indicated in the 6th byte of the headerfollowed by the connection ID itself.¶
Note that subsequent Initial packets might contain a Destination Connection IDother than the one used to generate the Initial secret. Therefore, attempts todecrypt these packets using the procedure above might fail unless the Initialsecret is retained by the observer.¶
To determine the end of the packet header and find the start of the payload,the Packet Number Length, the Source Connection ID Length, and the Token Lengthneed to be extracted. The Packet Number Length is defined by the seventh andeighth bits of the header as described inSection 17.2 of [QUIC-TRANSPORT],but is protected as described inSection 5.4 of [QUIC-TLS]. The SourceConnection ID Length is specified in the byte after the DestinationConnection ID. The Token Length, which follows the Source Connection ID, isa variable-length integer as specified inSection 16 of [QUIC-TRANSPORT].¶
After decryption, the client's Initial packets can be parsed to detect theCRYPTO frames that contain the TLS ClientHello, which then can be parsedsimilarly to TLS over TCP connections. Note that there can be multiple CRYPTOframes spread out over one or more Initial packets and they might not be inorder, so reassembling the CRYPTO stream by parsing offsets and lengths isrequired. Further, the client's Initial packets may contain other frames,so the first bytes of each frame need to be checked to identify the frametype and determine whether the frame can be skipped over. Note that thelength of the frames is dependent on the frame type; seeSection 18 of [QUIC-TRANSPORT].For example, PADDING frames (each consisting of a single zero byte) may occur before,after, or between CRYPTO frames. However, extensions might define additionalframe types. If an unknown frame type is encountered, it is impossible toknow the length of that frame, which prevents skipping over it; therefore,parsing fails.¶
The QUIC connection ID (seeSection 2.6) is designed to allow a coordinatingon-path device, such as a load balancer, to associate two flows when one of theendpoints changes address. This change can be due to NAT rebinding or addressmigration.¶
The connection ID must change upon intentional address change by an endpointand connection ID negotiation is encrypted; therefore, it is not possible for apassive observer to link intended changes of address using the connection ID.¶
When one endpoint's address unintentionally changes, as is the case with NATrebinding, an on-path observer may be able to use the connection ID toassociate the flow on the new address with the flow on the old address.¶
A network function that attempts to use the connection ID to associate flowsmust be robust to the failure of this technique. Since the connection ID maychange multiple times during the lifetime of a connection, packets with thesame 5-tuple but different connection IDs might or might not belong tothe same connection. Likewise, packets with the same connection ID butdifferent 5-tuples might not belong to the same connection either.¶
Connection IDs should be treated as opaque; seeSection 4.4for caveats regarding connection ID selection at servers.¶
QUIC does not expose the end of a connection; the only indication to on-pathdevices that a flow has ended is that packets are no longer observed. Therefore, statefuldevices on path such as NATs and firewalls must use idle timeouts todetermine when to drop state for QUIC flows; seeSection 4.2.¶
QUIC explicitly exposes which side of a connection is a client and which side isa server during the handshake. In addition, the symmetry of a flow (whether it isprimarily client-to-server, primarily server-to-client, or roughlybidirectional, as input to basic traffic classification techniques) can beinferred through the measurement of data rate in each direction.Note that QUIC packets containing only control frames (such asACK-only packets) may be padded. Padding, though optional,may conceal connection roles or flow symmetry information.¶
The round-trip time (RTT) of QUIC flows can be inferredby observation once per flowduring the handshake in passive TCP measurement; this requires parsing ofthe QUIC packet header and recognition of the handshake, as illustrated inSection 2.4. It can also be inferred during the flow's lifetime if theendpoints use the spin bit facility described below and inSection 17.3.1 of [QUIC-TRANSPORT]. RTT measurement is available to unidirectional observerswhen the spin bit is enabled.¶
In the common case, the delay between the client's Initial packet (containingthe TLS ClientHello) and the server's Initial packet (containing the TLSServerHello) represents the RTT component on the path between the observer andthe server. The delay between the server's first Handshake packet and theHandshake packet sent by the client represents the RTT component on the pathbetween the observer and the client. While the client may send 0-RTT packetsafter the Initial packet during connection re-establishment, these can beignored for RTT measurement purposes.¶
Handshake RTT can be measured by adding the client-to-observer andobserver-to-server RTT components together. This measurement necessarilyincludes all transport- and application-layer delay at both endpoints.¶
The spin bit provides a version-specific method to measure per-flow RTT fromobservation points on the network path throughout the duration of a connection.SeeSection 17.4 of [QUIC-TRANSPORT] for the definition of the spin bit inVersion 1 of QUIC. Endpoint participation in spin bit signaling is optional. While its location is fixed in this version of QUIC, an endpoint canunilaterally choose to not support "spinning" the bit.¶
Use of the spin bit for RTT measurement by devices on path is only possible whenboth endpoints enable it. Some endpoints may disable use of the spin bit bydefault, others only in specific deployment scenarios, e.g., for servers andclients where the RTT would reveal the presence of a VPN or proxy. To avoidmaking these connections identifiable based on the usage of the spin bit, allendpoints randomly disable "spinning" for at least one eighth of connections,even if otherwise enabled by default. An endpoint not participating in spin bitsignaling for a given connection can use a fixed spin value for the duration ofthe connection or can set the bit randomly on each packet sent.¶
When in use, the latency spin bit in each direction changes value once perRTT any time that both endpoints are sending packetscontinuously. An on-path observer can observe the time difference between edges(changes from 1 to 0 or 0 to 1) in the spin bit signal in a single direction tomeasure one sample of end-to-end RTT. This mechanism follows the principles ofprotocol measurability laid out in[IPIM].¶
Note that this measurement, as with passive RTT measurement for TCP, includesall transport protocol delay (e.g., delayed sending of acknowledgments) and/orapplication layer delay (e.g., waiting for a response to be generated). Ittherefore provides devices on path a good instantaneous estimate of the RTT asexperienced by the application.¶
However, application-limited and flow-control-limited senders can haveapplication- and transport-layer delay, respectively, that are much greater thannetwork RTT. For example, if the sender only sends small amounts of application traffic periodically, where the periodicity is longer than the RTT, spin bit measurements provide information about the application period ratherthan network RTT.¶
Since the spin bit logic at each endpoint considers only samples from packetsthat advance the largest packet number, signal generation itself isresistant to reordering. However, reordering can cause problems at an observerby causing spurious edge detection and therefore inaccurate (i.e., lower) RTTestimates, if reordering occurs across a spin bit flip in the stream.¶
Simple heuristics based on the observed data rate per flow or changes in the RTTseries can be used to reject bad RTT samples due to lost or reordered edges inthe spin signal, as well as application or flow control limitation; for example,QoF[TMA-QOF] rejects component RTTs significantly higher than RTTs over thehistory of the flow. These heuristics may use the handshake RTT as an initialRTT estimate for a given flow. Usually such heuristics would also detect ifthe spin is either constant or randomly set for a connection.¶
An on-path observer that can see traffic in both directions (from client toserver and from server to client) can also use the spin bit to measure"upstream" and "downstream" component RTT; i.e, the component of theend-to-end RTT attributable to the paths between the observer and the serverand between the observer and the client, respectively. It does this by measuring thedelay between a spin edge observed in the upstream direction and that observedin the downstream direction, and vice versa.¶
Raw RTT samples generated using these techniques can be processed in variousways to generate useful network performance metrics. A simple linear smoothingor moving minimum filter can be applied to the stream of RTT samples to get amore stable estimate of application-experienced RTT. RTT samples measured fromthe spin bit can also be used to generate RTT distribution information,including minimum RTT (which approximates network RTT over longer time windows)and RTT variance (which approximates one-way packet delay variance as seenby an application end-point).¶
In this section, we review specific network management and measurementtechniques and how QUIC's design impacts them.¶
Limited RTT measurement is possible by passive observation of QUIC traffic;seeSection 3.8. No passive measurement of loss is possible with the presentwire image. Limited observation of upstream congestion may bepossible via the observation of Congestion Experienced (CE) markings in theIP header[RFC3168] on ECN-enabled QUIC traffic.¶
On-path devices can also make measurements of RTT, loss, and otherperformance metrics when information is carried in an additional network-layerpacket header (Section 6 of [RFC9065] describes the use of Operations,Administration, and Management (OAM) information).Using network-layer approaches also has the advantage that common observationand analysis tools can be consistently used for multiple transport protocols;however, these techniques are often limited to measurements within one ormultiple cooperating domains.¶
Stateful treatment of QUIC traffic (e.g., at a firewall or NAT middlebox) ispossible through QUIC traffic and version identification (Section 3.1)and observation of the handshake for connection confirmation (Section 3.2).The lack of any visible end-of-flow signal (Section 3.6) means that thisstate must be purged either through timers or least-recently-usedeviction depending on application requirements.¶
While QUIC has no clear network-visible end-of-flow signal and thereforedoes require timer-based state removal, the QUIC handshake indicatesconfirmation by both ends of a valid bidirectional transmission. As soonas the handshake completed, timers should be set long enough to alsoallow for short idle time during a valid transmission.¶
[RFC4787] requires a network state timeout that is not less than 2 minutesfor most UDP traffic. However, in practice, a QUIC endpoint can experiencelower timeouts in the range of 30 to 60 seconds[QUIC-TIMEOUT].¶
In contrast,[RFC5382] recommends a state timeout of more than 2hours for TCP given that TCP is a connection-oriented protocol withwell-defined closure semantics.Even though QUIC has explicitly been designed to tolerate NAT rebindings,decreasing the NAT timeout is not recommended as it may negatively impactapplication performance or incentivize endpoints to send very frequentkeep-alive packets.¶
Therefore, a state timeout of at least two minutes is recommended for QUIC traffic, even when lower state timeouts are used for other UDP traffic.¶
If state is removed too early, this could lead to black-holing of incomingpackets after a short idle period. To detect this situation, a timer at theclient needs to expire before a re-establishment can happen (if at all), whichwould lead to unnecessarily long delays in an otherwise working connection.¶
Furthermore, not all endpoints use routing architectures where connectionswill survive a port or address change. Even when the client revives theconnection, a NAT rebinding can cause a routing mismatch where a packetis not even delivered to the server that might support address migration.For these reasons, the limits in[RFC4787] are important to avoidblack-holing of packets (and hence avoid interrupting the flow of data to theclient), especially where devices are able to distinguish QUIC traffic fromother UDP payloads.¶
The QUIC header optionally contains a connection ID, which could provideadditional entropy beyond the 5-tuple. The QUIC handshake needsto be observed in order to understand whether the connection ID is present andwhat length it has. However, connection IDs may be renegotiatedafter the handshake, and this renegotiation is not visible to the path.Therefore, using the connection ID as a flow key field for stateful treatmentof flows is not recommended as connection ID changes will cause undetectableand unrecoverable loss of state in the middle of a connection. In particular,the use of the connection ID for functions that require state to make aforwarding decision is not viable as it will break connectivity, or at minimum,cause long timeout-based delays before this problem is detected by theendpoints and the connection can potentially be re-established.¶
Use of connection IDs is specifically discouraged for NAT applications.If a NAT hits an operational limit, it is recommended to rather drop theinitial packets of a flow (see alsoSection 4.5),which potentially triggers TCP fallback. Use of the connection ID tomultiplex multiple connections on the same IP address/port pair is not aviable solution as it risks connectivity breakage in case the connectionID changes.¶
While QUIC's migration capability makes it possible for a connection to surviveclient address changes, this does not work if the routers or switches in theserver infrastructure route using the address-port 4-tuple. If infrastructureroutes on addresses only, NAT rebinding or address migration will cause packetsto be delivered to the wrong server.[QUIC-LB] describes a way to addressesthis problem by coordinating the selection and use of connection IDs betweenload balancers and servers.¶
Applying address translation at a middlebox to maintain a stableaddress-port mapping for flows based on connection ID might seem like a solution to this problem. However, hiding information about thechange of the IP address or port conceals important and security-relevantinformation from QUIC endpoints, and as such, would facilitate amplificationattacks (seeSection 8 of [QUIC-TRANSPORT]). A NAT function that hidespeer address changes prevents the other end fromdetecting and mitigating attacks as the endpoint cannot verify connectivityto the new address using QUIC PATH_CHALLENGE and PATH_RESPONSE frames.¶
In addition, a change of IP address or port is also an input signal to otherinternal mechanisms in QUIC. When a path change is detected, path-dependentvariables like congestion control parameters will be reset, which protectsthe new path from overload.¶
In the case of networking architectures that include load balancers,the connection ID can be used as a way for the server to signal informationabout the desired treatment of a flow to the load balancers. Guidance onassigning connection IDs is given in[QUIC-APPLICABILITY].[QUIC-LB]describes a system for coordinating selection and use of connection IDs betweenload balancers and servers.¶
[RFC4787] describes possible packet-filtering behaviors that relate to NATs but are often also used in other scenarios where packet filtering is desired. Though the guidance there holds, a particularly unwise behavior admits a handful of UDP packets and then makes a decision to whether or not filter later packets in the same connection. QUIC applications are encouraged to fall back to TCP if early packets do not arrive at their destination[QUIC-APPLICABILITY], as QUIC is based on UDP and there are known blocks of UDP traffic (seeSection 4.6). Admitting a few packets allows the QUIC endpoint to determine that the path accepts QUIC. Sudden drops afterwards will result in slow and costly timeouts before abandoning the connection.¶
Today, UDP is the most prevalent DDoS vector, since it is easy for compromisednon-admin applications to send a flood of large UDP packets (while with TCP theattacker gets throttled by the congestion controller) or to craft reflection andamplification attacks; therefore, some networks block UDP traffic.With increased deployment of QUIC, there is also an increased need to allowUDP traffic on ports used for QUIC. However, if UDP is generally enabled onthese ports, UDP flood attacks may also use the same ports. One possibleresponse to this threat is to throttle UDP traffic on the network, allocating afixed portion of the network capacity to UDP and blocking UDP datagrams overthat cap. As the portion of QUIC traffic compared to TCP is also expected toincrease over time, using such a limit is not recommended; if this is done,limits might need to be adapted dynamically.¶
Further, if UDP traffic is desired to be throttled, it is recommended toblock individualQUIC flows entirely rather than dropping packets indiscriminately.When the handshake is blocked, QUIC-capable applications may fall backto TCP. However, blocking a random fraction of QUIC packets across4-tuples will allow many QUIC handshakes to complete, preventing TCP fallback, butthese connections will suffer fromsevere packet loss (see alsoSection 4.5). Therefore, UDP throttlingshould be realized by per-flow policing as opposed to per-packetpolicing. Note that this per-flow policing should be stateless to avoidproblems with stateful treatment of QUIC flows (seeSection 4.2),for example, blocking a portion of the space of values of a hash functionover the addresses and ports in the UDP datagram.While QUIC endpoints are often able to survive address changes, e.g., by NATrebindings, blocking a portion of the traffic based on 5-tuple hashing increasesthe risk of black-holing an active connection when the address changes.¶
Note that some source ports are assumed to be reflection attack vectors by someservers; seeSection 8.1 of [QUIC-APPLICABILITY]. As a result, NATbinding to these source ports can result in that traffic being blocked.¶
On-path observation of the transport headers of packets can be used for varioussecurity functions. For example, Denial of Service (DoS) and Distributed DoS(DDoS) attacks against the infrastructure or against an endpoint can bedetected and mitigated by characterizing anomalous traffic.Other uses include support for security audits (e.g., verifying thecompliance with cipher suites), client and application fingerprinting forinventory, and providing alerts for network intrusion detection and othernext-generation firewall functions.¶
Current practices in detection and mitigation of DDoSattacks generally involve classification of incoming traffic (aspackets, flows, or some other aggregate) into "good" (productive) and "bad"(DDoS) traffic, and then differential treatment of this traffic to forward onlygood traffic. This operation is often done in a separate specialized mitigationenvironment through which all traffic is filtered; a generalized architecturefor separation of concerns in mitigation is given in[DOTS-ARCH].¶
Efficient classification of this DDoS traffic in the mitigation environmentis key to the success of this approach. Limited first packet garbage detectionas inSection 3.1.2 and stateful tracking of QUIC traffic as mentioned inSection 4.2 above may be useful during classification.¶
Note that using a connection ID to support connection migration renders5-tuple-based filtering insufficient to detect active flows and requires morestate to be maintained by DDoS defense systems if support of migration of QUICflows is desired. For the common case of NAT rebinding, where the client'saddress changes without the client's intent or knowledge, DDoS defense systemscan detect a change in the client's endpoint address by linking flows based onthe server's connection IDs. However, QUIC's linkability resistance ensures thata deliberate connection migration is accompanied by a change in the connectionID. In this case, the connection ID cannot be used to distinguish valid, activetraffic from new attack traffic.¶
It is also possible forendpoints to directly support security functions such as DoSclassification and mitigation.Endpoints can cooperate with an in-network device directly by e.g.,sharing information about connection IDs.¶
Another potential method could use anon-path network device that relies on pattern inferences in the traffic andheuristics or machine learning instead of processing observed headerinformation.¶
However, it is questionable whether connection migrations must be supportedduring a DDoS attack. While unintended migration without a connection IDchange can be supported much easier, it might be acceptable to notsupport migrations of active QUIC connections that are not visible tothe network functions performing the DDoS detection.As soon as the connection blocking is detected by the client,the client may be able to rely on the 0-RTT data mechanismprovided by QUIC. When clients migrate to a new path, they should be preparedfor the migration to fail and attempt to reconnect quickly.¶
Beyond in-network DDoS protection mechanisms, TCP SYN cookies[RFC4987]are a well-established method of mitigating somekinds of TCP DDoS attacks. QUIC Retry packets are the functional analogue toSYN cookies, forcing clients to prove possession of their IP address beforecommitting server state. However, there are safeguards in QUIC againstunsolicited injection of these packets by intermediaries who do not have consentof the end server. See[QUIC-RETRY] for standardways for intermediaries to send Retry packets on behalf of consenting servers.¶
It is expected that any QoS handling in the network, e.g., based on use ofDiffserv Code Points (DSCPs)[RFC2475] as well as Equal-CostMulti-Path (ECMP) routing, is applied on a per-flow basis (and not per-packet)and as such that all packets belonging to the same active QUIC connectionget uniform treatment.¶
Using ECMP to distribute packets from a single flow across multiplenetwork paths or any other nonuniform treatment of packets belong to the sameconnection could result in variations in order, delivery rate, and drop rate.As feedback about loss or delay of each packet is used as input tothe congestion controller, these variations could adversely affect performance.Depending on the loss recovery mechanism that is implemented, QUIC may bemore tolerant of packet reordering than typical TCP traffic (seeSection 2.7). However, the recovery mechanism used by a flow cannot beknown by the network and therefore reordering tolerance should beconsidered as unknown.¶
Note that the 5-tuple of a QUIC connection can change due to migration.In this case different flows are observed by the path and may be treateddifferently, as congestion control is usually reset on migration (see alsoSection 3.5).¶
Datagram Packetization Layer PMTU Discovery (DPLPMTUD) can be used by QUIC toprobe for the supported PMTU. DPLPMTUD optionally uses ICMP messages (e.g.,IPv6 Packet Too Big (PTB) messages). Given known attacks with the use of ICMPmessages, the use of DPLPMTUD in QUIC has been designed to safely use butnot rely on receiving ICMP feedback (seeSection 14.2.1 of [QUIC-TRANSPORT]).¶
Networks are recommended to forward these ICMP messages and retain as much ofthe original packet as possible without exceeding the minimum MTU for the IPversion when generating ICMP messages as recommended in[RFC1812]and[RFC4443].¶
Some network segments support 1500-byte packets,but can only do so by fragmenting at alower layer before traversing a network segment with a smaller MTU,and then reassembling within the network segment.This is permissible even when the IP layer is IPv6 or IPv4 with the Don't Fragment (DF) bit set,because fragmentation occurs below the IP layer.However, this process can add to computeand memory costs, leading to a bottleneck that limits network capacity. In suchnetworks, this generates a desire to influence a majority of senders to usesmaller packets to avoid exceeding limited reassembly capacity.¶
For TCP, Maximum Segment Size (MSS) clamping (Section 3.2 of [RFC4459]) is often used to changethe sender's TCP maximum segment size, but QUIC requires a different approach.Section 14 of [QUIC-TRANSPORT] advises senders to probe larger sizes using DPLPMTUD[DPLPMTUD] or PathMaximum Transmission Unit Discovery (PMTUD)[RFC1191][RFC8201].This mechanism encourages senders to approach the maximum packet size, whichcould then cause fragmentation within a network segment of whichthey may not be aware.¶
If path performance is limited when forwarding larger packets, an on-pathdevice should support a maximum packet size for a specific transport flowand then consistently drop all packets that exceed the configured sizewhen the inner IPv4 packet has DF set or IPv6 is used.¶
Networks with configurations that would lead to fragmentation of largepackets within a network segment should drop such packets rather thanfragmenting them. Network operators who plan to implement a moreselective policy may start by focusing on QUIC.¶
QUIC flows cannot always be easily distinguished from other UDP traffic, butwe assume at least some portion of QUIC traffic can be identified(seeSection 3.1). For networks supporting QUIC, it is recommendedthat a path drops any packet larger than the fragmentation size.When a QUIC endpoint uses DPLPMTUD, it will use a QUIC probe packet todiscover the PMTU. If this probe is lost, it will not impact the flow ofQUIC data.¶
IPv4 routers generate an ICMP message when a packet is dropped because thelink MTU was exceeded.[RFC8504] specifies how an IPv6 node generates anICMPv6 PTB in this case. PMTUD relies upon anendpoint receiving such PTB messages[RFC8201], whereas DPLPMTUD does notreply upon these messages, but can still optionally use these to improveperformanceSection 4.6 of [DPLPMTUD].¶
A network cannot know in advance which discovery method is used by a QUICendpoint, so it should send a PTB message in addition to dropping anoversized packet. A generated PTB message should be compliant with thevalidation requirements ofSection 14.2.1 of [QUIC-TRANSPORT], otherwiseit will be ignored for PMTU discovery. This provides a signal to theendpoint to prevent the packet size from growing too large, which canentirely avoid network segment fragmentation for that flow.¶
Endpoints can cache PMTU information in the IP-layer cache. This short-termconsistency between the PMTU for flows can help avoid an endpoint using aPMTU that is inefficient. The IP cache can also influence the PMTU value ofother IP flows that use the same path[RFC8201][DPLPMTUD],including IP packets carryingprotocols other than QUIC. The representation of an IP path isimplementation specific[RFC8201].¶
This document has no actions for IANA.¶
QUIC is an encrypted and authenticated transport. That means once thecryptographic handshake is complete, QUIC endpoints discard most packets thatare not authenticated, greatly limiting the ability of an attacker to interferewith existing connections.¶
However, some information is still observable as supporting manageability ofQUIC traffic inherently involves trade-offs with the confidentiality of QUIC'scontrol information; this entire document is therefore security-relevant.¶
More security considerations for QUIC are discussed in[QUIC-TRANSPORT]and[QUIC-TLS], which generally consider active or passive attackers in thenetwork as well as attacks on specific QUIC mechanism.¶
Version Negotiation packets do not contain any mechanism to prevent versiondowngrade attacks. However, future versions of QUIC that use Version Negotiationpackets are required to define a mechanism that is robust against versiondowngrade attacks. Therefore, a network node should not attempt to impactversion selection, as version downgrade may result in connection failure.¶
Special thanks to last call reviewersElwyn Davies,Barry Leiba,Al Morton, andPeter Saint-Andre.¶
This work was partially supported by the European Commission under Horizon 2020 grant agreement no. 688421 Measurement and Architecture for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat for Education, Research, and Innovation under contract no. 15.0268. This support does not imply endorsement.¶
The following people have contributed significant text to and/or feedback on this document:¶