RFC 9413 | Maintaining Robust Protocols | June 2023 |
Thomson & Schinazi | Informational | [Page] |
The main goal of the networking standards process is to enable the long-terminteroperability of protocols. This document describes active protocolmaintenance, a means to accomplish that goal. By evolving specifications andimplementations, it is possible to reduce ambiguity over time and create ahealthy ecosystem.¶
The robustness principle, often phrased as "be conservative in what you send,and liberal in what you accept", has long guided the design and implementationof Internet protocols. However, it has been interpreted in a variety of ways.While some interpretations help ensure the health of the Internet, others cannegatively affect interoperability over time. When a protocol is activelymaintained, protocol designers and implementers can avoid these pitfalls.¶
This document is not an Internet Standards Track specification; it is published for informational purposes.¶
This document is a product of the Internet Architecture Board (IAB) and represents information that the IAB has deemed valuable to provide for permanent record. It represents the consensus of the Internet Architecture Board (IAB). Documents approved for publication by the IAB are not candidates for any level of Internet Standard; see Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc9413.¶
Copyright (c) 2023 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.¶
There is good evidence to suggest that many important protocols are routinelymaintained beyond their inception. In particular, a sizable proportion of IETFactivity is dedicated to the stewardship of existing protocols. This documentfirst discusses hazards in applying the robustness principle too broadly (seeSection 2) and offers an alternative strategy for handling interoperabilityproblems in deployments (seeSection 5).¶
Ideally, protocol implementations can be actively maintained so that unexpectedconditions are proactively identified and resolved. Some deployments might stillneed to apply short-term mitigations for deployments that cannot be easilyupdated, but such cases need not be permanent. This is discussed further inSection 5.¶
The robustness principle has been hugely influential in shaping the design ofthe Internet. As stated in the IAB document "Architectural Principles of theInternet"[RFC1958], the robustness principle advises to:¶
Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.¶
This simple statement captures a significant concept in the design ofinteroperable systems. Many consider the application of the robustnessprinciple to be instrumental in the success of the Internet as well as thedesign of interoperable protocols in general.¶
There are three main aspects to the robustness principle:¶
No software is perfect, and failures can lead to unexpected behavior.Well-designed software strives to be resilient to such issues, whether theyoccur in the local software or in software that it communicates with. Inparticular, it is critical for software to gracefully recover from these issueswithout aborting unrelated processing.¶
Since not all actors on the Internet are benevolent, networking software needsto be resilient to input that is intentionally crafted to cause unexpectedconsequences. For example, software must ensure that invalid input doesn't allowthe sender to access data, change data, or perform actions that it would otherwise not be allowed to.¶
It can be possible for an implementation to receive inputs that thespecification did not prepare it for. This scenario excludes those cases where athe specification explicitly defines how a faulty message is handled. Instead,this refers to cases where handling is not defined or where there is someambiguity in the specification. In this case, some interpretations of therobustness principle advocate that the implementation tolerate the faulty inputand silently discard it. Some interpretations even suggest that a faulty orambiguous message be processed according to the inferred intent of the sender.¶
The facets of the robustness principle that protect against defects or attacksare understood to be necessary guiding principles for the design andimplementation of networked systems. However, an interpretation that advocatesfor tolerating unexpected inputs is no longer considered best practice in allscenarios.¶
Time and experience show that negative consequences to interoperabilityaccumulate over time if implementations silently accept faulty input. Thisproblem originates from an implicit assumption that it is not possible to effectchange in a system the size of the Internet. When one assumes that changes toexisting implementations are not presently feasible, tolerating flaws feelsinevitable.¶
Many problems that this third aspect of the robustness principle was intended tosolve can instead be better addressed by active maintenance. Active protocolmaintenance is where a community of protocol designers, implementers, anddeployers work together to continuously improve and evolve protocolspecifications alongside implementations and deployments of those protocols. Acommunity that takes an active role in the maintenance of protocols will nolonger need to rely on the robustness principle to avoid interoperability issues.¶
The context from which the robustness principle was developed provides valuableinsights into its intent and purpose. The earliest form of the principle in theRFC Series (the Internet Protocol specification[RFC0760]) is preceded by asentence that reveals a motivation for the principle:¶
While the goal of this specification is to be explicit about the protocol there is the possibility of differing interpretations. In general, an implementation should be conservative in its sending behavior, and liberal in its receiving behavior.¶
This formulation of the principle expressly recognizes the possibility that thespecification could be imperfect. This contextualizes the principle in animportant way.¶
Imperfect specifications are unavoidable, largely because it is more importantto proceed to implementation and deployment than it is to perfect aspecification. A protocol benefits greatly from experience with its use. Adeployed protocol is immeasurably more useful than a perfect protocolspecification. This is particularly true in early phases of system design, towhich the robustness principle is best suited.¶
As demonstrated by the IAB document "What Makes for a Successful Protocol?" [RFC5218], success or failure of a protocol depends far more on factors like usefulness than on technical excellence. Timely publication of protocol specifications, even with the potential for flaws, likely contributed significantly to the eventual success of the Internet.¶
This premise that specifications will be imperfect is correct. However, ignoringfaulty or ambiguous input is almost always the incorrect solution to the problem.¶
Good extensibility[EXT] can make it easier to respond to new usecases or changes in the environment in which the protocol is deployed.¶
The ability to extend a protocol is sometimes mistaken for an application of therobustness principle. After all, if one party wants to start using a new featurebefore another party is prepared to receive it, it might be assumed that thereceiving party is being tolerant of new types of input.¶
A well-designed extensibility mechanism establishes clear rules for the handlingof elements like new messages or parameters. This depends on specifying thehandling of malformed or illegal inputs so that implementations behaveconsistently in all cases that might affect interoperation. New messages orparameters thereby become entirely expected. If extension mechanisms and errorhandling are designed and implemented correctly, new protocol features can bedeployed with confidence in the understanding of the effect they have onexisting implementations.¶
In contrast, relying on implementations to consistently handle unexpected inputis not a good strategy for extensibility. Using undocumented oraccidental features of a protocol as the basis of an extensibility mechanism canbe extremely difficult, as is demonstrated by the case study inAppendix A.3 of [EXT]. It is better and easier to design a protocol for extensibilityinitially than to retrofit the capability (see also[EDNS0]).¶
A protocol could be designed to permit a narrow set of valid inputs, or it could be designed to treat a wide range of inputs as valid.¶
A more flexible protocol is more complex to specify and implement; variations, especially those that are not commonly used, can create potentialinteroperability hazards. In the absence of strong reasons to be flexible, asimpler protocol is more likely to successfully interoperate.¶
Where input is provided by users, allowing flexibility might serve to make theprotocol more accessible, especially for non-expert users. HTML authoring[HTML] is an example of this sort of design.¶
In protocols where there are many participants that might generate messagesbased on data from other participants, some flexibility might contribute toresilience of the system. A routing protocol is a good example of where thismight be necessary.¶
In BGP[BGP], a peer generates UPDATE messagesbased on messages it receives from other peers. Peers can copyattributes without validation, potentially propagating invalidvalues. RFC 4271[BGP] mandated a session reset forinvalid UPDATE messages, a requirement that was not widelyimplemented. In many deployments, peers would treat a malformedUPDATE in less stringent ways, such as by treating the affected routeas having been withdrawn. Ultimately, RFC 7606[BGP-REH] documented this practice and provided preciserules, including mandatory actions for different error conditions.¶
A protocol can explicitly allow for a range of valid expressions of the samesemantics, with precise definitions for error handling. This is distinct from aprotocol that relies on the application of the robustness principle. With theformer, interoperation depends on specifications that capture all relevantdetails, whereas interoperation in the latterdepends more extensively on implementations making compatible decisions, as noted inSection 4.2.¶
The guidance in this document is intended for protocols that are deployed to theInternet. There are some situations in which this guidance might not apply to aprotocol due to conditions on its implementation or deployment.¶
In particular, this guidance depends on an ability to update and deployimplementations. Being able to rapidly update implementations that are deployedto the Internet helps manage security risks, but in reality, some softwaredeployments have lifecycles that make software updates either rare or altogether impossible.¶
Where implementations are not updated, there is no opportunity to apply thepractices that this document recommends. In particular, some practices -- such asthose described inSection 5.1 -- only exist to support the development ofprotocol maintenance and evolution. Employing this guidance is therefore onlyapplicable where there is the possibility of improving deployments throughtimely updates of their implementations.¶
Problems in other implementations can create an unavoidable need to temporarilytolerate unexpected inputs. However, this course of action carries risks.¶
Tolerating unexpected input might be an expedient tool for systems in earlyphases of deployment, which was the case for the early Internet. Being lenientin this way defers the effort of dealing with interoperability problems andprioritizes progress. However, this deferral can amplify the ultimate cost ofhandling interoperability problems.¶
Divergent implementations of a specification emerge over time. When variationsoccur in the interpretation or expression of semantic components,implementations cease to be perfectly interoperable.¶
Implementation bugs are often identified as the cause of variation, though it isoften a combination of factors. Using a protocol in ways that were notanticipated in the original design or ambiguities and errors in thespecification are often contributing factors. Disagreements on theinterpretation of specifications should be expected over the lifetime of aprotocol.¶
Even with the best intentions to maintain protocol correctness, the pressure tointeroperate can be significant. No implementation can hope to avoid having totrade correctness for interoperability indefinitely.¶
An implementation that reacts to variations in the manner recommended in therobustness principle enters a pathological feedback cycle. Over time:¶
A flaw can become entrenched as a de facto standard. Any implementation of theprotocol is required to replicate the aberrant behavior, or it is notinteroperable. This is both a consequence of tolerating the unexpected and aproduct of a natural reluctance to avoid fatal error conditions. Ensuringinteroperability in this environment is often referred to as aiming to be "bug-for-bug compatible".¶
For example, in TLS[TLS], extensions use a tag-length-value formatand can be added to messages in any order. However, some server implementationsterminated connections if they encountered a TLS ClientHello message that endswith an empty extension. To maintain interoperability with these servers, whichwere widely deployed, client implementations were required to be aware of thisbug and ensure that a ClientHello message ends in a non-empty extension.¶
Overapplication of the robustness principle therefore encourages a chainreaction that can create interoperability problems over time. In particular,tolerating unexpected behavior is particularly deleterious for earlyimplementations of new protocols, as quirks in early implementations can affectall subsequent deployments.¶
From observing widely deployed protocols, it appears there are two stable pointson the spectrum between being strict versus permissive in the presence ofprotocol errors:¶
This happens because interoperability requirements for protocolimplementations are set by other deployments. Specifications and testsuites -- where they exist -- can guide the initial development ofimplementations. Ultimately, the need to interoperate with deployedimplementations is a de facto conformance test suite that cansupersede any formal protocol definition.¶
For widely used protocols, the massive scale of the Internet makes large-scaleinteroperability testing infeasible for all but a privileged few. The cost ofbuilding a new implementation using reverse engineering increases as the numberof implementations and bugs increases. Worse, the set of tweaks necessary forwide interoperability can be difficult to discover. In the worst case, a newimplementer might have to choose between deployments that have diverged so faras to no longer be interoperable.¶
Consequently, new implementations might be forced into niche uses, where theproblems arising from interoperability issues can be more closely managed.However, restricting new implementations into limited deployments risks causingforks in the protocol. If implementations do not interoperate, little preventsthose implementations from diverging more over time.¶
This has a negative impact on the ecosystem of a protocol. New implementationsare key to the continued viability of a protocol. New protocol implementationsare also more likely to be developed for new and diverse use cases and are oftenthe origin of features and capabilities that can be of benefit to existing users.¶
The need to work around interoperability problems also reduces the ability ofestablished implementations to change. An accumulation of mitigations forinteroperability issues makes implementations more difficult to maintain and canconstrain extensibility (see also the IAB document "Long-Term Viability ofProtocol Extension Mechanisms"[RFC9170]).¶
Sometimes, what appear to be interoperability problems are symptomatic of issuesin protocol design. A community that is willing to make changes to the protocol,by revising or extending specifications and then deploying those changes,makes the protocol better.Tolerating unexpected input instead conceals problems, making it harder, if notimpossible, to fix them later.¶
The robustness principle can be highly effective in safeguarding against flawsin the implementation of a protocol by peers. Especially when a specificationremains unchanged for an extended period of time, the incentive to be tolerant oferrors accumulates over time. Indeed, when faced with divergent interpretationsof an immutable specification, the only way for an implementation to remaininteroperable is to be tolerant of differences in interpretation andimplementation errors. However, when official specifications fail to beupdated, then deployed implementations -- including their quirks -- often becomea substitute standard.¶
Tolerating unexpected inputs from another implementation might seem logical, even necessary. However, that conclusion relies on an assumption that existingspecifications and implementations cannot change. Applying the robustnessprinciple in this way disproportionately values short-term gains over thenegative effects on future implementations and the protocol as a whole.¶
For a protocol to have sustained viability, it is necessary for bothspecifications and implementations to be responsive to changes, in addition tohandling new and old problems that might arise over time. For example, when animplementer discovers a scenario where a specification defines some input asfaulty but does not define how to handle that input, the implementer can providesignificant value to the ecosystem by reporting the issue and helping to evolve thespecification.¶
When a discrepancy is found between a specification and its implementation, amaintenance discussion inside the standards process allows reaching consensus onhow best to evolve the specification. Subsequently, updating implementations tomatch evolved specifications ensures that implementations are consistentlyinteroperable and removes needless barriers for new implementations. Maintenancealso enables continued improvement of the protocol. New use cases are anindicator that the protocol could be successful[RFC5218].¶
Protocol designers are strongly encouraged to continue to maintain and evolveprotocol specifications beyond their initial inception and definition. Thismight require the development of revised specifications, extensions, or othersupporting material that evolves in concert with implementations. Involvement ofthose who implement and deploy the protocol is a critical part of this process,as they provide input on their experience with how the protocol is used.¶
Most interoperability problems do not require revision of protocols or protocolspecifications, as software defects can happen even when the specification isunambiguous. For instance, the most effective means of dealing with adefective implementation in a peer could be to contact the developerresponsible. It is far more efficient in the long term to fix one isolated bugthan it is to deal with the consequences of workarounds.¶
Early implementations of protocols have a stronger obligation to closely followspecifications, as their behavior will affect all subsequent implementations. Inaddition to specifications, later implementations will be guided by whatexisting deployments accept. Tolerance of errors in early deployments is mostlikely to result in problems. Protocol specifications might need more frequentrevision during early deployments to capture feedback from early rounds ofdeployment.¶
Neglect can quickly produce the negative consequences this document describes.Restoring the protocol to a state where it can be maintained involves firstdiscovering the properties of the protocol as it is deployed rather than theprotocol as it was originally documented. This can be difficult andtime-consuming, particularly if the protocol has a diverse set ofimplementations. Such a process was undertaken for HTTP[HTTP] aftera period of minimal maintenance. Restoring HTTP specifications to relevance tooksignificant effort.¶
Maintenance is most effective if it is responsive, which is greatly affected byhow rapidly protocol changes can be deployed. For protocol deployments thatoperate on longer time scales, temporary workarounds following the spirit of therobustness principle might be necessary. For this, improvements in softwareupdate mechanisms ensure that the cost of reacting to changes is much lower thanit was in the past. Alternatively, if specifications can be updated more readilythan deployments, details of the workaround can be documented, including thedesired form of the protocols once the need for workarounds no longer exists andplans for removing the workaround.¶
A well-specified protocol includes rules for consistent handling of aberrantconditions. This increases the chances that implementations will have consistentand interoperable handling of unusual conditions.¶
Choosing to generate fatal errors for unspecified conditions instead ofattempting error recovery can ensure that faults receive attention. Thisintolerance can be harnessed to reduce occurrences of aberrant implementations.¶
Intolerance toward violations of specification improves feedback for newimplementations in particular. When a new implementation encounters a peer thatis intolerant of an error, it receives strong feedback that allows the problemto be discovered quickly.¶
To be effective, intolerant implementations need to be sufficiently widelydeployed so that they are encountered by new implementations with high probability.This could depend on multiple implementations deploying strict checks.¶
Interoperability problems also need to be made known to those in a position toaddress them. In particular, systems with human operators, such as user-facingclients, are ideally suited to surfacing errors. Other systems might need touse less direct means of making errors known.¶
This does not mean that intolerance of errors in early deployments of protocolshas the effect of preventing interoperability. On the contrary, when existingimplementations follow clearly specified error handling, new implementations orfeatures can be introduced more readily, as the effect on existingimplementations can be easily predicted; see alsoSection 2.2.¶
Any intolerance also needs to be strongly supported by specifications; otherwise,they encourage fracturing of the protocol community or proliferation ofworkarounds. SeeSection 5.2.¶
Intolerance can be used to motivate compliance with any protocol requirement.For instance, the INADEQUATE_SECURITY error code and associated requirements inHTTP/2[HTTP/2] resulted in improvements in the security of thedeployed base.¶
A notification for a fatal error is best sent as explicit error messages to theentity that made the error. Error messages benefit from being able to carryarbitrary information that might help the implementer of the sender of thefaulty input understand and fix the issue in their software. QUIC error frames[QUIC] are an example of a fatal error mechanism that helpedimplementers improve software quality throughout the protocol lifecycle.Similarly, the use of Extended DNS Errors[EDE] has beeneffective in providing better descriptions of DNS resolution errors to clients.¶
Stateless protocol endpoints might generate denial-of-service attacks if theysend an error message in response to every message that is received from anunauthenticated sender. These implementations might need to silently discardthese messages.¶
Any protocol participant that is affected by changes arising from maintenancemight be excluded if they are unwilling or unable to implement or deploy changesthat are made to the protocol.¶
Deliberate exclusion of problematic implementations is an important tool thatcan ensure that the interoperability of a protocol remains viable. Whilebackward-compatible changes are always preferable to incompatible ones, it isnot always possible to produce a design that protects the ability of all currentand future protocol participants to interoperate.¶
Accidentally excluding unexpected participants is not usually a good outcome.When developing and deploying changes, it is best to first understand the extentto which the change affects existing deployments. This ensures that anyexclusion that occurs is intentional.¶
In some cases, existing deployments might need to change in order to avoid beingexcluded. Though it might be preferable to avoid forcing deployments to change,this might be considered necessary. To avoid unnecessarily excludingdeployments that might take time to change, developing a migration plan can beprudent.¶
Exclusion is a direct goal when choosing to be intolerant of errors (seeSection 5.1). Exclusionary actions are employed with the deliberate intentof protecting future interoperability.¶
Excluding implementations or deployments can lead to a fracturing of theprotocol system that could be more harmful than any divergence that might arisefrom tolerating the unexpected. The IAB document "Uncoordinated ProtocolDevelopment Considered Harmful"[RFC5704] describes how conflict orcompetition in the maintenance of protocols can lead to similar problems.¶
Careless implementations, lax interpretations of specifications, and uncoordinated extrapolation of requirements to cover gaps in specification can result in security problems. Hiding the consequences of protocol variations encourages the hiding of issues, which can conceal bugs and make them difficult to discover.¶
The consequences of the problems described in this document are especially acute for any protocol where security depends on agreement about semantics of protocol elements. For instance, weak primitives[MD5] and obsolete mechanisms[SSL3] are good examples of the use of unsafe security practices where forcing exclusion (Section 5.2) can be desirable.¶
This document has no IANA actions.¶
Internet Architecture Board members at the time this document was approvedfor publication were:¶
Jari Arkko¶
Deborah Brungard¶
Lars Eggert¶
Wes Hardaker¶
Cullen Jennings¶
Mallory Knodel¶
Mirja Kühlewind¶
Zhenbin Li¶
Tommy Pauly¶
David Schinazi¶
Russ White¶
Qin Wu¶
Jiankang Yao¶
The document had broad but not unanimous approval within the IAB, reflectingthat while the guidance is valid, concerns were expressed in the IETF communityabout how broadly it applies in all situations.¶
Constructive feedback on this document has been provided by a surprising number of people including, but not limited to, the following:Bernard Aboba,Brian Carpenter,Stuart Cheshire,Joel Halpern,Wes Hardaker,Russ Housley,Cullen Jennings,Mallory Knodel,Mirja Kühlewind,Mark Nottingham,Eric Rescorla,Henning Schulzrinne,Job Snijders,Robert Sparks,Dave Thaler,Brian Trammell, andAnne van Kesteren. Some of the properties of protocols described inSection 4.1 were observed byMarshall Rose inSection 4.5 of [RFC3117].¶