RFC 8767 | DNS Serve-Stale | March 2020 |
Lawrence, et al. | Standards Track | [Page] |
This document defines a method (serve-stale) for recursive resolvers touse stale DNS data to avoid outages when authoritative nameserverscannot be reached to refresh expired data. One of the motivationsfor serve-stale is to make the DNS more resilient to DoS attacksand thereby make them less attractive as an attack vector.This document updates the definitions of TTL from RFCs 1034and 1035 so that data can be kept in the cache beyondthe TTL expiry; it also updates RFC 2181 by interpretingvalues with the high-order bit set as being positive, ratherthan 0, and suggests a cap of 7 days.¶
This is an Internet Standards Track document.¶
This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 7841.¶
Information about the current status of this document, any errata, and how to provide feedback on it may be obtained athttps://www.rfc-editor.org/info/rfc8767.¶
Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.¶
Traditionally, the Time To Live (TTL) of a DNS Resource Record (RR) has beenunderstood to represent the maximum number of seconds that a recordcan be used before it must be discarded, based on its description andusage in[RFC1035] and clarifications in[RFC2181].¶
This document expands the definition of the TTLto explicitly allow for expired data to be used in the exceptionalcircumstance that a recursive resolver is unable to refresh theinformation. It is predicated on the observation that authoritativeanswer unavailability can cause outages even when the underlying datathose servers would return is typically unchanged.¶
We describe a method below for this use of stale data, balancing thecompeting needs of resiliency and freshness.¶
This document updates the definitions of TTL from[RFC1034]and[RFC1035] so that data can be kept in the cache beyond the TTL expiry; it also updates[RFC2181] by interpretingvalues with the high-order bit set as being positive, ratherthan 0, and also suggests a cap of 7 days.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14[RFC2119][RFC8174] when, and only when, they appear in all capitals, as shown here.¶
There are a number of reasons why an authoritative server may becomeunreachable, including Denial-of-Service (DoS) attacks, networkissues, and so on. If a recursive server is unable to contact theauthoritative servers for a query but still has relevant data that hasaged past its TTL, that information can still be useful for generatingan answer under the metaphorical assumption that "stale bread isbetter than no bread."¶
[RFC1035],Section 3.2.1 says that the TTL "specifies the timeinterval that the resource record may be cached before the source ofthe information should again be consulted."[RFC1035],Section 4.1.3 furthersays that the TTL "specifies the time interval (in seconds) that theresource record may be cached before it should be discarded."¶
A natural English interpretation of these remarks would seem to beclear enough that records past their TTL expiration must not be used.However,[RFC1035] predates the more rigorous terminology of[RFC2119], which softened the interpretation of "may" and "should".¶
[RFC2181] aimed to provide "the precise definition of the Time toLive," butSection 8 of [RFC2181]was mostly concerned with the numeric range ofvalues rather than data expiration behavior. It does, however, closethat section by noting, "The TTL specifies a maximum time to live, nota mandatory time to live." This wording again does not contain BCP 14key words[RFC2119], but it does convey the natural languageconnotation that data becomes unusable past TTL expiry.¶
As of the time of this writing, several large-scale operators use staledata for answers in some way. A number of recursive resolver packages,including BIND, Knot Resolver, OpenDNS, and Unbound, provide options to use stale data.Apple macOS can also use stale data as part of the Happy Eyeballs algorithms inmDNSResponder. The collective operational experience is that using stale datacan provide significant benefit with minimal downside.¶
The definition of TTL in Sections 3.2.1 and4.1.3 of[RFC1035] isamended to read:¶
Interpreting values that have the high-order bit set as beingpositive, rather than 0, is a change from[RFC2181], the rationalefor which is explained inSection 6.Suggesting a cap of 7 days, rather than the 68 years allowed by the full31 bits ofSection 8 of [RFC2181], reflects the current practice of major modern DNSresolvers.¶
When returning a response containing stale records, a recursiveresolverMUST set the TTL of each expired record in the message to avalue greater than 0, with aRECOMMENDED value of 30 seconds. SeeSection 6 for explanation.¶
Answers from authoritative servers that have a DNS response code ofeither 0 (NoError) or 3 (NXDomain) and the Authoritative Answer (AA)bit setMUST be considered to have refreshed the data at the resolver.Answers from authoritative servers that have any other response codeSHOULD be considered a failure to refresh the data and therefore leaveany previous state intact. SeeSection 6 fora discussion.¶
There is more than one way a recursive resolver couldresponsibly implement this resiliency feature while still respectingthe intent of the TTL as a signal for when data is to be refreshed.¶
In this example method, four notable timers drive considerations forthe use of stale data:¶
Most recursive resolvers already have the query resolution timer and,effectively, some kind of failure recheck timer. The clientresponse timer and maximum stale timer are new concepts for thismechanism.¶
When a recursive resolver receives a request, it should startthe client response timer. This timer is used to avoid clienttimeouts. It should be configurable, with a recommended value of 1.8seconds as being just under a common timeout value of 2 seconds whilestill giving the resolver a fair shot at resolving the name.¶
The resolver then checks its cache for any unexpired records thatsatisfy the request and returns them if available. If itfinds no relevant unexpired data and the Recursion Desired flag is notset in the request, it should immediately return the response withoutconsulting the cache for expired records. Typically, this responsewould be a referral to authoritative nameservers covering the zone,but the specifics are implementation dependent.¶
If iterative lookups will be done, then the failure recheck timer isconsulted. Attempts to refresh from non-responsive or otherwisefailing authoritative nameservers are recommended to be done no morefrequently than every 30 seconds. If this request was received withinthis period, the cache may be immediately consulted for stale data tosatisfy the request.¶
Outside the period of the failure recheck timer, the resolvershould start the query resolution timer and begin the iterativeresolution process. This timer bounds the work done by the resolverwhen contacting external authorities and is commonly around 10 to 30seconds. If this timer expires on an attempted lookup that is stillbeing processed, the resolution effort is abandoned.¶
If the answer has not been completely determined by the time theclient response timer has elapsed, the resolver should then check itscache to see whether there is expired data that would satisfy therequest. If so, it adds that data to the response message with a TTLgreater than 0 (as specified inSection 4). The response is then sent tothe client while the resolver continues its attempt to refresh thedata.¶
When no authorities are able to be reached during a resolutionattempt, the resolver should attempt to refresh the delegation andrestart the iterative lookup process with the remaining time on thequery resolution timer. This resumption should be done only onceper resolution effort.¶
Outside the resolution process, the maximum stale timer is used forcache management and is independent of the query resolutionprocess. This timer is conceptually different from the maximum cacheTTL that exists in many resolvers, the latter being a clamp on thevalue of TTLs as received from authoritative servers and recommendedto be 7 days in the TTL definition inSection 4.The maximum stale timershould be configurable. It defines the length of time after a recordexpires that it should be retained in the cache. The suggested valueis between 1 and 3 days.¶
This document mainly describes the issues behind serving stale dataand intentionally does not provide a formal algorithm. The concept isnot overly complex, and the details are best left to resolver authorsto implement in their codebases. The processing of serve-stale is alocal operation, and consistent variables between deployments are notneeded for interoperability. However, we would like to highlight theimpact of various implementation choices, starting with the timersinvolved.¶
The most obvious of these is the maximum stale timer. If this variableis too large, it could cause excessive cache memory usage, but if it istoo small, the serve-stale technique becomes less effective, as therecord may not be in the cache to be used if needed. Shorter values,even less than a day, can effectively handle the vast majority ofoutages. Longer values, as much as a week, give time for monitoringsystems to notice a resolution problem and for human intervention tofix it; operational experience has been that sometimes the rightpeople can be hard to track down and unfortunately slow to remedy thesituation.¶
Increased memory consumption could be mitigated by prioritizing removal of stale records over non-expired records during cache exhaustion. Eviction strategies could consider additional factors, including the last time of use or the popularity of a record, to retain active but stale records. A feature to manually flush only stale records could also be useful.¶
The client response timer is another variable that deservesconsideration. If this value is too short, there exists the risk thatstale answers may be used even when the authoritative server isactually reachable but slow; this may result in undesirable answersbeing returned. Conversely, waiting too long will negatively impactuser experience.¶
The balance for the failure recheck timer is responsiveness indetecting the renewed availability of authorities versus the extraresource use for resolution. If this variable is set too large, staleanswers may continue to be returned even after the authoritativeserver is reachable; per[RFC2308],Section 7, this should be nomore than 5 minutes. If this variable is too small, authoritativeservers may be targeted with a significant amount of excess traffic.¶
Regarding the TTL to set on stale records in the response,historically TTLs of 0 seconds have been problematic for someimplementations, and negative values can't effectively be communicatedto existing software. Other very short TTLs could lead to congestivecollapse as TTL-respecting clients rapidly try to refresh. Therecommended value of 30 seconds not only sidesteps those potential problemswith no practical negative consequences, it also rate-limitsfurther queries from any client that honors the TTL, such as aforwarding resolver.¶
As for the change to treat a TTL with the high-order bit set aspositive and then clamping it, as opposed to[RFC2181] treating itas zero, the rationale here is basically one of engineering simplicityversus an inconsequential operational history. Negative TTLs had norational intentional meaning that wouldn't have been satisfied by justsending 0 instead, and similarly there was realistically no practicalpurpose for sending TTLs of 2^25 seconds (1 year) or more. There'salso no record of TTLs in the wild having the most significant bit setin the DNS Operations, Analysis, and Research Center's (DNS-OARC's) "Day in the Life" samples[DITL]. With no apparentreason foroperators to use them intentionally, that leaves either errors ornon-standard experiments as explanations as to why such TTLs might beencountered, with neither providing an obviously compelling reason asto why having the leading bit set should be treated differently fromhaving any of the next eleven bits set and then capped perSection 4.¶
Another implementation consideration is the use ofstale nameserver addresses for lookups. This is mentioned explicitlybecause, in some resolvers, getting the addresses for nameservers isa separate path from a normal cache lookup. If authoritative serveraddresses are not able to be refreshed, resolution can possibly stillbe successful if the authoritative servers themselves are up. Forinstance, consider an attack on a top-level domain that takes itsnameservers offline; serve-stale resolvers that had expired glueaddresses for subdomains within that top-level domain would still be able toresolve names within those subdomains, even those it had notpreviously looked up.¶
The directive inSection 4 that only NoError and NXDomainresponses should invalidate any previously associated answer stemsfrom the fact that no other RCODEs that a resolver normallyencounters make any assertions regarding the name in the question orany data associated with it. This comports with existing resolverbehavior where a failed lookup (say, during prefetching) doesn'timpact the existing cache state. Some authoritative server operatorshave said that they would prefer stale answers to be used in the eventthat their servers are responding with errors like ServFail instead ofgiving true authoritative answers. ImplementersMAY decide to returnstale answers in this situation.¶
Since the goal of serve-stale is to provide resiliency for all obviouserrors to refresh data, these other RCODEs are treated as though theyare equivalent to not getting an authoritative response. AlthoughNXDomain for a previously existing name might well be an error, it isnot handled that way because there is no effective way to distinguishoperator intent for legitimate cases versus error cases.¶
During discussion in the IETF, it was suggested that,if all authorities return responses with an RCODE of Refused,it may be an explicit signal to take down the zone fromservers that still have the zone's delegation pointed to them.Refused, however, is alsooverloaded to mean multiple possible failures that could representtransient configuration failures. Operational experience has shownthat purposely returning Refused is a poor way to achieve anexplicit takedown of a zone compared to either updating the delegationor returning NXDomain with a suitable SOA for extended negativecaching. ImplementersMAY nonetheless consider whether totreat all authorities returning Refused as preempting the use of staledata.¶
Stale data is used only when refreshing has failed in order to adhereto the original intent of the design of the DNS and the behaviorexpected by operators. If stale data were to always be usedimmediately and then a cache refresh attempted after the clientresponse has been sent, the resolver would frequently be sending datathat it would have had no trouble refreshing. Because modern resolvers usetechniques like prefetching and request coalescing for efficiency, itis not necessary that every client request needs to trigger a newlookup flow in the presence of stale data, but rather that agood-faith effort has been recently made to refresh the stale databefore it is delivered to any client.¶
It is important to continue the resolution attempt after the staleresponse has been sent, until the query resolution timeout, becausesome pathological resolutions can take many seconds to succeed as theycope with unavailable servers, bad networks, and other problems.Stopping the resolution attempt when the response with expired datahas been sent would mean that answers in these pathological caseswould never be refreshed.¶
The continuing prohibition against using data with a 0-second TTLbeyond the current transaction explicitly extends to it being unusableeven for stale fallback, as it is not to be cached at all.¶
Be aware that Canonical Name (CNAME) and DNAME records[RFC6672] mingled in the expiredcache with other records at the same owner name can cause surprisingresults. This was observed with an initial implementation in BINDwhen a hostname changed from having an IPv4 Address (A) record to aCNAME. The version of BIND being used did not evict other types inthe cache when a CNAME was received, which in normal operations is nota significant issue. However, after both records expired and theauthorities became unavailable, the fallback to stale answers returnedthe older A instead of the newer CNAME.¶
The algorithm described inSection 5 wasoriginally implemented as a patch to BIND 9.7.0. It has been in useon Akamai's production network since 2011; it effectivelysmoothed over transient failures and longer outages that would haveresulted in major incidents. The patch was contributed to the InternetSystems Consortium, and the functionality is now available in BIND 9.12and later via the options stale-answer-enable, stale-answer-ttl, andmax-stale-ttl.¶
Unbound has a similar feature for serving stale answers and willrespond with stale data immediately if it has recently tried andfailed to refresh the answer by prefetching. Starting fromversion 1.10.0, Unbound can also be configured to follow thealgorithm described inSection 5. Both behaviors can beconfigured and fine-tuned with the available serve-expired-*options.¶
Knot Resolver has a demo module here:<https://knot-resolver.readthedocs.io/en/stable/modules-serve_stale.html>.¶
Apple's system resolvers are also known to use stale answers, but thedetails are not readily available.¶
In the research paper "When the Dike Breaks: Dissecting DNS DefensesDuring DDoS"[DikeBreaks], the authors detected some use ofstale answers by resolvers when authorities came under attack. Theirresearch results suggest that more widespread adoption of the techniquewould significantly improve resiliency for the large number of requeststhat fail or experience abnormally long resolution times during an attack.¶
During the discussion of serve-stale in the IETF,it was suggested that an EDNS option[RFC6891] should beavailable. One proposal was to use it to opt in to getting data that ispossibly stale, and another was to signal when stale data has been used for a response.¶
The opt-in use case was rejected, as the technique was meant to beimmediately useful in improving DNS resiliency for all clients.¶
The reporting case was ultimately also rejected becauseeven the simpler version of a proposedoption was still too much bother to implement for too little perceivedvalue.¶
The most obvious security issue is the increased likelihood of DNSSECvalidation failures when using stale data because signatures could bereturned outside their validity period. Stale negative records can increasethe time window where newly published TLSA or DS RRs may not be used dueto cached NSEC or NSEC3 records. These scenarios would only be an issue ifthe authoritative servers are unreachable (the only time the techniques inthis document are used), and thus serve-stale does not introduce a newfailure in place of what would have otherwise been success.¶
Additionally, bad actors have been known to use DNS caches to keeprecords alive even after their authorities have gone away. The serve-stalefeature potentially makes the attack easier, although without introducinga new risk. In addition, attackers could combine this with a DDoS attack onauthoritative servers with the explicit intent of having stale informationcached for a longer period of time. But if attackers have this capacity, they probably coulddo much worse than prolonging the life of old data.¶
In[CloudStrife], it was demonstrated how stale DNS data, namelyhostnames pointing to addresses that are no longer in use by the ownerof the name, can be used to co-opt security -- for example, to getdomain-validated certificates fraudulently issued to an attacker.While this document does not create a new vulnerability in this area, itdoes potentially enlarge the window in which such an attack could bemade. A proposed mitigation is that certificate authorities should fullylook up each name starting at the DNS root for every name lookup.Alternatively, certificate authorities should use a resolver that is not serving stale data.¶
This document does not add any practical new privacy issues.¶
The method described here is not affected by the use of NAT devices.¶
This document has no IANA actions.¶
The authors wish to thankBrian Carpenter,Vladimir Cunat,Robert Edmonds,Tony Finch,Bob Harold,Tatuya Jinmei,Matti Klock,Jason Moreau,Giovane Moura,Jean Roy,Mukund Sivaraman,Davey Song,Paul Vixie,Ralf Weber, andPaul Wouters for their review and feedback.Paul Hoffman deservesspecial thanks for submitting a number of Pull Requests.¶
Thank you also to the following members of the IESG for their finalreview:Roman Danyliw,Benjamin Kaduk,Suresh Krishnan,Mirja Kühlewind, andAdam Roach.¶