Movatterモバイル変換

[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]

Obsoleted by:7805 HISTORIC

RFC:  816                      FAULT ISOLATION AND RECOVERY                             David D. Clark                  MIT Laboratory for Computer Science               Computer Systems and Communications Group                               July, 1982     1.  Introduction     Occasionally, a network or a gateway will go down, and the sequenceof  hops  which the packet takes from source to destination must change.Fault isolation is that action which  hosts  and  gateways  collectivelytake  to  determine  that  something  is  wrong;  fault  recovery is theidentification and selection of an alternative route which will serve toreconnect the source to the destination.  In fact, the gateways  performmost  of  the  functions  of  fault  isolation and recovery.  There are,however, a few actions which hosts must take if they wish to  provide  areasonable  level  of  service.   This document describes the portion offault isolation and recovery which is the responsibility of the host.     2.  What Gateways Do     Gateways collectively implement an algorithm which  identifies  thebest  route  between  all pairs of networks.  They do this by exchangingpackets  which  contain  each  gateway's  latest   opinion   about   theoperational status of its neighbor networks and gateways.  Assuming thatthis  algorithm is operating properly, one can expect the gateways to gothrough a period of confusion immediately after some network or  gateway

2has failed, but one can assume that once a period of negotiation haspassed, the gateways are equipped with a consistent and correct model ofthe connectivity of the internet. At present this period of negotiationmay actually take several minutes, and many TCP implementations time outwithin that period, but it is a design goal of the eventual algorithmthat the gateway should be able to reconstruct the topology quicklyenough that a TCP connection should be able to survive a failure of theroute. 3. Host Algorithm for Fault Recovery Since the gateways always attempt to have a consistent and correctmodel of the internetwork topology, the host strategy for fault recoveryis very simple. Whenever the host feels that something is wrong, itasks the gateway for advice, and, assuming the advice is forthcoming, itbelieves the advice completely. The advice will be wrong only duringthe transient period of negotiation, which immediately follows anoutage, but will otherwise be reliably correct. In fact, it is never necessary for a host to explicitly ask agateway for advice, because the gateway will provide it as appropriate.When a host sends a datagram to some distant net, the host should beprepared to receive back either of two advisory messages which thegateway may send. The ICMP "redirect" message indicates that thegateway to which the host sent the datagram is not longer the bestgateway to reach the net in question. The gateway will have forwardedthe datagram, but the host should revise its routing table to have adifferent immediate address for this net. The ICMP "destination

3unreachable" message indicates that as a result of an outage, it iscurrently impossible to reach the addressed net or host in any manner.On receipt of this message, a host can either abandon the connectionimmediately without any further retransmission, or resend slowly to seeif the fault is corrected in reasonable time. If a host could assume that these two ICMP messages would alwaysarrive when something was amiss in the network, then no other action onthe part of the host would be required in order maintain its tables inan optimal condition. Unfortunately, there are two circumstances underwhich the messages will not arrive properly. First, during thetransient following a failure, error messages may arrive that do notcorrectly represent the state of the world. Thus, hosts must take anisolated error message with some scepticism. (This transient period isdiscussed more fully below.) Second, if the host has been sendingdatagrams to a particular gateway, and that gateway itself crashes, thenall the other gateways in the internet will reconstruct the topology,but the gateway in question will still be down, and therefore cannotprovide any advice back to the host. As long as the host continues todirect datagrams at this dead gateway, the datagrams will simply vanishoff the face of the earth, and nothing will come back in return. Hostsmust detect this failure. If some gateway many hops away fails, this is not of concern to thehost, for then the discovery of the failure is the responsibility of theimmediate neighbor gateways, which will perform this action in a mannerinvisible to the host. The problem only arises if the very first

                                   4gateway, the one to which the host is immediately sending the datagrams,fails.   We thus identify one single task which the host must perform asits part of fault isolation in the internet:  the  host  must  use  somestrategy  to  detect  that a gateway to which it is sending datagrams isdead.     Let us  assume  for  the  moment  that  the  host  implements  somealgorithm  to  detect  failed  gateways; we will return later to discusswhat this algorithm might be.  First, let  us  consider  what  the  hostshould  do  when it has determined that a gateway is down. In fact, withthe exception of one small problem, the action the host should  take  isextremely  simple.    The host should select some other gateway, and trysending the datagram to it.  Assuming that  gateway  is  up,  this  willeither  produce  correct  results, or some ICMP advice.  Since we assumethat, ignoring temporary periods immediately following  an  outage,  anygateway  is capable of giving correct advice, once the host has receivedadvice from any gateway, that host is in as good a condition as  it  canhope to be.     There is always the unpleasant possibility that when the host triesa different gateway, that gateway too will be down.  Therefore, whateveralgorithm  the  host  uses to detect a dead gateway must continuously beapplied, as the host tries every gateway in turn that it knows about.     The only difficult part of this algorithm is to specify  the  meansby which the host maintains the table of all of the gateways to which ithas  immediate  access.    Currently,  the specification of the internetprotocol does not architect any message by which a host can  ask  to  be

5supplied with such a table. The reason is that different networks mayprovide very different mechanisms by which this table can be filled in.For example, if the net is a broadcast net, such as an ethernet or aringnet, every gateway may simply broadcast such a table from time totime, and the host need do nothing but listen to obtain the requiredinformation. Alternatively, the network may provide the mechanism oflogical addressing, by which a whole set of machines can be providedwith a single group address, to which a request can be sent forassistance. Failing those two schemes, the host can build up its tableof neighbor gateways by remembering all the gateways from which it hasever received a message. Finally, in certain cases, it may be necessaryfor this table, or at least the initial entries in the table, to beconstructed manually by a manager or operator at the site. In caseswhere the network in question provides absolutely no support for thiskind of host query, at least some manual intervention will be requiredto get started, so that the host can find out about at least onegateway. 4. Host Algorithms for Fault Isolation We now return to the question raised above. What strategy shouldthe host use to detect that it is talking to a dead gateway, so that itcan know to switch to some other gateway in the list. In fact, there areseveral algorithms which can be used. All are reasonably simple toimplement, but they have very different implications for the overhead onthe host, the gateway, and the network. Thus, to a certain extent, thealgorithm picked must depend on the details of the network and of thehost.

                                   61.  NETWORK LEVEL DETECTION     Many  networks,  particularly  the  Arpanet,  perform precisely therequired function internal to the network.  If a host sends  a  datagramto  a dead gateway on the Arpanet, the network will return a "host dead"message, which is precisely the information the host needs  to  know  inorder  to  switch  to  another  gateway.   Some early implementations ofInternet on  the  Arpanet  threw  these  messages  away.    That  is  anexceedingly poor idea.2.  CONTINUOUS POLLING     The  ICMP  protocol  provides an echo mechanism by which a host maysolicit a response from a gateway.    A  host  could  simply  send  thismessage  at  a  reasonable  rate, to assure itself continuously that thegateway was still up.  This works, but, since the message must  be  sentfairly  often  to  detect  a fault in a reasonable time, it can imply anunbearable overhead on the host itself, the network,  and  the  gateway.This  strategy  is  prohibited  except  where  a  specific  analysis hasindicated that the overhead is tolerable.3.  TRIGGERED POLLING     If the use of polling could be restricted to only those times  whensomething  seemed  to  be  wrong,  then  the overhead would be bearable.Provided that one can get the proper  advice  from  one's  higher  levelprotocols,  it  is  possible to implement such a strategy.  For example,one could program the TCP level so  that  whenever  it  retransmitted  a

7segment more than once, it sent a hint down to the IP layer whichtriggered polling. This strategy does not have excessive overhead, butdoes have the problem that the host may be somewhat slow to respond toan error, since only after polling has started will the host be able toconfirm that something has gone wrong, and by then the TCP above mayhave already timed out. Both forms of polling suffer from a minor flaw. Hosts as well asgateways respond to ICMP echo messages. Thus, polling cannot be used todetect the error that a foreign address thought to be a gateway isactually a host. Such a confusion can arise if the physical addressesof machines are rearranged.4. TRIGGERED RESELECTION There is a strategy which makes use of a hint from a higher level,as did the previous strategy, but which avoids polling altogether.Whenever a higher level complains that the service seems to bedefective, the Internet layer can pick the next gateway from the list ofavailable gateways, and switch to it. Assuming that this gateway is up,no real harm can come of this decision, even if it was wrong, for theworst that will happen is a redirect message which instructs the host toreturn to the gateway originally being used. If, on the other hand, theoriginal gateway was indeed down, then this immediately provides a newroute, so the period of time until recovery is shortened. This laststrategy seems particularly clever, and is probably the most generallysuitable for those cases where the network itself does not provide faultisolation. (Regretably, I have forgotten who suggested this idea to me.It is not my invention.)

8 5. Higher Level Fault Detection The previous discussion has concentrated on fault detection andrecovery at the IP layer. This section considers what the higher layerssuch as TCP should do. TCP has a single fault recovery action; it repeatedly retransmits asegment until either it gets an acknowledgement or its connection timerexpires. As discussed above, it may use retransmission as an event totrigger a request for fault recovery to the IP layer. In the otherdirection, information may flow up from IP, reporting such things asICMP Destination Unreachable or error messages from the attachednetwork. The only subtle question about TCP and faults is what TCPshould do when such an error message arrives or its connection timerexpires. The TCP specification discusses the timer. In the description ofthe open call, the timeout is described as an optional value that theclient of TCP may specify; if any segment remains unacknowledged forthis period, TCP should abort the connection. The default for thetimeout is 30 seconds. Early TCPs were often implemented with a fixedtimeout interval, but this did not work well in practice, as thefollowing discussion may suggest. Clients of TCP can be divided into two classes: those running onimmediate behalf of a human, such as Telnet, and those supporting aprogram, such as a mail sender. Humans require a sophisticated responseto errors. Depending on exactly what went wrong, they may want to

9abandon the connection at once, or wait for a long time to see if thingsget better. Programs do not have this human impatience, but also lackthe power to make complex decisions based on details of the exact errorcondition. For them, a simple timeout is reasonable. Based on these considerations, at least two modes of operation areneeded in TCP. One, for programs, abandons the connection withoutexception if the TCP timer expires. The other mode, suitable forpeople, never abandons the connection on its own initiative, but reportsto the layer above when the timer expires. Thus, the human user can seeerror messages coming from all the relevant layers, TCP and ICMP, andcan request TCP to abort as appropriate. This second mode requires thatTCP be able to send an asynchronous message up to its client to reportthe timeout, and it requires that error messages arriving at lowerlayers similarly flow up through TCP. At levels above TCP, fault detection is also required. Either ofthe following can happen. First, the foreign client of TCP can fail,even though TCP is still running, so data is still acknowledged and thetimer never expires. Alternatively, the communication path can fail,without the TCP timer going off, because the local client has no data tosend. Both of these have caused trouble. Sending mail provides an example of the first case. When sendingmail using SMTP, there is an SMTP level acknowledgement that is returnedwhen a piece of mail is successfully delivered. Several early mailreceiving programs would crash just at the point where they had receivedall of the mail text (so TCP did not detect a timeout due to outstanding

                                   10unacknowledged  data)  but  before the mail was acknowledged at the SMTPlevel.  This failure would cause early mail senders to wait forever  forthe  SMTP level acknowledgement.  The obvious cure was to set a timer atthe SMTP level, but the first attempt to do this did not work, for therewas no simple way to  select  the  timer  interval.    If  the  intervalselected  was  short,  it  expired  in normal operational when sending alarge file to a slow host.  An interval of many minutes  was  needed  toprevent  false timeouts, but that meant that failures were detected onlyvery slowly.  The current solution in  several  mailers  is  to  pick  atimeout interval proportional to the size of the message.     Server telnet provides an example of the other kind of failure.  Itcan  easily  happen that the communications link can fail while there isno traffic flowing, perhaps because the user is thinking.    Eventually,the  user will attempt to type something, at which time he will discoverthat the connection is dead and abort it.   But  the  host  end  of  theconnection,  having  nothing  to send, will not discover anything wrong,and will remain waiting forever.  In some systems there is no way for  auser  in  a  different  process  to  destroy or take over such a hangingprocess, so there is no way to recover.     One solution to this would be to have the host server telnet  querythe  user  end now and then, to see if it is still up.  (Telnet does nothave an explicit query  feature,  but  the  host  could  negotiate  someunimportant   option,   which   should   produce   either  agreement  ordisagreement in  return.)    The  only  problem  with  this  is  that  areasonable  sample interval, if applied to every user on a large system,

11can generate an unacceptable amount of traffic and system overhead. Asmart server telnet would use this query only when something seemswrong, perhaps when there had been no user activity for some time. In both these cases, the general conclusion is that client levelerror detection is needed, and that the details of the mechanism arevery dependent on the application. Application programmers must be madeaware of the problem of failures, and must understand that errordetection at the TCP or lower level cannot solve the whole problem forthem. 6. Knowing When to Give Up It is not obvious, when error messages such as ICMP DestinationUnreachable arrive, whether TCP should abandon the connection. Thereason that error messages are difficult to interpret is that, asdiscussed above, after a failure of a gateway or network, there is atransient period during which the gateways may have incorrectinformation, so that irrelevant or incorrect error messages maysometimes return. An isolated ICMP Destination Unreachable may arriveat a host, for example, if a packet is sent during the period when thegateways are trying to find a new route. To abandon a TCP connectionbased on such a message arriving would be to ignore the valuable featureof the Internet that for many internal failures it reconstructs itsfunction without any disruption of the end points. But if failure messages do not imply a failure, what are they for?In fact, error messages serve several important purposes. First, if

                                   12they  arrive  in response to opening a new connection, they probably arecaused by opening the connection improperly  (e.g.,  to  a  non-existentaddress)  rather  than  by  a  transient  network failure.  Second, theyprovide valuable information, after the TCP timeout has occurred, as  tothe  probable  cause of the failure.  Finally, certain messages, such asICMP Parameter Problem, imply a possible  implementation  problem.    Ingeneral, error messages give valuable information about what went wrong,but  are  not  to  be  taken as absolutely reliable.  A general alertingmechanism, such as the TCP timeout  discussed  above,  provides  a  goodindication  that  whatever  is wrong is a serious condition, but withoutthe advisory messages to augment the timer, there  is  no  way  for  theclient  to  know  how  to  respond to the error.  The combination of thetimer and the advice from the error messages provide a reasonable set offacts for the client layer to have.  It is important that error messagesfrom all layers be passed up to  the  client  module  in  a  useful  andconsistent way.-------

[8]ページ先頭