Movatterモバイル変換

[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]

UNKNOWN

RFC 789    Vulnerabilities of Network Control Protocols: An Example                          Eric C. Rosen                  Bolt Beranek and Newman Inc.

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosen This paper has appeared in the January 1981 edition of theSIGSOFT Software Engineering Notes, and will soon appear in theSIGCOMM Computer Communications Review. It is being circulatedas an RFC because it is thought that it may be of interest to awider audience, particularly to the internet community. It is acase study of a particular kind of problem that can arise inlarge distributed systems, and of the approach used in theARPANET to deal with one such problem. On October 27, 1980, there was an unusual occurrence on theARPANET. For a period of several hours, the network appeared tobe unusable, due to what was later diagnosed as a high prioritysoftware process running out of control. Network-widedisturbances are extremely unusual in the ARPANET (none hasoccurred in several years), and as a result, many people haveexpressed interest in learning more about the etiology of thisparticular incident. The purpose of this note is to explain whatthe symptoms of the problem were, what the underlying causeswere, and what lessons can be drawn. As we shall see, theimmediate cause of the problem was a rather freakish hardwaremalfunction (which is not likely to recur) which caused a faultysequence of network control packets to be generated. This faultysequence of control packets in turn affected the apportionment ofsoftware resources in the IMPs, causing one of the IMP processesto use an excessive amount of resources, to the detriment ofother IMP processes. Restoring the network to operational - 1 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosencondition was a relatively straightforward task. There was nodamage other than the outage itself, and no residual problemsonce the network was restored. Nevertheless, it is quiteinteresting to see the way in which unusual (indeed, unique)circumstances can bring out vulnerabilities in network controlprotocols, and that shall be the focus of this paper. The problem began suddenly when we discovered that, withvery few exceptions, no IMP was able to communicate reliably withany other IMP. Attempts to go from a TIP to a host on some otherIMP only brought forth the "net trouble" error message,indicating that no physical path existed between the pair ofIMPs. Connections which already existed were summarily broken.A flood of phone calls to the Network Control Center (NCC) fromall around the country indicated that the problem was notlocalized, but rather seemed to be affecting virtually every IMP. As a first step towards trying to find out what the state ofthe network actually was, we dialed up a number of TIPs aroundthe country. What we generally found was that the TIPs were up,but that their lines were down. That is, the TIPs werecommunicating properly with the user over the dial-up line, butno connections to other IMPs were possible. We tried manually restarting a number of IMPs which are inour own building (after taking dumps, of course). This procedureinitializes all of the IMPs' dynamic data structures, and will - 2 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenoften clear up problems which arise when, as sometimes happens inmost complex software systems, the IMPs' software gets into a"funny" state. The IMPs which were restarted worked well untilthey were connected to the rest of the net, after which theyexhibited the same complex of symptoms as the IMPs which had notbeen restarted. From the facts so far presented, we were able to draw anumber of conclusions. Any problem which affects all IMPsthroughout the network is usually a routing problem. Restartingan IMP re-initializes the routing data structures, so the factthat restarting an IMP did not alleviate the problem in that IMPsuggested that the problem was due to one or more "bad" routingupdates circulating in the network. IMPs which were restartedwould just receive the bad updates from those of their neighborswhich were not restarted. The fact that IMPs seemed unable tokeep their lines up was also a significant clue as to the natureof the problem. Each pair of neighboring IMPs runs a lineup/down protocol to determine whether the line connecting them isof sufficient quality to be put into operation. This protocolinvolves the sending of HELLO and I-HEARD-YOU messages. We havenoted in the past that under conditions of extremely heavy CPUutilization, so many buffers can pile up waiting to be served bythe bottleneck CPU process, that the IMPs are unable to acquirethe buffers needed for receiving the HELLO or I-HEARD-YOUmessages. If a condition like this lasts for any length of time, - 3 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenthe IMPs may not be able to run the line up/down protocol, andlines will be declared down by the IMPs' software. On the basisof all these facts, our tentative conclusion was that somemalformed update was causing the routing process in the IMPs touse an excessive amount of CPU time, possibly even to be runningin an infinite loop. (This would be quite a surprise though,since we tried very hard to protect ourselves against malformedupdates when we designed the routing process.) As we shall see,this tentative conclusion, although on the right track, was notquite correct, and the actual situation turned out to be muchmore complex. When we examined core dumps from several IMPs, we noted thatmost, in some cases all, of the IMPs' buffers contained routingupdates waiting to be processed. Before describing thissituation further, it is necessary to explain some of the detailsof the routing algorithm's updating scheme. (The followingexplanation will of course be very brief and incomplete. Readerswith a greater level of interest are urged to consult thereferences.) Every so often, each IMP generates a routing updateindicating which other IMPs are its immediate neighbors overoperational lines, and the average per-packet delay (inmilliseconds) over that line. Every IMP is required to generatesuch an update at least once per minute, and no IMP is permittedto generate more than a dozen such updates over the course of aminute. Each update has a 6-bit sequence number which is - 4 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenadvanced by 1 (modulo 64) for each successive update generated bya particular IMP. If two updates generated by the same IMP havesequence numbers n and m, update n is considered to be LATER(i.e., more recently generated) than update m if and only if oneof the following two conditions hold: (a) n > m, and n - m <= 32 (b) n < m, and m - n > 32(where the comparisons and subtractions treat n and m as unsigned6-bit numbers, with no modulus). When an IMP generates anupdate, it sends a copy of the update to each neighbor. When anIMP A receives an update u1 which was generated by a differentIMP B, it first compares the sequence number of u1 with thesequence number of the last update, u2, that it accepted from B.If this comparison indicates that u2 is LATER than u1, u1 issimply discarded. If, on the other hand, u1 appears to be theLATER update, IMP A will send u1 to all its neighbors (includingthe one from which it was received). The sequence number of u1will be retained in A's tables as the LATEST received update fromB. Of course, u1 is always accepted if A has seen no previousupdate from B. Note that this procedure is designed to ensurethat an update generated by a particular IMP is received,unchanged, by all other IMPs in the network, IN THE PROPERSEQUENCE. Each routing update is broadcast (or flooded) to allIMPs, not just to immediate neighbors of the IMP which generated - 5 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenthe update (as in some other routing algorithms). The purpose ofthe sequence numbers is to ensure that all IMPs will agree as towhich update from a given IMP is the most recently generatedupdate from that IMP. For reliability, there is a protocol for retransmittingupdates over individual links. Let X and Y be neighboring IMPs,and let A be a third IMP. Suppose X receives an update which wasgenerated by A, and transmits it to Y. Now if in the next 100 msor so, X does not receive from Y an update which originated at Aand whose sequence number is at least as recent as that of theupdate X sent to Y, X concludes that its transmission of theupdate did not get through to Y, and that a retransmission isrequired. (This conclusion is warranted, since an update whichis received and adjudged to be the most recent from itsoriginating IMP is sent to all neighbors, including the one fromwhich it was received.) The IMPs do not keep the original updatepackets buffered pending retransmission. Rather, all theinformation in the update packet is kept in tables, and thepacket is re-created from the tables if necessary for aretransmission. This transmission protocol ("flooding") distributes therouting updates in a very rapid and reliable manner. Oncegenerated by an IMP, an update will almost always reach all otherIMPs in a time period on the order of 100 ms. Since an IMP cangenerate no more than a dozen updates per minute, and there are - 6 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosen64 possible sequence numbers, sequence number wrap-around is nota problem. There is only one exception to this. Suppose twoIMPs A and B are out of communication for a period of timebecause there is no physical path between them. (This may be dueeither to a network partition, or to a more mundane occurrence,such as one of the IMPs being down.) When communication isre-established, A and B have no way of knowing how long they havebeen out of communication, or how many times the other's sequencenumbers may have wrapped around. Comparing the sequence numberof a newly received update with the sequence number of an updatereceived before the outage may give an incorrect result. To dealwith this problem, the following scheme is adopted. Let t0 bethe time at which IMP A receives update number n generated by IMPB. Let t1 be t0 plus 1 minute. If by t1, A receives no updategenerated by B with a LATER sequence number than n, A will acceptany update from B as being more recent than n. So if two IMPsare out of communication for a period of time which is longenough for the sequence numbers to have wrapped around, thisprocedure ensures that proper resynchronization of sequencenumbers is effected when communication is re-established. There is just one more facet of the updating process whichneeds to be discussed. Because of the way the line up/downprotocol works, a line cannot be brought up until 60 secondsafter its performance becomes good enough to warrant operationaluse. (Roughly speaking, this is the time it takes to determine - 7 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenthat the line's performance is good enough.) During this60-second period, no data is sent over the line, but routingupdates are transmitted. Remember that every node is required togenerate a routing update at least once per minute. Therefore,this procedure ensures that if two IMPs are out of communicationbecause of the failure of some line, each has the most recentupdate from the other by the time communication isre-established. This very short introduction to the routing algorithm'supdating protocol should provide enough background to enable thereader to understand the particular problem under discussion;further justification and detail can be found in the references. Let us return now to the discussion of the network outage.I have already mentioned that the core dumps showed almost allbuffers holding routing updates which were waiting to beprocessed. Close inspection showed that all the updates werefrom a single IMP, IMP 50. By a strange "coincidence," IMP 50had been malfunctioning just before the network-wide outageoccurred, and was off the net during the period of the outage.Hence it was not generating any updates during the period of theoutage. In addition, IMP 29, an immediate neighbor of IMP 50,was also suffering hardware malfunctions (in particular, droppingbits), but was up (though somewhat flakey) while the network wasin bad shape. Furthermore, the malfunction in IMP 50 had to dowith its ability to communicate properly with the neighboring IMP - 8 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosen29. Although we did not yet understand how it waspossible forso many updates from one IMP to be extant simultaneously, we didunderstand enough to be able to get the network to recover. Allthat was necessary was to patch the IMPs to disregard any updatesfrom IMP 50, which after all was down anyway. When the networkis operating normally, broadcasting a patch to all IMPs can bedone in a matter of minutes. With the network operating as itwas during the period of the outage, this can take as much as 3or 4 hours. (Remember that the IMPs are generally unmanned, andthat the only way of controlling them from the NCC is via thenetwork itself. This is perfectly satisfactory when an outageaffects only a small group of IMPs, but is an obvious problemwhen the outage has network-wide effects.) This procedure wasfully successful in bringing the network back up. When we looked closely at the dumps, we saw that not onlywere all the updates on the queue from IMP 50, but they all hadone of three sequence numbers (either 8, 40, or 44), and wereordered in the queue as follows:8, 40, 44, 8, 40, 44, 8, 40, 44, ... Note that by the definitionof LATER, 44 is LATER than 40 (44 > 40 and 44 - 40 <= 32), 40 isLATER than 8 (40 > 8 and 40 - 8 <= 32), and 8 is LATER than 44(8 < 44 and 44 - 8 > 32). Given the presence of three updatesfrom the same IMP with these three sequence numbers, this is whatwould be expected. Since each update is LATER than one of theothers, a cycle is formed which keeps the three updates floating - 9 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenaround the network indefinitely. Thus the IMPs spend most oftheir CPU time and buffer space in processing these updates. Theproblem was to figure out how these three updates could possiblyhave existed at the same time. After all, getting from update 8to update 40 should require 2 or 3 full minutes, plus 31intervening sequence numbers. So how could 8 still be aroundwhen 40 was generated, especially since no updates withintervening sequence numbers were present? Our first thought was that maybe the real-time clock in IMP50 was running one or two orders of magnitude faster than normal,invalidating our assumptions about the maximum number of updateswhich could be generated in a given time. An alternativehypothesis suggested itself however when we looked at the binaryrepresentations of the three sequence numbers: 8 - 001000 40 - 101000 44 - 101100Note that 44 has only one more bit than 40, which has only onemore bit than 8. Furthermore, the three different updates werecompletely identical, except for their sequence numbers. Thissuggests that there was really only one update, 44, whosesequence number was twice corrupted by dropped bits. (Of course,it's also possible that the "real" update was 8, and wascorrupted by added bits. However, bit-dropping has proven itself - 10 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosento be a much more common sort of hardware malfunction thanbit-adding, although spontaneously dropped bits may sometimescome back on spontaneously.) Surely, the reader will object, there must be protectionagainst dropped bits. Yes there is protection, but apparentlynot enough. The update packets themselves are checksummed, so adropped bit in an update packet is readily detected. Rememberthough that if an update needs to be retransmitted, it isrecreated from tabled information. For maximal reliability, thetables must be checksummed also, and the checksum must berecomputed every time the table is accessed. However, this wouldrequire either a large number of CPU cycles (for frequentchecksumming of a large area of memory) or a large amount ofmemory (to store the checksums for a lot of small areas). SinceCPU cycles and memory are both potentially scarce resources, thisdid not seem to us to be a cost-effective way to deal withproblems that arise, say, once per year (this is the first suchproblem encountered in a year and a half of running this routingalgorithm). Time and space can be saved by recomputing thechecksum at a somewhat slower frequency, but this is lessreliable, in that it allows a certain number of dropped bits to"fall between the cracks." It seems likely then that one of themalfunctioning IMPs had to retransmit update 44 at least twice,(recreating it each time from tabled information), retransmittingit at least once with the corrupted sequence number 40, and at - 11 -

RFC 789                              Bolt Beranek and Newman Inc.                                                    Eric C. Rosenleast  once  with  the  corrupted  sequence number 8.  This wouldcause those three sequence numbers to be extant  in  the  networksimultaneously,  even  though protocol is supposed to ensure thatthis is impossible.     Actually, the detection of dropped bits is most  properly  ahardware function.  The next generation of IMP hardware (the "C30IMP")  will  be able to detect and correct all single-bit errors,and will detect all other bit errors.  Uncorrectable  bit  errorswill  cause  the  IMP to go into its "loader/dumper."  (An IMP inits loader/dumper is not usable for  transferring  data,  and  isofficially   in  the  "down"  state.   However,  an  IMP  in  itsloader/dumper is easily controllable from the  NCC,  and  can  berestarted  or  reloaded  without  on-site intervention.)  Currenthardware does have parity checking (which  should  detect  singledropped  bits),  but  this feature has had to be turned off since(a) there are too many spurious parity "errors,"  i.e.,  most  ofthe  time when the machines complain of parity errors there don'treally seem to be any, and (b) parity errors cause  the  machinesto  simply  halt, rather than go into their loader/dumpers, whichmeans that on-site intervention is required to restart them.     Pending the introduction of improved hardware, what  can  bedone  to prevent problems like this from recurring in the future?It is easy to think of many  ways  of  avoiding  this  particularproblem,  especially  if  one does not consider the problems thatmay arise from the "fixes."  For example, we  might  be  able  to                             - 12 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenavoid this sort of problem by spending a lot more CPU cycles onchecksumming, but this may be too expensive because of the sideeffects it would introduce. (Also, it is not clear that anymemory checksumming strategy can be totally free of "cracks.") Avery simple and conservative fix to prevent this particularproblem from recurring is to modify clause (a) of the definitionof LATER so that the "<=" is replaced by "<" (strictly lessthan). We will implement this fix, but it cannot be guaranteedthat no related problems will ever arise. What is really needed is not some particular fix to therouting algorithm, but a more general fix. In some sense, theproblem we saw was not really a routing problem. The routingcode was working correctly, and the routes that were generatedwere correct and consistent. The real problem is that a freakishhardware malfunction caused a high priority process to run wild,devouring resources needed by other processes, thereby making thenetwork unusable. The fact that the wild process was the routingprocess is incidental. In designing the routing process, wecarefully considered the amount of resource utilization it wouldrequire. By strictly controlling and limiting the rate at whichupdates can be generated, we tried to prevent any situation inwhich the routing process would make excessive demands on thesystem. As we have seen though, even our carefully designedmechanisms were unable to protect against every possible sort ofhardware failure. We need a better means of detecting that some - 13 -

RFC 789 Bolt Beranek and Newman Inc. Eric C. Rosenhigh priority process in the IMP, despite all the safeguards wehave built in, is still consuming too many resources. Once thisis detected, the IMP can be automatically placed in itsloader/dumper. In the case under discussion, we would have likedto have all the IMPs go into their loader/dumpers when theproblem arose. This would have enabled us to re-initialize andrestart all the IMPs much more quickly. (Although restartingindividual IMPs did little good, restarting all the IMPssimultaneously would have cleared up the problem instantly, sinceall routing tables in all IMPs would have been initializedsimultaneously.) It took us no more than an hour to figure outhow to restore the network; several additional hours wererequired because it took so long for us to gain control of themisbehaving IMPs and get them back to normal. A built-insoftware alarm system (assuming, of course, that it was notsubject to false alarms) might have enabled us to restore thenetwork more quickly, significantly reducing the duration of theoutage. This is not to say that a better alarm and controlsystem could ever be a replacement for careful study and designwhich attempts to properly distribute the utilization ofimportant resources, but only that it is a necessary adjunct, tohandle the cases that will inevitably fall between the cracks ofeven the most careful design. - 14 -

RFC 789                              Bolt Beranek and Newman Inc.                                                    Eric C. Rosen                           REFERENCES"The New Routing Algorithm for the ARPANET," IEEE TRANSACTIONS ONCOMMUNICATIONS, May 1980, J.M. McQuillan, I. Richer, E.C.  Rosen."The  Updating  Protocol  of  ARPANET's  New  Routing Algorithm,"COMPUTER NETWORKS, February 1980, E.C. Rosen.                             - 15 -

[8]ページ先頭