Movatterモバイル変換

[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]

UNKNOWN

Network Working Group L. Kleinrock (UCLA-NMC)Request for Comments: 626 H. Opderbeck (UCLA-NMC)NIC #22161 Mar 1974 "On a Possible Lockup Condition in the IMP Subnet Due to Message Sequencing"Lockup or deadlock conditions are one of the most serious system malfunctionsthat can occur in a computer system or network. Communication protocols haveto be designed very carefully to avoid the occurrence of these lockups. Theircommon characteristic is that they occur only under unusual circumstances whichwere not foreseen or deemed too unlikely to occur by the protocol designers.(However, these designers often are not the ones in a position to evaluatesuch likelihoods quantitatively.)The best known lockup that has occurred in the ARPANET is the reassemblylockup [1]. The store-and-forward lockup, also described in Reference 1, hasbeen avoided in the new IMPSYS by carefully observing Kahn's heuristics [1].The last lockup in the subnet we know of occurred on December 21, 1973(Christmas lockup"). This dormant lockup conditions was brought to lightby collecting snapshot measurement messages from all sites simultaneously.The Christmas lockup happened when snapshot messages arrived at ther UCLA IMPwhich had allocated reassembly storage for them and no reassembly blocks werefree. (A reassembly block is a piece of storage used in the actual processof reassembling packets back into messages) [2].Deadlock conditions have not only been observed in the subnet but also inhigher level protocols. The original design of the ICP, for example, had aflaw that was discovered only after a few months of use [3,4]. More recentlyBBN reported a deadlock problem involving the exchange of HOST statusinformation by the RSEXEC server (RSSER) programs [5].As long as it is not possible to design practical communication protocolswhich guarantee deadlock-free operation it is vital to continually check thoseprotocols that are currently in use for any such failures - even if they appear"very unlikely" to occur. In this RFC we comment on a possible deadlockcondition in the IMP subnet which, to our knowledge, has not yet occurred,neither has it been identified. Though we have never seen this problemactually happen it may occur in the future if no precautions are taken. Thispossible lockup condition is due to the sequencing of messages in the subset.To avoid the occurrence of reassembly lockup, the flow control mechanism inthe subnet was modified in some significant ways. Currently there is a limitof four messages that can simultaneously be in transmission between any pair ofsource and destination IMPs. As a result of removing the link-handling fromthe old IMPSYS, it became necessary to introduce a message sequencingmechanism. -1-

Network Working GroupRequest for Comments: 626Messages leave the destination IMP in the same order as they entered the sourceIMP. (Note that the sequencing is done on an IMP-to-IMP basis, not a HOST-to-Host basis. This may introduce undesirable "sequencing delay" if two HOSTsattached to the same destination IMP receive messages from the same sourceIMP).Sequencing of messages has, in general, the potential of introducing deadlockconditions. The reason for this is that any message, say message (n+1), whichis out of order (and therefore cannot be delivered to its destination HOST)may use up resources that are required by message (n) which must be deliverednext. Therefore, message (n) cannot reach its destination IMP which, in turn,prevents the other messages (n+1), etc) that are out of order from beingdelivered to their destination HOST(s). For this reason one has to be verycareful not to allocate too many resources (e.g., buffers) to messages thatare out of order.To avoid lockup conditions the current IMPSYS takes the following twoprecautions: 1. Requests for buffer allocation are always services in order of message number; i.e., no 'ALLOCATE' is returned for message (n+1) if message (n) (or a request for buffer allocation for message (n)) has not yet been received and serviced. 2. Single packet messages (regular and priority) that arrive at the destination IMP out of order are not accepted unless they were retransmitted in response to a previous buffer allocation. These messages are treated rather as a request for the allocation of one buffer (according to 1 above) and then discarded.With these two precautions a deadlock condition appears to be impossible tooccur. However, there is a second buffer allocation mechanism that is nottried to the message sequencing, namely the 'ALLOCATE' that is piggy-backedon the RFNM for a multiple-packet message. The piggy-backed ALLOCATErepresents a buffer allocation for the next multiple-packet message, and NOTfor the next message in sequence. Thus, if the next message in sequence isa single-packet message, the piggy-backed ALLOCATE in effect allocatesbuffers to a message that is out of order.Let us see how this situation can lead to a deadlock condition. Assume thereis a maximum number of 24 reassembly buffers in each IMP. -2-

Network Working GroupRequest for Comments:  626Let IMPs A, B, and C continually transmit 8-packet messages to the samedestination IMP D such that all 24 reassembly buffers in IMP D are used up bythis transmission of multiple packet messages.  If now, in the stream of8-packet messages, IMP A sends a single-packet message it will generally notbe accepted by destination IMP D since there is no reassembly buffer spaceavailable.  (There may be a free reassembly buffer if the single-packet messagehappens to arrive during the time one of the three 8-packet messages is beingtransmitted to its HOST).  The single-packet message will therefore be treatedas a request for buffer allocation.  This request will not be serviced beforethe RFNM of the previous multiple-packet message is sent.  At this time,however, all the free reassembly buffers have already been allocated to thenext multiple-packet message via the piggy-backed ALLOCATE mechanism.  Theonly chance for the single-packet message to get its allocation requestsatisfied is to grab a reassembly buffer from one of the other two 8-packetmessages.  This attempt may be unsuccessful because it depends on the timingof events in the IMP. A deadlock condition can occur if IMP B and IMP Calso send a single-packet message in their stream of 8-packet measures whichcannot be serviced for the same reason.  In this case, the three 8-packetmessages that will arrive later at IMP D cannot be delivered to theirdestination HOST(s) because they are out of order.  The three single-packetmessages that should be delivered next, however, will never reach thedestination IMP since there is no reassembly space available.A possible sequence of event that leads to a deadlock condition can bedescribed as follows:  Examples for Notation:        event:  A8 arrives  ->          all 8 packets of the 8-packet message                                        from IMP A have arrived at IMP D        event:  C1 arrives  ->          a single packet message from IMP C has                                        arrived at IMP D (and is treated as a                                        request for buffer allocation)        event:  B8 complete ->          the last packet of the 8-packet                                        message from IMP B has been received                                        by its destination HOST        event:  A8 RFNM/ALL ->          a RFNM with the piggy-backed ALLOCATE                                        is sent to IMP A                                   -3-

Network Working GroupRequest for Comments:  626                    # of Allocated      # of Reassembly         # of Free                     Reassembly            Buffers              Reassembly                      Buffers              in Use                Buffers     Initially          24                    0                     0 1.  A8 arrives         16                    8                     0 2.  B8 arrives          8                   16                     0 3.  C8 arrives          0                   24                     0 4.  Al arrives          0                   24                     0 5.  B1 arrives          0                   24                     0 6.  C1 arrives          0                   24                     0 7.  A8 complete         0                   16                     8 8.  B8 complete         0                    8                    16 9.  C8 complete         0                    0                    2410.  A8 RFNM/ALL       8                    0                    1611.  B8 RFNM/ALL      16                    0                     812.  C8 RFNM/ALL      24                    0                     013.  A8 arrives       16                    8                     014.  B8 arrives        8                   16                     015.  C8 arrives        0                   24                     016.  - deadlock -Note that an ALLOCATE for one of the single-packet messages A1, B1 and C1 canonly be returned to source IMP A, B, and C, respectively, after the RFNM(with its piggy-backed ALLOCATE) for the previous 8-packet message has beensent.  If these RFNMs are sent in sequence, i.e., without an ALLOCATE forone of the single-packet messages in between, the temporarily freed reassemblystorage (events (7) through (9) is implicitly allocated to the next multiple-packet messages (events (10) through (12) and not to any of the single-packetmessages.  The next 8-packet messages are, however, out of order andcannot be delivered to their destination HOST(s).Right now it looks as if such a lockup can only occur if the number ofreassembly buffers is a multiple of eight.  Indeed, the probability of alockup in this latter case is much higher.  However, deadlocks can also occurif the number of reassembly buffers is not a multiple of eight.Let us assume there are 26 instead of 24 reassembly buffers.  Assume also that,due to alternate paths, line failure, or retransmission, the second packet ofa 2-packet message arrives at IMP D before a single-packet message from thesame source IMP A.  The single-packet message has a smaller sequence numberand must therefore be delivered to its destination HOST before the 2-packetmessage.  When the second packet of the 2-packet message arrives at IMP D theIMP realizes that only 2 of the allocated 8 buffers will be needed.  Therefore                                   -4-

Network Working GroupRequest for Comments: 6266 buffers are returned to the pool of free reassembly buffers.If there were26-3x8=2 buffers in the pool before, the pool size is increased by 6 to 8buffers. These 8 buffers, however, are just enough to send out anotherpiggy-backed ALLOCATE. The single-packet message will therefore find the poolof free reassembly buffers empty although the total number of reassemblybuffers is not a multiple of eight. The 2-packet message cannot be deliveredto its destination HOST because it is out of order. Therefore the deadlockcondition we described before may occur again.We agree that the above mentioned sequence of events is unlikely to occur(otherwise one would have observed it already). This is particularly truesince the current maximum number of reassembly buffers (58) is much largerthan 24. Judging from past experience with computer systems and networks,however, we know that even very unlikely events have a tendency to occur in thelong run. Also, the probability of this deadlock condition increases withincreasing traffic in the net. Therefore, it is certainly worthwhile tomodify the IMPSYS in such a way that this deadlock cannot occur. It turns outthat a minor modification already achieves the desired effect. Recall thatthat described deadlock can only occur because single- and multiple-packetmessages use the same pool of reassembly buffers. If we set aside a singlereassembly buffer (or one for each destination HOST) that can be used only bysingle-packet messages this lockup condition due to message sequencing cannotoccur.Another solution to this problem is, of course, to more the message sequencingfrom the IMP to the HOST level. This does not mean that similar lockupproblems cannot occur on the HOST level. Also, it is currently much easierto resolve this problem by a slight modification of the IMPSYS rather thanhaving to coordinate all the various HOSTs. Another alternative is todiscontinue the use of multiple-packet messages. However, this also involvesmuch more drastic changes which require careful consideration. -5-

Network Working GroupRequest for Comments:  626                              REFERENCES1.  Kahn, R. E. and W. R. Crowther, "Flow Control in a Resource-Sharing    Computer Network," IEEE Transactions on Communication, Volume COM-20,3    June 1972.2.  BBN Report No. 2717, "Interface Message Processors for the ARPA Computer    Network," Quarterly Technical Report No. 4, January 1974.3.  Naylor, W., J. Wong, C. Kline, and J. Postel, "Regarding Proffered    Official ICP,"RFC 143, NIC 6728.4.  Postel, J. B. "A Graph Model Analysis of Computer Communications    Protocols," PhD dissertation, University of California, Los Angeles.5.  BBN Report No. 2670."Natural Communication with Computers IV,"  Quarterly    Progress Report No. 12, October 1973.                                   -6-

[8]ページ先頭