Movatterモバイル変換


[0]ホーム

URL:


[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]

Obsoleted by:7805 HISTORIC
RFC:  813                WINDOW AND ACKNOWLEDGEMENT STRATEGY IN TCP                             David D. Clark                  MIT Laboratory for Computer Science               Computer Systems and Communications Group                               July, 1982     1.  Introduction     This  document describes implementation strategies to deal with twomechanisms in TCP, the window and the acknowledgement.  These mechanismsare described in the specification document, but it is  possible,  whilecomplying with the specification, to produce implementations which yieldvery  bad  performance.    Happily,  the pitfalls possible in window andacknowledgement strategies are very easy to avoid.     It is a much more difficult exercise to verify the performance of aspecification than the correctness.  Certainly, we have less  experiencein  this  area,  and  we  certainly  lack  any  useful formal technique.Nonetheless, it is important to attempt a specification  in  this  area,because  different  implementors  might  otherwise  choose superficiallyreasonable algorithms  which  interact  poorly  with  each  other.  Thisdocument  presents  a  particular  set of algorithms which have receivedtesting in the field, and which appear to work properly with each other.With more experience, these algorithms may become  part  of  the  formalspecification:  until such time their use is recommended.

                                   22.  The Mechanisms     The acknowledgement mechanism is at the heart of TCP.  Very simply,when  data  arrives at the recipient, the protocol requires that it sendback an acknowledgement of this data.  The protocol specifies  that  thebytes  of  data  are  sequentially  numbered,  so that the recipient canacknowledge data by naming the highest numbered  byte  of  data  it  hasreceived,  which  also  acknowledges  the  previous  bytes (actually, itidentifies the first byte of data which it has  not  yet  received,  butthis is a small detail).  The protocol contains only a general assertionthat  data  should  be acknowledged promptly, but gives no more specificindication as to how quickly an acknowledgement must  be  sent,  or  howmuch data should be acknowledged in each separate acknowledgement.     The window mechanism is a flow control tool.  Whenever appropriate,the  recipient of data returns to the sender a number, which is (more orless) the size of the buffer which the receiver currently has  availablefor  additional  data.   This number of bytes, called the window, is themaximum which the sender is permitted to  transmit  until  the  receiverreturns  some  additional  window.  Sometimes, the receiver will have nobuffer space available, and will return a window value of zero.    Underthese  circumstances,the  protocol  requires  the sender to send a smallsegment to the receiver now and then, to see if more data  is  accepted.If  the  window  remains closed at zero for some substantial period, andthe sender can obtain  no  response  from  the  receiver,  the  protocolrequires  the  sender  to  conclude that the receiver has failed, and toclose  the  connection.    Again,  there  is  very  little   performance

                                   3information  in  the  specification, describing under what circumstancesthe window should be increased, and how the  sender  should  respond  tosuch revised information.     A  bad implementation of the window algorithm can lead to extremelypoor performance overall.  The degradations which  occur  in  throughputand  CPU  utilizations  can easily be several factors of ten, not just afractional increase.  This particular phenomenon is specific enough thatit has been given the name of Silly Window Syndrome, or  SWS.    HappilySWS  is  easy  to  avoid  if  a few simple rules are observed.  The mostimportant function of this memo is to describe SWS, so that implementorswill understand the general nature  of  the  problem,  and  to  describealgorithms  which  will  prevent  its  occurrence.    This document alsodescribes   performance   enhancing   algorithms   which    relate    toacknowledgement,  and  discusses  the  way  acknowledgement  and  windowalgorithms interact as part of SWS.     3.  SILLY WINDOW SYNDROME     In order to understand SWS, we must first  define  two  new  terms.Superficially,  the window mechanism is very simple:  there is a number,called "the window", which is returned from the receiver to the  sender.However,  we  must have a more detailed way of talking about the meaningof this number.  The receiver of data computes a  value  which  we  willcall  the  "offered  window".    In  a  simple  case, the offered windowcorresponds to the amount of buffer space  available  in  the  receiver.This  correspondence  is  not necessarily exact, but is a suitable modelfor the discussion to follow.    It  is  the  offered  window  which  is

                                   4actually  transmitted  back from the receiver to the sender.  The senderuses the offered window  to  compute  a  different  value,  the  "usablewindow",  which  is  the  offered window minus the amount of outstandingunacknowledged data.  The usable window is less than  or  equal  to  theoffered window, and can be much smaller.     Consider  the  following  simple  example.   The receiver initiallyprovides an offered window of 1,000.  The sender uses up this window  bysending  five  segments  of 200 bytes each.  The receiver, on processingthe first of these  segments,  returns  an  acknowledgement  which  alsocontains  an  updated  window value.  Let us assume that the receiver ofthe data has removed the first 200 bytes from the buffer,  so  that  thereceiver once again has 1,000 bytes of available buffer.  Therefore, thereceiver would return, as before, an offered window of 1,000 bytes.  Thesender,  on  receipt  of  this  first  acknowledgement, now computes theadditional number of bytes which may be sent.  In  fact,  of  the  1,000bytes  which  the recipient is prepared to receive at this time, 800 arealready in transit, having been sent in response to the previous offeredwindow.  In this case, the usable window is only 200 bytes.     Let us now consider how SWS  arises.    To  continue  the  previousexample,  assume  that at some point, when the sender computes a useablewindow of 200 bytes, it has only 50 bytes to send  until  it  reaches  a"push"  point.   It thus sends 50 bytes in one segment, and 150 bytes inthe next segment. Sometime later, this 50-byte segment  will  arrive  atthe recipient, which will process and remove the 50 bytes and once againreturn  an  offered window of 1,000 bytes.  However, the sender will now

                                   5compute  that there are 950 bytes in transit in the network, so that theuseable window is now only 50 bytes.  Thus, the sender will  once  againsend  a  50  byte  segment,  even  though  there  is no longer a naturalboundary to force it.     In fact, whenever the acknowledgement  of  a  small  segment  comesback, the useable window associated with that acknowledgement will causeanother  segment  of  the  same  small  size  to  be  sent,  until  someabnormality breaks the pattern.  It is easy to see  how  small  segmentsarise,  because  natural  boundaries  in the data occasionally cause thesender to take a computed useable window and divide it  up  between  twosegments.   Once that division has occurred, there is no natural way forthose useable window allocations to be recombined; thus the breaking  upof the useable window into small pieces will persist.     Thus,  SWS  is a degeneration in the throughput which develops overtime, during a long data transfer.  If the sender  ever  stops,  as  forexample  when  it runs out of data to send, the receiver will eventuallyacknowledge all  the  outstanding  data,  so  that  the  useable  windowcomputed  by  the  sender  will  equal  the  full  offered window of thereceiver.  At this point the situation will  have  healed,  and  furtherdata  transmission  over  the  link will occur efficiently.  However, inlarge file transfers, which occur without interruption,  SWS  can  causeappalling  performance.  The network between the sender and the receiverbecomes clogged with  many  small  segments,  and  an  equal  number  ofacknowledgements,  which  in  turn  causes lost segments, which triggersmassive retransmission.  Bad cases of SWS have been seen  in  which  the

                                   6average  segment  size was one-tenth of the size the sender and receiverwere prepared to deal with, and the average number of retransmission persuccessful segments sent was five.     Happily, SWS is trivial to avoid.  The following sections  describetwo  algorithms,  one  executed  by the sender, and one by the receiver,which appear to eliminate SWS completely.  Actually, either algorithm byitself is sufficient to prevent SWS, and thus  protect  a  host  from  aforeign  implementation  which  has  failed  to  deal properly with thisproblem.  The  two  algorithms  taken  together  produce  an  additionalreduction  in  CPU  consumption, observed in practice to be as high as afactor of four.     4.  Improved Window Algorithms     The receiver of data can take a very simple step to eliminate  SWS.When  it  disposes of a small amount of data, it can artificially reducethe offered window in subsequent acknowledgements, so that  the  useablewindow computed by the sender does not permit the sending of any furtherdata.     At  some  later  time,  when  the  receiver  has  processed  asubstantially larger amount of incoming data, the artificial  limitationon  the  offered  window  can be removed all at once, so that the sendercomputes a sudden large jump rather than a sequence of  small  jumps  inthe useable window.     At  this  level,  the  algorithm  is  quite simple, but in order todetermine exactly when the window should  be  opened  up  again,  it  isnecessary  to  look  at some of the other details of the implementation.

                                   7Depending  on whether the window is held artificially closed for a shortor long time, two problems will  develop.    The  one  we  have  alreadydiscussed  -- never closing the window artificially -- will lead to SWS.On the other hand, if  the  window  is  only  opened  infrequently,  thepipeline  of data in the network between the sender and the receiver mayhave emptied out while the sender was being held off, so that a delay isintroduced before additional data arrives from the sender.   This  delaydoes reduce throughput, but it does not consume network resources or CPUresources  in  the  process, as does SWS.  Thus, it is in this directionthat one ought to overcompensate.  For a simple implementation,  a  ruleof  thumb  that  seems to work in practice is to artificially reduce theoffered window until the reduction constitutes one half of the availablespace, at which point increase the window to advertise the entire  spaceagain.  In any event, one ought to make the chunk by which the window isopened  at  least permit one reasonably large segment.  (If the receiveris so short of buffers that it can never advertise a large enough bufferto permit at least one large segment, it is hopeless to expect any  sortof high throughput.)     There  is  an algorithm that the sender can use to achieve the sameeffect described above:  a very simple and elegant rule first  describedby  Michael  Greenwald  at MIT.  The sender of the data uses the offeredwindow to compute a useable window, and then compares the useable windowto the offered window, and refrains from sending anything if  the  ratioof  useable to offered is less than a certain fraction.  Clearly, if thecomputed useable window is small compared to the  offered  window,  thismeans  that a substantial amount of previously sent information is still

                                   8in  the  pipeline  from  the sender to the receiver, which in turn meansthat the sender can count on being granted a larger  useable  window  inthe  future.    Until  the  useable window reaches a certain amount, thesender should simply refuse to send anything.     Simple experiments suggest that the exact value of the ratio is notvery important, but that a value of about 25 percent  is  sufficient  toavoid  SWS  and  achieve reasonable throughput, even for machines with asmall offered window.    An  additional  enhancement  which  might  helpthroughput  would be to attempt to hold off sending until one can send amaximum size segment.  Another enhancement would be to send anyway, evenif the ratio is small, if the useable window is sufficient to  hold  thedata available up to the next "push point".     This algorithm at the sender end is very simple.  Notice that it isnot  necessary  to  set  a timer to protect against protocol lockup whenpostponing the  send  operation.    Further  acknowledgements,  as  theyarrive,  will  inevitably change the ratio of offered to useable window.(To see this, note that when all the data in the  catanet  pipeline  hasarrived  at  the  receiver,  the resulting acknowledgement must yield anoffered window and  useable  window  that  equal  each  other.)  If  theexpected  acknowledgements  do  not arrive, the retransmission mechanismwill come into play to assure that something finally happens.  Thus,  toadd  this  algorithm  to an existing TCP implementation usually requiresone line of code.  As part of the send algorithm it is already necessaryto compute the useable window from the offered window.  It is  a  simplematter  to add a line of code which, if the ratio is less than a certain

                                   9percent,  sets  the  useable  window to zero.  The results of SWS are sodevastating that no sender  should  be  without  this  simple  piece  ofinsurance.     5.  Improved Acknowledgement Algorithms     In the beginning of this paper, an overly simplistic implementationof  TCP  was described, which led to SWS.  One of the characteristics ofthis implementation was that the  recipient  of  data  sent  a  separateacknowledgement  for  every  segment  that it received.  This compulsiveacknowledgement  was  one  of  the   causes   of   SWS,   because   eachacknowledgement provided some new useable window, but even if one of thealgorithms  described  above  is  used to eliminate SWS, overly frequentacknowledgement still has  a  substantial  problem,  which  is  that  itgreatly  increases the processing time at the sender's end.  Measurementof TCP implementations, especially on large operating systems,  indicatethat  most  of  the  overhead  of  dealing  with a segment is not in theprocessing at the TCP or IP level, but simply in the scheduling  of  thehandler which is required to deal with the segment.  A steady dribble ofacknowledgements  causes a high overhead in scheduling, with very littleto show for it.  This waste is to be avoided if possible.     There are two reasons  for  prompt  acknowledgement.    One  is  toprevent  retransmission.  We will discuss later how to determine whetherunnecessary  retransmission  is  occurring.    The  other   reason   oneacknowledges  promptly  is  to permit further data to be sent.  However,the previous section makes quite clear that it is not  always  desirableto send a little bit of data, even though the receiver may have room for

                                   10it.    Therefore,  one  can  state  a  general  rule  that  under normaloperation, the receiver of data need not,  and  for  efficiency  reasonsshould  not,  acknowledge  the data unless either the acknowledgement isintended to produce an increased useable window, is necessary  in  orderto  prevent  retransmission  or  is  being  sent  as  part  of a reversedirection segment being sent for some other reason.  We will consider analgorithm to achieve these goals.     Only the recipient of  the  data  can  control  the  generation  ofacknowledgements.    Once  an  acknowledgement  has  been  sent from thereceiver back to the sender, the sender must process it.   Although  theextra overhead is incurred at the sender's end, it is entirely under thereceiver's  control.  Therefore, we must now describe an algorithm whichoccurs at the receiver's end.  Obviously, the algorithm  must  have  thefollowing  general form; sometimes the receiver of data, upon processinga segment, decides not to send an acknowledgement now, but  to  postponethe  acknowledgement until some time in the future, perhaps by setting atimer.  The peril of this approach  is  that  on  many  large  operatingsystems  it  is  extremely costly to respond to a timer event, almost ascostly as to respond to an incoming segment.  Clearly, if  the  receiverof  the data, in order to avoid extra overhead at the sender end, spendsa great deal of time responding to timer interrupts, no overall  benefithas been achieved, for efficiency at the sender end is achieved by greatthrashing  at  the  receiver end.  We must find an algorithm that avoidsboth of these perils.     The following scheme seems a good compromise.  The receiver of data

                                   11will   refrain   from   sending   an   acknowledgement   under   certaincircumstances, in which case it must set a timer which  will  cause  theacknowledgement  to be sent later.  However, the receiver should do thisonly where it is a reasonable guess that some other event will interveneand prevent the necessity of the timer  interrupt.    The  most  obviousevent  on  which  to depend is the arrival of another segment.  So, if asegment arrives, postpone sending an  acknowledgement  if  both  of  thefollowing  conditions  hold.    First,  the  push  bit is not set in thesegment, since it is a reasonable assumption that  there  is  more  datacoming  in  a  subsequent  segment.   Second, there is no revised windowinformation to be sent back.     This algorithm will insure that the timer, although set, is  seldomused.    The  interval  of  the  timer is related to the expected inter-segment delay, which is in turn a function  of  the  particular  networkthrough  which  the  data  is  flowing.    For the Arpanet, a reasonableinterval seems to be 200 to 300 milliseconds.Appendix A  describes  anadaptive algorithm for measuring this delay.     The section on improved window algorithms described both a receiveralgorithm  and  a  sender  algorithm,  and suggested that both should beused.  The reason for this is now clear.  While the sender algorithm  isextremely  simple,  and  useful  as insurance, the receiver algorithm isrequired in order that this improved acknowledgement strategy work.   Ifthe  receipt  of every segment causes a new window value to be returned,then of necessity  an  acknowledgement  will  be  sent  for  every  datasegment.    When, according to the strategy of the previous section, the

                                   12receiver  determines  to artificially reduce the offered window, that isprecisely the circumstance under which an acknowledgement  need  not  besent.      When   the   receiver   window  algorithm  and  the  receiveracknowledgement algorithm are  used  together,  it  will  be  seen  thatsending  an  acknowledgement  will  be triggered by one of the followingevents.  First, a push bit has been received.  Second, a temporary pausein the data stream is detected.  Third,  the  offered  window  has  beenartificially reduced to one-half its actual value.     In the beginning of this section, it was pointed out that there aretwo  reasons  why  one must acknowledge data.  Our consideration at thispoint has been concerned only with the first,  that  an  acknowledgementmust  be  returned as part of triggering the sending of new data.  It isalso necessary to acknowledge  whenever  the  failure  to  do  so  wouldtrigger retransmission by the sender.  Since the retransmission intervalis  selected  by  the  sender,  the  receiver  of the data cannot make aprecise  determination  of  when  the  acknowledgement  must  be   sent.However,   there   is   a  rough  rule  the  sender  can  use  to  avoidretransmission, provided that the receiver is reasonably well behaved.     We will assume that sender of the data uses the optional  algorithmdescribed  in  the  TCP  specification,  in which the roundtrip delay ismeasured using an exponential decay smoothing algorithm.  Retransmissionof a segment occurs if the measured delay for that segment  exceeds  thesmoothed  average  by  some  factor.  To see how retransmission might betriggered, one must consider the pattern  of  segment  arrivals  at  thereceiver.   The goal of our strategy was that the sender should send off

                                   13a  number of segments in close sequence, and receive one acknowledgementfor the whole burst.  The  acknowledgement  will  be  generated  by  thereceiver  at  the time that the last segment in the burst arrives at thereceiver.  (To ensure the prompt  return  of  the  acknowledgement,  thesender  could  turn on the "push" bit in the last segment of the burst.)The delay observed at the sender between the initial transmission  of  asegment  and  the  receipt  of the acknowledgement will include both thenetwork transit time, plus the  holding  time  at  the  receiver.    Theholding  time  will be greatest for the first segments in the burst, andsmallest for the last segments  in  the  burst.    Thus,  the  smoothingalgorithm  will  measure  a  delay  which is roughly proportional to theaverage roundtrip delay for all the segments in  the  burst.    Problemswill  arise  if  the  average  delay  is  substantially smaller than themaximum delay  and  the  smoothing  algorithm  used  has  a  very  smallthreshold  for  triggering retransmission.  The widest variation betweenaverage and maximum delay  will  occur  when  network  transit  time  isnegligible, and all delay is processing time.  In this case, the maximumwill  be  twice  the  average  (by simple algebra) so the threshold thatcontrols retransmission should be somewhat more than a factor of two.     In practice, retransmission of the first segments of  a  burst  hasnot  been  a  problem because the delay measured consists of the networkroundtrip  delay,  as  well  as  the  delay  due  to   withholding   theacknowledgement,  and the roundtrip tends to dominate except in very lowroundtrip time situations (such as when sending to one's self  for  testpurposes).    This low roundtrip situation can be covered very simply byincluding a minimum value below which  the  roundtrip  estimate  is  notpermitted to drop.

                                   14     In  our  experiments  with  this  algorithm,  retransmission due tofaulty calculation of the roundtrip delay occurred only once,  when  theparameters  of  the exponential smoothing algorithm had been misadjustedso that they were only  taking  into  account  the  last  two  or  threesegments  sent.   Clearly, this will cause trouble since the last two orthree segments of any burst are the  ones  whose  holding  time  at  thereceiver is minimal, so the resulting total estimate was much lower thanappropriate.   Once the parameters of the algorithm had been adjusted sothat the number of segments taken into account was  approximately  twicethe  number  of  segments  in  a burst of average size, with a thresholdfactor of 1.5, no further retransmission has ever been identified due tothis problem, including when sending to ourself and  when  sending  overhigh delay nets.     6.  Conservative Vs. Optimistic Windows     According  to the TCP specification, the offered window is presumedto have some relationship to the amount of data which  the  receiver  isactually  prepared  to receive.  However, it is not necessarily an exactcorrespondence.  We will use the term "conservative window" to  describethe case where the offered window is precisely no larger than the actualbuffering  available.  The drawback to conservative window algorithms isthat they can produce very low throughput in long delay situations.   Itis easy to see that the maximum input of a conservative window algorithmis  one  bufferfull  every  roundtrip  delay  in the net, since the nextbufferfull cannot be launched until the  updated  window/acknowledgementinformation from the previous transmission has made the roundtrip.

                                   15     In  certain  cases,  it  may  be  possible  to increase the overallthroughput of the transmission by increasing the offered window over theactual buffer available at the receiver.  Such a strategy we  will  callan  "optimistic  window" strategy.  The optimistic strategy works if thenetwork delivers the data to the recipient sufficiently slowly  that  itcan  process  the  data fast enough to keep the buffer from overflowing.If the receiver is faster than the sender, one could, with luck,  permitan infinitely optimistic window, in which the sender is simply permittedto send full-speed.  If the sender is faster than the receiver, however,and the window is too optimistic, then some segments will cause a bufferoverflow,  and  will  be  discarded.  Therefore, the correct strategy toimplement an optimistic window is to  increase  the  window  size  untilsegments  start to be lost.  This only works if it is possible to detectthat the segment has been lost.  In  some  cases,  it  is  easy  to  do,because  the  segment  is  partially processed inside the receiving hostbefore it is thrown away.  In other cases, overflows may actually  causethe network interface to be clogged, which will cause the segments to belost  elsewhere  in the net.  It is inadvisable to attempt an optimisticwindow strategy unless one is certain that the algorithm can detect  theresulting  lost  segments.  However, the increase in throughput which ispossible from optimistic windows is quite substantial.  Any systems withsmall buffer space should seriously consider  the  merit  of  optimisticwindows.     The  selection  of an appropriate window algorithm is actually morecomplicated than even the above  discussion  suggests.    The  followingconsiderations  are  not  presented  with  the  intention  that  they be

                                   16incorporated  in  current  implementations of TCP, but as background forthe sophisticated designer who is attempting to understand how  his  TCPwill  respond  to  a variety of networks, with different speed and delaycharacteristics.  The particular pattern of windows and acknowledgementssent from receiver to sender influences two characteristics of the  databeing  sent.    First, they control the average data rate.  Clearly, theaverage rate of the  sender  cannot  exceed  the  average  rate  of  thereceiver,  or  long-term  buffer  overflow  will  occur.    Second, theyinfluence the burstiness of the data coming from the sender.  Burstinesshas both advantages and disadvantages.  The advantage of  burstiness  isthat  it  reduces  the  CPU processing necessary to send the data.  Thisfollows from the observed fact, especially on large machines, that  mostof  the  cost  of sending a segment is not the TCP or IP processing, butthe scheduling overhead of getting started.     On the other hand, the disadvantage of burstiness is  that  it  maycause  buffers  to overflow, either in the eventual recipient, which wasdiscussed above, or in an intermediate gateway,  a  problem  ignored  inthis paper.  The algorithms described above attempts to strike a balancebetween  excessive  burstiness,  which  in  the  extreme cases can causedelays because a burst is  not  requested  soon  enough,  and  excessivefragmentation   of  the  data  stream  into  small  segments,  which  weidentified as Silly Window Syndrome.     Under conditions of extreme delay  in  the  network,  none  of  thealgorithms   described   above   will   achieve   adequate   throughput.Conservative window algorithms  have  a  predictable  throughput  limit,

                                   17which  is one windowfull per roundtrip delay.  Attempts to solve this byoptimistic window strategies may  cause  buffer  overflows  due  to  thebursty  nature  of the arriving data.  A very sophisticated way to solvethis is for the receiver, having measured by some  means  the  roundtripdelay  and  intersegment  arrival rate of the actual connection, to openhis window, not in one optimistic increment of gigantic proportion,  butin  a number of smaller optimistic increments, which have been carefullyspaced using a timer so that the resulting smaller bursts  which  arriveare each sufficiently small to fit into the existing buffers.  One couldvisualize this as a number of requests flowing backwards through the netwhich trigger in return a number of bursts which flow back spaced evenlyfrom  the  sender  to  the  receiver.    The  overall result is that thereceiver uses the window mechanism to  control  the  burstiness  of  thearrivals, and the average rate.     To  my knowledge, no such strategy has been implemented in any TCP.First, we do not normally have delays high enough to require  this  kindof  treatment.    Second,  the  strategy described above is probably notstable unless it is very carefully balanced.  Just as buses on a  singlebus  route tend to bunch up, bursts which start out equally spaced couldwell end up piling into each other, and forming the single  large  burstwhich  the  receiver was hoping to avoid.  It is important to understandthis extreme case, however, in order to  understand  the  limits  beyondwhich  TCP,  as normally implemented, with either conservative or simpleoptimistic windows can be expected to  deliver  throughput  which  is  areasonable percentage of the actual network capacity.

                                   18     7.  Conclusions     This  paper  describes  three  simple  algorithms  for  performanceenhancement in TCP, one at the sender end and two at the receiver.   Thesender  algorithm  is  to  refrain from sending if the useable window issmaller than 25 percent of the offered window.  The receiver  algorithmsare first, to artificially reduce the offered window when processing newdata  if  the  resulting  reduction  does  not  represent more than somefraction, say 50 percent, of the actual space available, and second,  torefrain  from  sending an acknowledgment at all if two simple conditionshold.     Either of these algorithms will prevent the worst aspects of  SillyWindow  Syndrome, and when these algorithms are used together, they willproduce substantial improvement in CPU utilization, by  eliminating  theprocess of excess acknowledgements.     Preliminary  experiments  with  these  algorithms suggest that theywork, and work very well.  Both the sender and receiver algorithms  havebeen  shown  to  eliminate  SWS,  even  when  talking  to  fairly  sillyalgorithms at the other end.  The Multics  mailer,  in  particular,  hadsuffered substantial attacks of SWS while sending large mail to a numberof  hosts.   We believe that implementation of the sender side algorithmhas  eliminated  every  known  case  of  SWS  detected  in  our  mailer.Implementation  of  the  receiver  side  algorithm  produced substantialimprovements of CPU time when Multics was the sending system.    Multicsis  a  typical  large  operating system, with scheduling costs which arelarge compared to the actual  processing  time  for  protocol  handlers.

                                   19Tests were done sending from Multics to a host which implemented the SWSsuppression  algorithm,  and  which  could  either  refrain  or not fromsending acknowledgements on each segment.  As predicted, suppressing thereturn acknowledgements did not influence the throughput for large  datatransfer  at  all,  since the throttling effect was elsewhere.  However,the CPU time required to process the data at the Multics end was cut  bya  factor  of  four  (In  this experiment, the bursts of data which werebeing sent were approximately eight  segments.    Thus,  the  number  ofacknowledgements in the two experiments differed by a factor of eight.)     An  important  consideration in evaluating these algorithms is thatthey must not cause the protocol implementations to deadlock.    All  ofthe  recommendations  in this document have the characteristic that theysuggest one refrain  from  doing  something  even  though  the  protocolspecification  permits one to do it.  The possibility exists that if onerefrains from doing something now one may never get to do it later,  andboth  ends will halt, even though it would appear superficially that thetransaction can continue.     Formally, the idea that things continue to work is referred  to  as"liveness".    One  of  the  defects  of ad hoc solutions to performanceproblems is the possibility that two different approaches will  interactto  prevent  liveness.   It is believed that the algorithms described inthis paper are always live, and that is one of the reasons why there  isa strong advantage in uniform use of this particular proposal, except incases where it is explicitly demonstrated not to work.     The  argument  for liveness in these solutions proceeds as follows.

                                   20First,  the sender algorithm can only be stopped by one thing, a refusalof the receiver to acknowledge sent data.    As  long  as  the  receivercontinues  to  acknowledge  data, the ratio of useable window to offeredwindow will approach one, and eventually the  sender  must  continue  tosend.    However,  notice  that the receiver algorithm we have advocatedinvolves refraining from acknowledging.  Therefore, we certainly do havea situation where improper  operation  of  this  algorithm  can  preventliveness.     What  we  must show is that the receiver of the data, if it choosesto refrain from acknowledging, will do so only for a short time, and notforever.  The design of the algorithm described above  was  intended  toachieve  precisely  this  goal:  whenever the receiver of data refrainedfrom sending an acknowledgement it was required to set  a  timer.    Theonly  event  that  was  permitted to clear that timer was the receipt ofanother segment, which essentially reset the timer, and started it goingagain.  Thus, an acknowledgement will be sent as soon  as  no  data  hasbeen received.  This has precisely the effect desired:  if the data flowappears to be disrupted for any reason, the receiver responds by sendingan  up-to-date  acknowledgement.    In  fact,  the receiver algorithm isdesigned  to  be  more  robust  than  this,  for  transmission   of   anacknowledgment is triggered by two events, either a cessation of data ora  reduction in the amount of offered window to 50 percent of the actualvalue.    This  is  the  condition  which  will  normally  trigger   thetransmission of this acknowledgement.

                                   21                               APPENDIX A     Dynamic Calculation of Acknowledgement Delay     The  text  suggested  that  when  setting  a  timer to postpone thesending  of  an  acknowledgement,  a  fixed  interval  of  200  to   300milliseconds  would  work  properly  in  practice.    This  has not beenverified over a wide variety of network delays, and clearly if there  isa  very  slow  net  which stretches out the intersegment arrival time, afixed interval will fail.  In a sophisticated TCP, which is expected  toadjust   dynamically   (rather   than   manually)  to  changing  networkconditions, it would be appropriate to measure this interval and responddynamically.  The following algorithm, which has been  relegated  to  anAppendix,  because  it  has not been tested, seems sensible.  Whenever asegment arrives which does not have the push  bit  on  in  it,  start  atimer,  which  runs  until  the  next  segment  arrives.   Average theseinterarrival intervals, using an exponential  decay  smoothing  functiontuned  to take into account perhaps the last ten or twenty segments thathave come in.  Occasionally, there will be a long  interarrival  period,even  for  a  segment  which is does not terminate a piece of data beingpushed, perhaps because a window has gone to zero or some glitch in  thesender  or  the  network  has held up the data.  Therefore, examine eachinterarrival interval, and discard it from the smoothing algorithm if itexceeds the current estimate by some amount, perhaps a ratio of  two  orfour times.  By rejecting the larger intersegment arrival intervals, oneshould obtain a smoothed estimate of the interarrival of segments inside

                                   22a  burst.   The number need not be exact, since the timer which triggersacknowledgement can add a fairly generous fudge factor to  this  withoutcausing  trouble  with  the  sender's  estimate  of  the  retransmissioninterval, so long as the fudge factor is constant.

[8]ページ先頭

©2009-2026 Movatter.jp