Movatterモバイル変換
[0]ホーム
[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Errata] [Info page]
INFORMATIONAL
Errata ExistNetwork Working Group D.L. MillsRequest for Comments: 889 December 1983Internet Delay ExperimentsThis memo reports on some measurement experiments and suggests some possibleimprovements to the TCP retransmission timeout calculation. This memo isboth a status report on the measurements and advice to implementers of TCP.1. Introduction This memorandum describes two series of experiments designed to explorethe transmission characteristics of the Internet system. One series ofexperiments was designed to determine the network delays with respect topacket length, while the other was designed to assess the effectiveness of theTCP retransmission-timeout algorithm specified in the standards documents.Both sets of experiments were conducted during the October - November 1983time frame and used many hosts distributed throughout the Internet system. The objectives of these experiments were first to accumulate experimentaldata on actual network paths that could be used as a benchmark of Internetsystem performance, and second to apply these data to refine individual TCPimplementations and improve their performance. The experiments were done using a specially instrumented measurement hostcalled a Fuzzball, which consists of an LSI-11 running IP/TCP and variousapplication-layer protocols including TELNET, FTP and SMTP mail. Among thevarious measurement packages is the original PING (Packet InterNet Groper)program used over the last six years for numerous tests and measurements ofthe Internet system and its client nets. This program contains facilities tosend various kinds of probe packets, including ICMP Echo messages, process thereply and record elapsed times and other information in a data file, as wellas produce real-time snapshot histograms and traces. Following an experiment run, the data collected in the file were reducedby another set of programs and plotted on a Peritek bit-map display with colormonitor. The plots have been found invaluable in the indentification andunderstanding of the causes of netword glitches and other "zoo" phenomena.Finally, summary data were extracted and presented in this memorandum. Theraw data files, including bit-map image files of the various plots, areavailable to other experimenters upon request. The Fuzzballs and their local-net architecture, called DCN, have abouttwo-dozen clones scattered worldwide, including one (DCN1) at the LinkabitCorporation offices in McLean, Virginia, and another at the NorwegianTelecommunications Adminstration (NTA) near Oslo, Norway. The DCN1 Fuzzballis connected to the ARPANET at the Mitre IMP by means of 1822 Error ControlUnits operating over a 56-Kbps line. The NTA Fuzzball is connected to theNTARE Gateway by an 1822 interface and then via VDH/HAP operating over a9.6-Kbps line to SATNET at the Tanum (Sweden) SIMP. For most experimentsdescribed below, these details of the local connectivity can be ignored, sinceonly relatively small delays are involved.
Internet Delay Experiments Page 2D.L. Mills The remote test hosts were selected to represent canonical paths in theInternet system and were scattered all over the world. They included some onthe ARPANET, MILNET, MINET, SATNET, TELENET and numerous local nets reachablevia these long-haul nets. As an example of the richness of the Internetsystem connectivity and the experimental data base, data are included forthree different paths from the ARPANET-based measurement host to London hosts,two via different satellite links and one via an undersea cable.2. Packet Length Versus Delay This set of experiments was designed to determine whether delays acrossthe Internet are significantly influenced by packet length. In cases wherethe intrinsic propagation delays are high relative to the time to transmit anindividual packet, one would expect that delays would not be strongly affectedby packet length. This is the case with satellite nets, including SATNET andWIDEBAND, but also with terrestrial nets where the degree of trafficaggregation is high, so that the measured traffic is a small proportion of thetotal traffic on the path. However, in cases where the intrinsic propagationdelays are low and the measured traffic represents the bulk of the traffic onthe path, quite the opposite would be expected. The objective of the experiments was to assess the degree to which TCPperformance could be improved by refining the retransmission-timeout algorithmto include a dependency on packet length. Another objective was to determinethe nature of the delay characteristic versus packet length on tandem pathsspanning networks of widely varying architectures, including local-nets,terrestrial long-haul nets and satellite nets.2.1. Experiment Design There were two sets of experiments to measure delays as a function ofpacket length. One of these was based at DCN1, while the other was based atNTA. All experiments used ICMP Echo/Reply messages with embedded timestamps.A cycle consisted of sending an ICMP Echo message of specified length, waitingfor the corresponding ICMP Reply message to come back and recording theelapsed time (normalized to one-way delay). An experiment run, resulting inone line of the table below, consisted of 512 of these volleys. The length of each ICMP message was determined by a random-numbergenerator uniformly distributed between zero and 256. Lengths less than 40were rounded up to 40, which is the minimum datagram size for an ICMP messagecontaining timestamps and just happens to also be the minimum TCP segmentsize. The maximum length was chosen to avoid complications due tofragmentation and reassembly, since ICMP messages are not ordinarilyfragmented or reassembled by the gateways. The data collected were first plotted as a scatter diagram on a colorbit-map display. For all paths involving the ARPANET, this immediatelyrevealed two distinct characteristics, one for short (single-packet) messagesless than 126 octets in length and the other for long (multi-packet) messages
Internet Delay Experiments Page 3D.L. Millslonger than this. Linear regression lines were then fitted to eachcharacteristic with the results shown in the following table. (Only onecharacteristic was assumed for ARPANET-exclusive paths.) The table shows foreach host the delays, in milliseconds, for each type of message along with arate computed on the basis of these delays. The "Host ID" column designatesthe host at the remote end of the path, with a letter suffix used whennecessary to identify a particular run.
Internet Delay Experiments Page 4D.L. MillsHost Single-packet Rate Multi-packet Rate CommentsID 40 125 (bps) 125 256 (bps)---------------------------------------------------------------------------DCN1 to nearby local-net hosts (calibration)DCN5 9 13 366422 DMA 1822DCN8 14 20 268017 EthernetIMP17 22 60 45228 56K 1822/ECUFORD1 93 274 9540 9600 DDCMP baseUMD1 102 473 4663 4800 synchDCN6 188 550 4782 4800 DDCMPFACC 243 770 3282 9600/4800 DDCMPFOE 608 1917 1320 9600/14.4K stat muxDCN1 to ARPANET hosts and local netsMILARP 61 105 15358 133 171 27769 MILNET gatewayISID-L 166 263 6989 403 472 15029 low-traffic periodSCORE 184 318 5088 541 608 15745 low-traffic periodRVAX 231 398 4061 651 740 11781 Purdue local netAJAX 322 578 2664 944 1081 7681 MIT local netISID-H 333 520 3643 715 889 6029 high-traffic periodBERK 336 967 1078 1188 1403 4879 UC BerkeleyWASH 498 776 2441 1256 1348 11379 U WashingtonDCN1 to MILNET/MINET hosts and local netsISIA-L 460 563 6633 1049 1140 11489 low-traffic periodISIA-H 564 841 2447 1275 1635 2910 high-traffic periodBRL 560 973 1645 1605 1825 4768 BRL local netLON 585 835 2724 1775 1998 4696 MINET host (London)HAWAII 679 980 2257 1817 1931 9238 a long way offOFFICE3 762 1249 1396 2283 2414 8004 heavily loaded hostKOREA 897 1294 1712 2717 2770 19652 a long, long way offDCN1 to TELENET hosts via ARPANETRICE 1456 2358 754 3086 3543 2297 via VAN gatewayDCN1 to SATNET hosts and local nets via ARPANETUCL 1089 1240 4514 1426 1548 8558 UCL zooNTA-L 1132 1417 2382 1524 1838 3339 low-traffic periodNTA-H 1247 1504 2640 1681 1811 8078 high-traffic periodNTA to SATNET hostsTANUM 107 368 6625 9600 bps Tanum lineETAM 964 1274 5576 Etam channel echoGOONY 972 1256 6082 Goonhilly channel echo
Internet Delay Experiments Page 5D.L. Mills2.2 Analysis of Results The data clearly show a strong correlation between delay and length, withthe longest packets showing delays two to three times the shortest. On pathsvia ARPANET clones the delay characteristic shows a stonger correlation withlength for single-packet messages than for multi-packet messages, which isconsistent with a design which favors low delays for short messages and highthroughputs for longer ones. Most of the runs were made during off-peak hours. In the few cases whereruns were made for a particular host during both on-peak and off-peak hours,comparison shows a greater dependency on packet length than on traffic shift. TCP implementors should be advised that some dependency on packet lengthmay have to be built into the retransmission-timeout estimation algorithm toinsure good performance over lossy nets like SATNET. They should also beadvised that some Internet paths may require stupendous timeout intervalsranging to many seconds for the net alone, not to mention additional delays onhost-system queues. I call to your attention the fact that the delays (at least for thelarger packets) from ARPANET hosts (e.g. DCN1) to MILNET hosts (e.g. ISIA)are in the same ballpark as the delays to SATNET hosts (e.g. UCL)! I havealso observed that the packet-loss rates on the MILNET path are at present notneglible (18 in 512 for ISIA-2). Presumably, the loss is in the gateways;however, there may well be a host or two out there swamping the gateways withretransmitted data and which have a funny idea of the "normal" timeoutinterval. The recent discovery of a bug in the TOPS-20 TCP implementation,where spurious ACKs were generated at an alarming rate, would seem to confirmthat suspicion.3. Retransmission-Timeout Algorithm One of the basic features of TCP which allow it to be used on pathsspanning many nets of widely varying delay and packet-loss characteristics isthe retranansmission-timeout algorithm, sometimes known as the "RSREAlgorithm" for the original designers. The algorithm operates by recordingthe time and initial sequence number when a segment is transmitted, thencomputing the elapsed time for that sequence number to be acknowledged. Thereare various degrees of sophistication in the implementation of the algorithm,ranging from allowing only one such computation to be in progress at a time toallowing one for each segment outstanding at a time on the connection. The retransmission-timeout algorithm is basically an estimation process.It maintains an extimate of the current roundtrip delay time and updates it asnew delay samples are computed. The algorithm smooths these samples and thenestablishes a timeout, which if exceeded causes a retransmission. Theselection of the parameters of this algorithm are vitally important in orderto provide effective data transmission and avoid abuse of the Internet systemby excessive retransmissions. I have long been suspicious of the parameters
Internet Delay Experiments Page 6D.L. Millssuggested in the specification and used in some implementations, especially incases involving long-delay paths involving lossy nets. The experiment wasdesigned to simulate the operation of the algorithm using data collected fromreal paths involving some pretty leaky Internet plumbing.3.1. Experiment Design The experiment data base was constructed of well over a hundred runsusing ICMP Echo/Reply messages bounced off hosts scattered all over the world.Most runs, including all those summarized here, consisted of 512 echo/replycycles lasting from several seconds to twenty minutes or so. Other runsdesigned to detect network glitches lasted several hours. Some runs usedpackets of constant length, while others used different lengths distributedfrom 40 to 256 octets. The maximum length was chosen to avoid complicationsfragmented or reassembled by the gateways. The object of the experiment was to simulate the packet delaydistribution seen by TCP over the paths measured. Only the network delay isof interest here, not the queueing delays within the hosts themselves, whichcan be considerable. Also, only a single packet was allowed in flight, sothat stress on the network itself was minimal. Some tests were conductedduring busy periods of network activity, while others were conducted duringquiet hours. The 512 data points collected during each run were processed by a programwhich plotted on a color bit-map display each data point (x,y), where xrepresents the time since initiation of the experiment the and y the measureddelay, normalized to the one-way delay. Then, the simulatedretransmission-timeout algorithm was run on these data and its computedtimeout plotted in the same way. The display immediately reveals how thealgorithm behaves in the face of varying traffic loads, network glitches, lostpackets and superfluous retransmissions. Each experiment run also produced summary statistics, which aresummarized in the table below. Each line includes the Host ID, whichidentifies the run. The suffix -1 indicates 40-octet packets, -2 indicates256-octet packets and no suffix indicates uniformly distributed lengthsbetween 40 and 256. The Lost Packets columns refer to instances when no ICMPReply message was received for thirty seconds after transmission of the ICMPEcho message, indicating probable loss of one or both messages. The RTXPackets columns refer to instances when the computed timeout is less than themeasured delay, which would result in a superfluous retransmission. For eachof these two types of packets the column indicates the number of instancesand the Time column indicates the total accumulated time required for therecovery action. For reference purposes, the Mean column indicates the computed mean delayof the echo/reply cycles, excluding those cycles involving packet loss, whilethe CoV column indicates the coefficient of variation. Finally, the Eff
Internet Delay Experiments Page 7D.L. Millscolumn indicates the efficiency, computed as the ratio of the total timeaccumulated while sending good data to this time plus the lost-packet andrtx-packet time. Complete sets of runs were made for each of the hosts in the table belowfor each of several selections of algorithm parameters. The table itselfreflects values, selected as described later, believed to be a good compromisefor use on existing paths in the Internet system.
Internet Delay Experiments Page 8D.L. MillsHost Total Lost Packets RTX Packets Mean CoV EffID Time Time Time-------------------------------------------------------------------DCN1 to nearby local-net hosts (calibration)DCN5 5 0 0 0 0 11 .15 1DCN8 8 0 0 0 0 16 .13 1IMP17 19 0 0 0 0 38 .33 1FORD1 86 0 0 1 .2 167 .33 .99UMD1 135 0 0 2 .5 263 .45 .99DCN6 177 0 0 0 0 347 .34 1FACC 368 196 222.1 6 9.2 267 1.1 .37FOE 670 3 7.5 21 73.3 1150 .69 .87FOE-1 374 0 0 26 61.9 610 .75 .83FOE-2 1016 3 16.7 10 47.2 1859 .41 .93DCN1 to ARPANET hosts and local netsMILARP 59 0 0 2 .5 115 .39 .99ISID 163 0 0 1 1.8 316 .47 .98ISID-1 84 0 0 2 1 163 .18 .98ISID-2 281 0 0 3 17 516 .91 .93ISID * 329 0 0 5 12.9 619 .81 .96SCORE 208 0 0 1 .8 405 .46 .99RVAX 256 1 1.3 0 0 499 .42 .99AJAX 365 0 0 0 0 713 .44 1WASH 494 0 0 2 2.8 960 .39 .99WASH-1 271 0 0 5 8 514 .34 .97WASH-2 749 1 9.8 2 17.5 1411 .4 .96BERK 528 20 50.1 4 35 865 1.13 .83DCN1 to MILNET/MINET hosts and local netsISIA 436 4 7.4 2 15.7 807 .68 .94ISIA-1 197 0 0 0 0 385 .27 1ISIA-2 615 0 0 2 15 1172 .36 .97ISIA * 595 18 54.1 6 33.3 992 .77 .85BRL 644 1 3 1 1.9 1249 .43 .99BRL-1 318 0 0 4 13.6 596 .68 .95BRL-2 962 2 8.4 0 0 1864 .12 .99LON 677 0 0 3 11.7 1300 .51 .98LON-1 302 0 0 0 0 589 .06 1LON-2 1047 0 0 0 0 2044 .03 1HAWAII 709 4 12.9 3 18.5 1325 .55 .95OFFICE3 856 3 12.9 3 10.3 1627 .54 .97OFF3-1 432 2 4.2 2 6.9 823 .31 .97OFF3-2 1277 7 39 3 41.5 2336 .44 .93KOREA 1048 3 14.5 2 18.7 1982 .48 .96KOREA-1 506 4 8.6 1 2.2 967 .18 .97KOREA-2 1493 6 35.5 2 19.3 2810 .19 .96DCN1 to TELENET hosts via ARPANETRICE 677 2 6.8 3 12.1 1286 .41 .97
Internet Delay Experiments Page 9D.L. MillsRICE-1 368 1 .1 3 2.3 715 .11 .99RICE-2 1002 1 4.4 1 9.5 1930 .19 .98DCN1 to SATNET hosts and local nets via ARPANETUCL 689 9 26.8 0 0 1294 .21 .96UCL-1 623 39 92.8 2 5.3 1025 .32 .84UCL-2 818 4 13.5 0 0 1571 .15 .98NTA 779 12 38.7 1 3.7 1438 .24 .94NTA-1 616 24 56.6 2 5.3 1083 .25 .89NTA-2 971 19 71.1 0 0 1757 .2 .92NTA to SATNET hosts and local netsTANUM 110 3 1.6 0 0 213 .41 .98GOONY 587 19 44.2 1 2.9 1056 .23 .91ETAM 608 32 76.3 1 3.1 1032 .29 .86UCL 612 5 12.6 2 8.5 1154 .24 .96Note: * indicates randomly distributed packets during periods of high ARPANETactivity. The same entry without the * indicates randomly distributed packetsduring periods of low ARPANET activity.
Internet Delay Experiments Page 10D.L. Mills3.2 Discussion of Results It is immediately obvious from visual inspection of the bit-map displaythat the delay distribution is more-or-less Poissonly distributed about arelatively narrow range with important exceptions. The exceptions arecharacterized by occasional spasms where one or more packets can be delayedmany times the typical value. Such glitches have been commonly noted beforeon paths involving ARPANET and SATNET, but the true impact of their occuranceon the timeout algorithm is much greater than I expected. What commonlyhappens is that the algorithm, when confronted with a short burst oflong-delay packets after a relatively long interval of well-mannered behavior,takes much too long to adapt to the spasm, thus inviting many superfluousretransmissions and leading to congestion. The incidence of long-delay bursts, or glitches, varied widely during theexperiments. Some of them were glitch-free, but most had at least one glitchin 512 echo/reply volleys. Glitches did not seem to correlate well withincreases in baseline delay, which occurs as the result of traffic surges, nordid they correlate well with instances of packet loss. I did not notice anyparticular periodicity, such as might be expected with regular pinging, forexample; however, I did not process the data specially for that. There was no correction for packet length used in any of theseexperiments, in spite of the results of the first set of experiments describedpreviously. This may be done in a future set of experiments. The algorithmdoes cope well in the case of constant-length packets and in the case ofrandomly distributed packet lengths between 40 and 256 octets, as indicated inthe table. Future experiments may involve bursts of short packets followed bybursts of longer ones, so that the speed of adaptation of the algorithm can bedirectly deterimend. One particularily interesting experiment involved the FOE host(FORD-FOE), which is located in London and reached via a 14.4-Kbps underseacable and statistical multiplexor. The multiplexor introduces a moderate meandelay, but with an extremely large delay dispersion. The specifiedretransmission-timeout algorithm had a hard time with this circuit, as mightbe expected; however, with the improvments described below, TCP performancewas acceptable. It is unlikely that many instances of such ornery circuitswill occur in the Internet system, but it is comforting to know that thealgorithm can deal effectively with them.3.3. Improvments to the Algorithm The specified retransmission-timeout algorithm, really a first-orderlinear recursive filter, is characterized by two parameters, a weightingfactor F and a threshold factor G. For each measured delay sample R the delayestimator E is updated: E = F*E + (1 - F)*R .
Internet Delay Experiments Page 11D.L. MillsThen, if an interval equal to G*E expires after transmitting a packet, thepacket is retransmitted. The current TCP specification suggests values in therange 0.8 to 0.9 for F and 1.5 to 2.0 for G. These values have been believedreasonable up to now over ARPANET and SATNET paths. I found that a simple change to the algorithm made a worthwhile change inthe efficiency. The change amounts to using two values of F, one (F1) when R< E in the expression above and the other (F2) when R >= E, with F1 > F2. Theeffect is to make the algorithm more responsive to upward-going trends indelay and less respnsive to downward-going trends. After a number of trials Iconcluded that values of F1 = 15/16 and F2 = 3/4 (with G = 2) gave the bestall-around performance. The results on some paths (FOE, ISID, ISIA) werebetter by some ten percent in efficiency, as compared to the values now usedin typical implementations where F = 7/8 and G = 2. The results on most pathswere better by five percent, while on a couple (FACC, UCL) the results wereworse by a few percent. There was no clear-cut gain in fiddling with G. The value G = 2 seemedto represent the best overall compromise. Note that increasing G makessuperfluous retransmissions less likely, but increases the total delay whenpackets are lost. Also, note that increasing F2 too much tends to causeovershoot in the case of network glitches and leads to the same result. Thetable above was constructed using F1 = 15/16, F2 = 3/4 and G = 2. Readers familiar with signal-detection theory will recognize mysuggestion as analogous to an ordinary peak-detector circuit. F1 representsthe discharge time-constant, while F2 represents the charge time-constant. Grepresents a "squelch" threshold, as used in voice-operated switches, forexample. Some wag may be even go on to suggest a network glitch should becalled a netspurt.
Internet Delay Experiments Page 12D.L. MillsAppendix. Index of Test HostsName Address NIC Host Name-------------------------------------DCN1 to nearby local-net hosts (calibration)DCN5 128.4.0.5 DCN5DCN8 128.4.0.8 DCN8IMP17 10.3.0.17 DCN-GATEWAYFORD1 128.5.0.1 FORD1UMD1 128.8.0.1 UMD1DCN6 128.4.0.6 DCN6FACC 128.5.32.1 FORD-WDL1FOE 128.5.0.15 FORD-FOEDCN1 to ARPANET hosts and local netsMILARP 10.2.0.28 ARPA-MILNET-GWISID 10.0.0.27 USC-ISIDSCORE 10.3.0.11 SU-SCORERVAX 128.10.0.2 PURDUE-MORDREDAJAX 18.10.0.64 MIT-AJAXWASH 10.0.0.91 WASHINGTONBERK 10.2.0.78 UCB-VAXDCN1 to MILNET/MINET hosts and local netsISIA 26.3.0.103 USC-ISIABRL 192.5.21.6 BRL-VGRLON 24.0.0.7 MINET-LON-EMHAWAII 26.1.0.36 HAWAII-EMHOFFICE3 26.2.0.43 OFFICE-3KOREA 26.0.0.117 KOREA-EMHDCN1 to TELENET hosts via ARPANETRICE 14.0.0.12 RICEDCN1 to SATNET hosts and local nets via ARPANETUCL 128.16.9.0 UCL-SAMNTA 128.39.0.2 NTARE1NTA to SATNET hosts and local netsTANUM 4.0.0.64 TANUM-ECHOGOONY 4.0.0.63 GOONHILLY-ECHOETAM 4.0.0.62 ETAM-ECHO
[8]ページ先頭