Movatterモバイル変換
[0]ホーム
[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]
UNKNOWN
Request for Comments: 713 Jack Haverty (JFH@MIT-DMS)NIC #34739 Apr 1976I. ABSTRACTA mechanism is defined for use by message servers intransferring data between hosts. The mechanism, called theMSDTP, is defined in terms of a model of the process as atranslation between two sets of items, the abstract entitiessuch as 'strings' and 'integers', and the formats used torepresent such data as a byte stream.A proposed organization of a general data transfermechanism is described, and the manner in which the MSDTPwould be used in that environment is presented. -1-
II. REFERENCESBlack, Edward H., "The DMS Message Composer", MIT ProjectMAC, Programming Technology Division DocumentSYS.16.02.Burchfiel, Jerry D., Leavitt, Elsie M., Shapiro, Sonya andStrollo, Theodore R., compilers, "Tenex Users' Guide",Bolt Beranek and Newman, Cambridge, Mass., May 1971,revised January 1975, Descriptive sections on the TENEXsubsystems: MAlLER, p. 116-11; MAlLSTAT, p. 118-119;READMAIL, p. 137; and SNDMSG, p. 165-170.Haverty, Jack, "Communications System Overview", MIT ProjectMAC, Programming Technology Division DocumentSYS.16.00.Haverty, Jack, "Communications System Daemon Manual", MITProject MAC, Programming Technology Division DocumentSYS.16.01.ISI Information Automation Project, "Military MessageProcessing System Design," Internal ProjectDocumentation (Out of Print), Jan. 1975Message Services Committee, "Interim Report", Jan. 28, 1975Mooers, Charlotte D., "Mailsys Message System: Manual ForUsers", Bolt Beranek and Newman, Cambridge, Mass., June1975 (draft).Myer, Theodore H., "Notes On The BBN Mail System", BoltBeranek and Newman, November 8, 1974.Myer, Theodore H., and Henderson, D. Austin, "MessageTransmission Protocol", Network Working GroupRFC 680,NIC 32116, April 30, 1975.Postel, Jon, "The PCPB8 Format", NSW Proposal, June 5, 1975Tugender, R., and D. R. Oestreicher, "Basic FunctionalCapabilities for a Military Message ProcessingService," ISI?RR-74-23., May 1975Vezza, Al, "Message Services Committee Minority Report",Jan. 1975 -2-
III. OVERVIEWThis document describes a mechanism developed for useby message servers communicating over an eight-bitbyte-oriented network connection to move data structures andassociated data-typing information. It is presented here inthe hope that it may be of use to other projects which needto transfer data structures between dissimilar hosts.A set of abstract entities called PRIMITIVE ITEMS isenumerated. These are intended to include traditional datatypes of general utility, such as integers, strings, andarrays.A mechanism is defined for augmenting the set ofabstract data entities handled, to allow the introduction ofapplication-specific data, whose format and semantics areunderstood by the application programs involved, but whichcan be transmitted using common coding facilities. Anexample might be a data structure called a 'filespecification', or a 'date'. Abstract data entities definedusing this mechanism will be termed SEMANTIC ITEMS, sincethey are typically used to carry data having semanticcontent in the application involved.Semantic and primitive items are collectively referredto simply as ITEMS.The protocol next involves the definition of the formatof the byte stream used to convey items from machine tomachine. These encodings are described in terms of OBJECTS,which are the physical byte streams transmitted.To complete the protocol, the rules for translatingbetween objects and items are presented as each object isdefined.An item is transmitted by being translated into anobject which is transmitted over the connection as a streamof bytes to the receiver, and reconstructed there as anitem. The protocol mechanism may thus be viewed as a simpletranslator. It enumerates a set of abstract entities, theitems, which are known to programmers, a set of entities inbyte-stream format, the objects, and the translation rulesfor conversion between the sets. A site implementing theMSDTP would typically provide a facility to convert betweenobjects and the local representation of the various itemshandled. Applications using the MSDTP define theirinteractions using items, without regard to the actualformats in which such items are represented at variousmachines. This permits programs to handle higher-levelconcepts such as a character string, without concern for itsnumerous representational formats. Such detail is handledby the MSDTP. -3-
Finally, a discussion of a general data transfermechanism for communication between programs is presented,and the manner in which the particular byte-orientedprotocol defined herein would be used in that environment isdiscussed.Terminology, as introduced, is defined and highlightedby capitalizing.IV. PRIMITIVE DATA ITEMSThe primitive data items include a variety oftraditional, well-understood types, such as integers andstrings. Primitive data items will be presented usingmnemonic names preceded by the character pair "p-", to serveas a reminder that the named object is primitive.These items may be represented in various computersystems in whatever fashion their programmers desire.IV.1 -- Set Of Primitive ItemsThe set of primitive items defined includes p-INT,p-STRING, p-STRUC, p-BITS, p-CHAR, p-BOOL, p-EMPTY, andp-XTRA.Since the protocol was developed primarily for use inmessage services, items such as p-FLOAT are not includedsince they were unnecessary. Additional items may be easilyadded as necessary.A p-INT performs the traditional role of representinginteger numbers. A p-BITS (BIT Stream) item represents abit stream. The two possible p-BOOL (BOOLean) items areused to represent the logical values of *TRUE* and *FALSE*.The single p-EMPTY item is used to, for example, indicatethat a given field of a message is empty. It is provided toact as a place-holder, representing 'no data', and appearsas *EMPTY*.The p-STRUC (STRUCture) item is used to group togethera collection of items as a single value, maintaining theordering of the elements, such as a p-STRUC of p-INTs.A p-CHAR is a single character. The most commonoccurrence of character data, however, will be as p-STRINGs.A p-STRING should be considered to be a synonym for ap-STRUC containing only p-CHARs. This concept is importantfor generality and consistency, especially when consideringdefinitions of permissible operations on structures, such asextracting subsequences of elements, etc. -4-
Four p-XTRA items, which can be transmitted in a singlebyte, are made available for higher level protocols to usewhen a frequently used datum is handled which can berepresented just by its name. An example would be anacknowledgment between two servers. Using p-XTRAs torepresent such data permits them to be handled in a singlebyte. There are four possible p-XTRA items, termed *XTRA0*,*XTRA1*, *XTRA2*, and *XTRA3*. These may be assignedmeanings by user protocols as desired.IV.2 -- Printing ConventionsThe following printing conventions are introduced tofacilitate discussion of the primitive items.When a specific instance of a primitive data item ispresented, it will be shown in a traditional representationfor that kind of data. For example, p-INTs are shown assequences of digits, e.g. 100, p-STRINGs, as sequences ofcharacters enclosed in double-quote characters, for example"ABCDEF".As shown above, the two possible p-BOOL items are shownas *TRUE* or *FALSE*. The object p-EMPTY appears as*EMPTY*. A bit stream, i.e. p-BITS, appears as a stream of1s and 0s enclosed in asterisks, for example *100101001*. Ap-CHAR will be presented as the character enclosed in singlequote characters, e.g., 'A'.P-STRUCs are printed as the representations of theirelements, enclosed in parentheses, for example (1 2 3 4) or("XYZ" "ABC" 1 2) or ((1 2 3) "A" "B"). Note that becausep-STRINGs are simply a class of p-STRUCs assigned a specialname and printing format for brevity and convenience, theitems "ABC" and ('A' 'B' 'C') are identical, and the latterformat should not be used.To present a generic p-STRUC, as in specifying formatsof the contents of something, the items are presented as amnemonic name, optionally followed by a colon and thepermissible types of values for that datum. When one ofseveral items may appear as the value for some component,the permissible ones appear separated by vertical-barcharacters. For example, p-INT|p-STRING represents a singleitem, which may be either a p-INT or a p-STRING.To represent a succession of items, the Kleene starconvention is used. The specification p-INT[*] representsany number of p-INTs. Similarly, p-INT[3,5] represents from3 to 5 p-INTs, while p-INT[*,5] specifies up to 5 andp-iNT[5,*] specifies at least 5 p-INTs. -5-
For example, a p-STRUC which is used to carry names andnumbers might be specified as follows.(name:p-STRING number:p-INT)In discussing items in general, when a specific datavalue is not intended, the name and types representation maybe used, e.g., offset:p-INT to discuss an 'offset' which hasa numeric value.V. SEMANTIC ITEM MECHANISMThe semantic item mechanism provides a means forprogram designers to use a variety of application-specificdata items.This mechanism is implemented using a special taggedstructure to carry the data type information as well as theactual components of the particular semantic item. Fordiscussion purposes. Such a special p-STRUC will be termed ap-EDT (Extended Data Type).When p-EDTs are transferred, their identity as a p-EDTis maintained. So that an applications program receives thecorresponding semantic item instead of a simple p-STRUC. Ap-EDT is identical to a p-STRUC in all other respects.V.1 -- Format of p-EDTsA prototypical p-EDT follows. It is printed as if itwere a normal p-STRUC. Since p-EDTs are converted tosemantic items for presentation to the user, a p-EDT willnever be used except in this protocol definition.(type:p-INT|p-STRING version:p-INT com1:anycom2:any ...)The first element, the 'type' is generally a p-INT, andis used to identify the particular type of semantic item.Types are assigned numeric codes in a controlled fashion.The type may alternatively be specified by a p-STRING, topermit development of new data types for possible laterassignment of codes. Each type has an equivalent p-STRINGname. These may be used interchangeably as 'type' elements,primarily to maintain upward compatibility.The second element of a p-EDT is always an p-INT, the'version', and specifies the exact format of the particulardatum. A semantic item may undergo several revisions of itsinternal structure. Which would be evident through assigningdifferent versions to each revision. -6-
Successive components. The 'com' elements, if any.carry the actual data of the semantic item. As eachsemantic item is defined, conventions on permissible valuesand interpretation of these components are presented. Suchdefinitions may use any types of items to specify the formatof the semantic item. Use of lower level concepts, such asobjects, in these definitions is prohibited.Semantic items will be printed as the mnemonic for thetype involved, preceded by the character pair "s-", tosignify that the data item is handled by this mechanism.V.2 -- Printing ConventionsA semantic item is represented as if it were a p-STRUCcontaining only the components, if any, but preceded by thesemantic type name and a # character. The version number isassumed to be 1 if unspecified. For later versions, theversion number is attached to the type name, as in, forexample, FILE-2 to represent version 2 of the FILE datatype.For example, a semantic item called a 'filespecification' might be defined, containing two components,a host number and pathname. A specific instance of such anitem might appear as #FILE(69 "DIRECTORY.NAME-OF-FILE"),while a generic s-FILE might be presented as the following.#FILE(host:p-INT|p-STRING pathname:p-STRING)the item, which may be either a p-INT or p-STRING, and'pathname' is the second component, which must be ap-STRING. The full definition would present interpretationrules for these components.VI. ENCODING OBJECTSThis section presents the set of objects which are usedto represent items as byte streams for inter-servertransmission. Objects will be presented using mnemonictype-names preceded by the character pair "b-", indicatingtheir existence only as byte streams.All servers are required to be capable of decoding theentire set of objects. Servers are not required to transmitcertain objects which are available to improve channelefficiency. -7-
The encodings are designed to facilitate programmingand efficiency of the receiving decoder. In all cases, thetype and length in bytes of objects is supplied as the firstinformation sent. This characteristic is important for easeof implementation. The type information permits a decoder tobe constructed in a modular fashion. The most importantadvantage of including size information is that the receiveralways knows how many bytes it must read to discover what todo next, and knows when each object terminates. Thisrequirement avoids many potential problems with timing andsynchronization of processes.Two varieties of objects are defined. The first willbe called ATOMIC, and includes objects used to efficientlyencode the most common data. The second variety is termedNON-ATOMIC, and is used to encode larger or less commonitems.In all cases, a data object begins with a single byte,which will be termed the TYPE-BYTE, a field of whichcontains the type code of the object. The following bytes,if any, are interpreted according to the type involved.VI.1 -- Presentation ConventationsIn discussing formats of bytes, the followingconventions will be employed. The individual bits of a bytewill be referenced by using capital letters from A to H,where A signifies the highest order bit, and H the lowest.The entire eight bit value, for example, could be referredto as ABCDEFGH. Similarly, subfields of the byte will beidentified by such sequences. The CDEF field specifies themiddle four bits of a byte.In referring to values of fields, binary format will beused, and small letters near the end of the alphabet will beused to identify particular bits for discussion. Forexample, we might say that the BCD field of a byte containsa specifier for some type, and define its value to beBCD=11z. In discussions of the specifier usage, we couldrefer to the cases where z=l and where z=0, as shorthandnotation to identify BCD=111 and BCD=110, respectively.V1.2 -- Type-Byte Bit AssignmentTo assist in understanding the assignment of thevarious type-byte values, the table and graph below areincluded, showing representations of the eight bits. -8-
OXXXXXXX -- CHAR7 (CHARacter, 7 bit)10XXXXXX -- SINTEGER (Small INTEGER)l10XXXXX -- NON-ATOM (NON-ATOMic objects)11100XXX -- LINTEGER (Large INTEGER)11101XXX -- reserved11110XXX -- SBITSTR (Short BIT STReam)111110XX -- XTRA (eXTRA single-byte objects)1111110X -- BOOL (BOOLean)11111110 -- EMPTY (EMPTY data item)11111111 -- PADDING (unused byte)In each case, the bits identified by X's are used tocontain information specific to the type involved. Theseare explained when each type is defined.An equivalent tree representation follows, for thosewho prefer it.start with high order bit | | | 0-----0-----0-----0-----0-----0-----0-----0-----X | | | | | | | | PADDING0| 0| 0| 0| 0| 0| 0| 0| | | | | | | | | X | X | X | X XCHAR7 | NON-ATOM | BITS | BOOL EMPTY (7) | (5) | (3) | (1) | 0| | | SINTEGER | XTRA (6) | (2) LINTEGER (3) Type-Byte Bit Assignment SchemeThis picture is interpreted by entering at the top, andtaking the appropriate branch at each node to correspond tothe next bit of the type-byte, as it is scanned from left toright. When a type is assigned, the branch terminates withan "X' and the name of the type of the object, with thenumber of remaining bits in parentheses. The individualobject definitions specify how these bits are used for thatparticular type.V1.3 -- Atomic ObjectsAtomic objects are identified by specific patterns in atype-byte. Receiving servers must be capable of recognizing -9-
and handling all atomic types, since the size of the objectis not explicitly present in a uniform fashion.================================| Atomic Object: B-CHAR7 |================================The b-CHAR7 (CHARacter 7 bit) object is introduced tohandle transmission of characters, in 7-bit ASCII format.Since the vast majority of message-related data involvessuch objects, they are designed to be very efficient intransmission. Other formats, such as eight bit values, canbe introduced as non-atomic objects. The format of a b-CHAR7follows:A=0 identifying the b-CHAR7 data typeBCDEFGH=tuvwxyz containing the charactercodeThe tuvwxyz objects contain the ASCII code of thecharacter. For example, transmission of a "space' (ASCIIcode 32, 40 octal) would be accomplished by the followingbyte.00100000ABCDEFGHA=0 to identify this byte as a b-CHAR7. The remainingbits contain the 7 bit code, octal 40, for space.A b-CHAR7 standing alone is presented as a p-CHAR.Such occurrences will probably be rare if they are used atall. The most common use of b-CHAR7's is as elements ofb-USTRUCs used to transmit p-STRINGS, as explained later.=============================| Atomic Object: B-SINTEGER |=============================The b-SINTEGER (Small INTEGER) object is used totransmit very small positive integers, of values up to 64.It always translates to an p-INT, and any p-INT between 0and 63 may be encoded as a b-SINTEGER for transmission. Theformat of an b-SINTEGER follows.AB=10 identifying the object as a b-SINTEGERCDEFGH=uvwxyz containing the actual numberFor example, to transmit the integer 10 (12 octal), thefollowing byte would be transmitted:10001010ABCDEFGH -10-
AB=10 to specify a b-SINTEGER. The remaining six bitscontain the number 10 expressed in binary.=============================| Atomic Object: B-SINTEGER |=============================The b-SINTEGER (Large INTEGER) object is used totransmit p-INTs to any precision up to 64 bits. It isalways translated as a p-INT. Sending servers are permittedto choose either b-SINTEGER or b-SINTEGER format fortransmission of numbers, as appropriate. When possible,b-SINTEGERs can be used for better channel efficiency. Theformat of a b-SINTEGER follows:ABCDE=11100 specifying that this is a b-SINTEGER.FGH=xyz containing a count of number of bytes to follow.The xyz bits are interpreted as a number of bytes tofollow which contain the actual binary code of the theinteger in 2's complement format. Since a zero-byte integeris disallowed, the pattern xyz=000 is interpreted as 1000,specifying that 8 bytes follow. The number is transmittedwith high-order bits first. This format permitstransmission of integers as large as 64 bits in magnitude.For example, if the number 4096 (10000 octal) is to betransmitted, the following sequence of bytes would be sent:11100010 00010000 00000000ABCDEFGH ---actual data---ABCDE=11100, identifying this as a b-LINTEGER, E=0,specifying a positive number, and FGH=010, specifying that 2bytes follow, containing the actual binary number.============================| Atomic Object: B-SBITSTR |============================The b-SBITSTR (Short BIT STReam) object is used totransmit a p-BITS of length 63 or less. For longer bitstreams, the non-atomic object b-LBITSTR may be used. Theformat of a b-SBITSTR follows.ABCDE=11110 specifying the type as b-SBITSTRFGH=xyz specifying the number of bytesfollowing. -11-
The xyz value specifies the number of additional bytesto be read to obtain the bit stream values. As in the caseof b-SINTEGER, the value xyz=000 is interpreted as 1000,specifying that 8 bytes follow.To avoid requiring specification of exactly the numberof bits contained, the following convention is used. Thefirst data byte is scanned from left to right until thefirst 1 bit is encountered. The bit stream is defined tobegin with the immediately following bit, and run throughthe last bit of the last byte read. In other words, the bitstream is 'right-adjusted' in the collected bytes, with itsleft end delimited by the first "on' bit.For example, to send the bit stream *001010011* (9bits), the following bytes are transmitted.11110010 00000010 01010011ABCDEhij klmnopqr stuvwxyzThe hij=010 value specifies that two bytes follow. Theq bit, which is the first 1 bit encountered, identifies thestart of the bit stream as being the r bit. The rstuvwxyzbits are the bit stream being handled.=========================| Atomic Object: b-BOOL |=========================The b-BOOL (BOOLean) object is used to transmitp-BOOLs. The format of b-BOOL objects follows.ABCDEFG=1111110 specifying the type asb-BOOLH=z specifying the valueThe two possible translations of a b-BOOL are *FALSE*and *TRUE*.11111100 represents *FALSE*11111101 represents *TRUE*ABCDEFGzif z=0, the value is FALSE, otherwise TRUE.========================================| Atomic Object: B-EMPTY |========================================The b-EMPTY object type is used to transmit a 'null'object, i.e. an *EMPTY*. The format of an b-EMPTY follows.ABCDEFGH=11111110 specifying *EMPTY* -12-
=========================| Atomic Object: B-XTRA |=========================The b-XTRA objects are used to carry the four possiblep-XTRA items, i.e., *XTRA0*, *XTRA1*, *XTRA2*, and *XTRA3*.These four items correspond to the binary coding of theremaining two bits after the b-XTRA type code bits. Theformat of a b-XTRA follows.ABCDEF=111110 to specify the type b-XTRAGH=yz to identify the particular p-XTRA itemcarriedThe GH bits of the byte are decoded to produce aparticular p-XTRA item, as follows.GH=00 -- *XTRA0*GH=01 -- *XTRA1*GH=10 -- *XTRA2*GH=11 -- *XTRA3*The b-XTRA object is included to provide the use ofseveral single-byte data items to higher levels. Theseitems may be assigned by individual applications to improvethe efficiency of transmission of several very frequent dataitems. For example, the message services protocols will usethese items to convey positive and negative acknowledgments,two very common items in every interaction.========================================| Atomic Object: B-PADDING========================================This object is anomalous, since it represents really nodata at all. Whenever it is encountered in a byte stream ina position where a type-byte is expected, it is completelyignored, and the succeeding byte examined instead. Itspurpose is to serve as a filler in byte streams, providingservers with an aid in handling internal problems related totheir specific word lengths, etc. The encoders may freelyuse this object to serve as padding when necessary.All b-PADDING data objects exist only within an encodedbyte stream. They never cause any data item whatsoever tobe presented externally to the coder module. The format of ab-PADDING follows.ABCDEFGH=11111111Note that this does not imply that all such 'null'bytes in a stream are to be ignored, since they could beencountered as a byte within some other type, such asb-LINTEGER. Only bytes of this format which, by theirposition in the stream, appear as a 'type' byte are to beignored. -13-
VI.4 -- Non-Atomic ObjectsNon-atomic objects are are always transmitted precededby both a single type byte and some small number of sizebyte(s). The type byte identifies that the data objectconcerned is of a non-atomic type, as well as uniquelyspecifying the particular type involved. All non-atomicobjects have type byte values of the following form.ABC=110 specifying that the object isnon-atomicDEFGH=vwxyz specifying the particular typeof objectThe vwxyz value is used to specify one of 31 possiblenon-atomic types. The value vwxyz=00000 is reserved for usein future expansion.In all non-atomic data objects, the byte(s) followingthe type-byte specify the number of bytes to follow whichcontain the data object. In all cases, if the number ofbytes specified are processed, the next byte to be seenshould be another type-byte, the beginning of the nextobject in the stream.The number of bytes containing the object sizeinformation is variable. These bytes will be termed theSIZE-BYTES. The first byte encountered has the followingformat.A=s specifying the manner in which the sizeinformation is encodedBCDEFGH=tuvwxyz specifying the size, ornumber of bytes containing the sizeThe tuvwxyz values supply a positive binary number. Ifthe s value is a one, the tuvwxyz value specifies the numberof bytes to follow which should be read and concatenated asa binary number, which will then specify the size of theobject. These bytes will appear with high order bits first.Thus, if s=1, up to 128 bytes may follow, containing thecount of the succeeding data bytes, which should certainlybe sufficient.Since many non-atomic objects will be fairly short, thes=0 condition is used to indicate that the 7 bits containedin tuvwxyz specify the actual data byte count. This permitsobjects of sizes up to 128 bytes to be specified using onesize-information byte. The case tuvwxyz=0000000 isinterpreted as specifying 128 bytes.For example, a data object of some non-atomic typewhich requires 100 (144 octal) bytes to be transmitted wouldbe sent as follows. -14-
110XXXXX -- identifying a specificnon-atomic object01100100 -- specifying that 100 bytes follow..data -- the 100 data bytes..Note that the size count does not include thesize-specifier byte(s) themselves, but does include allsucceeding bytes in the stream used to encode the object.A data object requiring 20000 (47040 octal) bytes wouldappear in the stream as follows.110XXXXX -- identifying a specificnon-atomic object10000010 -- specifying that the next 2 bytescontain the stream length01001110 -- first byte of number 2000000100000 -- second byte..data -- 20,000 bytes..Interpretation of the contents of the 20000 bytes inthe stream can be performed by a module which knows thespecific format of the non-atomic type specified by DEFGH inthe type-byte.The remainder of this section defines an initial set ofnon-atomic types, the format of their encoding, and thesemantics of their interpretation.================================| Non-atomic Object: B-LBITSTR |================================The b-LBITSTR (Long BIT Stream) data type is introducedto transmit p-BITS which cannot be handled by a b-SBITSTR.A b-LBITSTR may be used to transmit short p-BITS as well.Its format follows. -15-
11000001 size-bytes data-bytesABCDEFGHABC=110 identifies this as a non-atomic object.DEFGH=00001 specifies that it is a b-LBITSTR. The standardsizing information specifies the number of succeeding bytes.Within the data-bytes, the first object encountered mustdecode to a p-INT. This number conveys the length of thebit stream to follow. The actual bit stream begins with thenext byte, and is left-adjusted in the byte stream. Forexample to encode *101010101010*, the following b-LBITSTRcould be used, although a b-SBITSTR would be more compact.11000001 -- identifies a b-LBITSTR00000010 -- b-SINTEGER, to specify length10001100 -- size = 210101010 -- first 8 data bits10100000 -- last 4 data bits==============================| Non-atomic Object: B-STRUC |==============================The b-STRUC (STRUCture) data type is used to transmitany p-STRUC. The translation rules for converting a b-STRUCinto a primitive item are presented following the discussionof b-REPEATs. The b-STRUC format appears as follows.11000010 size-bytes data-bytesABCDEFGHABC=110 identifies this as a non-atomic type.DEFGH=00010 specifies that the object is a b-STRUC. Withinthe data-bytes stream, objects simply follow in order. Thisimplies that the b-STRUC encoder and decoder modules cansimply make use of recursive calls to a standardencoder/decoder for processing each element of the b-STRUC.Note that any type of object is permitted as an element of ab-STRUC, but the size information of the b-STRUC mustinclude all bytes used to represent the elements.Containment of b-STRUCs within other b-STRUCs ispermitted to any reasonable level. That is, a b-STRUC maycontain as an element another b-STRUC, which containsanother b-STRUC, and so on. All servers are requires tohandle such containment to at least a minimum depth ofthree.Examples of encoded structures appear in a latersection. -16-
============================| Non-atomic Object: B-EDT |============================A b-EDT is the object used as the carrier for p-EDTs intransmission of semantic items. It is functionallyidentical to a b-STRUC, but has a different type code topermit it to be identified and converted to a semantic iteminstead of a p-STRUC. The format of a b-EDT follows.11000011 size-bytes data-bytesABCDEFGHAs with all non-atomic types, ABC=110 to identify thisas such, and DEFGH=00011 to specify a b-EDT. The objects inthe data-bytes are decoded as for b-STRUCs. However, thefirst object must decode to a p-iNT or p-STRING and thesecond to a p-INT, to conform to the format of p-EDTs.===============================| Non-atomic Object: b-REPEAT |===============================The b-REPEAT object is never translated directly intoan item. It is legal only as an component of an enclosingb-STRUC, b-USTRUC, b-EDT, or b-REPEAT. A b-REPEAT is used toconcisely specify a set of elements to be treated as if theyappeared in the enclosing structure in place of theb-REPEAT. This provides a mechanism for encoding a sequenceof identical data items or patterns efficiently fortransmission.A common example of this would be in transmission oftext, where line images containing long sequences of spaces,or pages containing multiple carriage-return, line-feedpairs, are often encountered. Such sequences could beencoded as an appropriate b-REPEAT to compact the data fortransmission. The format of a b-REPEAT is as follows.11000100 -- identifyIng the object as a b-REPEATsize-bytes -- the standard non-atomic object size informationcountspec -- an object which translates to a p-INT..data -- the objects which define the pattern..The 'countspec' object must translate to an p-INT tospecify the number of times that the following data patternshould be repeated in the object enclosing the b-REPEAT. -17-
The remaining objects in the b-REPEAT constitute thedata pattern which is to be repeated. The decoding of theenclosing structure will be continued as if the data patternobjects appeared 'countspec' times in place of the b-REPEAT.Zero repeat counts are permitted, for generality. Theycause no objects to be simulated in the enclosing structure.An encoder does not have to use b-REPEATs at all, ifsimplicity of coding outweighs the benefits of datacompression. In message services, for example, an encodermight limIt itself to only compressing long text strings. Itis important for compatibility, however, to have the abilityin the decoders to handle b-REPEATs.===============================| Non-atomic Object: B-USTRUC |===============================The b-USTRUC (Uniform Structure) object type isprovided to enable servers to convey the fact that a p-STRUCbeing transferred contains items of only a single type. Themost common example would involve a b-USTRUC whichtranslates to a p-STRUC of only p-CHARs, and hence may beconsidered to be a p-STRING. Servers may use thisinformation to assist them in decoding objects efficiently.No server is required to generate b-USTRUCs.The internal construction of a b-USTRUC is identical tothat of a b-STRUC, except for the type-byte. The format of ab-USTRUC follows.11000101 size-bytes data-bytesABCDEFGHABC=110 to identify a non-atomic object. DEFGH=00101specifies the object as a b-USTRUC.===============================| Non-atomic Object: B-STRING |===============================The b-STRING object is included to permit explicitspecification of a structure as a p-STRING. Thisinformation will permit receiving servers to process theincoming structure more efficiently. A b-STRING isformatted similarity to a b-USTRUC, except that its type-byteidentifies the object as a b-STRI/NG. The normal sizinginformation is followed by a stream of bytes which areinterpreted as b-CHAR7s, Ignoring the high-order bit. Theformat of a b-STRING follows.11000110 size-bytes data-bytesABCDEFGHABC=110 to identify a non-atomic object. DEFGH=00110specifies the object as a b-STRING. -18-
VI.5 -- Structure Translation RulesA b-STRUC is translated into a p-STRUC. This isperformed by translating each object of the b-STRUC Into itscorresponding item, and saving it for inclusion In thep-STRUC being generated. A b-USTRUC is handled similarly,but the coding programs may utilize the information that theresultant p-STRUC will contain items of uniform type. Thepreferred method of coding p-STRINGS is to use b-USTRUCs.If all of the elements of the resultant p-STRUC arep-CHARs, it is presented to the user of the decoder as ap-STRING. A p-STRING should be considered to be a synonymfor a p-STRUC containing only characters. It need notnecessarily exist at particular sites which would presentp-STRUCs of p-CHARs to their application programsThe object b-REPEAT is handled in a special fashionwhen encountered as an element. When this occurs, the datapattern of the b-REPEAT is translated into a sequence ofitems, and that sequence is repeated in the next higherlevel as many times as specified in the b-REPEAT.Therefore, b-REPEATS are legal only as elements of asurrounding b-STRUC, b-USTRUC, b-EDT, or b-REPEAT.In encoding a p-STRUC or p-STRING for transmission, atranslator may use b-REPEATs as desired to effect datacompression, but their use is not mandatory. Similarly,b-STRINGS may be used, but are not mandatory.A b-EDT is translated into a p-EDT to identify it as acarrier for a semantic item. Otherwise, it is treatedidentically to a b-STRUC.VI.6 -- Translation SummaryThe following table summarizes the possibletranslations between primitive items and objects.p-INT <--> b-LINTEGER, b-SINTEGERp-STRING <--> b-STRING, b-STRUC, b-USTRUCp-STRUC <--> b-STRING, b-STRUC, b-USTRUCp-BITS <--> b=SBITSTR, b-LBITSTRp-CHAR <--> b-CHAR7p-BOOL <--> b-BOOLp-EMPTY <--> b=EMPTYp-XTRA <--> b-XTRAp-EDT <--> b-EDT (all semantic items)-none- <--> b-PADDING-none- <--> b-REPEAT (only within structure)Note that all semantic items are represented as p-EDTswhich always exist as b-EDTs in byte-stream format. -19-
V1.7 -- Structure Coding ExamplesThe following stream transmits a b-STRUC containing 3b-SINTEGERs, with values 1, 2, and 3, representing a p-STRUCcontaining three p-INTs, i.e. (1 2 3).11000010 -- b-STRUC00000011 -- size=310000001 -- b-SINTEGER=110000010 -- b-SINTEGER=210000011 -- b-SINTEGER=3The next example represents a b-STRUC containing thecharacters X and Y, followed by the b-LINTEGER 10,representing a p-STRUC of 2 p-CHARs and a p-INT, i.e., ('X''Y' 10). Note that the p-INT prevents considering this ap-STRING.11000010 -- b-STRUC00000100 -- size=401011000 -- b-CHAR7 'X'01011001 -- b-CHAR7 'Y'11100001 -- b-LINTEGER00001010 -- 10Note that a better way to send this p-STRUC would be torepresent the integer as a b-SINTEGER, as shown below.11000010 -- b-STRUC00000011 -- size=301011000 -- b-CHAR7 'X'01011001 -- b-CHAR7 'Y'10001010 -- b-SINTEGER=10The next example shows a b-STRUC of b-CHAR7s. It isthe translation of the b-STRING "HELLO".11000010 -- b-STRUC00000101 -- size=501001000 -- b-CHAR7 'H'01000101 -- b-CHAR7 'E'01001100 -- b-CHAR7 'L'01001100 -- b-CHAR7 'L'01001111 -- b-CHAR7 'O'This datum could also be transmitted as a b-STRING.Note that the character bytes are not necessarily b-CHAR7s,since the high-order bit is ignored.11000110 -- b-STRING00000101 -- size=501001000 -- 'H'01000101 -- 'E'01001100 -- 'L'01001100 -- 'L'01001111 -- 'O' -20-
To encode a p-STRING containing 20 carriage-returnline-feed pairs, the following b-STRUC containing a b-REPEATcould be used.11000010 -- b-STRUC00000101 -- size=511000100 -- b-REPEAT00000011 -- size=310010100 -- count, b-SINTEGER=2000001101 -- b-CHAR7, "CR'00001010 -- b-CHAR7, 'IF'To encode a p-STRUC of p-INTs, where the sequencecontains a sequence of thirty 0's preceded by a single 1,the following b-STRUC could be used.11000010 -- b-STRUC00000110 -- size=610000001 -- b-SINTEGER=111000100 -- b-REPEAT00000010 -- size=210011110 -- count, b-SINTEGER=3010000000 -- b-SINTEGER=0VII. A GENERAL DATA TRANSFER SCHEMEThis section considers a possible scheme for extendingthe concept of a data translator into an multi-purpose datatransfer mechanism.The proposed environment would provide a set ofprimitive items, including those enumerated herein butextended as necessary to accommodate a variety ofapplications. Communication between processes would bedefined solely in terms of these items, and wouldspecifically avoid any consideration of the actual formatsin which the data is transferred.A repertoire of translators would be provided, one ofwhich is the MSDTP machinery, for use in converting items toany of a number of transmission formats. Borrowing aconcept from radio terminology, each translator would beanalogous to a different type of modulation scheme, to beused to transfer data through some communications medium.Such media could be an eight-bit byte-oriented connection,36-bit connection, etc. and conceivably have otherdistinguishing features, such as bandwidth, cost, and delay.For each media which a site supports, it would provide itsprogrammers with a module for performing the translationsrequired. -21-
Certain media or translators might not handle variousitems. For example, the MSDTP does not handle items whichmight be termed p-FLOATs, p-COMPLEXs, p-ARRAY, and so on. Inaddition, the efficiency of various media for transfer ofspecific items may differ drastically. MSDTP, for example,transfers data frequently used in message handling veryefficiently, but is relatively poor at transfer of verylarge or deep tree structures.Available at each site as a process or subroutinepackage wouLd be a module responsible for interfacing withits counterpart at the other end of the media. Thesemodules would use a protocol, not yet defined, to matchtheir capabilities, and choose a particular media andtranslator, when more than one exists, for transfer of dataitems.Such a facility could totally insulate applicationsfrom need to consider encoding formats, machine differences,and so on, as well as eliminate duplication of effort inproducing such facilities for every new project whichrequires them. In addition, as new translators or media areintroduced, they would become immediately available toexisting users without reprogramming.Implementation of such a protocol should not be verydifficult or time-consuming, since it need not be verysophisticated in choosing the most appropriate transfermechanism in initial implementations. The system isinherently upward-compatible and easily expandable. -22-
[8]ページ先頭