Movatterモバイル変換
[0]ホーム
[RFC Home] [TEXT|PDF|HTML] [Tracker] [IPR] [Info page]
UNKNOWN
RFC 242NIC 7672Categories: D.4, D.7 DATA DESCRIPTIVE LANGUAGE FOR SHARED DATA L. Haibt A. Mullery Thomas J. Watson Research Center Yorktown Heights, N.Y. July 19, 1971Introduction A primary consequence of the use of networks of computers is thedemand for more efficient shared use of data. Many of the impedements to easy shared data follow from the manydiverse ways of representing and making reference to the same data.Almost all of these problems have been known before data was sharedthrough computer networks, but the network facility has simplyemphasized the problems. For convenience of discussion, representation differences will beclassified in three categories. The first category is one of very localrepresentation - the bit patterns for the character set, for fixed pointand floating point numbers. These differences are usually imposed bydifferences in CPU's and storage devices. Translations from onerepresentation to another at another at this level can usually be made aunit at a time (e.g. computer word by computer word) with the mostserious problems occurring when there are some values in onerepresentation scheme which have no corresponding meaning in the otherrepresentation scheme, as, for expamble, when trying to translateeight-bit bytes to six-bit bytes. A second category of differences has to do with the representationof collections of data, e.g., their size, ordering and location. [Page 1]
A third category of representation differences which is a littledifficult to characterize has to do with all the more complex structuresthat data collections may have - for example, files with indexes, fieldswith internal pointers and cross references, and collections of filessuch as partitioned data sets and generation data sets in OS 360. The approach to coping with these problems within our project ofNetwork/440 has been to work on the development of a descriptivelanguage which would permit the specification of those aspects of datarepresentation which would be subject to transformation in moving dataabout in a network. Then, the network data managment system would beable to refer to the descriptions as needed in the data managementfunction. For example, to a large extent, one could supply twodescriptions to the data manager, one wich indicates how data is nowrepresented, and one which indicates how a copy of it should look, andthe data managment systems could invoke the necessary transformations tomake the proper copy. This approach to specifying data transformation contrasts somewhatwith systems, such as the RAND Form Machine, which provide a formalismfor specifying the particular translation alogrithms for changing formone form to another. the descriptor-to descriptor approach seems tosimplyfy the programming burden when creating new field formats. Neithermethod of specifying translations precludes the use of a NetworkStandard Reprsentation.Structure The descriptive language assumes that data may have an inherentstructure independent of other groupings, such as name groupings,locking groupings, etc., imposed on it. A data structure descriptionconsists of groupings of established data value type codes. The list ofestablished data value types should be sufficient, through appropriategroupings, to describe any hierarchical structure of data. The data type identifies how the data value is to be interpreted. Alist of data type codes is given below. This list must be able identifyeach data type that may exist in a data set in any machine in thenetwork. However, for data sets that contain only data types of themachine on which it is stored, it is not necessary that a different codebe defined for different forms of any single type that may exist amongdifferent machines. The data type specified in the description alongwith the identification of the machine at which the data is stored issufficient to completely describe all such forms of the data types. Atentative list of machine dependent type codes, compiled by [Page 2]
G. Howe and T. Kibler is as follows: F floating point I fixed D double precision floating point C character string X complex P packed decimal L logical It is desirable to be able to construct data sets that containeither data types not allowable at the machine at which the data setis stored or, possibly, even types that say not exist at any machinein the network. For example, one may wish to store eight bit data on asix bit machine. This may, in principle at least, be done by defininga logical data set containing eight bit bytes in terms of a real dataset containing six bit characters. For this, however, data value typedescriptors have to be defined that are machine independent. The basicmachine independent data type is as follows: B bit.It is not clear at this time that any others are necessary since otherscan be built from this one. For convenience, other standard machineindependent data types may later be defined. Two other machine independent types are useful in describingstructures. These are: Z null O omit.the null type indicates that there is no data corresponding to thisitem: however, the item should be counted as existing in the structure.The omit type indicates just the opposite: there is data that should notbe counted as an item, it should be ignored. [Page 3]
A grouping of data values is described by the list of elements ofthe grouping enclosed in parentheses. An element of grouping may beeither a data value as described by one of the data value type codes, ora grouping. The list consists of these elements, separated by commas andindicates that the elements appear in the grouping in the orderindicated. For example, the description: ((C,C),(F,F,I))describes a data collection consisting of two subgroupings, the firstsubgrouping consisting of two data values of type 'C', and the secondsubgrouping consisting of two data values of type 'F' followed by a datavalue of type 'I'. the structure of this data collection is thus a threelevel tree which may be shown in two dimensions as follows: ( ) | ( )---( ) | | C-C F-F-IProperties Other properties of data beyond that of the structures andcomposition of the data set have also to be described. These may beassigned to items of the data collection, where an item may be definedas an individual data value, or a grouping of these, by modifying theitem description with the specification of the preperties that apply toit. The notation that will be used will be an infix notation of theform: operand operator |[extent]|where the operator indicates the property type, the operand the propertyvalue and the optional extent the numer of items to which the propertyapplies. Normally the property is assumed to apply to just the followingitem in the description. If the property is to apply to more than justthe following item description, this is indicated by specifying a numberas the extent, this number indicating how many of the following itemdefinitions at the same level the property is to apply to. Type - The structure description of the data set is a constitutionalor syntactic description of the data set. In some cases it is necessaryto give a discription of the use or meaning of an element. For example,in some complex data structures, the linkages of the structures may berepresented as data values in the data set. Thus, though the more [Page 4]
complex data structure is represented in a hierarchical form and, as aresult, is in a form describable by the above notation, the data valuesthat represent the linkages, and their meaning, must somehow herepresented in the data description in order for the complex datastructure to be truly described. As another example, one may wishascribe to some level of the data structure the type 'record' so thatthe data set can be used by some system which uses the concept 'record'in accessing the data. What an initial set of such types should be has not been deicded. Names - Items of a data structure can be given names by modifyingthe items description with a notation of the form: name n |[extent]|.Depending on the context of its use, the name can refer to thedescription itself or to the data pertaining to the named part of thedescription. The name is assumed to be unique only within the scope ofextent of the next higher encompassing name unless otherwise indicatedby giving another encompassing name as the scope. This may be the nameof the whole data set or description, for example. The scope of a nameis specified by preceding the inner name by the outer name or names,separated by dots. The name: A,BETAindicates that the scope of the name BETA is A. The name applies to just the following item in the descriptionunless otherwise indicated by including the extent parameter, Forexampel, the description: (An(C,C), (Bn[ 2 ]F,Cn[ 2 ]F,I))indicates that name A is given to the item that contains two data valuesof type 'C', the name B to two data values, both of type 'F', and thename C to the last two data values, one of type 'F', and the other oftype 'I'. Notice that with this notation, extents can overlap. Forexample, in the above description, the extent of name B overlapped thatof name C. In a description, the same name can be applied to more than one itemdefinition either by use of the extent parameter, or by actuallyspecifying the name at each item to be included in the extent of thename. If a name is multiply-applied within the same scope, then the nameis assumed to apply to the aggregate of the items to which it has beengiven. Thus is possible to apply names to aggregates of items that are [Page 5]
not necessarily sequential. Lock - During the course of processing data, it may be necessary tolock out use of some portion of data to other users. Seqmentation of adata set into units for locking purposes may be indicated by thenotation: k|[extent]|.Whether or not the data is locked and the type of lock applied (forexample, write protect or read/write protect) is specified at the timethe data is used. Authorization - Authorization for a user to access data may begoverned by some access code assigned to the data. This access code canbe specified in the description by modifying the desired elements of thedescription with an indication of the code. The notation is: code a |[extent ]|Control. Two modifiers are provided which govern the existence of items inthe definition. The first is the repetition modifier: factor r |[extent ]|.This causes the following item definition or item definitions (if theextent indicates more than one) to be repeated. Thus the description (3rC)is equivalent to the description (C,C,C).The other control modifier is the condition modifier: condition c |[extent ]|.If the condition specified is not true, then the following itemdefinition is ignored. The condition is specified by a Booleanexpression. Since several modifiers may apply to an item definition, there is aproblem concerning the relationship among them. For example, if arepetition modifier and a conditional modifier apply to an item, doesthe condition apply to all the repeated items, or only to the first, [Page 6]
assuming the extent of the condition modifier is one? The effect ofmultiple modifiers is dependent on the order in which they areevaluated. Two possible conventions come to mind. One says thatrepetitions are expanded first, then properties applied, and finallyconditions applied to the resulting expanded item definitions,independent of the order in which the modifiers were specified in thedescription. Thus the description (A=3c [ 4 ]4rF,I)is equivalent to (4rA=3c[ 4 ]F,T),and if the condition is true, is equivalent to (F,F,F,F,I),or, if the condition is not true, is equivalent to (T).The other convention is that the modifiers are evaluated in the order inwhich they appear in the description, perhaps. in reverse order - themodifier immediately preceding an item definition is evaluated first,then the one next preceding, etc. This gives more flexibility of meaningto the mulitple modifiers. For example, the descriptions (A=3c3rC)and (3rA=3cC)are not equivalent. In the first, only the first of the threerepetitions is affected by the condition whereas in the second, ll threerepetitions are affected. Since this second convention is more flexible,it shall be the one assumed. This convention allows, for example, therepetition modifier to the applied to a named item as shown: (3rAnC).The name A applies to the three items (in effect, the name A is appliedthree times). This facility allows a name to be applied to a verticalcolumn in a two dimensioned array by, for example, the description: (3r[ 3 ]C,AnC,C) [Page 7]
which given the name A to the second column of the 3x3 array.Reference Named descriptions, or parts of descriptions, that have already beendefined may be inserted into a description using the notation: $ specification.Is a description, the reference is used as an item definition of astring fo item definitions. The item definitions used are those definedby the name given. Names that apply to the named item or items as awhole in the description in which it is defined are ignored by thedescription at which is referred. However, names that apply to parts ofthe named item are carried over to the description at which it isreferred. For example the description (An(F,F),I,$A)is equivalent to (An(F,F),T,(F,F)).Notice that the name "A" was not carried over to where the descriptionwas referenced since it applied to the referred-to item or items as awhole. Parts of a data set or description must be able to be specified foruse in a reference. This specification is in terms of the structure ofthe data set or description. The specification has the form of a dataset name, or description name, followed by modifiers which particularizeto specifications, to the part desired. The four types of modifiers arefor going down a level, going up a level, going frontwards along alevel, and going backwards. Down - To go down a level from that previously specified, themodifier has the following form: . itemor . (item |,extent| |,=value|).Having gone down a level, the item indicates which particular item atthis level is the first (or only one) desired. This may be a number or a [Page 8]
name. If more than one are desired, then the extent indicates how manyitems. (* as extent means all remaining items at that level, ! meansthe first item that meets the conditions that may he get on it orsubitems in following modifiers.) The items selected may he conditionedby their contents. If a value is given, then only those items with thevalue indicated are selected. For example, A.1.1specifies the first field of the first record of data set A, A.(1*).1specified the first field of all the records of A, A.1.(1,2)specifies the first two fields of the first record of dataset A, A.(1,*).(1,="768174")specified only the first fields of all the records of A thathave value "768174", and A.(1,!)-(1,="768174")finds the first field that has value "768174". Up- To go up a level from that previously specified,the modifier has the following form: ' itemor ' (item |,extent| |,=value|).Going up a level specifies the item up one level that contains the itempreviously specified. The item indicates which particular item at thislevel is desired where the containing item is considered the first. Forexample, A.(1,!).(1,=768174")'1 [Page 9]
specifies the first record whose first field has value "768174". Forward. - To go forward on the same level as that previouslyspecified, the modifier is as follows: + itemor + (item |,extent| |,=value|).This modifier is useful when an item following the one which has acertain value in a field is desired. It may also be used when the dataset name is really a pointer, into the data set, which has beet setpreviously. Pointers may or may net be described in a section elsewhere.Backward. - To go backward on the same level as that previouslyspecified, the modifier has the following form: - itemor - (item |,extent| |,=value|).An example of the use of this modifier is when an item preceding the onewhich has a certain value in a field is desired. This might bespecified: A.(1,!).(2,="768174")'1-1 [ This RFC was put into machine readable form for entry ] [ into the online RFC archives Gottfried Janik 9/97 ] [Page 10]
[8]ページ先頭