Movatterモバイル変換

[0]ホーム

Jump to content

Record (computer science)

Edit links

From Wikipedia, the free encyclopedia

Composite data type

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Record" computer science – news ·newspapers ·books ·scholar ·JSTOR(September 2021) (Learn how and when to remove this message)

Incomputer science, arecord (also called astructure,struct,user-defined type (UDT), orcompound data type) is a compositedata structure – a collection offields, possibly of differentdata types, typically fixed in number and sequence.^[1]

For example, a date could be stored as a record containing anumericyear field, amonth field represented as a string, and a numericday-of-month field. A circle record might contain a numericradius and acenter that is apoint record containingx andy coordinates.

Notable applications include theprogramming languagerecord type and for row-based storage, data organized as a sequence of records, such as adatabase table,spreadsheet orcomma-separated values (CSV) file. In general, a record type value is stored inmemory and row-based storage is inmass storage.

Arecord type is adata type that describes such values and variables. Most modern programming languages allow the programmer to define new record types. The definition includes specifying the data type of each field and anidentifier (name or label) by which it can be accessed. Intype theory,product types (with no field names) are generally preferred due to their simplicity, but proper record types are studied in languages such asSystem F-sub. Since type-theoretical records may containfirst-class function-typed fields in addition to data, they can express many features ofobject-oriented programming.

Terminology

[edit]

In the context of storage such as in adatabase orspreadsheet a record is often called arow and each field is called acolumn.^[2]^[3]^[4]^[5]

Inobject-oriented programming, anobject is a record that contains state and method fields.

A record is similar to amathematical tuple, although atuple may or may not be considered a record, and vice versa, depending on conventions and the programming language. In the same vein, a record type can be viewed as the computer language analog of theCartesian product of two or moremathematical sets, or the implementation of an abstractproduct type in a specific language.

A record differs from anarray in that a record's elements (fields) are determined by the definition of the record, and may be heterogeneous whereas an array is a collection of elements with the same type.^[6]

The parameters of afunction can be viewed collectively as the fields of a record and passing arguments to the function can be viewed asassigning the input parameters to the record fields. At a low-level, a function call includes anactivation record orcall frame, that contains the parameters as well as other fields such as local variables and the return address.

History

[edit]

Journal sheet from1880 United States census, showing tabular data with rows of data, each a record corresponding to a single person.

The concept of a record can be traced to various types oftables andledgers used inaccounting since remote times. The modern notion of records in computer science, with fields of well-defined type and size, was already implicit in 19th century mechanical calculators, such asBabbage'sAnalytical Engine.^[7]^[8]

The original machine-readable medium used for data (as opposed to control) was thepunch card used for records in the1890 United States census: each punch card was a single record. Compare the journal entry from 1880 and the punch card from 1895. Records were well-established in the first half of the 20th century, when most data processing was done using punched cards. Typically, each record of a data file would be recorded on one punched card, with specific columns assigned to specific fields. Generally, a record was the smallest unit that could be read from external storage (e.g., card reader, tape, or disk). The contents of punchcard-style records were originally called "unit records" because punchcards had pre-determined document lengths.^[9] When storage systems became more advanced with the use ofhard drives andmagnetic tape, variable-length records became the standard. A variable-length record is a record in which the size of the record in bytes is approximately equal to the sum of the sizes of its fields. This was not possible to do before more advanced storage hardware was invented because all of the punchcards had to conform to pre-determined document lengths that the computer could read, since at the time the cards had to be physically fed into a machine.

Mostmachine language implementations and earlyassembly languages did not have special syntax for records, but the concept was available (and extensively used) through the use ofindex registers,indirect addressing, andself-modifying code. Some early computers, such as theIBM 1620, had hardware support for delimiting records and fields, and special instructions for copying such records.

The concept of records and fields was central in some early filesorting andtabulating utilities, such asIBM's Report Program Generator (RPG).

COBOL was the first widespreadprogramming language to support record types,^[10] and its record definition facilities were quite sophisticated at the time. The language allows for the definition of nested records with alphanumeric, integer, and fractional fields of arbitrary size and precision, and fields that automatically format any value assigned to them (e.g., insertion of currency signs, decimal points, and digit group separators). Each file is associated with a record variable where data is read into or written from. COBOL also provides aMOVECORRESPONDING statement that assigns corresponding fields of two records according to their names.

The early languages developed for numeric computing, such asFORTRAN (up toFORTRAN IV) andALGOL 60, did not support record types; but later versions of those languages, such asFORTRAN 77 andALGOL 68 did add them. The originalLisp programming language too was lacking records (except for the built-incons cell), but itsS-expressions provided an adequate surrogate. ThePascal programming language was one of the first languages to fully integrate record types with other basic types into a logically consistent type system. ThePL/I language provided for COBOL-style records. TheC language provides the record concept usingstructs. Most languages designed after Pascal (such asAda,Modula, andJava), also supported records. Java introduced records in Java 17 andC# introduced records in C#. Records were introduced to Java to simplify data aggregate classes with less boilerplate, making all fieldsfinal andprivate, automatically generating all-argument constructors, getters, and the methodsbooleanequals(),inthashCode(), andStringtoString(). Java records all implicitly extendjava.lang.Record.

Although records are not often used in their original context anymore (i.e. being used solely for the purpose of containing data), records influenced newerobject-oriented programming languages andrelational database management systems. Since records provided more modularity in the way data was stored and handled, they are better suited at representing complex, real-world concepts than theprimitive data types provided by default in languages. This influenced later languages such asC++,Python,JavaScript, andObjective-C which address the same modularity needs of programming.^[11]Objects in these languages are essentially records with the addition ofmethods andinheritance, which allow programmers to manipulate the way data behaves instead of only the contents of a record. Many programmers regard records as obsolete now since object-oriented languages have features that far surpass what records are capable of. On the other hand, many programmers argue that the low overhead and ability to use records inassembly language make records still relevant when programming with low levels ofabstraction. Today, the most popular languages on theTIOBE index, an indicator of the popularity of programming languages, have been influenced in some way by records due to the fact that they are object oriented.^[12] Query languages such asSQL andObject Query Language were also influenced by the concept of records. These languages allow the programmer to store sets of data, which are essentially records, in tables.^[13] This data can then be retrieved using aprimary key. The tables themselves are also records which may have aforeign key: a key that references data in another table.

Record type

[edit]

Operations

[edit]

Operations for a record type include:

Declaration of a record type, including the position, type, and (possibly) name of each field
Declaration of a record; a variable typed as a record type
Construction of a record value; possibly with field value initialization
Read and write record field value
Comparison of two records for equality
Computation of a standardhash value for the record

Some languages provide facilities that enumerate the fields of a record. This facility is needed to implement certain services such asdebugging,garbage collection, andserialization. It requires some degree oftype polymorphism.

In contexts that support record subtyping, operations include adding and removing fields of a record. A specific record type implies that a specific set of fields are present, but values of that type may contain additional fields. A record with fieldsx,y, andz would thus belong to the type of records with fieldsx andy, as would a record with fieldsx,y, andr. The rationale is that passing an (x,y,z) record to a function that expects an (x,y) record as argument should work, since that function will find all the fields it requires within the record. Many ways of practically implementing records in programming languages would have trouble with allowing such variability, but the matter is a central characteristic of record types in more theoretical contexts.

Assignment and comparison

[edit]

Most languages allow assignment between records that have exactly the same record type (including same field types and names, in the same order). Depending on the language, however, two record data types defined separately may be regarded as distinct types even if they have exactly the same fields.

Some languages may also allow assignment between records whose fields have different names, matching each field value with the corresponding field variable by their positions within the record; so that, for example, acomplex number with fields calledreal andimag can be assigned to a2D point record variable with fieldsX andY. In this alternative, the two operands are still required to have the same sequence of field types. Some languages may also require that corresponding types have the same size and encoding as well, so that the whole record can be assigned as an uninterpretedbit string. Other languages may be more flexible in this regard, and require only that each value field can be legally assigned to the corresponding variable field; so that, for example, ashort integer field can be assigned to along integer field, or vice versa.

Other languages (such asCOBOL) may match fields and values by their names, rather than positions.

These same possibilities apply to the comparison of two record values for equality. Some languages may also allow order comparisons ('<'and '>'), using thelexicographic order based on the comparison of individual fields.^{[citation needed]}

PL/I allows both of the preceding types of assignment, and also allowsstructure expressions, such asa = a+1; where "a" is a record, or structure in PL/I terminology.

Algol 68's distributive field selection

[edit]

In Algol 68, ifPts was an array of records, each with integer fieldsX andY, one could writeYof Pts to obtain an array of integers, consisting of theY fields of all the elements ofPts. As a result, the statementsYof Pts[3] := 7 and(Yof Pts)[3] := 7 would have the same effect.

Pascal's "with" statement

[edit]

InPascal, the commandwith R do S would execute the command sequenceS as if all the fields of recordR had been declared as variables. Similarly to entering a differentnamespace in an object-oriented language likeC#, it is no longer necessary to use the record name as a prefix to access the fields. So, instead of writingPt.X := 5; Pt.Y := Pt.X + 3 one could writewithPtdobeginX:=5;Y:=X+3end.

Representation in memory

[edit]

The representation of a record in memory varies depending on the programming language. Often, fields are stored in consecutive memory locations, in the same order as they are declared in the record type. This may result in two or more fields stored into the same word of memory; indeed, this feature is often used insystems programming to access specific bits of a word. On the other hand, most compilers will add padding fields, mostly invisible to the programmer, in order to comply with alignment constraints imposed by the machine—say, that afloating point field must occupy a single word.

Some languages may implement a record as an array of addresses pointing to the fields (and, possibly, to their names and/or types). Objects in object-oriented languages are often implemented in rather complicated ways, especially in languages that allowmultiple class inheritance.

Self-defining records

[edit]

Aself-defining record is a type of record which contains information to identify the record type and to locate information within the record. It may contain the offsets of elements; the elements can therefore be stored in any order or may be omitted.^[14] The information stored in a self-defining record can be interpreted asmetadata for the record, which is similar to what one would expect to find in theUNIX metadata regarding a file, containing information such as the record's creation time and the size of the record inbytes. Alternatively, various elements of the record, each including an element identifier, can simply follow one another in any order.

Key field

[edit]

A record, especially in the context of row-based storage, may includekey fields that allow indexing the records of a collection. A primary key is unique throughout all stored records; only one of this key exists.^[15] In other words, no duplicate may exist for any primary key. For example, an employee file might contain employee number, name, department, and salary. The employee number will be unique in the organization and will be the primary key. Depending on the storage medium and file organization, the employee number might beindexed—that is also stored in a separate file to make the lookup faster. The department code is not necessarily unique; it may also be indexed, in which case it would be considered asecondary key, oralternate key.^[16] If it is not indexed, the entire employee file would have to be scanned to produce a listing of all employees in a specific department. Keys are usually chosen in a way that minimizes the chances of multiple values being feasibly mapped to by one key. For example, the salary field would not normally be considered usable as a key since many employees will likely have the same salary.

References

[edit]

^Felleisen, Matthias (2001).How To Design Programs. MIT Press. pp. 53, 60.ISBN 978-0262062183.
^"Computer Science Dictionary Definitions".Computing Students. RetrievedJan 22, 2018.
^Radványi, Tibor (2014).Database Management Systems. Eszterházy Károly College. p. 19. Archived fromthe original on 2018-09-23. Retrieved23 September 2018.
^Kahate, Atul (2006).Introduction to Database Management Systems. Pearson. p. 3.ISBN 978-81-317-0078-5. Retrieved23 September 2018.
^Connolly, Thomas (2004).Database Solutions: A Step by Step Guide to Building Databases (2nd ed.). Pearson. p. 7.ISBN 978-0-321-17350-8.
^Pape, Tobias; Kirilichev, Vasily; Bolz, Carl Friedrich; Hirschfeld, Robert (2017-01-13)."Record data structures in racket: usage analysis and optimization".ACM SIGAPP Applied Computing Review.16 (4):25–37.doi:10.1145/3040575.3040578.ISSN 1559-6915.S2CID 14306162.
^Bromley, Allan (October 1998)."Charles Babbage's Analytical Engine, 1838".IEEE Annals of the History of Computing.20 (4):29–45.doi:10.1109/85.728228.S2CID 2285332. Retrieved23 September 2018.
^Swade, Doron."Automatic Computation: Charles Babbage and Computational Method".The Rutherford Journal. Retrieved23 September 2018.
^Edwin D. Reilly; Anthony Ralston; David Hemmendinger, eds. (2003).Encyclopedia of computer science (4th ed.). Chichester, UK: Wiley.ISBN 978-1-84972-160-8.OCLC 436846454.
^Sebesta, Robert W. (1996).Concepts of Programming Languages (Third ed.). Addison-Wesley Publishing Company, Inc. p. 218.ISBN 0-8053-7133-8.
^Leavens, Gary T.; Weihl, William E. (1990)."Reasoning about object-oriented programs that use subtypes".Proceedings of the European conference on object-oriented programming on Object-oriented programming systems, languages, and applications - OOPSLA/ECOOP '90. New York, New York, USA: ACM Press. pp. 212–223.doi:10.1145/97945.97970.ISBN 0-201-52430-9.S2CID 46526.
^"Index: The Software Quality Company".TIOBE.com. Retrieved2022-03-01.
^"What is a Relational Database (RDBMS)?".Oracle. RetrievedFebruary 28, 2022.
^Kraimer, Martin R."EPICS Input / Output Controller (IOC) Application Developer's Guide".Argonne National Laboratory. RetrievedNovember 25, 2015.
^"Add or change a table's primary key in Access".support.microsoft.com. Retrieved2022-03-01.
^"Alternate key - Oracle FAQ".www.orafaq.com. Retrieved2022-03-01.

v t e Data types
Uninterpreted	Bit Byte Trit Tryte Word Bit array
Numeric	Arbitrary-precision or bignum Complex Decimal Fixed point Block floating point Floating point Reduced precision Minifloat Half precision bfloat16 Single precision Double precision Quadruple precision Octuple precision Extended precision Long double Integer signedness Interval Rational
Reference	Address physical virtual Pointer
Text	Character String null-terminated
Composite	Algebraic data type generalized Array Associative array Class Dependent Equality Inductive Intersection List Object metaobject Option type Product Record or Struct Refinement Set Union tagged
Other	Any type Boolean Bottom type Collection Enumerated type Exception Function type Opaque data type Recursive data type Semaphore Stream Strongly typed identifier Type class Empty type Unit type Void
Related topics	Value Abstract data type Boxing Data structure Generic Kind metaclass Parametric polymorphism Primitive data type Interface Subtyping Type constructor Type conversion Type system Type theory Variable