Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Flat-file database

From Wikipedia, the free encyclopedia
Database stored as flat data
Not to be confused withFlat file system orFlat file (hand tool).
This article has multiple issues. Please helpimprove it or discuss these issues on thetalk page.(Learn how and when to remove these messages)
icon
This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Flat-file database" – news ·newspapers ·books ·scholar ·JSTOR
(March 2015) (Learn how and when to remove this message)
This articlepossibly containsoriginal research. Pleaseimprove it byverifying the claims made and addinginline citations. Statements consisting only of original research should be removed.(March 2015) (Learn how and when to remove this message)
(Learn how and when to remove this message)
Example of a flat-file model[1]

Aflat-file database is atabular flat-file in which eachrecord issemantically independent – can meaningfully be interpreted and manipulated independent of other records of the table. The termflat loosely refers to data that is record-based and sequential yet lacks more complicated aspects such asnesting,relationships andmetadata (with the exception of column headers). Relationships can be inferred from the data, but the format does not provide special accommodations for relationships.

Format

[edit]

A flat-file database may be stored asplain text orbinary (notcharacter encoded). When plain text, it is typically formatted as one record per line[2] either as delimiter-separated or fixed-width.

Indelimiter-separated values files, thefields are separated by a character or string called thedelimiter.Common variants arecomma-separated values (CSV) where the delimiter is a comma,tab-separated values (TSV) where the delimiter is the tab character, space-separated values and vertical-bar-separated values (delimiter is|). If the delimiter is allowed inside a field, there needs to be a way to distinguish delimiters characters or strings that are meant literally. For example, consider the sentence "If I have to, I'll do it myself.". To encode it in CSV, there needs to be a way to prevent the comma from splitting the field. Severalstrategies to prevent delimiter collision exist.

With fixed-width formats, each field has a fixed length with extraspaces added as needed. The fixed lengths can be predefined and known ahead of time (i.e. stated in the format's specification), or parsed from aheader. With predefined lengths, fields are limited to a maximum length. The need for longer fields may appear sometime after the format is defined. Possible workarounds include abbreviating phrases, replacing values with links (e.g. a URI pointing to the value), and splitting a file into multiple files. With delimiter-separated formats, determining the field boundaries requires finding the delimiters, which incurs somecomputational overhead. This is not needed for fixed-width formats. However, fixed-width formats can lead to unnecessarily large file sizes if fields tend to be shorter than the lengths reserved for them.

Delimiters can be used alongside a notation stating the length of each field. For example,5apple|9pineapple specifies the length (5 and 9) of each field. This is calleddeclarative notation. It has low overhead and trivially avoids delimiter collisions, but it is brittle when edited manually.

History

[edit]

Herman Hollerith's work for theUS Census Bureau first exercised in the1890 United States census, involving data tabulated via hole punches in paper cards,[3] is sometimes considered the first computerized flat-file database, as it included no cards indexing other cards, or otherwise relating the individual cards to one another, save by their group membership.[citation needed]

In the 1980s, configurable flat-file databasecomputer applications were popular on theIBM PC and theMacintosh. These programs were designed to make it easy for individuals to design and use their own databases, and were almost on par withword processors andspreadsheets in popularity.[citation needed] Examples of flat-file database software include early versions ofFileMaker and thesharewarePC-File and the populardBase.

Flat-file databases are common and ubiquitous because they are easy to write and edit, and suit myriad purposes in an uncomplicated way.

Linear stores ofNoSQL data,JSON data, primitive spreadsheets (perhaps comma-separated or tab-delimited), and text files can all be seen as flat-file databases because they lack integrated indexes, built-in references between data elements, and complex data types. Programs to manage collections of books or appointments andaddress books may use single-purpose flat-file databases, storing and retrieving information from flat-files unadorned with indexes or pointing systems.

While a user can write a table of contents into a text file, the text file format itself does not include a concept of a table of contents. While a user may write "friends with Kathy" in the "Notes" section for John's contact information, this is interpreted by the user rather than a built-in feature of the database. When a database system begins to recognize and codify relationships between records, it begins to drift away from being "flat," and when it has a detailed system for describing types and hierarchical relationships, it is now too structured to be considered "flat."

Examples

[edit]

In the context ofUnix-like systems, the files/etc/passwd and/etc/group are flat-files databases.

The following illustrates typical elements of a flat-file database.

id    name    team1     Amy     Blues2     Bob     Reds3     Chuck   Blues4     Richard Blues5     Ethel   Reds6     Fred    Blues7     Gilly   Blues8     Hank    Reds9     Hank    Blues

The information is arranged as a table – a series of rows and columns.

The first row specifies thefield names that are associated with the values of each row. The columns consist of an identifier (id), a person's name (name) and a team name (team).

Columns are separated bywhitespace characters. This is also called indentation or "fixed-width" data formatting. Another common convention is to separate columns using one or moredelimiter characters, such as a tab or comma.

Each column may be restricted to a specificdata type with restrictions usually enforced by convention.

Each row or record meets the standard definition of atuple underrelational algebra. This example depicts a series of 3-tuples.

Since the formal operations possible with a text file are usually more limited than desired, the text in the above example would ordinarily represent an intermediary state of the data prior to being transferred into adatabase management system.

See also

[edit]
  • Awk – Text processing programming languagePages displaying short descriptions of redirect targets
  • Berkeley DB – Software library providing embedded database for key/value data
  • Recutils – Toolset for using plain text files as a database

References

[edit]
Wikimedia Commons has media related toFlat file models.
  1. ^"Data Integration Glossary"(PDF). U.S. Department of Transportation. August 2001. p. 10. Archived fromthe original(PDF) on March 20, 2009. RetrievedApril 16, 2025.
  2. ^Fowler, Glenn (1994),"cql: Flat-file database query language",WTEC'94: Proceedings of the USENIX Winter 1994 Technical Conference on USENIX Winter 1994 Technical Conference
  3. ^Blodgett, John H.; Schultz, Claire K. (1969)."Herman hollerith: data processing pioneer".American Documentation.20 (3):221–226.doi:10.1002/asi.4630200307.ISSN 1936-6108.
Common models
Other models
Implementations
Retrieved from "https://en.wikipedia.org/w/index.php?title=Flat-file_database&oldid=1311048852"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp