Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Backus–Naur form

From Wikipedia, the free encyclopedia
Formalism to describe programming languages
Not to be confused withBoyce–Codd normal form.

Incomputer science,Backus–Naur form (BNF, pronounced/ˌbækəsˈnaʊər/), also known asBackus normal form, is a notation system for defining thesyntax ofprogramming languages and otherformal languages, developed byJohn Backus andPeter Naur. It is ametasyntax forcontext-free grammars, providing a precise way to outline the rules of a language's structure.

It has been widely used in official specifications, manuals, and textbooks onprogramming language theory, as well as to describedocument formats,instruction sets, andcommunication protocols. Over time, variations such asextended Backus–Naur form (EBNF) andaugmented Backus–Naur form (ABNF) have emerged, building on the original framework with added features.

Structure

[edit]

BNF specifications outline how symbols are combined to form syntactically valid sequences. Each BNF consists of three core components: a set ofnon-terminal symbols, a set ofterminal symbols, and a series of derivation rules.[1] Non-terminal symbols represent categories or variables that can be replaced, while terminal symbols are the fixed, literal elements (such as keywords or punctuation) that appear in the final sequence. Derivation rules provide the instructions for replacing non-terminal symbols with specific combinations of symbols.

A derivation rule is written in the format:

<symbol>::= __expression__

where:

  • <symbol>[2] is a non-terminal symbol, enclosed in angle brackets (<>), identifying the category to be replaced
  • ::= is a metasymbol meaning "is replaced by,"
  • __expression__ is the replacement, consisting of one or more sequences of symbols—either terminal symbols (e.g., literal text like "Sr." or ",") or non-terminal symbols (e.g.,<last-name>)—with options separated by avertical bar (|) to indicate alternatives.

For example, in the rule<opt-suffix-part>::= "Sr." | "Jr." | "", the entire line is the derivation rule, "Sr.", "Jr.", and "" (an empty string) are terminal symbols, and<opt-suffix-part> is a non-terminal symbol.

Generating a valid sequence involves starting with a designated start symbol and iteratively applying the derivation rules.[3] This process can extend sequences incrementally. To allow flexibility, some BNF definitions include an optional "delete" symbol (represented as an empty alternative, e.g.,<item> ::=<thing> | ), enabling the removal of certain elements while maintaining syntactic validity.[3]

Example

[edit]

A practical illustration of BNF is a specification for a simplified U.S.postal address:

<postal-address>::=<name-part><street-address><zip-part><name-part>::=<personal-part><last-name><opt-suffix-part><EOL> |<personal-part><name-part><personal-part>::=<first-name> |<initial> "."<street-address>::=<house-num><street-name><opt-apt-num><EOL><zip-part>::=<town-name> ","<state-code><ZIP-code><EOL><opt-suffix-part>::= "Sr." | "Jr." |<roman-numeral> | ""<opt-apt-num>::= "Apt"<apt-num> | ""

This translates into English as:

  • A postal address consists of a name-part, followed by astreet-address part, followed by azip-code part.
  • A name-part consists of either: a personal-part followed by alast name followed by an optionalsuffix (Jr. Sr., or dynastic number) andend-of-line, or a personal part followed by a name part (this rule illustrates the use ofrecursion in BNFs, covering the case of people who use multiple first and middle names and initials).[4]
  • A personal-part consists of either afirst name or aninitial followed by a dot.
  • A street address consists of a house number, followed by a street name, followed by an optionalapartment specifier, followed by an end-of-line.
  • A zip-part consists of atown-name, followed by a comma, followed by astate code, followed by a ZIP-code followed by an end-of-line.
  • An opt-suffix-part consists of a suffix, such as "Sr.", "Jr." or aroman-numeral, or an empty string (i.e. nothing).
  • An opt-apt-num consists of a prefix "Apt" followed by an apartment number, or an empty string (i.e. nothing).

Note that many things (such as the format of a first-name, apartment number, ZIP-code, and Roman numeral) are left unspecified here. If necessary, they may be described using additional BNF rules.

History

[edit]

The concept of usingrewriting rules to describe language structure traces back to at leastPāṇini, an ancient Indian Sanskrit grammarian who lived sometime between the 6th and 4th centuriesBC.[5] His notation for describingSanskrit word structure is equivalent in power to that of BNF and exhibits many similar properties.[6]

In Western society, grammar was long regarded as a subject for teaching rather than scientific study; descriptions were informal and targeted at practical usage. This perspective shifted in the first half of the 20th century, when linguists such asLeonard Bloomfield andZellig Harris began attempts to formalize language description, includingphrase structure. Meanwhile, mathematicians explored related ideas throughstring rewriting rules asformal logical systems, such asAxel Thue in 1914,Emil Post in the 1920s–40s,[7] andAlan Turing in 1936.Noam Chomsky, teaching linguistics to students ofinformation theory atMIT combined linguistics and mathematics, adapting Thue's formalism to describe natural language syntax. In 1956, he introduced a clear distinction between generative rules (those ofcontext-free grammars) and transformation rules.[8][9]

BNF itself emerged whenJohn Backus, a programming language designer atIBM, proposed ametalanguage ofmetalinguistic formulas to define the syntax of the new programming language IAL, known today asALGOL 58, in 1959.[10] This notation was formalized in theALGOL 60 report, wherePeter Naur named itBackus normal form in the committee's 1963 report.[11] Whether Backus was directly influenced by Chomsky's work is uncertain.[12][13]

Donald Knuth argued in 1964 that BNF should be read asBackus–Naur form, as it is "not anormal form in the conventional sense," unlikeChomsky normal form.[14] In 1967, Peter Zilahy Ingerman suggested renaming itPāṇini Backus form to acknowledge Pāṇini's earlier, independent development of a similar notation.[15]

In the ALGOL 60 report, Naur described BNF as ametalinguistic formula:[16]

Sequences of characters enclosed in the brackets <> represent metalinguistic variables whose values are sequences of symbols. The marks "::=" and "|" (the latter with the meaning of "or") are metalinguistic connectives. Any mark in a formula, which is not a variable or a connective, denotes itself. Juxtaposition of marks or variables in a formula signifies juxtaposition of the sequence denoted.

This is exemplified in the report's section 2.3, where comments are specified:

For the purpose of including text among the symbols of a program the following "comment" conventions hold:

The sequence of basic symbols:is equivalent to
;comment <any sequence not containing ';'>;;
begincomment <any sequence not containing ';'>;begin
end <any sequence not containing 'end' or ';' or 'else'>end

Equivalence here means that any of the three structures shown in the left column may be replaced, in any occurrence outside of strings, by the symbol shown in the same line in the right column without any effect on the action of the program.

Naur altered Backus's original symbols for ALGOL 60, changing:≡ to::= and the overbarred "or" to|, using commonly available characters.[17]: 14 

BNF is very similar tocanonical-formBoolean algebra equations (used in logic-circuit design), reflecting Backus's mathematical background as a FORTRAN designer.[18] Studies of Boolean algebra were commonly part of a mathematics curriculum, which may have informed Backus's approach. Neither Backus nor Naur described the names enclosed in< > as non-terminals—Chomsky's terminology was not originally used in describing BNF. Naur later called them "classes" in 1961 course materials.[18] In the ALGOL 60 report, they were "metalinguistic variables," with other symbols defining the target language.

Saul Rosen, involved with theAssociation for Computing Machinery since 1947, contributed to the transition from IAL to ALGOL and edited Communications of the ACM. He described BNF as a metalanguage for ALGOL in his 1967 book.[19] Early ALGOL manuals from IBM, Honeywell, Burroughs, and Digital Equipment Corporation followed this usage.

Impact

[edit]

BNF significantly influenced programming language development, notably as the basis for earlycompiler-compiler systems. Examples include Edgar T. Irons' "A Syntax Directed Compiler for ALGOL 60" and Brooker and Morris' "A Compiler Building System," which directly utilized BNF.[20] Others, like Schorre'sMETA II, adapted BNF into a programming language, replacing< > with quoted strings and adding operators like $ for repetition, as in:

EXPR=TERM$('+'TERM.OUT('ADD')|'-'TERM.OUT('SUB'));

This influenced tools likeyacc, a widely usedparser generator rooted in BNF principles.[21] BNF remains one of the oldest computer-related notations still referenced today, though its variants often dominate modern applications.

Examples of its use as a metalanguage include defining arithmetic expressions:

<expr>::=<term> |<expr><addop><term>

Here,<expr> can recursively include itself, allowing repeated additions.

BNF today is one of the oldest computer-related languages still in use.[citation needed]

BNF representation of itself

[edit]
BNF syntax diagram
BNFsyntax diagram

BNF's syntax itself may be represented with a BNF like the following:

<syntax>::=<rule> |<rule><syntax><rule>::=<opt-whitespace> "<"<rule-name> ">"<opt-whitespace> "::="<opt-whitespace><expression><line-end><opt-whitespace>::= " "<opt-whitespace> | ""<expression>::=<list> |<list><opt-whitespace> "|"<opt-whitespace><expression><line-end>::=<opt-whitespace><EOL> |<line-end><line-end><list>::=<term> |<term><opt-whitespace><list><term>::=<literal> | "<"<rule-name> ">"<literal>::= '"'<text1> '"' | "'"<text2> "'"<text1>::= "" |<character1><text1><text2>::= "" |<character2><text2><character>::=<letter> |<digit> |<symbol><letter>::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" | "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"<digit>::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"<symbol>::= "|" | " " | "!" | "#" | "$" | "%" | "&" | "(" | ")" | "*" | "+" | "," | "-" | "." | "/" | ":" | ";" | ">" | "=" | "<" | "?" | "@" | "[" | "\" | "]" | "^" | "_" | "`" | "{" | "}" | "~"<character1>::=<character> | "'"<character2>::=<character> | '"'<rule-name>::=<letter> |<rule-name><rule-char><rule-char>::=<letter> |<digit> | "-"

Note that "" is theempty string.

The original BNF did not use quotes as shown in<literal> rule. This assumes that nowhitespace is necessary for proper interpretation of the rule.

<EOL> represents the appropriateline-end specifier (inASCII, carriage-return, line-feed or both depending on theoperating system).<rule-name> and<text> are to be substituted with a declared rule's name/label or literal text, respectively.

In the U.S. postal address example above, the entire block-quote is a<syntax>. Each line or unbroken grouping of lines is a rule; for example one rule begins with<name-part> ::=. The other part of that rule (aside from a line-end) is an expression, which consists of two lists separated by a vertical bar|. These two lists consists of some terms (three terms and two terms, respectively). Each term in this particular rule is a rule-name.

Variants

[edit]

EBNF

[edit]
Main article:Extended Backus–Naur form

There are many variants and extensions of BNF, generally either for the sake of simplicity and succinctness, or to adapt it to a specific application. One common feature of many variants is the use ofregular expression repetition operators such as* and+. Theextended Backus–Naur form (EBNF) is a common one.

Another common extension is the use of square brackets around optional items. Although not present in the original ALGOL 60 report (instead introduced a few years later inIBM'sPL/I definition), the notation is now universally recognised.

ABNF

[edit]
Main article:ABNF

Augmented Backus–Naur form (ABNF) and Routing Backus–Naur form (RBNF)[22] are extensions commonly used to describeInternet Engineering Task Force (IETF)protocols.

Parsing expression grammars build on the BNF andregular expression notations to form an alternative class offormal grammar, which is essentiallyanalytic rather thangenerative in character.

Others

[edit]

Many BNF specifications found online today are intended to be human-readable and are non-formal. These often include many of the following syntax rules and extensions:

  • Optional items enclosed in square brackets:[<item-x>].
  • Items existing 0 or more times are enclosed in curly brackets or suffixed with an asterisk (*) such as<word> ::= <letter> {<letter>} or<word> ::= <letter> <letter>* respectively.
  • Items existing 1 or more times are suffixed with an addition (plus) symbol,+, such as<word> ::= <letter>+.
  • Terminals may appear in bold rather than italics, and non-terminals in plain text rather than angle brackets.
  • Where items are grouped, they are enclosed in simple parentheses.

Software using BNF or variants

[edit]

Software that accepts BNF (or a superset) as input

[edit]
  • ANTLR, a parser generator written inJava
  • Coco/R, compiler generator accepting an attributed grammar inEBNF
  • DMS Software Reengineering Toolkit, program analysis and transformation system for arbitrary languages
  • GOLD, a BNF parser generator
  • RPA BNF parser.[23] Online (PHP) demo parsing: JavaScript, XML
  • XACT X4MR System,[24] a rule-based expert system for programming language translation
  • XPL Analyzer, a tool which accepts simplified BNF for a language and produces a parser for that language in XPL; it may be integrated into the supplied SKELETON program, with which the language may be debugged[25] (aSHARE contributed program, which was preceded byA Compiler Generator[26])
  • bnfparser2,[27] a universal syntax verification utility
  • bnf2xml,[28] Markup input with XML tags using advanced BNF matching
  • JavaCC,[29] Java Compiler Compiler tm (JavaCC tm) - The Java Parser Generator

Similar software

[edit]
  • GNU bison, GNU version of yacc
  • Yacc, parser generator (most commonly used with theLex preprocessor)
  • Racket's parser tools, lex and yacc-style parsing (Beautiful Racket edition)
  • Qlik Sense, a BI tool, uses a variant of BNF for scripting[30]
  • BNF Converter (BNFC[31]), operating on a variant called "labeled Backus–Naur form" (LBNF). In this variant, each production for a given non-terminal is given a label, which can be used as a constructor of analgebraic data type representing that nonterminal. The converter is capable of producing types and parsers forabstract syntax in several languages, includingHaskell and Java

See also

[edit]

References

[edit]
  1. ^Janikow, Cezary Z."What is BNF?"(PDF).
  2. ^Naur, Peter (1961)."A COURSE OF ALGOL 60 PROGRAMMING with special reference to the DASK ALGOL system"(PDF). Copenhagen: Regnecentralen. Retrieved26 March 2015.
  3. ^abJanikow, Cezary Z."What is BNF?"(PDF).
  4. ^This article is based on material taken fromBackus-Naur+Form at theFree On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of theGFDL, version 1.3 or later.
  5. ^"Panini biography". School of Mathematics and Statistics, University of St Andrews, Scotland. Retrieved2014-03-22.
  6. ^Ingerman, Peter Zilahy (March 1967).""Pāṇini-Backus Form" Suggested".Communications of the ACM.10 (3): 137.doi:10.1145/363162.363165.S2CID 52817672.
  7. ^Post, Emil L. (1943). "Formal Reductions of the General Combinatorial Decision Problem".American Journal of Mathematics.65 (2):197–215.doi:10.2307/2371804.
  8. ^Chomsky, Noam (1956). "Three models for the description of language".IRE Transactions on Information Theory.2 (3):113–24.doi:10.1109/TIT.1956.1056813.S2CID 19519474.
  9. ^Chomsky, Noam (1957).Syntactic Structures. The Hague: Mouton.
  10. ^Backus, J. W. (1959). "The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM Conference".Proceedings of the International Conference on Information Processing. UNESCO. pp. 125–132.
  11. ^Revised ALGOL 60 report section 1.1."ALGOL 60". RetrievedApril 18, 2015.
  12. ^Fulton, Scott M., III (20 March 2007)."John W. Backus (1924 - 2007)". BetaNews, Inc. RetrievedJun 3, 2014.{{cite web}}: CS1 maint: multiple names: authors list (link)
  13. ^John Backus (Sep 2006). Grady Booch (ed.).Oral History of John Backus(PDF) (Report). Computer History Museum. Here: p.25
  14. ^Knuth, Donald E. (1964)."Backus Normal Form vs. Backus Naur Form".Communications of the ACM.7 (12):735–736.doi:10.1145/355588.365140.S2CID 47537431.
  15. ^Ingerman, P. Z. (1967).""Pāṇini Backus Form" suggested".Communications of the ACM.10 (3): 137.doi:10.1145/363162.363165.S2CID 52817672.
  16. ^Revised ALGOL 60 report section. 1.1."ALGOL 60". RetrievedApril 18, 2015.
  17. ^Backus, J. W. (1959)."The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM Conference".Proceedings of the International Conference on Information Processing. UNESCO. pp. 125–132.
  18. ^abNaur, Peter (1961)."A COURSE OF ALGOL 60 PROGRAMMING with special reference to the DASK ALGOL system"(PDF). Copenhagen: Regnecentralen. Retrieved26 March 2015.
  19. ^Saul Rosen (Jan 1967).Programming Systems and Languages. McGraw Hill Computer Science Series. New York: McGraw Hill.ISBN 978-0070537088.
  20. ^McKeeman, W. M.; Horning, J.J.; Wortman, D. B. (1970).A Compiler Generator. Prentice-Hall.ISBN 978-0-13-155077-3.
  21. ^"BNF parser²",Source forge (project)
  22. ^RBNF.
  23. ^"Online demo",RPatk, archived fromthe original on 2012-11-02, retrieved2011-07-03
  24. ^"Tools",Act world, archived fromthe original on 2013-01-29
  25. ^If the target processor is System/360, or related, even up to z/System, and the target language is similar to PL/I (or, indeed, XPL), then the required code "emitters" may be adapted from XPL's "emitters" for System/360.
  26. ^McKeeman, W. M.; Horning, J.J.; Wortman, D. B. (1970).A Compiler Generator. Prentice-Hall.ISBN 978-0-13-155077-3.
  27. ^"BNF parser²",Source forge (project)
  28. ^bnf2xml
  29. ^"JavaCC". Archived fromthe original on 2013-06-08. Retrieved2013-09-25.
  30. ^"Script Syntax - Qlik Sense on Windows".Qlik.com. QlikTech International AB. Retrieved10 January 2022.
  31. ^"BNFC",Language technology,SE: Chalmers

External links

[edit]
  • Garshol, Lars Marius,BNF and EBNF: What are they and how do they work?,NO: Priv.
  • RFC 5234 — Augmented BNF for Syntax Specifications: ABNF.
  • RFC 5511 — Routing BNF: A Syntax Used in Various Protocol Specifications.
  • ISO/IEC 14977:1996(E)Information technology – Syntactic metalanguage – Extended BNF, available from"Publicly available",Standards, ISO or fromKuhn, Marcus,Iso 14977(PDF),UK: CAM(the latter is missing the cover page, but is otherwise much cleaner)

Language grammars

[edit]
Metasyntax notations
Retrieved from "https://en.wikipedia.org/w/index.php?title=Backus–Naur_form&oldid=1300146735"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp