Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Left recursion

From Wikipedia, the free encyclopedia
Theory of computer sciences
This articlemay be too technical for most readers to understand. Pleasehelp improve it tomake it understandable to non-experts, without removing the technical details.(November 2015) (Learn how and when to remove this message)

In theformal language theory ofcomputer science,left recursion is a special case ofrecursion where a string is recognized as part of a language by the fact that it decomposes into a string from that same language (on the left) and a suffix (on the right). For instance,1+2+3{\displaystyle 1+2+3} can be recognized as a sum because it can be broken into1+2{\displaystyle 1+2}, also a sum, and+3{\displaystyle {}+3}, a suitable suffix.

In terms ofcontext-free grammar, anonterminal is left-recursive if the leftmost symbol in one of its productions is itself (in the case of direct left recursion) or can be made itself by some sequence of substitutions (in the case of indirect left recursion).

Definition

[edit]

A grammar is left-recursive if and only if there exists a nonterminal symbolA{\displaystyle A} that can derive to asentential form with itself as the leftmost symbol.[1] Symbolically,

A+Aα{\displaystyle A\Rightarrow ^{+}A\alpha },

where+{\displaystyle \Rightarrow ^{+}} indicates the operation of making one or more substitutions, andα{\displaystyle \alpha } is any sequence of terminal and nonterminal symbols.

Direct left recursion

[edit]

Direct left recursion occurs when the definition can be satisfied with only one substitution. It requires a rule of the form

AAα{\displaystyle A\to A\alpha }

whereα{\displaystyle \alpha } is a sequence of nonterminals and terminals . For example, the rule

ExpressionExpression+Term{\displaystyle {\mathit {Expression}}\to {\mathit {Expression}}+{\mathit {Term}}}

is directly left-recursive. A left-to-rightrecursive descent parser for this rule might look like

voidExpression(){Expression();match('+');Term();}

and such code would fall into infinite recursion when executed.

Indirect left recursion

[edit]

Indirect left recursion occurs when the definition of left recursion is satisfied via several substitutions. It entails a set of rules following the pattern

A0β0A1α0{\displaystyle A_{0}\to \beta _{0}A_{1}\alpha _{0}}
A1β1A2α1{\displaystyle A_{1}\to \beta _{1}A_{2}\alpha _{1}}
{\displaystyle \cdots }
AnβnA0αn{\displaystyle A_{n}\to \beta _{n}A_{0}\alpha _{n}}

whereβ0,β1,,βn{\displaystyle \beta _{0},\beta _{1},\ldots ,\beta _{n}} are sequences that can each yield theempty string, whileα0,α1,,αn{\displaystyle \alpha _{0},\alpha _{1},\ldots ,\alpha _{n}} may be any sequences of terminal and nonterminal symbols at all. Note that these sequences may be empty. The derivation

A0β0A1α0+A1α0β1A2α1α0++A0αnα1α0{\displaystyle A_{0}\Rightarrow \beta _{0}A_{1}\alpha _{0}\Rightarrow ^{+}A_{1}\alpha _{0}\Rightarrow \beta _{1}A_{2}\alpha _{1}\alpha _{0}\Rightarrow ^{+}\cdots \Rightarrow ^{+}A_{0}\alpha _{n}\dots \alpha _{1}\alpha _{0}}

then givesA0{\displaystyle A_{0}} as leftmost in its final sentential form.

Uses

[edit]

Left recursion is commonly used as an idiom for making operationsleft-associative: that an expressiona+b-c-d+e is evaluated as(((a+b)-c)-d)+e. In this case, that evaluation order could be achieved as a matter of syntax via the three grammatical rules

ExpressionTerm{\displaystyle {\mathit {Expression}}\to {\mathit {Term}}}
ExpressionExpression+Term{\displaystyle {\mathit {Expression}}\to {\mathit {Expression}}+{\mathit {Term}}}
ExpressionExpressionTerm{\displaystyle {\mathit {Expression}}\to {\mathit {Expression}}-{\mathit {Term}}}

These only allow parsing theExpression{\displaystyle {\mathit {Expression}}}a+b-c-d+e as consisting of theExpression{\displaystyle {\mathit {Expression}}}a+b-c-d andTerm{\displaystyle {\mathit {Term}}}e, wherea+b-c-d in turn consists of theExpression{\displaystyle {\mathit {Expression}}}a+b-c andTerm{\displaystyle {\mathit {Term}}}d, whilea+b-c consists of theExpression{\displaystyle {\mathit {Expression}}}a+b andTerm{\displaystyle {\mathit {Term}}}c, etc.

Removing left recursion

[edit]

Left recursion often poses problems for parsers, either because it leads them into infinite recursion (as in the case of mosttop-down parsers) or because they expect rules in a normal form that forbids it (as in the case of manybottom-up parsers[clarification needed]). Therefore, a grammar is often preprocessed to eliminate the left recursion.

Removing direct left recursion

[edit]

The general algorithm to remove direct left recursion follows. Several improvements to this method have been made.[2]For a left-recursive nonterminalA{\displaystyle A}, discard any rules of the formAA{\displaystyle A\rightarrow A} and consider those that remain:

AAα1Aαnβ1βm{\displaystyle A\rightarrow A\alpha _{1}\mid \ldots \mid A\alpha _{n}\mid \beta _{1}\mid \ldots \mid \beta _{m}}

where:

Replace these with two sets of productions, one set forA{\displaystyle A}:

Aβ1AβmA{\displaystyle A\rightarrow \beta _{1}A^{\prime }\mid \ldots \mid \beta _{m}A^{\prime }}

and another set for the fresh nonterminalA{\displaystyle A'} (often called the "tail" or the "rest"):

Aα1AαnAϵ{\displaystyle A^{\prime }\rightarrow \alpha _{1}A^{\prime }\mid \ldots \mid \alpha _{n}A^{\prime }\mid \epsilon }

Repeat this process until no direct left recursion remains.

As an example, consider the rule set

ExpressionExpression+ExpressionIntegerString{\displaystyle {\mathit {Expression}}\rightarrow {\mathit {Expression}}+{\mathit {Expression}}\mid {\mathit {Integer}}\mid {\mathit {String}}}

This could be rewritten to avoid left recursion as

ExpressionIntegerExpressionStringExpression{\displaystyle {\mathit {Expression}}\rightarrow {\mathit {Integer}}\,{\mathit {Expression}}'\mid {\mathit {String}}\,{\mathit {Expression}}'}
Expression+ExpressionExpressionϵ{\displaystyle {\mathit {Expression}}'\rightarrow {}+{\mathit {Expression}}\,{\mathit {Expression}}'\mid \epsilon }

Removing all left recursion

[edit]

The above process can be extended to eliminate all left recursion, by first converting indirect left recursion to direct left recursion on the highest numbered nonterminal in a cycle.

InputsA grammar: a set of nonterminalsA1,,An{\displaystyle A_{1},\ldots ,A_{n}} and their productions
OutputA modified grammar generating the same language but without left recursion
  1. For each nonterminalAi{\displaystyle A_{i}}:
    1. Repeat until an iteration leaves the grammar unchanged:
      1. For each ruleAiαi{\displaystyle A_{i}\rightarrow \alpha _{i}},αi{\displaystyle \alpha _{i}} being a sequence of terminals and nonterminals:
        1. Ifαi{\displaystyle \alpha _{i}} begins with a nonterminalAj{\displaystyle A_{j}} andj<i{\displaystyle j<i}:
          1. Letβi{\displaystyle \beta _{i}} beαi{\displaystyle \alpha _{i}} without its leadingAj{\displaystyle A_{j}}.
          2. Remove the ruleAiαi{\displaystyle A_{i}\rightarrow \alpha _{i}}.
          3. For each ruleAjαj{\displaystyle A_{j}\rightarrow \alpha _{j}}:
            1. Add the ruleAiαjβi{\displaystyle A_{i}\rightarrow \alpha _{j}\beta _{i}}.
    2. Remove direct left recursion forAi{\displaystyle A_{i}} as described above.

Step 1.1.1 amounts to expanding the initial nonterminalAj{\displaystyle A_{j}} in the right hand side of some ruleAiAjβ{\displaystyle A_{i}\to A_{j}\beta }, but only ifj<i{\displaystyle j<i}. IfAiAjβ{\displaystyle A_{i}\to A_{j}\beta } was one step in a cycle of productions giving rise to a left recursion, then this has shortened that cycle by one step, but often at the price of increasing the number of rules.

The algorithm may be viewed as establishing atopological ordering on nonterminals: afterwards there can only be a ruleAiAjβ{\displaystyle A_{i}\to A_{j}\beta } ifj>i{\displaystyle j>i}.Note that this algorithm is highly sensitive to the nonterminal ordering; optimizations often focus on choosing this ordering well.[clarification needed]

Pitfalls

[edit]

Although the above transformations preserve the language generated by a grammar, they may change theparse trees thatwitness strings' recognition. With suitable bookkeeping,tree rewriting can recover the originals, but if this step is omitted, the differences may change the semantics of a parse.

Associativity is particularly vulnerable; left-associative operators typically appear in right-associative-like arrangements under the new grammar. For example, starting with this grammar:

ExpressionExpressionTermTerm{\displaystyle {\mathit {Expression}}\rightarrow {\mathit {Expression}}\,-\,{\mathit {Term}}\mid {\mathit {Term}}}
TermTermFactorFactor{\displaystyle {\mathit {Term}}\rightarrow {\mathit {Term}}\,*\,{\mathit {Factor}}\mid {\mathit {Factor}}}
Factor(Expression)Integer{\displaystyle {\mathit {Factor}}\rightarrow ({\mathit {Expression}})\mid {\mathit {Integer}}}

the standard transformations to remove left recursion yield the following:

ExpressionTerm Expression{\displaystyle {\mathit {Expression}}\rightarrow {\mathit {Term}}\ {\mathit {Expression}}'}
ExpressionTerm Expressionϵ{\displaystyle {\mathit {Expression}}'\rightarrow {}-{\mathit {Term}}\ {\mathit {Expression}}'\mid \epsilon }
TermFactor Term{\displaystyle {\mathit {Term}}\rightarrow {\mathit {Factor}}\ {\mathit {Term}}'}
TermFactor Termϵ{\displaystyle {\mathit {Term}}'\rightarrow {}*{\mathit {Factor}}\ {\mathit {Term}}'\mid \epsilon }
Factor(Expression)Integer{\displaystyle {\mathit {Factor}}\rightarrow ({\mathit {Expression}})\mid {\mathit {Integer}}}

Parsing the string "1 - 2 - 3" with the first grammar in an LALR parser (which can handle left-recursive grammars) would have resulted in the parse tree:

Left-recursive parsing of a double subtraction
Left-recursive parsing of a double subtraction

This parse tree groups the terms on the left, giving the correct semantics(1 - 2) - 3.

Parsing with the second grammar gives

Right-recursive parsing of a double subtraction
Right-recursive parsing of a double subtraction

which, properly interpreted, signifies1 + (-2 + (-3)), also correct, but less faithful to the input and much harder to implement for some operators. Notice how terms to the right appear deeper in the tree, much as a right-recursive grammar would arrange them for1 - (2 - 3).

Accommodating left recursion in top-down parsing

[edit]

Aformal grammar that contains left recursion cannot beparsed by aLL(k)-parser or other naiverecursive descent parser unless it is converted to aweakly equivalent right-recursive form. In contrast, left recursion is preferred forLALR parsers because it results in lower stack usage thanright recursion. However, more sophisticated top-down parsers can implement generalcontext-free grammars by use of curtailment. In 2006, Frost and Hafiz described an algorithm which accommodatesambiguous grammars with direct left-recursiveproduction rules.[3] That algorithm was extended to a completeparsing algorithm to accommodate indirect as well as direct left recursion inpolynomial time, and to generate compact polynomial-size representations of the potentially exponential number of parse trees for highly ambiguous grammars by Frost, Hafiz and Callaghan in 2007.[4] The authors then implemented the algorithm as a set ofparser combinators written in theHaskell programming language.[5]

See also

[edit]

References

[edit]
  1. ^Notes on Formal Language Theory and Parsing at theWayback Machine (archived 2007-11-27). James Power, Department of Computer Science National University of Ireland, Maynooth Maynooth, Co. Kildare, Ireland.JPR02
  2. ^Moore, Robert C. (May 2000)."Removing Left Recursion from Context-Free Grammars"(PDF).6th Applied Natural Language Processing Conference:249–255.
  3. ^Frost, R.; R. Hafiz (2006)."A New Top-Down Parsing Algorithm to Accommodate Ambiguity and Left Recursion in Polynomial Time".ACM SIGPLAN Notices.41 (5):46–54.doi:10.1145/1149982.1149988.S2CID 8006549., available from the author athttp://hafiz.myweb.cs.uwindsor.ca/pub/p46-frost.pdfArchived 2015-01-08 at theWayback Machine
  4. ^Frost, R.; R. Hafiz; P. Callaghan (June 2007)."Modular and Efficient Top-Down Parsing for Ambiguous Left-Recursive Grammars"(PDF).10th International Workshop on Parsing Technologies (IWPT), ACL-SIGPARSE:109–120. Archived fromthe original(PDF) on 2011-05-27.
  5. ^Frost, R.; R. Hafiz; P. Callaghan (January 2008). "Parser Combinators for Ambiguous Left-Recursive Grammars".Practical Aspects of Declarative Languages(PDF). Lecture Notes in Computer Science. Vol. 4902. pp. 167–181.doi:10.1007/978-3-540-77442-6_12.ISBN 978-3-540-77441-9.

External links

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Left_recursion&oldid=1260110960"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp