Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A pure-Python module that implements an LR(1) parser generator, as well as CFSM and GLR parser drivers.

License

NotificationsYou must be signed in to change notification settings

MagicStack/parsing

Repository files navigation

Theparsing module implements an LR(1) parser generator, as well as theruntime support for using a generated parser, via the Lr and Glr parserdrivers. There is no special parser generator input file format, but theparser generator still needs to know what classes/methods correspond tovarious aspects of the parser. This information is specified viadocstrings, which the parser generator introspects in order to generate aparser. Only one parser specification can be embedded in each module, butit is possible to share modules between parser specifications so that, forexample, the same token definitions can be used by multiple parserspecifications.

The parsing tables are LR(1), but they are generated using a fast algorithmthat avoids creating duplicate states that result when using the genericLR(1) algorithm. Creation time and table size are on par with the LALR(1)algorithm. However, LALR(1) can create reduce/reduce conflicts that don'texist in a true LR(1) parser. For more information on the algorithm, see:

A Practical General Method for Constructing LR(k) ParsersDavid PagerActa Informatica 7, 249-268 (1977)

Parsing table generation requires non-trivial amounts of time for largegrammars, however it is still quite fast. Internal pickling support makesit possible to cache the most recent version of the parsing table on disk,and use the table if the current parser specification is still compatiblewith the one that was used to generate the pickled parsing table. Sincethe compatibility checking is quite fast, even for large grammars, thisremoves the need to use the standard code generation method that is usedby most parser generators.

Parser specifications are encapsulated by theSpec class. Parserinstances useSpec instances, but are themselves based on separateclasses. This allows multiple parser instances to exist simultaneously,without requiring multiple copies of the parsing tables. There are twoseparate parser driver classes:

Lr:
Standard Characteristic Finite State Machine (CFSM) driver, based onunambiguous LR(1) parsing tables. This driver is faster than the Glrdriver, but it cannot deal with all parsing tables that the Glrdriver can.
Glr:

Generalized LR driver, capable of tracking multiple parse treessimultaneously, if the %split precedence is used to mark ambiguousactions. This driver is closely based on Elkhound's design, whichis described in a technical report:

Elkhound: A Fast, Practical GLR Parser GeneratorScott McPeakReport No. UCB/CSD-2-1214 (December 2002)http://www.cs.berkeley.edu/~smcpeak/elkhound/

Parser generator directives are embedded in docstrings, and must begin witha '%' character, followed immediately by one of several keywords:

Precedence:
%fail%nonassoc%left%right%split
Token:
%token
Non-terminal:
%start%nonterm
Production:
%reduce

All of these directives are associated with classes except for %reduce.%reduce is associated with methods within non-terminal classes. The Parsingmodule provides base classes from which precedences, tokens, andnon-terminals must be derived. This is not as restrictive as it sounds,since there is nothing preventing, for example, a master Token class thatsubclasses Parsing.Token, which all of the actual token types then subclass.Also, nothing prevents using multiple inheritance.

Folowing are the base classes to be subclassed by parser specifications:

  • Precedence
  • Token
  • Nonterm

The Parsing module implements the following exception classes:

  • SpecError - when there is a problem with the grammar specification
  • ParsingException - any problem that occurs during parsing
  • UnexpectedToken - when the input sequence contains a token that isnot allowed by the grammar (including end-of-input)

In order to maintain compatibility with legacy code, the Parsing moduledefines the following aliases. New code should use the exceptions abovethat do not shadow Python's builtin exceptions.

  • Exception - superclass for all exceptions that can be raised
  • SyntaxError - alias for UnexpectedToken
Additionally, trying to set private attributes may raise:
  • AttributeError

Author: Jason Evansjasone@canonware.com

Github repo:http://github.com/MagicStack/parsing

About

A pure-Python module that implements an LR(1) parser generator, as well as CFSM and GLR parser drivers.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors7

Languages


[8]ページ先頭

©2009-2025 Movatter.jp