Much like theparser module exposes the Python parser, this PEPproposes that the parser generator used to create the Pythonparser,pgen, be exposed as a module in Python.
Through the course of Pythonic history, there have been numerousdiscussions about the creation of a Python compiler[1]. Thesehave resulted in several implementations of Python parsers, mostnotably theparser module currently provided in the Pythonstandard library[2] and Jeremy Hylton’scompiler module[3].However, while multiple language changes have been proposed[4][5], experimentation with the Python syntax has lacked thebenefit of a Python binding to the actual parser generator used tobuild Python.
By providing a Python wrapper analogous to Fred Drake Jr.’s parserwrapper, but targeted at thepgen library, the followingassertions are made:
pgen tool fromthe command line. The resulting parser data structure wouldthen either have to be reworked to interface with a customCPython implementation, or wrapped as a C extension module.The proposed module will be calledpgen. Thepgen module willcontain the following functions:
parseGrammarFile(fileName)->ASTTheparseGrammarFile() function will read the file pointed toby fileName and create an AST object. The AST nodes willcontain the nonterminal, numeric values of the parsergenerator meta-grammar. The output AST will be an instance ofthe AST extension class as provided by theparser module.Syntax errors in the input file will cause the SyntaxErrorexception to be raised.
parseGrammarString(text)->ASTTheparseGrammarString() function will follow the semantics oftheparseGrammarFile(), but accept the grammar text as astring for input, as opposed to the file name.
buildParser(grammarAst)->DFAThebuildParser() function will accept an AST object for inputand return a DFA (deterministic finite automaton) datastructure. The DFA data structure will be a C extensionclass, much like the AST structure is provided in theparsermodule. If the input AST does not conform to the nonterminalcodes defined for thepgen meta-grammar,buildParser() willthrow aValueError exception.
parseFile(fileName,dfa,start)->ASTTheparseFile() function will essentially be a wrapper for thePyParser_ParseFile() C API function. The wrapper code willaccept the DFA C extension class, and the file name. An ASTinstance that conforms to the lexical values in thetokenmodule and the nonterminal values contained in the DFA will beoutput.
parseString(text,dfa,start)->ASTTheparseString() function will operate in a similar fashionto theparseFile() function, but accept the parse text as anargument. Much likeparseFile() will wrap thePyParser_ParseFile() C API function,parseString() will wrapthePyParser_ParseString() function.
symbolToStringMap(dfa)->dictThesymbolToStringMap() function will accept a DFA instanceand return a dictionary object that maps from the DFA’snumeric values for its nonterminals to the string names of thenonterminals as found in the original grammar specificationfor the DFA.
stringToSymbolMap(dfa)->dictThestringToSymbolMap() function output a dictionary mappingthe nonterminal names of the input DFA to their correspondingnumeric values.
Extra credit will be awarded if the map generation functions andparsing functions are also methods of the DFA extension class.
A cunning plan has been devised to accomplish this enhancement:
pgen functions to conform to the CPython namingstandards. This action may involve adding some header files totheInclude subdirectory.pgen C modules in the Makefile.pre.in from uniquepgenelements to the Python C library.parser module so the ASTextension class understands that there are AST types it may notunderstand. Cursory examination of the AST extension classshows that it keeps track of whether the tree is a suite or anexpression.Modules directory. The Cextension module will implement the DFA extension class and thefunctions outlined in the previous section.Under this proposal, would be designers of Python 3000 will stillbe constrained to Python’s lexical conventions. The addition,subtraction or modification of the Python lexer is outside thescope of this PEP.
No reference implementation is currently provided. A patchwas provided at some point inhttp://sourceforge.net/tracker/index.php?func=detail&aid=599331&group_id=5470&atid=305470but that patch is no longer maintained.
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0269.rst
Last modified:2025-02-01 08:55:40 GMT