| Python Library Reference |
Theparser module provides an interface to Python's internalparser and byte-code compiler. The primary purpose for this interfaceis to allow Python code to edit the parse tree of a Python expressionand create executable code from this. This is better than tryingto parse and modify an arbitrary Python code fragment as a stringbecause parsing is performed in a manner identical to the codeforming the application. It is also faster.
There are a few things to note about this module which are importantto making use of the data structures created. This is not a tutorialon editing the parse trees for Python code, but some examples of usingtheparser module are presented.
Most importantly, a good understanding of the Python grammar processedby the internal parser is required. For full information on thelanguage syntax, refer to thePythonLanguage Reference. The parser itself is created from a grammarspecification defined in the fileGrammar/Grammar in thestandard Python distribution. The parse trees stored in the ASTobjects created by this module are the actual output from the internalparser when created by theexpr() orsuite()functions, described below. The AST objects created bysequence2ast() faithfully simulate those structures. Beaware that the values of the sequences which are considered``correct'' will vary from one version of Python to another as theformal grammar for the language is revised. However, transportingcode from one Python version to another as source text will alwaysallow correct parse trees to be created in the target version, withthe only restriction being that migrating to an older version of theinterpreter will not support more recent language constructs. Theparse trees are not typically compatible from one version to another,whereas source code has always been forward-compatible.
Each element of the sequences returned byast2list() orast2tuple() has a simple form. Sequences representingnon-terminal elements in the grammar always have a length greater thanone. The first element is an integer which identifies a production inthe grammar. These integers are given symbolic names in the C headerfileInclude/graminit.h and the Python modulesymbol. Each additional element of the sequence representsa component of the production as recognized in the input string: theseare always sequences which have the same form as the parent. Animportant aspect of this structure which should be noted is thatkeywords used to identify the parent node type, such as the keywordif in anif_stmt, are included in the node tree withoutany special treatment. For example, theif keyword isrepresented by the tuple(1, 'if'), where1 is thenumeric value associated with allNAME tokens, includingvariable and function names defined by the user. In an alternate formreturned when line number information is requested, the same tokenmight be represented as(1, 'if', 12), where the12represents the line number at which the terminal symbol was found.
Terminal elements are represented in much the same way, but withoutany child elements and the addition of the source text which wasidentified. The example of theif keyword above isrepresentative. The various types of terminal symbols are defined inthe C header fileInclude/token.h and the Python moduletoken.
The AST objects are not required to support the functionality of thismodule, but are provided for three purposes: to allow an applicationto amortize the cost of processing complex parse trees, to provide aparse tree representation which conserves memory space when comparedto the Python list or tuple representation, and to ease the creationof additional modules in C which manipulate parse trees. A simple``wrapper'' class may be created in Python to hide the use of ASTobjects.
Theparser module defines functions for a few distinctpurposes. The most important purposes are to create AST objects andto convert AST objects to other representations such as parse treesand compiled code objects, but there are also functions which serve toquery the type of parse tree represented by an AST object.
See Also:
| Python Library Reference |