- Notifications
You must be signed in to change notification settings - Fork164
Description
leoAst.py unifies python's token-oriented and ast-oriented worlds. This project is intended to be a major contribution to python's tool set.
Overview
TheTokenOrderGenerator (TOG) class in leoAst.py creates the following data:
- Achildren array from each ast node to its children. Order matters!
- Aparent link from each ast.node to its parent.
- Two-way links between tokens in thetoken list, a list of Token objects, and theast nodes in theparse tree:
- For each token,token.node contains the ast.node "responsible" for the token.
- For each ast node,node.first_i andnode.last_i are indices into the token list.
These indices give the range of tokens that can be said to be "generated" by the ast node.
These links promise to collapse the complexity of any code that changes text, including theasttokens,fstringify, andblack projects.
This project is a general solution to frequently-asked questions such asthis. It fills gaps in python's ast module, as discussed inthis python issue.
leoAst.py contains:
- an entirely new implementation of thefstringify tool.
- orange, an entirely new implementation ofblack.
Project status
Work is substantially complete. The "fstrings" branch has been merged into devel.
leoAst.py is completely independent ofLeo itself. Naturally, I recommend using Leo to view this code. You will see the outline structure of the code.
leoAst.py contains a complete suite of unit tests. Those unit tests completely cover the principle classes of the file.
Figures of merit
The code in leoAst.py is simpler, easier to understand, more flexible, more robust and faster than the corresponding code in the asttokens, fstringify and black projects:
Simplicity: The code (all of it) is the simplest thing that could possibly work. This is in stark contrast to the generators used in theasttokens,fstringify, andblack projects.
Flexibility: Flexibility comes from simplicity, not special cases. The code contains no hacks of any kind. Again, this is in stark contrast to the generators used by asttokens, fstringify and black.
Speed: The TOG creates two-way links between tokens and ast nodes in roughly the time taken by python's tokenize.tokenize and ast.parse library methods. This is substantially faster than theasttokens,fstringify, andblack tools. The TOT class traverses trees annotated with parent/child links even more quickly.
Memory: The TOG class makes no significant demands on python's resources. Generators add nothing to python's call stack. TOG.node_stack is the only variable length data. This stack resides in python's heap, so its length is unimportant. In the worst case, it might contain a few thousand entries. The TOT class uses no variable-length data whatever.
Innovation: The code's speed, simplicity, robustness and flexibility is the result of months of work. For more details, see the Project's history inthis issue's third comment.
Testing and inspection: The AstDumper class shows all the links in various formats. Strong unit tests cover all parts of the TOG and TOT classes.
To do
Infrastructure:
- Create Token, Tokenizer, TokenOrderGenerator and TokenOrderTraverser classes.
- Create a set of dumping tools.
- Create nearest_common_ancestor and tokens_for_node functions.
Testing:
- tokenizer.check_results verifies token round-tripping.
- tog.sync_tokens is an ever-present unit test that verifies that the TOG visits nodes in token order.
- Create unit tests that cover the TOG, TOT and Fstringify classes.
- 100% coverage of all non-testing code.
- Beautify all files.
- Fstringify all files.
Real world tools:
- Create the Fstringify class.
- Create fstringify-files and diff-fstringify-files commands.
- Rewrite Leo's beautify-files and diff-beautify-files commands using the Orange class.
- Complete the Orange class.
- Won't do: Create orange.py, for stand-alone operation.
- Black is good for non-Leo files.
- Leo's beautifier commands are needed forall Leo files.