This PEP adds support for syntactic macros to Python.A macro is a compile-time function that transformsa part of the program to allow functionality that cannot beexpressed cleanly in normal library code.
The term “syntactic” means that this sort of macro operates on the program’ssyntax tree. This reduces the chance of mistranslation that can happenwith text-based substitution macros, and allows the implementationofhygienic macros.
Syntactic macros allow libraries to modify the abstract syntax tree during compilation,providing the ability to extend the language for specific domains withoutadding to complexity to the language as a whole.
New language features can be controversial, disruptive and sometimes divisive.Python is now sufficiently powerful and complex, that many proposed additionsare a net loss for the language due to the additional complexity.
Although a language change may make certain patterns easy to express,it will have a cost. Each new feature makes the language larger,harder to learn and harder to understand.Python was once described asPython Fits Your Brain,but that becomes less and less true as more and more features are added.
Because of the high cost of adding a new feature,it is very difficult or impossible to add a feature that would benefit onlysome users, regardless of how many users, or how beneficial that feature wouldbe to them.
The use of Python in data science and machine learning has grown very rapidlyover the last few years.However, most of the core developers of Python do not have a background indata science or machine learning.This makes it extremely difficult for the core developers to determine whether alanguage extension for machine learning is worthwhile.
By allowing language extensions to be modular and distributable, like libraries,domain-specific extensions can be implemented without negatively impactingusers outside of that domain.A web developer is likely to want a very different set of extensions froma data scientist.We need to let the community develop their own extensions.
Without some form of user-defined language extensions,there will be a constant battle between those wanting to keep thelanguage compact and fitting their brains, and those wanting a new featurethat suits their domain or programming style.
Many domains see repeated patterns that are difficult or impossibleto express as a library.Macros can express those patterns in a more concise and less error-prone way.
It is possible to demonstrate potential language extensions using macros.For example, macros would have enabled thewith statement andyieldfrom expression to have been trialed.Doing so might well have lead to a higher quality implementationat first release, by allowing more testingbefore those features were included in the language.
It is nearly impossible to make sure that a new feature is completely reliablebefore it is released; bugs relating to thewith andyieldfromfeatures were still being fixed many years after they were released.
Historically, new language features have been implemented by naive compilationof the AST into new, complex bytecode instructions.Those bytecodes have often had their own internal flow-control, performingoperations that could, and should, have been done in the compiler.
For example,until recently flow control within thetry-finally andwithstatements was managed by complicated bytecodes with context-dependent semantics.The control flow within those statements is now implemented in the compiler, makingthe interpreter simpler and faster.
By implementing new features as AST transformations, the existing compiler cangenerate the bytecode for a feature without having to modify the interpreter.
A stable interpreter is necessary if we are to improve the performance andportability of the CPython VM.
Python is both expressive and easy to learn;it is widely recognized as the easiest-to-learn, widely used programming language.However, it is not the most flexible. That title belongs to lisp.
Because lisp is homoiconic, meaning that lisp programs are lisp data structures,lisp programs can be manipulated by lisp programs.Thus much of the language can be defined in itself.
We would like that ability in Python,without the many parentheses that characterize lisp.Fortunately, homoiconicity is not needed for a language to be able tomanipulate itself, all that is needed is the ability to manipulate programsafter parsing, but before translation to an executable form.
Python already has the components needed.The syntax tree of Python is available through theast module.All that is needed is a marker to tell the compiler that a macro is present,and the ability for the compiler to callback into user code to manipulate the AST.
Any sequence of identifier characters followed by an exclamation point(exclamation mark, UK English) will be tokenized as aMACRO_NAME.
macro_stmt=MACRO_NAMEtestlist["import"NAME]["as"NAME][":"NEWLINEsuite]
macro_expr=MACRO_NAME"("testlist")"
The statement form of a macro takes precedence, so that the codemacro_name!(x) will be parsed as a macro statement,not as an expression statement containing a macro expression.
Upon encountering amacro during translation to bytecode,the code generator will look up the macro processor registered for the macro,and pass the AST rooted at the macro to the processor function.The returned AST will then be substituted for the original tree.
For macros with multiple names,several trees will be passed to the macro processor,but only one will be returned and substituted,shorting the enclosing block of statements.
This process can be repeated,to enable macros to return AST nodes including other macros.
The compiler will not look up a macro processor until that macro is reached,so that inner macros do not need to have processors registered.For example, in aswitch macro, thecase anddefault macros wouldn’tneed processors registered as they would be eliminated by theswitch processor.
To enable definition of macros to be imported,the macrosimport! andfrom! are predefined.They support the following syntax:
"import!"dotted_name"as"name"from!"dotted_name"import"name["as"name]
Theimport! macro performs a compile-time import ofdotted_nameto find the macro processor, then registers it undernamefor the scope currently being compiled.
Thefrom! macro performs a compile-time import ofdotted_name.nameto find the macro processor, then registers it undername(using thename following “as”, if present)for the scope currently being compiled.
Note that, sinceimport! andfrom! only define the macro for thescope in which the import is present, all uses of a macro must be preceded byan explicitimport! orfrom! to improve clarity.
For example, to import the macro “compile” from “my.compiler”:
from! my.compiler import compile
A macro processor is defined by a four-tuple, consisting of(func,kind,version,additional_names):
func must be a callable that takeslen(additional_names)+1 arguments, all of which are abstract syntax trees, and returns a single abstract syntax tree.kind must be one of the following:macros.STMT_MACRO: A statement macro where the body of the macro is indented. This is the only form allowed to have additional names.macros.SIBLING_MACRO: A statement macro where the body of the macro is the next statement in the same block. The following statement is moved into the macro as its body.macros.EXPR_MACRO: An expression macro.version is used to track versions of macros, so that generated bytecodes can be correctly cached. It must be an integer.additional_names are the names of the additional parts of the macro, and must be a tuple of strings.# (func, _ast.STMT_MACRO, VERSION, ())stmt_macro!: multi_statement_body# (func, _ast.SIBLING_MACRO, VERSION, ())sibling_macro!single_statement_body# (func, _ast.EXPR_MACRO, VERSION, ())x = expr_macro!(...)# (func, _ast.STMT_MACRO, VERSION, ("subsequent_macro_part",))multi_part_macro!: multi_statement_bodysubsequent_macro_part!: multi_statement_bodyThe compiler will check that the syntax used matches the declared kind.
For convenience, the decoratormacro_processor is provided in themacros module to mark a function as a macro processor:
defmacro_processor(kind,version,*additional_names):defdeco(func):returnfunc,kind,version,additional_namesreturndeco
Which can be used to help declare macro processors, for example:
@macros.macro_processor(macros.STMT_MACRO,1_08)defswitch(astnode):...
Two new AST nodes will be needed to express macros,macro_stmt andmacro_expr.
classmacro_stmt(_ast.stmt):_fields="name","args","importname","asname","body"classmacro_expr(_ast.expr):_fields="name","args"
In addition, macro processors will need a means to express control flow or side-effecting code, that produces a value.A new AST node calledstmt_expr will be added, combining a statement and an expression.This new ast node will be a subtype ofexpr, but include a statement to allow side effects.It will be compiled to bytecode by compiling the statement, then compiling the value.
classstmt_expr(_ast.expr):_fields="stmt","value"
Macro processors will often need to create new variables.Those variables need to named in such as way as to avoid contaminating the original code and other macros.No rules for naming will be enforced, but to ensure hygiene and help debugging, the following naming scheme is recommended:
$$$mname wheremname is the name of the macro.$vname wherevname is the name of the variable.Examples:
$$macro_17_0$var_12_5It is common to encode tables of data in Python as large dictionaries.However, these can be hard to maintain and error prone.Macros allow such data to be written in a more readable format.Then, at compile time, the data can be verified and converted to an efficient format.
For example, suppose we have a two dictionary literals mapping codes to names,and vice versa.This is error prone, as the dictionaries may have duplicate keys,or one table may not be the inverse of the other.A macro could generate the two mappings from a single table and,at the same time, verify that no duplicates are present.
color_to_code={"red":1,"blue":2,"green":3,}code_to_color={1:"red",2:"blue",3:"yellow",# error}
would become:
bijection! color_to_code, code_to_color: "red" = 1 "blue" = 2 "green" = 3
Where I see macros having real value is in specific domains, not in general-purpose language features.
For example, parsers.Here’s part of a parser definition for Python, using macros:
choice! single_input: NEWLINE simple_stmt sequence!: compound_stmt NEWLINE
Runtime compilers, such asnumba have to reconstitute the Python source, or attempt to analyze the bytecode.It would be simpler and more reliable for them to get the AST directly:
from! my.jit.library import jitjit!def func(): ...
When matching something representing syntax, such a Pythonast node, or asympy expression,it is convenient to match against the actual syntax, not the data structure representing it.For example, a calculator could be implemented using a domain-specific macro for matching syntax:
from! ast_matcher import matchdef calculate(node): if isinstance(node, Num): return node.n match! node: case! a + b: return calculate(a) + calculate(b) case! a - b: return calculate(a) - calculate(b) case! a * b: return calculate(a) * calculate(b) case! a / b: return calculate(a) / calculate(b)
Which could be converted to:
def calculate(node): if isinstance(node, Num): return node.n $$match_4_0 = node if isinstance($$match_4_0, _ast.Add): a, b = $$match_4_0.left, $$match_4_0.right return calculate(a) + calculate(b) elif isinstance($$match_4_0, _ast.Sub): a, b = $$match_4_0.left, $$match_4_0.right return calculate(a) - calculate(b) elif isinstance($$match_4_0, _ast.Mul): a, b = $$match_4_0.left, $$match_4_0.right return calculate(a) * calculate(b) elif isinstance($$match_4_0, _ast.Div): a, b = $$match_4_0.left, $$match_4_0.right return calculate(a) / calculate(b)
Annotations, either decorators orPEP 3107 function annotations, have a runtime costeven if they serve only as markers for checkers or as documentation.
@do_nothing_markerdeffoo(...):...
can be replaced with the zero-cost macro:
do_nothing_marker!:def foo(...): ...
Although macros would be most valuable for domain-specific extensions, it is possible todemonstrate possible language extensions using macros.
The f-stringf"..." could be implemented as macro asf!("...").Not quite as nice to read, but would still be useful for experimenting with.
try_!: bodyfinally!: closing
Would be translated roughly as:
try:bodyexcept:closingelse:closing
with! open(filename) as fd: return fd.read()
The above would require handlingopen specially.An alternative that would be more explicit, would be:
with! open!(filename) as fd: return fd.read()
Languages that have syntactic macros usually provide a macro for defining macros.This PEP intentionally does not do that, as it is not yet clear what a good designwould be, and we want to allow the community to define their own macros.
One possible form could be:
macro_def! name: input: ... # input pattern, defining meta-variables output: ... # output pattern, using meta-variables
This PEP is fully backwards compatible.
For code that doesn’t use macros, there will be no effect on performance.
For code that does use macros and has already been compiled to bytecode,there will be some slight overhead to check that the versionof macros used to compile the code match the imported macro processors.
For code that has not been compiled, or compiled with different versionsof the macro processors, then there would be the usual overhead of bytecodecompilation, plus any additional overhead of macro processing.
It is worth noting that the speed of source to bytecode compilationis largely irrelevant for Python performance.
In order to allow transformation of the AST at compile time by Python code,all AST nodes in the compiler will have to be Python objects.
To do that efficiently, will mean making all the nodes in the_ast moduleimmutable, so as not degrade performance by much.They will need to be immutable to guarantee that the AST remains atreeto avoid having to support cyclic GC.Making them immutable means they will not have a__dict__ attribute, making them compact.
AST nodes in theast module will remain mutable.
Currently, all AST nodes are allocated using an arena allocator.Changing to use the standard allocator might slow compilation down a little,but has advantages in terms of maintenance, as much code can be deleted.
None as yet.
This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.
Source:https://github.com/python/peps/blob/main/peps/pep-0638.rst
Last modified:2025-02-01 08:55:40 GMT