Procedural Macros
Procedural macros allow creating syntax extensions as execution of a function.Procedural macros come in one of three flavors:
- Function-like macros -
custom!(...)
- Derive macros -
#[derive(CustomDerive)]
- Attribute macros -
#[CustomAttribute]
Procedural macros allow you to run code at compile time that operates over Rustsyntax, both consuming and producing Rust syntax. You can sort of think ofprocedural macros as functions from an AST to another AST.
Procedural macros must be defined in the root of a crate with thecrate type ofproc-macro
.The macros may not be used from the crate where they are defined, and can only be used when imported in another crate.
Note
When using Cargo, Procedural macro crates are defined with the
proc-macro
key in your manifest:[lib]proc-macro = true
As functions, they must either return syntax, panic, or loop endlessly. Returnedsyntax either replaces or adds the syntax depending on the kind of proceduralmacro. Panics are caught by the compiler and are turned into a compiler error.Endless loops are not caught by the compiler which hangs the compiler.
Procedural macros run during compilation, and thus have the same resources thatthe compiler has. For example, standard input, error, and output are the samethat the compiler has access to. Similarly, file access is the same. Becauseof this, procedural macros have the same security concerns thatCargo’sbuild scripts have.
Procedural macros have two ways of reporting errors. The first is to panic. Thesecond is to emit acompile_error
macro invocation.
Theproc_macro
crate
Procedural macro crates almost always will link to the compiler-providedproc_macro
crate. Theproc_macro
crate provides types required forwriting procedural macros and facilities to make it easier.
This crate primarily contains aTokenStream
type. Procedural macros operateovertoken streams instead of AST nodes, which is a far more stable interfaceover time for both the compiler and for procedural macros to target. Atoken stream is roughly equivalent toVec<TokenTree>
where aTokenTree
can roughly be thought of as lexical token. For examplefoo
is anIdent
token,.
is aPunct
token, and1.2
is aLiteral
token. TheTokenStream
type, unlikeVec<TokenTree>
, is cheap to clone.
All tokens have an associatedSpan
. ASpan
is an opaque value that cannotbe modified but can be manufactured.Span
s represent an extent of sourcecode within a program and are primarily used for error reporting. While youcannot modify aSpan
itself, you can always change theSpan
associatedwith any token, such as through getting aSpan
from another token.
Procedural macro hygiene
Procedural macros areunhygienic. This means they behave as if the outputtoken stream was simply written inline to the code it’s next to. This means thatit’s affected by external items and also affects external imports.
Macro authors need to be careful to ensure their macros work in as many contextsas possible given this limitation. This often includes using absolute paths toitems in libraries (for example,::std::option::Option
instead ofOption
) orby ensuring that generated functions have names that are unlikely to clash withother functions (like__internal_foo
instead offoo
).
Function-like procedural macros
Function-like procedural macros are procedural macros that are invoked usingthe macro invocation operator (!
).
These macros are defined by apublicfunction with theproc_macro
attribute and a signature of(TokenStream) -> TokenStream
. The inputTokenStream
is what is inside the delimiters of the macro invocation and theoutputTokenStream
replaces the entire macro invocation.
Theproc_macro
attribute defines the macro in themacro namespace in the root of the crate.
For example, the following macro definition ignores its input and outputs afunctionanswer
into its scope.
#![crate_type = "proc-macro"]extern crate proc_macro;use proc_macro::TokenStream;#[proc_macro]pub fn make_answer(_item: TokenStream) -> TokenStream { "fn answer() -> u32 { 42 }".parse().unwrap()}
And then we use it in a binary crate to print “42” to standard output.
extern crate proc_macro_examples;use proc_macro_examples::make_answer;make_answer!();fn main() { println!("{}", answer());}
Function-like procedural macros may be invoked in any macro invocationposition, which includesstatements,expressions,patterns,typeexpressions,item positions, including items inextern
blocks, inherentand traitimplementations, andtrait definitions.
Derive macros
Derive macros define new inputs for thederive
attribute. These macroscan create newitems given the token stream of astruct,enum, orunion.They can also definederive macro helper attributes.
Custom derive macros are defined by apublicfunction with theproc_macro_derive
attribute and a signature of(TokenStream) -> TokenStream
.
Theproc_macro_derive
attribute defines the custom derive in themacro namespace in the root of the crate.
The inputTokenStream
is the token stream of the item that has thederive
attribute on it. The outputTokenStream
must be a set of items that arethen appended to themodule orblock that the item from the inputTokenStream
is in.
The following is an example of a derive macro. Instead of doing anythinguseful with its input, it just appends a functionanswer
.
#![crate_type = "proc-macro"]extern crate proc_macro;use proc_macro::TokenStream;#[proc_macro_derive(AnswerFn)]pub fn derive_answer_fn(_item: TokenStream) -> TokenStream { "fn answer() -> u32 { 42 }".parse().unwrap()}
And then using said derive macro:
extern crate proc_macro_examples;use proc_macro_examples::AnswerFn;#[derive(AnswerFn)]struct Struct;fn main() { assert_eq!(42, answer());}
Derive macro helper attributes
Derive macros can add additionalattributes into the scope of theitemthey are on. Said attributes are calledderive macro helper attributes. Theseattributes areinert, and their only purpose is to be fed into the derivemacro that defined them. That said, they can be seen by all macros.
The way to define helper attributes is to put anattributes
key in theproc_macro_derive
macro with a comma separated list of identifiers that arethe names of the helper attributes.
For example, the following derive macro defines a helper attributehelper
, but ultimately doesn’t do anything with it.
#![crate_type="proc-macro"]extern crate proc_macro;use proc_macro::TokenStream;#[proc_macro_derive(HelperAttr, attributes(helper))]pub fn derive_helper_attr(_item: TokenStream) -> TokenStream { TokenStream::new()}
And then usage on the derive macro on a struct:
#[derive(HelperAttr)]struct Struct { #[helper] field: ()}
Attribute macros
Attribute macros define newouter attributes which can beattached toitems, including items inextern
blocks, inherent and traitimplementations, andtrait definitions.
Attribute macros are defined by apublicfunction with theproc_macro_attribute
attribute that has a signature of(TokenStream, TokenStream) -> TokenStream
. The firstTokenStream
is the delimited tokentree following the attribute’s name, not including the outer delimiters. Ifthe attribute is written as a bare attribute name, the attributeTokenStream
is empty. The secondTokenStream
is the rest of theitemincluding otherattributes on theitem. The returnedTokenStream
replaces theitem with an arbitrary number ofitems.
Theproc_macro_attribute
attribute defines the attribute in themacro namespace in the root of the crate.
For example, this attribute macro takes the input stream and returns it as is,effectively being the no-op of attributes.
#![crate_type = "proc-macro"]extern crate proc_macro;use proc_macro::TokenStream;#[proc_macro_attribute]pub fn return_as_is(_attr: TokenStream, item: TokenStream) -> TokenStream { item}
This following example shows the stringifiedTokenStream
s that the attributemacros see. The output will show in the output of the compiler. The output isshown in the comments after the function prefixed with “out:”.
// my-macro/src/lib.rsextern crate proc_macro;use proc_macro::TokenStream;#[proc_macro_attribute]pub fn show_streams(attr: TokenStream, item: TokenStream) -> TokenStream { println!("attr: \"{attr}\""); println!("item: \"{item}\""); item}
// src/lib.rsextern crate my_macro;use my_macro::show_streams;// Example: Basic function#[show_streams]fn invoke1() {}// out: attr: ""// out: item: "fn invoke1() {}"// Example: Attribute with input#[show_streams(bar)]fn invoke2() {}// out: attr: "bar"// out: item: "fn invoke2() {}"// Example: Multiple tokens in the input#[show_streams(multiple => tokens)]fn invoke3() {}// out: attr: "multiple => tokens"// out: item: "fn invoke3() {}"// Example:#[show_streams { delimiters }]fn invoke4() {}// out: attr: "delimiters"// out: item: "fn invoke4() {}"
Declarative macro tokens and procedural macro tokens
Declarativemacro_rules
macros and procedural macros use similar, butdifferent definitions for tokens (or ratherTokenTree
s.)
Token trees inmacro_rules
(corresponding tott
matchers) are defined as
- Delimited groups (
(...)
,{...}
, etc) - All operators supported by the language, both single-character andmulti-character ones (
+
,+=
).- Note that this set doesn’t include the single quote
'
.
- Note that this set doesn’t include the single quote
- Literals (
"string"
,1
, etc)- Note that negation (e.g.
-1
) is never a part of such literal tokens,but a separate operator token.
- Note that negation (e.g.
- Identifiers, including keywords (
ident
,r#ident
,fn
) - Lifetimes (
'ident
) - Metavariable substitutions in
macro_rules
(e.g.$my_expr
inmacro_rules! mac { ($my_expr: expr) => { $my_expr } }
after themac
’sexpansion, which will be considered a single token tree regardless of thepassed expression)
Token trees in procedural macros are defined as
- Delimited groups (
(...)
,{...}
, etc) - All punctuation characters used in operators supported by the language (
+
,but not+=
), and also the single quote'
character (typically used inlifetimes, see below for lifetime splitting and joining behavior) - Literals (
"string"
,1
, etc)- Negation (e.g.
-1
) is supported as a part of integerand floating point literals.
- Negation (e.g.
- Identifiers, including keywords (
ident
,r#ident
,fn
)
Mismatches between these two definitions are accounted for when token streamsare passed to and from procedural macros.
Note that the conversions below may happen lazily, so they might not happen ifthe tokens are not actually inspected.
When passed to a proc-macro
- All multi-character operators are broken into single characters.
- Lifetimes are broken into a
'
character and an identifier. - All metavariable substitutions are represented as their underlying tokenstreams.
- Such token streams may be wrapped into delimited groups (
Group
) withimplicit delimiters (Delimiter::None
) when it’s necessary forpreserving parsing priorities. tt
andident
substitutions are never wrapped into such groups andalways represented as their underlying token trees.
- Such token streams may be wrapped into delimited groups (
When emitted from a proc macro
- Punctuation characters are glued into multi-character operatorswhen applicable.
- Single quotes
'
joined with identifiers are glued into lifetimes. - Negative literals are converted into two tokens (the
-
and the literal)possibly wrapped into a delimited group (Group
) with implicit delimiters(Delimiter::None
) when it’s necessary for preserving parsing priorities.
Note that neither declarative nor procedural macros support doc comment tokens(e.g./// Doc
), so they are always converted to token streams representingtheir equivalent#[doc = r"str"]
attributes when passed to macros.