Code generation attributes
The followingattributes are used for controlling code generation.
Theinline attribute
Theinlineattribute suggests whether a copy of the attributed function’s code should be placed in the caller rather than generating a call to the function.
Example
#![allow(unused)]fn main() {#[inline]pub fn example1() {}#[inline(always)]pub fn example2() {}#[inline(never)]pub fn example3() {}}
Note
rustcautomatically inlines functions when doing so seems worthwhile. Use this attribute carefully as poor decisions about what to inline can slow down programs.
The syntax for theinline attribute is:
Syntax
InlineAttribute →
inline(always)
|inline(never)
|inline
Theinline attribute may only be applied to functions withbodies —closures,async blocks,free functions,associated functions in aninherent impl ortrait impl, and associated functions in atrait definition when those functions have adefault definition .
Note
rustcignores use in other positions but lints against it. This may become an error in the future.
Note
Though the attribute can be applied toclosures andasync blocks, the usefulness of this is limited as we do not yet support attributes on expressions.
#![allow(unused)]fn main() {// We allow attributes on statements.#[inline] || (); // OK#[inline] async {}; // OK}#![allow(unused)]fn main() {// We don't yet allow attributes on expressions.let f = #[inline] || (); // ERROR}
Only the first use ofinline on a function has effect.
Note
rustclints against any use following the first. This may become an error in the future.
Theinline attribute supports these modes:
#[inline]suggests performing inline expansion.#[inline(always)]suggests that inline expansion should always be performed.#[inline(never)]suggests that inline expansion should never be performed.
Note
In every form the attribute is a hint. The compiler may ignore it.
Wheninline is applied to a function in atrait, it applies only to the code of thedefault definition.
Wheninline is applied to anasync function orasync closure, it applies only to the code of the generatedpoll function.
Note
For more details, seeRust issue #129347.
Theinline attribute is ignored if the function is externally exported withno_mangle orexport_name.
Thecold attribute
Thecoldattribute suggests that the attributed function is unlikely to be called which may help the compiler produce better code.
Example
#![allow(unused)]fn main() {#[cold]pub fn example() {}}
Thecold attribute uses theMetaWord syntax.
Thecold attribute may only be applied to functions withbodies —closures,async blocks,free functions,associated functions in aninherent impl ortrait impl, and associated functions in atrait definition when those functions have adefault definition .
Note
rustcignores use in other positions but lints against it. This may become an error in the future.
Note
Though the attribute can be applied toclosures andasync blocks, the usefulness of this is limited as we do not yet support attributes on expressions.
Only the first use ofcold on a function has effect.
Note
rustclints against any use following the first. This may become an error in the future.
Whencold is applied to a function in atrait, it applies only to the code of thedefault definition.
Thenaked attribute
Thenakedattribute prevents the compiler from emitting a function prologue and epilogue for the attributed function.
Thefunction body must consist of exactly onenaked_asm! macro invocation.
No function prologue or epilogue is generated for the attributed function. The assembly code in thenaked_asm! block constitutes the full body of a naked function.
Thenaked attribute is anunsafe attribute. Annotating a function with#[unsafe(naked)] comes with the safety obligation that the body must respect the function’s calling convention, uphold its signature, and either return or diverge (i.e., not fall through past the end of the assembly code).
The assembly code may assume that the call stack and register state are valid on entry as per the signature and calling convention of the function.
The assembly code may not be duplicated by the compiler except when monomorphizing polymorphic functions.
Note
Guaranteeing when the assembly code may or may not be duplicated is important for naked functions that define symbols.
Theunused_variables lint is suppressed within naked functions.
Theinline attribute cannot by applied to a naked function.
Thetrack_caller attribute cannot be applied to a naked function.
Thetesting attributes cannot be applied to a naked function.
Theno_builtins attribute
Theno_builtinsattribute disables optimization of certain code patterns related to calls to library functions that are assumed to exist.
Example
#![allow(unused)]#![no_builtins]fn main() {}
Theno_builtins attribute uses theMetaWord syntax.
Theno_builtins attribute can only be applied to the crate root.
Only the first use of theno_builtins attribute has effect.
Note
rustclints against any use following the first.
Thetarget_feature attribute
Thetarget_featureattribute may be applied to a function toenable code generation of that function for specific platform architecturefeatures. It uses theMetaListNameValueStr syntax with a single key ofenable whose value is a string of comma-separated feature names to enable.
#![allow(unused)]fn main() {#[cfg(target_feature = "avx2")]#[target_feature(enable = "avx2")]fn foo_avx2() {}}
Eachtarget architecture has a set of features that may be enabled. It is anerror to specify a feature for a target architecture that the crate is notbeing compiled for.
Closures defined within atarget_feature-annotated function inherit theattribute from the enclosing function.
It isundefined behavior to call a function that is compiled with a featurethat is not supported on the current platform the code is running on,exceptif the platform explicitly documents this to be safe.
The following restrictions apply unless otherwise specified by the platform rules below:
- Safe
#[target_feature]functions (and closures that inherit the attribute) can only be safely called within a caller that enables all thetarget_features that the callee enables.This restriction does not apply in anunsafecontext. - Safe
#[target_feature]functions (and closures that inherit the attribute) can only be coerced tosafe function pointers in contexts that enable all thetarget_features that the coercee enables.This restriction does not apply tounsafefunction pointers.
Implicitly enabled features are included in this rule. For example ansse2 function can call ones marked withsse.
#![allow(unused)]fn main() {#[cfg(target_feature = "sse2")] {#[target_feature(enable = "sse")]fn foo_sse() {}fn bar() { // Calling `foo_sse` here is unsafe, as we must ensure that SSE is // available first, even if `sse` is enabled by default on the target // platform or manually enabled as compiler flags. unsafe { foo_sse(); }}#[target_feature(enable = "sse")]fn bar_sse() { // Calling `foo_sse` here is safe. foo_sse(); || foo_sse();}#[target_feature(enable = "sse2")]fn bar_sse2() { // Calling `foo_sse` here is safe because `sse2` implies `sse`. foo_sse();}}}
A function with a#[target_feature] attributenever implements theFn family of traits, although closures inheriting features from the enclosing function do.
The#[target_feature] attribute is not allowed on the following places:
- the
mainfunction - a
panic_handlerfunction - safe trait methods
- safe default functions in traits
Functions marked withtarget_feature are not inlined into a context thatdoes not support the given features. The#[inline(always)] attribute may notbe used with atarget_feature attribute.
Available features
The following is a list of the available feature names.
x86 orx86_64
Executing code with unsupported features is undefined behavior on this platform.Hence on this platform usage of#[target_feature] functions follows theabove restrictions.
| Feature | Implicitly Enables | Description |
|---|---|---|
adx | ADX — Multi-Precision Add-Carry Instruction Extensions | |
aes | sse2 | AES — Advanced Encryption Standard |
avx | sse4.2 | AVX — Advanced Vector Extensions |
avx2 | avx | AVX2 — Advanced Vector Extensions 2 |
avx512bf16 | avx512bw | AVX512-BF16 — Advanced Vector Extensions 512-bit - Bfloat16 Extensions |
avx512bitalg | avx512bw | AVX512-BITALG — Advanced Vector Extensions 512-bit - Bit Algorithms |
avx512bw | avx512f | AVX512-BW — Advanced Vector Extensions 512-bit - Byte and Word Instructions |
avx512cd | avx512f | AVX512-CD — Advanced Vector Extensions 512-bit - Conflict Detection Instructions |
avx512dq | avx512f | AVX512-DQ — Advanced Vector Extensions 512-bit - Doubleword and Quadword Instructions |
avx512f | avx2,fma,f16c | AVX512-F — Advanced Vector Extensions 512-bit - Foundation |
avx512fp16 | avx512bw | AVX512-FP16 — Advanced Vector Extensions 512-bit - Float16 Extensions |
avx512ifma | avx512f | AVX512-IFMA — Advanced Vector Extensions 512-bit - Integer Fused Multiply Add |
avx512vbmi | avx512bw | AVX512-VBMI — Advanced Vector Extensions 512-bit - Vector Byte Manipulation Instructions |
avx512vbmi2 | avx512bw | AVX512-VBMI2 — Advanced Vector Extensions 512-bit - Vector Byte Manipulation Instructions 2 |
avx512vl | avx512f | AVX512-VL — Advanced Vector Extensions 512-bit - Vector Length Extensions |
avx512vnni | avx512f | AVX512-VNNI — Advanced Vector Extensions 512-bit - Vector Neural Network Instructions |
avx512vp2intersect | avx512f | AVX512-VP2INTERSECT — Advanced Vector Extensions 512-bit - Vector Pair Intersection to a Pair of Mask Registers |
avx512vpopcntdq | avx512f | AVX512-VPOPCNTDQ — Advanced Vector Extensions 512-bit - Vector Population Count Instruction |
avxifma | avx2 | AVX-IFMA — Advanced Vector Extensions - Integer Fused Multiply Add |
avxneconvert | avx2 | AVX-NE-CONVERT — Advanced Vector Extensions - No-Exception Floating-Point conversion Instructions |
avxvnni | avx2 | AVX-VNNI — Advanced Vector Extensions - Vector Neural Network Instructions |
avxvnniint16 | avx2 | AVX-VNNI-INT16 — Advanced Vector Extensions - Vector Neural Network Instructions with 16-bit Integers |
avxvnniint8 | avx2 | AVX-VNNI-INT8 — Advanced Vector Extensions - Vector Neural Network Instructions with 8-bit Integers |
bmi1 | BMI1 — Bit Manipulation Instruction Sets | |
bmi2 | BMI2 — Bit Manipulation Instruction Sets 2 | |
cmpxchg16b | cmpxchg16b — Compares and exchange 16 bytes (128 bits) of data atomically | |
f16c | avx | F16C — 16-bit floating point conversion instructions |
fma | avx | FMA3 — Three-operand fused multiply-add |
fxsr | fxsave andfxrstor — Save and restore x87 FPU, MMX Technology, and SSE State | |
gfni | sse2 | GFNI — Galois Field New Instructions |
kl | sse2 | KEYLOCKER — Intel Key Locker Instructions |
lzcnt | lzcnt — Leading zeros count | |
movbe | movbe — Move data after swapping bytes | |
pclmulqdq | sse2 | pclmulqdq — Packed carry-less multiplication quadword |
popcnt | popcnt — Count of bits set to 1 | |
rdrand | rdrand — Read random number | |
rdseed | rdseed — Read random seed | |
sha | sse2 | SHA — Secure Hash Algorithm |
sha512 | avx2 | SHA512 — Secure Hash Algorithm with 512-bit digest |
sm3 | avx | SM3 — ShangMi 3 Hash Algorithm |
sm4 | avx2 | SM4 — ShangMi 4 Cipher Algorithm |
sse | SSE — StreamingSIMD Extensions | |
sse2 | sse | SSE2 — Streaming SIMD Extensions 2 |
sse3 | sse2 | SSE3 — Streaming SIMD Extensions 3 |
sse4.1 | ssse3 | SSE4.1 — Streaming SIMD Extensions 4.1 |
sse4.2 | sse4.1 | SSE4.2 — Streaming SIMD Extensions 4.2 |
sse4a | sse3 | SSE4a — Streaming SIMD Extensions 4a |
ssse3 | sse3 | SSSE3 — Supplemental Streaming SIMD Extensions 3 |
tbm | TBM — Trailing Bit Manipulation | |
vaes | avx2,aes | VAES — Vector AES Instructions |
vpclmulqdq | avx,pclmulqdq | VPCLMULQDQ — Vector Carry-less multiplication of Quadwords |
widekl | kl | KEYLOCKER_WIDE — Intel Wide Keylocker Instructions |
xsave | xsave — Save processor extended states | |
xsavec | xsavec — Save processor extended states with compaction | |
xsaveopt | xsaveopt — Save processor extended states optimized | |
xsaves | xsaves — Save processor extended states supervisor |
aarch64
On this platform the usage of#[target_feature] functions follows theabove restrictions.
Further documentation on these features can be found in theARM ArchitectureReference Manual, or elsewhere ondeveloper.arm.com.
Note
The following pairs of features should both be marked as enabled or disabled together if used:
pacaandpacg, which LLVM currently implements as one feature.
| Feature | Implicitly Enables | Feature Name |
|---|---|---|
aes | neon | FEAT_AES & FEAT_PMULL — AdvancedSIMD AES & PMULL instructions |
bf16 | FEAT_BF16 — BFloat16 instructions | |
bti | FEAT_BTI — Branch Target Identification | |
crc | FEAT_CRC — CRC32 checksum instructions | |
dit | FEAT_DIT — Data Independent Timing instructions | |
dotprod | neon | FEAT_DotProd — Advanced SIMD Int8 dot product instructions |
dpb | FEAT_DPB — Data cache clean to point of persistence | |
dpb2 | dpb | FEAT_DPB2 — Data cache clean to point of deep persistence |
f32mm | sve | FEAT_F32MM — SVE single-precision FP matrix multiply instruction |
f64mm | sve | FEAT_F64MM — SVE double-precision FP matrix multiply instruction |
fcma | neon | FEAT_FCMA — Floating point complex number support |
fhm | fp16 | FEAT_FHM — Half-precision FP FMLAL instructions |
flagm | FEAT_FLAGM — Conditional flag manipulation | |
fp16 | neon | FEAT_FP16 — Half-precision FP data processing |
frintts | FEAT_FRINTTS — Floating-point to int helper instructions | |
i8mm | FEAT_I8MM — Int8 Matrix Multiplication | |
jsconv | neon | FEAT_JSCVT — JavaScript conversion instruction |
lor | FEAT_LOR — Limited Ordering Regions extension | |
lse | FEAT_LSE — Large System Extensions | |
mte | FEAT_MTE & FEAT_MTE2 — Memory Tagging Extension | |
neon | FEAT_AdvSimd & FEAT_FP — Floating Point and Advanced SIMD extension | |
paca | FEAT_PAUTH — Pointer Authentication (address authentication) | |
pacg | FEAT_PAUTH — Pointer Authentication (generic authentication) | |
pan | FEAT_PAN — Privileged Access-Never extension | |
pmuv3 | FEAT_PMUv3 — Performance Monitors extension (v3) | |
rand | FEAT_RNG — Random Number Generator | |
ras | FEAT_RAS & FEAT_RASv1p1 — Reliability, Availability and Serviceability extension | |
rcpc | FEAT_LRCPC — Release consistent Processor Consistent | |
rcpc2 | rcpc | FEAT_LRCPC2 — RcPc with immediate offsets |
rdm | neon | FEAT_RDM — Rounding Double Multiply accumulate |
sb | FEAT_SB — Speculation Barrier | |
sha2 | neon | FEAT_SHA1 & FEAT_SHA256 — Advanced SIMD SHA instructions |
sha3 | sha2 | FEAT_SHA512 & FEAT_SHA3 — Advanced SIMD SHA instructions |
sm4 | neon | FEAT_SM3 & FEAT_SM4 — Advanced SIMD SM3/4 instructions |
spe | FEAT_SPE — Statistical Profiling Extension | |
ssbs | FEAT_SSBS & FEAT_SSBS2 — Speculative Store Bypass Safe | |
sve | neon | FEAT_SVE — Scalable Vector Extension |
sve2 | sve | FEAT_SVE2 — Scalable Vector Extension 2 |
sve2-aes | sve2,aes | FEAT_SVE_AES & FEAT_SVE_PMULL128 — SVE AES instructions |
sve2-bitperm | sve2 | FEAT_SVE2_BitPerm — SVE Bit Permute |
sve2-sha3 | sve2,sha3 | FEAT_SVE2_SHA3 — SVE SHA3 instructions |
sve2-sm4 | sve2,sm4 | FEAT_SVE2_SM4 — SVE SM4 instructions |
tme | FEAT_TME — Transactional Memory Extension | |
vh | FEAT_VHE — Virtualization Host Extensions |
loongarch
On this platform the usage of#[target_feature] functions follows theabove restrictions.
| Feature | Implicitly Enables | Description |
|---|---|---|
f | F — Single-precision float-point instructions | |
d | f | D — Double-precision float-point instructions |
frecipe | FRECIPE — Reciprocal approximation instructions | |
lasx | lsx | LASX — 256-bit vector instructions |
lbt | LBT — Binary translation instructions | |
lsx | d | LSX — 128-bit vector instructions |
lvz | LVZ — Virtualization instructions |
riscv32 orriscv64
On this platform the usage of#[target_feature] functions follows theabove restrictions.
Further documentation on these features can be found in their respectivespecification. Many specifications are described in theRISC-V ISA Manual orin another manual hosted on theRISC-V GitHub Account.
| Feature | Implicitly Enables | Description |
|---|---|---|
a | A — Atomic instructions | |
c | C — Compressed instructions | |
m | M — Integer Multiplication and Division instructions | |
zb | zba,zbc,zbs | Zb — Bit Manipulation instructions |
zba | Zba — Address Generation instructions | |
zbb | Zbb — Basic bit-manipulation | |
zbc | Zbc — Carry-less multiplication | |
zbkb | Zbkb — Bit Manipulation Instructions for Cryptography | |
zbkc | Zbkc — Carry-less multiplication for Cryptography | |
zbkx | Zbkx — Crossbar permutations | |
zbs | Zbs — Single-bit instructions | |
zk | zkn,zkr,zks,zkt,zbkb,zbkc,zkbx | Zk — Scalar Cryptography |
zkn | zknd,zkne,zknh,zbkb,zbkc,zkbx | Zkn — NIST Algorithm suite extension |
zknd | Zknd — NIST Suite: AES Decryption | |
zkne | Zkne — NIST Suite: AES Encryption | |
zknh | Zknh — NIST Suite: Hash Function Instructions | |
zkr | Zkr — Entropy Source Extension | |
zks | zksed,zksh,zbkb,zbkc,zkbx | Zks — ShangMi Algorithm Suite |
zksed | Zksed — ShangMi Suite: SM4 Block Cipher Instructions | |
zksh | Zksh — ShangMi Suite: SM3 Hash Function Instructions | |
zkt | Zkt — Data Independent Execution Latency Subset |
wasm32 orwasm64
Safe#[target_feature] functions may always be used in safe contexts on Wasmplatforms. It is impossible to cause undefined behavior via the#[target_feature] attribute because attempting to use instructionsunsupported by the Wasm engine will fail at load time without the risk of beinginterpreted in a way different from what the compiler expected.
| Feature | Implicitly Enables | Description |
|---|---|---|
bulk-memory | WebAssembly bulk memory operations proposal | |
extended-const | WebAssembly extended const expressions proposal | |
mutable-globals | WebAssembly mutable global proposal | |
nontrapping-fptoint | WebAssembly non-trapping float-to-int conversion proposal | |
relaxed-simd | simd128 | WebAssembly relaxed simd proposal |
sign-ext | WebAssembly sign extension operators Proposal | |
simd128 | WebAssembly simd proposal | |
multivalue | WebAssembly multivalue proposal | |
reference-types | WebAssembly reference-types proposal | |
tail-call | WebAssembly tail-call proposal |
Additional information
See thetarget_feature conditional compilation option for selectivelyenabling or disabling compilation of code based on compile-time settings. Notethat this option is not affected by thetarget_feature attribute, and isonly driven by the features enabled for the entire crate.
See theis_x86_feature_detected oris_aarch64_feature_detected macrosin the standard library for runtime feature detection on these platforms.
Note
rustchas a default set of features enabled for each target and CPU. The CPU may be chosen with the-C target-cpuflag. Individual features may be enabled or disabled for an entire crate with the-C target-featureflag.
Thetrack_caller attribute
Thetrack_caller attribute may be applied to any function with"Rust" ABIwith the exception of the entry pointfn main.
When applied to functions and methods in trait declarations, the attribute applies to all implementations. If the trait provides adefault implementation with the attribute, then the attribute also applies to override implementations.
When applied to a function in anextern block the attribute must also be applied to any linkedimplementations, otherwise undefined behavior results. When applied to a function which is madeavailable to anextern block, the declaration in theextern block must also have the attribute,otherwise undefined behavior results.
Behavior
Applying the attribute to a functionf allows code withinf to get a hint of theLocation ofthe “topmost” tracked call that led tof’s invocation. At the point of observation, animplementation behaves as if it walks up the stack fromf’s frame to find the nearest frame of anunattributed functionouter, and it returns theLocation of the tracked call inouter.
#![allow(unused)]fn main() {#[track_caller]fn f() { println!("{}", std::panic::Location::caller());}}
Note
coreprovidescore::panic::Location::callerfor observing caller locations. It wraps thecore::intrinsics::caller_locationintrinsic implemented byrustc.
Note
Because the resulting
Locationis a hint, an implementation may halt its walk up the stack early. SeeLimitations for important caveats.
Examples
Whenf is called directly bycalls_f, code inf observes its callsite withincalls_f:
#![allow(unused)]fn main() {#[track_caller]fn f() { println!("{}", std::panic::Location::caller());}fn calls_f() { f(); // <-- f() prints this location}}
Whenf is called by another attributed functiong which is in turn called bycalls_g, code inbothf andg observesg’s callsite withincalls_g:
#![allow(unused)]fn main() {#[track_caller]fn f() { println!("{}", std::panic::Location::caller());}#[track_caller]fn g() { println!("{}", std::panic::Location::caller()); f();}fn calls_g() { g(); // <-- g() prints this location twice, once itself and once from f()}}
Wheng is called by another attributed functionh which is in turn called bycalls_h, all codeinf,g, andh observesh’s callsite withincalls_h:
#![allow(unused)]fn main() {#[track_caller]fn f() { println!("{}", std::panic::Location::caller());}#[track_caller]fn g() { println!("{}", std::panic::Location::caller()); f();}#[track_caller]fn h() { println!("{}", std::panic::Location::caller()); g();}fn calls_h() { h(); // <-- prints this location three times, once itself, once from g(), once from f()}}
And so on.
Limitations
This information is a hint and implementations are not required to preserve it.
In particular, coercing a function with#[track_caller] to a function pointer creates a shim whichappears to observers to have been called at the attributed function’s definition site, losing actualcaller information across virtual calls. A common example of this coercion is the creation of atrait object whose methods are attributed.
Note
The aforementioned shim for function pointers is necessary because
rustcimplementstrack_callerin a codegen context by appending an implicit parameter to the function ABI, but this would be unsound for an indirect call because the parameter is not a part of the function’s type and a given function pointer type may or may not refer to a function with the attribute. The creation of a shim hides the implicit parameter from callers of the function pointer, preserving soundness.
Theinstruction_set attribute
Theinstruction_setattribute specifies the instruction set that a function will use during code generation. This allows mixing more than one instruction set in a single program.
Example
#[instruction_set(arm::a32)]fn arm_code() {}#[instruction_set(arm::t32)]fn thumb_code() {}
Theinstruction_set attribute uses theMetaListPaths syntax to specify a single path consisting of the architecture family name and instruction set name.
Theinstruction_set attribute may only be applied to functions withbodies —closures,async blocks,free functions,associated functions in aninherent impl ortrait impl, and associated functions in atrait definition when those functions have adefault definition .
Note
rustcignores use in other positions but lints against it. This may become an error in the future.
Note
Though the attribute can be applied toclosures andasync blocks, the usefulness of this is limited as we do not yet support attributes on expressions.
Theinstruction_set attribute may be used only once on a function.
Theinstruction_set attribute may only be used with a target that supports the given value.
When theinstruction_set attribute is used, any inline assembly in the function must use the specified instruction set instead of the target default.
instruction_set on ARM
When targeting theARMv4T andARMv5te architectures, the supported values forinstruction_set are:
arm::a32— Generate the function as A32 “ARM” code.arm::t32— Generate the function as T32 “Thumb” code.
If the address of the function is taken as a function pointer, the low bit of the address will depend on the selected instruction set:
- For
arm::a32(“ARM”), it will be 0. - For
arm::t32(“Thumb”), it will be 1.