Julia has a number of custom LLVM passes. Broadly, they can be classified into passes that are required to be run to maintain Julia semantics, and passes that take advantage of Julia semantics to optimize LLVM IR.
These passes are used to transform LLVM IR into code that is legal to be run on a CPU. Their main purpose is to enable simpler IR to be emitted by codegen, which then enables other LLVM passes to optimize common patterns.
llvm-cpufeatures.cpp
CPUFeaturesPass
module(CPUFeatures)
This pass lowers thejulia.cpu.have_fma.(f32|f64)
intrinsic to either true or false, depending on the target architecture and target features present on the function. This intrinsic is often used to determine if using algorithms dependent on fastfused multiply-add operations is better than using standard algorithms not dependent on such instructions.
llvm-demote-float16.cpp
DemoteFloat16Pass
function(DemoteFloat16)
This pass replacesfloat16 operations with float32 operations on architectures that do not natively support float16 operations. This is done by insertingfpext
andfptrunc
instructions around any float16 operation. On architectures that do support native float16 operations, this pass is a no-op.
llvm-late-gc-lowering.cpp
LateLowerGCPass
function(LateLowerGCFrame)
This pass performs most of the GC rooting work required to track pointers between GC safepoints. It also lowers several intrinsics to their corresponding instruction translation, and is permitted to violate the non-integral invariants previously established (pointer_from_objref
is lowered to aptrtoint
instruction here). This pass typically occupies the most time out of all the custom Julia passes, due to its dataflow algorithm to minimize the number of objects live at any safepoint.
llvm-final-gc-lowering.cpp
FinalLowerGCPass
module(FinalLowerGC)
This pass lowers a few last intrinsics to their final form targeting functions in thelibjulia
library. Separating this fromLateGCLowering
enables other backends (GPU compilation) to supply their own custom lowerings for these intrinsics, enabling the Julia pipeline to be used on those backends as well.
llvm-lower-handlers.cpp
LowerExcHandlersPass
function(LowerExcHandlers)
This pass lowers exception handling intrinsics into calls to runtime functions that are actually called when handling exceptions.
llvm-remove-ni.cpp
RemoveNIPass
module(RemoveNI)
This pass removes the non-integral address spaces from the module's datalayout string. This enables the backend to lower Julia's custom address spaces directly to machine code, without a costly rewrite of every pointer operation to address space 0.
llvm-simdloop.cpp
LowerSIMDLoopPass
loop(LowerSIMDLoop)
This pass acts as the main driver of the@simd
annotation. Codegen inserts a!llvm.loopid
marker at the back branch of a loop, which this pass uses to identify loops that were originally marked with@simd
. Then, this pass looks for a chain of floating point operations that form a reduce and adds thecontract
andreassoc
fast math flags to allow reassociation (and thus vectorization). This pass does not preserve either loop information nor inference correctness, so it may violate Julia semantics in surprising ways. If the loop was annotated withivdep
as well, then the pass marks the loop as having no loop-carried dependencies (the resulting behavior is undefined if the user annotation was incorrect or gets applied to the wrong loop).
llvm-ptls.cpp
LowerPTLSPass
module(LowerPTLSPass)
This pass lowers thread-local Julia intrinsics to assembly instructions. Julia relies on thread-local storage for garbage collection and multithreading task scheduling. When compiling code for system images and package images, this pass replaces calls to intrinsics with loads from global variables that are initialized at load time.
If codegen produces a function with aswiftself
argument and calling convention, this pass assumes theswiftself
argument is the pgcstack and will replace the intrinsics with that argument. Doing so provides speedups on architectures that have slow thread local storage accesses.
llvm-remove-addrspaces.cpp
RemoveAddrspacesPass
module(RemoveAddrspaces)
This pass renames pointers in one address space to another address space. This is used to remove Julia-specific address spaces from LLVM IR.
llvm-remove-addrspaces.cpp
RemoveJuliaAddrspacesPass
module(RemoveJuliaAddrspaces)
This pass removes Julia-specific address spaces from LLVM IR. It is mostly used for displaying LLVM IR in a less cluttered format. Internally, it is implemented off the RemoveAddrspaces pass.
llvm-multiversioning.cpp
MultiVersioningPass
module(JuliaMultiVersioning)
This pass performs modifications to a module to create functions that are optimized for running on different architectures (see sysimg.md and pkgimg.md for more details). Implementation-wise, it clones functions and applies different target-specific attributes to them to allow the optimizer to use advanced features such as vectorization and instruction scheduling for that platform. It also creates some infrastructure to enable the Julia image loader to select the appropriate version of the function to call based on the architecture the loader is running on. The target-specific attributes are controlled by thejulia.mv.specs
module flag, which during compilation is derived from theJULIA_CPU_TARGET
environment variable. The pass must also be enabled by providing ajulia.mv.enable
module flag with a value of 1.
Use ofllvmcall
with multiversioning is dangerous.llvmcall
enables access to features not typically exposed by the Julia APIs, and are therefore usually not available on all architectures. If multiversioning is enabled and code generation is requested for a target architecture that does not support the feature required by anllvmcall
expression, LLVM will probably error out, likely with an abort and the messageLLVM ERROR: Do not know how to split the result of this operator!
.
llvm-gc-invariant-verifier.cpp
GCInvariantVerifierPass
module(GCInvariantVerifier)
This pass is used to verify Julia's invariants about LLVM IR. This includes things such as the nonexistence ofptrtoint
in Julia'snon-integral address spaces[nislides] and the existence of only blessedaddrspacecast
instructions (Tracked -> Derived, 0 -> Tracked, etc). It performs no transformations on IR.
These passes are used to perform transformations on LLVM IR that LLVM will not perform itself, e.g. fast math flag propagation, escape analysis, and optimizations on Julia-specific internal functions. They use knowledge about Julia's semantics to perform these optimizations.
llvm-muladd.cpp
CombineMulAddPass
function(CombineMulAdd)
This pass serves to optimize the particular combination of a regularfmul
with a fastfadd
into a contractfmul
with a fastfadd
. This is later optimized by the backend to afused multiply-add instruction, which can provide significantly faster operations at the cost of moreunpredictable semantics.
This optimization only occurs when thefmul
has a single use, which is the fastfadd
.
llvm-alloc-opt.cpp
AllocOptPass
function(AllocOpt)
Julia does not have the concept of a program stack as a place to allocate mutable objects. However, allocating objects on the stack reduces GC pressure and is critical for GPU compilation. Thus,AllocOpt
performs heap to stack conversion of objects that it can prove do notescape the current function. It also performs a number of other optimizations on allocations, such as removing allocations that are never used, optimizing typeof calls to freshly allocated objects, and removing stores to allocations that are immediately overwritten. The escape analysis implementation is located inllvm-alloc-helpers.cpp
. Currently, this pass does not use information fromEscapeAnalysis.jl
, though that may change in the future.
llvm-propagate-addrspaces.cpp
PropagateJuliaAddrspacesPass
function(PropagateJuliaAddrspaces)
This pass is used to propagate Julia-specific address spaces through operations on pointers. LLVM is not allowed to introduce or remove addrspacecast instructions by optimizations, so this pass acts to eliminate redundant addrspace casts by replacing operations with their equivalent in a Julia address space. For more information on Julia's address spaces, see (TODO link to llvm.md).
llvm-julia-licm.cpp
JuliaLICMPass
loop(JuliaLICM)
This pass is used to hoist Julia-specific intrinsics out of loops. Specifically, it performs the following transformations:
gc_preserve_begin
and sinkgc_preserve_end
out of loops when the preserved objects are loop-invariant.gc_preserve_begin
/gc_preserve_end
pairs in the IR. This makes it easier for theLateLowerGCPass
to identify where particular objects are preserved.AllocOptPass
. This transformation can reduce the number of allocations in the IR, even when an allocation escapes the function altogether.This pass is required to preserve LLVM'sMemorySSA (Short Video,Longer Video) andScalarEvolution (Newer SlidesOlder Slides) analyses.
Settings
This document was generated withDocumenter.jl version 1.8.0 onWednesday 9 July 2025. Using Julia version 1.11.6.