Julia uses a static single assignment intermediate representation (SSA IR) to perform optimization. This IR is different from LLVM IR, and unique to Julia. It allows for Julia specific optimizations.
goto
statements.For example the following Julia code:
function foo(x) y = sin(x) if x > 5.0 y = y + cos(x) end return exp(2) + yend
when called with aFloat64
argument is translated into:
using InteractiveUtils@code_typed foo(1.0)
CodeInfo(1 ─ %1 = invoke Main.sin(x::Float64)::Float64│ %2 = Base.lt_float(x, 5.0)::Bool└── goto #3 if not %22 ─ %4 = invoke Main.cos(x::Float64)::Float64└── %5 = Base.add_float(%1, %4)::Float643 ┄ %6 = φ (#2 => %5, #1 => %1)::Float64│ %7 = Base.add_float(7.38905609893065, %6)::Float64└── return %7) => Float64
In this example, we can see all of these changes.
1 ─ %1 = invoke Main.sin(x::Float64)::Float64│ %2 = Base.lt_float(x, 5.0)::Bool└── goto #3 if not %2
if
statement is translated intogoto #3 if not %2
which goes to the 3rd basic block ifx>5
isn't met and otherwise goes to the second basic block.%2
is an SSA value introduced to representx > 5
.Beginning in Julia 0.7, parts of the compiler use a newSSA-form intermediate representation (IR). Historically, the compiler would directly generate LLVM IR from a lowered form of the Julia AST. This form had most syntactic abstractions removed, but still looked a lot like an abstract syntax tree. Over time, in order to facilitate optimizations, SSA values were introduced to this IR and the IR was linearized (i.e. turned into a form where function arguments could only be SSA values or constants). However, non-SSA values (slots) remained in the IR due to the lack of Phi nodes in the IR (necessary for back-edges and re-merging of conditional control flow). This negated much of the usefulness of SSA form representation when performing middle end optimizations. Some heroic effort was put into making these optimizations work without a complete SSA form representation, but the lack of such a representation ultimately proved prohibitive.
The SSA IR representation has four categories of IR nodes: Phi, Pi, PhiC, and Upsilon nodes (the latter two are only used for exception handling).
Phi nodes are part of generic SSA abstraction (see the link above if you're not familiar with the concept). In the Julia IR, these nodes are represented as:
struct PhiNode edges::Vector{Int32} values::Vector{Any}end
where we ensure that both vectors always have the same length. In the canonical representation (the one handled by codegen and the interpreter), the edge values indicate come-from statement numbers (i.e. if edge has an entry of15
, there must be agoto
,gotoifnot
or implicit fall through from statement15
that targets this phi node). Values are either SSA values or constants. It is also possible for a value to be unassigned if the variable was not defined on this path. However, undefinedness checks get explicitly inserted and represented as booleans after middle end optimizations, so code generators may assume that any use of a Phi node will have an assigned value in the corresponding slot. It is also legal for the mapping to be incomplete, i.e. for a Phi node to have missing incoming edges. In that case, it must be dynamically guaranteed that the corresponding value will not be used.
Note that SSA uses semantically occur after the terminator of the corresponding predecessor ("on the edge"). Consequently, if multiple Phi nodes appear at the start of a basic block, they are run simultaneously. This means that in the following IR snippet, if we came from block23
,%46
will take the value associated to%45
before we entered this block.
%45 = φ (#18 => %23, #23 => %50)%46 = φ (#18 => 1.0, #23 => %45)
PiNodes encode statically proven information that may be implicitly assumed in basic blocks dominated by a given pi node. They are conceptually equivalent to the technique introduced in the paperABCD: Eliminating Array Bounds Checks on Demand or the predicate info nodes in LLVM. To see how they work, consider, e.g.
%x::Union{Int, Float64} # %x is some Union{Int, Float64} typed ssa valueif isa(x, Int) # use xelse # use xend
We can perform predicate insertion and turn this into:
%x::Union{Int, Float64} # %x is some Union{Int, Float64} typed ssa valueif isa(x, Int) %x_int = PiNode(x, Int) # use %x_intelse %x_float = PiNode(x, Float64) # use %x_floatend
Pi nodes are generally ignored in the interpreter, since they don't have any effect on the values, but they may sometimes lead to code generation in the compiler (e.g. to change from an implicitly union split representation to a plain unboxed representation). The main usefulness of PiNodes stems from the fact that path conditions of the values can be accumulated simply by def-use chain walking that is generally done for most optimizations that care about these conditions anyway.
Exception handling complicates the SSA story moderately, because exception handling introduces additional control flow edges into the IR across which values must be tracked. One approach to do so, which is followed by LLVM, is to make calls which may throw exceptions into basic block terminators and add an explicit control flow edge to the catch handler:
invoke @function_that_may_throw() to label %regular unwind to %catchregular:# Control flow continues herecatch:# Exceptions go here
However, this is problematic in a language like Julia, where at the start of the optimization pipeline, we do not know which calls throw. We would have to conservatively assume that every call (which in Julia is every statement) throws. This would have several negative effects. On the one hand, it would essentially reduce the scope of every basic block to a single call, defeating the purpose of having operations be performed at the basic block level. On the other hand, every catch basic block would haven*m
phi node arguments (n
, the number of statements in the critical region,m
the number of live values through the catch block).
To work around this, we use a combination ofUpsilon
andPhiC
nodes (the C standing forcatch
, writtenφᶜ
in the IR pretty printer, because unicode subscript c is not available). There are several ways to think of these nodes, but perhaps the easiest is to think of eachPhiC
as a load from a unique store-many, read-once slot, withUpsilon
being the corresponding store operation. ThePhiC
has an operand list of all the upsilon nodes that store to its implicit slot. TheUpsilon
nodes however, do not record whichPhiC
node they store to. This is done for more natural integration with the rest of the SSA IR. E.g. if there are no more uses of aPhiC
node, it is safe to delete it, and the same is true of anUpsilon
node. In most IR passes,PhiC
nodes can be treated likePhi
nodes. One can follow use-def chains through them, and they can be lifted to newPhiC
nodes and newUpsilon
nodes (in the same places as the originalUpsilon
nodes). The result of this scheme is that the number ofUpsilon
nodes (andPhiC
arguments) is proportional to the number of assigned values to a particular variable (before SSA conversion), rather than the number of statements in the critical region.
To see this scheme in action, consider the function
@noinline opaque() = invokelatest(identity, nothing) # Something opaquefunction foo() local y x = 1 try y = 2 opaque() y = 3 error() catch end (x, y)end
The corresponding IR (with irrelevant types stripped) is:
1 ─ nothing::Nothing2 ─ %2 = $(Expr(:enter, #4))3 ─ %3 = ϒ (false)│ %4 = ϒ (#undef)│ %5 = ϒ (1)│ %6 = ϒ (true)│ %7 = ϒ (2)│ invoke Main.opaque()::Any│ %9 = ϒ (true)│ %10 = ϒ (3)│ invoke Main.error()::Union{}└── $(Expr(:unreachable))::Union{}4 ┄ %13 = φᶜ (%3, %6, %9)::Bool│ %14 = φᶜ (%4, %7, %10)::Core.Compiler.MaybeUndef(Int64)│ %15 = φᶜ (%5)::Core.Const(1)└── $(Expr(:leave, Core.SSAValue(2)))5 ─ $(Expr(:pop_exception, :(%2)))::Any│ $(Expr(:throw_undef_if_not, :y, :(%13)))::Any│ %19 = Core.tuple(%15, %14)└── return %19
Note in particular that every value live into the critical region gets an upsilon node at the top of the critical region. This is because catch blocks are considered to have an invisible control flow edge from outside the function. As a result, no SSA value dominates the catch blocks, and all incoming values have to come through aφᶜ
node.
The mainSSAIR
data structure is worthy of discussion. It draws inspiration from LLVM and Webkit's B3 IR. The core of the data structure is a flat vector of statements. Each statement is implicitly assigned an SSA value based on its position in the vector (i.e. the result of the statement at idx 1 can be accessed usingSSAValue(1)
etc). For each SSA value, we additionally maintain its type. Since, SSA values are definitionally assigned only once, this type is also the result type of the expression at the corresponding index. However, while this representation is rather efficient (since the assignments don't need to be explicitly encoded), it of course carries the drawback that order is semantically significant, so reorderings and insertions change statement numbers. Additionally, we do not keep use lists (i.e. it is impossible to walk from a def to all its uses without explicitly computing this map–def lists however are trivial since you can look up the corresponding statement from the index), so the LLVM-style RAUW (replace-all-uses-with) operation is unavailable.
Instead, we do the following:
SSAValue(13)
).nothing
(this is essentially just a special-case convention of the above).nothing
.There is acompact!
function that compacts the above data structure by performing the insertion of nodes in the appropriate place, trivial copy propagation, and renaming of uses to any changed SSA values. However, the clever part of this scheme is that this compaction can be done lazily as part of the subsequent pass. Most optimization passes need to walk over the entire list of statements, performing analysis or modifications along the way. We provide anIncrementalCompact
iterator that can be used to iterate over the statement list. It will perform any necessary compaction and return the new index of the node, as well as the node itself. It is legal at this point to walk def-use chains, as well as make any modifications or deletions to the IR (insertions are disallowed however).
The idea behind this arrangement is that, since the optimization passes need to touch the corresponding memory anyway and incur the corresponding memory access penalty, performing the extra housekeeping should have comparatively little overhead (and save the overhead of maintaining these data structures during IR modification).
Settings
This document was generated withDocumenter.jl version 1.8.0 onWednesday 9 July 2025. Using Julia version 1.11.6.