The Optimizer

The Solidity compiler uses two different optimizer modules: The “old” optimizerthat operates at the opcode level and the “new” optimizer that operates on Yul IR code.

The opcode-based optimizer applies a set ofsimplification rulesto opcodes. It also combines equal code sets and removes unused code.

The Yul-based optimizer is much more powerful, because it can work across functioncalls. For example, arbitrary jumps are not possible in Yul, so it ispossible to compute the side-effects of each function. Consider two function calls,where the first does not modify storage and the second does modify storage.If their arguments and return values do not depend on each other, we can reorderthe function calls. Similarly, if a function isside-effect free and its result is multiplied by zero, you can remove the functioncall completely.

Currently, the parameter--optimize activates the opcode-based optimizer for thegenerated bytecode and the Yul optimizer for the Yul code generated internally, for example for ABI coder v2.One can usesolc--ir-optimized--optimize to produce anoptimized experimental Yul IR for a Solidity source. Similarly, one can usesolc--strict-assembly--optimizefor a stand-alone Yul mode.

You can find more details on both optimizer modules and their optimization steps below.

Benefits of Optimizing Solidity Code

Overall, the optimizer tries to simplify complicated expressions, which reduces both codesize and execution cost, i.e., it can reduce gas needed for contract deployment as well as for external calls made to the contract.It also specializes or inlines functions. Especiallyfunction inlining is an operation that can cause much bigger code, but it isoften done because it results in opportunities for more simplifications.

Differences between Optimized and Non-Optimized Code

Generally, the most visible difference is that constant expressions are evaluated at compile time.When it comes to the ASM output, one can also notice a reduction of equivalent or duplicatecode blocks (compare the output of the flags--asm and--asm--optimize). However,when it comes to the Yul/intermediate-representation, there can be significantdifferences, for example, functions may be inlined, combined, or rewritten to eliminateredundancies, etc. (compare the output between the flags--ir and--optimize--ir-optimized).

Optimizer Parameter Runs

The number of runs (--optimize-runs) specifies roughly how often each opcode of thedeployed code will be executed across the life-time of the contract. This means it is atrade-off parameter between code size (deploy cost) and code execution cost (cost after deployment).A “runs” parameter of “1” will produce short but expensive code. In contrast, a larger “runs”parameter will produce longer but more gas efficient code. The maximum value of the parameteris2**32-1.

Note

A common misconception is that this parameter specifies the number of iterations of the optimizer.This is not true: The optimizer will always run as many times as it can still improve the code.

Opcode-Based Optimizer Module

The opcode-based optimizer module operates on assembly code. It splits thesequence of instructions into basic blocks atJUMPs andJUMPDESTs.Inside these blocks, the optimizer analyzes the instructions and records every modification to the stack,memory, or storage as an expression which consists of an instruction anda list of arguments which are pointers to other expressions.

Additionally, the opcode-based optimizeruses a component called “CommonSubexpressionEliminator” that, amongst othertasks, finds expressions that are always equal (on every input) and combinesthem into an expression class. It first tries to find each newexpression in a list of already known expressions. If no such matches are found,it simplifies the expression according to rules likeconstant+constant=sum_of_constants orX*1=X. Since this isa recursive process, we can also apply the latter rule if the second factoris a more complex expression which we know always evaluates to one.

Certain optimizer steps symbolically track the storage and memory locations. For example, thisinformation is used to compute Keccak-256 hashes that can be evaluated during compile time. Considerthe sequence:

PUSH 32PUSH 0CALLDATALOADPUSH 100DUP2MSTOREKECCAK256

or the equivalent Yul

letx:=calldataload(0)mstore(x,100)letvalue:=keccak256(x,32)

In this case, the optimizer tracks the value at a memory locationcalldataload(0) and thenrealizes that the Keccak-256 hash can be evaluated at compile time. This only works if there is noother instruction that modifies memory between themstore andkeccak256. So if there is aninstruction that writes to memory (or storage), then we need to erase the knowledge of the currentmemory (or storage). There is, however, an exception to this erasing, when we can easily see thatthe instruction doesn’t write to a certain location.

For example,

letx:=calldataload(0)mstore(x,100)// Current knowledge memory location x -> 100lety:=add(x,32)// Does not clear the knowledge that x -> 100, since y does not write to [x, x + 32)mstore(y,200)// This Keccak-256 can now be evaluatedletvalue:=keccak256(x,32)

Therefore, modifications to storage and memory locations, of say locationl, must eraseknowledge about storage or memory locations which may be equal tol. More specifically, forstorage, the optimizer has to erase all knowledge of symbolic locations, that may be equal toland for memory, the optimizer has to erase all knowledge of symbolic locations that may not be atleast 32 bytes away. Ifm denotes an arbitrary location, then this decision on erasure is doneby computing the valuesub(l,m). For storage, if this value evaluates to a literal that isnon-zero, then the knowledge aboutm will be kept. For memory, if the value evaluates to aliteral that is between32 and2**256-32, then the knowledge aboutm will be kept. Inall other cases, the knowledge aboutm will be erased.

After this process, we know which expressions have to be on the stack atthe end, and have a list of modifications to memory and storage. This informationis stored together with the basic blocks and is used to link them. Furthermore,knowledge about the stack, storage and memory configuration is forwarded tothe next block(s).

If we know the targets of allJUMP andJUMPI instructions,we can build a complete control flow graph of the program. If there is onlyone target we do not know (this can happen as in principle, jump targets canbe computed from inputs), we have to erase all knowledge about the input stateof a block as it can be the target of the unknownJUMP. If the opcode-basedoptimizer module finds aJUMPI whose condition evaluates to a constant, it transforms itto an unconditional jump.

As the last step, the code in each block is re-generated. The optimizer createsa dependency graph from the expressions on the stack at the end of the block,and it drops every operation that is not part of this graph. It generates codethat applies the modifications to memory and storage in the order they weremade in the original code (dropping modifications which were found not to beneeded). Finally, it generates all values that are required to be on thestack in the correct place.

These steps are applied to each basic block and the newly generated codeis used as replacement if it is smaller. If a basic block is split at aJUMPI and during the analysis, the condition evaluates to a constant,theJUMPI is replaced based on the value of the constant. Thus code like

uintx=7;data[7]=9;if(data[x]!=x+2)// this condition is never truereturn2;elsereturn1;

simplifies to this:

data[7]=9;return1;

Simple Inlining

Since Solidity version 0.8.2, there is another optimizer step that replaces certainjumps to blocks containing “simple” instructions ending with a “jump” by a copy of these instructions.This corresponds to inlining of simple, small Solidity or Yul functions. In particular, the sequencePUSHTAG(tag)JUMP may be replaced, whenever theJUMP is marked as jump “into” afunction and behindtag there is a basic block (as described above for the“CommonSubexpressionEliminator”) that ends in anotherJUMP which is marked as a jump“out of” a function.

In particular, consider the following prototypical example of assembly generated for acall to an internal Solidity function:

  tag_return  tag_f  jump      // intag_return:  ...opcodes after call to f...tag_f:  ...body of function f...  jump      // out

As long as the body of the function is a continuous basic block, the “Inliner” can replacetag_fjump bythe block attag_f resulting in:

  tag_return  ...body of function f...  jumptag_return:  ...opcodes after call to f...tag_f:  ...body of function f...  jump      // out

Now ideally, the other optimizer steps described above will result in the return tag push being movedtowards the remaining jump resulting in:

  ...body of function f...  tag_return  jumptag_return:  ...opcodes after call to f...tag_f:  ...body of function f...  jump      // out

In this situation the “PeepholeOptimizer” will remove the return jump. Ideally, all of this can be donefor all references totag_f leaving it unused, s.t. it can be removed, yielding:

...body of function f......opcodes after call to f...

So the call to functionf is inlined and the original definition off can be removed.

Inlining like this is attempted, whenever a heuristics suggests that inlining is cheaper over the lifetime of acontract than not inlining. This heuristics depends on the size of the function body, thenumber of other references to its tag (approximating the number of calls to the function) andthe expected number of executions of the contract (the global optimizer parameter “runs”).

Yul-Based Optimizer Module

The Yul-based optimizer consists of several stages and components that all transformthe AST in a semantically equivalent way. The goal is to end up either with codethat is shorter or at least only marginally longer but will allow furtheroptimization steps.

Warning

Since the optimizer is under heavy development, the information here might be outdated.If you rely on a certain functionality, please reach out to the team directly.

The optimizer currently follows a purely greedy strategy and does not do anybacktracking.

All components of the Yul-based optimizer module are explained below.The following transformation steps are the main components:

  • SSA Transform

  • Common Subexpression Eliminator

  • Expression Simplifier

  • Redundant Assign Eliminator

  • Full Function Inliner

Optimizer Steps

This is a list of all steps the Yul-based optimizer sorted alphabetically. You can find more informationon the individual steps and their sequence below.

Selecting Optimizations

By default the optimizer applies its predefined sequence of optimization steps tothe generated assembly. You can override this sequence and supply your own usingthe--yul-optimizations option:

solc --optimize --ir-optimized --yul-optimizations'dhfoD[xarrscLMcCTU]uljmul'

The sequence inside[...] will be applied multiple times in a loop until the Yul coderemains unchanged or until the maximum number of rounds (currently 12) has been reached.

Available abbreviations are listed in theYul optimizer docs.

Preprocessing

The preprocessing components perform transformations to get the programinto a certain normal form that is easier to work with. This normalform is kept during the rest of the optimization process.

Disambiguator

The disambiguator takes an AST and returns a fresh copy where all identifiers haveunique names in the input AST. This is a prerequisite for all other optimizer stages.One of the benefits is that identifier lookup does not need to take scopes into accountwhich simplifies the analysis needed for other steps.

All subsequent stages have the property that all names stay unique. This means ifa new identifier needs to be introduced, a new unique name is generated.

FunctionHoister

The function hoister moves all function definitions to the end of the topmost block. This isa semantically equivalent transformation as long as it is performed after thedisambiguation stage. The reason is that moving a definition to a higher-level block cannot decreaseits visibility and it is impossible to reference variables defined in a different function.

The benefit of this stage is that function definitions can be looked up more easilyand functions can be optimized in isolation without having to traverse the AST completely.

FunctionGrouper

The function grouper has to be applied after the disambiguator and the function hoister.Its effect is that all topmost elements that are not function definitions are movedinto a single block which is the first statement of the root block.

After this step, a program has the following normal form:

{ I F... }

WhereI is a (potentially empty) block that does not contain any function definitions (not even recursively)andF is a list of function definitions such that no function contains a function definition.

The benefit of this stage is that we always know where the list of function begins.

ForLoopConditionIntoBody

This transformation moves the loop-iteration condition of a for-loop into loop body.We need this transformation becauseExpressionSplitter will notapply to iteration condition expressions (theC in the following example).

for { Init... } C { Post... } {    Body...}

is transformed to

for { Init... } 1 { Post... } {    if iszero(C) { break }    Body...}

This transformation can also be useful when paired withLoopInvariantCodeMotion, sinceinvariants in the loop-invariant conditions can then be taken outside the loop.

ForLoopInitRewriter

This transformation moves the initialization part of a for-loop to beforethe loop:

for { Init... } C { Post... } {    Body...}

is transformed to

{    Init...    for {} C { Post... } {        Body...    }}

This eases the rest of the optimization process because we can ignorethe complicated scoping rules of the for loop initialisation block.

VarDeclInitializer

This step rewrites variable declarations so that all of them are initialized.Declarations likeletx,y are split into multiple declaration statements.

Only supports initializing with the zero literal for now.

Pseudo-SSA Transformation

The purpose of this components is to get the program into a longer form,so that other components can more easily work with it. The final representationwill be similar to a static-single-assignment (SSA) form, with the differencethat it does not make use of explicit “phi” functions which combines the valuesfrom different branches of control flow because such a feature does not existin the Yul language. Instead, when control flow merges, if a variable is re-assignedin one of the branches, a new SSA variable is declared to hold its current value,so that the following expressions still only need to reference SSA variables.

An example transformation is the following:

{leta:=calldataload(0)letb:=calldataload(0x20)ifgt(a,0){b:=mul(b,0x20)}a:=add(a,1)sstore(a,add(b,0x20))}

When all the following transformation steps are applied, the program will lookas follows:

{let_1:=0leta_9:=calldataload(_1)leta:=a_9let_2:=0x20letb_10:=calldataload(_2)letb:=b_10let_3:=0let_4:=gt(a_9,_3)if_4{let_5:=0x20letb_11:=mul(b_10,_5)b:=b_11}letb_12:=blet_6:=1leta_13:=add(a_9,_6)let_7:=0x20let_8:=add(b_12,_7)sstore(a_13,_8)}

Note that the only variable that is re-assigned in this snippet isb.This re-assignment cannot be avoided becauseb has different valuesdepending on the control flow. All other variables never change theirvalue once they are defined. The advantage of this property is thatvariables can be freely moved around and references to themcan be exchanged by their initial value (and vice-versa),as long as these values are still valid in the new context.

Of course, the code here is far from being optimized. To the contrary, it is muchlonger. The hope is that this code will be easier to work with and furthermore,there are optimizer steps that undo these changes and make the code morecompact again at the end.

ExpressionSplitter

The expression splitter turns expressions likeadd(mload(0x123),mul(mload(0x456),0x20))into a sequence of declarations of unique variables that are assigned sub-expressionsof that expression so that each function call has only variables or literalsas arguments.

The above would be transformed into

{let_1:=mload(0x123)let_2:=mul(_1,0x20)let_3:=mload(0x456)letz:=add(_3,_2)}

Note that this transformation does not change the order of opcodes or function calls.

It is not applied to loop iteration-condition, because the loop control flow does not allowthis “outlining” of the inner expressions in all cases. We can sidestep this limitation by applyingForLoopConditionIntoBody to move the iteration condition into loop body.

The final program should be in a form such that (with the exception of loop conditions)function calls cannot appear nested inside expressionsand all function call arguments have to be literals or variables.

The benefits of this form are that it is much easier to re-order the sequence of opcodesand it is also easier to perform function call inlining. Furthermore, it is simplerto replace individual parts of expressions or re-organize the “expression tree”.The drawback is that such code is much harder to read for humans.

SSATransform

This stage tries to replace repeated assignments toexisting variables by declarations of new variables as much aspossible.The reassignments are still there, but all references to thereassigned variables are replaced by the newly declared variables.

Example:

{leta:=1mstore(a,2)a:=3}

is transformed to

{leta_1:=1leta:=a_1mstore(a_1,2)leta_3:=3a:=a_3}

Exact semantics:

For any variablea that is assigned to somewhere in the code(variables that are declared with value and never re-assignedare not modified) perform the following transforms:

  • replaceleta:=v byleta_i:=v  leta:=a_i

  • replacea:=v byleta_i:=v  a:=a_i wherei is a number such thata_i is yet unused.

Furthermore, always record the current value ofi used fora and replace eachreference toa bya_i.The current value mapping is cleared for a variablea at the end of each blockin which it was assigned to and at the end of the for loop init block if it is assignedinside the for loop body or post block.If a variable’s value is cleared according to the rule above and the variable is declared outsidethe block, a new SSA variable will be created at the location where control flow joins,this includes the beginning of loop post/body block and the location right afterIf/Switch/ForLoop/Block statement.

After this stage, the Redundant Assign Eliminator is recommended to remove the unnecessaryintermediate assignments.

This stage provides best results if the Expression Splitter and the Common Subexpression Eliminatorare run right before it, because then it does not generate excessive amounts of variables.On the other hand, the Common Subexpression Eliminator could be more efficient if run after theSSA transform.

RedundantAssignEliminator

The SSA transform always generates an assignment of the forma:=a_i, even thoughthese might be unnecessary in many cases, like the following example:

{leta:=1a:=mload(a)a:=sload(a)sstore(a,1)}

The SSA transform converts this snippet to the following:

{leta_1:=1leta:=a_1leta_2:=mload(a_1)a:=a_2leta_3:=sload(a_2)a:=a_3sstore(a_3,1)}

The Redundant Assign Eliminator removes all the three assignments toa, becausethe value ofa is not used and thus turn thissnippet into strict SSA form:

{leta_1:=1leta_2:=mload(a_1)leta_3:=sload(a_2)sstore(a_3,1)}

Of course the intricate parts of determining whether an assignment is redundant or notare connected to joining control flow.

The component works as follows in detail:

The AST is traversed twice: in an information gathering step and in theactual removal step. During information gathering, we maintain amapping from assignment statements to the three states“unused”, “undecided” and “used” which signifies whether the assignedvalue will be used later by a reference to the variable.

When an assignment is visited, it is added to the mapping in the “undecided” state(see remark about for loops below) and every other assignment to the same variablethat is still in the “undecided” state is changed to “unused”.When a variable is referenced, the state of any assignment to that variable stillin the “undecided” state is changed to “used”.

At points where control flow splits, a copyof the mapping is handed over to each branch. At points where control flowjoins, the two mappings coming from the two branches are combined in the following way:Statements that are only in one mapping or have the same state are used unchanged.Conflicting values are resolved in the following way:

  • “unused”, “undecided” -> “undecided”

  • “unused”, “used” -> “used”

  • “undecided, “used” -> “used”

For for-loops, the condition, body and post-part are visited twice, takingthe joining control-flow at the condition into account.In other words, we create three control flow paths: Zero runs of the loop,one run and two runs and then combine them at the end.

Simulating a third run or even more is unnecessary, which can be seen as follows:

A state of an assignment at the beginning of the iteration will deterministicallyresult in a state of that assignment at the end of the iteration. Let thisstate mapping function be calledf. The combination of the three differentstatesunused,undecided andused as explained above is themaxoperation whereunused=0,undecided=1 andused=2.

The proper way would be to compute

max(s, f(s), f(f(s)), f(f(f(s))), ...)

as state after the loop. Sincef just has a range of three different values,iterating it has to reach a cycle after at most three iterations,and thusf(f(f(s))) has to equal one ofs,f(s), orf(f(s))and thus

max(s, f(s), f(f(s))) = max(s, f(s), f(f(s)), f(f(f(s))), ...).

In summary, running the loop at most twice is enough because there are only threedifferent states.

For switch statements that have a “default”-case, there is no control-flowpart that skips the switch.

When a variable goes out of scope, all statements still in the “undecided”state are changed to “unused”, unless the variable is the returnparameter of a function - there, the state changes to “used”.

In the second traversal, all assignments that are in the “unused” state are removed.

This step is usually run right after the SSA transform to completethe generation of the pseudo-SSA.

Tools

Movability

Movability is a property of an expression. It roughly means that the expressionis side-effect free and its evaluation only depends on the values of variablesand the call-constant state of the environment. Most expressions are movable.The following parts make an expression non-movable:

  • function calls (might be relaxed in the future if all statements in the function are movable)

  • opcodes that (can) have side-effects (likecall orselfdestruct)

  • opcodes that read or write memory, storage or external state information

  • opcodes that depend on the current PC, memory size or returndata size

DataflowAnalyzer

The Dataflow Analyzer is not an optimizer step itself but is used as a toolby other components. While traversing the AST, it tracks the current value ofeach variable, as long as that value is a movable expression.It records the variables that are part of the expressionthat is currently assigned to each other variable. Upon each assignment toa variablea, the current stored value ofa is updated andall stored values of all variablesb are cleared whenevera is partof the currently stored expression forb.

At control-flow joins, knowledge about variables is cleared if they have or would be assignedin any of the control-flow paths. For instance, upon entering afor loop, all variables are cleared that will be assigned during thebody or the post block.

Expression-Scale Simplifications

These simplification passes change expressions and replace them by equivalentand hopefully simpler expressions.

CommonSubexpressionEliminator

This step uses the Dataflow Analyzer and replaces subexpressions thatsyntactically match the current value of a variable by a reference tothat variable. This is an equivalence transform because such subexpressions haveto be movable.

All subexpressions that are identifiers themselves are replaced by theircurrent value if the value is an identifier.

The combination of the two rules above allow to compute a local valuenumbering, which means that if two variables have the samevalue, one of them will always be unused. The Unused Pruner or theRedundant Assign Eliminator will then be able to fully eliminate suchvariables.

This step is especially efficient if the expression splitter is runbefore. If the code is in pseudo-SSA form,the values of variables are available for a longer time and thus wehave a higher chance of expressions to be replaceable.

The expression simplifier will be able to perform better replacementsif the common subexpression eliminator was run right before it.

Expression Simplifier

The Expression Simplifier uses the Dataflow Analyzer and makes useof a list of equivalence transforms on expressions likeX+0->Xto simplify the code.

It tries to match patterns likeX+0 on each subexpression.During the matching procedure, it resolves variables to their currentlyassigned expressions to be able to match more deeply nested patternseven when the code is in pseudo-SSA form.

Some of the patterns likeX-X->0 can only be applied as longas the expressionX is movable, because otherwise it would remove its potential side-effects.Since variable references are always movable, even if their currentvalue might not be, the Expression Simplifier is again more powerfulin split or pseudo-SSA form.

LiteralRematerialiser

To be documented.

LoadResolver

Optimisation stage that replaces expressions of typesload(x) andmload(x) by the valuecurrently stored in storage resp. memory, if known.

Works best if the code is in SSA form.

Prerequisite: Disambiguator, ForLoopInitRewriter.

ReasoningBasedSimplifier

This optimizer uses SMT solvers to check whetherif conditions are constant.

  • IfconstraintsANDcondition is UNSAT, the condition is never true and the whole body can be removed.

  • IfconstraintsANDNOTcondition is UNSAT, the condition is always true and can be replaced by1.

The simplifications above can only be applied if the condition is movable.

It is only effective on the EVM dialect, but safe to use on other dialects.

Prerequisite: Disambiguator, SSATransform.

Statement-Scale Simplifications

CircularReferencesPruner

This stage removes functions that call each other but areneither externally referenced nor referenced from the outermost context.

ConditionalSimplifier

The Conditional Simplifier inserts assignments to condition variables if the value can be determinedfrom the control-flow.

Destroys SSA form.

Currently, this tool is very limited, mostly because we do not yet have supportfor boolean types. Since conditions only check for expressions being nonzero,we cannot assign a specific value.

Current features:

  • switch cases: insert “<condition> := <caseLabel>”

  • after if statement with terminating control-flow, insert “<condition> := 0”

Future features:

  • allow replacements by “1”

  • take termination of user-defined functions into account

Works best with SSA form and if dead code removal has run before.

Prerequisite: Disambiguator.

ConditionalUnsimplifier

Reverse of Conditional Simplifier.

ControlFlowSimplifier

Simplifies several control-flow structures:

  • replace if with empty body with pop(condition)

  • remove empty default switch case

  • remove empty switch case if no default case exists

  • replace switch with no cases with pop(expression)

  • turn switch with single case into if

  • replace switch with only default case with pop(expression) and body

  • replace switch with const expr with matching case body

  • replacefor with terminating control flow and without other break/continue byif

  • removeleave at the end of a function.

None of these operations depend on the data flow. The StructuralSimplifierperforms similar tasks that do depend on data flow.

The ControlFlowSimplifier does record the presence or absence ofbreakandcontinue statements during its traversal.

Prerequisite: Disambiguator, FunctionHoister, ForLoopInitRewriter.Important: Introduces EVM opcodes and thus can only be used on EVM code for now.

DeadCodeEliminator

This optimization stage removes unreachable code.

Unreachable code is any code within a block which is preceded by aleave, return, invalid, break, continue, selfdestruct or revert.

Function definitions are retained as they might be called by earliercode and thus are considered reachable.

Because variables declared in a for loop’s init block have their scope extended to the loop body,we require ForLoopInitRewriter to run before this step.

Prerequisite: ForLoopInitRewriter, Function Hoister, Function Grouper

UnusedPruner

This step removes the definitions of all functions that are never referenced.

It also removes the declaration of variables that are never referenced.If the declaration assigns a value that is not movable, the expression is retained,but its value is discarded.

All movable expression statements (expressions that are not assigned) are removed.

StructuralSimplifier

This is a general step that performs various kinds of simplifications ona structural level:

  • replace if statement with empty body bypop(condition)

  • replace if statement with true condition by its body

  • remove if statement with false condition

  • turn switch with single case into if

  • replace switch with only default case bypop(expression) and body

  • replace switch with literal expression by matching case body

  • replace for loop with false condition by its initialization part

This component uses the Dataflow Analyzer.

BlockFlattener

This stage eliminates nested blocks by inserting the statement in theinner block at the appropriate place in the outer block:

{letx:=2{lety:=3mstore(x,y)}}

is transformed to

{letx:=2lety:=3mstore(x,y)}

As long as the code is disambiguated, this does not cause a problem becausethe scopes of variables can only grow.

LoopInvariantCodeMotion

This optimization moves movable SSA variable declarations outside the loop.

Only statements at the top level in a loop’s body or post block are considered, i.e variabledeclarations inside conditional branches will not be moved out of the loop.

Requirements:

  • The Disambiguator, ForLoopInitRewriter and FunctionHoister must be run upfront.

  • Expression splitter and SSA transform should be run upfront to obtain better result.

Function-Level Optimizations

FunctionSpecializer

This step specializes the function with its literal arguments.

If a function, say,functionf(a,b){sstore(a,b)}, is called with literal arguments, forexample,f(x,5), wherex is an identifier, it could be specialized by creating a newfunctionf_1 that takes only one argument, i.e.,

functionf_1(a_1){letb_1:=5sstore(a_1,b_1)}

Other optimization steps will be able to make more simplifications to the function. Theoptimization step is mainly useful for functions that would not be inlined.

Prerequisites: Disambiguator, FunctionHoister

LiteralRematerialiser is recommended as a prerequisite, even though it’s not required forcorrectness.

UnusedFunctionParameterPruner

This step removes unused parameters in a function.

If a parameter is unused, likec andy in,functionf(a,b,c)->x,y{x:=div(a,b)}, weremove the parameter and create a new “linking” function as follows:

functionf(a,b)->x{x:=div(a,b)}functionf2(a,b,c)->x,y{x:=f(a,b)}

and replace all references tof byf2.The inliner should be run afterwards to make sure that all references tof2 are replaced byf.

Prerequisites: Disambiguator, FunctionHoister, LiteralRematerialiser.

The step LiteralRematerialiser is not required for correctness. It helps deal with cases such as:functionf(x)->y{revert(y,y}} where the literaly will be replaced by its value0,allowing us to rewrite the function.

EquivalentFunctionCombiner

If two functions are syntactically equivalent, while allowing variablerenaming but not any re-ordering, then any reference to one of thefunctions is replaced by the other.

The actual removal of the function is performed by the Unused Pruner.

Function Inlining

FunctionalInliner

This component of the optimizer performs restricted function inlining by inlining functions that can beinlined inside functional expressions, i.e. functions that:

  • return a single value.

  • have a body liker:=<functionalexpression>.

  • neither reference themselves norr in the right hand side.

Furthermore, for all parameters, all of the following need to be true:

  • The argument is movable.

  • The parameter is either referenced less than twice in the function body, or the argument is rather cheap(“cost” of at most 1, like a constant up to 0xff).

Example: The function to be inlined has the form offunctionf(...)->r{r:=E} whereE is an expression that does not referencer and all arguments in the function call are movable expressions.

The result of this inlining is always a single expression.

This component can only be used on sources with unique names.

FullFunctionInliner

The Full Function Inliner replaces certain calls of certain functionsby the function’s body. This is not very helpful in most cases, becauseit just increases the code size but does not have a benefit. Furthermore,code is usually very expensive and we would often rather have shortercode than more efficient code. In same cases, though, inlining a functioncan have positive effects on subsequent optimizer steps. This is the caseif one of the function arguments is a constant, for example.

During inlining, a heuristic is used to tell if the function callshould be inlined or not.The current heuristic does not inline into “large” functions unlessthe called function is tiny. Functions that are only used onceare inlined, as well as medium-sized functions, while functioncalls with constant arguments allow slightly larger functions.

In the future, we may include a backtracking componentthat, instead of inlining a function right away, only specializes it,which means that a copy of the function is generated wherea certain parameter is always replaced by a constant. After that,we can run the optimizer on this specialized function. If itresults in heavy gains, the specialized function is kept,otherwise the original function is used instead.

Cleanup

The cleanup is performed at the end of the optimizer run. It triesto combine split expressions into deeply nested ones again and alsoimproves the “compilability” for stack machines by eliminatingvariables as much as possible.

ExpressionJoiner

This is the opposite operation of the expression splitter. It turns a sequence ofvariable declarations that have exactly one reference into a complex expression.This stage fully preserves the order of function calls and opcode executions.It does not make use of any information concerning the commutativity of the opcodes;if moving the value of a variable to its place of use would change the orderof any function call or opcode execution, the transformation is not performed.

Note that the component will not move the assigned value of a variable assignmentor a variable that is referenced more than once.

The snippetletx:=add(0,2)lety:=mul(x,mload(2)) is not transformed,because it would cause the order of the call to the opcodesadd andmload to be swapped - even though this would not make a differencebecauseadd is movable.

When reordering opcodes like that, variable references and literals are ignored.Because of that, the snippetletx:=add(0,2)lety:=mul(x,3) istransformed tolety:=mul(add(0,2),3), even though theadd opcodewould be executed after the evaluation of the literal3.

SSAReverser

This is a tiny step that helps in reversing the effects of the SSA transformif it is combined with the Common Subexpression Eliminator and theUnused Pruner.

The SSA form we generate is detrimental to code generation on the EVM andWebAssembly alike because it generates many local variables. It wouldbe better to just re-use existing variables with assignments instead offresh variable declarations.

The SSA transform rewrites

leta:=calldataload(0)mstore(a,1)

to

leta_1:=calldataload(0)leta:=a_1mstore(a_1,1)leta_2:=calldataload(0x20)a:=a_2

The problem is that instead ofa, the variablea_1 is usedwhenevera was referenced. The SSA transform changes statementsof this form by just swapping out the declaration and the assignment. The abovesnippet is turned into

leta:=calldataload(0)leta_1:=amstore(a_1,1)a:=calldataload(0x20)leta_2:=a

This is a very simple equivalence transform, but when we now run theCommon Subexpression Eliminator, it will replace all occurrences ofa_1bya (untila is re-assigned). The Unused Pruner will theneliminate the variablea_1 altogether and thus fully reverse theSSA transform.

StackCompressor

One problem that makes code generation for the Ethereum Virtual Machinehard is the fact that there is a hard limit of 16 slots for reachingdown the expression stack. This more or less translates to a limitof 16 local variables. The stack compressor takes Yul code andcompiles it to EVM bytecode. Whenever the stack difference is toolarge, it records the function this happened in.

For each function that caused such a problem, the Rematerialiseris called with a special request to aggressively eliminate specificvariables sorted by the cost of their values.

On failure, this procedure is repeated multiple times.

Rematerialiser

The rematerialisation stage tries to replace variable references by the expression thatwas last assigned to the variable. This is of course only beneficial if this expressionis comparatively cheap to evaluate. Furthermore, it is only semantically equivalent ifthe value of the expression did not change between the point of assignment and thepoint of use. The main benefit of this stage is that it can save stack slots if itleads to a variable being eliminated completely (see below), but it can alsosave a DUP opcode on the EVM if the expression is very cheap.

The Rematerialiser uses the Dataflow Analyzer to track the current values of variables,which are always movable.If the value is very cheap or the variable was explicitly requested to be eliminated,the variable reference is replaced by its current value.

ForLoopConditionOutOfBody

Reverses the transformation of ForLoopConditionIntoBody.

For any movablec, it turns

for { ... } 1 { ... } {if iszero(c) { break }...}

into

for { ... } c { ... } {...}

and it turns

for { ... } 1 { ... } {if c { break }...}

into

for { ... } iszero(c) { ... } {...}

The LiteralRematerialiser should be run before this step.

WebAssembly specific

MainFunction

Changes the topmost block to be a function with a specific name (“main”) which has noinputs nor outputs.

Depends on the Function Grouper.