The LLVM Lexicon

Note

This document is a work in progress!

Definitions

A

ADCE

Aggressive Dead Code Elimination

AST

Abstract Syntax Tree.

Due to Clang’s influence (mostly the fact that parsing and semanticanalysis are so intertwined for C and especially C++), the typicalworking definition of AST in the LLVM community is roughly “thecompiler’s first complete symbolic (as opposed to textual)representation of an input program”.As such, an “AST” might be a more general graph instead of a “tree”(consider the symbolic representation for the type of a typical “linkedlist node”). This working definition is closer to what some authorscall an “annotated abstract syntax tree”.

Consult your favorite compiler book or search engine for more details.

B

BB Vectorization

Basic-Block Vectorization

BDCE

Bit-tracking dead code elimination. Some bit-wise instructions (shifts,ands, ors, etc.) “kill” some of their input bits – that is, they make itsuch that those bits can be either zero or one without affecting control ordata flow of a program. The BDCE pass removes instructions that onlycompute these dead bits.

BURS

Bottom Up Rewriting System — A method of instruction selection for codegeneration. An example is theBURG tool.

C

CFI

This abbreviation has two meanings.Either:Call Frame Information. Used in DWARF debug info and in C++ unwind infoto show how the function prolog lays out the stack frame.

Or:Control Flow Integrity. A general term for computer security techniquesthat prevent a wide variety of malware attacks from redirecting the flowof execution (the control flow) of a program.

CIE

Common Information Entry. A kind of CFI used to reduce the size of FDEs.The compiler creates a CIE which contains the information common across allthe FDEs. Each FDE then points to its CIE.

CSE

Common Subexpression Elimination. An optimization that removes commonsubexpression computation. For example(a+b)*(a+b) has twosubexpressions that are the same:(a+b). This optimization wouldperform the addition only once and then perform the multiply (but only ifit’s computationally correct/safe).

D

DAG

Directed Acyclic Graph

Derived Pointer

A pointer to the interior of an object, such that a garbage collector isunable to use the pointer for reachability analysis. While a derived pointeris live, the corresponding object pointer must be kept in a root, otherwisethe collector might free the referenced object. With copying collectors,derived pointers pose an additional hazard that they may be invalidated atanysafe point. This term is used in opposition toobject pointer.

DSA

Data Structure Analysis

DSE

Dead Store Elimination

E

ento

This namespace houses theClang Static Analyzer.It is an abbreviation ofentomology.

“Entomology is the scientific study of insects.”

In the past, this namespace had not only the nameGR (aka. Graph Reachability)but alsoentoSA.

F

FCA

First Class Aggregate

FDE

Frame Description Entry. A kind of CFI used to describe the stack frame ofone function.

G

GC

Garbage Collection. The practice of using reachability analysis instead ofexplicit memory management to reclaim unused memory.

GEP

GetElementPtr. An LLVM IR instruction that is used to get the addressof a subelement of an aggregate data structure. It is documented in detailhere.

GVN

Global Value Numbering. GVN is a pass that partitions values computed by afunction into congruence classes. Values ending up in the same congruenceclass are guaranteed to be the same for every execution of the program.In that respect, congruency is a compile-time approximation of equivalenceof values at runtime.

H

Heap

In garbage collection, the region of memory which is managed usingreachability analysis.

I

ICE

Internal Compiler Error. This abbreviation is used to describe errorsthat occur in LLVM or Clang as they are compiling source code. For example,if a valid C++ source program were to trigger an assert in Clang whencompiled, that could be referred to as an “ICE”.

ICF

Identical Code Folding

ICP

Indirect Call Promotion

IPA

Inter-Procedural Analysis. Refers to any variety of code analysis thatoccurs between procedures, functions or compilation units (modules).

IPO

Inter-Procedural Optimization. Refers to any variety of code optimizationthat occurs between procedures, functions or compilation units (modules).

ISel

Instruction Selection

L

LCSSA

Loop-Closed Static Single Assignment Form

LGTM

“Looks Good To Me”. In a review thread, this indicates that thereviewer thinks that the patch is okay to commit.

LICM

Loop Invariant Code Motion

LSDA

Language Specific Data Area. C++ “zero cost” unwinding is built on top ageneric unwinding mechanism. As the unwinder walks each frame, it callsa “personality” function to do language specific analysis. Each function’sFDE points to an optional LSDA which is passed to the personality function.For C++, the LSDA contain info about the type and location of catchstatements in that function.

Load-VN

Load Value Numbering

LTO

Link-Time Optimization

M

MC

Machine Code

N

NFC

“No functional change”. Used in a commit message to indicate that a patchis a pure refactoring/cleanup.Usually used in the first line, so it is visible without opening theactual commit email.

O

Object Pointer

A pointer to an object such that the garbage collector is able to tracereferences contained within the object. This term is used in opposition toderived pointer.

P

PGO

Profile-Guided Optimization

PR

Problem report. A bug filed onthe LLVM Bug Tracking System.

PRE

Partial Redundancy Elimination

R

RAUW

Replace All Uses With. The functionsUser::replaceUsesOfWith(),Value::replaceAllUsesWith(), andConstant::replaceUsesOfWithOnConstant() implement the replacement of oneValue with another by iterating over its def/use chain and fixing up all ofthe pointers to point to the new value. Seealsodef/use chains.

Reassociation

Rearranging associative expressions to promote better redundancy eliminationand other optimization. For example, changing(A+B-A) into(B+A-A),permitting it to be optimized into(B+0) then(B).

RFC

Request for Comment. An email sent to a project mailing list in order tosolicit feedback on a proposed change.

Root

In garbage collection, a pointer variable lying outside of theheap fromwhich the collector begins its reachability analysis. In the context of codegeneration, “root” almost always refers to a “stack root” — a local ortemporary variable within an executing function.

RPO

Reverse postorder

RTTI

Run-time Type Information

S

Safe Point

In garbage collection, it is necessary to identifystack roots so thatreachability analysis may proceed. It may be infeasible to provide thisinformation for every instruction, so instead the information iscalculated only at designated safe points. With a copying collector,derived pointers must not be retained across safe points andobjectpointers must be reloaded from stack roots.

SDISel

Selection DAG Instruction Selection.

SCC

Strongly Connected Component

SCCP

Sparse Conditional Constant Propagation

SLP

Superword-Level Parallelism, same asBasic-Block Vectorization.

Splat

Splat refers to a vector of identical scalar elements.

The term is based on the PowerPC Altivec instructions that providedthis functionality in hardware. For example, “vsplth” and the correspondingsoftware intrinsic “vec_splat()”. Examples of other hardware names for thisaction include “duplicate” (ARM) and “broadcast” (x86).

SRoA

Scalar Replacement of Aggregates

SSA

Static Single Assignment

Stack Map

In garbage collection, metadata emitted by the code generator whichidentifiesroots within the stack frame of an executing function.

T

TBAA

Type-Based Alias Analysis

W

WPD

Whole Program Devirtualization