JITLink and ORC’s ObjectLinkingLayer¶
Introduction¶
This document aims to provide a high-level overview of the design and APIof the JITLink library. It assumes some familiarity with linking andrelocatable object files, but should not require deep expertise. If you knowwhat a section, symbol, and relocation are then you should find this documentaccessible. If it is not, please submit a patch (Contributing to LLVM) or file abug (How to submit an LLVM bug report).
JITLink is a library forJIT Linking. It was built to support theORC JITAPIs and is most commonly accessed via ORC’s ObjectLinkingLayer API. JITLink wasdeveloped with the aim of supporting the full set of features provided by eachobject format; including static initializers, exception handling, thread localvariables, and language runtime registration. Supporting these features enablesORC to execute code generated from source languages which rely on these features(e.g. C++ requires object format support for static initializers to supportstatic constructors, eh-frame registration for exceptions, and TLV support forthread locals; Swift and Objective-C require language runtime registration formany features). For some object format features support is provided entirelywithin JITLink, and for others it is provided in cooperation with the(prototype) ORC runtime.
JITLink aims to support the following features, some of which are still underdevelopment:
Cross-process and cross-architecture linking of single relocatable objectsinto a targetexecutor process.
Support for all object format features.
Open linker data structures (
LinkGraph
) and pass system.
JITLink and ObjectLinkingLayer¶
ObjectLinkingLayer
is ORCs wrapper for JITLink. It is an ORC layer thatallows objects to be added to aJITDylib
, or emitted from some higher levelprogram representation. When an object is emitted,ObjectLinkingLayer
usesJITLink to construct aLinkGraph
(seeConstructing LinkGraphs) andcalls JITLink’slink
function to link the graph into the executor process.
TheObjectLinkingLayer
class provides a plugin API,ObjectLinkingLayer::Plugin
, which users can subclass in order to inspect andmodifyLinkGraph
instances at link time, and react to important JIT events(such as an object being emitted into target memory). This enables many featuresand optimizations that were not possible under MCJIT or RuntimeDyld.
ObjectLinkingLayer Plugins¶
TheObjectLinkingLayer::Plugin
class provides the following methods:
modifyPassConfig
is called each time a LinkGraph is about to be linked. Itcan be overridden to install JITLinkPasses to run during the link process.voidmodifyPassConfig(MaterializationResponsibility&MR,jitlink::LinkGraph&G,jitlink::PassConfiguration&Config)
notifyLoaded
is called before the link begins, and can be overridden toset up any initial state for the givenMaterializationResponsibility
ifneeded.voidnotifyLoaded(MaterializationResponsibility&MR)
notifyEmitted
is called after the link is complete and code has beenemitted to the executor process. It can be overridden to finalize statefor theMaterializationResponsibility
if needed.ErrornotifyEmitted(MaterializationResponsibility&MR)
notifyFailed
is called if the link fails at any point. It can beoverridden to react to the failure (e.g. to deallocate any already allocatedresources).ErrornotifyFailed(MaterializationResponsibility&MR)
notifyRemovingResources
is called when a request is made to remove anyresources associated with theResourceKey
K for theMaterializationResponsibility
.ErrornotifyRemovingResources(JITDylib&JD,ResourceKeyK)
notifyTransferringResources
is called if/when a request is made totransfer tracking of any resources associated withResourceKey
SrcKey toDstKey.voidnotifyTransferringResources(JITDylib&JD,ResourceKeyDstKey,ResourceKeySrcKey)
Plugin authors are required to implement thenotifyFailed
,notifyRemovingResources
, andnotifyTransferringResources
methods inorder to safely manage resources in the case of resource removal or transfer,or link failure. If no resources are managed by the plugin then these methodscan be implemented as no-ops returningError::success()
.
Plugin instances are added to anObjectLinkingLayer
bycalling theaddPlugin
method[1]. E.g.
// Plugin class to print the set of defined symbols in an object when that// object is linked.classMyPlugin:publicObjectLinkingLayer::Plugin{public:// Add passes to print the set of defined symbols after dead-stripping.voidmodifyPassConfig(MaterializationResponsibility&MR,jitlink::LinkGraph&G,jitlink::PassConfiguration&Config)override{Config.PostPrunePasses.push_back([this](jitlink::LinkGraph&G){returnprintAllSymbols(G);});}// Implement mandatory overrides:ErrornotifyFailed(MaterializationResponsibility&MR)override{returnError::success();}ErrornotifyRemovingResources(JITDylib&JD,ResourceKeyK)override{returnError::success();}voidnotifyTransferringResources(JITDylib&JD,ResourceKeyDstKey,ResourceKeySrcKey)override{}// JITLink pass to print all defined symbols in G.ErrorprintAllSymbols(LinkGraph&G){for(auto*Sym:G.defined_symbols())if(Sym->hasName())dbgs()<<Sym->getName()<<"\n";returnError::success();}};// Create our LLJIT instance using a custom object linking layer setup.// This gives us a chance to install our plugin.autoJ=ExitOnErr(LLJITBuilder().setObjectLinkingLayerCreator([](ExecutionSession&ES,constTriple&T){// Manually set up the ObjectLinkingLayer for our LLJIT// instance.autoOLL=std::make_unique<ObjectLinkingLayer>(ES,std::make_unique<jitlink::InProcessMemoryManager>());// Install our plugin:OLL->addPlugin(std::make_unique<MyPlugin>());returnOLL;}).create());// Add an object to the JIT. Nothing happens here: linking isn't triggered// until we look up some symbol in our object.ExitOnErr(J->addObject(loadFromDisk("main.o")));// Plugin triggers here when our lookup of main triggers linking of main.oautoMainSym=J->lookup("main");
LinkGraph¶
JITLink maps all relocatable object formats to a genericLinkGraph
typethat is designed to make linking fast and easy (LinkGraph
instances canalso be created manually. SeeConstructing LinkGraphs).
Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details,but share a common goal: to represent machine level code and data withannotations that allow them to be relocated in a virtual address space. Tothis end they usually contain names (symbols) for content defined inside thefile or externally, chunks of content that must be moved as a unit (sectionsor subsections, depending on the format), and annotations describing how topatch content based on the final address of some target symbol/section(relocations).
At a high level, theLinkGraph
type represents these concepts as a decoratedgraph. Nodes in the graph represent symbols and content, and edges representrelocations. Each of the elements of the graph is listed here:
Addressable
– A node in the link graph that can be assigned an addressin the executor process’s virtual address space.Absolute and external symbols are represented using plain
Addressable
instances. Content defined inside the object file is represented using theBlock
subclass.Block
– AnAddressable
node that hasContent
(or is marked aszero-filled), a parentSection
, aSize
, anAlignment
(and anAlignmentOffset
), and a list ofEdge
instances.Blocks provide a container for binary content which must remain contiguous inthe target address space (alayout unit). Many interesting low leveloperations on
LinkGraph
instances involve inspecting or mutating blockcontent or edges.Content
is represented as anllvm::StringRef
, and accessible viathegetContent
method. Content is only available for content blocks,and not for zero-fill blocks (useisZeroFill
to check, and prefergetSize
when only the block size is needed as it works for bothzero-fill and content blocks).Section
is represented as aSection&
reference, and accessible viathegetSection
method. TheSection
class is described in more detailbelow.Size
is represented as asize_t
, and is accessible via thegetSize
method for both content and zero-filled blocks.Alignment
is represented as auint64_t
, and available via thegetAlignment
method. It represents the minimum alignment requirement (inbytes) of the start of the block.AlignmentOffset
is represented as auint64_t
, and accessible via thegetAlignmentOffset
method. It represents the offset from the alignmentrequired for the start of the block. This is required to support blockswhose minimum alignment requirement comes from data at some non-zero offsetinside the block. E.g. if a block consists of a single byte (with bytealignment) followed by a uint64_t (with 8-byte alignment), then the blockwill have 8-byte alignment with an alignment offset of 7.list of
Edge
instances. An iterator range for this list is returned bytheedges
method. TheEdge
class is described in more detail below.
Symbol
– An offset from anAddressable
(often aBlock
), with anoptionalName
, aLinkage
, aScope
, aCallable
flag, and aLive
flag.Symbols make it possible to name content (blocks and addressables areanonymous), or target content with an
Edge
.Name
is represented as anllvm::StringRef
(equal tollvm::StringRef()
if the symbol has no name), and accessible via thegetName
method.Linkage
is one ofStrong orWeak, and is accessible via thegetLinkage
method. TheJITLinkContext
can use this flag to determinewhether this symbol definition should be kept or dropped.Scope
is one ofDefault,Hidden, orLocal, and is accessible viathegetScope
method. TheJITLinkContext
can use this to determinewho should be able to see the symbol. A symbol with default scope should beglobally visible. A symbol with hidden scope should be visible to otherdefinitions within the same simulated dylib (e.g. ORCJITDylib
) orexecutable, but not from elsewhere. A symbol with local scope should only bevisible within the currentLinkGraph
.Callable
is a boolean which is set to true if this symbol can be called,and is accessible via theisCallable
method. This can be used toautomate the introduction of call-stubs for lazy compilation.Live
is a boolean that can be set to mark this symbol as root fordead-stripping purposes (seeGeneric Link Algorithm). JITLink’sdead-stripping algorithm will propagate liveness flags through the graph toall reachable symbols before deleting any symbols (and blocks) that are notmarked live.
Edge
– A quad of anOffset
(implicitly from the start of thecontainingBlock
), aKind
(describing the relocation type), aTarget
, and anAddend
.Edges represent relocations, and occasionally other relationships, betweenblocks and symbols.
Offset
, accessible viagetOffset
, is an offset from the start of theBlock
containing theEdge
.Kind
, accessible viagetKind
is a relocation type – it describeswhat kinds of changes (if any) should be made to block content at the givenOffset
based on the address of theTarget
.Target
, accessible viagetTarget
, is a pointer to aSymbol
,representing whose address is relevant to the fixup calculation specified bythe edge’sKind
.Addend
, accessible viagetAddend
, is a constant whose interpretationis determined by the edge’sKind
.
Section
– A set ofSymbol
instances, plus a set ofBlock
instances, with aName
, a set ofProtectionFlags
, and anOrdinal
.Sections make it easy to iterate over the symbols or blocks associated witha particular section in the source object file.
blocks()
returns an iterator over the set of blocks defined in thesection (asBlock*
pointers).symbols()
returns an iterator over the set of symbols defined in thesection (asSymbol*
pointers).Name
is represented as anllvm::StringRef
, and is accessible via thegetName
method.ProtectionFlags
are represented as a sys::Memory::ProtectionFlags enum,and accessible via thegetProtectionFlags
method. These flags describewhether the section is readable, writable, executable, or some combinationof these. The most common combinations areRW-
for writable data,R--
for constant data, andR-X
for code.SectionOrdinal
, accessible viagetOrdinal
, is a number used to orderthe section relative to others. It is usually used to preserve sectionorder within a segment (a set of sections with the same memory protections)when laying out memory.
For the graph-theorists: TheLinkGraph
is bipartite, with one set ofSymbol
nodes and one set ofAddressable
nodes. EachSymbol
node hasone (implicit) edge to its targetAddressable
. EachBlock
has a set ofedges (possibly empty, represented asEdge
instances) back to elements oftheSymbol
set. For convenience and performance of common algorithms,symbols and blocks are further grouped intoSections
.
TheLinkGraph
itself provides operations for constructing, removing, anditerating over sections, symbols, and blocks. It also provides metadataand utilities relevant to the linking process:
Graph element operations
sections
returns an iterator over all sections in the graph.findSectionByName
returns a pointer to the section with the givenname (as aSection*
) if it exists, otherwise returns a nullptr.blocks
returns an iterator over all blocks in the graph (across allsections).defined_symbols
returns an iterator over all defined symbols in thegraph (across all sections).external_symbols
returns an iterator over all external symbols in thegraph.absolute_symbols
returns an iterator over all absolute symbols in thegraph.createSection
creates a section with a given name and protection flags.createContentBlock
creates a block with the given initial content,parent section, address, alignment, and alignment offset.createZeroFillBlock
creates a zero-fill block with the given size,parent section, address, alignment, and alignment offset.addExternalSymbol
creates a new addressable and symbol with a givenname, size, and linkage.addAbsoluteSymbol
creates a new addressable and symbol with a givenname, address, size, linkage, scope, and liveness.addCommonSymbol
convenience function for creating a zero-filled blockand weak symbol with a given name, scope, section, initial address, size,alignment and liveness.addAnonymousSymbol
creates a new anonymous symbol for a given block,offset, size, callable-ness, and liveness.addDefinedSymbol
creates a new symbol for a given block with a name,offset, size, linkage, scope, callable-ness and liveness.makeExternal
transforms a formerly defined symbol into an external oneby creating a new addressable and pointing the symbol at it. The existingblock is not deleted, but can be manually removed (if unreferenced) bycallingremoveBlock
. All edges to the symbol remain valid, but thesymbol must now be defined outside thisLinkGraph
.removeExternalSymbol
removes an external symbol and its targetaddressable. The target addressable must not be referenced by any othersymbols.removeAbsoluteSymbol
removes an absolute symbol and its targetaddressable. The target addressable must not be referenced by any othersymbols.removeDefinedSymbol
removes a defined symbol, butdoes not removeits target block.removeBlock
removes the given block.splitBlock
split a given block in two at a given index (useful whereit is known that a block contains decomposable records, e.g. CFI recordsin an eh-frame section).
Graph utility operations
getName
returns the name of this graph, which is usually based on thename of the input object file.getTargetTriple
returns anllvm::Triple for the executor process.getPointerSize
returns the size of a pointer (in bytes) in the executorprocess.getEndianness
returns the endianness of the executor process.allocateString
copies data from a givenllvm::Twine
into thelink graph’s internal allocator. This can be used to ensure that contentcreated inside a pass outlives that pass’s execution.
Generic Link Algorithm¶
JITLink provides a generic link algorithm which can be extended / modified atcertain points by the introduction of JITLinkPasses.
At the end of each phase the linker packages its state into acontinuationand calls theJITLinkContext
object to perform a (potentially high-latency)asynchronous operation: allocating memory, resolving external symbols, andfinally transferring linked memory to the executing process.
Phase 1
This phase is called immediately by the
link
function as soon as theinitial configuration (including the pass pipeline setup) is complete.Run pre-prune passes.
These passes are called on the graph before it is pruned. At this stage
LinkGraph
nodes still have their original vmaddrs. A mark-live pass(supplied by theJITLinkContext
) will be run at the end of thissequence to mark the initial set of live symbols.Notable use cases: marking nodes live, accessing/copying graph data thatwill be pruned (e.g. metadata that’s important for the JIT, but not neededfor the link process).
Prune (dead-strip) the
LinkGraph
.Removes all symbols and blocks not reachable from the initial set of livesymbols.
This allows JITLink to remove unreachable symbols / content, includingoverridden weak and redundant ODR definitions.
Run post-prune passes.
These passes are run on the graph after dead-stripping, but before memoryis allocated or nodes assigned their final target vmaddrs.
Passes run at this stage benefit from pruning, as dead functions and datahave been stripped from the graph. However new content can still be addedto the graph, as target and working memory have not been allocated yet.
Notable use cases: Building Global Offset Table (GOT), Procedure LinkageTable (PLT), and Thread Local Variable (TLV) entries.
Asynchronously allocate memory.
Calls the
JITLinkContext
’sJITLinkMemoryManager
to allocate bothworking and target memory for the graph. As part of this process theJITLinkMemoryManager
will update the addresses of all nodesdefined in the graph to their assigned target address.Note: This step only updates the addresses of nodes defined in this graph.External symbols will still have null addresses.
Phase 2
Run post-allocation passes.
These passes are run on the graph after working and target memory havebeen allocated, but before the
JITLinkContext
is notified of thefinal addresses of the symbols in the graph. This gives these passes achance to set up data structures associated with target addresses beforeany JITLink clients (especially ORC queries for symbol resolution) canattempt to access them.Notable use cases: Setting up mappings between target addresses andJIT data structures, such as a mapping between
__dso_handle
andJITDylib*
.Notify the
JITLinkContext
of the assigned symbol addresses.Calls
JITLinkContext::notifyResolved
on the link graph, allowingclients to react to the symbol address assignments made for this graph.In ORC this is used to notify any pending queries forresolved symbols,including pending queries from concurrently running JITLink instances thathave reached the next step and are waiting on the address of a symbol inthis graph to proceed with their link.Identify external symbols and resolve their addresses asynchronously.
Calls the
JITLinkContext
to resolve the target address of any externalsymbols in the graph.
Phase 3
Apply external symbol resolution results.
This updates the addresses of all external symbols. At this point allnodes in the graph have their final target addresses, however nodecontent still points back to the original data in the object file.
Run pre-fixup passes.
These passes are called on the graph after all nodes have been assignedtheir final target addresses, but before node content is copied intoworking memory and fixed up. Passes run at this stage can make lateoptimizations to the graph and content based on address layout.
Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses arebypassed for fixup targets that are directly accessible under the assignedmemory layout.
Copy block content to working memory and apply fixups.
Copies all block content into allocated working memory (following thetarget layout) and applies fixups. Graph blocks are updated to point atthe fixed up content.
Run post-fixup passes.
These passes are called on the graph after fixups have been applied andblocks updated to point to the fixed up content.
Post-fixup passes can inspect blocks contents to see the exact bytes thatwill be copied to the assigned target addresses.
Finalize memory asynchronously.
Calls the
JITLinkMemoryManager
to copy working memory to the executorprocess and apply the requested permissions.
Phase 3.
Notify the context that the graph has been emitted.
Calls
JITLinkContext::notifyFinalized
and hands off theJITLinkMemoryManager::FinalizedAlloc
object for this graph’s memoryallocation. This allows the context to track/hold memory allocations andreact to the newly emitted definitions. In ORC this is used to update theExecutionSession
instance’s dependence graph, which may result inthese symbols (and possibly others) becomingReady if all of theirdependencies have also been emitted.
Passes¶
JITLink passes arestd::function<Error(LinkGraph&)>
instances. They are freeto inspect and modify the givenLinkGraph
subject to the constraints ofwhatever phase they are running in (seeGeneric Link Algorithm). If apass returnsError::success()
then linking continues. If a pass returnsa failure value then linking is stopped and theJITLinkContext
is notifiedthat the link failed.
Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOTand PLT construction as a pass), and external clients likeObjectLinkingLayer::Plugin
.
In combination with the openLinkGraph
API, JITLink passes enable theimplementation of powerful new features. For example:
Relaxation optimizations – A pre-fixup pass can inspect GOT accesses and PLTcalls and identify situations where the addresses of the entry target and theaccess are close enough to be accessed directly. In this case the pass canrewrite the instruction stream of the containing block and update the fixupedges to make the access direct.
Code for this looks like:
ErrorrelaxGOTEdges(LinkGraph&G){for(auto*B:G.blocks())for(auto&E:B->edges())if(E.getKind()==x86_64::GOTLoad){auto&GOTTarget=getGOTEntryTarget(E.getTarget());if(isInRange(B.getFixupAddress(E),GOTTarget)){// Rewrite B.getContent() at fixup address from// MOVQ to LEAQ// Update edge target and kind.E.setTarget(GOTTarget);E.setKind(x86_64::PCRel32);}}returnError::success();}
Metadata registration – Post allocation passes can be used to record theaddress range of sections in the target. This can be used to register themetadata (e.g exception handling frames, language metadata) in the targetonce memory has been finalized.
ErrorregisterEHFrameSection(LinkGraph&G){if(auto*Sec=G.findSectionByName("__eh_frame")){SectionRangeSR(*Sec);registerEHFrameSection(SR.getStart(),SR.getEnd());}returnError::success();}
Record call sites for later mutation – A post-allocation pass can recordthe call sites of all calls to a particular function, allowing those callsites to be updated later at runtime (e.g. for instrumentation, or toenable the function to be lazily compiled but still called directly aftercompilation).
StringRefFunctionName="foo";std::vector<ExecutorAddr>CallSitesForFunction;autoRecordCallSites=[&](LinkGraph&G)->Error{for(auto*B:G.blocks())for(auto&E:B.edges())if(E.getKind()==CallEdgeKind&&E.getTarget().hasName()&&E.getTraget().getName()==FunctionName)CallSitesForFunction.push_back(B.getFixupAddress(E));returnError::success();};
Memory Management with JITLinkMemoryManager¶
JIT linking requires allocation of two kinds of memory: working memory in theJIT process and target memory in the execution process (these processes andmemory allocations may be one and the same, depending on how the user wantsto build their JIT). It also requires that these allocations conform to therequested code model in the target process (e.g. MachO/x86-64’s Small codemodel requires that all code and data for a simulated dylib is allocated within4Gb). Finally, it is natural to make the memory manager responsible fortransferring memory to the target address space and applying memory protections,since the memory manager must know how to communicate with the executor, andsince sharing and protection assignment can often be efficiently managed (inthe common case of running across processes on the same machine for security)via the host operating system’s virtual memory management APIs.
To satisfy these requirementsJITLinkMemoryManager
adopts the followingdesign: The memory manager itself has just two virtual methods for asynchronousoperations (each with convenience overloads for calling synchronously):
/// Called when allocation has been completed.usingOnAllocatedFunction=unique_function<void(Expected<std::unique_ptr<InFlightAlloc>)>;/// Called when deallocation has completed.usingOnDeallocatedFunction=unique_function<void(Error)>;/// Call to allocate memory.virtualvoidallocate(constJITLinkDylib*JD,LinkGraph&G,OnAllocatedFunctionOnAllocated)=0;/// Call to deallocate memory.virtualvoiddeallocate(std::vector<FinalizedAlloc>Allocs,OnDeallocatedFunctionOnDeallocated)=0;
Theallocate
method takes aJITLinkDylib*
representing the targetsimulated dylib, a reference to theLinkGraph
that must be allocated for,and a callback to run once anInFlightAlloc
has been constructed.JITLinkMemoryManager
implementations can (optionally) use theJD
argument to manage a per-simulated-dylib memory pool (since code modelconstraints are typically imposed on a per-dylib basis, and not acrossdylibs)[2]. TheLinkGraph
describes the object file that we need toallocate memory for. The allocator must allocate working memory for all ofthe Blocks defined in the graph, assign address space for each Block within theexecuting processes memory, and update the Blocks’ addresses to reflect thisassignment. Block content should be copied to working memory, but does not needto be transferred to executor memory yet (that will be done once the content isfixed up).JITLinkMemoryManager
implementations can take fullresponsibility for these steps, or use theBasicLayout
utility to reducethe task to allocating working and executor memory forsegments: chunks ofmemory defined by permissions, alignments, content sizes, and zero-fill sizes.Once the allocation step is complete the memory manager should construct anInFlightAlloc
object to represent the allocation, and then pass this objectto theOnAllocated
callback.
TheInFlightAlloc
object has two virtual methods:
usingOnFinalizedFunction=unique_function<void(Expected<FinalizedAlloc>)>;usingOnAbandonedFunction=unique_function<void(Error)>;/// Called prior to finalization if the allocation should be abandoned.virtualvoidabandon(OnAbandonedFunctionOnAbandoned)=0;/// Called to transfer working memory to the target and apply finalization.virtualvoidfinalize(OnFinalizedFunctionOnFinalized)=0;
The linking process will call thefinalize
method on theInFlightAlloc
object if linking succeeds up to the finalization step, otherwise it will callabandon
to indicate that some error occurred during linking. A call to theInFlightAlloc::finalize
method should cause content for the allocation to betransferred from working to executor memory, and permissions to be run. A calltoabandon
should result in both kinds of memory being deallocated.
On successful finalization, theInFlightAlloc::finalize
method shouldconstruct aFinalizedAlloc
object (an opaque uint64_t id that theJITLinkMemoryManager
can use to identify executor memory for deallocation)and pass it to theOnFinalized
callback.
Finalized allocations (represented byFinalizedAlloc
objects) can bedeallocated by calling theJITLinkMemoryManager::dealloc
method. This methodtakes a vector ofFinalizedAlloc
objects, since it is common to deallocatemultiple objects at the same time and this allows us to batch these requests fortransmission to the executing process.
JITLink provides a simple in-process implementation of this interface:InProcessMemoryManager
. It allocates pages once and re-uses them as bothworking and target memory.
ORC provides a cross-process-capableMapperJITLinkMemoryManager
that can useshared memory or ORC-RPC-based communication to transfer content to the executingprocess.
JITLinkMemoryManager and Security¶
JITLink’s ability to link JIT’d code for a separate executor process can beused to improve the security of a JIT system: The executor process can besandboxed, run within a VM, or even run on a fully separate machine.
JITLink’s memory manager interface is flexible enough to allow for a range oftrade-offs between performance and security. For example, on a system where codepages must be signed (preventing code from being updated), the memory managercan deallocate working memory pages after linking to free memory in the processrunning JITLink. Alternatively, on a system that allows RWX pages, the memorymanager may use the same pages for both working and target memory by markingthem as RWX, allowing code to be modified in place without further overhead.Finally, if RWX pages are not permitted but dual-virtual-mappings ofphysical memory pages are, then the memory manager can dual map physical pagesas RW- in the JITLink process and R-X in the executor process, allowingmodification from the JITLink process but not from the executor (at the cost ofextra administrative overhead for the dual mapping).
Error Handling¶
JITLink makes extensive use of thellvm::Error
type (see the error handlingsection ofLLVM Programmer’s Manual for details). The link process itself, allpasses, the memory manager interface, and operations on theJITLinkContext
are all permitted to fail. Link graph construction utilities (especially parsersfor object formats) are encouraged to validate input, and validate fixups(e.g. with range checks) before application.
Any error will halt the link process and notify the context of failure. In ORC,reported failures are propagated to queries pending on definitions provided bythe failing link, and also through edges of the dependence graph to any querieswaiting on dependent symbols.
Connection to the ORC Runtime¶
The ORC Runtime (currently under development) aims to provide runtime supportfor advanced JIT features, including object format features that requirenon-trivial action in the executor (e.g. running initializers, managing threadlocal storage, registering with language runtimes, etc.).
ORC Runtime support for object format features typically requires cooperationbetween the runtime (which executes in the executor process) and JITLink (whichruns in the JIT process and can inspect LinkGraphs to determine what actionsmust be taken in the executor). For example: Execution of MachO staticinitializers in the ORC runtime is performed by thejit_dlopen
function,which calls back to the JIT process to ask for the list of address ranges of__mod_init
sections to walk. This list is collated by theMachOPlatformPlugin
, which installs a pass to record this information foreach object as it is linked into the target.
Constructing LinkGraphs¶
Clients usually access and manipulateLinkGraph
instances that were createdfor them by anObjectLinkingLayer
instance, but they can be created manually:
By directly constructing and populating a
LinkGraph
instance.By using the
createLinkGraph
family of functions to create aLinkGraph
from an in-memory buffer containing an object file. This is howObjectLinkingLayer
usually createsLinkGraphs
.
createLinkGraph_<Object-Format>_<Architecture>
can be used whenboth the object format and architecture are known ahead of time.
createLinkGraph_<Object-Format>
can be used when the object format isknown ahead of time, but the architecture is not. In this case thearchitecture will be determined by inspection of the object header.
createLinkGraph
can be used when neither the object format northe architecture are known ahead of time. In this case the object headerwill be inspected to determine both the format and architecture.
JIT Linking¶
The JIT linker concept was introduced in LLVM’s earlier generation of JIT APIs,MCJIT. In MCJIT theRuntimeDyld component enabled re-use of LLVM as anin-memory compiler by adding an in-memory link step to the end of the usualcompiler pipeline. Rather than dumping relocatable objects to disk as a compilerusually would, MCJIT passed them to RuntimeDyld to be linked into a targetprocess.
This approach to linking differs from standardstatic ordynamic linking:
Astatic linker takes one or more relocatable object files as input and linksthem into an executable or dynamic library on disk.
Adynamic linker applies relocations to executables and dynamic libraries thathave been loaded into memory.
AJIT linker takes a single relocatable object file at a time and links itinto a target process, usually using a context object to allow the linked codeto resolve symbols in the target.
RuntimeDyld¶
In order to keep RuntimeDyld’s implementation simple MCJIT imposed somerestrictions on compiled code:
It had to use the Large code model, and often restricted available relocationmodels in order to limit the kinds of relocations that had to be supported.
It required strong linkage and default visibility on all symbols – behaviorfor other linkages/visibilities was not well defined.
It constrained and/or prohibited the use of features requiring runtimesupport, e.g. static initializers or thread local storage.
As a result of these restrictions not all language features supported by LLVMworked under MCJIT, and objects to be loaded under the JIT had to be compiled totarget it (precluding the use of precompiled code from other sources under theJIT).
RuntimeDyld also provided very limited visibility into the linking processitself: Clients could access conservative estimates of section size(RuntimeDyld bundled stub size and padding estimates into the section sizevalue) and the final relocated bytes, but could not access RuntimeDyld’sinternal object representations.
Eliminating these restrictions and limitations was one of the primary motivationsfor the development of JITLink.
The llvm-jitlink tool¶
Thellvm-jitlink
tool is a command line wrapper for the JITLink library.It loads some set of relocatable object files and then links them usingJITLink. Depending on the options used it will then execute them, or validatethe linked memory.
Thellvm-jitlink
tool was originally designed to aid JITLink development byproviding a simple environment for testing.
Basic usage¶
By default,llvm-jitlink
will link the set of objects passed on the commandline, then search for a “main” function and execute it:
%cathello-world.c#include <stdio.h>intmain(intargc,char*argv[]){printf("hello, world!\n");return0;}%clang-c-ohello-world.ohello-world.c%llvm-jitlinkhello-world.oHello,World!
Multiple objects may be specified, and arguments may be provided to the JIT’dmain function using the -args option:
%catprint-args.c#include <stdio.h>voidprint_args(intargc,char*argv[]){for(inti=0;i!=argc;++i)printf("arg %i is \"%s\"\n",i,argv[i]);}%catprint-args-main.cvoidprint_args(intargc,char*argv[]);intmain(intargc,char*argv[]){print_args(argc,argv);return0;}%clang-c-oprint-args.oprint-args.c%clang-c-oprint-args-main.oprint-args-main.c%llvm-jitlinkprint-args.oprint-args-main.o-argsabcarg0is"a"arg1is"b"arg2is"c"
Alternative entry points may be specified using the-entry<entrypointname>
option.
Other options can be found by callingllvm-jitlink-help
.
llvm-jitlink as a regression testing utility¶
One of the primary aims ofllvm-jitlink
was to enable readable regressiontests for JITLink. To do this it supports two options:
The-noexec
option tells llvm-jitlink to stop after looking up the entrypoint, and before attempting to execute it. Since the linked code is notexecuted, this can be used to link for other targets even if you do not haveaccess to the target being linked (the-define-abs
or-phony-externals
options can be used to supply any missing definitions in this case).
The-check<check-file>
option can be used to run a set ofjitlink-check
expressions against working memory. It is typically used in conjunction with-noexec
, since the aim is to validate JIT’d memory rather than to run thecode and-noexec
allows us to link for any supported target architecturefrom the current process. In-check
mode,llvm-jitlink
will scan thegiven check-file for lines of the form#jitlink-check:<expr>
. Seeexamples of this usage inllvm/test/ExecutionEngine/JITLink
.
Remote execution via llvm-jitlink-executor¶
By defaultllvm-jitlink
will link the given objects into its own process,but this can be overridden by two options:
The-oop-executor[=/path/to/executor]
option tellsllvm-jitlink
toexecute the given executor (which defaults tollvm-jitlink-executor
) andcommunicate with it via file descriptors which it passes to the executoras the first argument with the formatfiledescs=<in-fd>,<out-fd>
.
The-oop-executor-connect=<host>:<port>
option tellsllvm-jitlink
toconnect to an already running executor via TCP on the given host and port. Touse this option you will need to startllvm-jitlink-executor
manually withlisten=<host>:<port>
as the first argument.
Harness mode¶
The-harness
option allows a set of input objects to be designated as a testharness, with the regular object files implicitly treated as objects to betested. Definitions of symbols in the harness set override definitions in thetest set, and external references from the harness cause automatic scopepromotion of local symbols in the test set (these modifications to the usuallinker rules are accomplished via anObjectLinkingLayer::Plugin
installed byllvm-jitlink
when it sees the-harness
option).
With these modifications in place we can selectively test functions in an objectfile by mocking those function’s callees. For example, suppose we have an objectfile,test_code.o
, compiled from the following C source (which we need nothave access to):
voidirrelevant_function(){irrelevant_external();}intfunction_to_mock(intX){return/* some function of X */;}staticvoidfunction_to_test(){...intY=function_to_mock();printf("Y is %i\n",Y);}
If we want to know howfunction_to_test
behaves when we change the behavioroffunction_to_mock
we can test it by writing a test harness:
voidfunction_to_test();intfunction_to_mock(intX){printf("used mock utility function\n");return42;}intmain(intargc,char*argv[]){function_to_test():return0;}
Under normal circumstances these objects could not be linked together:function_to_test
is static and could not be resolved outsidetest_code.o
, the twofunction_to_mock
functions would result in aduplicate definition error, andirrelevant_external
is undefined.However, using-harness
and-phony-externals
we can run this codewith:
%clang-c-otest_code_harness.otest_code_harness.c%llvm-jitlink-phony-externalstest_code.o-harnesstest_code_harness.ousedmockutilityfunctionYis42
The-harness
option may be of interest to people who want to perform somevery late testing on build products to verify that compiled code behaves asexpected. On basic C test cases this is relatively straightforward. Mocks formore complicated languages (e.g. C++) are much trickier: Any code involvingclasses tends to have a lot of non-trivial surface area (e.g. vtables) thatwould require great care to mock.
Tips for JITLink backend developers¶
Make liberal use of assert and
llvm::Error
. Donot assume that the inputobject is well formed: Return any errors produced by libObject (or your ownobject parsing code) and validate as you construct. Think carefully about thedistinction between contract (which should be validated with asserts andllvm_unreachable) and environmental errors (which should generatellvm::Error
instances).Don’t assume you’re linking in-process. Use libSupport’s sized,endian-specific types when reading/writing content in the
LinkGraph
.
As a “minimum viable” JITLink wrapper, thellvm-jitlink
tool is aninvaluable resource for developers bringing in a new JITLink backend. A standardworkflow is to start by throwing an unsupported object at the tool and seeingwhat error is returned, then fixing that (you can often make a reasonable guessat what should be done based on existing code for other formats orarchitectures).
In debug builds of LLVM, the-debug-only=jitlink
option dumps logs from theJITLink library during the link process. These can be useful for spotting some bugs ata glance. The-debug-only=llvm_jitlink
option dumps logs from thellvm-jitlink
tool, which can be useful for debugging both testcases (it is often less verbose than-debug-only=jitlink
) and the tool itself.
The-oop-executor
and-oop-executor-connect
options are helpful for testinghandling of cross-process and cross-architecture use cases.
Roadmap¶
JITLink is under active development. Work so far has focused on the MachOimplementation. In LLVM 12 there is limited support for ELF on x86-64.
Major outstanding projects include:
Refactor architecture support to maximize sharing across formats.
All formats should be able to share the bulk of the architecture specificcode (especially relocations) for each supported architecture.
Refactor ELF link graph construction.
ELF’s link graph construction is currently implemented in theELF_x86_64.cppfile, and tied to the x86-64 relocation parsing code. The bulk of the code isgeneric and should be split into an ELFLinkGraphBuilder base class along thesame lines as the existing generic MachOLinkGraphBuilder.
Implement support for arm32.
Implement support for other new architectures.
JITLink Availability and Feature Status¶
The following table describes the status of the JITlink backends for variousformat / architecture combinations (as of July 2023).
Support levels:
None: No backend. JITLink will return an “architecture not supported” error.Represented by empty cells in the table below.
Skeleton: A backend exists, but does not support commonly used relocations.Even simple programs are likely to trigger an “unsupported relocation” error.Backends in this state may be easy to improve by implementing new relocations.Consider getting involved!
Basic: The backend supports simple programs, isn’t ready for general use yet.
Usable: The backend is useable for general use for at least one code andrelocation model.
Good: The backend supports almost all relocations. Advanced features likenative thread local storage may not be available yet.
Complete: The backend supports all relocations and object format features.
Architecture | ELF | COFF | MachO |
---|---|---|---|
arm32 | Skeleton | ||
arm64 | Usable | Good | |
LoongArch | Good | ||
PowerPC 64 | Usable | ||
RISC-V | Good | ||
x86-32 | Basic | ||
x86-64 | Good | Usable | Good |
Seellvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin
fora full worked example.
If not forhidden scoped symbols we could eliminate theJITLinkDylib*
argument toJITLinkMemoryManager::allocate
andtreat every object as a separate simulated dylib for the purposes ofmemory layout. Hidden symbols break this by generating in-range accessesto external symbols, requiring the access and symbol to be allocatedwithin range of one another. That said, providing a pre-reserved addressrange pool for each simulated dylib guarantees that the relaxationoptimizations will kick in for all intra-dylib references, which is goodfor performance (at the cost of whatever overhead is introduced byreserving the address-range up-front).