Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Whole Program LLVM: wllvm ported to go

License

NotificationsYou must be signed in to change notification settings

SRI-CSL/gllvm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whole Program LLVM in Go

LicenseBuild StatusGo Report Card

TL; DR: A drop-in replacement forwllvm, that builds thebitcode in parallel, and is faster. A comparison between the two tools can be gleaned from building theLinux kernel.

Quick Start Comparison Table

wllvm command/env variablegllvm command/env variable
wllvmgclang
wllvm++gclang++
wfortrangflang
extract-bcget-bc
wllvm-sanity-checkergsanity-check
LLVM_COMPILER_PATHLLVM_COMPILER_PATH
LLVM_CC_NAME ...LLVM_CC_NAME ...
LLVM_F_NAME
WLLVM_CONFIGURE_ONLYWLLVM_CONFIGURE_ONLY
WLLVM_OUTPUT_LEVELWLLVM_OUTPUT_LEVEL
WLLVM_OUTPUT_FILEWLLVM_OUTPUT_FILE
LLVM_COMPILERnot supported (clang only)
LLVM_GCC_PREFIXnot supported (clang only)
LLVM_DRAGONEGG_PLUGINnot supported (clang only)
LLVM_LINK_FLAGSLLVM_LINK_FLAGS

This project,gllvm, provides tools for building whole-program (orwhole-library) LLVM bitcode files from an unmodified C or C++source package. It currently runs on*nix platforms such as Linux,FreeBSD, and Mac OS X. It is a Go port ofwllvm.

gllvm provides compiler wrappers that work in twophases. The wrappers first invoke the compiler as normal. Then, foreach object file, they call a bitcode compiler to produce LLVMbitcode. The wrappers then store the location of the generated bitcodefile in a dedicated section of the object file. When object files arelinked together, the contents of the dedicated sections areconcatenated (so we don't lose the locations of any of the constituentbitcode files). After the build completes, one can use agllvmutility to read the contents of the dedicated section and link all ofthe bitcode into a single whole-program bitcode file. This utilityworks for both executable and native libraries.

For more details seewllvm.

Prerequisites

To installgllvm you need the go languagetool.

To usegllvm you need clang/clang++/flang and the llvm tools llvm-link and llvm-ar.gllvm is agnostic to the actual llvm version.gllvm also relies on standard buildtools such asobjcopy andld.

Installation

To install, simply do (making sure to include those...)

go install github.com/SRI-CSL/gllvm/cmd/...@latest

This should install six binaries:gclang,gclang++,gflang,get-bc,gparse, andgsanity-checkin the$GOPATH/bin directory.

Usage

gclang andgclang++ are the wrappers used to compile C and C++.
gflang is the wrapper used to compile Fortran.get-bc is used forextracting the bitcode from a build product (either an object file, executable, libraryor archive).gsanity-check can be used for detecting configuration errors.gparse can be used to examine howgllvm parses compiler/linker lines.

Here is a simple example. Assuming that clang is in yourPATH, you can buildbitcode forpkg-config as follows:

tar xf pkg-config-0.26.tar.gzcd pkg-config-0.26CC=gclang ./configuremake

This should produce the executablepkg-config. To extract the bitcode:

get-bc pkg-config

which will produce the bitcode modulepkg-config.bc. For more on this exampleseehere.

Advanced Configuration

If clang and the llvm tools are not in yourPATH, you will need to set someenvironment variables.

  • LLVM_COMPILER_PATH can be set to the absolute path of the directory thatcontains the compiler and the other LLVM tools to be used.

  • LLVM_CC_NAME can be set if your clang compiler is not calledclang butsomething likeclang-3.7. SimilarlyLLVM_CXX_NAME andLLVM_F_NAME can be used todescribe what the C++ and Fortran compilers are called, respectively. We also pay attention to theenvironment variablesLLVM_LINK_NAME andLLVM_AR_NAME in ananalogous way.

Another useful, and sometimes necessary, environment variable isWLLVM_CONFIGURE_ONLY.

  • WLLVM_CONFIGURE_ONLY can be set to anything. If it is set,gclangandgclang++ behave like a normal C or C++ compiler. They do notproduce bitcode. SettingWLLVM_CONFIGURE_ONLY may preventconfiguration errors caused by the unexpected production of hiddenbitcode files. It is sometimes required when configuring a build.For example:
    WLLVM_CONFIGURE_ONLY=1 CC=gclang ./configuremake

Extracting the Bitcode

Theget-bc tool is used to extract the bitcode from a build artifact, such as an executable, object file, thin archive, archive, or library. In the simplest use case, as seen above,one simply does:

get-bc -o <name of bitcode file> <path to executable>

This will produce the desired bitcode file. The situation is similar for an object file.For an archive or library, there is a choice as to whether you produce a bitcode moduleor a bitcode archive. This choice is made by using the-b switch.

Another useful switch is the-m switch which will, in addition to producing thebitcode, will also produce a manifest of the bitcode filesthat made up the final product. As is typical

get-bc -h

will list all the commandline switches. Since we use thegolangflag module,the switches must precede the artifact path.

Preserving bitcode files in a store

Sometimes, because of pathological build systems, it can be usefulto preserve the bitcode files produced in abuild, either to prevent deletion or to retrieve it later. If theenvironment variableWLLVM_BC_STORE is set to the absolute path ofan existing directory,then WLLVM will copy the produced bitcode file into that directory.The name of the copied bitcode file is the hash of the path to theoriginal bitcode file. For convenience, when using both the manifestfeature ofget-bc and the store, the manifest will contain boththe original path, and the store path.

Debugging

The gllvm tools can show various levels of output to aid with debugging.To show this output set theWLLVM_OUTPUT_LEVEL environmentvariable to one of the following levels:

  • ERROR
  • WARNING
  • AUDIT
  • INFO
  • DEBUG

For example:

    export WLLVM_OUTPUT_LEVEL=DEBUG

Output will be directed to the standard error stream, unless you specify thepath of a logfile via theWLLVM_OUTPUT_FILE environment variable.TheAUDIT level, new in 2022, logs only the calls to the compiler, and indicateswhether each call iscompiling orlinking, the compiler used, and the arguments provided.

For example:

    export WLLVM_OUTPUT_FILE=/tmp/gllvm.log

Dragons Begone

gllvm does not support the dragonegg plugin.

Sanity Checking

Too many environment variables? Try doing a sanity check:

gsanity-check

it might point out what is wrong.

Under the hoods

Bothwllvm andgllvm toolsets do much the same thing, but the waythey do it is slightly different. Thegllvm toolset's code base iswritten ingolang, and is largely derived from thewllvm's pythoncodebase.

Both generate object files and bitcode files using thecompiler.wllvm can usegcc anddragonegg,gllvm can only useclang. Thegllvm toolset does these two tasks in parallel, whilewllvm does them sequentially. This together with the slowness ofpython'sfork exec-ing, and it's interpreted nature accounts for thelarge efficiency gap between the two toolsets.

Both inject the path of the bitcode version of the.o file into adedicated segment of the.o file itself. This segment is the sameacross toolsets, so extracting the bitcode can be done by theappropriate tool in either toolset. On*nix both toolsets useobjcopy to add the segment, while on OS X they useld.

When the object files are linked into the resulting library orexecutable, the bitcode path segments are appended, so the resultingbinary contains the paths of all the bitcode files that constitute thebinary. To extract the sections thegllvm toolset uses the golangpackages"debug/elf" and"debug/macho", while thewllvm toolsetusesobjdump on*nix, andotool on OS X.

Both tools then usellvm-link orllvm-ar to combine the bitcodefiles into the desired form.

Customization under the hood.

You can specify the exact version ofobjcopy andld thatgllvm usesto manipulate the artifacts by setting theGLLVM_OBJCOPY andGLLVM_LDenvironment variables. For more details of what's under thegllvm hood, try

gsanity-check -e

Customizing the BitCode Generation (e.g. LTO)

In some situations it is desirable to pass certain flags toclang in the step thatproduces the bitcode. This can be fulfilled by setting theLLVM_BITCODE_GENERATION_FLAGS environment variable to the desiredflags, for example"-flto -fwhole-program-vtables".

In other situations it is desirable to pass certain flags tollvm-link in the stepthat merges multiple individual bitcode files together (i.e., withinget-bc).This can be fulfilled by setting theLLVM_LINK_FLAGS environment variable tothe desired flags, for example"-internalize -only-needed".

Beware of link time optimization.

If the package you are building happens to take advantage of recentclang developmentssuch aslink time optimization (indicated by the presence of compiler flag-flto), thenyour build is unlikely to produce anything thatget-bc will work on. This is to beexpected. When working under these flags, the compiler actually produces object files that are bitcode,your only recourse here is to try and save these object files, and retrieve them yourself.This can be done by setting theLTO_LINKING_FLAGS to be something like"-g -Wl,-plugin-opt=save-temps" which will be appended to the flags at link time.This will at least preserve the bitcode files, even ifget-bc will not be able to retrieve them for you.

Cross-compilation notes

When cross-compiling a project (i.e. you pass the--target= or-target flag to the compiler),you'll need to set theGLLVM_OBJCOPY variable to either

  • llvm-objcopy to use LLVM's objcopy, which naturally supports all targets that clang does.
  • YOUR-TARGET-TRIPLE-objcopy to use GNU's objcopy, sinceobjcopy only supports the native architecture.

Example:

# test programecho'int main() { return 0; }'> a.c clang --target=aarch64-linux-gnu a.c# worksgclang --target=aarch64-linux-gnu a.c# breaksGLLVM_OBJCOPY=llvm-objcopy gclang --target=aarch64-linux-gnu a.c# worksGLLVM_OBJCOPY=aarch64-linux-gnu-objcopy gclang --target=aarch64-linux-gnu a.c# works if you have GNU's arm64 toolchain

Developer tools

Debugging usually boils down to looking in the logs, maybe adding a print statement or two.There is an additional executable, not mentioned above, calledgparse that gets installedalong withgclang,gclang++,gflang,get-bc andgsanity-check.gparse takes the command linearguments to the compiler, and outputs how it parsed them. This can sometimes be helpful.

License

gllvm is released under a BSD license. See the fileLICENSE fordetails.


This material is based upon work supported by the National ScienceFoundation under GrantACI-1440800. Anyopinions, findings, and conclusions or recommendations expressed inthis material are those of the author(s) and do not necessarilyreflect the views of the National Science Foundation.


[8]ページ先頭

©2009-2025 Movatter.jp