arnetheduck/nlvmPublic

NotificationsYou must be signed in to change notification settings
Fork45
Star765

LLVM-based compiler for the Nim language

License

View license

765 stars 45 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 263 Commits
.github/workflows		.github/workflows
.vscode		.vscode
Nim @ dfc3cf8		Nim @ dfc3cf8
doc		doc
llvm		llvm
nlvm-lib		nlvm-lib
nlvm		nlvm
.clang-format		.clang-format
.dockerignore		.dockerignore
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ci-skipped-tests.txt		ci-skipped-tests.txt
dl-llvm-mingw.sh		dl-llvm-mingw.sh
env.sh		env.sh
make-dist-docker.sh		make-dist-docker.sh
make-dist-linux.sh		make-dist-linux.sh
make-dist-windows.sh		make-dist-windows.sh
make-llvm.sh		make-llvm.sh
skipped-tests.txt		skipped-tests.txt
upload.sh		upload.sh

Repository files navigation

Introduction

nlvm (the nim-level virtual machine?)is anLLVM-based compiler for theNimprogramming language.

From Nim's point of view, it's a backend just like C or JavaScript - fromLLVM's point of view, it's a language frontend that emits IR.

Questions, patches, improvement suggestions and reviews welcome. Whenyou find bugs, feel free to fix them as well :)

Fork and enjoy!

Jacek Sieka (arnetheduck on gmail point com)

Features

nlvm works as a drop-in replacement fornim with the following notable differences:

Fast compile times - no intermediateC compiler step
DWARF ("zero-cost") exception handling
High-qualitygdb/lldb debug information with source stepping, typeinformation etc
Smart code generation and optimisation
- LTO and whole-program optimisation out-of-the-box
- compiler-intrinsic guided optimisation for overflow checking, memory operations, exception handling
- heap allocation elision
- native constant initialization
Native cross compiler, includingwasm32 support with no extra tooling
Native integrated fast linker (lld)
Just-in-time execution and REPL (nlvm r) using the LLVMORCv2 JIT
Built-incross-compiler

Most things fromnim work just fine (see theporting guide below!):

the same standard library is used
similar command line options are supported (just changenim tonlvm!)
C header files are not used - the declaration in the.nim file needs to beaccurate
If your program has{.compile.} dependencies, these work as usual butrequire a corresponding compiler to be installed (ie clang, gcc)

Test coverage is not too bad either:

bootstrapping and compiling itself
~95% of all upstream tests - most failures can be traced tothe standard library and compiler relying on C implementation details - seeskipped-tests.txt for an updated list of issues
compiling most applications
platforms with tests:
- Linux/x86_64
- Windows/x86_64
majority of the nim standard library (the rest can be fixed easily -requires upstream changes however)

How you could contribute:

work on makingskipped-tests.txt smaller
improve platform support (osx should be easy,arm would be nice)
helpnlvm generate better IR - optimizations, builtins, exception handling..
help upstream make std library smaller and morenlvm-compatible
send me success stories :)
leave the computer for a bit and do something real for your fellow earthlings

nlvm doesnot:

understandC - as a consequence,header,emit and similar pragmasare ignored - neither will the fancyimportcpp/C++ features - see theporting guide below!
support all nim compiler flags and features - do file bugs for anythinguseful that's missing

Installation

Binaries

Binaries are available from thegithub releases page.

Source code

To do what I do, you will need:

A C/C++ compiler
- gcc on Linux
- clang on Windows
A cup of tea and a good book
- Compilingllvm takes about an hour the first time (then it's cached)

Start with a clone:

cd $SRCgit clone https://github.com/arnetheduck/nlvm.git --recurse-submodulescd nlvm

We will need a few development libraries installed, mainly due to hownlvmprocesses library dependencies (see dynlib section below):

# Fedorasudo dnf install pcre-devel openssl-devel sqlite-devel ninja-build cmake clang libzstd-devel# Debian, ubuntu etcsudo apt-get install libpcre3-dev libssl-dev libsqlite3-dev ninja-build cmake clang libzstd-dev# MSYS2 CLANG64 (note that we need the gcc shim for clang)pacboy -S toolchain cmake ninja gcc

Compilenlvm (if needed, this will also buildnim andllvm):

make

Compile with itself and compare:

make compare

Run test suite:

make testmake stats

You can link statically to LLVM to create a stand-alone binary - this willuse a more optimized version of LLVM as well, but takes longer to build:

make STATIC_LLVM=1

If you want a fasternlvm, you can also try the release build - it will becallednlvmr:

make STATIC_LLVM=1 nlvmr

When you updatenlvm fromgit, don't forget the submodule:

git pull && git submodule update

To build a docker image, use:

make docker

To run builtnlvm docker image use:

docker run -v $(pwd):/code/ nlvm c -r /code/test.nim

Compiling your code

On the command line,nlvm is mostly compatible withnim.

For compiling pure Nimm code, you do not need a C compiler -nlvm will compileand link the code using the built-inlld linker:

cd $SRC/nlvm/Nim/examples../../nlvm/nlvm c fizzbuzz

If you want to see the generated LLVM IR, use the-c option:

cd $SRC/nlvm/Nim/examples../../nlvm/nlvm c -c fizzbuzzless fizzbuzz.ll

You can then run the LLVM optimizer on it:

opt -Os fizzbuzz.ll | llvm-dis

... or compile it to assembly (.s):

llc fizzbuzz.llless fizzbuzz.s

Apart from the code of your.nim files, the compiler will also mix in thecompiler runtime library innlvm-lib/.

Pipeline

Generally, thenim compiler pipeline looks something like this:

nim --> c files --> IR --> object files --> linker --> executable

Innlvm, we remove one step and bunch all the code together:

nim --> single IR file --> built-in LTO linker --> executable

Going straight to the IR means it's possible to express nim constructs moreclearly, allowingllvm to understand the code better and thus do a betterjob at optimization. It also helps keep compile times down, because thec-to-IR step can be avoided.

The practical effect of generating a single object file is similar toclang -fwhole-program -flto - it is a bit more expensive in terms of memory,but results in slightly smaller and faster binaries. Notably, theIR-to-machine-code step, including any optimizations, is repeated in full foreach recompile.

Porting guide

dynlib

nim uses a runtime dynamic library loading scheme to gain access to sharedlibraries. When compiling, no linking is done - instead, when running yourapplication,nim will try to open anything the user has installed.

nlvm does not support the{.dynlib.} pragma - instead you can use{.passL.} using normal system linking.

# works with `nim`procf() {.importc,dynlib:"mylib" .}# works with both `nim` and `nlvm`{.passL:"-lmylib".}procf() {.importc .}

{.header.}

Whennim compiles code, it will generatec code which may include otherc code, from headers or directly viaemit statements. This meansnim hasdirect access to symbols declared in thec file, which can be both a featureand a problem.

Innlvm,{.header.} directives are ignored -nlvm looks strictly atthe signature of the declaration, meaning the declaration mustexactly matchthec header file or subtly ABI issues and crashes ensue!

# When `nim` encounters this, it will emit `jmp_buf` in the `c` code without# knowing the true size of the type, letting the `c` compiler determine it# instead.type C_JmpBuf {.importc:"jmp_buf",header:"<setjmp.h>".}=object# nlvm instead ignores the `header` directive completely and will use the# declaration as written. Failure to correctly declare the type will result# in crashes and subtle bugs - memory will be overwritten or fields will be# read from the wrong offsets.## The following works with both `nim` and `nlvm`, but requires you to be# careful to match the binary size and layout exactly (note how `bycopy`# sometimes help to further nail down the ABI):whendefined(linux)anddefined(amd64):type    C_JmpBuf {.importc:"jmp_buf",bycopy.}=object      abi:array[200divsizeof(clong),clong]# In `nim`, `C` constant defines are often imported using the following trick,# which makes `nim` emit the right `C` code that the value from the header# can be read (no writing of course, even though it's a `var`!)## assuming a c header with: `#define RTLD_NOW 2`# works for nim:varRTLD_NOW* {.importc:"RTLD_NOW",header:"<dlfcn.h>".}:cint# both nlvm and nim (note how these values often can be platform-specific):whendefined(linux)anddefined(amd64):constRTLD_NOW*=cint(2)

{.emit.}

To deal withemit, the recommendation is to put the emitted code in a C fileand{.compile.} it.

procmyEmittedFunction() {.importc.}{.compile:"myemits.c".}

voidmyEmittedFunction() {/* ... */}

{.asm.}

Similar to{.emit.},{.asm.} functions must be moved to a separate file andincluded in the compilation with{.compile.} - this works both with.S and.c files.

Cross compiler

nlvm can be used to cross-compile code for a different platform, for example tocreateWindows executables on aLinux machine.

Prerequisites

For cross-compilation, aclang-based environment for that platform must firstbeset up - the environmentconsists of:

asysroot - the basic libraries needed to create executables for the platform
- forWindows, this isllvm-mingw
- It can beobtainedfrom the environment you're targeting.
The compiler runtime for that target (compiler-rt orlibgcc)
- provided byllvm-mingw
clang - for finding libraries in the sysroot and dealing with{.compile.}
- a cross-compilation version ofgcc may also work, though this hasn't beentested

Compiling

Once thesysroot is set up, compiling is as easy as selecting an alternativeOS / CPU using the standard Nim flags,--os: and--cpu:.

Windows

A helper script exists to set upllvm-mingw:

./dl-llvm-mingw.sh# Set up $PATH to include the `clang` compiler that comes with llvm-mingw. env.sh

# Assuming we're on Linux, compile for Windows and link dependencies statically# to create a (mostly) stand-alone binary:nlvm c --os:windows --passl:-static test.nim# You can also use a target triplenlvm c --nlvm.triple=x86_64-w64-mingw32 --passl:-static Nim/examples/fizzbuzz.nim# Run the compiled program via winewine test.exe

wasm32

Use--cpu:wasm32 --os:standalone --gc:none to compile Nim to (barebones) WASM.

You will need to provide a runtime (ie WASI) and use manual memory allocation asthe garbage collector hasn't yet been ported to WASM and the Nim standardlibrary lacks WASM / WASI support.

Apart from WASI, an implementation ofpanicoverride.nim also needs to beprovided - here's one that discards all panics:

# panicoverride.nimprocrawoutput(s:string)=discardprocpanic(s:string) {.noreturn.}=discard

After placing the above code in your project folder, you can compile.nimcode towasm32:

# myfile.nimprocadder*(v:int):int {.exportc.}=  v+4

nlvm c --cpu:wasm32 --os:standalone --gc:none --passl:-Wl,--no-entry myfile.nimwasm2wat -l myfile.wasm

Most WASM-compile code ends up needing WASMextensions -in particular, the bulk memory extension is needed to process data.

Extensions are enabled by passing--passc:-mattr=+feature,+feature2, for example:

nlvm c --cpu:wasm32 --os:standalone --gc:none --passl:-Wl,--no-entry --passc:-mattr=+bulk-memory

Passing--passc:-mattr=help will print available features (only works while compiling, for now!)

To use functions from the environment (withimportc), compile with--passl:-Wl,--allow-undefined.

REPL / running your code

nlvm supports directly running Nim code using just-in-time compilation:

# Compile and run `myfile.nim` without creating a binary firstnlvm r myfile.nim

This mode can also be used to run code directly from the standard input:

$ nlvm r.......................................................>>> log2(100.0)stdin(1, 1) Error: undeclared identifier:'log2'candidates (edit distance, scope distance); see'--spellSuggest': (2, 2):'low' [proc declaredin /home/arnetheduck/src/nlvm/Nim/lib/system.nim(1595, 6)]...>>> import math.....>>> log2(100.0)6.643856189774724: float64

Random notes

Upstream is pinned using a submodule - nlvm relies heavily on internalsthat keep changing - it's unlikely that it works with any other versions,patches welcome to update it
The nim standard library likes to import C headers directly which worksbecause the upstream nim compiler uses a C compiler underneath - ergo,large parts of the standard library don't work with nlvm.
Happy to take patches for anything, including better platform support!
For development, it's convenient to build LLVM with assertions turned on -the API is pretty unforgiving
When I started on this little project, I knew neither llvm nor Nim.Therefore, I'd specially like to thank the friendly folks at the #nimchannel that never seemed to tire of my nooby questions.Also, thanks to all tutorial writers out there, on llvm, programmingand other topics for providing such fine sources of copy-pa... er,inspiration!

About

LLVM-based compiler for the Nim language

Movatterモバイル変換

License

arnetheduck/nlvm

Folders and files

Latest commit

History

Repository files navigation

Introduction

Table of contents

Features

Installation

Binaries

Source code

Compiling your code

Pipeline

Porting guide

dynlib

{.header.}

{.emit.}

{.asm.}

Cross compiler

Prerequisites

Compiling

Windows

wasm32

REPL / running your code

Random notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages0

Uh oh!

Contributors6

Uh oh!

Languages

Packages