vnmakarov/mirPublic

NotificationsYou must be signed in to change notification settings
Fork155
Star2.4k

A lightweight JIT compiler based on MIR (Medium Internal Representation) and C11 JIT compiler and interpreter based on MIR

License

MIT license

2.4k stars 155 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 4,162 Commits
.github/workflows		.github/workflows
adt-tests		adt-tests
c-benchmarks		c-benchmarks
c-tests		c-tests
c2mir		c2mir
llvm2mir		llvm2mir
mir-tests		mir-tests
mir-utils		mir-utils
mir2c		mir2c
.clang-format		.clang-format
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
CUSTOM-ALLOCATORS.md		CUSTOM-ALLOCATORS.md
GNUmakefile		GNUmakefile
HOW-TO-PORT-MIR.md		HOW-TO-PORT-MIR.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MIR.md		MIR.md
README.md		README.md
check-threads.sh		check-threads.sh
csmith-c2m-gcc.sh		csmith-c2m-gcc.sh
csmith-c2m.sh		csmith-c2m.sh
mir-aarch64.c		mir-aarch64.c
mir-aarch64.h		mir-aarch64.h
mir-alloc-default.c		mir-alloc-default.c
mir-alloc.h		mir-alloc.h
mir-bin-driver.c		mir-bin-driver.c
mir-bin-run.c		mir-bin-run.c
mir-bitmap.h		mir-bitmap.h
mir-code-alloc-default.c		mir-code-alloc-default.c
mir-code-alloc.h		mir-code-alloc.h
mir-dlist.h		mir-dlist.h
mir-gen-aarch64.c		mir-gen-aarch64.c
mir-gen-ppc64.c		mir-gen-ppc64.c
mir-gen-riscv64.c		mir-gen-riscv64.c
mir-gen-s390x.c		mir-gen-s390x.c
mir-gen-stub.c		mir-gen-stub.c
mir-gen-x86_64.c		mir-gen-x86_64.c
mir-gen.c		mir-gen.c
mir-gen.h		mir-gen.h
mir-gen.svg		mir-gen.svg
mir-hash.h		mir-hash.h
mir-htab.h		mir-htab.h
mir-interp.c		mir-interp.c
mir-ppc64.c		mir-ppc64.c
mir-ppc64.h		mir-ppc64.h
mir-reduce.h		mir-reduce.h
mir-riscv64.c		mir-riscv64.c
mir-riscv64.h		mir-riscv64.h
mir-s390x.c		mir-s390x.c
mir-s390x.h		mir-s390x.h
mir-varr.h		mir-varr.h
mir-x86_64.c		mir-x86_64.c
mir-x86_64.h		mir-x86_64.h
mir.c		mir.c
mir.h		mir.h
mir3.svg		mir3.svg
mirall.svg		mirall.svg
real-time.h		real-time.h
sieve.c		sieve.c

Repository files navigation

MIR Project

MIR meansMediumInternalRepresentation
MIR project goal is to provide a basis to implement fast and lightweight JITs
Plans to try MIR light-weight JIT first for CRuby or/and MRuby implementation
Motivations for the project can be found inthis blog post
C2MIR compiler description can be found inthis blog post
Future of code specialization in MIR for dynamic language JITs can be found inthis blog post

Disclaimer

There is absolutely no warranty that the code will work for any tests except ones given here and on platformsother than x86_64 Linux/OSX, aarch64 Linux/OSX(Apple M1), and ppc64le/s390x/riscv64 Linux

MIR

MIR is strongly typed IR
MIR can represent machine 32-bit and 64-bit insns of different architectures
MIR.md contains detail description of MIR and its API.Here is a brief MIR description:
MIR consists ofmodules
- Each module can containfunctions and some declarations and data
- Each function hassignature (parameters and return types),local variables(including function arguments) andinstructions
  - Each local variable hastype which can be only 64-bit integer, float, double, or long doubleand can be bound to a particular target machine register
  - Each instruction hasopcode andoperands
    - Operand can be a local variable(or a function argument),immediate,memory,label, orreference
      - Immediate operand can be 64-bit integer, float, double, or long double value
  - Memory operand has atype,displacement,base andindex integer local variable,and integer constant as ascale for the index
    - Memory type can be 8-, 16-, 32- and 64-bit signed or unsigned integer type,float type, double, or long double type
      - When integer memory value is used it is expanded with sign or zero promotingto 64-bit integer value first
  - Label operand has name and used for control flow instructions
  - Reference operand is used to refer to functions and declarations in the current module,in other MIR modules, or for C external functions or declarations
- opcode describes what the instruction does
- There areconversion instructions for conversion between different32- and 64-bit signed and unsigned values, float, double, and long double values
- There arearithmetic instructions (addition, subtraction, multiplication, division,modulo) working on 32- and 64-bit signed and unsigned values, float, double, and long double values
- There arelogical instructions (and, or, xor, different shifts) working on32- and 64-bit signed and unsigned values
- There arecomparison instructions working on 32- and 64-bitsigned and unsigned values, float, double, and long double values
- There arelocal variable address instructions to get address of local variable
- There arebranch insns (unconditional jump, and jump on zero or non-zero value)which take a label as one their operand
- There arecombined comparison and branch instructions taking a label as one operandand two 32- and 64-bit signed and unsigned values, float, double, and long double values
- There isswitch instruction to jump to a label from labels given as operandsdepending on index given as the first operand
- There islabel address instruction to get a label addressandunconditional indirect jump instruction whose operand contains previously taken label address
- There arefunction and procedural call instructions
- There arereturn instructions optionally returning 32- and 64-bitinteger values, float, double, and long double values
- There arespecialized light-weight call and return instructions can be used forfast switching from threaded interpreter to JITted code and vice verse
- There areproperty instructions to generated specialized machine code when lazy basic block versioning is used

MIR Example

You can create MIR throughAPI consisting of functions for creation of modules,functions, instructions, operands etc
You can also create MIR from MIRbinary ortext file
The best way to get a feel about MIR is to use textual MIR representation
Example of Eratosthenes sieve on C

#defineSize 819000intsieve (intN) {int64_ti,k,prime,count,n;charflags[Size];for (n=0;n<N;n++) {count=0;for (i=0;i<Size;i++)flags[i]=1;for (i=0;i<Size;i++)if (flags[i]) {prime=i+i+3;for (k=i+prime;k<Size;k+=prime)flags[k]=0;count++;      }  }returncount;}voidex100 (void) {printf ("sieve (100) = %d\", sieve (100));}

Example of MIR textual file for the same function:

m_sieve:moduleexport sievesieve:func i32, i32:Nlocal i64:iter, i64:count, i64:i, i64:k, i64:prime, i64:temp, i64:flagsalloca flags, 819000mov iter, 0loop:bge fin, iter, Nmov count, 0;  mov i, 0loop2:bge fin2, i, 819000mov u8:(flags, i), 1;  add i, i, 1jmp loop2fin2:mov i, 0loop3:bge fin3, i, 819000beq cont3, u8:(flags,i), 0add temp, i, i;  add prime, temp, 3;  add k, i, primeloop4:bge fin4, k, 819000mov u8:(flags, k), 0;  add k, k, primejmp loop4fin4:add count, count, 1cont3:add i, i, 1jmp loop3fin3:add iter, iter, 1jmp loopfin:ret countendfuncendmodulem_ex100:moduleformat:string "sieve (10) = %d\n"p_printf:proto p:fmt, i32:resultp_sieve:proto i32, i32:iterexport ex100import sieve, printfex100:func v, 0local i64:rcall p_sieve, sieve, r, 100call p_printf, printf, format, rendfuncendmodule

func describes signature of the function (taking 32-bit signedinteger argument and returning 32-bit signed integer value)and function argumentN which will be localvariable of 64-bit signed integer type
- Function results are described first by their types and have no names.Parameters always have names and go after the result description
- Function may have more than one result but possible number and combinationof result types are currently machine defined
You can write several instructions on one line if you separate them by;
The instruction result, if any, is always the first operand
We use 64-bit instructions in calculations
We could use 32-bit instructions in calculations which would have sense if we use 32-bit CPU
- When we use 32-bit instructions we take only 32-bit significant part of 64-bit operandand high 32-bit part of the result is machine defined (so if you write a portable MIR codeconsider the high 32-bit part value is undefined)
string describes data in form of C string
- C string can be used directly as an insn operand. In this case the data will be addedto the module and the data address will be used as an operand
export describes the module functions or data which are visible outside the current module
import describes the module functions or data which should be defined in other MIR modules
proto describes function prototypes. Its syntax is the same asfunc syntax
call are MIR instruction to call functions

Running MIR code

After creating MIR modules (through MIR API or reading MIR binary or textual files),you should load the modules
- Loading modules makes visible exported module functions and data
- You can load external C function withMIR_load_external
After loading modules, you should link the loaded modules
- Linking modules resolves imported module references, initializes data,and set up call interfaces
After linking, you can interpret functions from the modules or call machine codefor the functions generated with MIR JIT compiler (generator). What way the function can be executedis usually defined by set up interface. How the generated code is produced (lazily on the first call or ahead of time)can be also dependent on the interface
Running code from the above example could look like the following (herem1 andm2 are modulesm_sieve andm_e100,func is functionex100,sieve is functionsieve):

/* ctx is a context created by MIR_init / MIR_init2 */MIR_load_module (ctx,m1);MIR_load_module (ctx,m2);MIR_load_external (ctx,"printf",printf);MIR_link (ctx,MIR_set_interp_interface,import_resolver);/* or use MIR_set_gen_interface to generate and use the machine code *//* or use MIR_set_lazy_gen_interface to generate function code on its 1st call *//* use MIR_gen (ctx, func) to explicitly generate the function machine code */MIR_interp (ctx,func,&result,0);/* zero here is arguments number  *//* or ((void (*) (void)) func->addr) (); to call interpr. or gen. code through the interface */

Running binary MIR files on Linux through`binfmt_misc`

Themir-bin-run binary is prepared to be used frombinfmt_misc with thefollowing line (example):

line=:mir:M::MIR::/usr/local/bin/mir-bin-run:Pecho$line> /proc/sys/fs/binfmt_misc/register

Do adapt the mir-bin-run binary path to your system, that is the default one

And run with

c2m your-file.c -o your-filechmod +x your-file./your-file your args

The executable is "configurable" with environment variables:

MIR_TYPE sets the interface for code execution:interp (for interpretation),jit (for generation) andlazy (for lazy generation, default);
MIR_LIBS (colon separated list) defines a list of extra libraries to load;
MIR_LIB_DIRS orLD_LIBRARY_PATH (colon separated list) defines an extra listof directories to search the libraries on.

Due to the tied nature ofmir-bin-run withbinfmt_misc, it may be a bit weirdto callmir-bin-run directly.TheP flag on the binfmt_misc passes an extra argument with the full pathto the MIR binary.

The current state of MIR project

You can use Csetjmp/longjmp functions to implementlongjump in MIR
Binary MIR code is usually upto10 times more compact and upto10 times faster to readthan analogous MIR textual code
MIR interpreter is about 6-10 times slower than code generated by MIR JIT compiler
LLVM IR to MIR translator has not been finished and probably will be never fully implementedas LLVM IR is much richer than MIR but translation of LLVM IR generated from standard C/C++ to MIRis a doable task

The possible future state of MIR project

WASM to MIR translation should be pretty straightforward
- Only small WASM runtime for WASM floating point round insns needed to be provided for MIR
Porting GCC to MIR is possible too. An experienced GCC developer can implement thisfor 6 to 12 months
On my estimation porting MIR JIT compiler to mips64 or sparc64 will take1-2 months of work for each target
Performance minded porting MIR JIT compiler to 32-bit targets will need an implementation ofadditional small analysis pass to get info what 64-bit variables are used onlyin 32-bit instructions

MIR JIT compiler

Very short optimization pipeline for speed and light-weight
Only themost valuable optimization usage:
- function inlining
- global common sub-expression elimination
- variable renaming
- register pressure sensitive loop invariant code motion
- conditional constant propagation
- dead code elimination
- code selection
- fastregister allocator with
  - aggressive coalescing registers and stack slots for copy elimination
  - live range splitting
Different optimization levels to tune compilation speed vs generated code performance
SSA form of MIR is used before register allocation
- We use a form of Braun's algorithm to build SSA (M. Braun et al. "Simple and EfficientConstruction of Static Single Assignment Form")
Simplicity of optimizations implementation over extreme generated code performance
More details aboutfull JIT compiler pipeline:
Simplify: lowering MIR
Inline: inlining MIR calls
Build CFG: building Control Flow Graph (basic blocks and CFG edges)
Build SSA: Building Single Static Assignment Form by adding phi nodes and SSA edges to operands
Address Transformation: remove or change MIR ADDR instructions
Global Value Numbering: removing redundant insns through GVN. This includes constantpropagation and redundant load eliminations
Copy Propagation: SSA copy propagation and removing redundant extension instructions
Dead store elimination: removing redundant stores
Dead Code Elimination: removing insns with unused outputs
Pressure relief: moving insns to decrease register pressure
SSA combine: combining addresses and compare and branch instruction pairs
Out of SSA: Removing phi nodes and SSA edges
Jump opts: Different jump optimizations
Machinize: run machine-dependent code transforming MIR for calls ABI, 2-op insns, etc
Find Loops: finding natural loops and building loop tree
Build Live Info: calculating live in and live out for the basic blocks
Build Register Conflicts: building conflict matrix for registers involved in moves.It is used for register coalescing
Coalesce: aggressive register coalescing
Register Allocator (RA): priority-based linear scan RA with live range splitting
Build Live Ranges: calculating program point ranges for registers
Assign: fast RA for-O0 or priority-based linear scan RA for-O1 and above
Rewrite: transform MIR according to the assign using reserved hard regs
Combine (code selection): merging data-depended insns into one
Dead Code Elimination: removing insns with unused outputs
Generate Machine Insns: run machine-dependent code creating machine insns

C to MIR translation

We implemented a small C11 (2011 ANSI C standard with some GCC extensions) to MIR compilerc2m.SeeREADME.md
C code can be used as an input of JIT compiler besides MIR
- Usage of C as an input to JIT compiler can slow down compilation speed up to 2 times

Structure of the project code

Filesmir.h andmir.c contain major API code including input/output of MIR binaryand MIR text representation
Filesmir-dlist.h,mir-mp.h,mir-varr.h,mir-bitmap.h,mir-hash.h,mir-htab.h,mir-reduce.hcontain generic code correspondingly for double-linked lists, memory pools, variable length arrays, bitmaps,hash calculations, hash tables, and compressing/decompressing data. Filemir-hash.h is a general, simple,high quality hash function used by hashtables
Filemir-interp.c contains code for interpretation of MIR code. It is included inmir.cand never compiled separately
Filesmir-gen.h,mir-gen.c,mir-gen-x86_64.c,mir-gen-aarch64.c,mir-gen-ppc64.c,mir-gen-s390x.c,andmir-gen-riscv64.c contain code for MIR JIT compiler
- Filesmir-gen-x86_64.c,mir-gen-aarch64.c,mir-gen-ppc64.c,mir-gen-s390x.c,andmir-gen-riscv64.c is machine dependent code of JIT compiler
Filesmir-<target>.c contain simple machine dependent code common for interpreter andJIT compiler
Filesmir-<target>.h contain declarations common for interpreter and JIT compiler
Filesmir2c/mir2c.h andmir2c/mir2c.c contain code for MIR to C compiler. The generated code might be not portable
Filesc2mir/c2mir.h,c2mir/c2mir.c,c2mir/c2mir-driver.c, andc2mir/mirc.h contain code forC to MIR compiler. Files in directoriesc2mir/x86_64 andc2mir/aarch64,c2mir/ppc64,c2mir/s390x,andc2mir/riscv64 contain correspondingly x86_64, aarch64, ppc64le, s390x, and riscv machine-dependentcode for C to MIR compiler
Filemir-bin-run.c contains code formir-bin-run described above
Filemir-bin-driver.c withb2ctab utility can be used for portable way to generate binary from MIR binary files
Directorymir-utils contains different utilities to work with MIR,e.g. transforming binary MIR to textual MIR and vice verse
Directoryadt-tests,mir-tests,c-tests, andc-benchmarks contains code for testing and benchmarking MIR andc2m

Playing with current MIR project code

You can run some benchmarks and tests bymake bench andmake test

Current MIR Performance Data

Intel i5-13600K with 64GB memory under FC37 with GCC-12.3.1

	MIR-generator	MIR-interpreter	gcc -O2	gcc -O0
compilation [1]	1.0 (249us)	0.09 (22us)	109 (27.1ms)	105 (26.1ms)
execution [2]	1.0 (1.74s)	13.7 (23.8s)	0.92 (1.6s)	2.28 (3.97s)
code size [3]	1.0 (557KB)	0.43 (240KB)	58 (32.2MB)	58 (32.2MB)
LOC [4]	1.0 (23.4K)	0.48 (11.3K)	103 (2420K)	103 (2402K)

[1] is based on wall time of compilation of C sieve code (w/o any include file and withusing memory file system for GCC) and the corresponding MIR sieve code by MIR-interpreterand MIR-generator with optimization level 2

[2] is based on the best wall time of 10 runs with used MIR-generator optimization level 2

[3] is based on stripped sizes of cc1 for GCC and MIR core and interpreter or generator for MIR

[4] my estimation based only on files required for x86-64 GNU C compiler and MIR files for minimal program to createand run MIR code

Current C2MIR Performance Data

Intel i5-13600K with 64GB memory under FC37 with GCC-12.3.1

	c2m -O2 -eg (generator)	c2m -ei (interpreter)	gcc -O2	gcc -O0
compilation [1]	1.0 (336us)	1.0 (337us)	80 (27.1ms)	77 (26.1ms)
execution [2]	1.0 (1.74s)	13.7 (23.8s)	0.92 (1.6s)	2.28 (3.97s)
code size [3]	1.0 (961KB)	1.0 (961KB)	34 (32.2MB)	34 (32.2MB)
LOC [4]	1.0 (54.8K)	1.0 (54.8K)	44 (2420K)	44 (2420K)

[1] is based on wall time of compilation of C sieve code (w/o any include file and withusing memory file system for GCC)

[2] is based on the best wall time of 10 runs with used MIR-generator optimization level 2

[3] is based on stripped sizes of cc1 for GCC and C2MIR, MIR core, interpreter, and generator for MIR

[4] is based on all source files excluding tests

Here is generated code performance related to GCC -O2 for different C compilers on 15 small C benchmarks (from directoryc-benchmarks) on the same machine where

gcc version is 12.3.1
clang version is 15.0.7
chibicc is Rui Ueyama's latest C11 implementation
cparser is a C99 implementation based on a pretty sophisticated backend, libFirm version 1.22
cproc is Michael Forney's C11 implementation based on theQBE compiler backend
lacc is a C89 implementation
pcc (1.2.0.DEVEL) is a modern version of the Portable C compiler
tcc (0.9.27) is the tiny C11 compiler
emcc (2.0.20) is emscripten compiler to Webassembly with wasmer (1.0.2) runtime
wasi cranelift is a C to webassember clang compiler (11.0.0) with wasmer (1.0.2) based on cranelift backend
wasi LLVM is a C to webassember clang compiler (11.0.0) with wasmer (1.0.2) based on LLVM backend
wasi singlepass is a C to webassember clang compiler (11.0.0) with wasmer (1.0.2) based on singlepass backend
wasi wasmtime is a C to webassember clang compiler (11.0.0) with wasmtime (0.26.0) runtime based on cranelift backend

	Average	Geomean
gcc -O2	1.00	1.00
gcc -O0	0.63	0.57
c2m -eg	0.96	0.91
c2m -eb	0.92	0.85
chibicc	0.38	0.30
clang -O2	1.12	1.09
cparser -O3	1.02	0.98
cproc	0.68	0.65
lacc -O3	0.47	0.39
pcc -O	0.80	0.78
tcc	0.54	0.50
emcc -O2/wasmer	0.60	0.55
wasi -O2/wasmer cranelift	0.60	0.54
wasi -O2/wasmer LLVM	0.78	0.72
wasi -O2/wasmer singlepass	0.45	0.36
wasi -O2/wasmtime	0.92	0.87

MIR project competitors

I only see three projects which could be considered or adapted as real universal light-weight JIT competitors
QBE:
- It is small (10K C lines)
- It uses SSA based IR (kind of simplified LLVM IR)
- It has the same optimizations as MIR-generator plus aliasing but QBE has no inlining
- It generates assembler code which makes QBE 30 slower in machine code generation than MIR-generator
- On my benchmarks it generates code whose geomean performance is only 65% of GCC with -O2(performance of MIR generated code is 91% of GCC with -O2) while having the same compilation speed as MIR
LIBJIT started as a part of DotGNU Project:
- LIBJIT is bigger:
  - 80K C lines (for LIBJIT w/o dynamic Pascal compiler) vs 20K C lines for MIR(excluding C to MIR compiler)
- LIBJIT has fewer optimizations: only copy propagation and register allocation
RyuJITis a part of runtime for .NET Core:
- RyuJIT is even bigger: 360K SLOC
- RyuJIT optimizations is basically MIR-generator optimizations
- RyuJIT uses SSA
Other candidates:
- LIBFirm: less standalone-, big- (140K LOC), SSA,ASM generation-, LGPL2
- CraneLift: less standalone-,big- (70K LOC of Rust-), SSA, Apache License
- NanoJIT, standalone+, medium (40K C++ LOC), only simple RA-,Mozilla Public License

Porting MIR

Currently MIR works on x86_64, aarch64, ppc64le, s390x, riscv64 Linux and x86_64/aarch64 (Apple M1) MacOS
HOW-TO-PORT-MIR.md outlines process of porting MIR
- On my estimation an experienced developer can port MIR (includingc2m) to another target for 1-2 months