- Notifications
You must be signed in to change notification settings - Fork3
A programmable virtual CPU written in untyped lambda calculus
License
woodrush/lambdavm
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
LambdaVM is a programmable virtual CPU written as a closed untyped lambda calculus term.It supports an extended version of theELVM instruction set and architecture written byShinichiro Hamaji.Using LambdaVM, you can enjoy assembly programming to write programs in untyped lambda calculus.
LambdaVM supports 8 instructions including standard I/O and virtual memory operations, and has an arbitrarily configurable ROM/RAM address size and word size, and an arbitrarily configurable number of registers.
Despite its rather rich capability, LambdaVM's lambda term is quite small. Here is its entire lambda term written in plaintext:
LambdaVM = \x.\y.\z.\a.\b.((\c.((\d.((\e.((\f.((\g.((\h.(a ((\i.(i (d (\j.\k.(k (\l.\m.\n.\o.(o k (j m))) k)) a) (\j.(i z (d (\k.\l.\m.\n.\o.\p.((\q.((\r.((\s.(n (\t.\u.\v.\w.v) (\t.t) (\t.\u.\v.u) (\t.\u.u) (o (\t.\u.\v.(o (k l m) p)) o) (n (\t.\u.((\v.(t (\w.\A.\B.((\C.(A (C B) (s B C))) (\C.\D.(w (D ((\E.(m (\F.\G.\H.(E (y (\I.\J.(J (\K.\L.K) I)) F) G)) (E c m))) (\E.\F.(r B E (k l F u o)))) (\E.(E (y (\F.(F (\G.\H.H))) C) (v p))) A) (D (\E.\F.\G.\H.((\I.(F (I G) (s G I))) (s H (\I.\J.(E (e I C) (q J) (v p))))))) (D (\E.\F.((\G.(f (\H.\I.I) (E (s F e C)) G G (\H.(r F)))) c)) v) (q C) (h l C (r D) v) (s D (g l C) k m u o p) (D (\E.\F.(s E (f F F) C (\G.(r E)))) v) (r D C v))))))) (k l m u o)))))) (h p))) (g p))) (\q.(h j q (\r.(r (k l m) p))))))))))) (\i.\j.(d (\k.\l.\m.\n.(l (\o.\p.\q.(m (\r.\s.\t.(k l s (\u.\v.(k v s (\w.(n (\A.(A u w)))))))) (l n))) (n l l))) i c (\k.\l.(j k)))) b) (\i.\j.j))) (d (\h.\i.\j.\k.(i (\l.\m.\n.(j (\o.\p.\q.(o (h l) (h m) p k)) (k i))) (k c)))))) (d (\g.\h.\i.\j.\k.(i (\l.\m.\n.((\o.(h (\p.\q.\r.(l (h o) (o q p))) (o (\p.\q.q) (\p.\q.q)))) (\o.(g o m j (\p.\q.(l (k (\r.(r p q))) (k (\r.(r q p))))))))) (k j)))))) (d (\f.\g.\h.\i.\j.\k.(i (\l.\m.\n.(j (\o.\p.(f g h m p (\q.\r.((\s.((\t.((\u.((\v.(t s q (v (\w.\A.w)) (v (\w.\A.A)))) (t q (q (\v.\w.w) (\v.\w.v)) (u (\v.\w.v)) (u (\v.\w.w))))) (\u.\v.(k v (\w.(w u r)))))) (\t.\u.(l (s t u) (s u t))))) (h o (o (\s.\t.t) (\s.\t.s))))))))) (k g i)))))) (d (\e.\f.\g.(f (\h.\i.\j.(g (\k.\l.((\m.(h (k m (\n.\o.\p.o)) (k (\n.\o.\p.p) m))) (e i l))))) (\h.\i.\j.h)))))) (\d.((\e.(d (e e))) (\e.(d (e e))))))) ((\c.(y c (x c (\d.\e.e)))) (\c.\d.(d (\e.\f.e) c))))
Shown here is a lambda calculus term featuring a RAM unit with 8 instructions including I/O and memory operations.With LambdaVM, you can write an assembly program that runs on this virtual machine.
The image on the top of this repo is LambdaVM'slambda calculus diagram.
Various designs for LambdaVM are borrowed fromKunihiko Sakamoto'sUnlambdaVM (in Japanese), with many modifications. Details are described later.
LambdaVM is built usingLambdaCraft, a Common Lisp DSL that I wrote for building large lambda calculus programs, also used to buildLambdaLisp, a Lisp interpreter implemented in untyped lambda calculus.
Using LambdaVM, you can write lambda calculus programs in assembly style.But what does it mean to write programs in lambda calculus?
Lambda calculus terms can be interpreted as programs, by interpreting it as a program that takes an input string and returns an output string.Characters and bytes are encoded as a list of bits with
Various lambda calculus interpreters automatically handle this I/O format so that it runs on the terminal - standard input is encoded into lambda terms, and the output lambda term is decoded and shown on the terminal.Using these interpreters, lambda calculus programs can be run on the terminal just like any other terminal utility with I/O.
A thorough explanation of programming in lambda calculus is described inmy blog post aboutLambdaLisp, a Lisp interpreter written as an untyped lambda calculus term.
There are various programming languages that use this I/O strategy. Examples are:
Seen as a programming language, these languages are purely functional languages with lazy evaluation.The "purely" part is emphasized even more than languages such as Haskell, since no primitive data types such as integers exist except for lambdas exist in these languages.Each of these languages have a different I/O encoding and a different notation for expressing lambda calculus terms.The I/O encoding can be absorbed by using awrapper,and the notation can be adapted by using acompiler to write lambdas in the expected format.
Using LambdaVM, you can write programs for these languages.A FizzBuzz program written in LambdaVM assembly looks like./examples/fizzbuzz.cl.
Compiled programs can be run on the terminal using various interpreters, including:
- SectorLambda, the521-byte lambda calculus interpreter written by Justine Tunney
- TheIOCCC 2012"Most functional" interpreter written by John Tromp(thesource is in the shape of a λ)
- Universal Lambda interpreterclamb and Lazy K interpreterlazyk written by Kunihiko Sakamoto
I have integrated LambdaVM into ELVM to implement ELVM's lambda calculus backend.Using this backend, you can even compile interactive C code to LambdaVM's assembly.Since ELVM implements itsown libc, you can even#include <stdio.h>
and use library functions such asprintf
andscanf
.
This is entirely contained in the realm of the ELVM repository. Please see the documentation for ELVM for details.Instead of compiling C to assembly, you can enjoy hand-assembling programs for lambda calculus using the examples in this repository.
- Instruction set:
mov
load
store
addsub
cmp
jmpcmp
jmp
putchar
getchar
exit
- The behavior of each instruction is mostly the same as ELVM. Please seeelvm.md from the ELVM repo for details.
- The lambda calculus data structure for each instruction is described later.
- ROM/RAM address size and word size:
- Pre-configurable to an arbitrary integer
- I/O bit size:
- Pre-configurable to an arbitrary integer
- Registers:
- Word size is pre-configurable to an arbitrary integer
- Number of registers is arbitrarily pre-configurable
You can hand-assemble your own LambdaVM programs usingLambdaCraft,a Common Lisp DSL I wrote for building lambda calculus programs, also used to buildLambdaLisp.
Theexamples directory in this repo contains 3 example LambdaVM assembly programs:
- fizzbuzz.cl: Prints the FizzBuzz sequence in unary.
- rot13.cl: Encodes/decodes standard input to/from theROT13 cipher.
- yes.cl: The Unix
yes
command, printing infinite lines ofy
.
Here is what the beginning of the assembly listing for rot13.cl looks like:
(def-lazy asm (list;; Initialization (PC == 0) (list;; Store 26/2 = 13 at reg-B (mov reg-B"N") (sub reg-B"A") );; tag-main (PC == 1) (list (getc reg-A);; Exit at EOF (jmpcmp reg-A == EOF -> tag-exit);; "a" <= reg-A < "n" : add 13 (mov reg-C reg-A) (cmp reg-C>="a") (mov reg-D reg-A) (cmp reg-D<"n") (add reg-C reg-D) (jmpcmp reg-C == int-2 -> tag-plus13)...
As shown here, the assembly is written as Common Lisp macros.These listings can be compiled by running *.cl on a Common Lisp interpreter such as SBCL.
Since these programs are based on LambdaCraft and LambdaCraft runs on LambdaLisp,it is expected that these programs run on LambdaLisp as well, although it takes a lot of time compared to fast interpreters such as SBCL.
Please use these example programs as a template for hand-assembling your own LambdaVM programs.
LambdaVM is written as the following lambda calculus term:
LambdaVM = \iobitsize. \suppbitsize. \proglist. \memlist. \stdin. ...
- The first 2 arguments
${\rm iobitsize}$ and${\rm suppbitsize}$ are configuration parameters specifying the CPU's I/O word size and RAM word size. ${\rm proglist}$ represents the assembly listing to be executed.${\rm memlist}$ represents the memory initialization state. Unspecified memory regions are initialized to 0.${\rm stdin}$ is the input string provided by the interpreter.
By applying the first 4 arguments except(LambdaVM iobitsize suppbitsize proglist memlist)
behaves as a lambda calculus program that accepts a string
Various designs for LambdaVM are borrowed fromKunihiko Sakamoto'sUnlambdaVM (in Japanese):
- Using a binary tree structure to represent the RAM
- Using a list of lists of instructions to represent the program
LambdaVM has the following differences:
- While Unlambda is a strictly evaluated language, LambdaVM assumes a lazily evaluated language.While UnlambdaVM is written in direct style using function applications to mutate the VM's global state,LambdaVM is written using continuation-passing style to handle monadic I/O.
- The binary tree structure is modified so that an empty tree can be initialized with
nil = \x.\y.y
.
By loading the program and initialization configurations to LambdaVM, the resulting lambda calculus term(LambdaVM iobitsize suppbitsize proglist memlist)
behaves as a function that accepts a string as an input and outputs a string.
Here, characters and bytes are encoded as a list of bits with
The variable
Various lambda calculus languages use a slightly different I/O encoding, such as usingChurch numerals to express each character.Such differences can be absorbed by using awrapper that adapts the program to its surrounding environment.This is discussed later in theAdaptation to Different Languages section.
iobitsize
andsuppbitsize
are integers encoded as lambda terms in theChurch encoding.
iobitsize
specifies the number of bits used for the input string.In theIOCCC 2012"Most functional" interpreter,iobitsize == 8
since it uses 8 bits for encoding the I/O.
suppbitsize
represents the additional number of bits added toiobitsize
to make the machine's RAM and register word size.The word size becomesiobitsize + suppbitsize
.
In ELVM, the machine word size is 24 and the I/O bit size is 8, soiobitsize
andsuppbitsize
are set to 8 and 16, respectively.
proglist
is represented as a list of lists, where each sublist is atag containing a list of instructions.The instruction format is described later.The beginning of each list represents a tag that can be jumped to using thejmp
orjmpcmp
instructions.When thejmp
orjmpcmp
instruction is run, the program proceeds to the beginning of the specified tag.
memlist
is represented as a list of N-bit unsigned integers with the machine's word size, where each integer is represented as a list of bits with
Each instruction
${\rm cons4}$ is a 4-tuple constructor,${\rm cons4} = \lambda x. \lambda y. \lambda z. \lambda w. \lambda f. (f x y z w)$ .${\rm insttag}$ is an 8-tuple specifying the instruction.${\rm srcisimm}$ is a boolean of either${\rm t} = \lambda x. \lambda y. y$ or${\rm nil} = \lambda x. \lambda y. y$ .It specifies whether if the following${\rm src}$ is an immediate value or a register.${\rm src}$ is either an N-bit integer or a register address.${\rm opt}$ is an instruction-specific option with an instruction-specific data structure.
Instruction | |
---|---|
mov | |
load | |
store | |
addsub | |
cmp | |
jmpcmp | |
jmp | |
io |
Instruction | |
---|---|
mov | |
load | |
store | |
addsub | |
cmp | |
jmpcmp | |
jmp | (unused) |
io |
The examples under./examples wrap this instruction structure as Lisp macrosso that they can be written in an assembly-like notation.Macros are defined in./src/lambda-asm-header.cl.For further details on the instruction structure, please see this source.
As mentioned earlier, various lambda calculus languages use a slightly different I/O encoding, such as usingChurch numerals to express each character.Such differences can be absorbed by using awrapper that adapts the program to its surrounding environment.
To run a lambda program
When
After calculuations,
Using this strategy, any program
Please seedetails.md.
LambdaVM was written by Hikaru Ikuta, inspired byKunihiko Sakamoto'sUnlambdaVM (in Japanese).The instruction set for LambdaVM is based on and is extended from theELVM architecture written byShinichiro Hamaji.LambdaVM is written usingLambdaCraft written by Hikaru Ikuta.