- Notifications
You must be signed in to change notification settings - Fork5
#F (Sharp-F or False) is a portable compiler/runtime for a minimalistic subset of the Scheme programming language. Compatibility with R5RS/R7RS Scheme programs is provided in a form of libraries written in #F itself.
License
false-schemers/sharpF
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
#F (Sharp-F or False) is a portable compiler/runtime for a subset of the Scheme programming language. Its main purpose is to facilitate creation of specialized code fragments which use Scheme-like computational model (garbage-collected memory, proper tail recursion, call/cc, closures, multiple return values) and can be easily integrated with regular C code in a direct way (no FFI or "scripting language bindings").
#F's portability is a consequence of its ability to compile itself into a standard portable C code. There is a single source distribution for all platforms; there are no OS- or hardware-specific parts, no compiler-specific tricks, no dependency on platform-specific building tools. There is no distributive to install, just a bunch of C files you can compile with your favorite C compiler, link the object files together and be done with it.
#F compiler (SFC) produces reasonably fast and very readable C code which can be tailoredto fit into a regular C project with minimal effort. The main distinguishing feature ofthe #F system is its scalable runtime - the compiler itself has no embedded knowledge ofany standard functions or data types, with a single exception -#f
(hence the name).The#f
value (represented as immediate 0) is hard-wired since SFC needs to compiletraditional conditional forms; everything else should be explicitly defined beforeuse. If one's code does not use lists or strings, they do not need to be implemented!
The starting point is the obligatory "Hello, World!" example:hello.sf
It can be compiled and executed as follows (you may use clang, cl, or any other C compiler):
$ sfc hello.sf # sfc produces hello.c (273 lines, ~8K in size)$ gcc hello.c # gcc produces a.out (~50K in size)$ ./a.outHello, World!
You can see the compiled C program here:hello.c. As it defines no data types or functions, the whole fileis basically just startup/shutdown/gc code.
Please note that some C compilers may issue annoying warnings; we recommend to add
-Wno-parentheses-equality
for Clang and-D_CRT_SECURE_NO_WARNINGS
for Windows headers(unless you want to hear thatfopen
is no longer a reasonable way to open files etc.)
Here is an example of what a bigger #F "Scheme" program might look like and whathappens when it is compiled and executed:
$ sfc tak.sf # sfc produces tak.c (370 lines, ~11K in size)$ gcc -O3 -DNDEBUG tak.c # gcc produces a.out (~61K in size)$ ./a.out70000
The files can be seen here:tak.sftak.c
As in the first example, since tak.sf defines a global function called "main", SFC added the #F runtime systemto the output C file. There is no need to link with anything else except for thestandard C runtime.
SFC supports C-style separate compilation for #F programs. Here's a slightly morecomplicated example, with lists and 3 separate compilation units, one containingall tak runtime/library functions, another for tak and ltak functions themselves,and the third one for the main test function. It is compiled and linked in the samedirect manner:
$ sfc tlib.sf tfun.sf tmain.sf # sfc produces 3 C files (~33K total size)$ gcc tlib.c tfun.c tmain.c # gcc produces a.out (~67K in size)$ ./a.out70007000
The files can be seen here:tlib.sftfun.sftmain.sftlib.ctfun.ctmain.c
There's no need to compile all .sf files in a single run; doing it one fileat a time is just as good. Dependent units (files) are scanned for macros duringcompilation of a unit, so they should be present and in a good shape. Dependenciesare declared via(load "foo.sf")
special form (or its syntactic shortcut#fload "foo.sf"
),which serves two purposes: to tellSFC where to look for macros and how to order initialization code. SFC turnsall globally defined Scheme identifiers into C identifiers (e.g. foo => cx_foo),generating "extern" specifications as needed. To do that, SFC assumes thatno set! can cross a compilation unit boundary; if this rule is violated,there will be duplicate symbol definition warnings from the linker.
Note that cross-module function calls are slower than intra-module calls,which can sometimes be compiled as direct jumps (tail call to a known entrypoint). On the other hand, very big output files can choke some C compilers,so some planning may be needed to organize the code for best performance.
"Bare-bones" #F is in no way a standard-compliant Scheme implementation; itcuts many corners to get a much simpler system that can serve as a basefor a more ambitious dialect of Scheme-like language. Built-in support forScheme not only omits all functionality related to any particular data type,it also lacks such features as apply, variable-arity functions, and dynamicwind. All missing functionality can be defined using the facilities existingin the core #F language.
As an example of language-building, and as an aid to porting Scheme programs,#F distribution includes a simple Scheme portability library, LibS (seelibs.sf).
LibS is just a regular #F source file, and it is included in the project in a regularmanner:
$ sfc libs.sf myprog.sf # sfc produces 2 C files$ gcc libs.c myprog.c -lm # gcc produces a.out (libs refers to math functions, so -lm may be needed)
To dress an existing pre-R6RS Scheme source file as a #F program thatuses LibS, one has to add(load "libs.sf")
line to the beginning of thefile and(define (main argv) #f)
to the end. LibS generally targetsR5RS feature set, but it has the following known limitations:
- SFC reader used to read #F source code is case-sensitive
read
andstring->symbol
are also case-sensitive- no support for
s
,f
,d
,l
exponent markers and#
digit placeholders - there is no support for
eval
and environment functions - no dynamic
load
or dynamic macroexpansion/compilation - fixnums are limited to 24 bits, flonums are doubles
- no support for bignums/rational/complex numbers
set!
to built-in bindings is not allowed- there is no REPL and no transcript functions
In addition to formal limitations listed above, there are issues thatmay affect the real-life performance of your converted Schemeprograms:
- very large list/vector literals converted to C code can choke your C compiler
- there is a noticeable performance penalty related to use of
apply
and functions with improper argument lists - compared to #F's native
letcc
/withcc
, there is a performance penalty for usingcall/cc
because ofdynamic-wind
's need to preserve multiple return values in a list - run-time errors such as fixnum overflows trigger asserts in C code unless
NDEBUG
is defined during compilation - deciphering syntax errors in the source code can be tricky
In addition to R5RS-level functionality, LibS supports many popular extensionsdefined in pre-R5RS Scheme standards, SRFIs, and R7RS libraries:
- many fixnum (
fx
) and flonum (fl
) - specific operations letrec*
,rec
,receive
,case-lambda
forms- operations on boxes:
box?
,box
,unbox
,set-box!
error
,assertion-violation
(not based on exceptions)file-exists?
,delete-file
,rename-file
,open-input-string
exit
,abort
,reset
,command-line
get-environment-variable
,system
,current-jiffy
,jiffies-per-second
Please check the/lib
directory to see the full list of available libraries. Some of them have corresponding interpretersto simplify development and debugging. Interpreters are located in the/int
directory.
#F memory model was designed to be C-friendly. #F programs can freelyuse pointers to C objects because #F's garbage collector does not scanpointers pointing outside its heap. For #F's wrapped C objects requiring finalization,the collector provides support for a finalization call. Finalizationprocedures are registered on per-datatype basis, e.g.:
(%localdef "static cxtype_t cxt_string = { \"string\", free };")
C code can access objects from a garbage-collected heap via #F'sglobal variables, which are regular C's global variables, updatedby the garbage collector when the corresponding object is moved inmemory. Since the hard-wired parts of the runtime are minimal,each project can have the set of data types and operations bestsuited for the task at hand, with all the necessary glue writtenexplicitly as C code fragments.
#F uses a slightly modified version of Al Petrofsky's R5RS-compatibleAlexpander macro system. It has many advanced features critical for what#F tries to accomplish, such as extending the definition of an existingmacro, local syntax definitions, syntax-binding a code to a name etc.In addition, #F's expander features non-hygienic extensions, allowingon-the-fly construction of identifiers, predicates in patterns etc.Here are some examples:
(let-syntax ([foo (let-syntax ([bar (syntax-rules () [(bar x) (- x)])]) (syntax-rules () [(foo) (bar 2)]))]) (foo)) => -2((syntax-rules () [(_ ([var init] ...) . body) ((lambda (var ...) . body) init ...)]) ([x 1] [y 2]) (+ x y)) => 3(let-syntax ([q quote]) (q x)) => x(let-syntax ([x append]) ((x x))) => ()(define-syntax variant-case (syntax-rules (else) [(_ (a . d) clause ...) (let ([var (a . d)]) (variant-case var clause ...))] [(_ var) (error 'variant-case "no clause matches ~s" var)] [(_ var (else exp1 exp2 ...)) (begin exp1 exp2 ...)] [(_ var (name (field ...) exp1 exp2 ...) clause ...) (if (#&(string->id #&(string-append #&(id->string name) "?")) var) (let ([field (#&(string->id #&(string-append #&(id->string name) "->" #&(id->string field))) var)] ...) exp1 exp2 ...) (variant-case var clause ...))]))
SFC is not a very sophisticated Scheme compiler, but it does many thingsother compilers do:
(define (compile-file filename) (write-code filename (code-generate (unbox-values (beta-substitute3 (lambda-lift (beta-substitute2 (cps-convert (beta-substitute (constant-fold (remove-assignments (fix-letrecs (parse-file filename)))))))))))))
Its code performance is not very far behind that of some more sophisticatedScheme-to-C compilers which support full R5RS Scheme, but are many timesmore complex and have large run-time systems, dominating any C code theymight link to. SFC, while being rather simple and ignorant as to what datatypes its user will decide to support, does a reasonably good job atoptimizing function calls and unboxing non-immediate objects like floating-pointnumbers. As an example, here's a piece of code generated for the expression
(fl- 0.0 (fl+ (fl/ Vc (fl* R C)) (fl/ Il C))
from RnRS's Runge-Kutta example (rk.sf,rk.c):
{ const flonum_t v2500_C = flonum_from_obj(r[3]); r[5] = (vectorref((r[1]), (0))); r[6] = (vectorref((r[1]), (1))); r[7] = obj_from_flonum(7, flonum_from_obj(r[5]) / flonum_from_obj(r[4])); { const flonum_t v2505_tmp = (0.0); { flonum_t v2504_tmp; { flonum_t v2503_tmp; { const flonum_t v2502_tmp = (flonum_from_obj(r[6]) / (v2500_C)); { const flonum_t v2501_tmp = (flonum_from_obj(r[2]) * (v2500_C)); v2503_tmp = (flonum_from_obj(r[5]) / (v2501_tmp)); } v2504_tmp = ((v2503_tmp) + (v2502_tmp)); } } r[8] = obj_from_flonum(8, (v2505_tmp) - (v2504_tmp)); } }
Since the library provided unboxed typeflonum_t
for floating pointnumbers, SFC used unboxed representation for all intermediate results,converting them to a boxed form only when they have to be storedin general-purpose registers or storage locations. Many compilerscan do a better job optimizing floating point operations, but SFC cango surprisingly far with just a few simple tricks and no embeddedknowledge of supported datatypes.
SFC does not insert any run-time checks into a release version of thecode, so getting the car of a number or passing a wrong number of argumentsto a function will likely lead to a coredump. IfNDEBUG
is not defined(which may or may not be your C compiler's default), the code producedby SFC switches to debug versions of data access primitives, catchingdata access and parameter passing violations with C run-time asserts.
To install SFC, grab the C files from thefixpoint
subdirectory of the master tree and compile them with your favorite C compiler.
If you are too lazy to learn GIT, the latest versions of the C source files are here:
Here's how you can compile SFC on a unix box:
gcc -o sfc -Wall -O3 -DNDEBUG 0.c 1.c 2.c 3.c 4.c 5.c 6.c 7.c c.c
The rest is up to you - the compiler has no dependencies and can be run from any location.
The documentation is not yet complete and might be slightly out-of-date.The .html files are located in thedoc
subdirectory of the master tree.
You may also learn some #F basics by looking at the examples:
They can be compiled in the following manner:
sfc tak.sf # produces tak.c gcc -o tak -Wall -O3 -DNDEBUG tak.c
As a demonstration of how far you can go, here is a full-scale R7RS interpreter in a single #F file:
(please seeSIOF repository for details)
And yes, SFC is also written in #F and compiles itself into C files in thefixpoint
directory.
SFC is based on ideas from Marc Feeley's "90 minute Scheme to C compiler" presentedat the Montreal Scheme/Lisp User Group on October 20, 2004. SFC's hygienic macroexpander is derived fromAl Petrofsky's alexpander v1.65 (please see the 2.sf source file for the original copyright notice).
About
#F (Sharp-F or False) is a portable compiler/runtime for a minimalistic subset of the Scheme programming language. Compatibility with R5RS/R7RS Scheme programs is provided in a form of libraries written in #F itself.