Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

CS383 course project

NotificationsYou must be signed in to change notification settings

QDelta/simpl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

In this project I implement an interpreter for a ML dialect SimPL.

Implementation

The implementation is pretty straightforward given the specification and the skeleton, but there are some points that are worth mentioning.

Type checker

For type checker I refers tothis blog for most of my implementation, which demonstrates the development of Rémy's algorithm for OCaml type inference.

The T-LetPoly rule is simple but rather inefficient for implementation, so explicit polymorphic types should be introduced, which brings the procedure of type generalization and instantiation.

The instantiation is just replacing all type quantifiers with fresh type variables, but the generalization should be taken care of since type variables still in the context are not supposed to be generalized since generalization here will break the constraints.

The naive solution is scanning the environment, but Rémy came up with a more efficient way introducing generalization levels, relaxing environment traversing to integer comparison. The original idea comes from the similarity between memory management and type generalization (ownership of memory region/type variable).

Equality type constraint

Actually we are going to implement a special case of type class, supporting ad hoc polymorphism for equality check.

A common technique to implement type classes is dictionary passing style, where each type records what classes it belongs to.

But the original definition with equality type have a bit problem. The T-NIL rule:

$$\frac{}{\Gamma\vdash \texttt{nil}:\alpha\ \texttt{list}}$$

According to that rule, the type of$\texttt{nil}$ is always a list of an equality type, which makes the construction of list of other types impossible, and the definition of common list operations on arbitrary element impossible, likefold andmap.

So I relax this constraint, requiring it to be equality type only when it is compared to other expressions. It is also the usual way how type classes are implemented.

Extra features

I also implement some extra features which are common in functional programming languages.

Sum type

The classical sum type.

Syntax:

$$\begin{align*}e ::= &\ \cdots\\|\ \ &\texttt{inl}\ e\\|\ \ &\texttt{inr}\ e\\|\ \ &\texttt{case}\ e\ \texttt{of}\ \texttt{inl}\ x\ \texttt{=>}\ e\ |\ \texttt{inr}\ x\ \texttt{=>}\ e\end{align*}$$

Type definition:

$$\begin{align*}t ::= &\ \cdots\\|\ \ &t+t\end{align*}$$

Typing rules:

$$\frac{\Gamma\vdash e:t_1}{\Gamma\vdash\texttt{inl}\ e:t_1+t_2}\qquad\frac{\Gamma\vdash e:t_2}{\Gamma\vdash\texttt{inr}\ e:t_1+t_2}$$

$$\frac{\Gamma\vdash e_1:t_1+t_2\quad\Gamma[x_1:t_1]\vdash e_2:t_3\quad\Gamma[x_2:t_2]\vdash e_3:t_3}{\Gamma\vdash\texttt{case}\ e_1\ \texttt{of}\ \texttt{inl}\ x_1\ \texttt{=>}\ e_2\ |\ \texttt{inr}\ x_2\ \texttt{=>}\ e_3:t_3}$$

Value definition:

$$V_{t_1+t_2}=({\texttt{inl}}\times V_{t_1})\cup({\texttt{inr}}\times V_{t_2})$$

Evaluation rules:

$$\frac{E,M,p;e\Downarrow M',p';v}{E,M,p;\texttt{inl}\ e\Downarrow M',p';(\texttt{inl},v)}\qquad\frac{E,M,p;e\Downarrow M',p';v}{E,M,p;\texttt{inr}\ e\Downarrow M',p';(\texttt{inr},v)}$$

$$\frac{E,M,p;e_1\Downarrow M',p';(\texttt{inl},v_1)\quad E[x_1\mapsto v_1],M',p';e_2\Downarrow M'',p'';v_2}{E,M,p;\texttt{case}\ e_1\ \texttt{of}\ \texttt{inl}\ x_1\ \texttt{=>}\ e_2\ |\ \texttt{inr}\ x_2\ \texttt{=>}\ e_3\Downarrow M'',p'';v_2}$$

$$\frac{E,M,p;e_1\Downarrow M',p';(\texttt{inr},v_1)\quad E[x_2\mapsto v_1],M',p';e_3\Downarrow M'',p'';v_2}{E,M,p;\texttt{case}\ e_1\ \texttt{of}\ \texttt{inl}\ x_1\ \texttt{=>}\ e_2\ |\ \texttt{inr}\ x_2\ \texttt{=>}\ e_3\Downarrow M'',p'';v_2}$$

Example:

(* new.case.spl*)let f= fn b=> fn x=> fn y=>if bthen inl xelse inr yin  (ftrue()1, ffalse()1)end(* : (unit + int) * (unit + int)   ==> pair@inl@unit@inr@1*)

Case on list

After relaxing the type constraint on$\texttt{nil}$, case expressions on lists is also necessary for defining list operations on arbitrary elements since expressions like$l=\texttt{nil}$ also require an equality type.

Syntax:

$$\begin{align*}e ::= &\ \cdots\\|\ \ &\texttt{case}\ e\ \texttt{of}\ \texttt{nil}\ \texttt{=>}\ e\ |\ x::x\ \texttt{=>}\ e\end{align*}$$

Typing rule:

$$\frac{\Gamma\vdash e_1:t_1\ \texttt{list}\quad\Gamma\vdash e_2:t_2\quad\Gamma[h:t_1][t:t_1\ \texttt{list}]\vdash e_3:t_2}{\Gamma\vdash\texttt{case}\ e_1\ \texttt{of}\ \texttt{nil}\ \texttt{=>}\ e_2\ |\ h::t\ \texttt{=>}\ e_3:t_2}$$

Evaluation rules:

$$\frac{E,M,p;e_1\Downarrow M',p';\texttt{nil}\quad E,M',p';e_2\Downarrow M'',p'';v_2}{E,M,p;\texttt{case}\ e_1\ \texttt{of}\ \texttt{nil}\ \texttt{=>}\ e_2\ |\ h::t\ \texttt{=>}\ e_3\Downarrow M'',p'';v_2}$$

$$\frac{E,M,p;e_1\Downarrow M',p';(\texttt{cons},v_1,v_2)\quad E[h\mapsto v_1][t\mapsto v_2],M',p';e_3\Downarrow M'',p'';v_3}{E,M,p;\texttt{case}\ e_1\ \texttt{of}\ \texttt{nil}\ \texttt{=>}\ e_2\ |\ h::t\ \texttt{=>}\ e_3\Downarrow M'',p'';v_3}$$

Example:

(* new.fold.spl*)let foldl=rec foldl=>  fn f=> fn x=> fn l=>    case l of      nil=> x| h :: t=> foldl f (f x h) tin  foldl (fn x=> fn y=> x+ y)0 (1::2::3::4::nil)end(* : int   ==> 10*)

Mutual recursion

Mutual recursion in ML dialects usually have two forms. The first form is mutual recursion bindings, likelet rec ... and ... in OCaml. The second form is in the original Y combinator style where a recursion is an expression rather than a binding. I see that SimPL prefers the latter so I implement mutual recursion in that form.

For syntax, I refer tosimilar terms in Coq likefix <ident> := ... with <ident> := ... for <ident>, where multiple values are defined and one of them is returned. The other values should be considered and designed as helpers since they can not be accessed outside, which is also a limitation of mutual recursion in the second form.

The original recursion syntax is modified to:

$$\begin{align*}e ::= &\ \cdots\\|\ \ &\texttt{rec}\ r\\r ::=&\ r\ \texttt{with}\ x\ \texttt{=>}\ e\\|\ \ &x\ \texttt{=>}\ e\end{align*}$$

I sacrifices thefor <ident> part for a uniform syntax and make the parser generator happy. The first value is returned. But in the following part I will use$\texttt{rec}\ r_n\ \texttt{for}\ x_k$ syntax for convenience, where$r_n$ represents$x_1\ \texttt{=&gt;}\ e_1\ \texttt{with}\ x_2\ \texttt{=&gt;}\ e_2\cdots\texttt{with}\ x_n\ \texttt{=&gt;}\ e_n$.

The typing rule is modified to:

$$\frac{\forall i,\Gamma[x_i:t_i]\vdash e_i:t_i}{\Gamma\vdash\texttt{rec}\ r_n\ \texttt{for}\ x_k:t_k}$$

Value definition:

$$\bf{Rec}={\texttt{rec}}\times\bf{Env}\times\bf{Var}\times(\bf{Var}\rightarrow\bf{Exp})$$

Here$\bf{Var}\rightarrow\bf{Exp}$ means an mapping from$\bf{Var}$ to$\bf{Exp}$.

Evaluation rules:

$$\frac{E[x_i\mapsto(\texttt{rec},E,x_i,{x_j\mapsto e_j}^* )]^*,M,p;e_k\Downarrow M',p';v}{E,M,p;\texttt{rec}\ r_n\ \texttt{for}\ x_k\Downarrow M',p';v}$$

$$\frac{E(x)=(\texttt{rec},E_1,x_1,{x_i\mapsto e_i}^*)\quad E_1,M,p;\texttt{rec}\ r_n\ \texttt{for}\ x_1\Downarrow M',p';v}{E,M,p;x\Downarrow M',p';v}$$

Example:

(* new.mutualrec.spl*)let iseven=rec e=> fn x=>if iszero xthentrueelse o (x-1)with o=> fn x=>if iszero (pred x)thentrueelse e (x-1)in  iseven6end(* : bool   ==> true*)

Garbage collection of reference cells

I implement a classical mark-and-sweep garbage collector for reference cells.

publicintallocate(Enve,Valuev) {if (mem.size() >=gcThreshold) {gc(e);gcThreshold =2 *mem.size();  }intp =pointerGen;pointerGen +=1;mem.put(p,v);returnp;}

Each time the interpreter allocates memory cell for value, if the current active cells exceeds the threshold then the garbage collector will scan the environment and collect the unreachable cells. The next stage threshold will be the double of number of active cells after garbage collection.

Lazy evaluation

Make arbitrary part of the expression can be lazily evaluated is a little complex. Implementing it in a naive way have some disadvantages:

  • Explicit forces across the interpreter, which will result in a verbose code.
  • Performance penalty. Forced thunks is hard to be shared when arbitrary part of the expression can be lazily evaluated, resulting in multiple evaluations of the same expression.

One way to solve the above problems is using graph reduction technique. But it requires a complete refactor of the interpreter architecture. For simplicity without a great loss of performance and functionality, lazy bindings and functions evaluating arguments lazily (which is actually a lazy binding) can be introduced.

The syntax:

$$\begin{align*}e ::= &\ \cdots\\|\ \ &\texttt{lazy}\ x=e\ \texttt{in}\ e\ \texttt{end}\\|\ \ &\texttt{lazyfn}\ x\ \texttt{=>}\ e\end{align*}$$

The typing rules are similar to those of$\texttt{let}$ and$\texttt{fn}$ expressions.

Value definition:

$$\begin{align*}V_{thunk}&={\texttt{thunk}}\times\bf{Env}\times\bf{Exp}\\V_{t_1\rightarrow t_2}&={\texttt{fun}}\times\bf{Env}\times\bf{Var}\times\bf{Exp}\times\mathbb{B}\end{align*}$$

Evaluation rules:

$$\frac{E[x\mapsto (\texttt{thunk},E,e_1)],M,p;e_2\Downarrow M',p',v}{E,M,p;\texttt{lazy}\ x=e\ \texttt{in}\ e\ \texttt{end}\Downarrow M',p';v}$$

$$\frac{}{E,M,p;\texttt{fn}\ x\ \texttt{=>}\ e\Downarrow M,p;(\texttt{fun},E,x,e,\bf{ff})}\qquad\frac{}{E,M,p;\texttt{lazyfn}\ x\ \texttt{=>}\ e\Downarrow M,p;(\texttt{fun},E,x,e,\bf{tt})}$$

$$\frac{E,M,p;e_1\Downarrow M',p';(\texttt{fun},E_1,x,e,\bf{ff})\quad E,M',p';e_2\Downarrow M'',p'';v_2\quad E_1[x\mapsto v_2],M'',p'';e\Downarrow M''',p''';v}{E,M,p;e_1\ e_2\Downarrow M''',p''';v}$$

$$\frac{E,M,p;e_1\Downarrow M',p';(\texttt{fun},E_1,x,e,\bf{tt})\quad E_1[x\mapsto (\texttt{thunk},E,e_2)],M',p';e\Downarrow M'',p'';v}{E,M,p;e_1\ e_2\Downarrow M',p';v}$$

$$\frac{E(x)=(\texttt{thunk},E_1,e)\quad E_1,M,p;e\Downarrow M',p';v\quad E{x\mapsto v}}{E,M,p;x\Downarrow M,p;v}$$

In the last rule$E{x\mapsto v}$ means a local update of the environment, it guarantees that lazy bindings are evaluated at most once. It is not correct to write it in this form while theoretically the judgement form and all rules should be changed where$E$ can gets updated sequentially like$M$ and$p$.

In real implementation, while the environments originally behaves like immutable linked lists, it is actually a persistent data structure where two environments share the most common parts of name-value mapping, which enables me to implement local environment updating easily without changing previous code.

Example1: the lazy binding will not be evaluated if not accessed.

(* new.lazy1.spl*)let const= fn x=> lazyfn y=> xinlazy a= hd nilin    const() (hd nil)  endend(* : unit   ==> unit*)

Example2: the lazy binding will be evaluated at most once.

(* new.lazy2.spl*)let x=ref0inlazy y= (x:=!x+1)in    y;y;!x  endend(* : int   ==> 1*)

Demos

Sieve of Eratosthenes:

(* new.primes.spl*)let filter=rec filter=>  fn f=> fn l=>    case l of      nil=> nil| h :: t=>let rest= filter f tinif (f h)then (h :: rest)else rest        endinlet fromTo=rec fromTo=>  fn m=> fn n=>if (m> n)then nilelse (m :: (fromTo (succ m) n))inlet sieve=rec sieve=>  fn l=>    case l of      nil=> nil| p :: rest=>        p :: (sieve (filter (fn n=> n% p<>0) rest))inlet primes= fn n=>  sieve (fromTo2 n)in  primes1000end end end end(* : int list   ==> list@168*)(* prime list in [2,1000]*)

N-queens:

(* new.8queen.spl*)let foldr=rec foldr=>  fn f=> fn x=> fn l=>    case l of      nil=> x| h :: t=> f h (foldr f x t)inlet map=rec map=>  fn f=> fn l=>    case l of      nil=> nil| h :: t=> (f h) :: (map f t)inlet filter=rec filter=>  fn f=> fn l=>    case l of      nil=> nil| h :: t=>let rest= filter f tinif (f h)then (h :: rest)else rest        endinlet append=rec append=>  fn l1=> fn l2=>    case l1 of      nil=> l2| h :: t=> h :: (append t l2)inlet concat= foldr append nilinlet concatMap= fn f=> fn l=> concat (map f l)inlet fromTo=rec fromTo=>  fn m=> fn n=>if (m> n)then nilelse (m :: (fromTo (succ m) n))inlet queens= fn n=>let loop=rec loop=> fn boards=> fn counter=>if (counter= n)then boardselse (loop (concatMap expand boards) (succ counter))with expand=> fn board=>      map (fn x=> x :: board) (filter (fn x=> safe x board1) (fromTo1 n))with safe=> fn x=> fn l=> fn n=>      case l of        nil=>true| c :: y=> (x<> c) andalso (x<> (c+ n)) andalso (x<> (c- n))                           andalso (safe x y (succ n))in    loop (nil::nil)0  endin  queens8end end end end end end end end(* : int list list   ==> list@92*)(* all solutions for 8-queen problem   92 solutions   each is a int list of positions at each row*)

About

CS383 course project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp