Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A Programming Languages Compiler Compiler

License

NotificationsYou must be signed in to change notification settings

ourPLCC/plcc

Repository files navigation

PLCC is designed for teaching and learning programming language concepts.

Related repositories:

Options for Installation and Use

PLCC can be installed and used in different environments. The table belowmay help you determine which option is best for you and your class.

OptionSoftware RequirementsNon-Software RequirementsConsistent, Pre-configured Environment
GitPodWeb Browser* Account on GitPod
* Account on hosting service (GitLab/GitHub/Bitbucket)
* Knowledge of above and Git
Yes
DockerDocker DesktopMinimal understanding of DockerYes
Native* Bash/Linux-like environment
* Java >= 11
* Python >= 3.9
System administration knowledgeNo

The advantages of GitPod or Docker are (1) few or no software dependenciesand (2) the ability to provide your class/developers a consistent developmentenvironment with no installation necessary.

Having your students/developers perform native installations on theirindividual machines can lead to unexpected challenges due to the variety ofdifferent environments this creates. This can be mitigated by having yourIT staff install PLCC on a shared server or into a computer lab and havingyour students/developers use those if their native install stops workingfor them for some strange, inexplicable reason.

Install PLCC for Use in GitPod

Add the following to.gitpod.yml in the root ofyour GitLab/GitHub/Bitbucket repository.

image:gitpod/workspace-full:latesttasks:  -name:Install PLCCcommand:|        # To pin to a specific version of PLCC,        # in the next line, change main to something like v8.0.1        PLCC_GIT_BRANCH=main \          /bin/bash -c \          "$(\curl -fsSL https://github.com/ourPLCC/plcc/raw/main/installers/plcc/install.bash)" \          >> ~/.bashrc        exec bash

When the project is edited in a GitPod workspace, PLCC will be installed andavailable in the environments terminal.

Install for Use in Docker

InstallDocker Desktop.

Installplcc-con ...

  • On macOS

    /bin/bash -c"$(curl -fsSL https://github.com/ourPLCC/plcc/raw/main/installers/plcc-con/install.bash)">>~/.zshrc
  • On Windows >= 10, firstinstall WSL, and then ina Bash terminal in Ubuntu

    /bin/bash -c"$(curl -fsSL https://github.com/ourPLCC/plcc/raw/main/installers/plcc-con/install.bash)">>~/.bashrc
  • On Linux

    /bin/bash -c"$(curl -fsSL https://github.com/ourPLCC/plcc/raw/main/installers/plcc-con/install.bash)">>~/.bashrc

After starting a new shell, you can start a shell inside a PLCC containerthat has access to the files in your current directory.

plcc-con

Inside this shell, all of PLCC's commands are available. When you are done,typeexit.

You can also run a single PLCC command inside a PLCC container by passingthe command toplcc-con. For example, let's find out what version ofPLCC, Java, and Python are installed in the container.

plcc-con plcc --versionplcc-con java -versionplcc-con python3 --version

plcc-con is a Bash function that wraps the following Docker command

docker run --rm -it -v"${PWD}:/workdir" --user"$(id -u):$(id -g)" ghcr.io/ourplcc/plcc:latest

Native Install

Install a Bash/Linux environment

  • On Windows >= 10,pleaseinstall WSL. Then runa Terminal and open Ubuntu from its dropdown menu. You are now running inBash inside an Ubuntu running inside (or next to) Windows. Use this environment to installand use PLCC. From now on, when an instruction refers to Linux, make sureyou are running in this environment. Including the next line.

  • On Linux, we assume you are running in Bash on a Debian-based Linuxdistributed (this includes Ubuntu) which usesapt-get as its packagemanager. If this is not your situation, you will have to adapt the instructionsappropriately for your environment.

  • On macOS, pleaseinstall Homebrew.

Install Java

Check if you havejava andjavac >= 11

java -versionjavac -version

If you are missing either, or if their versions don't match, or either isoutdated, pleaseinstall SDKMAN!,and use it to install Java.

Install Python

Check if you have Python >= 3.9

python3 --version

If not, then install Python.

  • On macOS

    brew install python3
  • On Linux or Windows underWSL

    sudo apt-get updatesudo apt-get install python3

Install PLCC

  • On macOS (remove ">> ~/.zshrc" if you would like to update this file manually)

    brew install curl git/bin/bash -c"$(curl -fsSL https://github.com/ourPLCC/plcc/raw/main/installers/plcc/install.bash)" \>>~/.zshrc
  • On Linux or Windows underWSL (remove ">> ~/.bashrc" if you would like to update this file manually)

    sudo apt-get updatesudo apt-get install curl git/bin/bash -c"$(curl -fsSL https://github.com/ourPLCC/plcc/raw/main/installers/plcc/install.bash)">>~/.bashrc

Use

Now that you have a Linux-like, Bash-like environment installed withPLCC and its dependencies, here's how you use it.

mkdir mylangcd mylangvim samples# Write sample programs in your language.vim grammar# Write a grammar file defining your language.plccmk -c grammar# Compile grammar into a scanner, parser, and interpreter.scan< samples# Run the scanner on your samples.parse -n -t< samples# Run the parser on your samples.rep -n -t< samples# Run the interpreter on your samples.

Example

Create a filesamples with the following example programs.

3-(3,2)-(-(4,1), -(3,2))

Write agrammar file.

skipWHITESPACE'\s+'tokenWHOLE'\d+'tokenMINUS'\-'tokenLP'\('tokenRP'\)'tokenCOMMA','%<prog> ::= <exp><exp>WholeExp ::= <WHOLE><exp>SubExp ::=MINUSLP <exp>exp1COMMA <exp>exp2RP%Prog%%%publicvoid$run() {System.out.println(exp.eval());  }%%%Exp%%%abstractpublic inteval();%%%SubExp%%%publicinteval() {returnexp1.eval() -exp2.eval();  }%%%WholeExp%%%publicinteval() {returnInteger.parseInt(whole.toString());  }%%%

Compile it.

$ plccmk -c grammarNonterminals (* indicates start symbol):<exp>*<prog>Abstract classes:  ExpJavasource files created:  Exp.java  Prog.java  SubExp.java  WholeExp.java

Test the scanner.

$ scan< samples   1: WHOLE'3'   3: MINUS'-'   3: LP'('   3: WHOLE'3'   3: COMMA','   3: WHOLE'2'   3: RP')'   5: MINUS'-'   5: LP'('   5: MINUS'-'   5: LP'('   5: WHOLE'4'   5: COMMA','   5: WHOLE'1'   5: RP')'   5: COMMA','   5: MINUS'-'   5: LP'('   5: WHOLE'3'   5: COMMA','   5: WHOLE'2'   5: RP')'   5: RP')'

Test the parser.

$ parse -t -n< samples   1:<prog>   1:|<exp>WholeExp   1:|| WHOLE"3"OK   3:<prog>   3:|<exp>SubExp   3:|| MINUS"-"   3:|| LP"("   3:||<exp>WholeExp   3:||| WHOLE"3"   3:|| COMMA","   3:||<exp>WholeExp   3:||| WHOLE"2"   3:|| RP")"OK   5:<prog>   5:|<exp>SubExp   5:|| MINUS"-"   5:|| LP"("   5:||<exp>SubExp   5:||| MINUS"-"   5:||| LP"("   5:|||<exp>WholeExp   5:|||| WHOLE"4"   5:||| COMMA","   5:|||<exp>WholeExp   5:|||| WHOLE"1"   5:||| RP")"   5:|| COMMA","   5:||<exp>SubExp   5:||| MINUS"-"   5:||| LP"("   5:|||<exp>WholeExp   5:|||| WHOLE"3"   5:||| COMMA","   5:|||<exp>WholeExp   5:|||| WHOLE"2"   5:||| RP")"   5:|| RP")"OK

Test the interpreter.

$ rep -n< samples312$

Commands

This section provides a brief reference to the commands PLCC provides.

plcc file  Run plcc.py on 'file' to generate code in a directory named 'Java/'.plccmk [-c] [--json_ast] [file]  Run plcc.py on 'file' and compile its results.  '-c' Removes 'Java/' before regenerating it.  '--json_ast' add support to print JSON ASTs.      Required if you want to call parse with --json_ast.  'file' defaults to 'grammar'scan [file...]    Run Java/Scan on each file and then stdin printing recognized tokens.parse [-t] [-n] [--json_ast] [file...]  Run Java/Parser on each file and then stdin,  printing OK for recognized programs and errors otherwise.  '-t' Print trace (i.e., parse tree).  '-n' Suppress prompt.  '--json_ast' print JSON AST to stdout.rep [-t] [-n] [file...]    Run Java/Rep on each file and then stdin.    REP Reads, Executes, and Prints each program in the input.    '-t' Print trace (i.e., parse tree).    '-n' Suppress prompt.

Grammar Files

A grammar file consist of three sections separated by a line containinga single percent.

[Lexical specification]%[Syntactic specification]%[Semantic specification]

PLCC generates a different tool from each section.

Grammar SectionTool Generated
Lexical SpecificationScanner
Syntactic SpecificationParser
Semantic SpecificationInterpreter

The tools are dependent on each other as follows:

Interpreter -> Parser -> Scanner

Likewise the corresponding sections are dependent oneach other:

Semantic -> Syntactic -> Lexical

For example, to build a parser, you don't need a semantic spec,but you do need a lexical and syntactic specs.

An external file can be include from anywhere in the spec(replace FILENAME with the file you want to include).

include FILENAME

Lexical Specification

The lexical specification containstoken andskip rules,one per line. Lines starting with# are comments.For example,

# Skip rules discard the text they match.skip WHITESPACE '\s+'# Token rules emits a Token containing their name and the match.token PLUS "\+"token WORD "\w+"
  • Names must be all-caps and may have underscores.
  • Patterns are regular expressions (regex) enclosed in either singleor double quotes. Here are some resources on regex.

Scan algorithm

Below is PLCC's scan algorithm in pseudo-code. For clarity and simplicity, a couple details related to advanced features have been omitted.

DEFINE:Scan input for tokens.

While there is more unscanned input ...

  1. Identify the specification rule to apply. (defined below)
  2. Remove the non-empty string matched by the rule from the start of unscanned input.
  3. If rule is not a "skip rule", create and emit a token.
DEFINE:Identify the specification rule to apply.
  1. Identify rules that match a non-empty sequence of characters from the start of the unscanned input.
  2. If no such rule exists, emit a lexical error and stop scanning.
  3. Otherwise, from the matching rules, identify the rule that appears first in the specification.
  4. If the matching rule that appears first is a skip rule, then return it as the rule to apply.
  5. Otherwise, from the matching rules, remove all skip rules, leaving only token rules.
  6. From the matching token rules, identify rules with the longest match.
  7. From the rules with the longest match, identify the rule that appears first in the lexical specification.
  8. Return this first, longest-match, token rule.
Notes:
  • Rules do not match across newline characters.

Syntactic Specification

A syntax specification is a flavor ofBNF (Backus-Naur From).

<prog> ::= <exp><exp>WholeExp ::= <WHOLE><exp>SubExp ::= MINUS LP <exp>exp1 COMMA <exp>exp2 RP
  • Non-terminals are always enclosed in angles and startwith a lowercase. E.g.,<exp>.
  • Each non-terminal must be defined by appearing on the left-hand-sideof at least one rule.
  • Terminals are always all-caps, and MAY be enclosedin angles. E.g.,<WHOLE> andMINUS.
  • Terminals represent tokens which are generated by the scanner fromthe input program.
  • Any symbol enclosed in angles will be included inthe parse tree. So<WHOLE> will be included,butMINUS will not.
  • When a symbol appears more than once on the right-handside of a rule, each must be given a name to distinguish it from the others. For example, in<exp>exp1 the distinguishing name isexp1. That name must start with a lower case.
  • When a non-terminal appears multiple times on the left-hand-side, each must be given a name to distinguish itfrom the others. The name must start with an upper case letter. For example, in<exp>SubExp the distinguishing name isSubExp.
  • Alternative definitions for a non-terminal is accomplished byproviding multiple rules that define the same non-terminal.

Parse Tree Class Hierarchy

PLCC translates semantic rules into a class hierarchy. For example:

<prog> ::= <exp><exp>WholeExp ::= <WHOLE><exp>SubExp ::= MINUS LP <exp>exp1 COMMA <exp>exp2 RP

becomes (some details omitted):

classProgextends_Start {Expexp; }abstractclassExp {}classWholeExpextendsExp {Tokenwhole; }classSubExpextendsExp {Expexp1;Expexp2; }
  • A class is generated for the non-terminal defined by a rule (the LHS) with instance variables defined for each captured symbols (within<>) on the RHS.
  • The first rule defines the start symbol,and its class inherits from the standard, built-in class _Start.
  • A non-terminal defined more than once becomes an abstract base class,and the distinguishing names become its subclasses.
  • Tokens always have the type of Token.

Repetition Rule

The repetition rule simplifies defining a repeating structure.

<pairs> **= LP <WHOLE>x <WHOLE>y RP +COMMA

<pairs> matches zero or more (x,y) pairs separated by comma: e.g.,(3 4), (5 6), (7 8). The separator clause (e.g.,+COMMA) is optional. For example,

PLCC translates the above rule into:

classPairs {List<Val>xList;List<Val>yList; }

The captured symbols become parallel lists. That is,xList.git(i) andyList.get(i) correspond to the ith value pair.

Parse Algorithm

The parsing algorithm is a recursive-descent parser that parses languagesin LL(1). Let's take a look.

Code Generated for Parser

Each rule in the syntactic spec turns into a static parse method embeddedin the class generated by the same rule. For example,

<prog> ::= <exp><exp>WholeExp ::= <WHOLE><exp>SubExp ::= MINUS LP <exp>exp1 COMMA <exp>exp2 RP

This generates (slightly simplified):

classProg{staticProgparse(Scanscn$) {Expexp =Exp.parse(scn$);returnnewProg(exp);  }}classExp{publicstaticExpparse(Scanscn$) {Tokent$ =scn$.cur();Token.Matchmatch$ =t$.match;switch(match$) {caseWHOLE:returnWholeExp.parse(scn$);caseMINUS:returnSubExp.parse(scn$);default:thrownewPLCCException("Parse error");    }  }}classWholeExp{staticWholeExpparse(Scanscn$) {Tokenwhole =scn$.match(Token.Match.WHOLE);returnnewWholeExp(whole);  }}classSubExp{staticSubExpparse(Scanscn$) {scn$.match(Token.Match.MINUS);scn$.match(Token.Match.LP);Expexp1 =Exp.parse(scn$);scn$.match(Token.Match.COMMA);Expexp2 =Exp.parse(scn$);scn$.match(Token.Match.RP);returnnewSubExp(exp1,exp2);  }}

Parsing starts with the start symbol: hereProg.Each parse method is responsible for matching the structure of the RHS ofthe syntactic rule it represents. It does so from left to right.Tokens are matched directly by telling the scanner what type of tokenit should match and consume. Non-terminals are matched by calling theirparse method which matches its structure, returning an instance of thatnon-terminal's class.

The parse function of an abstract base class (e.g., inExp above)determines which subclass's method to call by looking at the next token.

Semantic specification

The semantic specification injects code intopredefined locations (called hooks) within each classgenerated from the syntactic specification.

Prog%%%publicvoid$run() {System.out.println(exp.eval());  }%%%Exp%%%abstractpublic inteval();%%%SubExp%%%@Overridepublicinteval() {returnexp1.eval() -exp2.eval();  }%%%WholeExp%%%@Overridepublicinteval() {returnInteger.parseInt(whole.toString());  }%%%
  • The class representing the start symbol should override the$runmethod. Execution of the interpreter begins here: seeProg above.
  • To enable polymorphism, add an abstract method to the abstract baseclass representing a non-terminal that has alternatives: seeExp above.Then override this method in the subclasses representing theconcrete alternatives: seeSubExp andWholeExp above.

Hooks

By default, you can only inject code at the end of a class.Hooks allow you to inject code elsewhere.

  • <classname>:top - Top of file.
  • <classname>:import - Addimports.
  • <classname>:class - Declareextends orimplements.
  • <classname>:init - Constructor.

As an example, we update our original example by replacing thedefinition for WholeExp with this.

WholeExp:import%%%importjava.util.HashMap;%%%WholeExp%%%publicstaticHashMap<Integer,Integer> =newHashMap<Integer,Integer>();publicinteval() {intx =Integer.parseInt(whole.toString());checkDuplicate(x);returnx;  }publicvoidcheckDuplicate(intx) {if (seen.containsKey(x)) {System.out.println("Duplicate: " +x);    }else {seen.put(x,x);    }  }%%%

Now our interpreter reports when it sees a duplicate whole number.

Adding additional Java files/classes

Entire Java files can be added by naming,and providing a complete definition of,a class that is not generated from the syntactic specification.

Helper%%%importjava.util.List;publicclassHelper {  \\ ...}%%%

Serializing AST in JSON

To print a JSON AST for a program, pass--json_ast to bothplccmkandparse, like so:

plccmk --json_ast -c YOUR_GRAMMAR_FILEparse --json_ast< YOUR_PROGRAM_FILE

This feature allows other tools to be written in different languagesthat reads the JSON AST as input. In particular, there are plans toextend PLCC to allow semantics to be written in Python. This optionallows the parser implemented in Java to be reused by and interpreterwritten in Python.

Copyright and Licensing

ourPLCC is a community of developers that maintain a number of projectsrelated to PLCC. The contents of the projects were originally createdby Timothy Fossumplcc@pithon.net.

Thank you Tim!

This program is free software: you can redistribute it and/or modify itunder the terms of the GNU General Public License as published bythe Free Software Foundation, either version 3 of the License,or (at your option) any later version.

This program is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.See the GNU General Public License for more details.

You should have received a copy of theGNU General Public Licensealong with this program. If not, seehttps://www.gnu.org/licenses/.

Third Party Libraries

PLCC uses and distributesJackson JSON Processor,under theApache 2.0.

About

A Programming Languages Compiler Compiler

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors6


[8]ページ先頭

©2009-2025 Movatter.jp