Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork0
streamich/jit-parser
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Top-down recursive descent backtracking PEG scanner-less JIT parser combinator generator.
A high-performance parser library that compiles grammar definitions into efficient JavaScript parsing functions at runtime. It generates both Concrete Syntax Trees (CST) and Abstract Syntax Trees (AST) from textual input.
- Installation
- Quick Start
- Grammar Node Types
- Tree Types
- Grammar Compilation
- Debug Mode
- Examples
- API Reference
npm install jit-parser
import{CodegenGrammar}from'jit-parser';import{ParseContext}from'jit-parser';// Define a simple grammarconstgrammar={start:'Value',cst:{Value:'hello'}};// Compile the grammar to JavaScriptconstparser=CodegenGrammar.compile(grammar);// Parse inputconstctx=newParseContext('hello',false);constcst=parser(ctx,0);console.log(cst);// CST node representing the parse result
JIT Parser supports five main grammar node types for defining parsing rules. Grammar rules can be fully defined in JSON, making them language-agnostic and easy to serialize.
References a named node defined elsewhere in the grammar.
Interface:
typeRefNode<Nameextendsstring=string>={r:Name};
Syntax:
{r:'NodeName'}
Example:
constgrammar={start:'Program',cst:{Program:{r:'Statement'},Statement:'return;'}};
Matches literal strings, regular expressions, or arrays of strings. Terminal nodes are leaf nodes in the parse tree.
Interface:
interfaceTerminalNode{type?:string;// Type name (default: "Text")t:RegExp|string|''|string[];// Pattern(s) to matchrepeat?:'*'|'+';// Repetition (only for string arrays)sample?:string;// Sample text for generationast?:AstNodeExpression;// AST transformation}// Shorthand: string, RegExp, or empty stringtypeTerminalNodeShorthand=RegExp|string|'';
Syntax:
// String literal'hello'// Regular expression/[a-z]+/// Array of alternatives{t:['true','false']}// With repetition{t:[' ','\t','\n'],repeat:'*'}// Full terminal node{t:/\d+/,type:'Number',sample:'123'}
Examples:
// Simple string terminalValue:'null'// RegExp terminalNumber:/\-?\d+(\.\d+)?/// Alternative stringsBoolean:{t:['true','false']}// Repeating whitespaceWS:{t:[' ','\t','\n'],repeat:'*'}
Matches a sequence of grammar nodes in order. All nodes in the sequence must match for the production to succeed.
Interface:
interfaceProductionNode{p:GrammarNode[];// Sequence of nodes to matchtype?:string;// Type name (default: "Production")children?:Record<number,string>;// Child index to property mappingast?:AstNodeExpression;// AST transformation}// Shorthand: array of grammar nodestypeProductionNodeShorthand=GrammarNode[];
Syntax:
// Shorthand array['{',{r:'Content'},'}']// Full production node{p:['{',{r:'Content'},'}'],type:'Block',children:{1:'content'// Maps index 1 to 'content' property}}
Examples:
// Function call: func()FunctionCall:['func','(',')']// Object with named childrenObject:{p:['{',{r:'Members'},'}'],children:{1:'members'}}
Matches one of several alternative patterns. The first matching alternative is selected (ordered choice).
Interface:
interfaceUnionNode{u:GrammarNode[];// Array of alternative nodestype?:string;// Type name (default: "Union")ast?:AstNodeExpression;// AST transformation}
Syntax:
{u:[pattern1,pattern2,pattern3]}
Examples:
// Literal valuesLiteral:{u:['null','true','false',{r:'Number'},{r:'String'}]}// Statement typesStatement:{u:[{r:'IfStatement'},{r:'ReturnStatement'},{r:'ExpressionStatement'}]}
Matches zero or more repetitions of a pattern.
Interface:
interfaceListNode{l:GrammarNode;// Node to repeattype?:string;// Type name (default: "List")ast?:AstNodeExpression;// AST transformation}
Syntax:
{l:pattern}
Examples:
// Zero or more statementsStatements:{l:{r:'Statement'}}// Comma-separated listArguments:{l:{p:[',',{r:'Expression'}],ast:['$','/children/1']// Extract the expression, ignore comma}}
JIT Parser works with four types of tree structures:
The grammar definition that describes the parsing rules. These are the node types described above that define how to parse input text.
The parse tree that contains every matched token and maintains the complete structure of the parsed input.
Interface:
interfaceCstNode{ptr:Pattern;// Reference to grammar patternpos:number;// Start position in inputend:number;// End position in inputchildren?:CstNode[];// Child nodes}
Example CST:
// For input: '{"foo": 123}'{ptr:ObjectPattern,pos:0,end:13,children:[{ptr:TextPattern,pos:0,end:1},// '{'{ptr:MembersPattern,pos:1,end:12,// '"foo": 123'children:[...]},{ptr:TextPattern,pos:12,end:13}// '}']}
A simplified tree structure derived from the CST, typically containing only semantically meaningful nodes.
Default AST Interface:
interfaceCanonicalAstNode{type:string;// Node typepos:number;// Start positionend:number;// End positionraw?:string;// Raw matched textchildren?:(CanonicalAstNode|unknown)[];// Child nodes}
Example AST:
// For input: '{"foo": 123}'{type:'Object',pos:0,end:13,children:[{type:'Entry',key:{type:'String',value:'foo'},value:{type:'Number',value:123}}]}
Default Conversion: Each CST node becomes an AST node with
type,pos,end, andchildrenproperties.AST Expressions: Use
astproperty in grammar nodes to customize AST generation:ast: null- Skip this node in ASTast: ['$', '/children/0']- Use first child's ASTast: {...}- Custom JSON expression for transformation
Children Mapping: Use
childrenproperty to map CST child indices to AST properties:{children:{0:'key',// CST child 0 -> AST property 'key'2:'value'// CST child 2 -> AST property 'value'}}
Type Override: Specify custom
typeproperty instead of default node type names.
If debug mode is enabled during compilation, the parser captures all grammar node tree paths that were attempted during parsing. This debug trace tree is useful for debugging parser behavior and improving parser performance by understanding which rules were tried and failed.
Interface:
interfaceTraceNode{type:string;// Grammar rule name that was attemptedpos:number;// Start position where rule was triedend?:number;// End position if rule succeededchildren?:TraceNode[];// Nested rule attemptssuccess:boolean;// Whether the rule matched successfully}
The debug trace captures the complete parsing process, including failed attempts, making it invaluable for understanding complex parsing scenarios and optimizing grammar rules.
Grammars are compiled to efficient JavaScript functions that can parse input strings rapidly.
import{CodegenGrammar}from'jit-parser';constgrammar={start:'Value',cst:{Value:{r:'Number'},Number:/\d+/}};// Compile to parser functionconstparser=CodegenGrammar.compile(grammar);
import{CodegenContext}from'jit-parser';constctx=newCodegenContext(true,// positions: Include pos/end in ASTtrue,// astExpressions: Process AST transformationsfalse// debug: Generate debug trace code);constparser=CodegenGrammar.compile(grammar,ctx);
You can print the grammar structure by converting it to a string:
import{GrammarPrinter}from'jit-parser';constgrammarString=GrammarPrinter.print(grammar);console.log(grammarString);
Example output:
Value (reference)└─ Number (terminal): /\d+/constjsonGrammar={start:'Value',cst:{WOpt:{t:[' ','\n','\t','\r'],repeat:'*',ast:null},Value:[{r:'WOpt'},{r:'TValue'},{r:'WOpt'}],TValue:{u:['null',{r:'Boolean'},{r:'Number'},{r:'String'},{r:'Object'},{r:'Array'}]},Boolean:{t:['true','false']},Number:/\-?\d+(\.\d+)?([eE][\+\-]?\d+)?/,String:/"[^"\\]*(?:\\.[^"\\]*)*"/,Object:['{',{r:'Members'},'}'],Members:{u:[{p:[{r:'Entry'},{l:{p:[',',{r:'Entry'}],ast:['$','/children/1']}}],ast:['concat',['push',[[]],['$','/children/0']],['$','/children/1']]},{r:'WOpt'}]},Entry:{p:[{r:'String'},':',{r:'Value'}],children:{0:'key',2:'value'}},Array:['[',{r:'Elements'},']']// ... more rules},ast:{Value:['$','/children/1'],// Extract middle child (TValue)Boolean:['==',['$','/raw'],'true'],// Convert to booleanNumber:['num',['$','/raw']]// Convert to number}};constparser=CodegenGrammar.compile(jsonGrammar);console.log(GrammarPrinter.print(jsonGrammar));
Debug mode captures a trace of the parsing process, showing which grammar rules were attempted at each position.
import{CodegenContext,ParseContext}from'jit-parser';// Enable debug during compilationconstdebugCtx=newCodegenContext(true,true,true);// debug = trueconstparser=CodegenGrammar.compile(grammar,debugCtx);// Create trace collectionconstrootTrace={pos:0,children:[]};constparseCtx=newParseContext('input text',false,[rootTrace]);// Parse with debug traceconstcst=parser(parseCtx,0);// Print debug traceimport{printTraceNode}from'jit-parser';console.log(printTraceNode(rootTrace,'','input text'));
The debug trace shows:
- Which grammar rules were attempted
- At what positions in the input
- Whether each attempt succeeded or failed
- The hierarchical structure of rule attempts
Example trace output:
Root└─ Value 0:22 → ' {"foo": ["bar", 123]}' ├─ WOpt 0:1 → " " ├─ TValue 1:22 → '{"foo": ["bar", 123]}' │ ├─ Null │ ├─ Boolean │ ├─ String │ └─ Object 1:22 → '{"foo": ["bar", 123]}' │ ├─ Text 1:2 → "{" │ ├─ Members 2:21 → '"foo": ["bar", 123]' │ │ └─ Production 2:21 → '"foo": ["bar", 123]' │ │ ├─ Entry 2:21 → '"foo": ["bar", 123]' │ │ │ ├─ String 2:7 → '"foo"' │ │ │ ├─ Text 7:8 → ":" │ │ │ └─ Value 8:21 → ' ["bar", 123]' │ │ │ └─ ... │ │ └─ List 21:21 → "" │ └─ Text 21:22 → "}" └─ WOpt 22:22 → ""constexprGrammar={start:'Expression',cst:{Expression:{r:'Number'},Number:{t:/\d+/,type:'Number'}}};constparser=CodegenGrammar.compile(exprGrammar);constctx=newParseContext('42',true);constcst=parser(ctx,0);constast=cst.ptr.toAst(cst,'42');console.log(ast);// {type: 'Number', pos: 0, end: 2, raw: '42'}
import{grammarasjsonGrammar}from'jit-parser/lib/grammars/json';constparser=CodegenGrammar.compile(jsonGrammar);constjson='{"name": "John", "age": 30}';constctx=newParseContext(json,true);constcst=parser(ctx,0);constast=cst.ptr.toAst(cst,json);console.log(ast);
constgrammar={start:'KeyValue',cst:{KeyValue:{p:[{r:'Key'},'=',{r:'Value'}],children:{0:'key',2:'value'},type:'Assignment'},Key:/[a-zA-Z]+/,Value:/\d+/},ast:{KeyValue:{type:'Assignment',key:['$','/children/0/raw'],value:['num',['$','/children/2/raw']]}}};
constlistGrammar={start:'List',cst:{List:['[',{r:'Items'},']'],Items:{u:[{p:[{r:'Item'},{l:{p:[',',{r:'Item'}],ast:['$','/children/1']}}],ast:['concat',['push',[[]],['$','/children/0']],['$','/children/1']]},''// Empty list]},Item:/\w+/}};
static compile(grammar: Grammar, ctx?: CodegenContext): ParsercompileRule(ruleName: string): Pattern
constructor(str: string, ast: boolean, trace?: RootTraceNode[])
constructor(positions: boolean, astExpressions: boolean, debug: boolean)
static print(grammar: Grammar, tab?: string): string
Print a formatted CST tree
Print a formatted debug trace
See theGrammar Node Types section for complete interface definitions.
This parser generator provides a powerful and efficient way to build custom parsers with minimal code while maintaining high performance through JIT compilation.
About
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.