Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork23
Universal Syntax Tree used by@unifiedjs
syntax-tree/unist
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
UniversalSyntaxTree.
unist is a specification for syntax trees.It has a bigecosystem of utilities in JavaScript forworking with these trees.It’s implemented by several other specifications.
This document may not be released.Seereleases for released documents.The latest released version is3.0.0.
This document defines a general-purpose format for syntax trees.Development of unist started in July 2015.This specification is written in aWeb IDL-like grammar.
Syntax trees are representations of source code or even natural language.These trees are abstractions that make it possible to analyze, transform, andgenerate code.
Syntax treescome in two flavors:
- concrete syntax trees: structures that represent every detail (such aswhite-space in white-space insensitive languages)
- abstract syntax trees: structures that only represent details relatingto the syntactic structure of code (such as ignoring whether a double orsingle quote was used in languages that support both, such as JavaScript).
This specification can express both abstract and concrete syntax trees.
unist is not intended to be self-sufficient.Instead, it is expected that other specifications implement unist and extend itto express language specific nodes.For example, see projects such ashast (for HTML),nlcst (fornatural language),mdast (for Markdown), andxast (for XML).
unist relates toJSON in that compliant syntax trees can be expressedcompletely in JSON.However, unist is not limited to JSON and can be expressed in other dataformats, such asXML.
unist relates toJavaScript in that it has a richecosystem ofutilities for working with compliant syntax trees inJavaScript.The five most used utilities combined are downloaded thirty million times eachmonth.However, unist is not limited to JavaScript and can be used in other programminglanguages.
unist relates to theunified,remark,rehype, andretextprojects in that unist syntax trees are used throughout their ecosystems.
unist relates to thevfile project in that it accepts unist nodes for itsmessage store, and that vfile can be a sourcefile of a syntaxtree.
If you are using TypeScript, you can use the unist types by installing themwith npm:
npm install @types/unist
Syntactic units in unist syntax trees are called nodes, and implement theNode interface.
interface Node { type:string data: Data? position: Position?}Thetype field is a non-empty string representing the variant of a node.This field can be used to determine thetype a node implements.
Thedata field represents information from the ecosystem.The value of thedata field implements theData interface.
Theposition field represents the location of a node in a source document.The value of theposition field implements thePositioninterface.Theposition field must not be present if a node isgenerated.
Specifications implementing unist are encouraged to define more fields.Ecosystems can define fields onData.
Any value in unistmust be expressible in JSON values:string,number,object,array,true,false, ornull.This means that the syntax tree should be able to be converted to and from JSONand produce the same tree.For example, in JavaScript, a tree can be passed throughJSON.parse(JSON.stringify(tree)) and result in the same tree.
interface Position { start: Pointend: Point}Position represents the location of a node in a sourcefile.
Thestart field ofPosition represents the place of the first character ofthe parsed source region.Theend field ofPosition represents the place of the first characterafter the parsed source region, whether it exists or not.The value of thestart andend fields implement thePointinterface.
If the syntactic unit represented by a node is not present in the sourcefile at the time of parsing, the node is said to begenerated and it must not have positional information.
For example, if the following value was represented as unist:
alphabravo
…the first word (alpha) would start at line1, column1, offset0, andend at line1, column6, offset5.The line feed would start at line1, column6, offset5, and end at line2, column1, offset6.The last word (bravo) would start at line2, column1, offset6, and endat line2, column6, offset11.
interface Point { line: number>=1 column: number>=1 offset: number>=0?}Point represents one place in a sourcefile.
Theline field (1-indexed integer) represents a line in a source file.Thecolumn field (1-indexed integer) represents a column in a source file.Theoffset field (0-indexed integer) represents a character in a source file.
The term character means a (UTF-16) code unit which is defined in theWeb IDL specification.
interface Data { }Data represents information associated by the ecosystem with the node.
This space is guaranteed to never be specified by unist or specificationsimplementing unist.
interface Parent<: Node { children: [Node]}
Nodes containing other nodes (said to bechildren) extend theabstract interfaceParent (Node).
Thechildren field is a list representing the children of a node.
interface Literal<: Node { value: any}
Nodes containing a value extend the abstract interfaceLiteral(Node).
Thevalue field can contain any value.
Atree is a node and all of itsdescendants (if any).
Node X ischild of node Y, if Y’schildren include X.
Node X isparent of node Y, if Y is achild of X.
Theindex of achild is its number of precedingsiblings, or0 if it has none.
Node X is asibling of node Y, if X and Y have the sameparent (if any).
Theprevious sibling of achild is itssibling at itsindex minus 1.
Thenext sibling of achild is itssibling at itsindex plus 1.
Theroot of a node is itself, if withoutparent, or theroot of itsparent.
Theroot of atree is any node in thattreewithoutparent.
Node X isdescendant of node Y, if X is achild of Y, or ifX is achild of node Z that is adescendant of Y.
Aninclusive descendant is a node or one of itsdescendants.
Node X is anancestor of node Y, if Y is adescendantof X.
Aninclusive ancestor is a node or one of itsancestors.
Thehead of a node is its firstchild (if any).
Thetail of a node is its lastchild (if any).
Aleaf is a node with nochildren.
Abranch is a node with one or morechildren.
A node isgenerated if it does not havepositionalinformation.
Thetype of a node is the value of itstype field.
Thepositional information of a node is the value of itsposition field.
Afile is a source document that represents the original file that wasparsed to produce the syntax tree.Positional information represents the place of a nodein this file.Files are provided by the host environment and not defined by unist.
For example, see projects such asvfile.
Inpreorder (NLR) isdepth-firsttreetraversal that performs the following steps for each nodeN:
- N: visitN itself
- L: traversehead (then itsnext sibling, recursivelymoving forward until reachingtail)
- R: traversetail
Inpostorder (LRN) isdepth-firsttreetraversal that performs the following steps for each nodeN:
- L: traversehead (then itsnext sibling, recursivelymoving forward until reachingtail)
- R: traversetail
- N: visitN itself
Enter is a step right before other steps performed on a given nodeN whentraversing a tree.
For example, when performingpreorder traversal,enter is the first steptaken, right before visitingN itself.
Exit is a step right after other steps performed on a given nodeN whentraversing a tree.
For example, when performingpreorder traversal,exit is the last steptaken, right after traversing thetail ofN.
Tree traversal is a common task when working with atree tosearch it.Tree traversal is typically eitherbreadth-first ordepth-first.
In the following examples, we’ll work with this tree:
graph TD A-->B-->C B-->D B-->E A-->F-->GBreadth-first traversal is visiting a node and all itssiblings to broaden the search at that level, beforetraversingchildren.
For the syntax tree defined in the diagram, a breadth-first traversal firstsearches the root of the tree (A), then its children (B andF), thentheir children (C,D,E, andG).
Alternatively, and more commonly,depth-first traversal is used.The search is first deepened, by traversingchildren, beforetraversingsiblings.
For the syntax tree defined in the diagram, a depth-first traversal firstsearches the root of the tree (A), then one of its children (B orF), then their children (C,D, andE, orG).
For a given nodeN withchildren, adepth-first traversalperforms three steps, simplified to only binary trees (every node hashead andtail, but no other children):
These steps can be done in any order, but for non-binary trees,L andRoccur together.IfL is done beforeR, the traversal is calledleft-to-righttraversal, otherwise it is calledright-to-left traversal.In the case of non-binary trees, the other children betweenhead andtailare processed in that order as well, so forleft-to-right traversal, firsthead is traversed (L), then itsnext sibling is traversed, etcetera,until finallytail (R) is traversed.
BecauseL andR occur together for non-binary trees, we can produce fourtypes of orders: NLR, NRL, LRN, RLN.
NLR and LRN (the twoleft-to-right traversal options) are most commonly usedand respectively namedpreorder andpostorder.
For the syntax tree defined in the diagram,preorder andpostorder traversalthus first search the root of the tree (A), then its head (B), then itschildren from left-to-right (C,D, and thenE).After alldescendants ofB are traversed, its nextsibling (F) is traversed and then finally its only child (G).
Utilities are functions that work with nodes.
There are several projects that deal with nodes from specifications implementingunist:
unist-util-ancestor— get the common ancestor of one or more nodesunist-util-assert— assert nodesunist-util-filter— create a new tree with all nodes that pass the given functionunist-util-find— find a node by conditionunist-util-find-after— find a node after another nodeunist-util-find-all-after— find nodes after another node or indexunist-util-find-all-before— find nodes before another node or indexunist-util-find-all-between— find nodes between two nodes or positionsunist-util-find-before— find a node before another nodeunist-util-flat-filter— flat map version ofunist-util-filterunist-util-flatmap— create a new tree by expanding a node into manyunist-util-generated— check if a node is generatedunist-util-index— index nodes by property or computed keyunist-util-inspect— node inspectorunist-util-is— check if a node passes a testunist-util-map— create a new tree by mapping nodesunist-util-modify-children— modify direct children of a parentunist-util-parents—parentreferences on nodesunist-util-position— get positional info of nodesunist-util-reduce— recursively reduce a treeunist-util-remove— remove nodes from treesunist-util-remove-position— remove positional info from treesunist-util-replace-all-between— replace nodes between two nodes or positionsunist-util-select— select nodes with CSS-like selectorsunist-util-size— calculate the number of nodes in a treeunist-util-source— get the source of a value (node or position) in a fileunist-util-stringify-position— stringify a node, position, or pointunist-util-visit— recursively walk over nodesunist-util-visit-parents— recursively walk over nodes, with a stack of parentsunist-util-visit-children— visit direct children of a parentunist-util-visit-all-after— visit nodes after another nodeunist-builder— helper for creating trees
- JavaScript:ECMAScript Language Specification.Ecma International.
- JSON:The JavaScript Object Notation (JSON) Data Interchange Format,T. Bray.IETF.
- XML:Extensible Markup Language,T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler, F. Yergeau.W3C.
- Web IDL:Web IDL,C. McCormack.W3C.
Seecontributing.md insyntax-tree/.github forways to get started.Seesupport.md for ways to get help.
A curated list of awesome syntax-tree, unist, hast, xast, mdast, and nlcstresources can be found inawesome syntax-tree.
This project has acode of conduct.By interacting with this repository, organization, or community you agree toabide by its terms.
The initial release of this project was authored by@wooorm.
Special thanks to@eush77 for their work,ideas, and incredibly valuable feedback!Thanks to@anandthakker,@anko,@arobase-che,@azu,@BarryThePenguin,@ben-eb,@blahah,@blakeembrey,@brainkim,@ChristianMurphy,@davidtheclark,@denysdovhan,@derhuerst,@dozoisch,@fazouane-marouane,@gibson042,@hrajchert,@ikatyang,@inklesspen,@izumin5210,@jasonLaster,@JDvorak,@jlevy,@justjake,@kmck,@kt3k,@KyleAMathews,@luca3m,@mattdesl,@muraken720,@mrzmmr,@nwtn,@rhysd,@Rokt33r,@Sarah-Seo,@sethvincent,@shawnbot,@simov,@staltz,@TitanSnow,@tmcw,and@vhf,for contributing to unist and related projects!
About
Universal Syntax Tree used by@unifiedjs
Topics
Resources
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.