Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Universal Syntax Tree used by@unifiedjs

NotificationsYou must be signed in to change notification settings

syntax-tree/unist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

unist

UniversalSyntaxTree.


unist is a specification for syntax trees.It has a bigecosystem of utilities in JavaScript forworking with these trees.It’s implemented by several other specifications.

This document may not be released.Seereleases for released documents.The latest released version is3.0.0.

Contents

Intro

This document defines a general-purpose format for syntax trees.Development of unist started in July 2015.This specification is written in aWeb IDL-like grammar.

Syntax tree

Syntax trees are representations of source code or even natural language.These trees are abstractions that make it possible to analyze, transform, andgenerate code.

Syntax treescome in two flavors:

  • concrete syntax trees: structures that represent every detail (such aswhite-space in white-space insensitive languages)
  • abstract syntax trees: structures that only represent details relatingto the syntactic structure of code (such as ignoring whether a double orsingle quote was used in languages that support both, such as JavaScript).

This specification can express both abstract and concrete syntax trees.

Where this specification fits

unist is not intended to be self-sufficient.Instead, it is expected that other specifications implement unist and extend itto express language specific nodes.For example, see projects such ashast (for HTML),nlcst (fornatural language),mdast (for Markdown), andxast (for XML).

unist relates toJSON in that compliant syntax trees can be expressedcompletely in JSON.However, unist is not limited to JSON and can be expressed in other dataformats, such asXML.

unist relates toJavaScript in that it has a richecosystem ofutilities for working with compliant syntax trees inJavaScript.The five most used utilities combined are downloaded thirty million times eachmonth.However, unist is not limited to JavaScript and can be used in other programminglanguages.

unist relates to theunified,remark,rehype, andretextprojects in that unist syntax trees are used throughout their ecosystems.

unist relates to thevfile project in that it accepts unist nodes for itsmessage store, and that vfile can be a sourcefile of a syntaxtree.

Types

If you are using TypeScript, you can use the unist types by installing themwith npm:

npm install @types/unist

Nodes

Syntactic units in unist syntax trees are called nodes, and implement theNode interface.

Node

interface Node {  type:string  data: Data?  position: Position?}

Thetype field is a non-empty string representing the variant of a node.This field can be used to determine thetype a node implements.

Thedata field represents information from the ecosystem.The value of thedata field implements theData interface.

Theposition field represents the location of a node in a source document.The value of theposition field implements thePositioninterface.Theposition field must not be present if a node isgenerated.

Specifications implementing unist are encouraged to define more fields.Ecosystems can define fields onData.

Any value in unistmust be expressible in JSON values:string,number,object,array,true,false, ornull.This means that the syntax tree should be able to be converted to and from JSONand produce the same tree.For example, in JavaScript, a tree can be passed throughJSON.parse(JSON.stringify(tree)) and result in the same tree.

Position

interface Position {  start: Pointend: Point}

Position represents the location of a node in a sourcefile.

Thestart field ofPosition represents the place of the first character ofthe parsed source region.Theend field ofPosition represents the place of the first characterafter the parsed source region, whether it exists or not.The value of thestart andend fields implement thePointinterface.

If the syntactic unit represented by a node is not present in the sourcefile at the time of parsing, the node is said to begenerated and it must not have positional information.

For example, if the following value was represented as unist:

alphabravo

…the first word (alpha) would start at line1, column1, offset0, andend at line1, column6, offset5.The line feed would start at line1, column6, offset5, and end at line2, column1, offset6.The last word (bravo) would start at line2, column1, offset6, and endat line2, column6, offset11.

Point

interface Point {  line: number>=1  column: number>=1  offset: number>=0?}

Point represents one place in a sourcefile.

Theline field (1-indexed integer) represents a line in a source file.Thecolumn field (1-indexed integer) represents a column in a source file.Theoffset field (0-indexed integer) represents a character in a source file.

The term character means a (UTF-16) code unit which is defined in theWeb IDL specification.

Data

interface Data { }

Data represents information associated by the ecosystem with the node.

This space is guaranteed to never be specified by unist or specificationsimplementing unist.

Parent

interface Parent<: Node {  children: [Node]}

Nodes containing other nodes (said to bechildren) extend theabstract interfaceParent (Node).

Thechildren field is a list representing the children of a node.

Literal

interface Literal<: Node {  value: any}

Nodes containing a value extend the abstract interfaceLiteral(Node).

Thevalue field can contain any value.

Glossary

Tree

Atree is a node and all of itsdescendants (if any).

Child

Node X ischild of node Y, if Y’schildren include X.

Parent

Node X isparent of node Y, if Y is achild of X.

Index

Theindex of achild is its number of precedingsiblings, or0 if it has none.

Sibling

Node X is asibling of node Y, if X and Y have the sameparent (if any).

Theprevious sibling of achild is itssibling at itsindex minus 1.

Thenext sibling of achild is itssibling at itsindex plus 1.

Root

Theroot of a node is itself, if withoutparent, or theroot of itsparent.

Theroot of atree is any node in thattreewithoutparent.

Descendant

Node X isdescendant of node Y, if X is achild of Y, or ifX is achild of node Z that is adescendant of Y.

Aninclusive descendant is a node or one of itsdescendants.

Ancestor

Node X is anancestor of node Y, if Y is adescendantof X.

Aninclusive ancestor is a node or one of itsancestors.

Head

Thehead of a node is its firstchild (if any).

Tail

Thetail of a node is its lastchild (if any).

Leaf

Aleaf is a node with nochildren.

Branch

Abranch is a node with one or morechildren.

Generated

A node isgenerated if it does not havepositionalinformation.

Type

Thetype of a node is the value of itstype field.

Positional information

Thepositional information of a node is the value of itsposition field.

File

Afile is a source document that represents the original file that wasparsed to produce the syntax tree.Positional information represents the place of a nodein this file.Files are provided by the host environment and not defined by unist.

For example, see projects such asvfile.

Preorder

Inpreorder (NLR) isdepth-firsttreetraversal that performs the following steps for each nodeN:

  1. N: visitN itself
  2. L: traversehead (then itsnext sibling, recursivelymoving forward until reachingtail)
  3. R: traversetail
Postorder

Inpostorder (LRN) isdepth-firsttreetraversal that performs the following steps for each nodeN:

  1. L: traversehead (then itsnext sibling, recursivelymoving forward until reachingtail)
  2. R: traversetail
  3. N: visitN itself
Enter

Enter is a step right before other steps performed on a given nodeN whentraversing a tree.

For example, when performingpreorder traversal,enter is the first steptaken, right before visitingN itself.

Exit

Exit is a step right after other steps performed on a given nodeN whentraversing a tree.

For example, when performingpreorder traversal,exit is the last steptaken, right after traversing thetail ofN.

Tree traversal

Tree traversal is a common task when working with atree tosearch it.Tree traversal is typically eitherbreadth-first ordepth-first.

In the following examples, we’ll work with this tree:

graph TD    A-->B-->C        B-->D        B-->E    A-->F-->G
Loading
Breadth-first traversal

Breadth-first traversal is visiting a node and all itssiblings to broaden the search at that level, beforetraversingchildren.

For the syntax tree defined in the diagram, a breadth-first traversal firstsearches the root of the tree (A), then its children (B andF), thentheir children (C,D,E, andG).

Depth-first traversal

Alternatively, and more commonly,depth-first traversal is used.The search is first deepened, by traversingchildren, beforetraversingsiblings.

For the syntax tree defined in the diagram, a depth-first traversal firstsearches the root of the tree (A), then one of its children (B orF), then their children (C,D, andE, orG).

For a given nodeN withchildren, adepth-first traversalperforms three steps, simplified to only binary trees (every node hashead andtail, but no other children):

  • N: visitN itself
  • L: traversehead
  • R: traversetail

These steps can be done in any order, but for non-binary trees,L andRoccur together.IfL is done beforeR, the traversal is calledleft-to-righttraversal, otherwise it is calledright-to-left traversal.In the case of non-binary trees, the other children betweenhead andtailare processed in that order as well, so forleft-to-right traversal, firsthead is traversed (L), then itsnext sibling is traversed, etcetera,until finallytail (R) is traversed.

BecauseL andR occur together for non-binary trees, we can produce fourtypes of orders: NLR, NRL, LRN, RLN.

NLR and LRN (the twoleft-to-right traversal options) are most commonly usedand respectively namedpreorder andpostorder.

For the syntax tree defined in the diagram,preorder andpostorder traversalthus first search the root of the tree (A), then its head (B), then itschildren from left-to-right (C,D, and thenE).After alldescendants ofB are traversed, its nextsibling (F) is traversed and then finally its only child (G).

Utilities

Utilities are functions that work with nodes.

There are several projects that deal with nodes from specifications implementingunist:

List of utilities

References

Contribute

Seecontributing.md insyntax-tree/.github forways to get started.Seesupport.md for ways to get help.

A curated list of awesome syntax-tree, unist, hast, xast, mdast, and nlcstresources can be found inawesome syntax-tree.

This project has acode of conduct.By interacting with this repository, organization, or community you agree toabide by its terms.

Acknowledgments

The initial release of this project was authored by@wooorm.

Special thanks to@eush77 for their work,ideas, and incredibly valuable feedback!Thanks to@anandthakker,@anko,@arobase-che,@azu,@BarryThePenguin,@ben-eb,@blahah,@blakeembrey,@brainkim,@ChristianMurphy,@davidtheclark,@denysdovhan,@derhuerst,@dozoisch,@fazouane-marouane,@gibson042,@hrajchert,@ikatyang,@inklesspen,@izumin5210,@jasonLaster,@JDvorak,@jlevy,@justjake,@kmck,@kt3k,@KyleAMathews,@luca3m,@mattdesl,@muraken720,@mrzmmr,@nwtn,@rhysd,@Rokt33r,@Sarah-Seo,@sethvincent,@shawnbot,@simov,@staltz,@TitanSnow,@tmcw,and@vhf,for contributing to unist and related projects!

License

CC-BY-4.0 ©Titus Wormer


[8]ページ先頭

©2009-2025 Movatter.jp