Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Parse HTML/XML to PostHTMLTree

License

NotificationsYou must be signed in to change notification settings

posthtml/posthtml-parser

Repository files navigation

PostHTML

PostHTML Parser

Parse HTML/XML toPostHTML AST

VersionBuildLicenseCoverage

Installation

npm install posthtml-parser

Usage

Input HTML:

<aclass="animals"href="#"><spanclass="animals__cat"style="background: url(cat.png)">Cat</span></a>

Parse withposthtml-parser:

importfsfrom'fs'import{parser}from'posthtml-parser'consthtml=fs.readFileSync('path/to/input.html','utf-8')console.log(parser(html))

Resulting PostHTML AST:

[{tag:'a',attrs:{class:'animals',href:'#'},content:['\n    ',{tag:'span',attrs:{class:'animals__cat',style:'background: url(cat.png)'},content:['Cat']},'\n']}]

PostHTML AST Format

Any parser used with PostHTML should return a standard PostHTMLAbstract Syntax Tree (AST).

Fortunately, this is a very easy format to produce and understand. The AST is an array that can contain strings and objects. Strings represent plain text content, while objects represent HTML tags.

Tag objects generally look like this:

{tag:'div',attrs:{class:'foo'},content:['hello world!']}

Tag objects can contain three keys:

  • Thetag key takes the name of the tag as the value. This can include custom tags.
  • The optionalattrs key takes an object with key/value pairs representing the attributes of the html tag. A boolean attribute has an empty string as its value.
  • The optionalcontent key takes an array as its value, which is a PostHTML AST. In this manner, the AST is a tree that should be walked recursively.

Options

directives

Type:Array
Default:[{name: '!doctype', start: '<', end: '>'}]

Adds processing of custom directives.

The propertyname in custom directives can be ofString orRegExp type.

xmlMode

Type:Boolean
Default:false

Indicates whether special tags (<script> and<style>) should get special treatment and if "empty" tags (eg.<br>) can have children. If false, the content of special tags will be text only.

For feeds and other XML content (documents that don't consist of HTML), set this totrue.

decodeEntities

Type:Boolean
Default:false

If set totrue, entities within the document will be decoded.

lowerCaseTags

Type:Boolean
Default:false

If set totrue, all tags will be lowercased. IfxmlMode is disabled.

lowerCaseAttributeNames

Type:Boolean
Default:false

If set totrue, all attribute names will be lowercased.

This has noticeable impact on speed.

recognizeCDATA

Type:Boolean
Default:false

If set totrue, CDATA sections will be recognized as text even if thexmlMode option is not enabled.

IfxmlMode is set totrue, then CDATA sections will always be recognized as text.

recognizeSelfClosing

Type:Boolean
Default:false

If set totrue, self-closing tags will trigger theonclosetag event even ifxmlMode is not set totrue.

IfxmlMode is set totrue, then self-closing tags will always be recognized.

sourceLocations

Type:Boolean
Default:false

If set totrue, AST nodes will have alocation property containing thestart andend line and column position of the node.

recognizeNoValueAttribute

Type:Boolean
Default:false

If set totrue, AST nodes will recognize attribute with no value and mark astrue which will be correctly rendered byposthtml-render package.

About

Parse HTML/XML to PostHTMLTree

Resources

License

Stars

Watchers

Forks

Sponsor this project

    Packages

    No packages published

    Contributors16


    [8]ページ先頭

    ©2009-2025 Movatter.jp