Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Java library for parsing and rendering CommonMark (Markdown)

License

NotificationsYou must be signed in to change notification settings

commonmark/commonmark-java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Java library for parsing and renderingMarkdown text according to theCommonMark specification (and some extensions).

Maven Central statusjavadoccicodecovSourceSpy Dashboard

Introduction

Provides classes for parsing input to an abstract syntax tree (AST),visiting and manipulating nodes, and rendering to HTML or back to Markdown.It started out as a port ofcommonmark.js, but has since evolved into anextensible library with the following features:

  • Small (core has no dependencies, extensions in separate artifacts)
  • Fast (10-20 times faster thanpegdown which used to be a popular Markdownlibrary, see benchmarks in repo)
  • Flexible (manipulate the AST after parsing, customize HTML rendering)
  • Extensible (tables, strikethrough, autolinking and more, see below)

The library is supported on Java 11 and later. It works on Android too,but that is on a best-effort basis, please report problems. For Android theminimum API level is 19, see thecommonmark-android-testdirectory.

Coordinates for core library (see all onMaven Central):

<dependency>    <groupId>org.commonmark</groupId>    <artifactId>commonmark</artifactId>    <version>0.24.0</version></dependency>

The module names to use in Java 9 areorg.commonmark,org.commonmark.ext.autolink, etc, corresponding to package names.

Note that for 0.x releases of this library, the API is not considered stableyet and may break between minor releases. After 1.0,Semantic Versioning willbe followed. A package containingbeta means it's not subject to stable APIguarantees yet; but for normal usage it should not be necessary to use.

See thespec.txtfile if you're wondering which version of the spec is currentlyimplemented. Also check out theCommonMark dingus for getting familiarwith the syntax or trying out edge cases. If you clone the repository,you can also use theDingusApp class to try out things interactively.

Usage

Parse and render to HTML

importorg.commonmark.node.*;importorg.commonmark.parser.Parser;importorg.commonmark.renderer.html.HtmlRenderer;Parserparser =Parser.builder().build();Nodedocument =parser.parse("This is *Markdown*");HtmlRendererrenderer =HtmlRenderer.builder().build();renderer.render(document);// "<p>This is <em>Markdown</em></p>\n"

This uses the parser and renderer with default options. Both builders havemethods for configuring their behavior:

  • escapeHtml(true) onHtmlRenderer will escape raw HTML tags and blocks.
  • sanitizeUrls(true) onHtmlRenderer will strip potentially unsafe URLsfrom<a> and<img> tags
  • For all available options, see methods on the builders.

Note that this library doesn't try to sanitize the resulting HTML with regardsto which tags are allowed, etc. That is the responsibility of the caller, andif you expose the resulting HTML, you probably want to run a sanitizer on itafter this.

Render to Markdown

importorg.commonmark.node.*;importorg.commonmark.renderer.markdown.MarkdownRenderer;MarkdownRendererrenderer =MarkdownRenderer.builder().build();Nodedocument =newDocument();Headingheading =newHeading();heading.setLevel(2);heading.appendChild(newText("My title"));document.appendChild(heading);renderer.render(document);// "## My title\n"

For rendering to plain text with minimal markup, there's alsoTextContentRenderer.

Use a visitor to process parsed nodes

After the source text has been parsed, the result is a tree of nodes.That tree can be modified before rendering, or just inspected withoutrendering:

Nodenode =parser.parse("Example\n=======\n\nSome more text");WordCountVisitorvisitor =newWordCountVisitor();node.accept(visitor);visitor.wordCount;// 4classWordCountVisitorextendsAbstractVisitor {intwordCount =0;@Overridepublicvoidvisit(Texttext) {// This is called for all Text nodes. Override other visit methods for other node types.// Count words (this is just an example, don't actually do it this way for various reasons).wordCount +=text.getLiteral().split("\\W+").length;// Descend into children (could be omitted in this case because Text nodes don't have children).visitChildren(text);    }}

Source positions

If you want to know where a parsedNode appeared in the input source text,you can request the parser to return source positions like this:

varparser =Parser.builder().includeSourceSpans(IncludeSourceSpans.BLOCKS_AND_INLINES).build();

Then parse nodes and inspect source positions:

varsource ="foo\n\nbar *baz*";vardoc =parser.parse(source);varemphasis =doc.getLastChild().getLastChild();vars =emphasis.getSourceSpans().get(0);s.getLineIndex();// 2 (third line)s.getColumnIndex();// 4 (fifth column)s.getInputIndex();// 9 (string index 9)s.getLength();// 5source.substring(s.getInputIndex(),s.getInputIndex() +s.getLength());// "*baz*"

If you're only interested in blocks and not inlines, useIncludeSourceSpans.BLOCKS.

Add or change attributes of HTML elements

Sometimes you might want to customize how HTML is rendered. If all youwant to do is add or change attributes on some elements, there's asimple way to do that.

In this example, we register a factory for anAttributeProvider on therenderer to set aclass="border" attribute onimg elements.

Parserparser =Parser.builder().build();HtmlRendererrenderer =HtmlRenderer.builder()        .attributeProviderFactory(newAttributeProviderFactory() {publicAttributeProvidercreate(AttributeProviderContextcontext) {returnnewImageAttributeProvider();            }        })        .build();Nodedocument =parser.parse("![text](/url.png)");renderer.render(document);// "<p><img src=\"/url.png\" alt=\"text\" class=\"border\" /></p>\n"classImageAttributeProviderimplementsAttributeProvider {@OverridepublicvoidsetAttributes(Nodenode,StringtagName,Map<String,String>attributes) {if (nodeinstanceofImage) {attributes.put("class","border");        }    }}

Customize HTML rendering

If you want to do more than just change attributes, there's also a wayto take complete control over how HTML is rendered.

In this example, we're changing the rendering of indented code blocks toonly wrap them inpre instead ofpre andcode:

Parserparser =Parser.builder().build();HtmlRendererrenderer =HtmlRenderer.builder()        .nodeRendererFactory(newHtmlNodeRendererFactory() {publicNodeRenderercreate(HtmlNodeRendererContextcontext) {returnnewIndentedCodeBlockNodeRenderer(context);            }        })        .build();Nodedocument =parser.parse("Example:\n\n    code");renderer.render(document);// "<p>Example:</p>\n<pre>code\n</pre>\n"classIndentedCodeBlockNodeRendererimplementsNodeRenderer {privatefinalHtmlWriterhtml;IndentedCodeBlockNodeRenderer(HtmlNodeRendererContextcontext) {this.html =context.getWriter();    }@OverridepublicSet<Class<?extendsNode>>getNodeTypes() {// Return the node types we want to use this renderer for.returnSet.of(IndentedCodeBlock.class);    }@Overridepublicvoidrender(Nodenode) {// We only handle one type as per getNodeTypes, so we can just cast it here.IndentedCodeBlockcodeBlock = (IndentedCodeBlock)node;html.line();html.tag("pre");html.text(codeBlock.getLiteral());html.tag("/pre");html.line();    }}

Add your own node types

In case you want to store additional data in the document or have customelements in the resulting HTML, you can create your own subclass ofCustomNode and add instances as child nodes to existing nodes.

To define the HTML rendering for them, you can use aNodeRenderer asexplained above.

Customize parsing

There are a few ways to extend parsing or even override built-in parsing,all of them via methods onParser.Builder(seeBlocks and inlines in the spec for an overview of blocks/inlines):

  • Parsing of specific block types (e.g. headings, code blocks, etc) can beenabled/disabled withenabledBlockTypes
  • Parsing of blocks can be extended/overridden withcustomBlockParserFactory
  • Parsing of inline content can be extended/overridden withcustomInlineContentParserFactory
  • Parsing ofdelimiters in inline content can beextended withcustomDelimiterProcessor
  • Processing of links can be customized withlinkProcessor andlinkMarker

Thread-safety

Both theParser andHtmlRenderer are designed so that you canconfigure them once using the builders and then use them multipletimes/from multiple threads. This is done by separating the state forparsing/rendering from the configuration.

Having said that, there might be bugs of course. If you find one, pleasereport an issue.

API documentation

Javadocs are available online onjavadoc.io.

Extensions

Extensions need to extend the parser, or the HTML renderer, or both. Touse an extension, the builder objects can be configured with a list ofextensions. Because extensions are optional, they live in separateartifacts, so additional dependencies need to be added as well.

Let's look at how to enable tables from GitHub Flavored Markdown.First, add an additional dependency (seeMaven Central for others):

<dependency>    <groupId>org.commonmark</groupId>    <artifactId>commonmark-ext-gfm-tables</artifactId>    <version>0.24.0</version></dependency>

Then, configure the extension on the builders:

importorg.commonmark.ext.gfm.tables.TablesExtension;List<Extension>extensions =List.of(TablesExtension.create());Parserparser =Parser.builder()        .extensions(extensions)        .build();HtmlRendererrenderer =HtmlRenderer.builder()        .extensions(extensions)        .build();

To configure another extension in the above example, just add it to the list.

The following extensions are developed with this library, each in theirown artifact.

Autolink

Turns plain links such as URLs and email addresses into links (based onautolink-java).

Use classAutolinkExtension from artifactcommonmark-ext-autolink.

Strikethrough

Enables strikethrough of text by enclosing it in~~. For example, inhey ~~you~~,you will be rendered as strikethrough text.

Use classStrikethroughExtension in artifactcommonmark-ext-gfm-strikethrough.

Tables

Enables tables using pipes as inGitHub Flavored Markdown.

Use classTablesExtension in artifactcommonmark-ext-gfm-tables.

Footnotes

Enables footnotes like inGitHuborPandoc:

Main text[^1][^1]: Additional text in a footnote

Inline footnotes like^[inline footnote] are also supported when enabled viaFootnotesExtension.Builder#inlineFootnotes.

Use classFootnotesExtension in artifactcommonmark-ext-footnotes.

Heading anchor

Enables adding auto generated "id" attributes to heading tags. The "id"is based on the text of the heading.

# Heading will be rendered as:

<h1>Heading</h1>

Use classHeadingAnchorExtension in artifactcommonmark-ext-heading-anchor.

In case you want custom rendering of the heading instead, you can usetheIdGenerator class directly together with aHtmlNodeRendererFactory (see example above).

Ins

Enables underlining of text by enclosing it in++. For example, inhey ++you++,you will be rendered as underline text. Uses the <ins> tag.

Use classInsExtension in artifactcommonmark-ext-ins.

YAML front matter

Adds support for metadata through a YAML front matter block. This extension only supports a subset of YAML syntax. Here's an example of what's supported:

---key: valuelist:  - value 1  - value 2literal: |  this is literal value.  literal values 2---document start here

Use classYamlFrontMatterExtension in artifactcommonmark-ext-yaml-front-matter. To fetch metadata, useYamlFrontMatterVisitor.

Image Attributes

Adds support for specifying attributes (specifically height and width) for images.

The attribute elements are given askey=value pairs inside curly braces{ } after the image node to which they apply,for example:

![text](/url.png){width=640 height=480}

will be rendered as:

<img src="/url.png" alt="text" width="640" height="480" />

Use classImageAttributesExtension in artifactcommonmark-ext-image-attributes.

Note: since this extension uses curly braces{} as its delimiters (inStylesDelimiterProcessor), this means thatother delimiter processorscannot use curly braces for delimiting.

Task List Items

Adds support for tasks as list items.

A task can be represented as a list item where the first non-whitespace character is a left bracket[, then a singlewhitespace character or the letterx in lowercase or uppercase, then a right bracket] followed by at least onewhitespace before any other content.

For example:

- [ ] task #1- [x] task #2

will be rendered as:

<ul><li><input type="checkbox" disabled=""> task #1</li><li><input type="checkbox" disabled="" checked=""> task #2</li></ul>

Use classTaskListItemsExtension in artifactcommonmark-ext-task-list-items.

Third-party extensions

You can also find other extensions in the wild:

Used by

Some users of this library (feel free to raise a PR if you want to be added):

See also

  • Markwon: Android library for rendering markdown as system-native Spannables
  • flexmark-java: Fork that added support for a lot more syntax and flexibility

Contributing

SeeCONTRIBUTING.md file.

License

Copyright (c) 2015, Robin Stocker

BSD (2-clause) licensed, see LICENSE.txt file.


[8]ページ先頭

©2009-2025 Movatter.jp