To champion the single-responsibility and open/closed principles, we have tried to make it relatively painless to extend Marked. If you are looking to add custom functionality, this is the place to start.
is the recommended way to extend Marked. Theextension
object can contain anyoption available in Marked:
import { marked }from'marked';marked.use({pedantic:false,gfm:true,breaks:false});
You can also supply multipleextension
objects at once.
marked.use(myExtension, extension2, extension3);\\EQUIVALENTTO:marked.use(myExtension);marked.use(extension2);marked.use(extension3);
All options will overwrite those previously set, except for the following options which will be merged with the existing framework and can be used to change or extend the functionality of Marked:renderer
, andextensions
, andhooks
options are objects with functions that will be merged into the built-inrenderer
option is a function that will be called to post-process every token before rendering.
option is an array of objects that can contain additional customrenderer
steps that will execute before any of the default parsing logic occurs.
Before building your custom extensions, it is important to understand the components that Marked uses to translate from Markdown to HTML:
feeds segments of the input text string into eachtokenizer
, and from their output, generates a series of tokens in a nested tree structure.tokenizer
receives a segment of Markdown text and, if it matches a particular pattern, generates a token object containing any relevant information.walkTokens
function will traverse every token in the tree and perform any final adjustments to the token contents.parser
traverses the token tree and feeds each token into the appropriaterenderer
, and concatenates their outputs into the final HTML result.renderer
receives a token and manipulates its contents to generate a segment of HTML.Marked provides methods for directly overriding therenderer
for any existing token type, as well as inserting additional customrenderer
functions to handle entirely custom syntax. For example, usingmarked.use({renderer})
would modify a renderer, whereasmarked.use({extensions: [{renderer}]})
would add a new renderer. See thecustom extensions example for insight on how to execute this.
The renderer defines the HTML output of a given token. If you supply arenderer
in the options object passed tomarked.use()
, any functions in the object will override the default handling of that token type.
to override the same function multiple times will give priority to the version that was assignedlast. Overriding functions can returnfalse
to fall back to the previous override in the sequence, or resume default behavior if all overrides returnfalse
. Returning any other value (including nothing) will prevent fallback behavior.
Example: Overriding output of the defaultheading
token by adding an embedded anchor tag like on GitHub.
// Create reference instanceimport { marked }from'marked';// Override functionconst renderer = {heading({ tokens, depth }) {const text =this.parser.parseInline(tokens);const escapedText = text.toLowerCase().replace(/[^\w]+/g,'-');return` <h${depth}> <a name="${escapedText}" class="anchor" href="#${escapedText}"> <span class="header-link"></span> </a>${text} </h${depth}>`; }};marked.use({ renderer });// Run markedconsole.log(marked.parse('# heading+'));
<h1><aname="heading-"class="anchor"href="#heading-"><spanclass="header-link"></span></a> heading+</h1>
Note: Callingmarked.use()
in the following way will avoid overriding theheading
token output but create a newheading
renderer in the process.
marked.use({extensions: [{name:'heading',renderer(token) {return/* ... */ } }]})
html(token:Tokens.HTML | Tokens.Tag):string
text(token:Tokens.Text | Tokens.Escape | Tokens.Tag):string
The Tokens.* properties can be foundhere.
The tokenizer defines how to turn markdown text into tokens. If you supply atokenizer
object to the Marked options, it will be merged with the built-in tokenizer and any functions inside will override the default handling of that token type.
to override the same function multiple times will give priority to the version that was assignedlast. Overriding functions can returnfalse
to fall back to the previous override in the sequence, or resume default behavior if all overrides returnfalse
. Returning any other value (including nothing) will prevent fallback behavior.
Example: Overriding defaultcodespan
tokenizer to include LaTeX.
// Create reference instanceimport { marked }from'marked';// Override functionconst tokenizer = {codespan(src) {const match = src.match(/^\$+([^\$\n]+?)\$+/);if (match) {return {type:'codespan',raw: match[0],text: match[1].trim() }; }// return false to use original codespan tokenizerreturnfalse; }};marked.use({ tokenizer });// Run markedconsole.log(marked.parse('$ latex code $\n\n` other code `'));
<p><code>latex code</code></p><p><code>other code</code></p>
NOTE: This does not fully support latex, see issue#1948.
link(src:string):Tokens.Link | Tokens.Image
reflink(src:string, links:object):Tokens.Link | Tokens.Image | Tokens.Text
emStrong(src:string, maskedSrc:string, prevChar:string):Tokens.Em | Tokens.Strong
The Tokens.* properties can be foundhere.
The walkTokens function gets called with every token. Child tokens are called before moving on to sibling tokens. Each token is passed by reference so updates are persisted when passed to the parser. Whenasync
mode is enabled, the return value is awaited. Otherwise the return value is ignored.
can be called multiple times with differentwalkTokens
functions. Each function will be called in order, starting with the function that was assignedlast.
Example: Overriding heading tokens to start at h2.
import { marked }from'marked';// Override functionconstwalkTokens = (token) => {if (token.type ==='heading') { token.depth +=1; }};marked.use({ walkTokens });// Run markedconsole.log(marked.parse('# heading 2\n\n## heading 3'));
<h2id="heading-2">heading 2</h2><h3id="heading-3">heading 3</h3>
Hooks are methods that hook into some part of marked. The following hooks are available:
signature | description |
preprocess(markdown: string): string | Process markdown before sending it to marked. |
postprocess(html: string): string | Process html after marked has finished parsing. |
processAllTokens(tokens: Token[]): Token[] | Process all tokens before walk tokens. |
provideLexer(): (src: string, options?: MarkedOptions) => Token[] | Provide function to tokenize markdown. |
provideParser(): (tokens: Token[], options?: MarkedOptions) => string | Provide function to parse tokens. |
can be called multiple times with differenthooks
functions. Each function will be called in order, starting with the function that was assignedlast.
Example: Set options based onfront-matter
import { marked }from'marked';import fmfrom'front-matter';// Override functionfunctionpreprocess(markdown) {const { attributes, body } =fm(markdown);for (const propin attributes) {if (propinthis.options) {this.options[prop] = attributes[prop]; } }return body;}marked.use({hooks: { preprocess } });// Run markedconsole.log(marked.parse(`---breaks: true---line1line2`.trim()));
Example: Sanitize HTML withisomorphic-dompurify
import { marked }from'marked';importDOMPurifyfrom'isomorphic-dompurify';// Override functionfunctionpostprocess(html) {returnDOMPurify.sanitize(html);}marked.use({hooks: { postprocess } });// Run markedconsole.log(marked.parse(`<img src=x onerror=alert(1)//>`));
Example: Save reflinks for chunked rendering
import { marked,Lexer }from'marked';let refLinks = {};// Override functionfunctionprocessAllTokens(tokens) { refLinks = tokens.links;return tokens;}functionprovideLexer(src, options) {return(src, options) => {const lexer =newLexer(options); lexer.tokens.links = refLinks;returnthis.block ? lexer.lex(src) : lexer.inlineTokens(src); };}marked.use({hooks: { processAllTokens, provideLexer } });// Parse reflinks separately from markdown that uses themmarked.parse(`[test]:`);console.log(marked.parse(`[test link][test]`));
<p><ahref="">test link</a></p>
You may supply anextensions
array to theoptions
object. This array can contain any number ofextension
objects, using the following properties:
If the name matches an existing extension name, or an existing method in the tokenizer/renderer methods listed above, they will override the previously assigned behavior, with priority on the extension that was assignedlast. An extension can returnfalse
to fall back to the previous behavior.
Ablock-level extension will be handled before any of the block-level tokenizer methods listed above, and generally consists of 'container-type' text (paragraphs, tables, blockquotes, etc.).
Aninline-level extension will be handled inside each block-level token, before any of the inline-level tokenizer methods listed above. These generally consist of 'style-type' text (italics, bold, etc.).
start(string src)
The index can be the result of asrc.match().index
, or even a simplesrc.indexOf()
. Marked will use this function to ensure that it does not skip over any text that should be part of the custom token.
tokenizer(string src,array tokens)
string. Accordingly, if using a Regular Expression to detect a token, it should be anchored to the string start (`^`). Thetokens
parameter contains the array of tokens that have been generated by the lexer up to that point, and can be used to access the previous token, for instance.The return value should be an object with the following parameters:
parameter of the extension.raw
tokens [optional]
function by default.The returned token can also contain any other custom parameters of your choice that your customrenderer
might need to access.
The tokenizer function has access to the lexer in thethis
object, which can be used if any internal section of the string needs to be parsed further, such as in handling any inline syntax on the text within a block token. The key functions that may be useful include:
this.lexer.blockTokens(string text,array tokens)
array. Thetokens
array is also returned by the function. You might use this, for example, if your extension creates a "container"-type token (such as a blockquote) that can potentially include other block-level tokens inside.this.lexer.inline(string text,array tokens)
to a queue to be processed using inline-level tokenizers (including any inline-level extensions) at that later step. Tokens will be generated using the providedtext
, and any resulting tokens will be appended to thetokens
array. Note that this function does **NOT** return anything since the inline processing cannot happen until the block-level processing is complete.this.lexer.inlineTokens(string text,array tokens)
token inside of a### Heading
). This runs the inline tokenizer functions (including any inline-level extensions) on the provided text, and appends any resulting tokens onto thetokens
array. Thetokens
array is also returned by the function.renderer(object token)
The renderer function has access to the parser in thethis
object, which can be used if any part of the token needs needs to be parsed further, such as any child tokens. The key functions that may be useful include:
this.parser.parse(array tokens)
this.parser.parseInline(array tokens)
childTokens [optional]
functions. For instance, if you want to use a second custom parameter to contain child tokens in addition totokens
, it could be listed here. IfchildTokens
is provided, thetokens
array will not be walked by default unless it is also included in thechildTokens
array.Note: If you would like to release an extension as an npm package you may use theMarked Extension Template which includes all of the things you need to get started. Feel free to create an issue in thatrepo if you need help.
Example:Add a custom syntax to generate<dl>
description lists.
const descriptionList = {name:'descriptionList',level:'block',// Is this a block-level or inline-level tokenizer?start(src) {return src.match(/:[^:\n]/)?.index; },// Hint to Marked.js to stop and check for a matchtokenizer(src, tokens) {const rule =/^(?::[^:\n]+:[^:\n]*(?:\n|$))+/;// Regex for the complete token, anchor to string startconst match = rule.exec(src);if (match) {const token = {// Token to generatetype:'descriptionList',// Should match "name" aboveraw: match[0],// Text to consume from the sourcetext: match[0].trim(),// Additional custom propertiestokens: []// Array where child inline tokens will be generated };this.lexer.inline(token.text, token.tokens);// Queue this data to be processed for inline tokensreturn token; } },renderer(token) {return`<dl>${this.parser.parseInline(token.tokens)}\n</dl>`;// parseInline to turn child tokens into HTML }};const description = {name:'description',level:'inline',// Is this a block-level or inline-level tokenizer?start(src) {return src.match(/:/)?.index; },// Hint to Marked.js to stop and check for a matchtokenizer(src, tokens) {const rule =/^:([^:\n]+):([^:\n]*)(?:\n|$)/;// Regex for the complete token, anchor to string startconst match = rule.exec(src);if (match) {return {// Token to generatetype:'description',// Should match "name" aboveraw: match[0],// Text to consume from the sourcedt:this.lexer.inlineTokens(match[1].trim()),// Additional custom properties, includingdd:this.lexer.inlineTokens(match[2].trim())// any further-nested inline tokens }; } },renderer(token) {return`\n<dt>${this.parser.parseInline(token.dt)}</dt><dd>${this.parser.parseInline(token.dd)}</dd>`; },childTokens: ['dt','dd'],// Any child tokens to be visited by walkTokens};functionwalkTokens(token) {// Post-processing on the completed token treeif (token.type ==='strong') { token.text +=' walked'; token.tokens =this.Lexer.lexInline(token.text) }}marked.use({extensions: [descriptionList, description], walkTokens });// EQUIVALENT TO:marked.use({extensions: [descriptionList] });marked.use({extensions: [description] });marked.use({ walkTokens })console.log(marked.parse('A Description List:\n' +': Topic 1 : Description 1\n' +': **Topic 2** : *Description 2*'));
<p>A Description List:</p><dl><dt>Topic 1</dt><dd>Description 1</dd><dt><strong>Topic 2 walked</strong></dt><dd><em>Description 2</em></dd></dl>
Marked will return a promise if theasync
option is true. Theasync
option will tell marked to await anywalkTokens
functions before parsing the tokens and returning an HTML string.
Simple Example:
constwalkTokens =async (token) => {if (token.type ==='link') {try {awaitfetch(token.href); }catch (ex) { token.title ='invalid'; } }};marked.use({ walkTokens,async:true });const markdown =`[valid link]([invalid link](`;const html =await marked.parse(markdown);
Custom Extension Example:
const importUrl = {extensions: [{name:'importUrl',level:'block',start(src) {return src.indexOf('\n:'); },tokenizer(src) {const rule =/^:(https?:\/\/.+?):/;const match = rule.exec(src);if (match) {return {type:'importUrl',raw: match[0],url: match[1],html:''// will be replaced in walkTokens }; } },renderer(token) {return token.html; } }],async:true,// needed to tell marked to return a promiseasyncwalkTokens(token) {if (token.type ==='importUrl') {const res =awaitfetch(token.url); token.html =await res.text(); } }};marked.use(importUrl);const markdown =`#`;const html =await marked.parse(markdown);
The lexer takes a markdown string and calls the tokenizer functions.
The parser takes tokens as input and calls the renderer functions.
You also have direct access to the lexer and parser if you so desire. The lexer and parser options are the same as passed tomarked.setOptions()
except they have to be full options objects, they don't get merged with the current or default options.
const tokens = marked.lexer(markdown, options);console.log(marked.parser(tokens, options));
const lexer =new marked.Lexer(options);const tokens = lexer.lex(markdown);console.log(tokens);console.log(lexer.tokenizer.rules.block);// block level rules usedconsole.log(lexer.tokenizer.rules.inline);// inline level rules usedconsole.log(marked.Lexer.rules.block);// all block level rulesconsole.log(marked.Lexer.rules.inline);// all inline level rules
Note that the lexer can be used in two different ways:
: this method tokenizes a string and returns its tokens. Subsequent calls tolexer()
ignore any previous marked.Lexer().lex()
: this instance tokenizes a string and returns its tokens along with any previous tokens. Subsequent calls tolex()
accumulate tokens.$ node> require('marked').lexer('> I am using marked.')[ {type:"blockquote", raw:"> I am using marked.", tokens: [ {type:"paragraph", raw:"I am using marked.", text:"I am using marked.", tokens: [ {type:"text", raw:"I am using marked.", text:"I am using marked." } ] } ] }, links: {}]
The Lexer builds an array of tokens, which will be passed to the Parser.The Parser processes each token in the token array:
import { marked }from'marked';const md =` # heading [link][1] [1]: #heading "heading"`;const tokens = marked.lexer(md);console.log(tokens);const html = marked.parser(tokens);console.log(html);
[ {type:"heading", raw:" # heading\n\n", depth: 1, text:"heading", tokens: [ {type:"text", raw:"heading", text:"heading" } ] }, {type:"paragraph", raw:" [link][1]", text:" [link][1]", tokens: [ {type:"text", raw:" ", text:" " }, {type:"link", raw:"[link][1]", text:"link", href:"#heading", title:"heading", tokens: [ {type:"text", raw:"link", text:"link" } ] } ] }, {type:"space", raw:"\n\n" }, links: {"1": { href:"#heading", title:"heading" } }]<h1id="heading">heading</h1><p> <a href="#heading" title="heading">link</a></p>