Creating Custom Pandoc Writers in Lua
Introduction
If you need to render a format not already handled by pandoc, or you want to change how pandoc renders a format, you can create a custom writer using theLua language. Pandoc has a built-in Lua interpreter, so you needn’t install any additional software to do this.
A custom writer is a Lua file that defines how to render the document. Writers must define just a single function, named eitherWriter
orByteStringWriter
, which gets passed the document and writer options, and then handles the conversion of the document, rendering it into a string. This interface was introduced in pandoc 2.17.2, with ByteString writers becoming available in pandoc 3.0.
Pandoc also supports “classic” custom writers, where a Lua function must be defined for each AST element type. Classic style writers aredeprecated and should be replaced with new-style writers if possible.
Writers
Custom writers using the new style must contain a global function namedWriter
orByteStringWriter
. Pandoc calls this function with the document and writer options as arguments, and expects the function to return a UTF-8 encoded string.
function Writer(doc,opts)-- ...end
Writers that do not return text but binary data should define a function with nameByteStringWriter
instead. The function must still return a string, but it does not have to be UTF-8 encoded and can contain arbitrary binary data.
If bothWriter
andByteStringWriter
functions are defined, then only theWriter
function will be used.
Format extensions
Writers can be customized through format extensions, such assmart
,citations
, orhard_line_breaks
. The globalExtensions
table indicates supported extensions with a key. Extensions enabled by default are assigned a true value, while those that are supported but disabled are assigned a false value.
Example: A writer with the following global table supports the extensionssmart
,citations
, andfoobar
, withsmart
enabled and the others disabled by default:
Extensions={smart=true,citations=false,foobar=false}
The users control extensions as usual, e.g.,pandoc -t my-writer.lua+citations
. The extensions are accessible through the writer options’extensions
field, e.g.:
function Writer(doc,opts)print('The citations extension is',opts.extensions:includes'citations'and'enabled'or'disabled')-- ...end
Default template
The default template of a custom writer is defined by the return value of the global functionTemplate
. Pandoc uses the default template for rendering when the user has not specified a template, but invoked with the-s
/--standalone
flag.
TheTemplate
global can be left undefined, in which case pandoc will throw an error when it would otherwise use the default template.
Example: modified Markdown writer
Writers have access to all modules described in theLua filters documentation. This includespandoc.write
, which can be used to render a document in a format already supported by pandoc. The document can be modified before this conversion, as demonstrated in the following short example. It renders a document as GitHub Flavored Markdown, but always uses fenced code blocks, never indented code.
function Writer(doc,opts)localfilter={CodeBlock=function(cb)-- only modify if code block has no attributesifcb.attr==pandoc.Attr()thenlocaldelimited='```\n'..cb.text..'\n```'returnpandoc.RawBlock('markdown',delimited)endend}returnpandoc.write(doc:walk(filter),'gfm',opts)endTemplate=pandoc.template.default'gfm'
pandoc.scaffolding.Writer
Reducing boilerplate withThepandoc.scaffolding.Writer
structure is a custom writer scaffold that serves to avoid common boilerplate code when defining a custom writer. The object can be used as a function and allows to skip details like metadata and template handling, requiring only the render functions for each AST element type.
The value ofpandoc.scaffolding.Writer
is a function that should usually be assigned to the globalWriter
:
Writer=pandoc.scaffolding.Writer
The render functions for Block and Inline values can then be added toWriter.Block
andWriter.Inline
, respectively. The functions are passed the element and the WriterOptions.
Writer.Inline.Str=function(str)returnstr.textendWriter.Inline.SoftBreak=function(_,opts)returnopts.wrap_text=="wrap-preserve"andcrorspaceendWriter.Inline.LineBreak=crWriter.Block.Para=function(para)return{Writer.Inlines(para.content),pandoc.layout.blankline}end
The render functions must return a string, a pandoc.layoutDoc element, or a list of such elements. In the latter case, the values are concatenated as if they were passed topandoc.layout.concat
. If the value does not depend on the input, a constant can be used as well.
The tablesWriter.Block
andWriter.Inline
can be used as functions; they apply the right render function for an element of the respective type. E.g.,Writer.Block(pandoc.Para 'x')
will delegate to theWriter.Para
render function and will return the result of that call.
Similarly, the functionsWriter.Blocks
andWriter.Inlines
can be used to render lists of elements, andWriter.Pandoc
renders the document’s blocks. The functionWriter.Blocks
can take a separator as an optional second argument, e.g.,Writer.Blocks(blks, pandoc.layout.cr)
; the default block separator ispandoc.layout.blankline
.
All predefined functions can be overwritten when needed.
The resulting Writer uses the render functions to handle metadata values and converts them to template variables. The template is applied automatically if one is given.
Classic style
A writer using the classic style defines rendering functions for each element of the pandoc AST. Note that this style isdeprecated and may be removed in later versions.
For example,
function Para(s)return"<paragraph>"..s.."</paragraph>"end
Template variables
New template variables can be added, or existing ones modified, by returning a second value from functionDoc
.
For example, the following will add the current date in variabledate
, unlessdate
is already defined as either a metadata value or a variable:
function Doc(body,meta,vars)vars.date=vars.dateormeta.dataoros.date'%B %e, %Y'returnbody,varsend
Changes in pandoc 3.0
Custom writers were reworked in pandoc 3.0. For technical reasons, the global variablesPANDOC_DOCUMENT
andPANDOC_WRITER_OPTIONS
are set to the empty document and default values, respectively. The old behavior can be restored by adding the following snippet, which turns a classic into a new style writer.
function Writer(doc,opts)PANDOC_DOCUMENT=docPANDOC_WRITER_OPTIONS=optsloadfile(PANDOC_SCRIPT_FILE)()returnpandoc.write_classic(doc,opts)end