Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Thin wrapper for "pandoc" (MIT)

License

NotificationsYou must be signed in to change notification settings

JessicaTegner/pypandoc

Repository files navigation

Build StatusGitHub ReleasesPypandoc PyPI VersionPypandoc Binary PyPI VersionPyPandoc PyPi DownloadsPyPandoc Binary PyPI Downloadsconda versionDevelopment StatusPyPandoc Python versionPyPandoc Binary Python versionLicense

Pypandoc provides a thin wrapper forpandoc, a universaldocument converter.

Installation

Pypandoc uses pandoc, so it needs an available installation of pandoc. Pypandoc provides 2 packages, "pypandoc" and "pypandoc_binary", with the second one including pandoc out of the box.The 2 packages are identical, with the only difference being that one includes pandoc, while the other don't.

If pandoc is already installed (i.e. pandoc is in thePATH), pypandoc uses the version with thehigher version number, and if both are the same, the already installed version. SeeSpecifying the location of pandoc binaries for more.

To use pandoc filters, you must have the relevant filters installed on your machine.

Installing via pip

If youwant to install pandoc yourself or are on a unsupported platform, you'll need to install "pypandoc" andinstall pandoc manually

pip install pypandoc

If you want pandoc included out of the box, you can utilize our pypandoc_binary package, which are identical to the "pypandoc" package, but with pandoc included.

pip install pypandoc_binary

Prebuiltwheels for Windows and Mac OS X

If you use Linux and haveyour own wheelhouse,you can build a wheel which include pandoc withpython setup_binary.py download_pandoc; python setup.py bdist_wheel. Be aware that this works onlyon 64bit intel systems, as we only download it from theofficial releases.

Installing via conda

Pypandoc is included inconda-forge. The conda packages willalso install the pandoc package, so pandoc is available in the installation.

Install viaconda install -c conda-forge pypandoc.

You can also add the channel to your conda config viaconda config --add channels conda-forge. This makes it possible touseconda install pypandoc directly and also lets you update viaconda update pypandoc.

Installing pandoc

If you don't already have pandoc on your system, or have installed the pypandoc_binary package, which includes pandoc, you need to install pandoc by yourself.

Installing pandoc via pypandoc

Installing via pypandoc is possible on Windows, Mac OS X or Linux (Intel-based, 64-bit):

pipinstallpypandocfrompypandoc.pandoc_downloadimportdownload_pandoc# see the documentation how to customize the installation path# but be aware that you then need to include it in the `PATH`download_pandoc()

The default install location is included in the search path for pandoc, so youdon't need to add it to thePATH.

By default, the latest pandoc version is installed. If you want to specify your own version, say 1.19.1, usedownload_pandoc(version='1.19.1') instead.

Installing pandoc manually

Installing manually via the system mechanism is also possible. Such installation mechanismmake pandoc available on many more platforms:

  • Ubuntu/Debian:sudo apt-get install pandoc
  • Fedora/Red Hat:sudo yum install pandoc
  • Arch:sudo pacman -S pandoc
  • Mac OS X with Homebrew:brew install pandoc pandoc-citeproc Caskroom/cask/mactex
  • Machine with Haskell:cabal-install pandoc
  • Windows: There is an installer availablehere
  • FreeBSD with pkg:pkg install hs-pandoc
  • Or seePandoc - Installing pandoc

Be aware that not all install mechanisms put pandoc in thePATH, so you eitherhave to change thePATH yourself or set the fullPATH to pandoc inPYPANDOC_PANDOC. See the next section for more information.

Specifying the location of pandoc binaries

You can point to a specific pandoc version by setting the environment variablePYPANDOC_PANDOC to the fullPATH to the pandoc binary(PYPANDOC_PANDOC=/home/x/whatever/pandoc orPYPANDOC_PANDOC=c:\pandoc\pandoc.exe).If this environment variable is set, this is the only place where pandoc is searched for.

In certain cases, e.g. pandoc is installed but a web server with its own usercannot find the binaries, it is useful to specify the location at runtime:

importosos.environ.setdefault('PYPANDOC_PANDOC','/home/x/whatever/pandoc')

Usage

There are two basic ways to use pypandoc: with input files or with inputstrings.

importpypandoc# With an input file: it will infer the input format from the filenameoutput=pypandoc.convert_file('somefile.md','rst')# ...but you can overwrite the format via the `format` argument:output=pypandoc.convert_file('somefile.txt','rst',format='md')# alternatively you could just pass some string. In this case you need to# define the input format:output=pypandoc.convert_text('# some title','rst',format='md')# output == 'some title\r\n==========\r\n\r\n'

convert_text expects this string to be unicode or utf-8 encoded bytes.convert_* will alwaysreturn a unicode string.

It's also possible to directly let pandoc write the output to a file. This is the only way toconvert to some output formats (e.g. odt, docx, epub, epub3, pdf). In that caseconvert_*() willreturn an empty string.

importpypandocoutput=pypandoc.convert_file('somefile.md','docx',outputfile="somefile.docx")assertoutput==""

It's also possible to specify multiple input files to pandoc, either as absolute paths, relative paths or file patterns.

importpypandoc# convert all markdown files in a chapters/ subdirectory.pypandoc.convert_file('chapters/*.md','docx',outputfile="somefile.docx")# convert all markdown files in the book1 and book2 directories.pypandoc.convert_file(['book1/*.md','book2/*.md'],'docx',outputfile="somefile.docx")# convert the front from another drive, and all markdown files in the chapter directory.pypandoc.convert_file(['D:/book_front.md','book2/*.md'],'docx',outputfile="somefile.docx")

pathlib is also supported.

importpypandocfrompathlibimportPath# single fileinput=Path('somefile.md')output=input.with_suffix('.docx')pypandoc.convert_file(input,'docx',outputfile=output)# convert all markdown files in a chapters/ subdirectory.pypandoc.convert_file(Path('chapters').glob('*.md'),'docx',outputfile="somefile.docx")# convert all markdown files in the book1 and book2 directories.pypandoc.convert_file([*Path('book1').glob('*.md'),*Path('book2').glob('*.md')],'docx',outputfile="somefile.docx")# pathlib globs must be unpacked if they are inside lists.

In addition toformat, it is possible to passextra_args.That makes it possible to access various pandoc options easily.

output=pypandoc.convert_text('<h1>Primary Heading</h1>','md',format='html',extra_args=['--atx-headers'])# output == '# Primary Heading\r\n'output=pypandoc.convert_text('# Primary Heading','html',format='md',extra_args=['--base-header-level=2'])# output == '<h2>Primary Heading</h2>\r\n'

pypandoc now supports easy addition ofpandoc filters.

filters= ['pandoc-citeproc']pdoc_args= ['--mathjax','--smart']output=pypandoc.convert_file(filename,to='html5',format='md',extra_args=pdoc_args,filters=filters)

Please pass any filters in as a list and not as a string.

Please refer topandoc -h and theofficial documentation for further details.

Dealing with Formatting Arguments

Pandoc supports custom formatting though-V parameter. In order to use it throughpypandoc, use code such as this:

output=pypandoc.convert_file('demo.md','pdf',outputfile='demo.pdf',extra_args=['-V','geometry:margin=1.5cm'])

Note: it's important to separate-V and its argument within a list like that or elseit won't work. This gotcha has to do with the waysubprocess.Popen works.

Logging Messages

Pypandoc logs messages using thePython logging library.By default, it will send messages to the console, including any messagesgenerated by Pandoc. If desired, this behaviour can be changed by addinghandlers tothe pypandoc loggerbefore calling any functions. For example, to mute alllogging add anull handler:

importlogginglogging.getLogger('pypandoc').addHandler(logging.NullHandler())

Getting Pandoc Version

As it can be useful sometimes to check what pandoc version is available at your system or whichparticular pandoc binary is used by pypandoc. For that, pypandoc provides the followingutility functions. Example:

print(pypandoc.get_pandoc_version())print(pypandoc.get_pandoc_path())print(pypandoc.get_pandoc_formats())

Related

  • pydocverter is a client for a service calledDocverter, which offers pandoc as a service (plus some extra goodies).
  • Seepyandoc for an alternative implementation of a pandocwrapper from Kenneth Reitz. This one hasn't been active in a while though.
  • Seepanflute which providesconvert_text similar to pypandoc's. Its focus is on writing and running pandoc filters though.

Contributing

Contributions are welcome. When opening a PR, please keep the following guidelines in mind:

  1. Before implementing, please open an issue for discussion.
  2. Make sure you have tests for the new logic.
  3. Make sure your code passesflake8 pypandoc/*.py tests.py
  4. Add yourself to contributors atREADME.md unless you are already there. In that case tweak your contributions.

Note that for citeproc tests to pass you'll need to havepandoc-citeproc installed. If you installed a prebuilt wheel or conda package, it is already included.

Contributors

  • Jessica Tegner - New maintainer as of 1. Juli 2021
  • Valentin Haenel - String conversion fix
  • Daniel Sanchez - Automatic parsing of input/output formats
  • Thomas G. - Python 3 support
  • Ben Jao Ming - Fail gracefully if pandoc is missing
  • Ross Crawford-d'Heureuse - Encode input in UTF-8 and add Djangoexample
  • Michael Chow - Decode output in UTF-8
  • Janusz Skonieczny - Support Windows newlines and allow encoding tobe specified.
  • gabeos - Fix help parsing
  • Marc Abramowitz - Makesetup.py fail hard if pandoc ismissing, Travis, Dockerfile, PyPI badge, Tox, PEP-8, improved documentation
  • Daniel L. - Addextra_args example to README
  • Amy Guy - Exception handling for unicode errors
  • Florian Eßer - Allow Markdown extensions in output format
  • Philipp Wendler - Allow Markdown extensions in input format
  • Jan Katins - Handling output to a file, Travis to work on newer version of pandoc, return code checking, get_pandoc_version. Helped to fix the Travis build, newconvert_* API. Former maintainer of pypandoc
  • Aaron Gonzales - Added better filter handling
  • David Lukes - Enabled input from non-plain-text files and made sure tests clean up template files correctly if they fail
  • valholl - Set up licensing information correctly and include examples to distribution version
  • Cyrille Rossant - Fixed bug by trimming out stars in the list of pandoc formats. Helped to fix the Travis build.
  • Paul Osborne - Don't require pandoc to install pypandoc.
  • Felix Yan - Added installation instructions for Arch Linux.
  • Kolen Cheung - Implement_get_pandoc_urls for installing arbitrary version as well as the latest version of pandoc. Minor: README, Travis, setup.py.
  • Rebecca Heineman - Added scanning code for finding pandoc in Windows
  • Andrew Barraford - Download destination.
  • Jesse Widner &Dominic Thorn - Add support for lua filters
  • Alex Kneisel - Added pathlib.Path support to convert_file.
  • Juho Vepsäläinen - Creator and former maintainer of pypandoc
  • Connor - Updated Dockerfile to Python 3.9 image and added docker compose file
  • Colin Bull - Added ability to control whether files are sorted before being passed to pandoc process.
  • Kurt McKee - Project infrastructure improvements

License

Pypandoc is available under MIT license. See LICENSE for more details. Pandoc itself isavailable under the GPL2 license.


[8]ページ先頭

©2009-2025 Movatter.jp