- Notifications
You must be signed in to change notification settings - Fork377
Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.
License
unoconv/unoconv
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Please note that there is a rewrite of Unoconv called "Unoserver":https://github.com/unoconv/unoserver/
We are running Unoserver successfully in production, and it’s now therecommended solution.
Unoserver does not have all the features of Unoconv, which features itwill get depends on a combination of what people want, and if someonewants to implement it.
Until Unoserver has all the major features people need, Unoconv is inbugfix mode, there will be no major changes. Once Unoserver has the majorfeatures of Unoconv, Unoconv will become unsupported.
Universal Office Converter (unoconv) is a command line tool to convert anydocument format that LibreOffice can import to any document format thatLibreOffice can export. It makes use of the LibreOffice’s UNO bindings fornon-interactive conversion of documents.
For practical reasons we mention LibreOffice, but OpenOffice is supported byunoconv as well.
unoconv can be installed using packages coming from your distribution, orsimply by copying the unoconv python script to your system.
If you installed unoconv by hand, make sure you have the required LibreOfficeor OpenOffice packages installed. A hard requirement is the UNO python bindingswhich are often inside a subpackage named libreoffice-pyuno orlibobasis4.4-pyuno.
Various sub-packages are needed for specific import or export filters, e.g.XML-based filters require the xsltfilter subpackage,e.g. libobasis4.4-xsltfilter.
Important | Neglecting these requirements will cause unoconv to fail withunhelpful and confusing error messages. |
To find a good Python installation to use to run unoconv, do the following:
To find which Python to use to run unoconv, you can try a script I made.
cd /tmpwget -l https://gist.githubusercontent.com/regebro/036da022dc7d5241a0ee97efdf1458eb/raw/1bc0655423d196acd79a5d9fa60d2baada8dd534/find_uno.pypython3 find_uno.py
It should list all Pythons that have Libreoffice libraries installed.
unoconv starts its own office instance (if it cannot find an existinglistener) that it then uses. There are some challenges to do thiscorrectly, but in general this works fine.
Typically you would convert an ODT document to PDF by running:
unoconv -f pdf some-file.odt
However, you can always start an instance yourself at the default port 2002(or specify another port with -p/--port) and after use you can tear it down:
unoconv --listener &sleep 20unoconv -f pdf *.odtunoconv -f doc *.odtunoconv -f html *.odtkill -15 %-
It is also possible to use a listener or LibreOffice instance that acceptsconnections on another system and use it from unoconv remotely. Thisway the conversion tasks are performed on a dedicated system insteadof on the client system. This works only if you have a shared filesystemmounted at the same location.
Beware that the pyuno python module needs to be compiled with the exactsame version of python that you are using to load it. A lot of people thatrun into problems loading pyuno are actually using a precompiled LibreOfficethat they downloaded somewhere and is incompatible with the python versionon their system.
To solve this issue, the project’s office suite ships with its own pythoninterpreter located in the 'program' directory, this one should workflawlessly.
The most recent unoconv works around this issue by automatically detectingincompatibilities, and restarting itself using a compatible python (the sameone that ships with LibreOffice).
You can influence the automatic detection by setting the UNO_PATH environmentvariable to point to an alternative LibreOffice installation, e.g.:
UNO_PATH=/opt/libreoffice4.4 unoconv -f pdf some-file.odt
But you can also force another python by using it to execute unoconv, e.g.:
/opt/libreoffice4.4/program/python.bin unoconv -f pdf some-file.odt
or on macOS:
/Applications/LibreOffice.app/Contents/MacOS/python unoconv -f pdf some-file.odt
or on Windows:
C:\Program Files (x86)\LibreOffice 4.4\program\python.exe unoconv -f pdf some-file.odt
Tip | If you plan to use unoconv extensively (or in an automated fashion) itis more efficient to use the correct python interpreter directly. Or eventput it directly in the Shebang (the first line) of the unoconv script ! |
Since OpenOffice 2.3 you do not need an X display for starting ooffice.However you may need the openoffice.org-headless package from yourdistribution. Since LibreOffice 2.4 nothing special is needed, runningin headless mode does not require X.
For any older OpenOffice releases, remember that ooffice requires an Xdisplay, even when using it in headless mode. One solution is to use Xvfbto create a headless X display for ooffice.
LibreOffice 3.6.0.1 or later is required to use unoconv under macOS. Thisis the first version distributed with an internal python script that works.No version of OpenOffice for macOS (3.4 is the current version) works becausethe necessary internal files are not included inside the application.
Some people have had difficulties using unoconv through webservices. Hereis a list of probable causes and recommendations:
Use the latest version of unoconv (or GitHub master branch)
Use the most recent stable release of LibreOffice (less memory, more stable, fewer crashes)
Use the native LibreOffice python binary to run unoconv
Hardcode this native python path in the unoconv script shebang (or ensure PATH is set)
Ensure that the user running unoconv has write access to its HOME directory (ensure HOME is set)
Test with SELinux in permissive mode
It is recommended to open the unoconv script and modify the very first line topoint directly to your installed LibreOffice python binary, so replace this:
#!/usr/bin/env python
with something like this:
#!/opt/libreoffice4.4/program/python
If you encounter problems converting files, it often helps to try again. Ifyou are using a listener, restarting the listener may help as well.
The reason for conversion failures are unclear, and they are notdeterministic. unoconv is not the only project to have noticed problemswith import and export filters using PyUNO. We assume these are relatedto internal state or timing issues that under certain conditions failto correctly work.
If you can reproduce the problem on a specific file, please take the time toopen the file in LibreOffice directly and export it to the desired format. Ifthis fails, it needs to be reported to the LibreOffice project directly. Ifthat works, we need to know !
We are looking into this with the LibreOffice developers to:
Collaborate closer to find, report and fix unexpected failures
Allow end-users to increase debugging and improve reporting to the project
If you encounter a problem with converting documents using unoconv, pleaseconsider that this could be caused by a number of things:
incomplete LibreOffice installation
LibreOffice bug or regression specific to your version/distribution
LibreOffice import or export filter issue
problem related to stale lock files
problem related to the source document
problem related to permissions or SELinux
problem related to the python UNO bindings
problem related to the unoconv python script
It is recommended to follow all of the below steps to pinpoint the problem:
if this is the first time you are using LibreOffice/OpenOffice, make sureyou have all the required sub-packages installed, depending on thedistribution this could be the xsltfilter, headless, writer,calc, impress or draw sub-packages.
check if there is no existing LibreOffice process running on the systemthat could interfere with proper functioning
# pgrep -l 'office|writer|calc'
check that there are no stale lock files present, e.g. '.~lock.file.pdf#' or'.~lock.index.html#'
check that the LibreOffice instance handling UNO requests is not handlingmultiple requests at the same time
try using the latest unoconv release, or the latest version on Github at:https://github.com/dagwieers/unoconv/downloads
try the conversion by opening the file in LibreOffice and exportingit through LibreOffice directly
try unoconv with a different minor or major LibreOffice version to testwhether it is a regression in LibreOffice
try to load the UNO bindings in python manually:
do this with the python executable that ships with the LibreOfficepackage/installer
# /opt/libreoffice4.4/program/python.bin -c 'import uno, unohelper'
or alternatively, run the distribution python (with the distributionLibreOffice)
# python -c 'import uno, unohelper'
try unoconv with a different python interpreter manually:
# /opt/libreoffice4.4/program/python.bin unoconv -f pdf test-file.odt
If you tried all of the above, and the issue still remains, the issue mightstill be related to import/export filters, LibreOffice or unoconv, so pleasereport any information to reproduce the problem on the Github issue-trackerat:https://github.com/dagwieers/unoconv/issues
And do mention that you already tried the above hints to troubleshoot the issue.
If you’re interested to help out with development, here are some pointers tointeresting sources:
[Tutorial] Import uno module to a different Python installhttp://user.services.openoffice.org/en/forum/viewtopic.php?f=45&t=36370&p=166783
UDK: UNO Development Kithttp://udk.openoffice.org/
Python-UNO bridgehttp://www.openoffice.org/udk/python/python-bridge.html
Python and OpenOffice.orghttp://wiki.services.openoffice.org/wiki/Python
OpenOffice.org developer manualhttp://api.openoffice.org/DevelopersGuide/DevelopersGuide.html
Framework/Article/Filter/FilterList OOo 2 1http://wiki.services.openoffice.org/wiki/Framework/Article/Filter/FilterList_OOo_2_1
Framework/Article/Filter/FilterList OOo 3 0http://wiki.services.openoffice.org/wiki/Framework/Article/Filter/FilterList_OOo_3_0
Other implementations using python and UNO:
convwatchhttp://cgit.freedesktop.org/libreoffice/core/tree/bin/convwatch.py
oooconvhttps://svn.infrae.com/oooconv/trunk/src/oooconv/filters.py
officeshots.orghttp://code.officeshots.org/trac/officeshots/browser/trunk/factory/src/backends/oooserver.py
cloudooohttp://svn.erp5.org/erp5/trunk/utils/cloudooo.handler/ooo/cloudooo/handler/ooo/
Other tools that are useful or similar in operation:
Text based document generation:http://www.methods.co.nz/asciidoc/
DocBook to OpenDocument XSLT:http://open.comsultia.com/docbook2odf/
Simple (and stupid) converter from OpenDocument Text to plain text:http://stosberg.net/odt2txt/
Another python tool to aid in converting files using UNO:http://www.artofsolving.com/files/DocumentConverter.pyhttp://www.artofsolving.com/opensource/pyodconverter
About
Universal Office Converter - Convert between any document format supported by LibreOffice/OpenOffice.