Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork2.2k
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
License
Byaidu/PDFMathTranslate
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
PDF scientific paper translation and bilingual comparison.
- 📊 Preserve formulas, charts, table of contents, and annotations(preview).
- 🌐 Supportmultiple languages, and diversetranslation services.
- 🤖 Providescommandline tool,interactive user interface, andDocker
Feel free to provide feedback inGitHub Issues orTelegram Group.
For details on how to contribute, please consult theContribution Guide.
- [May 9, 2025] pdf2zh 2.0 Preview Version#586: The Windows ZIP file and Docker image are now available.
Note
2.0 Moved to a new repository under the organization:PDFMathTranslate/PDFMathTranslate-next
Version 2.0 official release has been published.
- [Mar. 3, 2025] Experimental support for the new backendBabelDOC WebUI added as an experimental option (by@awwaawwa)
- [Feb. 22 2025] Better release CI and well-packaged windows-amd64 exe (by@awwaawwa)
- [Dec. 24 2024] The translator now supports local models onXinference(by@imClumsyPanda)
- [Dec. 19 2024] Non-PDF/A documents are now supported using
-cp
(by@reycn) - [Dec. 13 2024] Additional support for backend by(by@YadominJinta)
- [Dec. 10 2024] The translator now supports OpenAI models on Azure(by@yidasanqian)
You can try our application out using either of the following demos:
- Public free service online without installation(recommended).
- Immersive Translate - BabelDOC 1000 free pages per month.(recommended)
- Demo hosted on HuggingFace
- Demo hosted on ModelScope without installation.
Note that the computing resources of the demo are limited, so please avoid abusing them.
For different use cases, we provide distinct methods to use our program:
1. UV install
Python installed (3.10 <= version <= 3.12)
Install our package:
pip install uvuv tool install --python 3.12 pdf2zh
Execute translation, files generated incurrent working directory:
pdf2zh document.pdf
2. Windows exe
Download pdf2zh-version-win64.zip fromrelease page
Unzip and double-click
pdf2zh.exe
to run.
3. Graphic user interface
Python installed (3.10 <= version <= 3.12)
Install our package:
pip install pdf2zh
Start using in browser:
pdf2zh -i
If your browser has not been started automatically, goto
http://localhost:7860/
Seedocumentation for GUI for more details.
4. Docker
Pull and run:
docker pull byaidu/pdf2zhdocker run -d -p 7860:7860 byaidu/pdf2zh
Open in browser:
http://localhost:7860/
For docker deployment on cloud service:
5. Zotero Plugin
SeeZotero PDF2zh for more details.
6. Commandline
Python installed (3.10 <= version <= 3.12)
Install our package:
pip install pdf2zh
Execute translation, files generated incurrent working directory:
pdf2zh document.pdf
Tip
If you're using Windows and cannot open the file after downloading, please installvc_redist.x64.exe and try again.
If you cannot access Docker Hub, please try the image onGitHub Container Registry.
docker pull ghcr.io/byaidu/pdfmathtranslatedocker run -d -p 7860:7860 ghcr.io/byaidu/pdfmathtranslate
The present program needs an AI model(wybxc/DocLayout-YOLO-DocStructBench-onnx
) before working and some users are not able to download due to network issues. If you have a problem with downloading this model, we provide a workaround using the following environment variable:
set HF_ENDPOINT=https://hf-mirror.com
For PowerShell user:
$env:HF_ENDPOINT = https://hf-mirror.com
If the solution does not work to you / you encountered other issues, please refer tofrequently asked questions.
Execute the translation command in the command line to generate the translated documentexample-mono.pdf
and the bilingual documentexample-dual.pdf
in the current working directory. Use Google as the default translation service. More support translation services can findHERE.
In the following table, we list all advanced options for reference:
Option | Function | Example |
---|---|---|
files | Local files | pdf2zh ~/local.pdf |
links | Online files | pdf2zh http://arxiv.org/paper.pdf |
-i | Enter GUI | pdf2zh -i |
-p | Partial document translation | pdf2zh example.pdf -p 1 |
-li | Source language | pdf2zh example.pdf -li en |
-lo | Target language | pdf2zh example.pdf -lo zh |
-s | Translation service | pdf2zh example.pdf -s deepl |
-t | Multi-threads | pdf2zh example.pdf -t 1 |
-o | Output dir | pdf2zh example.pdf -o output |
-f ,-c | Exceptions | pdf2zh example.pdf -f "(MS.*)" |
-cp | Compatibility Mode | pdf2zh example.pdf --compatible |
--skip-subset-fonts | Skip font subset | pdf2zh example.pdf --skip-subset-fonts |
--ignore-cache | Ignore translate cache | pdf2zh example.pdf --ignore-cache |
--share | Public link | pdf2zh -i --share |
--authorized | Authorization | pdf2zh -i --authorized users.txt [auth.html] |
--prompt | Custom Prompt | pdf2zh --prompt [prompt.txt] |
--onnx | [Use Custom DocLayout-YOLO ONNX model] | pdf2zh --onnx [onnx/model/path] |
--serverport | [Use Custom WebUI port] | pdf2zh --serverport 7860 |
--dir | [batch translate] | pdf2zh --dir /path/to/translate/ |
--config | configuration file | pdf2zh --config /path/to/config/config.json |
--serverport | [custom gradio server port] | pdf2zh --serverport 7860 |
--babeldoc | Use Experimental backendBabelDOC to translate | pdf2zh --babeldoc -s openai example.pdf |
--mcp | Enable MCP STDIO mode | pdf2zh --mcp |
--sse | Enable MCP SSE mode | pdf2zh --mcp --sse |
For detailed explanations, please refer to our document aboutAdvanced Usage for a full list of each option.
For downstream applications, please refer to our document aboutAPI Details for further information about:
- Python API, how to use the program in other Python programs
- HTTP API, how to communicate with a server with the program installed
Parse layout with DocLayNet based models,PaddleX,PaperMage,SAM2
Fix page rotation, table of contents, format of lists
Fix pixel formula in old papers
Async retry except KeyboardInterrupt
Knuth–Plass algorithm for western languages
Support non-PDF/A files
Immersive Translation sponsors monthly Pro membership redemption codes for active contributors to this project, see details at:CONTRIBUTOR_REWARD.md
New backend:BabelDOC
Document merging:PyMuPDF
Document parsing:Pdfminer.six
Document extraction:MinerU
Document Preview:Gradio PDF
Multi-threaded translation:MathTranslate
Layout parsing:DocLayout-YOLO
Document standard:PDF Explained,PDF Cheat Sheets
Multilingual Font:Go Noto Universal
About
PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero
Topics
Resources
License
Code of conduct
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.