Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Benchmarking PDF libraries

License

NotificationsYou must be signed in to change notification settings

py-pdf/benchmarks

Repository files navigation

This benchmark is about reading pure PDF files - notscanned documents and not documents that applied OCR.

Benchmarking machine

Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz

Input Documents

#NameFile SizePages
12201.002142.4MiB22
2GeoTopo-book5.1MiB117
32201.001511.5MiB12
41707.097257.0MiB134
52201.000212.6MiB10
62201.000372.9MiB33
72201.0006914.7MiB15
82201.001782.3MiB16
92201.002011.3MiB9
101602.065412.9MiB16
112201.00200284.8KiB7
122201.000221.2MiB14
132201.00029797.6KiB12
141601.036421004.9KiB8

Libraries

NameLast PyPI ReleaseLicenseVersionDependencies
pypdfium22024-12-19Apache-2.0 or BSD-3-Clause4.30.1PDFium (Foxit/Google)
pdfminer.six2025-05-06MIT/X20250506
pdfplumber2025-06-12MIT0.11.7pdfminer.six
pdfrw2017-09-18MIT0.4
pdftotext-GPL0.86.1build-essential libpoppler-cpp-dev pkg-config python3-dev
PyMuPDF2025-06-12GNU AFFERO GPL 3.0 / Commerical1.26.1MuPDF
pypdf2025-06-29BSD 3-Clause5.7.0
Tika2025-03-26Apache v23.1.0Apache Tika

Text Extraction Speed

#LibraryAverage 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1PyMuPDF0.1s0.4s0.3s0.2s0.2s0.0s0.1s0.0s0.1s0.0s0.1s0.0s0.1s0.0s0.0s
2pypdfium20.1s0.5s0.3s0.2s0.2s0.0s0.1s0.0s0.0s0.0s0.1s0.0s0.0s0.0s0.0s
3Tika0.2s0.8s0.5s0.3s0.3s0.1s0.2s0.1s0.1s0.1s0.1s0.1s0.1s0.0s0.0s
4pdftotext0.3s0.7s0.9s0.2s0.8s0.1s0.3s0.4s0.1s0.1s0.2s0.1s0.1s0.0s0.0s
5pypdf3.5s26.2s6.4s6.8s3.3s0.9s1.6s0.6s0.6s0.5s0.8s0.6s0.6s0.5s0.3s
6pdfminer.six5.8s35.1s16.6s10.2s5.5s1.5s2.5s1.1s1.6s1.1s2.0s1.5s1.4s0.7s0.6s
7pdfplumber9.5s60.9s16.6s17.0s10.7s3.1s5.3s2.6s2.5s2.3s3.8s2.5s2.7s1.4s1.3s

Image Extraction Speed

#LibraryAverage 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1PyMuPDF0.5s0.3s0.5s0.0s1.6s0.4s0.0s2.9s0.4s0.4s0.1s0.0s0.3s0.2s0.0s
2pypdfium21.1s1.2s1.8s0.0s3.3s0.9s0.2s5.1s0.7s0.6s0.4s0.0s0.5s0.2s0.0s
3pypdf4.2s21.6s6.1s5.7s11.8s1.3s0.6s6.5s1.2s1.2s0.8s0.2s0.9s0.2s0.2s
4pdfminer.six7.4s43.9s17.5s12.7s15.4s1.6s2.5s1.6s1.5s1.0s1.8s1.2s1.3s0.7s0.5s

Watermarking Speed

#LibraryAverage 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1pdfrw0.1s0.1s0.5s0.0s0.3s0.1s0.1s0.1s0.1s0.1s0.1s0.0s0.1s0.0s0.0s
2PyMuPDF0.2s0.4s0.6s0.2s0.4s0.1s0.1s0.1s0.1s0.1s0.1s0.0s0.1s0.0s0.0s
3pypdf0.5s0.6s2.0s0.4s1.1s0.2s0.3s0.3s0.3s0.2s0.3s0.1s0.6s0.1s0.1s

Watermarking File Size

#LibraryAverage 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1pypdf3.4MB2.5MB5.6MB1.6MB7.2MB2.7MB3.1MB15.4MB2.4MB1.3MB3.0MB0.3MB1.2MB0.8MB1.0MB
2pdfrw3.5MB2.5MB5.7MB1.6MB7.3MB2.7MB3.1MB15.4MB2.4MB1.3MB3.0MB0.3MB1.2MB0.8MB1.0MB
3PyMuPDF3.7MB2.7MB6.9MB1.7MB8.5MB2.8MB3.4MB15.5MB2.5MB1.4MB3.2MB0.3MB1.3MB0.9MB1.1MB

Text Extraction Quality

#LibraryAverage 1 2 3 4 5 6 7 8 9 10 11 12 13 14
1pypdfium297%99%97%94%99%98%96%99%99%99%99%98%78%99%99%
2pypdf96%99%95%93%98%99%96%97%99%99%99%99%78%100%99%
3PyMuPDF96%98%96%93%97%98%95%99%98%98%98%97%77%98%99%
4Tika95%99%98%92%97%98%96%93%97%98%93%98%73%98%96%
5pdftotext91%96%93%91%94%92%96%96%96%97%83%94%77%96%79%
6pdfminer.six89%95%79%86%92%86%93%95%93%92%92%93%71%98%86%
7pdfplumber75%94%84%68%97%61%93%61%89%57%59%67%58%98%67%

[8]ページ先頭

©2009-2025 Movatter.jp