pymupdf
Here are 109 public repositories matching this topic...
Language:All
Sort:Most stars
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
- Updated
Mar 15, 2025 - Python
Open source Python library for converting PDF to DOCX.
- Updated
Sep 23, 2024 - Python
A CLI toolset to generate table of contents for PDF files automatically.
- Updated
Nov 26, 2023 - Python
Demos, examples and utilities using PyMuPDF
- Updated
Jul 1, 2024 - Jupyter Notebook
(eBook,PDFs Translation) A multilingual eBook processing tool supporting all eBook formats. Features online and offline translation while preserving original layouts. Compatible with both scanned and digital PDFs. Elegant user interface. The world's highest-performing open-source layout-preserving eBook translator.
- Updated
Mar 16, 2025 - Python
Extract annotations (highlights and scribbles) from PDF, EPUB, and notebooks marked with reMarkable tablets. Export to Markdown, PDF, PNG, SVG
- Updated
May 26, 2024 - Python
A Pure Python PDFViewer, which provides functionalities same as other famous PDFViewers.
- Updated
Jul 14, 2023 - Python
In this code, a simple implementation of PDF to audio converter is shown
- Updated
Mar 30, 2021 - Python
Multimodal LLM Application with PyMuPDF4LLM
- Updated
Oct 4, 2024 - Jupyter Notebook
Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
- Updated
Mar 3, 2025 - Python
pdfgui_tools is a user interface tool developed in Qt and Python that integrates with poppler-utils and PyPDF2 for PDF document management. It's a simple and user-friendly tool that includes various utilities.
- Updated
Feb 5, 2024 - Python
Fills the lack of an open-source PDF Editor with the capability to draw and add notes
- Updated
Jun 17, 2024 - Python
Useful PDF-related productivity tool.
- Updated
Oct 12, 2021 - Python
Automated extraction of specific information from invoices, achieving over 95% accuracy.
- Updated
Jul 14, 2023 - Python
UVA Data Science Capstone project for Internet Archive. This project aimed to classify PDFs as research or non-research documents using an image and text-based approach. For the image-based models, we leveraged CNN transfer learning and used XGBoost for text-based approach.
- Updated
May 7, 2021 - Jupyter Notebook
Improve this page
Add a description, image, and links to thepymupdf topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thepymupdf topic, visit your repo's landing page and select "manage topics."