#
pd3f
Here are 7 public repositories matching this topic...
Language:All
Filter by language
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based
- Updated
Oct 13, 2023 - HTML
📜 Dehyphenation of broken text (mainly German), i.e., extracted from a PDF
- Updated
Mar 8, 2022 - Python
📑 Python Package to reconstruct the original continuous text from PDFs with language models
- Updated
Sep 8, 2023 - Jupyter Notebook
Improve this page
Add a description, image, and links to thepd3f topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thepd3f topic, visit your repo's landing page and select "manage topics."