pdf-ocr
Here are 26 public repositories matching this topic...
Language:All
Sort:Most stars
#1 PDF Application on GitHub that lets you edit PDFs on any device anywhere
- Updated
Feb 20, 2026 - TypeScript
A Privacy First PDF Toolkit
- Updated
Feb 20, 2026 - JavaScript
AI Chatbot for analyzing/extracting information from data in conversational format.
- Updated
Apr 14, 2025 - Python
An out-of-the-box local Web UI for DeepSeek-OCR. Built with FastAPI + Vue.js, it supports PDF/Image uploads, progress tracking, and result visualization with bounding boxes. Easily experience the power of a top-tier OCR model.
- Updated
Dec 6, 2025 - Python
Building on the existing general text recognition capabilities, new features such as handwritten OCR, layout detection, and table detection and recognition have been added, covering all scenarios involving printed text, handwritten text, and document structure analysis.在原通用文本识别基础上,新增手写 OCR、版面检测、表格检测与识别功能,覆盖印刷体、手写体、文档结构解析全场景。
- Updated
Jan 7, 2026 - Java
Open-source batch OCR workbench — a free, local alternative to ABBYY FineReader. Powered by Ollama + GLM-OCR + PP-DocLayoutV3, ~0.5s/page on RTX 4090. Three-panel editor, layout-aware, PDF/image batch processing, Markdown/Word export. 批量OCR工作台,纯本地运行,免费平替ABBYY,适合书籍文档数字化。
- Updated
Feb 7, 2026 - JavaScript
Convert scanned PDFs into searchable text locally using Vision LLMs (olmOCR). 100% private, offline, and free. Features a modern Web UI & CLI.
- Updated
Dec 23, 2025 - Python
A powerful and user-friendly tool based on OCRmyPDF, offering a seamless GUI for conversion of image-based PDFs into searchable text.
- Updated
Oct 28, 2023 - Python
A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.
- Updated
Jan 15, 2025 - HTML
LLM PDF OCR工具,Markdown/Latex 文章翻译工具。支持逐段翻译和直接校对。支持数学公式。基于大语言模型(LLM)API
- Updated
Feb 13, 2026 - JavaScript
Client-side tool to check and fix PDF accessibility. Analyze PDFs for text layer accessibility, detect image-only pages, and rebuild selectable text layers with browser-based OCR—no server or backend required. Perfect for privacy-first and legacy environments.
- Updated
Dec 10, 2025 - JavaScript
A document processing service designed to extract structured text (Markdown) from various file formats using OCR (Tesseract) and native parsers.
- Updated
Jan 27, 2026 - Python
A tool for compare, merge, display difference and make OCR between the PDFs.
- Updated
Jan 21, 2024 - Python
OCR-enabled PDF text extraction in Python with pypdf and Azure Document Intelligence.
- Updated
Jan 31, 2026 - Python
GPicy - AI Artificial Intelligence-driven image processing for your sporadic needs.
- Updated
Apr 1, 2024
PDF to Markdown OCR using vision-language models with multi-GPU support
- Updated
Oct 17, 2025 - Python
PDFScalpel is a forensic PDF analysis and CTF toolkit for security researchers, digital forensics analysts, and penetration testers, providing deep insight into PDF structure, encryption, malware, steganography, metadata, revisions, and document authenticity.
- Updated
Feb 3, 2026 - Python
Improve this page
Add a description, image, and links to thepdf-ocr topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thepdf-ocr topic, visit your repo's landing page and select "manage topics."