ocr-python
Here are 429 public repositories matching this topic...
Language:All
Sort:Most stars
OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
- Updated
May 31, 2025 - Python
CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】
- Updated
Jun 28, 2025 - Python
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
- Updated
Apr 29, 2025 - Python
Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition:
- Updated
Sep 5, 2022 - Jupyter Notebook
OCR, Archive, Index and Search: Implementation agnostic OCR framework.
- Updated
Nov 3, 2023 - Jupyter Notebook
Lightweight & fast OCR models for license plate text recognition.
- Updated
Jul 4, 2025 - Python
A carefully-designed OCR pipeline for universal boarded table recognition and reconstruction.
- Updated
Jan 10, 2023 - C++
Perform text detection in a variety of languages with your computer webcam using Google Tesseract OCR and OpenCV. This script achieves a real-time OCR effect via multi-threading.
- Updated
Jan 30, 2023 - Python
Anansi is a computer vision (cv2 and FFmpeg) + OCR (EasyOCR and tesseract) python-based crawler for finding and extracting questions and correct answers from video files of popular TV game shows in the Balkan region.
- Updated
Sep 26, 2022 - Python
Manga OCR snipping application for desktop
- Updated
Jan 7, 2023 - Python
Collection of PDF parsing libraries like AI based docling, claude, openai, gemini, meta's llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
- Updated
Jul 1, 2025 - Python
A FLOSS software for Persian Optical Character Recognition
- Updated
Jun 19, 2024 - Jupyter Notebook
Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。
- Updated
Jan 22, 2025 - Python
PDF text data extraction web app with OCR for scanned documents
- Updated
Jun 5, 2024 - Python
Easter2.0: IMPROVING CONVOLUTIONAL MODELS FOR HANDWRITTEN TEXT RECOGNITION
- Updated
Apr 25, 2023 - Jupyter Notebook
Multimodal document parser for high quality data understanding and extraction
- Updated
Jun 28, 2025 - Python
OCR Tamil is a powerful tool that can detect and recognize text in Tamil images with high accuracy on Natural Scenes
- Updated
Jul 3, 2025 - Python
Custom C++ implementation of deep learning based OCR
- Updated
Apr 18, 2024 - C++
Turn any OCR models into online inference API endpoint 🚀 🌖
- Updated
Mar 21, 2025 - Python
Improve this page
Add a description, image, and links to theocr-python topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theocr-python topic, visit your repo's landing page and select "manage topics."