Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Jupyter notebooks testing different OCR models for document parsing (Dolphin, MonkeyOCR, Marker, Nanonets, ...)

NotificationsYou must be signed in to change notification settings

AdemBoukhris457/Documents-Parsing-Lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Parsing Lab: OCR & Document Intelligence

A curated collection of Jupyter notebooks for experimenting with state-of-the-art OCR, document parsing, table extraction, and chart understanding techniques. This repository enables easy benchmarking and practical usage of the latest open-source and cloud-based solutions for document image processing.


🚀 Doctra Quick Start

This section provides a quick start guide for getting started withDoctra, a powerful tool for structured document parsing without Vision Language Models (VLM). Learn how to parse PDF documents, extract structured content (text, tables, charts, figures), and generate multiple output formats.

NotebookDescription
01_doctra_quick_start.ipynbQuick start guide forDoctra structured document parsing

📚 Notebooks Overview

NotebookDescription
bytedance-dolphin-image-parsing.ipynbDocument page parsing withDolphin by ByteDance
Llama-3.1-Nemotron-Nano-VL-8B-V1_parsing_documents.ipynbTesting the performance of document parsing withLlama-3.1-Nemotron-Nano-VL-8B-V1
docling-documents-parsing-and-tables-extraction.ipynbParsing and table extraction withDocling
typhoon-ocr-7b-docs-pages-parser.ipynbEvaluatingTyphoon_ocr_7b Document Parsing Capabilities Across Various Use Cases
florence-2-large-ocr-documents-pages.ipynbOCR of document pages usingFlorence 2 Large
florence-2-large-ocr-images-real-life-scenarios.ipynbReal-life scenario OCR withFlorence 2 Large
got-ocr2-0-docs-parsing.ipynbDocument pages parsing withGOT-OCR2.0 andGemini 2.5 Flash
marker-docs-parsing.ipynbMarker-based document parsing experiments
mistralocr-docs-parsing.ipynbDocument parsing usingMistralOCR
monkeyocr-docs-pages-parsing.ipynbDocument parsing withMonkeyOCR
nanonets-OCR-s_docs_parsing.ipynbAdvanced document parsing usingNanonets-OCR-s
ollama-llama3-2-vision-usage.ipynbUsingLlama3-2 Vision for document parsing
paddleocr-3-0-docs-parsing.ipynbParsing withPaddleOCR 3.0 PP-StructureV3
pix2text-docs-pages-parsing.ipynbDocument parsing usingPix2Text
smoldocling-documents-understanding.ipynbDocument understanding withSmolDocling
zerox-pdf-parsing.ipynbPDF parsing experiments withZerox
qwen2-vl-2b-docs-parsing.ipynbDocuments pages parsing withQwen2-VL-2B
OCRFlux_3B_Docs_Parsing.ipynbDocument parsing withOCRFlux-3B on Lightning AI
granite-docling-258m-document-parsing-review.ipynbEvaluatingIBM Granite DocLing 258M for document parsing and layout understanding

📑📊 Tables and Charts Recognition

This section includes notebooks focused on table and chart detection, structure recognition, and extraction from documents. It covers various open-source approaches and benchmarks for understanding table and chart layouts and content.

NotebookDescription
unitable-testing-for-table-structure-recognition.ipynbTesting table detection and structure recognition withUniTable
deepdoctection-tables-recognition.ipynbEvaluatingDeepdoctection for table extraction across varied structures
gemini-2-5-pro-on-chart-and-table-extraction.ipynbChart/table extraction usingGemini 2.5 Pro
deplot-plots-to-tables-converter.ipynbConverting Charts into Tables withDePlot
cohere-command-a-vision-charts-understanding.ipynbCohere Command A Vision for Charts Understanding
cohere-command-a-vision-tables-recognition.ipynbCohere Command A Vision for Tables Recognition
moondream2-charts-tables-interpretation.ipynbMoondream2 for Charts and Tables understanding

📑🔍 Structured Data Extraction

This section covers the structured data extraction phase, detailing methods to extract specific data from documents or images. It includes steps like OCR preprocessing, table extraction, named entity recognition (NER), and conversion to structured formats.

NotebookDescription
NuExtract-2-8b-structured-data-extractionNuExtract-2.0-8B for Structured Data Extraction

📖 Project Goals

  • Benchmark different OCR/document parsing models on real documents.
  • Demonstrate table, chart, and text extraction workflows.
  • Compare open-source and commercial solutions.
  • Provide ready-to-use code snippets for rapid prototyping.

🛠️ Usage

  1. Clone the repository:

    git clone https://github.com/AdemBoukhris457/Docs_Parsing_Techniques.git
  2. Install dependencies as needed for each notebook (see the first cells of each.ipynb for requirements).

  3. Launch Jupyter Notebook or JupyterLab and open any notebook of interest.

  4. Run the cells and adapt the code for your documents.


📌 Notes

  • Some notebooks require model weights or API keys, check comments in each notebook for details.
  • Results, insights, and sample outputs are provided inline.

🔗 Related Resources

📂 You can find more notebooks, experiments, and datasets related to document parsing and OCR on my Kaggle profile:👉https://www.kaggle.com/ademboukhris/code


Star History

Star History Chart

About

Jupyter notebooks testing different OCR models for document parsing (Dolphin, MonkeyOCR, Marker, Nanonets, ...)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp