- Notifications
You must be signed in to change notification settings - Fork8
AdemBoukhris457/Documents-Parsing-Lab
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
A curated collection of Jupyter notebooks for experimenting with state-of-the-art OCR, document parsing, table extraction, and chart understanding techniques. This repository enables easy benchmarking and practical usage of the latest open-source and cloud-based solutions for document image processing.
This section provides a quick start guide for getting started withDoctra, a powerful tool for structured document parsing without Vision Language Models (VLM). Learn how to parse PDF documents, extract structured content (text, tables, charts, figures), and generate multiple output formats.
| Notebook | Description |
|---|---|
| 01_doctra_quick_start.ipynb | Quick start guide forDoctra structured document parsing |
| Notebook | Description |
|---|---|
| bytedance-dolphin-image-parsing.ipynb | Document page parsing withDolphin by ByteDance |
| Llama-3.1-Nemotron-Nano-VL-8B-V1_parsing_documents.ipynb | Testing the performance of document parsing withLlama-3.1-Nemotron-Nano-VL-8B-V1 |
| docling-documents-parsing-and-tables-extraction.ipynb | Parsing and table extraction withDocling |
| typhoon-ocr-7b-docs-pages-parser.ipynb | EvaluatingTyphoon_ocr_7b Document Parsing Capabilities Across Various Use Cases |
| florence-2-large-ocr-documents-pages.ipynb | OCR of document pages usingFlorence 2 Large |
| florence-2-large-ocr-images-real-life-scenarios.ipynb | Real-life scenario OCR withFlorence 2 Large |
| got-ocr2-0-docs-parsing.ipynb | Document pages parsing withGOT-OCR2.0 andGemini 2.5 Flash |
| marker-docs-parsing.ipynb | Marker-based document parsing experiments |
| mistralocr-docs-parsing.ipynb | Document parsing usingMistralOCR |
| monkeyocr-docs-pages-parsing.ipynb | Document parsing withMonkeyOCR |
| nanonets-OCR-s_docs_parsing.ipynb | Advanced document parsing usingNanonets-OCR-s |
| ollama-llama3-2-vision-usage.ipynb | UsingLlama3-2 Vision for document parsing |
| paddleocr-3-0-docs-parsing.ipynb | Parsing withPaddleOCR 3.0 PP-StructureV3 |
| pix2text-docs-pages-parsing.ipynb | Document parsing usingPix2Text |
| smoldocling-documents-understanding.ipynb | Document understanding withSmolDocling |
| zerox-pdf-parsing.ipynb | PDF parsing experiments withZerox |
| qwen2-vl-2b-docs-parsing.ipynb | Documents pages parsing withQwen2-VL-2B |
| OCRFlux_3B_Docs_Parsing.ipynb | Document parsing withOCRFlux-3B on Lightning AI |
| granite-docling-258m-document-parsing-review.ipynb | EvaluatingIBM Granite DocLing 258M for document parsing and layout understanding |
This section includes notebooks focused on table and chart detection, structure recognition, and extraction from documents. It covers various open-source approaches and benchmarks for understanding table and chart layouts and content.
| Notebook | Description |
|---|---|
| unitable-testing-for-table-structure-recognition.ipynb | Testing table detection and structure recognition withUniTable |
| deepdoctection-tables-recognition.ipynb | EvaluatingDeepdoctection for table extraction across varied structures |
| gemini-2-5-pro-on-chart-and-table-extraction.ipynb | Chart/table extraction usingGemini 2.5 Pro |
| deplot-plots-to-tables-converter.ipynb | Converting Charts into Tables withDePlot |
| cohere-command-a-vision-charts-understanding.ipynb | Cohere Command A Vision for Charts Understanding |
| cohere-command-a-vision-tables-recognition.ipynb | Cohere Command A Vision for Tables Recognition |
| moondream2-charts-tables-interpretation.ipynb | Moondream2 for Charts and Tables understanding |
This section covers the structured data extraction phase, detailing methods to extract specific data from documents or images. It includes steps like OCR preprocessing, table extraction, named entity recognition (NER), and conversion to structured formats.
| Notebook | Description |
|---|---|
| NuExtract-2-8b-structured-data-extraction | NuExtract-2.0-8B for Structured Data Extraction |
- Benchmark different OCR/document parsing models on real documents.
- Demonstrate table, chart, and text extraction workflows.
- Compare open-source and commercial solutions.
- Provide ready-to-use code snippets for rapid prototyping.
Clone the repository:
git clone https://github.com/AdemBoukhris457/Docs_Parsing_Techniques.git
Install dependencies as needed for each notebook (see the first cells of each
.ipynbfor requirements).Launch Jupyter Notebook or JupyterLab and open any notebook of interest.
Run the cells and adapt the code for your documents.
- Some notebooks require model weights or API keys, check comments in each notebook for details.
- Results, insights, and sample outputs are provided inline.
📂 You can find more notebooks, experiments, and datasets related to document parsing and OCR on my Kaggle profile:👉https://www.kaggle.com/ademboukhris/code
About
Jupyter notebooks testing different OCR models for document parsing (Dolphin, MonkeyOCR, Marker, Nanonets, ...)
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
