layout-analysis
Here are 57 public repositories matching this topic...
Language:All
Sort:Most stars
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
- Updated
Dec 16, 2025 - Python
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
- Updated
Dec 17, 2025 - Python
A Unified Toolkit for Deep Learning Based Document Image Analysis
- Updated
Aug 15, 2024 - Python
An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
- Updated
Jul 25, 2025 - Jupyter Notebook
Read and extract text and other content from PDFs in C# (port of PDFBox)
- Updated
Dec 7, 2025 - C#
YomiTokuはAIを活用した日本語文書解析エンジンを提供するPythonパッケージです。 Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.
- Updated
Dec 17, 2025 - Python
OCR engine for all the languages
- Updated
Dec 13, 2025 - Python
Document Layout Analysis resources repos for development with PdfPig.
- Updated
Oct 1, 2023 - C#
A toolbox of ocr models and algorithms based on MindSpore
- Updated
Jul 24, 2025 - Python
Analysis of Chinese and English layouts 中英文版面分析
- Updated
Aug 6, 2025 - Python
📝 针对文档类图像做内容提取,将文档类图像一比一输出到Word或者Txt中,便于进一步使用或处理。后续计划支持输入PDF/图像,输出对应json格式、Txt格式、Word格式和Markdown格式。
- Updated
Nov 1, 2024 - Python
YOLO models trained by DocLayNet - power your Document Intelligent by Layout Analysis
- Updated
Aug 3, 2025 - Python
Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.
- Updated
Oct 18, 2025 - Jupyter Notebook
An official implementation of paper "Paragraph2Graph: A Language-independent GNN-based framework for layout analysis"
- Updated
Oct 14, 2023 - Python
A Unified Toolkit for Deep Learning-Based Table Extraction
- Updated
Nov 21, 2024 - Python
Trained Detectron2 object detection models for document layout analysis based on PubLayNet dataset
- Updated
Apr 16, 2023 - Python
[ICDAR 2023] SelfDocSeg: A self-supervised vision-based approach towards Document Segmentation (Oral)
- Updated
Oct 6, 2023 - Python
A keyboard layout that provides an elegant and balanced typing experience by its use of a thumb-alpha, emphasis on middle fingers, deprioritisation of pinkies, and arcane keys.
- Updated
Nov 7, 2025
A Large Dataset of Historical Japanese Documents with Complex Layouts
- Updated
Jul 22, 2022 - Jupyter Notebook
Improve this page
Add a description, image, and links to thelayout-analysis topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thelayout-analysis topic, visit your repo's landing page and select "manage topics."