Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Extract tables from PDF files (port of tabula-java)

License

NotificationsYou must be signed in to change notification settings

BobLd/tabula-sharp

Repository files navigation

tabula-sharp is a library for extracting tables from PDF files — it is a port oftabula-java

WindowsLinuxMac OS

  • Supports netstandard2.0, net462, net471, net6.0, net8.0
  • No java bindings

NuGet packages available on thereleases page and onwww.nuget.org:

Differences with tabula-java

  • UsesPdfPig, and not PdfBox.
  • Coordinate system starts from the bottom left point (going up) of the page, and not from the top left point (going down).
  • TheNurminenDetectionAlgorithm is replaced bySimpleNurminenDetectionAlgorithm, because it requieres an image management library.
  • Table results might be different because of the way PdfPig builds Letters bounding box.

Usage

Stream mode - BasicExtractionAlgorithm

using(PdfDocumentdocument=PdfDocument.Open("doc.pdf",newParsingOptions(){ClipPaths=true})){PageAreapage=ObjectExtractor.Extract(document,1);// detect canditate table zonesSimpleNurminenDetectionAlgorithmdetector=newSimpleNurminenDetectionAlgorithm();varregions=detector.Detect(page);IExtractionAlgorithmea=newBasicExtractionAlgorithm();IReadOnlyList<Table>tables=ea.Extract(page.GetArea(regions[0].BoundingBox));// take first candidate areavartable=tables[0];varrows=table.Rows;}

Lattice mode - SpreadsheetExtractionAlgorithm

using(PdfDocumentdocument=PdfDocument.Open("doc.pdf",newParsingOptions(){ClipPaths=true})){PageAreapage=ObjectExtractor.Extract(document,1);IExtractionAlgorithmea=newSpreadsheetExtractionAlgorithm();IReadOnlyList<Table>tables=ea.Extract(page);vartable=tables[0];varrows=table.Rows;}

Results

Stream mode - BasicExtractionAlgorithm

example

Lattice mode - SpreadsheetExtractionAlgorithm

example


[8]ページ先頭

©2009-2025 Movatter.jp