Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

data-extraction

Here are 1,525 public repositories matching this topic...

firecrawlScrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

  • UpdatedDec 17, 2025
  • Python

Extract Keywords from sentence or Replace keywords in sentences.

  • UpdatedApr 13, 2025
  • Python

A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.

  • UpdatedDec 16, 2025
  • JavaScript
contextgem

Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).

  • UpdatedDec 17, 2023
  • Java

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.

  • UpdatedOct 19, 2025
  • JavaScript

Lightweight library for scraping web-sites with LLMs

  • UpdatedDec 17, 2025
  • Python
vnstock

A beginner-friendly yet powerful Python toolkit for financial analysis and automation — built to make modern investing accessible to everyone

  • UpdatedDec 8, 2025
  • Python
hacker-news-digest

Local-first, open-source AI assistant for your data. Unify tasks, notes, docs, photos, and bookmarks. Private, self-hosted, and extensible via APIs.

  • UpdatedDec 4, 2025
  • TypeScript

🚜 Parse text and tables from PDF files.

  • UpdatedNov 21, 2025
  • HTML

🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.

  • UpdatedOct 26, 2025
  • Python

Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

  • UpdatedDec 8, 2025
  • Python

Benchmarking PDF libraries

  • UpdatedJul 2, 2025
  • Python

Improve this page

Add a description, image, and links to thedata-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thedata-extraction topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp