Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

data-extraction

Here are 1,200 public repositories matching this topic...

firecrawlScrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

  • UpdatedNov 26, 2025
  • Python

Extract Keywords from sentence or Replace keywords in sentences.

  • UpdatedApr 13, 2025
  • Python
contextgem

A powerful Model Context Protocol (MCP) server that provides an all-in-one solution for public web access.

  • UpdatedNov 24, 2025
  • JavaScript

Converts a pdf file into a text file while keeping the layout of the original pdf. Useful to extract the content from a table in a pdf file for instance. This is a subclass of PDFTextStripper class (from the Apache PDFBox library).

  • UpdatedDec 17, 2023
  • Java

A lightweight, self-hosted headless browser automation platform. Designed as an alternative to Browserless, built for speed, privacy, and scalability.

  • UpdatedOct 19, 2025
  • JavaScript

Lightweight library for scraping web-sites with LLMs

  • UpdatedOct 15, 2025
  • Python
vnstock

A beginner-friendly yet powerful Python toolkit for financial analysis and automation — built to make modern investing accessible to everyone

  • UpdatedNov 14, 2025
  • Python
hacker-news-digest

🚜 Parse text and tables from PDF files.

  • UpdatedNov 21, 2025
  • HTML

Local-first, open-source AI assistant for your data. Unify tasks, notes, docs, photos, and bookmarks. Private, self-hosted, and extensible via APIs.

  • UpdatedNov 13, 2025
  • TypeScript

🤖 AI-powered web scraping editor with visual workflow builder. Build, test & deploy web scrapers using natural language. Powered by ScrapeGraphAI & LangGraph.

  • UpdatedOct 26, 2025
  • Python

Pure Python, lightweight, Pillow-based solver for Amazon's text captcha.

  • UpdatedOct 27, 2025
  • Python

Benchmarking PDF libraries

  • UpdatedJul 2, 2025
  • Python

Improve this page

Add a description, image, and links to thedata-extraction topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thedata-extraction topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp