ScrapeGraph AI
ScrapeGraph AI is a service that provides AI-powered web scraping capabilities.It offers tools for extracting structured data, converting webpages to markdown, and processing local HTML contentusing natural language prompts.
Installation and Setup
Install the required packages:
pip install langchain-scrapegraph
Set up your API key:
export SGAI_API_KEY="your-scrapegraph-api-key"
Tools
See ausage example.
There are four tools available:
from langchain_scrapegraph.toolsimport(
SmartScraperTool,# Extract structured data from websites
SmartCrawlerTool,# Extract data from multiple pages with crawling
MarkdownifyTool,# Convert webpages to markdown
GetCreditsTool,# Check remaining API credits
)
Each tool serves a specific purpose:
SmartScraperTool
: Extract structured data from websites given a URL, prompt and optional output schemaSmartCrawlerTool
: Extract data from multiple pages with advanced crawling options like depth control, page limits, and domain restrictionsMarkdownifyTool
: Convert any webpage to clean markdown formatGetCreditsTool
: Check your remaining ScrapeGraph AI credits