ScrapeGraphAI/langchain-scrapegraphPublic

NotificationsYou must be signed in to change notification settings
Fork2
Star17

ScrapeGraph client langchain integration

License

MIT license

17 stars 2 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github		.github
cookbook		cookbook
examples		examples
langchain_scrapegraph		langchain_scrapegraph
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.releaserc.yml		.releaserc.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Repository files navigation

🕷️🦜 langchain-scrapegraph

Supercharge your LangChain agents with AI-powered web scraping capabilities. LangChain-ScrapeGraph provides a seamless integration betweenLangChain andScrapeGraph AI, enabling your agents to extract structured data from websites using natural language.

🔗 ScrapeGraph API & SDKs

If you are looking for a quick solution to integrate ScrapeGraph in your system, check out our powerful APIhere!

We offer SDKs in both Python and Node.js, making it easy to integrate into your projects. Check them out below:

SDK	Language	GitHub Link
Python SDK	Python	scrapegraph-py
Node.js SDK	Node.js	scrapegraph-js

📦 Installation

pip install langchain-scrapegraph

🛠️ Available Tools

📝 MarkdownifyTool

Convert any webpage into clean, formatted markdown.

fromlangchain_scrapegraph.toolsimportMarkdownifyTooltool=MarkdownifyTool()markdown=tool.invoke({"website_url":"https://example.com"})print(markdown)

🔍 SmartscraperTool

Extract structured data from any webpage using natural language prompts.

fromlangchain_scrapegraph.toolsimportSmartScraperTool# Initialize the tool (uses SGAI_API_KEY from environment)tool=SmartscraperTool()# Extract information using natural languageresult=tool.invoke({"website_url":"https://www.example.com","user_prompt":"Extract the main heading and first paragraph"})print(result)

🌐 SearchscraperTool

Search and extract structured information from the web using natural language prompts.

fromlangchain_scrapegraph.toolsimportSearchScraperTool# Initialize the tool (uses SGAI_API_KEY from environment)tool=SearchScraperTool()# Search and extract information using natural languageresult=tool.invoke({"user_prompt":"What are the key features and pricing of ChatGPT Plus?"})print(result)# {#     "product": {#         "name": "ChatGPT Plus",#         "description": "Premium version of ChatGPT..."#     },#     "features": [...],#     "pricing": {...},#     "reference_urls": [#         "https://openai.com/chatgpt",#         ...#     ]# }

🔍 Using Output Schemas with SearchscraperTool

You can define the structure of the output using Pydantic models:

fromtypingimportList,DictfrompydanticimportBaseModel,Fieldfromlangchain_scrapegraph.toolsimportSearchScraperToolclassProductInfo(BaseModel):name:str=Field(description="Product name")features:List[str]=Field(description="List of product features")pricing:Dict[str,Any]=Field(description="Pricing information")reference_urls:List[str]=Field(description="Source URLs for the information")# Initialize with schematool=SearchScraperTool(llm_output_schema=ProductInfo)# The output will conform to the ProductInfo schemaresult=tool.invoke({"user_prompt":"What are the key features and pricing of ChatGPT Plus?"})print(result)# {#     "name": "ChatGPT Plus",#     "features": [#         "GPT-4 access",#         "Faster response speed",#         ...#     ],#     "pricing": {#         "amount": 20,#         "currency": "USD",#         "period": "monthly"#     },#     "reference_urls": [#         "https://openai.com/chatgpt",#         ...#     ]# }

🌟 Key Features

🐦LangChain Integration: Seamlessly works with LangChain agents and chains
🔍AI-Powered Extraction: Use natural language to describe what data to extract
📊Structured Output: Get clean, structured data ready for your agents
🔄Flexible Tools: Choose from multiple specialized scraping tools
⚡Async Support: Built-in support for async operations

💡 Use Cases

📖Research Agents: Create agents that gather and analyze web data
📊Data Collection: Automate structured data extraction from websites
📝Content Processing: Convert web content into markdown for further processing
🔍Information Extraction: Extract specific data points using natural language

🤖 Example Agent

fromlangchain.agentsimportinitialize_agent,AgentTypefromlangchain_scrapegraph.toolsimportSmartScraperToolfromlangchain_openaiimportChatOpenAI# Initialize toolstools= [SmartScraperTool(),]# Create an agentagent=initialize_agent(tools=tools,llm=ChatOpenAI(temperature=0),agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,verbose=True)# Use the agentresponse=agent.run("""    Visit example.com, make a summary of the content and extract the main heading and first paragraph""")

⚙️ Configuration

Set your ScrapeGraph API key in your environment:

export SGAI_API_KEY="your-api-key-here"

Or set it programmatically:

importosos.environ["SGAI_API_KEY"]="your-api-key-here"

📚 Documentation

💬 Support & Feedback

📧 Email:support@scrapegraphai.com
💻 GitHub Issues:Create an issue
🌟 Feature Requests:Request a feature

📄 License

This project is licensed under the MIT License - see theLICENSE file for details.

🙏 Acknowledgments

This project is built on top of:

Made with ❤️ byScrapeGraph AI

About

ScrapeGraph client langchain integration

scrapegraphai.com

Releases11

v1.4.0 Latest

Jul 15, 2025

+ 10 releases

Packages

No packages published

Movatterモバイル変換

License

ScrapeGraphAI/langchain-scrapegraph

Folders and files

Latest commit

History

Repository files navigation

🕷️🦜 langchain-scrapegraph

🔗 ScrapeGraph API & SDKs

📦 Installation

🛠️ Available Tools

📝 MarkdownifyTool

🔍 SmartscraperTool

🌐 SearchscraperTool

🌟 Key Features

💡 Use Cases

🤖 Example Agent

⚙️ Configuration

📚 Documentation

💬 Support & Feedback

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases11

Packages0

Uh oh!

Contributors3

Uh oh!

Languages

Packages