Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

BrightDataUnlocker

Bright Data provides a powerful Web Unlocker API that allows you to access websites that might be protected by anti-bot measures, geo-restrictions, or other access limitations, making it particularly useful for AI agents requiring reliable web content extraction.

Overview

Integration details

ClassPackageSerializableJS supportPackage latest
BrightDataUnlockerlangchain-brightdataPyPI - Version

Tool features

Native asyncReturns artifactReturn dataPricing
HTML, Markdown, or screenshot of web pagesRequires Bright Data account

Setup

The integration lives in thelangchain-brightdata package.

pip install langchain-brightdata

You'll need a Bright Data API key to use this tool. You can set it as an environment variable:

import os

os.environ["BRIGHT_DATA_API_KEY"]="your-api-key"

Or pass it directly when initializing the tool:

from langchain_brightdataimport BrightDataUnlocker

unlocker_tool= BrightDataUnlocker(bright_data_api_key="your-api-key")

Instantiation

Here we show how to instantiate an instance of the BrightDataUnlocker tool. This tool allows you to access websites that may be protected by anti-bot measures, geo-restrictions, or other access limitations using Bright Data's Web Unlocker service.

The tool accepts various parameters during instantiation:

  • bright_data_api_key (required, str): Your Bright Data API key for authentication.
  • format (optional, Literal["raw"]): Format of the response content. Default is "raw".
  • country (optional, str): Two-letter country code for geo-specific access (e.g., "us", "gb", "de", "jp"). Set this when you need to view the website as if accessing from a specific country. Default is None.
  • zone (optional, str): Bright Data zone to use for the request. The "unlocker" zone is optimized for accessing websites that might block regular requests. Default is "unlocker".
  • data_format (optional, Literal["html", "markdown", "screenshot"]): Output format for the retrieved content. Options include:
    • "html" - Returns the standard HTML content (default)
    • "markdown" - Returns content converted to markdown format
    • "screenshot" - Returns a PNG screenshot of the rendered page

Invocation

Basic Usage

from langchain_brightdataimport BrightDataUnlocker

# Initialize the tool
unlocker_tool= BrightDataUnlocker(
bright_data_api_key="your-api-key"# Optional if set in environment variables
)

# Access a webpage
result= unlocker_tool.invoke("https://example.com")

print(result)

Advanced Usage with Parameters

from langchain_brightdataimport BrightDataUnlocker

unlocker_tool= BrightDataUnlocker(
bright_data_api_key="your-api-key",
)

# Access a webpage with specific parameters
result= unlocker_tool.invoke(
{
"url":"https://example.com/region-restricted-content",
"country":"gb",# Access as if from Great Britain
"data_format":"html",# Get content in markdown format
"zone":"unlocker",# Use the unlocker zone
}
)

print(result)

Customization Options

The BrightDataUnlocker tool accepts several parameters for customization:

ParameterTypeDescription
urlstrThe URL to access
formatstrFormat of the response content (default: "raw")
countrystrTwo-letter country code for geo-specific access (e.g., "us", "gb")
zonestrBright Data zone to use (default: "unlocker")
data_formatstrOutput format: None (HTML), "markdown", or "screenshot"

Data Format Options

Thedata_format parameter allows you to specify how the content should be returned:

  • None or"html" (default): Returns the standard HTML content of the page
  • "markdown": Returns the content converted to markdown format, which is useful for feeding directly to LLMs
  • "screenshot": Returns a PNG screenshot of the rendered page, useful for visual analysis

Use within an agent

from langchain_brightdataimport BrightDataUnlocker
from langchain_google_genaiimport ChatGoogleGenerativeAI
from langgraph.prebuiltimport create_react_agent

# Initialize the LLM
llm= ChatGoogleGenerativeAI(model="gemini-2.0-flash", google_api_key="your-api-key")

# Initialize the tool
bright_data_tool= BrightDataUnlocker(bright_data_api_key="your-api-key")

# Create the agent
agent= create_react_agent(llm,[bright_data_tool])

# Input URLs or prompt
user_input="Get the content from https://example.com/region-restricted-page - access it from GB"

# Stream the agent's output step by step
for stepin agent.stream(
{"messages": user_input},
stream_mode="values",
):
step["messages"][-1].pretty_print()

API reference

Related


[8]ページ先頭

©2009-2025 Movatter.jp