Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add stagehand RPC interface based on Stagehand-API (for client SDK canonicalization)#1301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
pirate wants to merge5 commits intomain
base:main
Choose a base branch
Loading
fromstagehand-p2p

Conversation

@pirate
Copy link
Member

@piratepirate commentedNov 22, 2025
edited
Loading

"""Stagehand Python SDKA lightweight Python client for the Stagehand browser automation framework.Connects to a remote Stagehand server (Node.js) and executes browser automation tasks.Dependencies:    pip install httpxUsage:    from stagehand import Stagehand    async def main():        stagehand = Stagehand(server_url="http://localhost:3000")        await stagehand.init()        await stagehand.goto("https://example.com")        result = await stagehand.act("click the login button")        data = await stagehand.extract("extract the page title")        await stagehand.close()"""importjsonfromtypingimportAny,Dict,List,Optional,UnionimporthttpxclassStagehandError(Exception):"""Base exception for Stagehand errors"""passclassStagehandAPIError(StagehandError):"""API-level errors from the Stagehand server"""passclassStagehandConnectionError(StagehandError):"""Connection errors when communicating with the server"""passclassAction:"""Represents a browser action returned by observe()"""def__init__(self,data:Dict[str,Any]):self.selector=data.get("selector")self.description=data.get("description")self.backend_node_id=data.get("backendNodeId")self.method=data.get("method")self.arguments=data.get("arguments", [])self._raw=datadef__repr__(self):returnf"Action(selector={self.selector!r}, description={self.description!r})"defto_dict(self)->Dict[str,Any]:"""Convert back to dict for sending to API"""returnself._rawclassActResult:"""Result from act() method"""def__init__(self,data:Dict[str,Any]):self.success=data.get("success",False)self.message=data.get("message","")self.actions= [Action(a)foraindata.get("actions", [])]self._raw=datadef__repr__(self):returnf"ActResult(success={self.success}, message={self.message!r})"classStagehand:"""    Main Stagehand client for browser automation.    Connects to a remote Stagehand server and provides methods for browser automation:    - act: Execute actions on the page    - extract: Extract data from the page    - observe: Observe possible actions on the page    - goto: Navigate to a URL    """def__init__(self,server_url:str="http://localhost:3000",verbose:int=0,timeout:float=120.0,    ):"""        Initialize the Stagehand client.        Args:            server_url: URL of the Stagehand server (default: http://localhost:3000)            verbose: Verbosity level 0-2 (default: 0)            timeout: Request timeout in seconds (default: 120)        """self.server_url=server_url.rstrip("/")self.verbose=verboseself.timeout=timeoutself.session_id:Optional[str]=Noneself._client=httpx.AsyncClient(timeout=timeout)asyncdefinit(self,**options)->None:"""        Initialize a browser session on the remote server.        Args:            **options: Additional options to pass to the server (e.g., model, verbose, etc.)        """ifself.session_id:raiseStagehandError("Already initialized. Call close() first.")try:response=awaitself._client.post(f"{self.server_url}/v1/sessions/start",json=optionsor {},            )response.raise_for_status()data=response.json()self.session_id=data.get("sessionId")ifnotself.session_id:raiseStagehandAPIError("Server did not return a sessionId")ifself.verbose>0:print(f"✓ Initialized session:{self.session_id}")excepthttpx.HTTPErrorase:raiseStagehandConnectionError(f"Failed to connect to server:{e}")asyncdefgoto(self,url:str,options:Optional[Dict[str,Any]]=None,frame_id:Optional[str]=None,    )->Any:"""        Navigate to a URL.        Args:            url: The URL to navigate to            options: Navigation options (waitUntil, timeout, etc.)            frame_id: Optional frame ID to navigate        Returns:            Navigation response        """returnawaitself._execute(method="navigate",args={"url":url,"options":options,"frameId":frame_id,            }        )asyncdefact(self,instruction:Union[str,Action],options:Optional[Dict[str,Any]]=None,frame_id:Optional[str]=None,    )->ActResult:"""        Execute an action on the page.        Args:            instruction: Natural language instruction or Action object            options: Additional options (model, variables, timeout, etc.)            frame_id: Optional frame ID to act on        Returns:            ActResult with success status and executed actions        """input_data=instruction.to_dict()ifisinstance(instruction,Action)elseinstructionresult=awaitself._execute(method="act",args={"input":input_data,"options":options,"frameId":frame_id,            }        )returnActResult(result)asyncdefextract(self,instruction:Optional[str]=None,schema:Optional[Dict[str,Any]]=None,options:Optional[Dict[str,Any]]=None,frame_id:Optional[str]=None,    )->Any:"""        Extract data from the page.        Args:            instruction: Natural language instruction for what to extract            schema: JSON schema defining the expected output structure            options: Additional options (model, selector, timeout, etc.)            frame_id: Optional frame ID to extract from        Returns:            Extracted data matching the schema (if provided) or default extraction        """returnawaitself._execute(method="extract",args={"instruction":instruction,"schema":schema,"options":options,"frameId":frame_id,            }        )asyncdefobserve(self,instruction:Optional[str]=None,options:Optional[Dict[str,Any]]=None,frame_id:Optional[str]=None,    )->List[Action]:"""        Observe possible actions on the page.        Args:            instruction: Natural language instruction for what to observe            options: Additional options (model, selector, timeout, etc.)            frame_id: Optional frame ID to observe        Returns:            List of Action objects representing possible actions        """result=awaitself._execute(method="observe",args={"instruction":instruction,"options":options,"frameId":frame_id,            }        )return [Action(action)foractioninresult]asyncdefagent_execute(self,instruction:str,agent_config:Optional[Dict[str,Any]]=None,execute_options:Optional[Dict[str,Any]]=None,frame_id:Optional[str]=None,    )->Dict[str,Any]:"""        Execute an agent task.        Args:            instruction: The task instruction for the agent            agent_config: Agent configuration (model, systemPrompt, etc.)            execute_options: Execution options (maxSteps, highlightCursor, etc.)            frame_id: Optional frame ID to execute in        Returns:            Agent execution result        """config=agent_configor {}exec_opts=execute_optionsor {}exec_opts["instruction"]=instructionreturnawaitself._execute(method="agentExecute",args={"agentConfig":config,"executeOptions":exec_opts,"frameId":frame_id,            }        )asyncdefclose(self)->None:"""Close the session and cleanup resources."""ifself.session_id:try:awaitself._client.post(f"{self.server_url}/v1/sessions/{self.session_id}/end"                )ifself.verbose>0:print(f"✓ Closed session:{self.session_id}")exceptExceptionase:ifself.verbose>0:print(f"Warning: Failed to close session:{e}")finally:self.session_id=Noneawaitself._client.aclose()asyncdef_execute(self,method:str,args:Dict[str,Any])->Any:"""        Execute a method on the remote server using SSE streaming.        Args:            method: The method name (act, extract, observe, navigate, agentExecute)            args: Arguments to pass to the method        Returns:            The result from the server        """ifnotself.session_id:raiseStagehandError("Not initialized. Call init() first.")url=f"{self.server_url}/v1/sessions/{self.session_id}/{method}"try:asyncwithself._client.stream("POST",url,json=args,headers={"Accept":"text/event-stream"},            )asresponse:response.raise_for_status()result=Noneasyncforlineinresponse.aiter_lines():ifnotline.strip()ornotline.startswith("data: "):continue# Parse SSE datadata_str=line[6:]# Remove "data: " prefixtry:event=json.loads(data_str)exceptjson.JSONDecodeError:continueevent_type=event.get("type")event_data=event.get("data", {})ifevent_type=="log":# Handle log eventsifself.verbose>0:category=event_data.get("category","")message=event_data.get("message","")level=event_data.get("level",0)iflevel<=self.verbose:print(f"[{category}]{message}")elifevent_type=="system":# System events contain the resultif"result"inevent_data:result=event_data["result"]elif"error"inevent_data:raiseStagehandAPIError(event_data["error"])ifresultisNone:raiseStagehandAPIError("No result received from server")returnresultexcepthttpx.HTTPStatusErrorase:error_text=awaite.response.aread()raiseStagehandAPIError(f"HTTP{e.response.status_code}:{error_text.decode()}"            )excepthttpx.HTTPErrorase:raiseStagehandConnectionError(f"Connection error:{e}")asyncdef__aenter__(self):"""Context manager entry"""awaitself.init()returnselfasyncdef__aexit__(self,exc_type,exc_val,exc_tb):"""Context manager exit"""awaitself.close()# Example usageif__name__=="__main__":importasyncioasyncdefexample():# Create and initialize Stagehand clientstagehand=Stagehand(server_url="http://localhost:3000",verbose=1,        )try:awaitstagehand.init()# Navigate to a pageprint("\n=== Navigating to example.com ===")awaitstagehand.goto("https://example.com")# Extract dataprint("\n=== Extracting page title ===")data=awaitstagehand.extract("extract the page title")print(f"Extracted:{data}")# Observe actionsprint("\n=== Observing actions ===")actions=awaitstagehand.observe("find all links on the page")print(f"Found{len(actions)} actions")ifactions:print(f"First action:{actions[0]}")# Execute an actionprint("\n=== Executing action ===")result=awaitstagehand.act("scroll to the bottom")print(f"Result:{result}")finally:awaitstagehand.close()# Alternative: using context managerasyncdefexample_with_context_manager():asyncwithStagehand(server_url="http://localhost:3000")asstagehand:awaitstagehand.goto("https://example.com")data=awaitstagehand.extract("extract the page title")print(data)# Run the exampleasyncio.run(example())

After this ships we can also remove the duplicated fastify server incore/stagehand-api and use event hooks to do the cloud-only stuff. See my message in Slackhttps://browserbase.slack.com/archives/C08EZ5W9TB9/p1763775335075319

OpenAPI Spec:

openapi:3.0.3info:title:Stagehand P2P Server APIdescription:|    HTTP API for remote Stagehand browser automation. This API allows clients to    connect to a Stagehand server and execute browser automation tasks remotely.    All endpoints except /sessions/start require an active session ID.    Responses are streamed using Server-Sent Events (SSE) when the    `x-stream-response: true` header is provided.version:3.0.0contact:name:Browserbaseurl:https://browserbase.comservers:  -url:http://localhost:3000/v1description:Local P2P server  -url:https://api.stagehand.browserbase.com/v1description:Cloud API (for reference)paths:/sessions/start:post:summary:Create a new browser sessiondescription:|        Initializes a new Stagehand session with a browser instance.        Returns a session ID that must be used for all subsequent requests.operationId:createSessionrequestBody:required:truecontent:application/json:schema:$ref:'#/components/schemas/SessionConfig'examples:local:summary:Local browser sessionvalue:env:LOCALverbose:1browserbase:summary:Browserbase sessionvalue:env:BROWSERBASEapiKey:bb_api_key_123projectId:proj_123verbose:1responses:'200':description:Session created successfullycontent:application/json:schema:type:objectrequired:                  -sessionId                  -availableproperties:sessionId:type:stringformat:uuiddescription:Unique identifier for the sessionavailable:type:booleandescription:Whether the session is ready to use'500':$ref:'#/components/responses/InternalError'/sessions/{sessionId}/act:post:summary:Execute an action on the pagedescription:|        Performs a browser action based on natural language instruction or        a specific action object returned by observe().operationId:actparameters:        -$ref:'#/components/parameters/SessionId'        -$ref:'#/components/parameters/StreamResponse'requestBody:required:truecontent:application/json:schema:$ref:'#/components/schemas/ActRequest'examples:stringInstruction:summary:Natural language instructionvalue:input:"click the sign in button"actionObject:summary:Execute observed actionvalue:input:selector:"#login-btn"description:"Sign in button"method:"click"arguments:[]responses:'200':description:Action executed successfullycontent:text/event-stream:schema:$ref:'#/components/schemas/SSEResponse'application/json:schema:$ref:'#/components/schemas/ActResult''400':$ref:'#/components/responses/BadRequest''404':$ref:'#/components/responses/SessionNotFound''500':$ref:'#/components/responses/InternalError'/sessions/{sessionId}/extract:post:summary:Extract structured data from the pagedescription:|        Extracts data from the current page using natural language instructions        and optional JSON schema for structured output.operationId:extractparameters:        -$ref:'#/components/parameters/SessionId'        -$ref:'#/components/parameters/StreamResponse'requestBody:required:truecontent:application/json:schema:$ref:'#/components/schemas/ExtractRequest'examples:simple:summary:Simple extractionvalue:instruction:"extract the page title"withSchema:summary:Structured extractionvalue:instruction:"extract all product listings"schema:type:objectproperties:products:type:arrayitems:type:objectproperties:name:type:stringprice:type:stringresponses:'200':description:Data extracted successfullycontent:text/event-stream:schema:$ref:'#/components/schemas/SSEResponse'application/json:schema:$ref:'#/components/schemas/ExtractResult''400':$ref:'#/components/responses/BadRequest''404':$ref:'#/components/responses/SessionNotFound''500':$ref:'#/components/responses/InternalError'/sessions/{sessionId}/observe:post:summary:Observe possible actions on the pagedescription:|        Returns a list of candidate actions that can be performed on the page,        optionally filtered by natural language instruction.operationId:observeparameters:        -$ref:'#/components/parameters/SessionId'        -$ref:'#/components/parameters/StreamResponse'requestBody:required:truecontent:application/json:schema:$ref:'#/components/schemas/ObserveRequest'examples:allActions:summary:Observe all actionsvalue:{}filtered:summary:Observe specific actionsvalue:instruction:"find all buttons"responses:'200':description:Actions observed successfullycontent:text/event-stream:schema:$ref:'#/components/schemas/SSEResponse'application/json:schema:$ref:'#/components/schemas/ObserveResult''400':$ref:'#/components/responses/BadRequest''404':$ref:'#/components/responses/SessionNotFound''500':$ref:'#/components/responses/InternalError'/sessions/{sessionId}/agentExecute:post:summary:Execute a multi-step agent taskdescription:|        Runs an autonomous agent that can perform multiple actions to        complete a complex task.operationId:agentExecuteparameters:        -$ref:'#/components/parameters/SessionId'        -$ref:'#/components/parameters/StreamResponse'requestBody:required:truecontent:application/json:schema:$ref:'#/components/schemas/AgentExecuteRequest'examples:basic:summary:Basic agent taskvalue:agentConfig:model:"openai/gpt-4o"executeOptions:instruction:"Find and click the first product"maxSteps:10responses:'200':description:Agent task completedcontent:text/event-stream:schema:$ref:'#/components/schemas/SSEResponse'application/json:schema:$ref:'#/components/schemas/AgentResult''400':$ref:'#/components/responses/BadRequest''404':$ref:'#/components/responses/SessionNotFound''500':$ref:'#/components/responses/InternalError'/sessions/{sessionId}/navigate:post:summary:Navigate to a URLdescription:|        Navigates the browser to the specified URL and waits for page load.operationId:navigateparameters:        -$ref:'#/components/parameters/SessionId'        -$ref:'#/components/parameters/StreamResponse'requestBody:required:truecontent:application/json:schema:$ref:'#/components/schemas/NavigateRequest'examples:simple:summary:Simple navigationvalue:url:"https://example.com"withOptions:summary:Navigation with optionsvalue:url:"https://example.com"options:waitUntil:"networkidle"responses:'200':description:Navigation completedcontent:text/event-stream:schema:$ref:'#/components/schemas/SSEResponse'application/json:schema:$ref:'#/components/schemas/NavigateResult''400':$ref:'#/components/responses/BadRequest''404':$ref:'#/components/responses/SessionNotFound''500':$ref:'#/components/responses/InternalError'/sessions/{sessionId}/end:post:summary:End the session and cleanup resourcesdescription:|        Closes the browser and cleans up all resources associated with the session.operationId:endSessionparameters:        -$ref:'#/components/parameters/SessionId'responses:'200':description:Session ended successfullycontent:application/json:schema:type:objectproperties:success:type:boolean'500':$ref:'#/components/responses/InternalError'components:parameters:SessionId:name:sessionIdin:pathrequired:truedescription:The session ID returned by /sessions/startschema:type:stringformat:uuidStreamResponse:name:x-stream-responsein:headerdescription:Enable Server-Sent Events streaming for real-time logsschema:type:stringenum:["true", "false"]default:"true"schemas:SessionConfig:type:objectrequired:        -envproperties:env:type:stringenum:[LOCAL, BROWSERBASE]description:Environment to run the browser inverbose:type:integerminimum:0maximum:2default:0description:Logging verbosity levelmodel:type:stringdescription:AI model to use for actionsexample:"openai/gpt-4o"apiKey:type:stringdescription:API key for Browserbase (required when env=BROWSERBASE)projectId:type:stringdescription:Project ID for Browserbase (required when env=BROWSERBASE)systemPrompt:type:stringdescription:Custom system prompt for AI actionsdomSettleTimeout:type:integerdescription:Timeout in ms to wait for DOM to settleselfHeal:type:booleandescription:Enable self-healing for failed actionslocalBrowserLaunchOptions:type:objectdescription:Options for local browser launchproperties:headless:type:booleandefault:trueActRequest:type:objectrequired:        -inputproperties:input:oneOf:            -type:stringdescription:Natural language instruction            -$ref:'#/components/schemas/Action'options:$ref:'#/components/schemas/ActionOptions'frameId:type:stringdescription:Frame ID to act on (optional)Action:type:objectrequired:        -selector        -description        -method        -argumentsproperties:selector:type:stringdescription:CSS or XPath selector for the elementdescription:type:stringdescription:Human-readable description of the actionbackendNodeId:type:integerdescription:CDP backend node IDmethod:type:stringdescription:Method to execute (e.g., "click", "fill")arguments:type:arrayitems:type:stringdescription:Arguments for the methodActionOptions:type:objectproperties:model:$ref:'#/components/schemas/ModelConfig'variables:type:objectadditionalProperties:type:stringdescription:Template variables for instructiontimeout:type:integerdescription:Timeout in millisecondsModelConfig:type:objectproperties:provider:type:stringenum:[openai, anthropic, google]model:type:stringdescription:Model nameapiKey:type:stringdescription:API key for the model providerbaseURL:type:stringformat:uridescription:Custom base URL for APIExtractRequest:type:objectproperties:instruction:type:stringdescription:Natural language instruction for extractionschema:type:objectdescription:JSON Schema for structured outputadditionalProperties:trueoptions:type:objectproperties:model:$ref:'#/components/schemas/ModelConfig'timeout:type:integerselector:type:stringdescription:Extract only from elements matching this selectorframeId:type:stringdescription:Frame ID to extract fromObserveRequest:type:objectproperties:instruction:type:stringdescription:Natural language instruction to filter actionsoptions:type:objectproperties:model:$ref:'#/components/schemas/ModelConfig'timeout:type:integerselector:type:stringdescription:Observe only elements matching this selectorframeId:type:stringdescription:Frame ID to observeAgentExecuteRequest:type:objectrequired:        -agentConfig        -executeOptionsproperties:agentConfig:type:objectproperties:provider:type:stringenum:[openai, anthropic, google]model:oneOf:                -type:string                -$ref:'#/components/schemas/ModelConfig'systemPrompt:type:stringcua:type:booleandescription:Enable Computer Use Agent modeexecuteOptions:type:objectrequired:            -instructionproperties:instruction:type:stringdescription:Task for the agent to completemaxSteps:type:integerdefault:20description:Maximum number of steps the agent can takehighlightCursor:type:booleandescription:Visually highlight the cursor during actionsframeId:type:stringNavigateRequest:type:objectrequired:        -urlproperties:url:type:stringformat:uridescription:URL to navigate tooptions:type:objectproperties:waitUntil:type:stringenum:[load, domcontentloaded, networkidle]default:loaddescription:When to consider navigation completeframeId:type:stringActResult:type:objectrequired:        -success        -message        -actionsproperties:success:type:booleandescription:Whether the action succeededmessage:type:stringdescription:Result messageactions:type:arrayitems:$ref:'#/components/schemas/Action'description:Actions that were executedExtractResult:oneOf:        -type:objectdescription:Default extraction resultproperties:extraction:type:string        -type:objectdescription:Structured data matching provided schemaadditionalProperties:trueObserveResult:type:arrayitems:$ref:'#/components/schemas/Action'description:List of observed actionsAgentResult:type:objectproperties:message:type:stringdescription:Final message from the agentsteps:type:arrayitems:type:objectdescription:Steps taken by the agentNavigateResult:type:objectnullable:truedescription:Navigation response (may be null)properties:ok:type:booleanstatus:type:integerurl:type:stringSSEResponse:description:|        Server-Sent Events stream. Each event is prefixed with "data: "        and contains a JSON object with type and data fields.type:objectrequired:        -id        -type        -dataproperties:id:type:stringformat:uuiddescription:Unique event IDtype:type:stringenum:[system, log]description:Event typedata:oneOf:            -$ref:'#/components/schemas/SystemEvent'            -$ref:'#/components/schemas/LogEvent'SystemEvent:type:objectproperties:status:type:stringenum:[starting, connected, finished, error]description:System statusresult:description:Result data (present when status=finished)error:type:stringdescription:Error message (present when status=error)LogEvent:type:objectrequired:        -status        -messageproperties:status:type:stringconst:runningmessage:type:objectrequired:            -category            -message            -levelproperties:category:type:stringdescription:Log categorymessage:type:stringdescription:Log messagelevel:type:integerminimum:0maximum:2description:Log level (0=error, 1=info, 2=debug)ErrorResponse:type:objectrequired:        -errorproperties:error:type:stringdescription:Error messagedetails:description:Additional error detailsresponses:BadRequest:description:Invalid request parameterscontent:application/json:schema:$ref:'#/components/schemas/ErrorResponse'SessionNotFound:description:Session ID not found or expiredcontent:application/json:schema:$ref:'#/components/schemas/ErrorResponse'InternalError:description:Internal server errorcontent:application/json:schema:$ref:'#/components/schemas/ErrorResponse'

@changeset-bot
Copy link

changeset-botbot commentedNov 22, 2025
edited
Loading

⚠️ No Changeset found

Latest commit:3eb42da

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go.If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@pirate

[8]ページ先頭

©2009-2025 Movatter.jp