Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Learn how to use CUA (our Computer Using Agent) via the API on multiple computer environments.

License

NotificationsYou must be signed in to change notification settings

openai/openai-cua-sample-app

Repository files navigation

Get started building aComputer Using Agent (CUA) with the OpenAI API.

Caution

Computer use is in preview. Because the model is still in preview and may be susceptible to exploits and inadvertent mistakes, we discourage trusting it in authenticated environments or for high-stakes tasks.

Set Up & Run

Set up python env and install dependencies.

python3 -m venv .venvsource .venv/bin/activatepip install -r requirements.txt

Run CLI to let CUA use a local browser window, usingplaywright. (Stop with CTRL+C)

python cli.py --computer local-playwright

Note

The first time you run this, if you haven't used Playwright before, you will be prompted to install dependencies. Execute the command suggested, which will depend on your OS.

Other included samplecomputer environments:

  • Docker (containerized desktop)
  • Browserbase (remote browser, requires account)
  • Scrapybara (remote browser or computer, requires account)
  • ...or implement your ownComputer!

Overview

The computer use tool and model are available via theResponses API. At a high level, CUA will look at a screenshot of the computer interface and recommend actions. Specifically, it sendscomputer_call(s) withactions likeclick(x,y) ortype(text) that you have to execute on your environment, and then expects screenshots of the outcomes.

You can learn more about this tool in theComputer use guide.

Abstractions

This repository defines two lightweight abstractions to make interacting with CUA agents more ergonomic. Everything works without them, but they provide a convenient separation of concerns.

AbstractionFileDescription
Computercomputers/computer.pyDefines aComputer interface for various environments (local desktop, remote browser, etc.). An implementation ofComputer is responsible for executing anycomputer_action sent by CUA (clicks, etc).
Agentagent/agent.pySimple, familiar agent loop – implementsrun_full_turn(), which just keeps calling the model until all computer actions and function calls are handled.

CLI Usage

The CLI (cli.py) is the easiest way to get started with CUA. It accepts the following arguments:

  • --computer: The computer environment to use. See theComputer Environments section below for options. By default, the CLI will use thelocal-playwright environment.
  • --input: The initial input to the agent (optional: the CLI will prompt you for input if not provided)
  • --debug: Enable debug mode.
  • --show: Show images (screenshots) during the execution.
  • --start-url: Start the browsing session with a specific URL (only for browser environments). By default, the CLI will start the browsing session withhttps://bing.com.

Run examples (optional)

Theexamples folder contains more examples of how to use CUA.

python -m examples.weather_example

For reference, the filesimple_cua_loop.py implements the basics of the CUA loop.

You can run it with:

python simple_cua_loop.py

Computer Environments

CUA can work with anyComputer environment that can handle theCUA actions (plus a few extra):

ActionExample
click(x, y, button="left")click(24, 150)
double_click(x, y)double_click(24, 150)
scroll(x, y, scroll_x, scroll_y)scroll(24, 150, 0, -100)
type(text)type("Hello, World!")
wait(ms=1000)wait(2000)
move(x, y)move(24, 150)
keypress(keys)keypress(["CTRL", "C"])
drag(path)drag([[24, 150], [100, 200]])

This sample app provides a set of implementedComputer examples, but feel free to add your own!

ComputerOptionTypeDescriptionRequirements
LocalPlaywrightlocal-playwrightbrowserLocal browser windowPlaywright SDK
DockerdockerlinuxDocker container environmentDocker running
BrowserbasebrowserbasebrowserRemote browser environmentBrowserbase API key in.env
ScrapybaraBrowserscrapybara-browserbrowserRemote browser environmentScrapybara API key in.env
ScrapybaraUbuntuscrapybara-ubuntulinuxRemote Ubuntu desktop environmentScrapybara API key in.env

Using the CLI, you can run the sample app with different computer environments using the options listed above:

python cli.py --show --computer<computer-option>

For example, to run the sample app with theDocker computer environment, you can run:

python cli.py --show --computer docker

Contributed Computers

ComputerOptionTypeDescriptionRequirements
tbdtbdtbdtbdtbd

Note

If you've implemented a new computer, please add it to the "Contributed Computers" section of the README.md file. Clearly indicate any auth / signup requirements. See theContributing section for more details.

Docker Setup

If you want to run the sample app with theDocker computer environment, you need to build and run a local Docker container.

Open a new shell to build and run the Docker image. The first time you do this, it may take a few minutes, but subsequent runs should be much faster. Once the logs stop, proceed to the next setup step. To stop the container, press CTRL+C on the terminal where you ran the command below.

docker build -t cua-sample-app.docker run --rm -it --name cua-sample-app -p 5900:5900 --dns=1.1.1.3 -e DISPLAY=:99 cua-sample-app

Note

We use--dns=1.1.1.3 to restrict accessible websites to a smaller, safer set. We highly recommend you take similar safety precautions.

Warning

If you get the below error, then you need to kill that container.

docker: Error response from daemon: Conflict. The container name "/cua-sample-app" is already in use by container "e72fcb962b548e06a9dcdf6a99bc4b49642df2265440da7544330eb420b51d87"

Kill that container and try again.

docker rm -f cua-sample-app

Hosted environment setup

This repository contains example implementations of third-party hosted environments.To use these, you will need to set up an account with the service by following the links aboveand add your API key to the.env file.

Function Calling

TheAgent class accepts regular function schemas intools – it will return a hard-coded value for any invocations.

However, if you pass in anytools that are also defined in yourComputer methods, in addition to the requiredComputer methods, they will be routed to yourComputer to be handled when called.This is useful for cases where screenshots often don't capture the search bar or back arrow, so CUA may get stuck. So instead, you can provide aback() orgoto(url) functions. Seeexamples/playwright_with_custom_functions.py for an example.

Risks & Safety considerations

This repository provides example implementations with basic safety measures in place.

We recommend reviewing the best practices outlined in ourguide, and making sure you understand the risks involved with using this tool.

Contributing

Computers

To contribute a new computer, you'll need to implement it, test it, and submit a PR. Please follow the steps below:

1. Implement your computer

You will create or modify the following files (and only these files):

FileUpdates
computers/contrib/[your_computer_name].pyAdd computer file.
computers/contrib/__init__.pyAdd to imports.
computers/config.pyAdd to config.
README.mdAdd to README.

Create a new file incomputers/contrib/[your_computer_name].py and define your computer class. Make sure to implement the methods defined in theComputer class – use the existing implementations as a reference.

classYourComputerName:def__init__(self):passdefscreenshot(self):# TODO: implementpassdefclick(self,x,y):# TODO: implementpass# ... add other methods as needed

Note

For playwright-based computers, make sure to subclassBasePlaywrightComputer incomputers/shared/base_playwright.py – seecomputers/default/browserbase.py for an example.

Import your new computer in thecomputers/contrib/__init__.py:

# ... existing computer importsfrom .your_computer_nameimportYourComputerName

And add your new computer to thecomputers_config dictionary incomputers/config.py:

# ... existing computers_config"your_computer_name":YourComputerName,

Feel free to add your new computer to the "Contributed Computers" section of the README.md file. Clearly indicate any auth / signup requirements.

2. Test your computer

Test your new computer (with the CLI). Make sure:

  • Basic search / navigation works.
  • Any setup / teardown is handled correctly.
  • Test e2e with a few different tasks.

Potential gotchas (Seedefault computers for reference):

  • Scrolling, dragging, and control/command keys.
  • Resource allocation and teardown.
  • Auth / signup requirements.

3. Submit a PR

Your PR should clearly define the following:

  • Title:[contrib] Add computer: <your_computer_name>
  • Description:
# Add computer: <your_computer_name>#### AffiliationsWhat organization / company / institution are you affiliated with?#### Computer Description- Computer type (e.g. browser, linux)#### Testing Plan- Signup steps.- Auth steps.- Sample queries.

Thank you for your contribution! Please follow all of the above guidelines. Failure to do so may result in your PR being rejected.

About

Learn how to use CUA (our Computer Using Agent) via the API on multiple computer environments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp