Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
/cuaPublic

Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).

License

NotificationsYou must be signed in to change notification settings

trycua/cua

Repository files navigation

Cua logo

PythonSwiftmacOSDiscord
trycua%2Fcua | Trendshift

We’re hosting theComputer-Use Agents SOTA Challenge atHack the North and online!

Track A (On-site @ UWaterloo): Reserved for participants accepted to Hack the North. 🏆 Prize:YC interview guaranteed.
Track B (Remote): Open to everyone worldwide. 🏆 Prize:Cash award.

👉 Sign up here:trycua.com/hackathon

cua ("koo-ah") is Docker forComputer-Use Agents - it enables AI agents to control full operating systems in virtual containers and deploy them locally or to the cloud.

vibe-photoshop.mp4

With the Computer SDK, you can:

With the Agent SDK, you can:

  • run computer-use models with aconsistent schema
  • benchmark on OSWorld-Verified, SheetBench-V2, and morewith a single line of code using HUD (Notebook)
  • combine UI grounding models with any LLM usingcomposed agents
  • use new UI agent models and UI grounding models from the Model Zoo below with just a model string (e.g.,ComputerAgent(model="openai/computer-use-preview"))
  • use API or local inference by changing a prefix (e.g.,openai/,openrouter/,ollama/,huggingface-local/,mlx/,etc.)

CUA Model Zoo 🐨

All-in-one CUAsUI Grounding ModelsUI Planning Models
anthropic/claude-sonnet-4-5-20250929huggingface-local/xlangai/OpenCUA-{7B,32B}any all-in-one CUA
openai/computer-use-previewhuggingface-local/HelloKKMe/GTA1-{7B,32B,72B}any VLM (using liteLLM, requirestools parameter)
openrouter/z-ai/glm-4.5vhuggingface-local/Hcompany/Holo1.5-{3B,7B,72B}any LLM (using liteLLM, requiresmoondream3+ prefix )
huggingface-local/OpenGVLab/InternVL3_5-{1B,2B,4B,8B,...}any all-in-one CUA
huggingface-local/ByteDance-Seed/UI-TARS-1.5-7B
moondream3+{ui planning} (supports text-only models)
omniparser+{ui planning}
{ui grounding}+{ui planning}

Missing a model?Raise a feature request orcontribute!


Quick Start


Usage (Docs)

pip install cua-agent[all]
fromagentimportComputerAgentagent=ComputerAgent(model="anthropic/claude-3-5-sonnet-20241022",tools=[computer],max_trajectory_budget=5.0)messages= [{"role":"user","content":"Take a screenshot and tell me what you see"}]asyncforresultinagent.run(messages):foriteminresult["output"]:ifitem["type"]=="message":print(item["content"][0]["text"])

Output format (OpenAI Agent Responses Format):

{"output": [# user input    {"role":"user","content":"go to trycua on gh"    },# first agent turn adds the model output to the history    {"summary": [            {"text":"Searching Firefox for Trycua GitHub","type":"summary_text"            }        ],"type":"reasoning"    },    {"action": {"text":"Trycua GitHub","type":"type"        },"call_id":"call_QI6OsYkXxl6Ww1KvyJc4LKKq","status":"completed","type":"computer_call"    },# second agent turn adds the computer output to the history    {"type":"computer_call_output","call_id":"call_QI6OsYkXxl6Ww1KvyJc4LKKq","output": {"type":"input_image","image_url":"data:image/png;base64,..."        }    },# final agent turn adds the agent output text to the history    {"type":"message","role":"assistant","content": [          {"text":"Success! The Trycua GitHub page has been opened.","type":"output_text"          }        ]    }  ],"usage": {"prompt_tokens":150,"completion_tokens":75,"total_tokens":225,"response_cost":0.01,  }}

Computer (Docs)

pip install cua-computer[all]
fromcomputerimportComputerasyncwithComputer(os_type="linux",provider_type="cloud",name="your-container-name",api_key="your-api-key")ascomputer:# Take screenshotscreenshot=awaitcomputer.interface.screenshot()# Click and typeawaitcomputer.interface.left_click(100,100)awaitcomputer.interface.type("Hello!")

Resources

Modules

ModuleDescriptionInstallation
LumeVM management for macOS/Linux using Apple's Virtualization.Frameworkcurl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
LumierDocker interface for macOS and Linux VMsdocker pull trycua/lumier:latest
Computer (Python)Python Interface for controlling virtual machinespip install "cua-computer[all]"
Computer (Typescript)Typescript Interface for controlling virtual machinesnpm install @trycua/computer
AgentAI agent framework for automating taskspip install "cua-agent[all]"
MCP ServerMCP server for using CUA with Claude Desktoppip install cua-mcp-server
SOMSelf-of-Mark library for Agentpip install cua-som
Computer ServerServer component for Computerpip install cua-computer-server
Core (Python)Python Core utilitiespip install cua-core
Core (Typescript)Typescript Core utilitiesnpm install @trycua/core

Community

Join ourDiscord community to discuss ideas, get assistance, or share your demos!

License

Cua is open-sourced under the MIT License - see theLICENSE file for details.

Portions of this project, specifically components adapted from Kasm Technologies Inc., are also licensed under the MIT License. Seelibs/kasm/LICENSE for details.

Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0). See theOmniParser LICENSE for details.

Third-Party Licenses and Optional Components

Some optional extras for this project depend on third-party packages that are licensed under terms different from the MIT License.

  • The optional "omni" extra (installed viapip install "cua-agent[omni]") installs thecua-som module, which includesultralytics and is licensed under the AGPL-3.0.

When you choose to install and use such optional extras, your use, modification, and distribution of those third-party components are governed by their respective licenses (e.g., AGPL-3.0 forultralytics).

Contributing

We welcome contributions to Cua! Please refer to ourContributing Guidelines for details.

Trademarks

Apple, macOS, and Apple Silicon are trademarks of Apple Inc.
Ubuntu and Canonical are registered trademarks of Canonical Ltd.
Microsoft is a registered trademark of Microsoft Corporation.

This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., Microsoft Corporation, or Kasm Technologies.

Stargazers

Thank you to all our supporters!

Stargazers over time

About

Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Sponsor this project

 

Contributors27


[8]ページ先頭

©2009-2025 Movatter.jp