Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Secured CUA

License

NotificationsYou must be signed in to change notification settings

pnixnoel/cua-vlm-llm-switch

 
 

Repository files navigation

Cua logo

PythonSwiftmacOSDiscord
trycua%2Fcua | Trendshift

c/ua ("koo-ah") is Docker forComputer-Use Agents - it enables AI agents to control full operating systems in virtual containers and deploy them locally or to the cloud.

vibe-photoshop.mp4
Check out more demos of the Computer-Use Agent in action
MCP Server: Work with Claude Desktop and Tableau
mcp-claude-tableau.mp4
AI-Gradio: Multi-app workflow with browser, VS Code and terminal
ai-gradio-clone.mp4
Notebook: Fix GitHub issue in Cursor
notebook-github-cursor.mp4

🚀 Quick Start with a Computer-Use Agent UI

Need to automate desktop tasks? Launch the Computer-Use Agent UI with a single command.

Option 1: Fully-managed install with Docker (recommended)

Docker-based guided install for quick use

macOS/Linux/Windows (via WSL):

# Requires Docker/bin/bash -c"$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/scripts/playground-docker.sh)"

This script will guide you through setup using Docker containers and launch the Computer-Use Agent UI.


Option 2:Dev Container

Best for contributors and development

This repository includes aDev Container configuration that simplifies setup to a few steps:

  1. Install the Dev Containers extension (VS Code orWindSurf)
  2. Open the repository in the Dev Container:
    • PressCtrl+Shift+P (or⌘+Shift+P on macOS)
    • SelectDev Containers: Clone Repository in Container Volume... and paste the repository URL:https://github.com/trycua/cua.git (if not cloned) orDev Containers: Open Folder in Container... (if git cloned).

    Note: On WindSurf, the post install hook might not run automatically. If so, run/bin/bash .devcontainer/post-install.sh manually.

  3. Open the VS Code workspace: Once the post-install.sh is done running, open the.vscode/py.code-workspace workspace and pressOpen Workspace.
  4. Run the Agent UI example: ClickRun Agent UIto start the Gradio UI. If prompted to installdebugpy (Python Debugger) to enable remote debugging, select 'Yes' to proceed.
  5. Access the Gradio UI: The Gradio UI will be available athttp://localhost:7860 and will automatically forward to your host machine.

Option 3: PyPI

Direct Python package installation

# conda create -yn cua python==3.12pip install -U"cua-computer[all]""cua-agent[all]"python -m agent.ui# Start the agent UI

Or check out theUsage Guide to learn how to use our Python SDK in your own code.


SupportedAgent Loops

🖥️ Compatibility

For detailed compatibility information including host OS support, VM emulation capabilities, and model provider compatibility, see theCompatibility Matrix.



🐍 Usage Guide

Follow these steps to use C/ua in your own Python code. SeeDeveloper Guide for building from source.

Step 1: Install Lume CLI

/bin/bash -c"$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh)"

Lume CLI manages high-performance macOS/Linux VMs with near-native speed on Apple Silicon.

Step 2: Pull the macOS CUA Image

lume pull macos-sequoia-cua:latest

The macOS CUA image contains the default Mac apps and the Computer Server for easy automation.

Step 3: Install Python SDK

pip install"cua-computer[all]""cua-agent[all]"

Step 4: Use in Your Code

fromcomputerimportComputerfromagentimportComputerAgent,LLMasyncdefmain():# Start a local macOS VMcomputer=Computer(os_type="macos")awaitcomputer.run()# Or with C/ua Cloud Containercomputer=Computer(os_type="linux",api_key="your_cua_api_key_here",name="your_container_name_here"    )# Example: Direct control of a macOS VM with Computercomputer.interface.delay=0.1# Wait 0.1 seconds between kb/m actionsawaitcomputer.interface.left_click(100,200)awaitcomputer.interface.type_text("Hello, world!")screenshot_bytes=awaitcomputer.interface.screenshot()# Example: Create and run an agent locally using mlx-community/UI-TARS-1.5-7B-6bitagent=ComputerAgent(computer=computer,loop="uitars",model=LLM(provider="mlxvlm",name="mlx-community/UI-TARS-1.5-7B-6bit")    )asyncforresultinagent.run("Find the trycua/cua repository on GitHub and follow the quick start guide"):print(result)if__name__=="__main__":asyncio.run(main())

For ready-to-use examples, check out ourNotebooks collection.

Lume CLI Reference

# Install Lume CLI and background servicecurl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh| bash# List all VMslume ls# Pull a VM imagelume pull macos-sequoia-cua:latest# Create a new VMlume create my-vm --os macos --cpu 4 --memory 8GB --disk-size 50GB# Run a VM (creates and starts if it doesn't exist)lume run macos-sequoia-cua:latest# Stop a VMlume stop macos-sequoia-cua_latest# Delete a VMlume delete macos-sequoia-cua_latest

Lumier CLI Reference

For advanced container-like virtualization, check outLumier - a Docker interface for macOS and Linux VMs.

# Install Lume CLI and background servicecurl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh| bash# Run macOS in a Docker containerdocker run -it --rm \    --name lumier-vm \    -p 8006:8006 \    -v$(pwd)/storage:/storage \    -v$(pwd)/shared:/shared \    -e VM_NAME=lumier-vm \    -e VERSION=ghcr.io/trycua/macos-sequoia-cua:latest \    -e CPU_CORES=4 \    -e RAM_SIZE=8192 \    -e HOST_STORAGE_PATH=$(pwd)/storage \    -e HOST_SHARED_PATH=$(pwd)/shared \    trycua/lumier:latest

Resources

Modules

ModuleDescriptionInstallation
LumeVM management for macOS/Linux using Apple's Virtualization.Frameworkcurl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/lume/scripts/install.sh | bash
LumierDocker interface for macOS and Linux VMsdocker pull trycua/lumier:latest
Computer (Python)Python Interface for controlling virtual machinespip install "cua-computer[all]"
Computer (Typescript)Typescript Interface for controlling virtual machinesnpm install @trycua/computer
AgentAI agent framework for automating taskspip install "cua-agent[all]"
MCP ServerMCP server for using CUA with Claude Desktoppip install cua-mcp-server
SOMSelf-of-Mark library for Agentpip install cua-som
Computer ServerServer component for Computerpip install cua-computer-server
Core (Python)Python Core utilitiespip install cua-core
Core (Typescript)Typescript Core utilitiesnpm install @trycua/core

Computer Interface Reference

For complete examples, seecomputer_examples.py orcomputer_nb.ipynb

# Shell Actionsresult=awaitcomputer.interface.run_command(cmd)# Run shell command# result.stdout, result.stderr, result.returncode# Mouse Actionsawaitcomputer.interface.left_click(x,y)# Left click at coordinatesawaitcomputer.interface.right_click(x,y)# Right click at coordinatesawaitcomputer.interface.double_click(x,y)# Double click at coordinatesawaitcomputer.interface.move_cursor(x,y)# Move cursor to coordinatesawaitcomputer.interface.drag_to(x,y,duration)# Drag to coordinatesawaitcomputer.interface.get_cursor_position()# Get current cursor positionawaitcomputer.interface.mouse_down(x,y,button="left")# Press and hold a mouse buttonawaitcomputer.interface.mouse_up(x,y,button="left")# Release a mouse button# Keyboard Actionsawaitcomputer.interface.type_text("Hello")# Type textawaitcomputer.interface.press_key("enter")# Press a single keyawaitcomputer.interface.hotkey("command","c")# Press key combinationawaitcomputer.interface.key_down("command")# Press and hold a keyawaitcomputer.interface.key_up("command")# Release a key# Scrolling Actionsawaitcomputer.interface.scroll(x,y)# Scroll the mouse wheelawaitcomputer.interface.scroll_down(clicks)# Scroll downawaitcomputer.interface.scroll_up(clicks)# Scroll up# Screen Actionsawaitcomputer.interface.screenshot()# Take a screenshotawaitcomputer.interface.get_screen_size()# Get screen dimensions# Clipboard Actionsawaitcomputer.interface.set_clipboard(text)# Set clipboard contentawaitcomputer.interface.copy_to_clipboard()# Get clipboard content# File System Operationsawaitcomputer.interface.file_exists(path)# Check if file existsawaitcomputer.interface.directory_exists(path)# Check if directory existsawaitcomputer.interface.read_text(path,encoding="utf-8")# Read file contentawaitcomputer.interface.write_text(path,content,encoding="utf-8")# Write file contentawaitcomputer.interface.read_bytes(path)# Read file content as bytesawaitcomputer.interface.write_bytes(path,content)# Write file content as bytesawaitcomputer.interface.delete_file(path)# Delete fileawaitcomputer.interface.create_dir(path)# Create directoryawaitcomputer.interface.delete_dir(path)# Delete directoryawaitcomputer.interface.list_dir(path)# List directory contents# Accessibilityawaitcomputer.interface.get_accessibility_tree()# Get accessibility tree# Delay Configuration# Set default delay between all actions (in seconds)computer.interface.delay=0.5# 500ms delay between actions# Or specify delay for individual actionsawaitcomputer.interface.left_click(x,y,delay=1.0)# 1 second delay after clickawaitcomputer.interface.type_text("Hello",delay=0.2)# 200ms delay after typingawaitcomputer.interface.press_key("enter",delay=0.5)# 500ms delay after key press# Python Virtual Environment Operationsawaitcomputer.venv_install("demo_venv", ["requests","macos-pyxa"])# Install packages in a virtual environmentawaitcomputer.venv_cmd("demo_venv","python -c 'import requests; print(requests.get(`https://httpbin.org/ip`).json())'")# Run a shell command in a virtual environmentawaitcomputer.venv_exec("demo_venv",python_function_or_code,*args,**kwargs)# Run a Python function in a virtual environment and return the result / raise an exception# Example: Use sandboxed functions to execute code in a C/ua Containerfromcomputer.helpersimportsandboxed@sandboxed("demo_venv")defgreet_and_print(name):"""Get the HTML of the current Safari tab"""importPyXAsafari=PyXA.Application("Safari")html=safari.current_document.source()print(f"Hello from inside the container,{name}!")return {"greeted":name,"safari_html":html}# When a @sandboxed function is called, it will execute in the containerresult=awaitgreet_and_print("C/ua")# Result: {"greeted": "C/ua", "safari_html": "<html>...</html>"}# stdout and stderr are also captured and printed / raisedprint("Result from sandboxed function:",result)

ComputerAgent Reference

For complete examples, seeagent_examples.py oragent_nb.ipynb

# Import necessary componentsfromagentimportComputerAgent,LLM,AgentLoop,LLMProvider# UI-TARS-1.5 agent for local execution with MLXComputerAgent(loop=AgentLoop.UITARS,model=LLM(provider=LLMProvider.MLXVLM,name="mlx-community/UI-TARS-1.5-7B-6bit"))# OpenAI Computer-Use agent using OPENAI_API_KEYComputerAgent(loop=AgentLoop.OPENAI,model=LLM(provider=LLMProvider.OPENAI,name="computer-use-preview"))# Anthropic Claude agent using ANTHROPIC_API_KEYComputerAgent(loop=AgentLoop.ANTHROPIC,model=LLM(provider=LLMProvider.ANTHROPIC))# OmniParser loop for UI control using Set-of-Marks (SOM) prompting and any vision LLMComputerAgent(loop=AgentLoop.OMNI,model=LLM(provider=LLMProvider.OLLAMA,name="gemma3:12b-it-q4_K_M"))# OpenRouter example using OAICOMPAT providerComputerAgent(loop=AgentLoop.OMNI,model=LLM(provider=LLMProvider.OAICOMPAT,name="openai/gpt-4o-mini",provider_base_url="https://openrouter.ai/api/v1"    ),api_key="your-openrouter-api-key")

Community

Join ourDiscord community to discuss ideas, get assistance, or share your demos!

License

Cua is open-sourced under the MIT License - see theLICENSE file for details.

Microsoft's OmniParser, which is used in this project, is licensed under the Creative Commons Attribution 4.0 International License (CC-BY-4.0) - see theOmniParser LICENSE file for details.

Contributing

We welcome contributions to CUA! Please refer to ourContributing Guidelines for details.

Trademarks

Apple, macOS, and Apple Silicon are trademarks of Apple Inc. Ubuntu and Canonical are registered trademarks of Canonical Ltd. Microsoft is a registered trademark of Microsoft Corporation. This project is not affiliated with, endorsed by, or sponsored by Apple Inc., Canonical Ltd., or Microsoft Corporation.

Stargazers

Thank you to all our supporters!

Stargazers over time

Contributors

f-trycua
f-trycua

💻
Pedro Piñera Buendía
Pedro Piñera Buendía

💻
Amit Kumar
Amit Kumar

💻
Dung Duc Huynh (Kaka)
Dung Duc Huynh (Kaka)

💻
Zayd Krunz
Zayd Krunz

💻
Prashant Raj
Prashant Raj

💻
Leland Takamine
Leland Takamine

💻
ddupont
ddupont

💻
Ethan Gutierrez
Ethan Gutierrez

💻
Ricter Zheng
Ricter Zheng

💻
Rahul Karajgikar
Rahul Karajgikar

💻
trospix
trospix

💻
Evan smith
Evan smith

💻

About

Secured CUA

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python63.2%
  • Swift25.1%
  • Shell4.2%
  • TypeScript3.8%
  • Jupyter Notebook3.0%
  • PowerShell0.5%
  • Other0.2%

[8]ページ先頭

©2009-2025 Movatter.jp