You signed in with another tab or window.Reload to refresh your session.You signed out in another tab or window.Reload to refresh your session.You switched accounts on another tab or window.Reload to refresh your session.Dismiss alert
Deploy & Integrate Screenshot Parsing and Action Models
OpenAdapter simplifies deploying advanced screenshot parsing and action models on local or managed systems, supporting secure automation. Initial deployment options include AWS, with future support for Anthropic and OpenAI APIs.
✨ Key Features
Automated Cloud Deployment: Deploy models on AWS EC2, with planned Anthropic/OpenAI API support and PII/PHI scrubbing.
Configurable with .env: Simplifies setup with environment variables.
Cost-Efficiency: Deploy high-performance instances on demand, with intelligent caching and resource pause/stop features to reduce costs.
Container & API Compatibility: Supports Dockerized models like OmniParser and Set-of-Mark, with future support for Anthropic and OpenAI APIs.
CI/CD with GitHub Actions: Automated integration and deployment ensure consistent updates.
Dataset Preparation and Fine-Tuning: Collect and fine-tune your models with OpenAdapter’s tools for recording, preparing, and training models using user interaction data, such as screenshots and actions, captured directly in your application.
Prerequisites
Python 3.10+
AWS Account (support for Hugging Face, Anthropic, and OpenAI APIs planned)
OpenAdapter provides commands to deploy and manage model instances and capture user interactions for fine-tuning. You can record, train, and tune models with your custom dataset, all managed within the OpenAdapter environment.
Recording User Interactions
To capture user interactions such as screenshots and actions, use OpenAdapter’srecord command:
python -m openadapter.record"doing taxes"
This command will save the actions in a database file:
Actions saved to ~/openadapter/recording.db
Preparing and Fine-Tuning the Model
Use the recorded data to prepare a dataset and fine-tune your model:
This flow enables OpenAdapter to use custom datasets created with OpenAdapt for more accurate action detection and screenshot parsing. Adjust paths based on your local setup.
Deployment Example (OmniParser)
Deploy OmniParser using an AWS GPU instance with OpenAdapter:
Estimate: Approx. $10/day on AWS g4dn.xlarge (100GB storage). Intelligent caching helps reduce costs by reusing data across runs.
Modular Cloud Backends
AWS EC2: Current backend, with flexible instance options and security.
Planned: Hugging Face, Anthropic, OpenAI; future support for GCP and Azure.
Integrations
OpenAdapter works seamlessly with OpenAdapt to build datasets and automate models. It can also function as a standalone solution for deploying and managing models in automated environments.
Requirements
Core Requirements
Python 3.10 or higher
Optional Components
Install specific dependencies based on the use case:
Recording: Required for capturing user interactions.
pip install openadapter[record]
Training: Includes dependencies for preparing datasets and fine-tuning models.
pip install openadapter[train]
Deployment: Necessary for deploying models to production.
pip install openadapter[deploy]
Full Installation: Installs all dependencies for full-featured use.
pip install openadapter[full]
Note: GPU support is recommended for training and fine-tuning tasks, especially when working with large models like YOLO and BLIP2.
🛠️ Roadmap
AWS CDK Automation: Streamline Infrastructure as Code.
Container Optimization: Scale with ECS and ECR.
GPU & Intelligent Caching: Optimize cost and performance.
Secure Data Handling: Data encryption and VPC configurations.
Logging & Monitoring: Integrate with CloudWatch.
Serverless Options: AWS Lambda and Google Cloud Functions.
Cross-Cloud Flexibility: Extend to GCP, Azure, and additional model APIs.
License
MIT License.
About
Effortless Deployment and Integration for SOTA Screenshot Parsing and Action Models