multimodal-agent
Here are 10 public repositories matching this topic...
Language:All
Mobile-Agent: The Powerful GUI Agent Family
- Updated
Feb 20, 2026 - Python
[EMNLP-2024] Build multimodal language agents for fast prototype and production
- Updated
Mar 19, 2025 - Python
AUITestAgent is the first automatic, natural language-driven GUI testing tool for mobile apps, capable of fully automating the entire process of GUI interaction and function verification.
- Updated
Jul 19, 2024
🖼️ Workshop: Build a multimodal AI agent with Haystack & GPT-4o — featuring image understanding, document retrieval, conversational memory, and human-in-the-loop safety controls
- Updated
Jan 30, 2026 - Jupyter Notebook
claude but dockerized, goth-approved, and dangerously executable. This container gives you the Claude Code in a fully isolated ritual circle – no cursed system installs required.
- Updated
Feb 3, 2026 - Shell
[COLM 2024] ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning
- Updated
Oct 1, 2024 - Python
Build an end-to-end system that ingests inventory report PDFs/images, runs OCR to normalize and extract tabular data, stores the cleaned dataset, and exposes a secure, conversational agent that can answer business queries over the data (aggregation, filtering, joins, trends), returning tables, charts, and exportable results.
- Updated
Dec 5, 2025 - Python
Build an end-to-end system that ingests inventory report PDFs/images, runs OCR to normalize and extract tabular data, stores the cleaned dataset, and exposes a secure, conversational agent that can answer business queries over the data (aggregation, filtering, joins, trends), returning tables, charts, and exportable results.
- Updated
Dec 3, 2025 - Python
multimodal coding assistant that can analyze images containing code problems and generate solutions in multiple programming languages.
- Updated
Sep 3, 2025 - Python
Improve this page
Add a description, image, and links to themultimodal-agent topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with themultimodal-agent topic, visit your repo's landing page and select "manage topics."