Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

multimodal-agent

Here are 10 public repositories matching this topic...

Language:All
Filter by language

AUITestAgent is the first automatic, natural language-driven GUI testing tool for mobile apps, capable of fully automating the entire process of GUI interaction and function verification.

  • UpdatedJul 19, 2024

🖼️ Workshop: Build a multimodal AI agent with Haystack & GPT-4o — featuring image understanding, document retrieval, conversational memory, and human-in-the-loop safety controls

  • UpdatedJan 30, 2026
  • Jupyter Notebook

claude but dockerized, goth-approved, and dangerously executable. This container gives you the Claude Code in a fully isolated ritual circle – no cursed system installs required.

  • UpdatedFeb 3, 2026
  • Shell

[COLM 2024] ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning

  • UpdatedOct 1, 2024
  • Python

Multi-Agent voice conversation platform

  • UpdatedOct 3, 2025
  • Python

Build an end-to-end system that ingests inventory report PDFs/images, runs OCR to normalize and extract tabular data, stores the cleaned dataset, and exposes a secure, conversational agent that can answer business queries over the data (aggregation, filtering, joins, trends), returning tables, charts, and exportable results.

  • UpdatedDec 5, 2025
  • Python

Build an end-to-end system that ingests inventory report PDFs/images, runs OCR to normalize and extract tabular data, stores the cleaned dataset, and exposes a secure, conversational agent that can answer business queries over the data (aggregation, filtering, joins, trends), returning tables, charts, and exportable results.

  • UpdatedDec 3, 2025
  • Python

multimodal coding assistant that can analyze images containing code problems and generate solutions in multiple programming languages.

  • UpdatedSep 3, 2025
  • Python

Improve this page

Add a description, image, and links to themultimodal-agent topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themultimodal-agent topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2026 Movatter.jp