multi-modal
Here are 455 public repositories matching this topic...
Language:All
Sort:Most stars
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
- Updated
Feb 7, 2026 - Python
AgentScope: Agent-Oriented Programming for Building LLM Applications
- Updated
Feb 7, 2026 - Python
Open-source framework for conversational voice AI agents
- Updated
Feb 7, 2026 - Python
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
- Updated
Sep 22, 2025 - Python
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow.https://activeloop.ai
- Updated
Feb 7, 2026 - C++
ModelScope: bring the notion of Model-as-a-Service to life.
- Updated
Jan 24, 2026 - Python
AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.
- Updated
Feb 7, 2026 - TypeScript
a state-of-the-art-level open visual language model | 多模态预训练模型
- Updated
May 29, 2024 - Python
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
- Updated
Feb 6, 2026 - Python
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
- Updated
Aug 29, 2025 - Jupyter Notebook
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
- Updated
Feb 17, 2024 - Python
Open Source Routing Engine for OpenStreetMap
- Updated
Feb 7, 2026 - C++
Ecommerce Search and Discovery - marqo.ai
- Updated
Feb 7, 2026 - Python
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
- Updated
Jul 19, 2025 - Python
OmniGen: Unified Image Generation.https://arxiv.org/pdf/2409.11340
- Updated
Dec 4, 2025 - Jupyter Notebook
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
- Updated
Aug 23, 2024 - Python
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
- Updated
Feb 3, 2026 - Python
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
- Updated
Dec 3, 2024 - Python
Represent, send, store and search multimodal data
- Updated
Jan 13, 2026 - Python
Improve this page
Add a description, image, and links to themulti-modal topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with themulti-modal topic, visit your repo's landing page and select "manage topics."