Movatterモバイル変換

Skip to content

#

multi-modal

Here are 455 public repositories matching this topic...

Language:All

Filter by language

All455 Python310 Jupyter Notebook49 TypeScript12 JavaScript10 C++7 MATLAB4 HTML3 C2 C#2 Go2

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

OpenBMB /MiniCPM-o

A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone

multi-modal minicpm minicpm-v

UpdatedFeb 7, 2026
Python

agentscope-ai /agentscope

AgentScope: Agent-Oriented Programming for Building LLM Applications

agent mcp chatbot multi-agent multi-modal large-language-models llm llm-agent react-agent

UpdatedFeb 7, 2026
Python

TEN-framework /ten-framework

Open-source framework for conversational voice AI agents

real-time video ai voice multi-modal

UpdatedFeb 7, 2026
Python

OpenGVLab /InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

image-classification gpt multi-modal semantic-segmentation video-classification image-text-retrieval llm vision-language-model gpt-4v vit-6b vit-22b gpt-4o

UpdatedSep 22, 2025
Python

activeloopai /deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow.https://activeloop.ai

python data-science machine-learning ai computer-vision deep-learning tensorflow cv image-processing ml pytorch datasets multi-modal datalake mlops vector-search vector-database large-language-models llm langchain

UpdatedFeb 7, 2026
C++

modelscope

modelscope /modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

UpdatedJan 24, 2026
Python

big-AGI

enricoros /big-AGI

AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.

agi multi-model gpt multi-modal ai-agents gemini-api ai-workspace ai-suite gpt-5 librechat perplexity-api openwebui anthropic-api deepseek-api xai-api nano-banana openai-responses-api sonnet-4-5

UpdatedFeb 7, 2026
TypeScript

zai-org /CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

UpdatedMay 29, 2024
Python

datajuicer /data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

data-science data data-visualization data-analysis data-processing multi-modal data-pipeline synthetic-data pre-training foundation-models large-language-models llm llms instruction-tuning

UpdatedFeb 6, 2026
Python

OFA-Sys /Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp computer-vision deep-learning transformers pytorch chinese pretrained-models multi-modal clip coreml-models contrastive-loss vision-language multi-modal-learning image-text-retrieval vision-and-language-pre-training

UpdatedAug 29, 2025
Jupyter Notebook

lucidrains /DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

deep-learning transformers artificial-intelligence multi-modal attention-mechanism text-to-image

UpdatedFeb 17, 2024
Python

valhalla /valhalla

Open Source Routing Engine for OpenStreetMap

directions openstreetmap routing astar traveling-salesman dijkstra routing-engine isochrones multi-modal tiled

UpdatedFeb 7, 2026
C++

marqo

marqo-ai /marqo

Ecommerce Search and Discovery - marqo.ai

search-engine machine-learning ecommerce multi-modal

UpdatedFeb 7, 2026
Python

zjunlp /DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

nlp deep-learning prompt pytorch information-extraction knowledge-graph named-entity-recognition chinese ner multi-modal kg relation-extraction lightner few-shot low-resource document-level attribute-extraction knowprompt deepke instructie

UpdatedJul 19, 2025
Python

VectorSpaceLab /OmniGen

OmniGen: Unified Image Generation.https://arxiv.org/pdf/2409.11340

image image-generation multi-modal multi-task diffusion image-edit

UpdatedDec 4, 2025
Jupyter Notebook

zai-org /VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

gpt multi-modal chatglm-6b

UpdatedAug 23, 2024
Python

open-compass /VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

UpdatedFeb 3, 2026
Python

SciSharp /LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

UpdatedFeb 1, 2026
C#

PKU-YuanGroup /Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

multi-modal instruction-tuning large-vision-language-model

UpdatedDec 3, 2024
Python

docarray

docarray /docarray

Represent, send, store and search multimodal data

elasticsearch machine-learning deep-learning protobuf pytorch data-structures nearest-neighbor-search cross-modal multi-modal semantic-search multimodal nested-data weaviate dataclass pydantic fastapi neural-search qdrant docarray

UpdatedJan 13, 2026
Python

Improve this page

Add a description, image, and links to themulti-modal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themulti-modal topic, visit your repo's landing page and select "manage topics."

[8]ページ先頭

©2009-2026 Movatter.jp