multi-modal

Star

Here are 363 public repositories matching this topic...

Language:All

Filter by language

All363 Python250 Jupyter Notebook39 JavaScript10 C++6 MATLAB4 TypeScript3 Julia2 R2 C1 C#1

Sort:Most stars

Sort options

Most stars Fewest stars Most forks Fewest forks Recently updated Least recently updated

OpenBMB /MiniCPM-o

Star19.3k

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

multi-modal minicpm minicpm-v

UpdatedMar 3, 2025
Python

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow.https://activeloop.ai

python data-science machine-learning ai computer-vision deep-learning tensorflow cv image-processing ml pytorch datasets multi-modal datalake mlops vector-search vector-database large-language-models llm langchain

UpdatedApr 23, 2025
Python

OpenGVLab /InternVL

Star8k

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

image-classification gpt multi-modal semantic-segmentation video-classification image-text-retrieval llm vision-language-model gpt-4v vit-6b vit-22b gpt-4o

UpdatedApr 27, 2025
Python

modelscope /modelscope

Star7.8k

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

UpdatedApr 30, 2025
Python

modelscope /agentscope

Star7.2k

Start building LLM-empowered multi-agent applications in an easier way.

agent drag-and-drop mcp chatbot multi-agent multi-modal distributed-agents gpt-4 large-language-models llm llm-agent llama3 gpt-4o

UpdatedApr 30, 2025
Python

THUDM /CogVLM

Star6.5k

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

UpdatedMay 29, 2024
Python

lucidrains /DALLE-pytorch

Star5.6k

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

deep-learning transformers artificial-intelligence multi-modal attention-mechanism text-to-image

UpdatedFeb 17, 2024
Python

OFA-Sys /Chinese-CLIP

Star5.1k

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp computer-vision deep-learning transformers pytorch chinese pretrained-models multi-modal clip coreml-models contrastive-loss vision-language multi-modal-learning image-text-retrieval vision-and-language-pre-training

UpdatedAug 6, 2024
Python

marqo-ai /marqo

Star4.8k

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

search-engine machine-learning natural-language-processing information-retrieval deep-learning transformers machinelearning gpt visual-search multi-modal clip knn semantic-search hacktoberfest hnsw vector-search vision-language large-language-models tensor-search chatgpt

UpdatedMay 1, 2025
Python

valhalla /valhalla

Star4.8k

Open Source Routing Engine for OpenStreetMap

directions openstreetmap routing astar traveling-salesman dijkstra routing-engine isochrones multi-modal tiled

UpdatedApr 30, 2025
C++

modelscope /data-juicer

Star4.3k

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

nlp data-science opendata data-visualization pytorch dataset chinese data-analysis llama gpt multi-modal synthetic-data pre-training streamlit gpt-4 large-language-models llm llms instruction-tuning llava

UpdatedApr 30, 2025
Python

THUDM /VisualGLM-6B

Star4.2k

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

gpt multi-modal chatglm-6b

UpdatedAug 23, 2024
Python

VectorSpaceLab /OmniGen

Star4k

OmniGen: Unified Image Generation.https://arxiv.org/pdf/2409.11340

image image-generation multi-modal multi-task diffusion image-edit

UpdatedFeb 20, 2025
Jupyter Notebook

zjunlp /DeepKE

Star3.9k

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

nlp deep-learning prompt pytorch information-extraction knowledge-graph named-entity-recognition chinese ner multi-modal kg relation-extraction lightner few-shot low-resource document-level attribute-extraction knowprompt deepke instructie

UpdatedApr 22, 2025
Python

PKU-YuanGroup /Video-LLaVA

Star3.2k

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

multi-modal instruction-tuning large-vision-language-model

UpdatedDec 3, 2024
Python

SciSharp /LLamaSharp

Star3.1k

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

UpdatedApr 29, 2025
C#

docarray /docarray

Star3.1k

Represent, send, store and search multimodal data

elasticsearch machine-learning deep-learning protobuf pytorch data-structures nearest-neighbor-search cross-modal multi-modal semantic-search multimodal nested-data weaviate dataclass pydantic fastapi neural-search qdrant docarray

UpdatedApr 24, 2025
Python

THUDM /CogVLM2

Star2.3k

GPT4V-level open-source multi-modal model based on Llama3-8B

pretrained-models language-model multi-modal cogvlm

UpdatedMar 3, 2025
Python

open-compass /VLMEvalKit

Star2.3k

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

computer-vision evaluation pytorch gemini openai vqa vit gpt multi-modal clip claude openai-api gpt4 large-language-models llm chatgpt llava qwen gpt-4v

UpdatedApr 30, 2025
Python

dvlab-research /LISA

Star2.2k

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

segmentation multi-modal llm large-language-model

UpdatedFeb 16, 2025
Python

Improve this page

Add a description, image, and links to themulti-modal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with themulti-modal topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly