visual-language-learning
Here are 14 public repositories matching this topic...
Language:All
Sort:Most stars
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
- Updated
Aug 12, 2024 - Python
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
- Updated
May 13, 2025 - Python
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
- Updated
Mar 5, 2024 - Python
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
- Updated
May 26, 2025 - Python
An open-source implementation for training LLaVA-NeXT.
- Updated
Oct 23, 2024 - Python
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
- Updated
Sep 11, 2024 - Python
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
- Updated
Apr 14, 2024 - Python
🧘🏻♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.
- Updated
Apr 29, 2024 - Python
Multimodal Instruction Tuning for Llama 3
- Updated
Apr 25, 2024 - Python
Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖
- Updated
Jun 19, 2024 - Python
[ACM MMGR '24] 🔍 Shotluck Holmes: A family of small-scale LLVMs for shot-level video understanding
- Updated
Oct 26, 2024 - Python
Docker image for LLaVA: Large Language and Vision Assistant
- Updated
May 17, 2025 - Shell
PyTorch implementation of OpenAI's CLIP model for image classification, visual search, and visual question answering (VQA).
- Updated
Sep 14, 2024 - Jupyter Notebook
Efficient Video Question Answering
- Updated
Jan 19, 2023 - Python
Improve this page
Add a description, image, and links to thevisual-language-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thevisual-language-learning topic, visit your repo's landing page and select "manage topics."