llava-next
Here are 6 public repositories matching this topic...
An open-source implementation for training LLaVA-NeXT.
- Updated
Oct 23, 2024 - Python
[CVPR'25] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
- Updated
Mar 4, 2025 - Python
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.
- Updated
Feb 25, 2025 - Python
Matryoshka Multimodal Models
- Updated
Jan 22, 2025 - Python
LLaVA-NeXT-Image-Llama3-Lora, Modified fromhttps://github.com/arielnlee/LLaVA-1.6-ft
- Updated
Jul 17, 2024 - Python
[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.
- Updated
Jan 30, 2025 - Python
Improve this page
Add a description, image, and links to thellava-next topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thellava-next topic, visit your repo's landing page and select "manage topics."