large-vision-language-model

[CVPR 2024 Highlight] The first benchmark for lithic use-wear analysis leveraging SOTA vision and vision-language models (DINOv2, GPT-4V), demonstrating AI performance surpassing that of expert archaeologists.

computer-vision archeology anthropology ai4science large-vision-language-model

UpdatedMar 24, 2025
Jupyter Notebook

lca0503 /MergeToVLRM

Star5

Source code of our paper "Transferring Textual Preferences to Vision-Language Understanding through Model Merging", ACL 2025

model-merging large-vision-language-model reward-modeling

UpdatedApr 25, 2025
Python

lucaswychan /quant-lvlm

Star3

Easy-to-use large vision language model pipeline for quantitative analysis

pytorch quantitative-finance multimodal-learning large-vision-language-model

UpdatedApr 26, 2025
Python

amazon-science /THRONE

Star3

Code release for THRONE, a CVPR 2024 paper on measuring object hallucinations in LVLM generated text.

benchmark hallucination hallucinations large-language-models large-language-model vision-language-model large-vision-language-model large-vision-language-models cvpr2024 hallucination-evaluation vision-language-models

UpdatedAug 6, 2025
Python

pzrain /DiViCo

Star3

Official implementation of TCSVT 2025 paper: DiViCo: Disentangled Visual Token Compression For Efficient Large Vision-Language Model

multimodal large-vision-language-model token-compression

UpdatedMay 13, 2025
Python

ZPider0 /Multimodal

Star2

🎤 Transform speech and text with this lightweight Python toolkit for transcription, analysis, and audio conversion tasks.

agent machine-learning real-time reinforcement-learning ai deep-learning robotics reading-list multi-modality unified-model neural-search instruction-following llm large-vision-language-model multimodal-instruction-tuning multimodal-large-language-models multimodal-in-context-learning multimodal-chain-of-thought

UpdatedDec 17, 2025
Jupyter Notebook

devdhananjay14 /multim

Star1

🔍 Experiment with neural networks for binary classification on multimodal data using this extensible PyTorch framework.

python computer-vision deep-learning robotics tensorflow healthcare transformer reading-list llama representation-learning emotion-detection in-context-learning large-language-models chain-of-thought visual-instruction-tuning large-vision-language-model large-vision-language-models multimodal-large-language-models

UpdatedDec 17, 2025
Python

Improve this page

Add a description, image, and links to thelarge-vision-language-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thelarge-vision-language-model topic, visit your repo's landing page and select "manage topics."

Learn more

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

large-vision-language-model

Here are 21 public repositories matching this topic...

BradyFU /Awesome-Multimodal-Large-Language-Models

PKU-YuanGroup /Video-LLaVA

InternLM /InternLM-XComposer

PKU-YuanGroup /MoE-LLaVA

yaotingwangofficial /Awesome-MCoT

jqtangust /hawk

MMStar-Benchmark /MMStar

yu-rp /apiprompting

Orlando-CS /Awesome-VLA

richard-peng-xia /CARES

Ruiyang-061X /VL-Uncertainty

SuperBruceJia /Awesome-Large-Vision-Language-Model

ADL-X /LLAVIDAL

ai4ce /LUWA

lca0503 /MergeToVLRM

lucaswychan /quant-lvlm

amazon-science /THRONE

pzrain /DiViCo

ZPider0 /Multimodal

devdhananjay14 /multim

Improve this page

Add this topic to your repo