RLHF-VFollow

RLHF-V

RLHF-V

19 followers ·0 following

RLHF-V

Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Brief Introduction

This repository hosts the code, data, and model weight ofRLHF-V, a novel framework that aligns Multimodal Large Language Models (MLLMs) behavior through fine-grained correctional human feedback.

We collectfine-grained correctional feedback data, which can better credit the desired behavior, by asking human annotators to correct the hallucinated segments in model responses. Benefiting from the high data efficiency, it takes only 1 hour on 8 A100 GPUs for us to reduce the hallucination rate of the base model by 34.8%. Specifically, we conduct experiments onMuffin, an MLLM that has a strong ability in image understanding and reasoning which is trained onUniMM-Chat.

Visit our 🏠project page and 📃paper to explore more! And don't miss to try our interactive 🔥demo!

🎈News

📌 Pinned

[2024.05.28] 📃 Our RLAIF-V paper is accesible atarxiv now!
[2024.05.20] 🎉 We introduceRLAIF-V, our new alignment framework that utilize open-source models for feedback generation and reachsuper GPT-4V trustworthiness. You can download the correspondingdataset now!

[2024.04.11] 🔥 Our data is used inMiniCPM-V 2.0, anend-side multimodal large language model that exhibitscomparable trustworthiness with GPT-4V!
[2024.03.10] 📃 Our RLHF-V is accepted byCVPR 2024!
[2024.02.04] 🔥OmniLMM-12B which is built with RLHF-V achieves the#1 rank among open-source models onMMHal-Bench and evenoutperforms GPT-4V onObject HalBench! The demo is avaible athere!
[2024.01.06] 🔥 Alarger, more diverse set of fine-grained human correction data is available athugging face now! 🔥 The newly released data has about 5.7k of fine-grained human correction data that covers theoutput of more powerful models (Qwen-VL-Chat, InstructBLIP, etc.). We also expand the image types from everyday scenes todiverse styles and themes (WikiArt, landmarks, scene texts, etc.).
[2023.12.15] 🗂 We merge a new subset in our huggingface dataset! It contains an amount of1,065 fine-grained human preference data annotated on the outputs ofLLaVA-13B.
[2023.12.04] 📃 Our paper is accesible atarxiv now. We are still working hard to improve the datadiversity andamount. More high-qulity data are just on the way!

Dataset

We present theRLHF-V-Dataset, which is a human preference dataset constructed by fine-grained segment-level human corrections. In practice, we obtain a total of 1.4k annotated data that includes a diverse set of detailed description instructions and question-answering instructions.

RLHF-V Weights

We release RLHF-V model weights onHugging Face.

We also provideour SFT weights, which is the model checkpoint after finetuning Muffin on the VQAv2 dataset.

Install

Install Muffin

cd RLHF-Vgit clone https://github.com/thunlp/muffincd Muffin# Creating conda environmentconda create -n muffin python=3.10conda activate muffin# Installing dependenciespip install -e.# Install specific version of transformers to make sure you can reproduce the experimental results in our papersgit clone --recursive git@github.com:huggingface/transformers.gitcd transformersgit checkout a92e0ad2e20ef4ce28410b5e05c5d63a5a304e65pip install.cd ..

Prepare training environment

Install additional packages if you need to do training.

git clone --recursive https://github.com/Dao-AILab/flash-attention.gitcd flash-attention# Note: Uncomment the following line if you have CUDA version <= 11.4# git checkout ad11394MAX_JOBS=8 python setup.py installcd ..

Prepare evaluation environment

To run Object HalBench evaluation, you also need the following packages:

jsonlinesnltk==3.8.1spacy==3.7.0# Download and install "en_core_web_trf" for spacy# The wheel version we use can be downloaded from# https://github.com/explosion/spacy-models/releases/tag/en_core_web_trf-3.7.2# run pip install en_core_web_trf-3.7.2-py3-none-any.whl

Evaluation

LLaVA Bench

Run the following script to generate, evaluate, and summarize results for LLaVA Bench:

# cd RLHF-Vbash ./script/eval/eval_muffin_llavabench.sh ./RLHF-V_weight ./results/RLHF-V {YOUR_OPENAI_API_KEY}

Object HalBench

Prepare COCO2014 annotations

The evaluation of Object HalBench relies on the caption and segmentation annotations from the COCO2014 dataset. Please first download the COCO2014 dataset from the COCO dataset's official website.

mkdir coco2014cd coco2014wget http://images.cocodataset.org/annotations/annotations_trainval2014.zipunzip annotations_trainval2014.zip

Inference, evaluation, and summarization

Please replace{YOUR_COCO2014_ANNOTATION_DIR} with the path for COCO2014 annotation directory(e.g../coco2014/annotations), and replace{YOUR_OPENAI_API_KEY} with a valid OpenAI api-key.

# cd RLHF-Vbash ./script/eval_muffin_objhal.sh ./RLHF-V_weight ./results/RLHF-V {YOUR_COCO2014_ANNOTATION_DIR} {YOUR_OPENAI_API_KEY}

MMHal Bench

Prepare MMHal Data

Please download the MMHal evaluation datahere, and save the file ineval/data.

Run the following script to generate, evaluate, and summarize results for MMHal Bench:

# cd RLHF-Vbash ./script/eval_muffin_mmhal.sh ./RLHF-V_weight ./results/RLHF-V {YOUR_OPENAI_API_KEY}

RLHF-V Training

Prepare environment

Please follow the instructions in theInstall section to prepare the training environment. And make sure toupgrade to the latest code base ofMuffin:

cd Muffingit pullpip install -e .

Prepare model checkpoint

Please download ourSFT model checkpoint and save it toMuffin/RLHF-V_SFT_weight.

Training

Please make sure toupgrade to the latest code base ofMuffin. After installing the environment of Muffin, you can train your model as follows. This script will automatically download our open-sourced training data from HuggingFace, generate logps byour SFT model, and do DDPO training:

cd Muffinref_model=./RLHF-V_SFT_weightbash ./script/train/run_RLHFV.sh \    ./RLHFV_checkpoints/dpo_exp \    master \    RLHFV \    1.1 \$ref_model \    ./RLHF-V-Dataset \    RLHFV_SFT \    2160 \    360 \    0.1 \    False \    True

Licenses

Usage and License Notices: The data, code, and checkpoint are intended and licensed for research use only. They are also restricted to uses that follow the license agreement of LLaMA, Vicuna, and Chat GPT. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.

Acknowledgement

Muffin: the codebase we built upon.
LLaVA-RLHF: we utilize the MMHal-Bench data and evaluation code constructed by them.
Object Hallucination: we refer to the CHAIR evaluation code included in the repository.

Citation

If you find our model/code/data/paper helpful, please consider cite our papers 📝 and star us ⭐️！

@article{yu2023rlhf,title={Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback},author={Yu, Tianyu and Yao, Yuan and Zhang, Haoye and He, Taiwen and Han, Yifeng and Cui, Ganqu and Hu, Jinyi and Liu, Zhiyuan and Zheng, Hai-Tao and Sun, Maosong and others},journal={arXiv preprint arXiv:2312.00849},year={2023}}@article{yu2024rlaifv,title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness},author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},journal={arXiv preprint arXiv:2405.17220},year={2024},}

Popular repositoriesLoading

RLAIF-VRLAIF-VPublic
[CVPR'25] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness
Python 318 12
RLHF-VRLHF-VPublic
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
Python 266 8
RLHF-V.github.ioRLHF-V.github.ioPublic
RLHF-V project page
JavaScript 2 1

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLHF-V

Achievements

Achievements

Block or report RLHF-V

RLHF-V

Brief Introduction

🎈News

📌 Pinned

Contents

Dataset

RLHF-V Weights

Install

Evaluation

LLaVA Bench

Object HalBench

MMHal Bench

RLHF-V Training

Licenses

Acknowledgement

Citation

Popular repositoriesLoading