Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

NotificationsYou must be signed in to change notification settings

mbzuai-oryx/XrayGPT

Repository files navigation

Omkar Thawakar* ,Abdelrahman Shaker* ,Sahal Shaji Mullappilly* ,Hisham Cholakkal,Rao Muhammad Anwer,Salman Khan,Jorma Laaksonen, andFahad Shahbaz Khan.

*Equal Contribution

Mohamed bin Zayed University of Artificial Intelligence, UAE

YouTube

🚀 News


  • Aug-04 : Our paper has been accepted at BIONLP-ACL 2024 🔥
  • Jun-14 : Our technical report is releasedhere. 🔥🔥
  • May-25 : Our technical report will be released very soon. stay tuned!.
  • May-19 : Our code, models, and pre-processed report summaries are released.

Online Demo

You can try our demo using the provided examples or by uploading your own X-ray here :Link-1 |Link-2 |Link-3 .

About XrayGPT


  • XrayGPT aims to stimulate research around automated analysis of chest radiographs based on the given x-ray. 
  • The LLM (Vicuna) is fine-tuned on medical data (100k real conversations between patients and doctors) and ~30k radiology conversations to acquire domain specific and relevant features. 
  • We generate interactive and clean summaries (~217k) from free-text radiology reports of two datasets (MIMIC-CXR andOpenI). These summaries serve to enhance the performance of LLMs through fine-tuning the linear transformation layer on high-quality data. For more details regarding our high-quality summaries, please checkDataset Creation.
  • We align frozen medical visual encoder (MedClip) with a fune-tuned LLM (Vicuna), using simple linear transformation.

overview

Getting Started

Installation

1. Prepare the code and the environment

Clone the repository and create a anaconda environment

git clone https://github.com/mbzuai-oryx/XrayGPT.gitcd XrayGPTconda env create -f env.ymlconda activate xraygpt

OR

git clone https://github.com/mbzuai-oryx/XrayGPT.gitcd XrayGPTconda create -n xraygpt python=3.9conda activate xraygptpip install -r xraygpt_requirements.txt

Setup

1. Prepare the Datasets for training

Refer thedataset_creation for more details.

Download the preprocessed annoatationsmimic &openi.Respective image folders contains the images from the dataset.

Following will be the final dataset folder structure:

dataset├── mimic|    ├── image|    |   ├──abea5eb9-b7c32823-3a14c5ca-77868030-69c83139.jpg|    |   ├──427446c1-881f5cce-85191ce1-91a58ba9-0a57d3f5.jpg|    |   .....|    ├──filter_cap.json├── openi|    ├── image|    |   ├──1.jpg|    |   ├──2.jpg|    |   .....|    ├──filter_cap.json...

3. Prepare the pretrained Vicuna weights

We built XrayGPT on the v1 versoin of Vicuna-7B.We finetuned Vicuna using curated radiology report samples.Download the Vicuna weights fromvicuna_weightsThe final weights would be in a single folder in a structure similar to the following:

vicuna_weights├── config.json├── generation_config.json├── pytorch_model.bin.index.json├── pytorch_model-00001-of-00003.bin...

Then, set the path to the vicuna weight in the model config file "xraygpt/configs/models/xraygpt.yaml" at Line 16.

To finetune Vicuna on radiology samples please download our curatedradiology andmedical_healthcare conversational samples and refer the original Vicuna repo for finetune.Vicuna_Finetune

4. Download the pretrained Minigpt-4 checkpoint

Download the pretrained minigpt-4 checkpoints.ckpt

5. Training of XrayGPT

A. First mimic pretraining stage

In the first pretrained stage, the model is trained using image-text pairs from preprocessed mimic dataset.

To launch the first stage training, run the following command. In our experiments, we use 4 AMD MI250X GPUs.

torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_mimic_pretrain.yaml

2. Second openi finetuning stage

In the second stage, we use a small high quality image-text pair openi dataset preprocessed by us.

Run the following command. In our experiments, we use AMD MI250X GPU.

torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/xraygpt_openi_finetune.yaml

Launching Demo on local machine

Download the pretrained xraygpt checkpoints.link

Add this ckpt in "eval_configs/xraygpt_eval.yaml".

Try gradiodemo.py on your local machine with following

python demo.py --cfg-path eval_configs/xraygpt_eval.yaml  --gpu-id 0

Examples

example 1example 2
example 3example 4

Acknowledgement


  • MiniGPT-4 Enhancing Vision-language Understanding with Advanced Large Language Models. We built our model on top of MiniGPT-4.
  • MedCLIP Contrastive Learning from Unpaired Medical Images and Texts. We used medical aware image encoder from MedCLIP.
  • BLIP2 The model architecture of XrayGPT follows BLIP-2.
  • Lavis This repository is built upon Lavis!
  • Vicuna The fantastic language ability of Vicuna is just amazing. And it is open-source!

Citation

If you're using XrayGPT in your research or applications, please cite using this BibTeX:

@article{Omkar2023XrayGPT,title={XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models},author={Omkar Thawkar, Abdelrahman Shaker, Sahal Shaji Mullappilly, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Jorma Laaksonen and Fahad Shahbaz Khan},journal={arXiv: 2306.07971},year={2023}    }

License

This repository is licensed under CC BY-NC-SA. Please refer to the license termshere.

About

[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp