AI-in-Health/Patient-InstructionsPublic

NotificationsYou must be signed in to change notification settings
Fork6
Star34

[NeurIPS 2022] Code for "Retrieve, Reason, and Refine: Generating Accurate and Faithful Discharge/Patient Instructions"

License

View license

34 stars 6 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
config		config
data		data
misc		misc
models		models
pretreatmeants		pretreatmeants
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Tokenizers.py		Tokenizers.py
dataset.py		dataset.py
opts.py		opts.py
simplet5.py		simplet5.py
train.py		train.py
translate.py		translate.py

Repository files navigation

Patient Insturction Generation

Code for our paper published in NeurIPS 2022[arXiv]:

Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions
(a.k.a.,Retrieval-Augmented and Knowledge-Grounded Language Models for Faithful Clinical Medicine)
Fenglin Liu, Bang Yang, Chenyu You, Xian Wu, Shen Ge, Zhangdaihong Liu, Xu Sun*, Yang Yang*, and David A. Clifton.

Updates

[22-10-25]: We release the code and data.

Clone the repo

git clone https://github.com/AI-in-Hospitals/Patient-Instructions.git# clone the following repo to calculate automatic metricscd Patient-Instructiongit clone https://github.com/ruotianluo/coco-caption.git

Environment

conda create -n pi python==3.9conda activate pipip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.htmlpip install transformers==4.10.0pip install pytorch-lightning==1.5.1pip install pandas rouge scipy# if you want to re-produce our data preparation processpip install scikit-learn plotly

Higher version oftorch andcuda can also work.

Download the data

As we can not re-distribute the rawMIMIC-III data, we release only our pre-processed dataset used in the paper atGoogle Drive (data.zip, 132MB). After downloading, unzip the data and place it like the structure below:

Patient-Instructions/ # the root of the repo    data    ├── README.md    ├── prepare_dataset.ipynb    ├── prepare_subtasks.ipynb    ├── diagnose-procedure-medication    │   ├── admDxMap_mimic3.pk      # Source: D_ICD_DIAGNOSES.csv    │   ├── admMedMap_mimic3.pk     # Source: prescriptions.csv    │   ├── admPrMap_mimic3.pk      # Source: D_ICD_PROCEDURES.csv    │   └── readme.txt    ├── splits                      # Source: NOTEEVENTS.csv    │   ├── train.csv               # obtained by data/prepare_dataset.ipynb    │   ├── val.csv                 # obtained by data/prepare_dataset.ipynb    │   ├── test.csv                # obtained by data/prepare_dataset.ipynb    │   └── subtasks                    │       ├── age                 # Source: NOTEEVENTS.csv    │       │   └── ...             # obtained by data/prepare_subtasks.ipynb    │       ├── sex                 # Source: NOTEEVENTS.csv    │       │   └── ...             # obtained by data/prepare_subtasks.ipynb    │       └── diseases            # Source: NOTEEVENTS.csv, D_ICD_DIAGNOSES.csv    │           └── ...             # obtained by data/prepare_subtasks.ipynb    └── vocab                      ├── special_tokens_map.json # obtained by data/prepare_dataset.ipynb       ├── tokenizer_config.json   # obtained by data/prepare_dataset.ipynb       └── vocab.txt               # obtained by data/prepare_dataset.ipynb

We also provide insturctions to re-produce our data preparation process indata/README.md.

Pretreatments

Run the following codes to prepare some necessary files:

# Generate the adjacent matrix of all unique digonosis, medication, and procedure codes# This is essential if we want to use knowledge graph to assist PI-Writerpython pretreatments/prepare_codes_adjacent_matrix.py# Get top-300 most similar admission records from the training set for each query hospital admission# This is essential if we want to retrieve historical PIs to assist PI-Writerpython pretreatments/prepare_relevant_info.py# Use bert-base-uncased to extract sentence-level embeddings of PIs# We apply max pooling on the word embs of the last layerpython pretreatments/extract_instruction_embs.py

Training

Here are some key argument to runtrain.py:

gpus: specify the number of gpus;
batch_size: specify the number of samples in a batch;
accumulate_grad_batches: use it if you don't have much gpu memory;
arch: specify the architecture, can be eithersmall (hidden size = 256) orbase (hidden size = 512). Seeconfigs/archs;
setup: specify which setup to use. See options inconfig/setups.yaml, where we provide setups for model variants such as Transformer-basedtransformer andtransformer_Full and LSTM-basedlstm andlstm_Full.

Here are some examples:

python train.py --gpus 8 --batch_size 8 --arch base --setup transformerpython train.py --gpus 8 --batch_size 8 --arch base --setup transformer_Fullpython train.py --gpus 8 --batch_size 4 --accumulate_grad_batches 2 --arch base --setup transformer_Fullpython train.py --gpus 8 --batch_size 8 --arch small --setup lstmpython train.py --gpus 8 --batch_size 8 --arch small --setup lstm_Full

Evaluation

The simplest command below can show you results of automatic metrics (Bleu,METEOR, andROUGE), which will be written to./csv_results/overall.csv.

python translate.py $path_to_model

You can save the generated patient instructions by running:

# The ouput file will be saved to `./inference_results/preds_and_scores.json` in this casepython translate.py $path_to_model --save_json --save_base_path ./inference_results --save_folder "" --json_file_name preds_and_scores.json

You can evaluate the model on subtasks (seedata/README.md for details) by passing the augment--subtask_type:

python translate.py $path_to_model --subtask_type agepython translate.py $path_to_model --subtask_type sexpython translate.py $path_to_model --subtask_type disease

Bugs or Questions?

If you encounter any problems when using the code, or want to report a bug, you can open an issue or email {yangbang@pku.edu.cn,fenglinliu98@pku.edu.cn}. Please try to specify the problem with details so we can help you better and quicker!

Citation

Please consider citing our papers if our code or datasets are useful to your work, thanks sincerely!

@inproceedings{liu2022retrieve,title={Retrieve, Reason, and Refine: Generating Accurate and Faithful Patient Instructions},author={Liu, Fenglin and Yang, Bang and You, Chenyu and Wu, Xian and Ge, Shen and Liu, Zhangdaihong and Sun, Xu and Yang, Yang and Clifton, David A.},booktitle={Advances in Neural Information Processing Systems},year={2022}}

Acknowledgements

We borrow some codes fromShivanandroy/simpleT5.

About

[NeurIPS 2022] Code for "Retrieve, Reason, and Refine: Generating Accurate and Faithful Discharge/Patient Instructions"

Releases1

v1.0.0 Latest

Jul 17, 2023

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Patient Insturction Generation

Updates

Clone the repo

Environment

Download the data

Pretreatments

Training

Evaluation

Bugs or Questions?

Citation

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages

Languages

Movatterモバイル変換

License

AI-in-Health/Patient-Instructions

Folders and files

Latest commit

History

Repository files navigation

Patient Insturction Generation

Updates

Clone the repo

Environment

Download the data

Pretreatments

Training

Evaluation

Bugs or Questions?

Citation

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases1

Packages0

Languages

Packages