zjunlp/HVPNeTPublic

NotificationsYou must be signed in to change notification settings
Fork12
Star112

[NAACL 2022 Findings] Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction

License

MIT license

112 stars 12 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
models		models
modules		modules
processor		processor
resource		resource
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
run_re_task.sh		run_re_task.sh
run_twitter15.sh		run_twitter15.sh
run_twitter17.sh		run_twitter17.sh

Repository files navigation

HVPNet

Code for the NAACL2022 (Findings) paper "Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction".

Model Architecture

The overall architecture of our hierarchical modality fusion network.

Requirements

To run the codes, you need to install the requirements:

pip install -r requirements.txt

Data Preprocess

To extract visual object images, we first use the NLTK parser to extract noun phrases from the text and apply thevisual grouding toolkit to detect objects. Detailed steps are as follows:

Using the NLTK parser (or Spacy, textblob) to extract noun phrases from the text.
Applying thevisual grouding toolkit to detect objects. Taking the twitter2015 dataset as an example, the extracted objects are stored intwitter2015_aux_images. The images of the object obey the following naming format:imgname_pred_yolo_crop_num.png, whereimgname is the name of the raw image corresponding to the object,num is the number of the object predicted by the toolkit. (Note that intrain/val/test.txt, text and raw image have a one-to-one relationship, so theimgname can be used as a unique identifier for the raw images)
Establishing the correspondence between the raw images and the objects. We construct a dictionary to record the correspondence between the raw images and the objects. Takingtwitter2015/twitter2015_train_dict.pth as an example, the format of the dictionary can be seen as follows:{imgname:['imgname_pred_yolo_crop_num0.png', 'imgname_pred_yolo_crop_num1.png', ...] }, where key is the name of raw images, value is a List of the objects.

The detected objects and the dictionary of the correspondence between the raw images and the objects are available in our data links.

Data Download

Twitter2015 & Twitter2017
The text data follows the conll format. You can download the Twitter2015 data via thislink and download the Twitter2017 data via thislink. Please place them indata/NER_data.
You can also put them anywhere and modify the path configuration inrun.py
MNRE
The MNRE dataset comes fromMEGA, many thanks.
You can download the MRE dataset with detected visual objects fromGoogle Drive or use the following commands:
```
cd datawget 120.27.214.45/Data/re/multimodal/data.tar.gztar -xzvf data.tar.gzmv data RE_data
```

The expected structure of files is:

HMNeT |-- data |    |-- NER_data |    |    |-- twitter2015  # text data |    |    |    |-- train.txt |    |    |    |-- valid.txt |    |    |    |-- test.txt |    |    |    |-- twitter2015_train_dict.pth  # {imgname: [object-image]} |    |    |    |-- ... |    |    |-- twitter2015_images       # raw image data |    |    |-- twitter2015_aux_images   # object image data |    |    |-- twitter2017 |    |    |-- twitter2017_images |    |    |-- twitter2017_aux_images |    |-- RE_data |    |    |-- img_org          # raw image data |    |    |-- img_vg           # object image data |    |    |-- txt              # text data |    |    |-- ours_rel2id.json # relation data |-- models# models |    |-- bert_model.py |    |-- modeling_bert.py |-- modules |    |-- metrics.py    # metric |    |-- train.py  # trainer |-- processor |    |-- dataset.py    # processor, dataset |-- logs     # code logs |-- run.py   # main  |-- run_ner_task.sh |-- run_re_task.sh

Train

NER Task

The data path and GPU related configuration are in therun.py. To train ner model, run this script.

bash run_twitter15.shbash run_twitter17.sh

RE Task

To train re model, run this script.

bash run_re_task.sh

Test

NER Task

To test ner model, you can use the tained model and setload_path to the model path, then run following script:

python -u run.py \      --dataset_name="twitter15/twitter17" \      --bert_name="bert-base-uncased" \      --seed=1234 \      --only_test \      --max_seq=80 \      --use_prompt \      --prompt_len=4 \      --sample_ratio=1.0 \      --load_path='your_ner_ckpt_path'

RE Task

To test re model, you can use the tained model and setload_path to the model path, then run following script:

python -u run.py \      --dataset_name="MRE" \      --bert_name="bert-base-uncased" \      --seed=1234 \      --only_test \      --max_seq=80 \      --use_prompt \      --prompt_len=4 \      --sample_ratio=1.0 \      --load_path='your_re_ckpt_path'

Acknowledgement

The acquisition of Twitter15 and Twitter17 data refer to the code fromUMT, many thanks.

The acquisition of MNRE data for multimodal relation extraction task refer to the code fromMEGA, many thanks.

Papers for the Project & How to Cite

If you use or extend our work, please cite the paper as follows:

@inproceedings{DBLP:conf/naacl/ChenZLYDTHSC22,author    ={Xiang Chen and               Ningyu Zhang and               Lei Li and               Yunzhi Yao and               Shumin Deng and               Chuanqi Tan and               Fei Huang and               Luo Si and               Huajun Chen},editor    ={Marine Carpuat and               Marie{-}Catherine de Marneffe and               Iv{\'{a}}n Vladimir Meza Ru{\'{\i}}z},title     ={Good Visual Guidance Make {A} Better Extractor: Hierarchical Visual               Prefix for Multimodal Entity and Relation Extraction},booktitle ={Findings of the Association for Computational Linguistics: {NAACL}               2022, Seattle, WA, United States, July 10-15, 2022},pages     ={1607--1618},publisher ={Association for Computational Linguistics},year      ={2022},url       ={https://doi.org/10.18653/v1/2022.findings-naacl.121},doi       ={10.18653/v1/2022.findings-naacl.121},timestamp ={Tue, 23 Aug 2022 08:36:33 +0200},biburl    ={https://dblp.org/rec/conf/naacl/ChenZLYDTHSC22.bib},bibsource ={dblp computer science bibliography, https://dblp.org}}

About

[NAACL 2022 Findings] Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

HVPNet

Model Architecture

Requirements

Data Preprocess

Data Download

Train

NER Task

RE Task

Test

NER Task

RE Task

Acknowledgement

Papers for the Project & How to Cite

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors4

Languages

Movatterモバイル変換

License

zjunlp/HVPNeT

Folders and files

Latest commit

History

Repository files navigation

HVPNet

Model Architecture

Requirements

Data Preprocess

Data Download

Train

NER Task

RE Task

Test

NER Task

RE Task

Acknowledgement

Papers for the Project & How to Cite

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors4

Languages

Packages