Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

License

NotificationsYou must be signed in to change notification settings

ExplainableML/CLEVR-X

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ByLeonard Salewski,A. Sophia Koepke,Hendrik Lensch andZeynep Akata.Published inSpringer LNAI xxAI and also presented at theCVPR 2022 Workshop on Explainable AI for Computer Vision (XAI4CV). A preprint is available onarXiv.

This repository is the official implementation ofCLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations. It contains code to generate the CLEVR-X dataset and aPyTorch dataset implementation.

Below is an example from the CLEVR dataset extended with CLEVR-X's natural language explanation:

A synthetically rendered image of a small cyan metallic cylinder, a large purple metallic sphere, a large blue matte cube, a large brown matte cylinder and a large green metallic cylinder (from front to back) on an infinite flat matte gray surface.

Question: There is a purple metallic ball; what number of cyan objects are right of it?

Answer: 1

Explanation: There is a cyan cylinder which is on the right side of the purple metallic ball.

Overview

This repository contains instructions for:

  1. CLEVR-X Dataset Download
  2. CLEVR-X Dataset Generation
  3. Model Results
  4. Contribution and License
  5. Citation

CLEVR-X Dataset Download

The generated CLEVR-X dataset is available here:CLEVR-X dataset (~1.21 GB).

The download includes two JSON files, which contain the explanations for all CLEVR train and CLEVR validation questions (CLEVR_train_explanations_v0.7.10.json andCLEVR_val_explanations_v0.7.10.json respectively).The general layout of the JSON files follows the original CLEVR JSON files. Theinfo key contains general information, whereas thequestions key contains the dataset itself. The latter is a list of dictionaries, where each dictionary is one sample of the CLEVR-X dataset.

Furthermore, we provide two python pickle files at the same link. Those contain a list of the image indices of the CLEVR-X train and CLEVR-X validation subsets (which are both part of the CLEVR train subset.)

Note, that we do not provide the images of the CLEVR dataset, which can be downloaded from the originalCLEVR project page.

Obtaining the CLEVR-X Splits

As stated above, the two python pickle files (train_images_ids_v0.7.10-recut.pkl anddev_images_ids_v0.7.10-recut.pkl) contain the image indices of all CLEVR-X train explanations and all CLEVR-X validation explanations.

Train

To obtain the train samples, iterate through the samples inCLEVR_train_explanations_v0.7.10.json and use those samples, whoseimage_index is in the list contained intrain_images_ids_v0.7.10-recut.pkl.

Validation

To obtain the validation samples, iterate through the samples inCLEVR_train_explanations_v0.7.10.json and use those samples, whoseimage_index is in the list contained indev_images_ids_v0.7.10-recut.pkl.

Test

All samples from the CLEVRvalidation subset (CLEVR_val_explanations_v0.7.10.json) are used for the CLEVR-Xtest subset.

CLEVR-X Dataset Generation

The following sections explain how to generate the CLEVR-X dataset.

Requirements

The required libraries for generating the CLEVR-X dataset can be found in the environment.yaml file. To create an environment and to install the requirements useconda:

conda env create --file environment.yaml

Activate it with:

conda activate clevr_explanations

CLEVR Dataset Download

As CLEVR-X uses the same questions and images as CLEVR, it is necessary to download theCLEVR dataset. Follow the instructions on theCLEVR dataset website to download the original dataset (images, scene graphs and questions & answers).The extracted files should be located in a folder calledCLEVR_v1.0 also known as$CLEVR_ROOT.For further instructions and information about the original CLEVR code, it could also be helpful to refer to theCLEVR GitHub repository.

Training Subset

First change into thequestion_generation directory:

cd question_generation

To generate explanations for the CLEVR training subset run this command:

python generate_explanations.py \    --input_scene_file$CLEVR_ROOT/scenes/CLEVR_train_scenes.json \    --input_questions_file$CLEVR_ROOT/questions/CLEVR_train_questions.json \    --output_explanations_file$CLEVR_ROOT/questions/CLEVR_train_explanations_v0.7.13.json \    --seed"43" \    --metadata_file ./metadata.json

This generation takes about 6 hours on an Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz.Note, setting the--log_to_dataframe flag totrue may increase the generation time significantly, but allows dumping (parts of) the dataset as an HTML table.

Validation Subset

First change into thequestion_generation directory:

cd question_generation

To generate explanations for the CLEVR validation subset run this command:

python generate_explanations.py \    --input_scene_file$CLEVR_ROOT/scenes/CLEVR_val_scenes.json \    --input_questions_file$CLEVR_ROOT/questions/CLEVR_val_questions.json \    --output_explanations_file$CLEVR_ROOT/questions/CLEVR_val_explanations_v0.7.13.json \    --seed"43" \    --metadata_file ./metadata.json

This generation takes less than 1 hour on an Intel(R) Xeon(R) Gold 5220 CPU @ 2.20GHz.Note, setting the--log_to_dataframe flag totrue may increase the generation time significantly, but allows dumping (parts of) the dataset as an HTML table.

Both commands use the--input_scene_file,--input_questions_file and the--metadata_file provided by the originalCLEVR dataset. You can use any name for the--output_explanations_file argument, but the dataloader expects it in the formatCLEVR_<split>_explanations_<version>.json.

Splits

Note, that the original CLEVR test set does not have publically accessible scene graphs and functional programs. Thus, we use the CLEVR validation set as the CLEVR-X test subset. The following code generates anew split of the CLEVR training set into the CLEVR-X training and validation subsets:

cd question_generationpython dev_split.py --root$CLEVR_ROOT

As each image comes with ten questions, the split is performed alongside the images instead of individual dataset samples. The code stores the image indices of each split in two separate python pickle files (namedtrain_images_ids_v0.7.10-recut.pkl anddev_images_ids_v0.7.10-recut.pkl). We have published our files alongside with the dataset download and recommend using those indices.

Results

Different baselines and VQA-X models achieve the following performance on CLEVR-X:

Model nameAccuracyBLEUMETEORROUGE-LCIDEr
Random Words3.6%0.08.411.45.9
Random Explanations3.6%10.916.635.330.4
PJ-X80.3%78.852.585.8566.8
FM63.0%87.458.993.4639.8

For more information on the baselines and models, check the respective publications and our CLEVR-X publication itself.

Contributing & License

For information on the license please look into theLICENSE file.

Citation

If you use CLEVR-X in any of your works, please use the following bibtex entry to cite it:

@inproceedings{salewski2022clevrx,    title     = {CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations},    author    = {Leonard Salewski and A. Sophia Koepke and Hendrik P. A. Lensch and Zeynep Akata},    booktitle = {xxAI - Beyond explainable Artificial Intelligence},    pages     = {85--104},    year      = {2022},    publisher = {Springer}}

You can also find our work onGoogle Scholar andSemantic Scholar.


[8]ページ先頭

©2009-2025 Movatter.jp