- Notifications
You must be signed in to change notification settings - Fork6
Starter code for the VMT task and challenge
eric-xw/Video-guided-Machine-Translation
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This repo contains the starter code for theVATEX Translation Challenge for Video-guided Machine Translation (VMT), aiming at translating a source language description into the target language with video information as additional spatiotemporal context.
VMT is introduced in our ICCV oral paper "VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research".VATEX is a new large-scale multilingual video description dataset, which contains over 41,250 videos and 825,000 captions in both English and Chinese and half of these captions are English-Chinese translation pairs.For more details, please check the latest version of the paper:https://arxiv.org/abs/1904.03493.
- Python 3.7
- PyTorch 1.4 (1.0+)
- nltk 3.4.5
First, under the vmt/ directory, download train/val/test json file:
./data/download.sh
Then download the I3D video features fromhere for trainval andhere for test
# set up your DIR/vatex_features for storing large video featuresmkdir DIR/vatex_featureswget https://vatex-feats.s3.amazonaws.com/trainval.zip -P DIR/vatex_featuresunzip DIR/vatex_features/trainval.zipwget https://vatex-feats.s3.amazonaws.com/public_test.zip -P DIR/vatex_featuresunzip DIR/vatex_features/public_test.zipcd vmt/ln -s DIR/vatex_features data/vatex_features
To train the baseline VMT model:
python train.py
The default hyperparamters are set inconfigs.yaml
.
Run
python eval.py
Specify the model name inconfigs.yaml
. The script will generate a json file for submission to theVMT Challenge on CodaLab.
The baseline VMT model achieves the following performance on corpus-level bleu score (the numbers here are slightly different from those in the paper due to different evaluation setups. For fair comparison, please compare with the performance here):
Model | EN -> ZH | ZH -> EN |
---|---|---|
BLEU-4 | 31.1 | 24.6 |
On the evaluation server, we report cumulative corpus-level BLEU score (up to 4-gram) and each individual n-gram score for reference, shown as B-1, ..., B-4.
Model performance is evaluated by cumulative BLEU-4 score in the challenge.
Please cite our paper if you use our code or dataset:
@InProceedings{Wang_2019_ICCV,author = {Wang, Xin and Wu, Jiawei and Chen, Junkun and Li, Lei and Wang, Yuan-Fang and Wang, William Yang},title = {VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research},booktitle = {The IEEE International Conference on Computer Vision (ICCV)},month = {October},year = {2019}}
About
Starter code for the VMT task and challenge
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.