Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Starter code for the VMT task and challenge

NotificationsYou must be signed in to change notification settings

eric-xw/Video-guided-Machine-Translation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo contains the starter code for theVATEX Translation Challenge for Video-guided Machine Translation (VMT), aiming at translating a source language description into the target language with video information as additional spatiotemporal context.

VMT is introduced in our ICCV oral paper "VATEX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research".VATEX is a new large-scale multilingual video description dataset, which contains over 41,250 videos and 825,000 captions in both English and Chinese and half of these captions are English-Chinese translation pairs.For more details, please check the latest version of the paper:https://arxiv.org/abs/1904.03493.

Prerequisites

  • Python 3.7
  • PyTorch 1.4 (1.0+)
  • nltk 3.4.5

Training

1. Download corpus files and the extracted video features

First, under the vmt/ directory, download train/val/test json file:

./data/download.sh

Then download the I3D video features fromhere for trainval andhere for test

# set up your DIR/vatex_features for storing large video featuresmkdir DIR/vatex_featureswget https://vatex-feats.s3.amazonaws.com/trainval.zip -P DIR/vatex_featuresunzip DIR/vatex_features/trainval.zipwget https://vatex-feats.s3.amazonaws.com/public_test.zip -P DIR/vatex_featuresunzip DIR/vatex_features/public_test.zipcd vmt/ln -s DIR/vatex_features data/vatex_features

2. Training

To train the baseline VMT model:

python train.py

The default hyperparamters are set inconfigs.yaml.

Evaluation

Run

python eval.py

Specify the model name inconfigs.yaml. The script will generate a json file for submission to theVMT Challenge on CodaLab.

Results

The baseline VMT model achieves the following performance on corpus-level bleu score (the numbers here are slightly different from those in the paper due to different evaluation setups. For fair comparison, please compare with the performance here):

ModelEN -> ZHZH -> EN
BLEU-431.124.6

On the evaluation server, we report cumulative corpus-level BLEU score (up to 4-gram) and each individual n-gram score for reference, shown as B-1, ..., B-4.

Model performance is evaluated by cumulative BLEU-4 score in the challenge.

Reference

Please cite our paper if you use our code or dataset:

@InProceedings{Wang_2019_ICCV,author = {Wang, Xin and Wu, Jiawei and Chen, Junkun and Li, Lei and Wang, Yuan-Fang and Wang, William Yang},title = {VaTeX: A Large-Scale, High-Quality Multilingual Dataset for Video-and-Language Research},booktitle = {The IEEE International Conference on Computer Vision (ICCV)},month = {October},year = {2019}}

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp