Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

License

NotificationsYou must be signed in to change notification settings

elbayadm/attn2d

Repository files navigation

This is a fork of Fairseq(-py) with implementations of the following models:

An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences.

Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models.

SeeREADME.

Efficient Wait-k Models for Simultaneous Machine Translation

Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths.

SeeREADME.

Fairseq Requirements and Installation

  • PyTorch version >= 1.4.0
  • Python version >= 3.6
  • For training new models, you'll also need an NVIDIA GPU andNCCL

Installing Fairseq

git clone https://github.com/elbayadm/attn2dcd attn2dpip install --editable.

License

fairseq(-py) is MIT-licensed.The license applies to the pre-trained models as well.

Citation

For Pervasive Attention, please cite:

@InProceedings{elbayad18conll,author ="Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob",title ="Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction",booktitle ="Proceedings of the 22nd Conference on Computational Natural Language Learning",year ="2018", }

For our wait-k models, please cite:

@article{elbayad20waitk,title={Efficient Wait-k Models for Simultaneous Machine Translation},author={Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob},journal={arXiv preprint arXiv:2005.08595},year={2020}}

For Fairseq, please cite:

@inproceedings{ott2019fairseq,title ={fairseq: A Fast, Extensible Toolkit for Sequence Modeling},author ={Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},booktitle ={Proceedings of NAACL-HLT 2019: Demonstrations},year ={2019},}

About

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp