Transformers

You are viewingmain version, which requiresinstallation from source. If you'd likeregular pip install, checkout the latest stable version (v4.57.1).

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

This model was released on 2020-01-13 and added to Hugging Face Transformers on 2020-11-16.

ProphetNet

Overview

The ProphetNet model was proposed inProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training, by Yu Yan, Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, RuofeiZhang, Ming Zhou on 13 Jan, 2020.

ProphetNet is an encoder-decoder model and can predict n-future tokens for “ngram” language modeling instead of justthe next token.

The abstract from the paper is the following:

In this paper, we present a new sequence-to-sequence pretraining model called ProphetNet, which introduces a novelself-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism. Instead ofthe optimization of one-step ahead prediction in traditional sequence-to-sequence model, the ProphetNet is optimized byn-step ahead prediction which predicts the next n tokens simultaneously based on previous context tokens at each timestep. The future n-gram prediction explicitly encourages the model to plan for the future tokens and preventoverfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large scaledataset (160GB) respectively. Then we conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks forabstractive summarization and question generation tasks. Experimental results show that ProphetNet achieves newstate-of-the-art results on all these datasets compared to the models using the same scale pretraining corpus.

The Authors’ code can be foundhere.

Usage tips

ProphetNet is a model with absolute position embeddings so it’s usually advised to pad the inputs on the right rather thanthe left.
The model architecture is based on the original Transformer, but replaces the “standard” self-attention mechanism in the decoder by a main self-attention mechanism and a self and n-stream (predict) self-attention mechanism.

Movatterモバイル変換

Transformers

ProphetNet

Overview

Usage tips

Resources

ProphetNetConfig

classtransformers.ProphetNetConfig

ProphetNetTokenizer

classtransformers.ProphetNetTokenizer

build_inputs_with_special_tokens

convert_tokens_to_string

get_special_tokens_mask

ProphetNet specific outputs

classtransformers.models.prophetnet.modeling_prophetnet.ProphetNetSeq2SeqLMOutput

classtransformers.models.prophetnet.modeling_prophetnet.ProphetNetSeq2SeqModelOutput

classtransformers.models.prophetnet.modeling_prophetnet.ProphetNetDecoderModelOutput

classtransformers.models.prophetnet.modeling_prophetnet.ProphetNetDecoderLMOutput

ProphetNetModel

classtransformers.ProphetNetModel

forward

ProphetNetEncoder

classtransformers.ProphetNetEncoder

forward

ProphetNetDecoder

classtransformers.ProphetNetDecoder

forward

ProphetNetForConditionalGeneration

classtransformers.ProphetNetForConditionalGeneration

forward

ProphetNetForCausalLM

classtransformers.ProphetNetForCausalLM

forward