Authors:Jiaxu Zhao1;Li Huang1;Ruixuan Sun2;Liao Bing3 andHong Qu1
Affiliations:1University of Electronic Science and Technology of China, Chengdu 610054, China;2Yelp Inc., U.S.A.;3Chengdu Dajiangtong Technology Co., Ltd, China
Keyword(s):Neural Machine Translation, Exposure Bias, GAN.
Abstract:In recent years, Neural Machine Translation (NMT) has achieved great success, but we can not ignore two important problems. One is the exposure bias caused by the different strategies between training and inference, and the other is that the NMT model generates the best candidate word for the current step yet a bad element of the whole sentence. The popular methods to solve these two problems are Schedule Sampling and Generative Adversarial Networks (GANs) respectively, and both achieved some success. In this paper, we proposed a more precise approach called “similarity selection” combining a new GAN structure called twin-GAN to solve the above two problems. There are two generators and two discriminators in the twin-GAN. One generator uses the “similarity selection” and the other one uses the same way as inference (simulate the inference process). One discriminator guides generators at the sentence level, and the other discriminator forces these two generators to have similar distributions. Moreover, we performed a lot of experiments on the IWSLT 2014 German→English (De→En) and the WMT’17 Chinese!English (Zh→En) and the result shows that we improved the performance compared to some other strong baseline models which based on recurrentarchitecture.(More)
In recent years, Neural Machine Translation (NMT) has achieved great success, but we can not ignore two important problems. One is the exposure bias caused by the different strategies between training and inference, and the other is that the NMT model generates the best candidate word for the current step yet a bad element of the whole sentence. The popular methods to solve these two problems are Schedule Sampling and Generative Adversarial Networks (GANs) respectively, and both achieved some success. In this paper, we proposed a more precise approach called “similarity selection” combining a new GAN structure called twin-GAN to solve the above two problems. There are two generators and two discriminators in the twin-GAN. One generator uses the “similarity selection” and the other one uses the same way as inference (simulate the inference process). One discriminator guides generators at the sentence level, and the other discriminator forces these two generators to have similar distributions. Moreover, we performed a lot of experiments on the IWSLT 2014 German→English (De→En) and the WMT’17 Chinese!English (Zh→En) and the result shows that we improved the performance compared to some other strong baseline models which based on recurrent
architecture.