CN113836946A

Movatterモバイル変換

Info

Publication number: CN113836946A
Application number: CN202111069233.3A
Authority: CN
Inventors: 徐金安; 黄辉; 狄慧; 刘健; 陈钰枫
Original assignee: Beijing Jiaotong University; Toshiba China Co Ltd
Current assignee: Beijing Jiaotong University; Toshiba China Co Ltd
Priority date: 2021-09-13
Filing date: 2021-09-13
Publication date: 2021-12-24
Anticipated expiration: 2041-09-13
Also published as: CN113836946B

Abstract

Translated fromChinese

本申请公开了一种训练评分模型的方法、装置、终端及存储介质，属于互联网技术领域。该方法包括：获取样本原文、第一样本译文以及至少一个第二样本译文，其中，第一样本译文的语义和样本原文对应的语义相同，第二样本译文的语义与第一样本译文的语义不同；将样本原文和第一样本译文输入评分模型，得到第一样本译文对应的第一样本分数，将样本原文分别和每个第二样本译文输入评分模型，得到每个第二样本译文对应的第二样本分数；基于第一样本分数以及至少一个第二样本分数，确定损失信息；基于损失信息，对评分模型进行调整。可见，本申请实施例解决了在没有样本译文对应的基准分数下，无法对评分模型进行训练的问题。

The present application discloses a method, device, terminal and storage medium for training a scoring model, which belong to the technical field of the Internet. The method includes: acquiring a sample original text, a first sample translation text and at least one second sample translation text, wherein the semantics of the first sample translation text are the same as the semantics corresponding to the sample original text, and the semantics of the second sample translation text are the same as the first sample translation text The semantics are different; input the sample original text and the first sample translation into the scoring model to obtain the first sample score corresponding to the first sample translation, input the sample original text and each second sample translation into the scoring model respectively, and obtain each The second sample score corresponding to the two-sample translation; the loss information is determined based on the first sample score and the at least one second sample score; and the scoring model is adjusted based on the loss information. It can be seen that the embodiment of the present application solves the problem that the scoring model cannot be trained without the benchmark score corresponding to the sample translation.

Description

Method, device, terminal and storage medium for training scoring model

Technical Field

The present application relates to the field of internet technologies, and in particular, to a method, an apparatus, a terminal, and a storage medium for training a score model.

Background

With the development of scientific technology, the evaluation of machine-translated translations becomes particularly important.

In the related art, an original text and a corresponding translation are input into a trained scoring model to obtain a score corresponding to the translation, and then the translation is evaluated based on the score. The training process of the trained scoring model comprises the following steps: and acquiring a training sample set, wherein each training sample comprises a sample original text, a corresponding sample translation and a pre-labeled benchmark score. And inputting the sample original text and the sample translation in the training sample into a scoring model to obtain a prediction score corresponding to the sample original text. And training and adjusting the scoring model based on the prediction score and the benchmark score.

In the above process, the reference score is obtained by the professional translator scoring the sample original text and the corresponding sample translation. Once the training samples lack the corresponding benchmark scores, the scoring model cannot be trained.

Disclosure of Invention

The embodiment of the application provides a method, a device, a terminal and a storage medium for training a scoring model, and solves the problem that the scoring model cannot be trained without a reference score corresponding to a sample translation. The technical scheme is as follows:

in a first aspect, an embodiment of the present application provides a method for training a scoring model, where the method includes:

acquiring a sample original text, a first sample text and at least one second sample text, wherein the semantics of the first sample text are the same as the semantics of the sample original text, and the semantics of the second sample text is different from the semantics of the first sample text;

inputting the sample original text and the first sample translation into a scoring model to obtain a first sample score corresponding to the first sample translation, and inputting the sample original text and each second sample translation into the scoring model to obtain a second sample score corresponding to each second sample translation;

determining loss information based on the first sample score and at least one second sample score;

based on the loss information, a scoring model is adjusted.

Optionally, before the obtaining the sample original text, the first sample translation and the at least one second sample translation, the method further includes:

acquiring a first sample vector and a Gaussian noise vector corresponding to the first sample translation;

adding the first sample text vector and the Gaussian noise vector to obtain a first sample text vector after noise addition;

and inputting the first sample text vector after noise addition and the first sample text vector into a pre-trained denoising self-encoder to obtain the second sample translation.

acquiring a first sample text vector corresponding to the first sample text;

randomly destroying the first sample translation to obtain a first sample translation after destruction;

determining a second sample text vector corresponding to the first sample translation after the corruption;

and inputting the first sample text vector and the second sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation.

Optionally, the determining loss information based on the first sample score and at least one second sample score includes:

determining the loss information based on the first sample score, the at least one second sample score, and a first preset formula;

the first preset formula is L ═ Sigma_x∈D-(p_x×log(W_x×h(x))+(1-p_x)×log(1-W_x×h(x)))；

Wherein L is the loss information, D is a sample translation set composed of the first sample translation and the at least one second sample translation, x is any sample translation in the sample translation set D, h (x) is a score corresponding to the sample translation x, W_xIs a predetermined coefficient, p_xIs a predetermined constant, p_xThe numerical range of (2) is (0, 1).

determining the loss information based on the first sample fraction, the at least one second sample fraction, and a second preset formula;

the second preset formula is

Wherein L is the loss information, D is a sample translation set composed of the first sample translation and the at least one second sample translation, s is the first sample translation, h(s) is a first sample score corresponding to the first sample translation, x is any sample translation in the sample translation set D, h (x) is a score corresponding to the sample translation x, and margin is a preset constant.

Optionally, the method further includes:

and inputting the target original text and the target translation into a pre-trained scoring model to obtain a target score corresponding to the target translation.

Optionally, the scoring model includes a text preprocessing module, a feature extraction module, and a scoring module;

inputting the sample original text and the first sample translation into a scoring model to obtain a first sample score corresponding to the first sample translation, wherein the method comprises the following steps:

inputting the sample original text and the first sample translation into a text preprocessing module to obtain a sample character sequence;

inputting the sample character sequence into a feature extraction module to obtain sample feature information;

and inputting the sample characteristic information into a scoring module to obtain a first sample score corresponding to the first sample translation.

In a second aspect, an embodiment of the present application provides an apparatus for training a scoring model, where the apparatus includes:

a first obtaining module configured to obtain a sample original, a first sample translation, and at least one second sample translation, wherein the semantics of the first sample translation and the semantics of the sample original are the same, and the semantics of the second sample translation and the semantics of the first sample translation are different;

the input module is configured to input the sample original text and the first sample text into a scoring model to obtain a first sample score corresponding to the first sample text, and input the sample original text and each second sample text into the scoring model to obtain a second sample score corresponding to each second sample text;

a determination module configured to determine loss information based on the first sample score and at least one second sample score;

an adjustment module configured to adjust a scoring model based on the loss information.

Optionally, the apparatus further includes a second obtaining module, where the second obtaining module is configured to:

Optionally, the apparatus further includes a third obtaining module, where the third obtaining module is configured to:

acquiring a first sample text vector corresponding to the first sample text;

Optionally, the determining module is configured to:

the second preset formula is

Optionally, the apparatus further comprises a usage module configured to:

the input module configured to:

In a third aspect, an embodiment of the present application provides a terminal, where the terminal includes a processor and a memory, where the memory stores at least one program code, and the at least one program code is loaded and executed by the processor to implement the method for training a scoring model described above.

In a fourth aspect, the present application provides a computer-readable storage medium, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor to implement the above method for training a scoring model.

In a fifth aspect, the present application provides a computer program product or a computer program, where the computer program product or the computer program includes a computer program code, the computer program code is stored in a computer readable storage medium, a processor of a computer device reads the computer program code from the computer readable storage medium, and the processor executes the computer program code, so that the computer device executes the above method for training a scoring model.

In the embodiment of the application, a first sample score corresponding to a first sample translation with the same semantic meaning as the sample original text and a second sample score corresponding to a second sample translation with the different semantic meaning from the sample original text are obtained. And determining loss information based on the first sample score and the second sample score, and adjusting the scoring model based on the loss information. Therefore, the reference score of the sample translation does not need to be obtained, and the problem that the scoring model cannot be trained without the reference score corresponding to the sample translation in the prior art is solved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of a method for training a scoring model according to an embodiment of the present application;

fig. 2 is a flowchart of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 3A is a schematic diagram of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 3B is a schematic diagram of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 4A is a schematic diagram of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 4B is a schematic diagram of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 5A is a schematic diagram of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 5B is a schematic diagram of a method for training a scoring model according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an apparatus for training a scoring model according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Fig. 1 is a schematic diagram of an implementation environment of a method for training a scoring model according to an embodiment of the present application. As shown in fig. 1, the method may be implemented by the terminal 101 or theserver 102.

The terminal 101 may include components such as a processor, memory, and the like. The processor, which may be a Central Processing Unit (CPU), may be configured to obtain a sample original text, a first sample translation, and at least one second sample translation, input the sample original text and the first sample translation into a scoring model, obtain a first sample score corresponding to the first sample translation, input the sample original text and each second sample translation into the scoring model, obtain a second sample score corresponding to each second sample translation, determine loss information, adjust the scoring model based on the first sample score and the at least one second sample score, and the like. The Memory may be a RAM (Random Access Memory), a Flash (Flash Memory), etc., and may be configured to store the sample original text, the first sample translation, and the at least one second sample translation, etc. The terminal 101 may also include a transceiver, image detection components, a screen, audio output components, audio input components, and the like. The audio output component may be a sound box, an earphone, etc. The audio input means may be a microphone or the like.

Theserver 102 may include components such as a processor, memory, and the like. The processor, which may be a Central Processing Unit (CPU), may be configured to obtain a sample original text, a first sample translation, and at least one second sample translation, input the sample original text and the first sample translation into a scoring model, obtain a first sample score corresponding to the first sample translation, input the sample original text and each second sample translation into the scoring model, obtain a second sample score corresponding to each second sample translation, determine loss information, adjust the scoring model based on the first sample score and the at least one second sample score, and the like. The Memory may be a RAM (Random Access Memory), a Flash (Flash Memory), etc., and may be configured to store the sample original text, the first sample translation, and the at least one second sample translation, etc.

Fig. 2 is a flowchart of a method for training a scoring model according to an embodiment of the present disclosure. Referring to fig. 2, the embodiment includes:

step 201, obtaining a sample original text, a first sample translation and at least one second sample translation.

The semantics of the first sample translation and the corresponding semantics of the sample original text are the same, and the semantics of the second sample translation is different from the semantics of the first sample translation, namely the semantics of the second sample translation is different from the semantics of the sample original text. The sample original, the first sample translation, the second sample translation 1, and the second sample translation 2 may be as shown in Table 1 below:

TABLE 1

Optionally, the second sample translation is a translation with a different semantic meaning from the first sample translation, but if the deviation between the second sample translation and the first sample translation is large, the trained scoring model may not be effective. Therefore, the second sample translation in the embodiment of the present application has a smaller deviation from the first sample translation. The embodiments of the present application provide various methods for obtaining a second sample translation with a smaller deviation from a first sample translation, and specifically, the method for obtaining a second sample translation is as follows:

in the first method, a first sample text vector and a Gaussian noise vector corresponding to a first sample translation are obtained. And adding the first sample text vector and the Gaussian noise vector to obtain the first sample text vector after noise addition. And inputting the first sample text vector after noise addition and the first sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation.

The text vector is in a vector form corresponding to the text, and the number of numerical digits contained in the first text vector is the same as the number of numerical digits contained in the Gaussian noise vector. The gaussian noise vector is a noise vector that is randomly generated and distributed by a multidimensional gaussian having a mean value of 0 and a variance of 1, and the method for generating the noise vector is the prior art, and is not described in detail in the embodiments of the present application.

In implementation, a vector form corresponding to the first sample translation is obtained through a word embedding algorithm, so that a first sample vector is obtained, and a gaussian noise vector is randomly generated through the prior art. And adding the first sample text vector and the Gaussian noise vector to obtain the first sample text vector after noise addition. And inputting the first sample text vector after noise addition and the first sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation.

The training process for the denoising autoencoder is as follows: and acquiring a sample text, and acquiring a vector form corresponding to the sample text to acquire a sample text vector corresponding to the sample text. And adding the sample text vector and the Gaussian noise vector generated randomly to obtain a sample text vector after noise addition. And inputting the denoised sample text vector and the sample text vector into a denoising self-encoder to obtain a predicted text. And obtaining loss information based on the prediction text and the sample text, and adjusting parameters of the denoising autoencoder based on the loss information to obtain the denoising autoencoder after parameter adjustment. And training and adjusting the parametrized denoising self-encoder by using other sample texts until the parametrized first denoising self-encoder is converged to obtain the pre-trained denoising self-encoder.

The specific structure of the denoising self-encoder is shown in fig. 3A, a first sample text vector corresponding to a first sample translation and a gaussian noise vector generated randomly are obtained, and the first sample text vector and the gaussian noise vector are added to obtain a first sample text vector containing noise. And inputting the first sample vector containing noise and the first sample vector into a denoising self-encoder to obtain a second sample translation.

After the first sample text vector containing noise is input into the denoising self-encoder, the denoising self-encoder performs linear mapping on the first sample text vector containing noise to obtain a linearly mapped first vector. The first vector is input into a multi-headed self-attention layer to obtain a second vector. And residual error connection is carried out on the first vector and the second vector, and normalized processing is carried out on the vectors after the residual error connection, so that a third vector is obtained. And inputting the third vector into a feedforward layer to obtain a fourth vector. And residual error connection is carried out on the third vector and the fourth vector, and normalized processing is carried out on the vectors after the residual error connection, so that a fifth vector is obtained. Meanwhile, after the first sample text vector is input into the denoising self-encoder, the denoising encoder performs linear mapping on the first sample text vector to obtain a linearly mapped sixth vector. And inputting the sixth vector into the multi-head self-attention layer of the mask to obtain a seventh vector. And residual errors are connected between the sixth vector and the seventh vector, and the vectors after residual error connection are subjected to normalization processing to obtain an eighth vector. And inputting the eighth vector and the fifth vector into the multi-head mutual attention layer to obtain a ninth vector. And connecting the residual errors of the eighth vector and the ninth vector, and carrying out normalization processing on the vectors after residual error connection to obtain a tenth vector. And inputting the tenth vector into a feedforward layer to obtain an eleventh vector. And residual errors are connected between the tenth vector and the eleventh vector, and the vectors after residual error connection are subjected to normalization processing to obtain a twelfth vector. And inputting the twelfth vector into the linear layer, and performing Softmax processing to obtain a second sample translation.

The denoising self-encoder is of a single encoder-single decoder architecture, only one multi-head mutual attention layer is arranged in the denoising self-encoder, and the multi-head mutual attention layer is used for realizing information interaction between the encoder and the decoder. The multi-head mutual attention layer, the feedforward layer, the mask multi-head self-attention layer, the multi-head mutual attention layer and the feedforward layer are all neural networks.

In the actual use process, if the first sample translation is subjected to noise addition directly based on rules, the obtained second sample translation has obvious characteristics of grammar error, sentence pattern hardness, poor diversity and the like, and the characteristics are easily captured by a neural network and are not beneficial to the training of a scoring model. If the second sample translation constructed based on the translation model often has a fixed syntactic pattern, semantic errors cannot be guaranteed, and the training of the scoring model is also not facilitated.

In the embodiment of the present application, in order to construct a second sample translation whose semantic meaning has a smaller deviation from that of the first sample translation, the first sample translation vector after being subjected to noise addition is obtained first, and then the first sample translation vector after being subjected to noise addition and the first sample translation vector are input into a pre-trained denoising self-encoder to obtain the second sample translation. Although the pre-trained denoising autoencoder is used for correcting the semantics of the first sample translation added with the noise, in the actual use process, the pre-trained denoising autoencoder cannot completely correct the first sample translation added with the noise, namely, the pre-trained denoising autoencoder can only remove part of the noise in the first sample translation vector added with the noise, and then the second sample translation is obtained based on the first sample translation vector containing the part of the noise, and the retained part of the noise makes the semantics of the first sample translation and the semantics of the second sample translation have smaller deviation. The scoring model is trained on the basis of the first sample translation and the second sample translation with smaller deviation from the semantics of the first sample translation, and the scoring model can capture more detailed features in the training process, so that the training effect is better.

In a second method, a first sample text vector corresponding to the first sample text is obtained. And randomly destroying the first sample translation to obtain the first sample translation after destruction. A second sample text vector corresponding to the first sample translation after the corruption is determined. And inputting the first sample text vector and the second sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation.

In practice, the first sample translation is randomly destroyed to obtain the first sample translation after destruction. And acquiring a second sample text vector corresponding to the first sample translation after the destruction. And inputting the first sample text vector and the second sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation.

The random destruction of the text includes ways such as random masking, random replacement, random deletion, and random insertion. The random masking is to MASK a part of words by using a MASK code at random, replace a part of words by using other random words at random, delete a part of words in the text at random, and insert random words at random positions. The method for randomly destroying the text is the prior art, and the embodiment of the application is not described again. For example, the results of random disruption to "I am Chinese, I love China" can be as shown in Table 2.

TABLE 2

Source sentence	I am Chinese,I love China.
		Random masking	I am[MASK],I[MASK]China.
Random replacement	I am Chinese,residual love China.
		Random deletion	I am,I love China.
Random insertion	I am Chinese,I monitoring love China.

The third method is to add a gaussian noise vector to the first sample text vector corresponding to the first sample translation, and also randomly destroy the first sample translation. The method comprises the following specific steps: and randomly destroying the first sample translation to obtain the randomly destroyed first sample translation. And acquiring a sample text vector corresponding to the first sample translation after random destruction and a Gaussian noise vector generated randomly, and adding the two vectors to obtain a vector after addition. Thus, the added vector and the first sample vector are input into a pre-trained denoising autoencoder to obtain a second sample translation. The denoising self-encoder related to the method is the same as the denoising self-encoder related to the first method and the denoising self-encoder related to the second method

And a fourth method, obtaining a third sample text vector corresponding to the sample original text, a first sample text vector corresponding to the first sample translation, a first Gaussian noise vector and a second Gaussian noise vector. And adding the third sample text vector and the first Gaussian noise vector to obtain a third sample text vector after noise addition. And adding the first sample text vector and the second Gaussian noise vector to obtain the first sample text vector after noise addition. And inputting the first sample text vector after noise addition, the third sample text vector after noise addition and the first sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation. The first gaussian noise vector and the second gaussian noise vector may be the same vector or different vectors.

The training process of the denoising autoencoder is as follows: and acquiring a sample original text and a first sample translation corresponding to the sample original text, and inputting the first sample text vector after noise addition, the third sample text vector after noise addition and the first sample text vector into a de-noising self-encoder to obtain a predicted text. And inputting the prediction text and the first sample translation into a loss function to obtain loss information, and adjusting the denoising autoencoder based on the loss information to obtain the adjusted denoising autoencoder. And continuously adjusting the adjusted denoising autoencoder by using other sample texts and the first sample translation corresponding to the other sample texts, and obtaining the pre-trained denoising autoencoder when the adjusted denoising autoencoder is converged.

The structure of the denoising autoencoder is different from the structure of the denoising encoder related to the three methods, the denoising autoencoder is a double-encoder-single-decoder framework, the specific composition is as shown in fig. 4, and a third sample text vector corresponding to a sample original text is obtained through a word embedding algorithm. And adding the third sample text vector and the first Gaussian noise vector generated randomly to obtain a third sample text vector after noise addition. And performing linear mapping on the third sample text vector after the noise is added to obtain a thirteenth vector, and inputting the thirteenth vector into the multi-head mutual attention layer to obtain a fourteenth vector. And residual errors of the fourteenth vector and the thirteenth vector are connected, and the vectors after residual error connection are subjected to normalization processing to obtain a fifteenth vector. And inputting the fifteenth vector into the feedforward layer to obtain a sixteenth vector. And residual errors are connected between the fifteenth vector and the sixteenth vector, and the vectors after residual error connection are subjected to normalization processing to obtain a seventeenth vector. Similarly, a first sample text vector corresponding to the sample original text is obtained through a word embedding algorithm. And adding the first sample text vector and a second Gaussian noise vector which is randomly generated to obtain a first sample text vector after noise addition. And performing linear mapping on the third sample text vector after the noise is added to obtain an eighteenth vector, and inputting the eighteenth vector into the multi-head mutual attention layer to obtain a nineteenth vector. And performing residual error connection on the eighteenth vector and the nineteenth vector, and performing normalization processing on the vectors subjected to residual error connection to obtain a twentieth vector. And inputting the twentieth vector into a feedforward layer to obtain a twenty-first vector. And residual error connection is carried out on the twentieth vector and the twenty-first vector, and normalized processing is carried out on the vectors after the residual error connection to obtain a twentieth vector. Similarly, the first sample text vector is linearly mapped to obtain a twenty-third vector, and the twenty-third vector is input into the multi-head self-attention layer of the mask to obtain a twenty-fourth vector. And residual error connection is carried out on the twenty-third vector and the twenty-fourth vector, and normalized processing is carried out on the vectors after the residual error connection, so that a twenty-fifth vector is obtained. And inputting the twenty-fifth vector and the seventeenth vector into the multi-head mutual attention layer to obtain a twenty-sixth vector. And residual error connection is carried out on the twenty-sixth vector and the twenty-fifth vector, and normalized processing is carried out on the vectors after the residual error connection, so that a twenty-seventh vector is obtained. And inputting the twenty-seventh vector and the twenty-two vector into a multi-head mutual attention layer to obtain a twenty-eighth vector. And residual error connection is carried out on the twenty-eighth vector and the twenty-seventh vector, and normalized processing is carried out on the vectors after the residual error connection, so that a twenty-ninth vector is obtained. And inputting the twenty-ninth vector into a feedforward layer to obtain a thirty-th vector. And residual errors are connected between the twenty-ninth vector and the thirty-eighth vector, and the vectors after residual error connection are subjected to normalization processing to obtain a thirty-first vector. And inputting the thirty-first vector into the linear layer, and performing Softmax processing to obtain a second sample translation.

The denoising self-encoder comprises two multi-head mutual attention layers which are respectively in information interaction with the two encoders so as to complete decoding. And the multi-head self-attention layer, the feedforward layer, the mask multi-head self-attention layer, the multi-head mutual attention layer and the feedforward layer in the denoising self-encoder are all neural networks. The multi-head self-attention layer is used for projecting the feature vectors input by the encoder through a plurality of linear transformations to obtain query, key and value triplets, then calculating the attention weight between the query and the key, and multiplying the attention weight by the value to obtain the feature representation input by the encoder. The multi-head mutual attention layer is used for projecting the feature vectors input by the encoder and the decoder through a plurality of linear transformations to obtain query, key and value triplets, then calculating the attention weight between the query and the key, and multiplying the attention weight by the value to obtain the feature representation after the input information of the encoder and the decoder is interacted. The full connection layer is used for mapping the input feature representation twice, and the feature representation capacity is increased. Residual concatenation is used to concatenate the input vectors, thereby avoiding the gradient vanishing problem. The layer normalization is used for normalizing the neuron distribution of the same layer into the same distribution, and the training stability is guaranteed.

And a fifth method, randomly destroying the sample original text to obtain the sample original text after random destruction. And randomly destroying the first sample translation to obtain the first sample translation after random destruction. And obtaining a fourth sample text vector corresponding to the original text of the sample after the destruction and a second sample text vector corresponding to the translation of the first sample after the destruction. And inputting the fourth sample text vector, the second sample text vector and the first sample text vector into a pre-trained denoising self-encoder to obtain a second sample translation.

And a sixth method, randomly destroying the first sample translation to obtain the randomly destroyed first sample translation. And acquiring a sample text vector corresponding to the randomly damaged first sample translation and a randomly generated third Gaussian noise vector, and adding the two vectors to obtain a first vector. And randomly destroying the sample original text to obtain the randomly destroyed sample original text. And acquiring a sample text vector corresponding to the randomly damaged sample text and a randomly generated fourth Gaussian noise vector, and adding the two vectors to obtain a second vector. And inputting the first vector, the second vector and the first sample vector into a pre-trained denoising self-encoder to obtain a second sample translation.

The third noise vector and the fourth gaussian noise vector may be the same vector or different noise vectors.

And the seventh method is used for randomly destroying the sample original text or the first sample translation and obtaining a text vector corresponding to the sample original text or the first sample translation after random destruction. And acquiring a sample text vector corresponding to the first sample translation or the sample original text and a Gaussian noise vector generated randomly, and adding the two vectors to obtain a vector after addition. And inputting the two vectors and the first sample text vector into a denoising self-encoder to obtain a second sample translation.

It should be noted that the training process and structure of the denoising autoencoder related to the fourth method, the fifth method, the sixth method, and the seventh method are the same, and are not described herein again.

Step 202, inputting the sample original text and the first sample translation into the scoring model to obtain a first sample score corresponding to the first sample translation, and inputting the sample original text and each second sample translation into the scoring model to obtain a second sample score corresponding to each second sample translation.

Optionally, the scoring model in this embodiment of the present application includes a text preprocessing module, a feature extraction module, and a scoring module. Inputting the sample original text and the first sample translation into a scoring model, and obtaining a first sample score corresponding to the first sample translation specifically comprises the following steps: and inputting the sample original text and the first sample translation into a text preprocessing module to obtain a sample character sequence. And inputting the sample character sequence into a feature extraction module to obtain sample feature information. And inputting the sample characteristic information into a scoring module to obtain a first sample score corresponding to the first sample translation.

The text preprocessing module is an algorithm model and mainly used for performing text preprocessing on an input sample original text and a first sample translation to obtain a sample text after the text preprocessing and the first sample translation after the text preprocessing, and splicing the sample text and the first sample translation to obtain a sample character sequence.

The text preprocessing comprises word segmentation processing, sub-word segmentation processing, special character processing and truncation processing. The segmentation word processing is to separate punctuation from text, the sub-word segmentation processing is to further segment a single word according to the frequency of occurrence of continuous letters of the single word, the special character processing is to delete non-printed characters and to transcribe escape characters, and the truncation processing is to truncate an input sequence according to the upper limit of the length of a sentence which can be processed by a model.

It should be noted that the text preprocessing includes a sub-word segmentation process, so that the sample original text and the first sample translation may be segmented into a plurality of sub-words. For example, after the sample original text "I eat apple" and the first sample translation "I drink an applet" go through the pre-processing flow of BERT, the sample original text with text pre-processing is obtained as "[ CLS ] I eat apple. ", the first sample of text pre-processing is translated as" [ SEP ] I driver an app # # le. [ SEP ] ". The two are spliced to obtain the 'CLS' apple which is eaten by people. [ SEP ] I drive an app # # le. [ SEP ] ".

In the above sequence, the term applet is split into two parts, app and # # le. This process helps to reduce the size of the vocabulary and reduce computational overhead.

The feature extraction module is a neural network model and is mainly used for extracting features of the text vectors to obtain sample feature information. The specific processing process of the feature extraction module is as follows: each character in the sequence of characters is first converted to a text vector. The text vectors are then sent to an encoder in the feature extraction module. And each layer of Transformer in the encoder encodes the text vectors into characteristic information layer by layer, so that the characteristic information corresponding to each word is fused with the context information of the word.

For example, although the word "bank" is included in both "I am fixing on the bank" and "I went to bank to sink money," the "bank" in the two sentences has different meanings. The "bank" in the first sentence should be translated as "bank" and the "bank" in the second sentence should be translated as "bank". The feature extraction module can distinguish different meanings of two words according to the context information, so that different expression vectors are given to the same word.

The scoring module is also a neural network model, the structure of the scoring module is composed of a layer of fully-connected network, and the scoring module can map the characteristic information into a real-valued continuous numerical value, namely a score, which is used as a quality evaluation result of the sample original text and the first sample translation.

Step 203, determining loss information based on the first sample score and the at least one second sample score.

And adding noise to the first sample translation to obtain a second sample translation. The second sample score corresponding to the second sample translation should be lower than the first sample score corresponding to the first sample translation. This allows loss information to be obtained based on the comparison between the first sample score and the second sample score. Based on this principle, the embodiment of the present application provides two forms of comparative training. The first method is comparative classification, as shown in FIG. 5A, and the other is comparative ordering, as shown in FIG. 5B, where the goal of both loss functions is to make the first sample score higher than the second sample score. Two specific methods are described below.

The first method determines loss information based on a first sample score, at least one second sample score, and a first predetermined formula.

The first preset formula is that L is sigma_x∈D-(p_x×log(W_x×h(x))+(1-p_x)×log(1-W_x×h(x)))；

Wherein L is loss information, D is a sample translation set consisting of a first sample translation and at least one second sample translation, x is any sample translation in the sample translation set D, h (x) is a score corresponding to the sample translation x, W_xIs a predetermined coefficient, p_xIs a predetermined constant, p_xThe numerical range of (2) is (0, 1).

In FIG. 5A, S is the sample text, T₀For the first translation, T₁`～T_nAll are the second sample translation, l₀Is the first sample score, l₁`～l_nIs the second sample score. The original sample text S and the first sample translation T are combined₀Inputting a scoring model to obtain a first sample score l₀. The original sample text S and the first sample translation T are combined₁Inputting a scoring model to obtain a first sample score l₁And (5) allowing the strain to stand. The original sample text S and the first sample translation T are combined_nInputting a scoring model to obtain a first sample score l_nAnd (5) allowing the strain to stand. Then, the first preset formula is used for l₀And l₁`～l_n"performing contrast sorting.

One or more scoring models may be used in fig. 5A. But when there are multiple scoring models, the parameters used by each scoring model are shared. Therefore, only one scoring model is obtained by actual training. The parameter sharing mode improves the efficiency of neural network training and reduces the occupied space of the scoring model.

In a second method, loss information is determined based on the first sample score, at least one second sample score' and a second predetermined formula.

The second predetermined formula is

Wherein, L is loss information, D is a sample translation set composed of a first sample translation and at least one second sample translation, s is the first sample translation, h(s) is a first sample score corresponding to the first sample translation, x is any sample translation in the sample translation set D, h (x) is a score corresponding to the sample translation x, and margin is a preset constant for enlarging a difference value between the first sample score and the second sample score.

As shown in fig. 5B, S is a sample original, T is a first sample translation, T 'is a second sample translation, l is a first sample score, and l' is a second sample score, where the scoring models in fig. 5B are all the same model or a model shared by multiple parameters. And inputting the sample original text S and the first sample translation T into a scoring model to obtain a first sample score l. Inputting the sample original text S and the first sample translation T 'into a scoring model to obtain a first sample score l'. And then calculating the marginal losses of l and l' by using a second preset formula.

And step 204, adjusting the scoring model based on the loss information.

In implementation, parameters in the scoring model are adjusted based on the loss information to obtain an adjusted scoring model. And training and adjusting the scoring model based on other sample original texts, the corresponding first sample translations and the corresponding second sample translations.

After loss information is obtained, gradient back transmission and parameter updating are carried out on the scoring model by using a back propagation algorithm of deep learning. In one training process, parameters of a feature extraction module and a scoring module in the scoring model are updated and the same learning rate is used. Meanwhile, in the training process, the scoring model can be verified once when the scoring model is trained for a preset time. The verification process is similar to the training process in the prior art, but the parameters of the scoring model are not adjusted based on the loss information, but the prediction score and the reference score output by the scoring model are compared to obtain the loss information, then the scoring model is trained and adjusted based on the loss information, and meanwhile, the precision of the scoring model is calculated based on the prediction score and the reference score. And when the calculated precision is not improved any more, obtaining the trained scoring model.

Wherein the loss information is calculated based on the prediction score, the benchmark score, and a third formula, wherein the third formula is

Wherein Lsent is loss information, h(s) is a prediction score output by the scoring model, hter_sIs a reference score, W_sIs a preset coefficient, wherein sigmoid (x) is a mapping function for mapping x into a numerical range of 0-1.

Therefore, when the scoring model is trained and adjusted, the scoring model can be trained based on the prediction score and the reference score, so that the output result of the trained scoring model is more accurate.

In an embodiment of the present application, a sample original, a first sample translation, and at least one second sample translation are obtained, where a semantic meaning of the first sample translation is the same as a semantic meaning corresponding to the sample original, and a semantic meaning of the second sample translation is different from a semantic meaning of the first sample translation. Inputting the sample original text and the first sample translation into a scoring model to obtain a first sample score corresponding to the first sample translation, and inputting the sample original text and each second sample translation into the scoring model to obtain a second sample score corresponding to each second sample translation; determining loss information based on the first sample score and at least one second sample score; based on the loss information, a scoring model is adjusted. Therefore, the scoring model can be trained on the premise of not depending on the reference score.

In the related art, before training a scoring model, a professional translator or native speaker is required to evaluate the sample original text and the translated sample translation, score the sample original text and the translated sample translation in multiple different aspects such as accuracy and fluency, and then integrate multiple evaluation scores of the sample translation to obtain a final benchmark score. As shown in table 3:

TABLE 3

Sample original text	I am a Chinese.	I eat apples.
			Sample translation	I am Chinese.	I drink an apple.
Results of Manual evaluation 1	1.0	0.2
			Results of Manual evaluation 2	0.9	0.4
Results of Manual evaluation 3	1.0	0.35
			Final manual evaluation results	0.9667	0.3167

In table 2, the manual evaluation results of the sample translation "I am chinese." are 1.0, 0.9, and 1.0, respectively, and the final manual evaluation result is the average 0.9667 of the three manual evaluation results, i.e. the average is the benchmark score. The sample translation "I drink an applet" corresponds to manual evaluation results of 0.2, 0.4 and 0.35, respectively, and the final manual evaluation result is the average 0.3167 of the three manual evaluation results, i.e., the average is the benchmark score.

Because the process of manual evaluation is time-consuming and labor-consuming, a large number of professional translators are required to participate, and then objective evaluation scores can be obtained. Moreover, because the error distributions of different languages, different fields and different machine translation systems are different, when quality evaluation is performed on a specific language, a specific field and a specific machine translation system, an existing scoring model cannot be directly used, and training and adjustment are performed on the scoring model based on the specific language, the specific field and the specific machine translation system again, which wastes time and labor.

And in the actual use process of the scoring model, inputting the target original text and the target translation into the pre-trained scoring model to obtain a target score corresponding to the target translation.

Fig. 6 is a schematic structural diagram of an apparatus for training a scoring model according to an embodiment of the present application, and referring to fig. 6, the apparatus includes:

a first obtainingmodule 610 configured to obtain a sample original, a first sample translation, and at least one second sample translation, wherein the semantics of the first sample translation and the semantics of the sample original are the same, and the semantics of the second sample translation and the semantics of the first sample translation are different;

theinput module 620 is configured to input the sample original text and the first sample text into a scoring model to obtain a first sample score corresponding to the first sample text, and input the sample original text and each second sample text into the scoring model to obtain a second sample score corresponding to each second sample text;

a determiningmodule 630 configured to determine loss information based on the first sample score and at least one second sample score;

anadjustment module 640 configured to adjust a scoring model based on the loss information.

acquiring a first sample text vector corresponding to the first sample text;

Optionally, the determiningmodule 630 is configured to:

the second preset formula is

Optionally, the apparatus further comprises a usage module configured to:

theinput module 620 is configured to:

It should be noted that: in the device for training a score model according to the above embodiment, when the score model is trained, only the division of the functional modules is exemplified, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the device for training the scoring model and the method for training the scoring model provided by the above embodiments belong to the same concept, and the specific implementation process thereof is described in the method embodiments, and is not described herein again.

Fig. 7 shows a block diagram of a terminal 700 according to an exemplary embodiment of the present application. The terminal 700 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer.Terminal 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.

In general,terminal 700 includes: aprocessor 701 and amemory 702.

Theprocessor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. Theprocessor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). Theprocessor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, theprocessor 701 may be integrated with a GPU (Graphics Processing Unit) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, theprocessor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 702 may include one or more computer-readable storage media, which may be non-transitory.Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium inmemory 702 is used to store at least one program code for execution byprocessor 701 to implement the method of training a scoring model provided by method embodiments herein.

In some embodiments, the terminal 700 may further optionally include: aperipheral interface 703 and at least one peripheral. Theprocessor 701, thememory 702, and theperipheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected toperipheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of aradio frequency circuit 704, adisplay screen 705, acamera assembly 706, anaudio circuit 707, apositioning component 708, and apower source 709.

Theperipheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to theprocessor 701 and thememory 702. In some embodiments,processor 701,memory 702, andperipheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of theprocessor 701, thememory 702, and theperipheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

TheRadio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. Theradio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. Therf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, theradio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. Theradio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, theradio frequency circuit 704 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

Thedisplay screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When thedisplay screen 705 is a touch display screen, thedisplay screen 705 also has the ability to capture touch signals on or over the surface of thedisplay screen 705. The touch signal may be input to theprocessor 701 as a control signal for processing. At this point, thedisplay 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, thedisplay 705 may be one, disposed on a front panel of the terminal 700; in other embodiments, thedisplay 705 can be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in other embodiments, thedisplay 705 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 700. Even more, thedisplay 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. TheDisplay 705 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.

Thecamera assembly 706 is used to capture images or video. Optionally,camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments,camera assembly 706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Theaudio circuitry 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to theprocessor 701 for processing or inputting the electric signals to theradio frequency circuit 704 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from theprocessor 701 or theradio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, theaudio circuitry 707 may also include a headphone jack.

Thepositioning component 708 is used to locate the current geographic Location of the terminal 700 for navigation or LBS (Location Based Service). ThePositioning component 708 can be a Positioning component based on the GPS (Global Positioning System) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

Power supply 709 is provided to supply power to various components ofterminal 700. Thepower source 709 may be alternating current, direct current, disposable batteries, or rechargeable batteries. Whenpower source 709 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.

The acceleration sensor 711 can detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 711 may be used to detect components of the gravitational acceleration in three coordinate axes. Theprocessor 701 may control thedisplay screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 712 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 712 may cooperate with the acceleration sensor 711 to acquire a 3D motion of the terminal 700 by the user. From the data collected by the gyro sensor 712, theprocessor 701 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 713 may be disposed on a side frame ofterminal 700 and/or underneathdisplay 705. When the pressure sensor 713 is disposed on a side frame of the terminal 700, a user's grip signal on the terminal 700 may be detected, and theprocessor 701 performs right-left hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at a lower layer of thedisplay screen 705, theprocessor 701 controls the operability control on the UI interface according to the pressure operation of the user on thedisplay screen 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 714 is used for collecting a fingerprint of a user, and theprocessor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, theprocessor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 714 may be disposed on the front, back, or side of the terminal 700. When a physical button or a vendor Logo is provided on the terminal 700, the fingerprint sensor 714 may be integrated with the physical button or the vendor Logo.

The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, theprocessor 701 may control the display brightness of thedisplay screen 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of thedisplay screen 705 is increased; when the ambient light intensity is low, the display brightness of thedisplay screen 705 is adjusted down. In another embodiment,processor 701 may also dynamically adjust the shooting parameters ofcamera assembly 706 based on the ambient light intensity collected by optical sensor 715.

A proximity sensor 716, also referred to as a distance sensor, is typically disposed on a front panel of the terminal 700. The proximity sensor 716 is used to collect the distance between the user and the front surface of the terminal 700. In one embodiment, when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 gradually decreases, theprocessor 701 controls thedisplay 705 to switch from the bright screen state to the dark screen state; when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 is gradually increased, theprocessor 701 controls thedisplay 705 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 7 is not intended to be limiting ofterminal 700 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

The computer device provided by the embodiment of the application can be provided as a server. Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where theserver 800 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one ormore memories 802, where thememory 802 stores at least one program code, and the at least one program code is loaded and executed by theprocessors 801 to implement the method for training the scoring model according to the above-mentioned method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input obtaining interface, so as to obtain input, and the server may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a computer-readable storage medium, such as a memory including program code, which is executable by a processor in a terminal or a server to perform the method of training a scoring model in the above embodiments, is also provided. For example, the computer-readable storage medium may be a read-only memory (ROM), a Random Access Memory (RAM), a compact-disc read-only memory (cd-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by hardware associated with program code, and the program may be stored in a computer readable storage medium, and the above mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

Translated fromChinese

1.一种训练评分模型的方法，其特征在于，所述方法包括：1. a method for training a scoring model, wherein the method comprises:

获取样本原文、第一样本译文以及至少一个第二样本译文，其中，所述第一样本译文的语义和所述样本原文的语义相同，所述第二样本译文的语义与所述第一样本译文的语义不同；Obtain a sample original text, a first sample translation and at least one second sample translation, wherein the semantics of the first sample translation are the same as the semantics of the sample original text, and the semantics of the second sample translation are the same as the first sample translation The semantics of the sample translations are different;

将所述样本原文和所述第一样本译文输入评分模型，得到所述第一样本译文对应的第一样本分数，将所述样本原文分别和每个第二样本译文输入评分模型，得到每个第二样本译文对应的第二样本分数；Inputting the sample original text and the first sample translation into a scoring model to obtain a first sample score corresponding to the first sample translation, respectively inputting the sample original text and each second sample translation into the scoring model, Obtain the second sample score corresponding to each second sample translation;

基于所述第一样本分数以及至少一个第二样本分数，确定损失信息；determining loss information based on the first sample score and at least one second sample score;

基于所述损失信息，对评分模型进行调整。Based on the loss information, the scoring model is adjusted.

2.根据权利要求1所述的方法，其特征在于，所述获取样本原文、第一样本译文以及至少一个第二样本译文之前，所述方法还包括：2 . The method according to claim 1 , wherein before acquiring the sample original text, the first sample translation and at least one second sample translation, the method further comprises: 2 .

获取所述第一样本译文对应的第一样本文本向量和高斯噪声向量；obtaining a first sample text vector and a Gaussian noise vector corresponding to the first sample translation;

将所述第一样本文本向量和所述高斯噪声向量相加，得到加噪之后的第一样本文本向量；adding the first sample text vector and the Gaussian noise vector to obtain the first sample text vector after adding noise;

将加噪之后的第一样本文本向量和所述第一样本文本向量输入预先训练的去噪自编码器，得到所述第二样本译文。Inputting the noise-added first sample text vector and the first sample text vector into a pre-trained denoising autoencoder to obtain the second sample translation.

3.根据权利要求1所述的方法，其特征在于，所述获取样本原文、第一样本译文以及至少一个第二样本译文之前，所述方法还包括：3 . The method according to claim 1 , wherein before acquiring the sample original text, the first sample translation and at least one second sample translation, the method further comprises: 3 .

获取所述第一样本文本对应的第一样本文本向量；obtaining a first sample text vector corresponding to the first sample text;

对所述第一样本译文进行随机破坏，得到破坏之后的第一样本译文；Randomly destroying the first sample translation to obtain the first sample translation after the destruction;

确定所述破坏之后的第一样本译文对应的第二样本文本向量；determining the second sample text vector corresponding to the first sample translation after the destruction;

将所述第一样本文本向量和所述第二样本文本向量输入预先训练的去噪自编码器，得到所述第二样本译文。Inputting the first sample text vector and the second sample text vector into a pre-trained denoising autoencoder to obtain the second sample translation.

4.根据权利要求1所述的方法，其特征在于，所述基于所述第一样本分数以及至少一个第二样本分数，确定损失信息，包括：4. The method according to claim 1, wherein the determining loss information based on the first sample score and at least one second sample score comprises:

基于所述第一样本分数、所述至少一个第二样本分数以及第一预设公式，确定所述损失信息；determining the loss information based on the first sample score, the at least one second sample score, and a first preset formula;

所述第一预设公式为L＝∑_x∈D-(p_x×log(W_x×h(x))+(1-p_x)×log(1-W_x×h(x)))；The first preset formula is L=∑_x∈D- (p_x ×log(W_x ×h(x))+(1-p_x )×log(1-W_x ×h(x))) ;

其中，L为所述损失信息，D为由所述第一样本译文和所述至少一个第二样本译文组成的样本译文集合，x为所述样本译文集合D中的任一样本译文，h(x)为所述样本译文x对应的分数，W_x为预设系数，p_x为预设常数，p_x的数值范围为(0，1)。Wherein, L is the loss information, D is a sample translation set consisting of the first sample translation and the at least one second sample translation, x is any sample translation in the sample translation set D, h (x) is the score corresponding to the sample translation x, W_x is a preset coefficient, p_x is a preset constant, and the numerical range of p_x is (0, 1).

5.根据权利要求1所述的方法，其特征在于，所述基于所述第一样本分数以及至少一个第二样本分数，确定损失信息，包括：5. The method according to claim 1, wherein the determining loss information based on the first sample score and at least one second sample score comprises:

基于所述第一样本分数、所述至少一个第二样本分数以及第二预设公式，确定所述损失信息；determining the loss information based on the first sample score, the at least one second sample score, and a second preset formula;

所述第二预设公式为

The second preset formula is

其中，L为所述损失信息，D为由所述第一样本译文和所述至少一个第二样本译文组成的样本译文集合，s为所述第一样本译文，h(s)为所述第一样本译文对应的第一样本分数，x为所述样本译文集合D中的任一样本译文，h(x)为所述样本译文x对应的分数，margin为预设常数。Wherein, L is the loss information, D is a sample translation set consisting of the first sample translation and the at least one second sample translation, s is the first sample translation, and h(s) is the The first sample score corresponding to the first sample translation, x is any sample translation in the sample translation set D, h(x) is the score corresponding to the sample translation x, and margin is a preset constant.

6.根据权利要求1所述的方法，其特征在于，所述方法还包括：6. The method of claim 1, wherein the method further comprises:

将目标原文和目标译文输入预先训练的评分模型，得到所述目标译文对应的目标分数。Input the target original text and the target translation into the pre-trained scoring model, and obtain the target score corresponding to the target translation.

7.根据权利要求1所述的方法，其特征在于，所述评分模型包括文本预处理模块、特征提取模块以及评分模块；7. The method according to claim 1, wherein the scoring model comprises a text preprocessing module, a feature extraction module and a scoring module;

所述将所述样本原文和所述第一样本译文输入评分模型，得到所述第一样本译文对应的第一样本分数，包括：The inputting the sample original text and the first sample translation into the scoring model to obtain the first sample score corresponding to the first sample translation, including:

将所述样本原文和所述第一样本译文输入文本预处理模块，得到样本字符序列；Inputting the sample original text and the first sample translation into a text preprocessing module to obtain a sample character sequence;

将所述样本字符序列输入特征提取模块，得到样本特征信息；Inputting the sample character sequence into a feature extraction module to obtain sample feature information;

将所述样本特征信息输入评分模块，得到所述第一样本译文对应的第一样本分数。Inputting the sample feature information into a scoring module to obtain a first sample score corresponding to the first sample translation.

8.一种训练评分模型的装置，其特征在于，所述装置包括：8. A device for training a scoring model, wherein the device comprises:

第一获取模块，被配置为获取样本原文、第一样本译文以及至少一个第二样本译文，其中，所述第一样本译文的语义和所述样本原文的语义相同，所述第二样本译文的语义与所述第一样本译文的语义不同；a first acquisition module configured to acquire a sample original text, a first sample translation and at least one second sample translation, wherein the first sample translation has the same semantics as the sample original text, and the second sample translation has the same semantics as the sample original text. The semantics of the translation is different from the semantics of the first sample translation;

输入模块，被配置为将所述样本原文和所述第一样本译文输入评分模型，得到所述第一样本译文对应的第一样本分数，将所述样本原文分别和每个第二样本译文输入评分模型，得到每个第二样本译文对应的第二样本分数；an input module, configured to input the sample original text and the first sample translation into a scoring model, obtain a first sample score corresponding to the first sample translation, and assign the sample original text to each second The sample translation is input into the scoring model, and the second sample score corresponding to each second sample translation is obtained;

确定模块，被配置为基于所述第一样本分数以及至少一个第二样本分数，确定损失信息；a determination module configured to determine loss information based on the first sample score and at least one second sample score;

调整模块，被配置为基于所述损失信息，对评分模型进行调整。An adjustment module configured to adjust the scoring model based on the loss information.

9.一种终端，其特征在于，所述终端包括处理器和存储器，所述存储器中存储有至少一条程序代码，所述至少一条程序代码由所述处理器加载并执行以实现如权利要求1至权利要求7任一项所述训练评分模型的方法所执行的操作。9. A terminal, characterized in that the terminal comprises a processor and a memory, and at least one piece of program code is stored in the memory, and the at least one piece of program code is loaded and executed by the processor to realize the method as claimed in claim 1 To the operations performed by the method for training a scoring model according to any one of claims 7.

10.一种计算机可读存储介质，其特征在于，所述计算机可读存储介质中存储有至少一条程序代码，所述至少一条程序代码由处理器加载并执行以实现如权利要求1至权利要求7任一项所述训练评分模型的方法所执行的操作。10. A computer-readable storage medium, characterized in that, at least one piece of program code is stored in the computer-readable storage medium, and the at least one piece of program code is loaded and executed by a processor to realize the invention as claimed in claim 1 to claim 1. 7. Operations performed by any one of the methods for training a scoring model.