CN118113864A

Movatterモバイル変換

Info

Publication number: CN118113864A
Application number: CN202211526406.4A
Authority: CN
Inventors: 樊鹏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-11-30
Filing date: 2022-11-30
Publication date: 2024-05-31

Abstract

The application provides a text emotion classification method, a text emotion classification device, electronic equipment and a storage medium; the method comprises the following steps: acquiring an original sample and an emotion tag marked for the original sample; modifying the original sample based on a keyword modification strategy to obtain an countermeasure sample; carrying out fusion processing on single feature vectors corresponding to the original samples and the countermeasure samples respectively to obtain a text multi-feature fusion vector, wherein the text multi-feature fusion vector is used for constructing prior distribution; generating a generation sample conforming to the target emotion label based on the potential variable and the target emotion label obtained from prior distribution sampling; training a text emotion classification model based on an original sample carrying emotion labels and a generated sample carrying target emotion labels, wherein the trained text emotion classification model is used for determining emotion classification results of texts to be classified. According to the application, the accuracy of the text emotion classification result can be improved.

Description

Translated fromChinese

文本情感分类方法、装置、电子设备及存储介质Text sentiment classification method, device, electronic device and storage medium

技术领域Technical Field

本申请涉及互联网技术领域，尤其涉及一种文本情感分类方法、装置、电子设备及存储介质。The present application relates to the field of Internet technology, and in particular to a text sentiment classification method, device, electronic device and storage medium.

背景技术Background technique

在社交网络中，常常需要确定用户发表的文本表达(例如评论数据)的情感类别，例如针对某一种上市的产品(例如网络会议客户端)，用户会对该产品发表评论，上述评论可以是文本格式，为了能够从上述评论中确定用户是否对该产品满意，可以确定用户发表的评论的情感类别，若该评论对应的情感类别是正向的(即积极类别)，则可以认为用户对该产品较为满意；若该评论对应的情感类别是负向的(即消极类别)，则可以认为用户对该产品不满意。In social networks, it is often necessary to determine the sentiment category of text expressions (such as comment data) posted by users. For example, for a certain product on the market (such as a web conferencing client), users will comment on the product. The above comments can be in text format. In order to determine whether the user is satisfied with the product from the above comments, the sentiment category of the comment posted by the user can be determined. If the sentiment category corresponding to the comment is positive (i.e., positive category), it can be considered that the user is relatively satisfied with the product; if the sentiment category corresponding to the comment is negative (i.e., negative category), it can be considered that the user is dissatisfied with the product.

相关技术中，通常是基于人工经验确定数据规则，例如产品运营人员基于业务经验，设定人工识别的规则，例如文本中带有“很好”、“不错”等字眼，则认为属于正向情感的概率更高；此外，相关技术还提供了基于非深度学习的数据挖掘方法，通过构建多维特征和模型训练的方法，预测当前用户身份属于不同标签值的概率。In related technologies, data rules are usually determined based on manual experience. For example, product operators set manual recognition rules based on business experience. For example, if a text contains words such as "very good" or "good", it is considered to have a higher probability of positive sentiment. In addition, related technologies also provide data mining methods based on non-deep learning, which predict the probability of the current user identity belonging to different label values by constructing multi-dimensional features and model training methods.

然而，相关技术提供的基于人工经验确定数据规则的方法，不仅使用规则数量非常有限，而且无法捕捉规则之间交互的高维特征信息，导致无法确定每个规则的最优参数；此外，由于在“产品使用评论正负向情感预测”场景，由于用户的行为特征较为复杂，特征信息在数据表征上很难显式表达，因此，相关技术提供的基于非深度学习的数据挖掘方法，不适合“产品使用评论正负向情感预测”场景的用户的特征信息的学习，特别是在强调数据安全的业务场景下，往往容易形成数据孤岛，导致分类效果不佳。However, the method of determining data rules based on artificial experience provided by related technologies not only uses a very limited number of rules, but also cannot capture the high-dimensional feature information of the interactions between rules, resulting in the inability to determine the optimal parameters of each rule; in addition, in the "positive and negative sentiment prediction of product usage reviews" scenario, due to the complex behavioral characteristics of users, the feature information is difficult to express explicitly in data representation. Therefore, the data mining method based on non-deep learning provided by related technologies is not suitable for learning the feature information of users in the "positive and negative sentiment prediction of product usage reviews" scenario, especially in business scenarios that emphasize data security, it is often easy to form data islands, resulting in poor classification results.

发明内容Summary of the invention

本申请实施例提供一种文本情感分类方法、装置、电子设备、计算机可读存储介质及计算机程序产品，能够提高文本情感分类结果的准确率。The embodiments of the present application provide a text sentiment classification method, device, electronic device, computer-readable storage medium and computer program product, which can improve the accuracy of text sentiment classification results.

本申请实施例的技术方案是这样实现的：The technical solution of the embodiment of the present application is implemented as follows:

本申请实施例提供一种文本情感分类方法，包括：The present application embodiment provides a text sentiment classification method, including:

获取原始样本、以及针对所述原始样本标注的情感标签；Obtaining original samples and sentiment labels annotated for the original samples;

基于关键词修改策略对所述原始样本进行修改，得到对抗样本；Modifying the original sample based on a keyword modification strategy to obtain an adversarial sample;

对所述原始样本和所述对抗样本分别对应的单特征向量进行融合处理，得到文本多特征融合向量，其中，所述文本多特征融合向量用于构建先验分布；The single feature vectors corresponding to the original sample and the adversarial sample are respectively fused to obtain a text multi-feature fusion vector, wherein the text multi-feature fusion vector is used to construct a prior distribution;

基于从所述先验分布采样得到的潜在变量和目标情感标签，生成符合所述目标情感标签的生成样本；Based on the latent variables sampled from the prior distribution and the target emotion label, generating a generated sample that conforms to the target emotion label;

基于携带所述情感标签的所述原始样本、以及携带所述目标情感标签的所述生成样本，对文本情感分类模型进行训练，其中，训练后的所述文本情感分类模型用于确定待分类文本的情感分类结果。Based on the original samples carrying the emotion label and the generated samples carrying the target emotion label, a text emotion classification model is trained, wherein the trained text emotion classification model is used to determine the emotion classification result of the text to be classified.

本申请实施例提供一种文本情感分类装置，包括：The present application embodiment provides a text sentiment classification device, comprising:

获取模块，用于获取原始样本、以及针对所述原始样本标注的情感标签；An acquisition module, used to acquire original samples and sentiment labels annotated for the original samples;

修改模块，用于基于关键词修改策略对所述原始样本进行修改，得到对抗样本；A modification module, used to modify the original sample based on a keyword modification strategy to obtain an adversarial sample;

融合模块，用于对所述原始样本和所述对抗样本分别对应的单特征向量进行融合处理，得到文本多特征融合向量，其中，所述文本多特征融合向量用于构建先验分布；A fusion module is used to fuse the single feature vectors corresponding to the original sample and the adversarial sample to obtain a text multi-feature fusion vector, wherein the text multi-feature fusion vector is used to construct a prior distribution;

生成模块，用于基于从所述先验分布采样得到的潜在变量和目标情感标签，生成符合所述目标情感标签的生成样本；A generation module, used to generate a generated sample that conforms to the target emotion label based on the latent variables sampled from the prior distribution and the target emotion label;

训练模块，用于基于携带所述情感标签的所述原始样本、以及携带所述目标情感标签的所述生成样本，对文本情感分类模型进行训练，其中，训练后的所述文本情感分类模型用于确定待分类文本的情感分类结果。A training module is used to train a text sentiment classification model based on the original samples carrying the sentiment label and the generated samples carrying the target sentiment label, wherein the trained text sentiment classification model is used to determine the sentiment classification result of the text to be classified.

本申请实施例提供一种电子设备，包括：An embodiment of the present application provides an electronic device, including:

存储器，用于存储可执行指令；A memory for storing executable instructions;

处理器，用于执行所述存储器中存储的可执行指令时，实现本申请实施例提供的文本情感分类方法。The processor is used to implement the text sentiment classification method provided in the embodiment of the present application when executing the executable instructions stored in the memory.

本申请实施例提供一种计算机可读存储介质，存储有计算机可执行指令，用于被处理器执行时，实现本申请实施例提供的文本情感分类方法。An embodiment of the present application provides a computer-readable storage medium storing computer-executable instructions for implementing the text sentiment classification method provided in the embodiment of the present application when executed by a processor.

本申请实施例提供一种计算机程序产品，包括计算机程序或计算机可执行指令，用于被处理器执行时，实现本申请实施例提供的文本情感分类方法。An embodiment of the present application provides a computer program product, including a computer program or computer executable instructions, which are used to implement the text sentiment classification method provided in the embodiment of the present application when executed by a processor.

本申请实施例具有以下有益效果：The embodiments of the present application have the following beneficial effects:

一方面通过基于关键词修改策略对原始样本进行修改的方式来得到对抗样本，可以解决如何在原始样本中确定需要修改的词语的位置、以及词语的修改尽量不影响文本情感分类模型的预测的问题；另一方面，通过将原始样本的单特征向量和对抗样本的单特征向量进行融合，即将单特征向量扩展为多特征融合向量，从而可以更好地学习样本的特征信息，从而提高后续训练得到的文本情感分类模型针对待分类文本的情感分类结果的准确率。On the one hand, by modifying the original sample based on the keyword modification strategy to obtain the adversarial sample, we can solve the problem of how to determine the position of the words that need to be modified in the original sample, and how to modify the words as little as possible to minimize the impact on the prediction of the text sentiment classification model; on the other hand, by fusing the single feature vector of the original sample and the single feature vector of the adversarial sample, that is, expanding the single feature vector into a multi-feature fusion vector, we can better learn the feature information of the sample, thereby improving the accuracy of the sentiment classification results of the text sentiment classification model obtained by subsequent training for the text to be classified.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1是本申请实施例提供的文本情感分类系统100的架构示意图；FIG1 is a schematic diagram of the architecture of a text sentiment classification system 100 provided in an embodiment of the present application;

图2是本申请实施例提供的服务器200的结构示意图；FIG2 is a schematic diagram of the structure of a server 200 provided in an embodiment of the present application;

图3是本申请实施例提供的文本情感分类方法的流程示意图；FIG3 is a flow chart of a text sentiment classification method provided in an embodiment of the present application;

图4是本申请实施例提供的文本情感分类方法的流程示意图；FIG4 is a flow chart of a text sentiment classification method provided in an embodiment of the present application;

图5是本申请实施例提供的文本情感分类方法的流程示意图；FIG5 is a flow chart of a text sentiment classification method provided in an embodiment of the present application;

图6A至图6C是本申请实施例提供的文本情感分类方法的流程示意图；6A to 6C are schematic diagrams of a flow chart of a text sentiment classification method provided in an embodiment of the present application;

图7是本申请实施例提供的针对评论数据进行预处理的流程示意图；FIG7 is a schematic diagram of a process for preprocessing comment data provided in an embodiment of the present application;

图8A和图8B是本申请实施例提供的产品反馈界面示意图；8A and 8B are schematic diagrams of product feedback interfaces provided in embodiments of the present application;

图9是本申请实施例提供的对标注的评论数据进行随机分组的流程示意图；FIG9 is a schematic diagram of a process for randomly grouping annotated comment data provided by an embodiment of the present application;

图10是本申请实施例提供的文本对抗样本的生成流程示意图；FIG10 is a schematic diagram of a flow chart of a text adversarial sample generation process provided in an embodiment of the present application;

图11是本申请实施例提供的文本词向量特征表示流程示意图；FIG11 is a schematic diagram of a text word vector feature representation process provided in an embodiment of the present application;

图12是本申请实施例提供的BERT训练流程示意图；FIG12 is a schematic diagram of a BERT training process provided in an embodiment of the present application;

图13是本申请实施例提供的文本字向量特征表示流程示意图；FIG13 is a schematic diagram of a text word vector feature representation process provided in an embodiment of the present application;

图14是本申请实施例提供的Word2vec训练流程示意图；FIG14 is a schematic diagram of the Word2vec training process provided in an embodiment of the present application;

图15是本申请实施例提供的ACGAN模型的结构示意图；FIG15 is a schematic diagram of the structure of the ACGAN model provided in an embodiment of the present application;

图16是本申请实施例提供的不同方案的效果对比示意图。FIG. 16 is a schematic diagram comparing the effects of different solutions provided in the embodiments of the present application.

具体实施方式Detailed ways

为了使本申请的目的、技术方案和优点更加清楚，下面将结合附图对本申请作进一步地详细描述，所描述的实施例不应视为对本申请的限制，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例，都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings. The described embodiments should not be regarded as limiting the present application. All other embodiments obtained by ordinary technicians in the field without making creative work are within the scope of protection of this application.

在以下的描述中，涉及到“一些实施例”，其描述了所有可能实施例的子集，但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集，并且可以在不冲突的情况下相互结合。In the following description, reference is made to “some embodiments”, which describe a subset of all possible embodiments, but it can be understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.

可以理解的是，在本申请实施例中，涉及到用户信息等相关的数据(例如用户针对产品的评论数据)，当本申请实施例运用到具体产品或技术中时，需要获得用户许可或者同意，且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。It is understandable that in the embodiments of the present application, when user information and other related data (such as user review data on products) are involved, when the embodiments of the present application are applied to specific products or technologies, user permission or consent is required, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.

在以下的描述中，所涉及的术语“第一\第二\...”仅仅是是区别类似的对象，不代表针对对象的特定排序，可以理解地，“第一\第二\...”在允许的情况下可以互换特定的顺序或先后次序，以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。In the following description, the terms "first\second\..." are merely used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that "first\second\..." can be interchanged with a specific order or sequence where permitted, so that the embodiments of the present application described herein can be implemented in an order other than that illustrated or described herein.

除非另有定义，本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的，不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.

对本申请实施例进行进一步详细说明之前，对本申请实施例中涉及的名词和术语进行说明，本申请实施例中涉及的名词和术语适用于如下的解释。Before further describing the embodiments of the present application in detail, the nouns and terms involved in the embodiments of the present application are explained. The nouns and terms involved in the embodiments of the present application are subject to the following interpretations.

1)对抗样本(Adversarial Examples)：指在数据集中通过故意添加细微的干扰所形成的输入样本，导致模型以高置信度给出一个错误的输出。即模型的输出结果容易被添加了微小扰动的输入影响，这些添加了故意生成的微小扰动的输入就称为对抗样本。1) Adversarial Examples: refers to input samples formed by intentionally adding slight disturbances to the data set, causing the model to give an erroneous output with high confidence. That is, the output of the model is easily affected by the input with slight perturbations added. These inputs with intentionally generated slight perturbations are called adversarial examples.

2)先验分布(Prior Distribution)：是概率分布的一种，与“后验分布”相对，与试验结果无关，或与随机抽样无关，反映在进行统计试验之前根据其他有关参数θ的知识而得到的分布。2) Prior Distribution: It is a type of probability distribution, which is opposite to the "posterior distribution". It is independent of the test results or random sampling. It reflects the distribution obtained based on the knowledge of other relevant parameters θ before conducting the statistical test.

3)KL距离：是Kullback-Leibler差异的简称，也叫做相对熵(Relative Entropy)，它衡量的是相同事件空间里的两个概率分布的差异情况，并不是一种距离度量方式，其物理意义是：在相同事件空间里，概率分布P(x)对应的每个事件，若用概率分布Q(x)编码时，平均每个事件编码长度增加了多少比特。3) KL distance: It is the abbreviation of Kullback-Leibler difference, also called relative entropy. It measures the difference between two probability distributions in the same event space. It is not a distance measurement method. Its physical meaning is: in the same event space, for each event corresponding to the probability distribution P(x), if it is encoded with the probability distribution Q(x), how many bits are added to the average encoding length of each event.

4)变分自编码器(VAE，Variational Auto-Encoder)：利用编码器对输入数据进行编码后得到潜在变量z，再利用解码器对潜在变量进行解码得到输出，最终通过优化输入与输出的差异更新模型参数得以训练。4) Variational Auto-Encoder (VAE): The encoder is used to encode the input data to obtain the latent variable z, and then the decoder is used to decode the latent variable to obtain the output. Finally, the model parameters are updated by optimizing the difference between the input and the output to achieve training.

5)生成对抗网络(GAN，Generative Adversarial Networks)：是一种深度学习模型，是近年来复杂分布上无监督学习最具前景的方法之一，模型通过框架中至少两个模块：生成模型(Generative Model)和判别模型(Discriminative Model)的互相博弈学习产生相当好的输出。5) Generative Adversarial Networks (GAN): It is a deep learning model and one of the most promising methods for unsupervised learning on complex distributions in recent years. The model produces fairly good output through the mutual game learning of at least two modules in the framework: the generative model and the discriminative model.

6)注意力机制(Attention Mechanism)：源于对人类视觉的研究，在认知科学中，由于信息处理的瓶颈，人类会选择性地关注所有信息的一部分，同时忽略其他可见的信息，上述机制通常被称为注意力机制。注意力机制主要有两个方面：决定需要关注输入的哪部分，分配有限的信息处理资源给重要的部分。6) Attention Mechanism: Originated from the study of human vision, in cognitive science, due to the bottleneck of information processing, humans will selectively focus on part of all information and ignore other visible information. The above mechanism is usually called the attention mechanism. The attention mechanism has two main aspects: deciding which part of the input needs to be paid attention to and allocating limited information processing resources to the important part.

7)文本：由字符、标点符号进行合理组合而生成的可被人类理解的一种非结构化数据。7) Text: An unstructured data that is understandable to humans and is generated by a reasonable combination of characters and punctuation marks.

8)关键词修改策略：在英文文本中，单词是由一个个字母组成的，并且基于学习的分类算法通过字典来表示有限个单词，但是由26个字母所组合而成的单词数量却几乎是无限的。因此，在英文语境下的文本对抗样本生成算法中，最常使用的修改策略都可以概括为，把一个在字典中存在的正确拼写的单词进行细微修改后，得到一个尚未在字典中出现过的单词。这些常见的修改策略包括：随机变换单词中的相邻字母；随机替换单词中的某个字母；随机在单词中插入字母或字符；随机在单词中删除除首字母和尾字母以外的字母等。8) Keyword modification strategy: In English text, words are composed of letters, and the learning-based classification algorithm uses a dictionary to represent a limited number of words, but the number of words composed of 26 letters is almost infinite. Therefore, in the text adversarial sample generation algorithm in the English context, the most commonly used modification strategy can be summarized as making a slight modification to a correctly spelled word in the dictionary to obtain a word that has not appeared in the dictionary. These common modification strategies include: randomly changing adjacent letters in a word; randomly replacing a letter in a word; randomly inserting letters or characters in a word; randomly deleting letters except the first and last letters in a word, etc.

此外，由于英文和中文的天然差异，以上修改策略在中文文本中并不适用，仅使用针对英文数据所提出的修改策略，如对汉字进行简单以及缺乏精心设计的替换或删除等操作可能并不会改变预测结果，反而会影响文本所表达的原始语义。换句话说，用于中文语境的修改策略需要进行更多样化的尝试和有针对性的选择。在本申请实施例中，希望在生成对抗样本时仅对原始样本添加细微的扰动，以保证在视觉或形态上对于人类来说是相似的。基于上述分析，本申请实施例可以采用三种针对中文文本的关键词修改策略，包括汉字交换、字符插入和汉字拆分替换，使得生成的对抗样本达到愚弄文本分类模型的目的。In addition, due to the natural differences between English and Chinese, the above modification strategies are not applicable to Chinese texts. Only using the modification strategies proposed for English data, such as simple and uncarefully designed replacement or deletion of Chinese characters, may not change the prediction results, but may affect the original semantics expressed by the text. In other words, modification strategies for Chinese contexts require more diverse attempts and targeted selections. In an embodiment of the present application, it is hoped that only slight disturbances are added to the original sample when generating adversarial samples to ensure that they are visually or morphologically similar to humans. Based on the above analysis, an embodiment of the present application can adopt three keyword modification strategies for Chinese text, including Chinese character exchange, character insertion, and Chinese character splitting and replacement, so that the generated adversarial samples can achieve the purpose of fooling the text classification model.

本申请实施例提供一种文本情感分类处理方法、装置、电子设备、计算机可读存储介质及计算机程序产品，能够提高文本情感分类结果的准确率。下面说明本申请实施例提供的电子设备的示例性应用，本申请实施例提供的电子设备可以实施为终端设备，或者实施为服务器，也可以由终端设备和服务器协同实施。下面以由服务器单独实施本申请实施例提供的文本情感分类方法为例进行说明。The embodiments of the present application provide a text sentiment classification processing method, device, electronic device, computer-readable storage medium and computer program product, which can improve the accuracy of text sentiment classification results. The following describes an exemplary application of the electronic device provided by the embodiment of the present application. The electronic device provided by the embodiment of the present application can be implemented as a terminal device, or as a server, or can be implemented by a terminal device and a server in collaboration. The following is an example of a text sentiment classification method provided by the embodiment of the present application being implemented by a server alone.

参见图1，图1是本申请实施例提供的文本情感分类系统100的架构示意图，为实现支撑提高文本情感分类结果的准确率的应用，如图1所示，文本情感分类系统100包括：服务器200、网络300、终端设备400、以及数据库500，其中，网络300可以是局域网，或者广域网，又或者是二者的组合，终端设备400是用户(例如产品运营人员)关联的终端设备，在终端设备400上运行有客户端410，客户端410可以是用于对待分类文本进行情感分类的客户端。Refer to Figure 1, which is a schematic diagram of the architecture of a text sentiment classification system 100 provided in an embodiment of the present application. In order to implement an application that supports improving the accuracy of text sentiment classification results, as shown in Figure 1, the text sentiment classification system 100 includes: a server 200, a network 300, a terminal device 400, and a database 500, wherein the network 300 can be a local area network, or a wide area network, or a combination of the two, the terminal device 400 is a terminal device associated with a user (such as a product operator), and a client 410 runs on the terminal device 400, and the client 410 can be a client for performing sentiment classification on the text to be classified.

在一些实施例中，服务器200可以首先从数据库500中获取原始样本(例如原始文本样本)、以及针对原始样本标注的情感标签，其中，情感标签用于表征原始样本的情感极性(即原始样本所表达的情感是正向情感还是负向情感)，接着，服务器200可以基于关键词修改策略对原始样本进行修改，得到对抗样本(即文本对抗样本)，并对原始样本和对抗样本分别对应的单特征向量进行融合处理，得到文本多特征融合向量，其中，文本多特征融合向量可以用于构建先验分布，随后服务器200可以对先验分布进行随机采样，得到潜在变量，并基于采样得到的潜在变量和目标情感标签，生成符合目标情感标签的生成样本(即可以通过潜在变量和当前所需要结合的情感标签生成多样化的情感文本)，最后服务器200可以基于携带情感标签的原始样本、以及携带目标情感标签的生成样本，对文本情感分类模型进行训练。In some embodiments, the server 200 may first obtain the original sample (e.g., the original text sample) and the emotion label annotated for the original sample from the database 500, wherein the emotion label is used to characterize the emotion polarity of the original sample (i.e., whether the emotion expressed by the original sample is positive emotion or negative emotion). Then, the server 200 may modify the original sample based on the keyword modification strategy to obtain an adversarial sample (i.e., a text adversarial sample), and fuse the single feature vectors corresponding to the original sample and the adversarial sample respectively to obtain a text multi-feature fusion vector, wherein the text multi-feature fusion vector may be used to construct a prior distribution. Subsequently, the server 200 may randomly sample the prior distribution to obtain latent variables, and based on the sampled latent variables and the target emotion label, generate a generated sample that meets the target emotion label (i.e., a diverse emotional text may be generated by combining the latent variables and the emotion label currently required). Finally, the server 200 may train the text emotion classification model based on the original sample carrying the emotion label and the generated sample carrying the target emotion label.

在另一些实施例中，承接上述示例，服务器200在接收到终端设备400通过网络200发送的待分类文本(例如用户针对产品的评论数据)时，可以调用训练后的文本情感分类模型对待分类文本进行预测，得到待分类文本的情感分类结果，随后服务器200可以通过网络300将针对待分类文本的情感分类结果返回给终端设备400，以使终端设备400调用客户端410的人机交互界面进行呈现，如此，产品运营人员可以根据服务器200发送的情感分类结果了解用户对于该产品的态度，进而有针对性地进行改进。In other embodiments, following the above example, when the server 200 receives the text to be classified (such as user comment data on a product) sent by the terminal device 400 through the network 200, the server 200 can call the trained text sentiment classification model to predict the text to be classified, and obtain the sentiment classification result of the text to be classified. Subsequently, the server 200 can return the sentiment classification result of the text to be classified to the terminal device 400 through the network 300, so that the terminal device 400 calls the human-computer interaction interface of the client 410 for presentation. In this way, the product operator can understand the user's attitude towards the product based on the sentiment classification result sent by the server 200, and then make targeted improvements.

在一些实施例中，本申请实施例还可以借助于云技术(Cloud Technology)实现，云技术是指在广域网或局域网内将硬件、软件、网络等系列资源统一起来，实现数据的计算、储存、处理和共享的一种托管技术。In some embodiments, the embodiments of the present application can also be implemented with the aid of cloud technology. Cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software, and networks within a wide area network or a local area network to achieve data calculation, storage, processing, and sharing.

云技术是基于云计算商业模式应用的网络技术、信息技术、整合技术、管理平台技术、以及应用技术等的总称，可以组成资源池，按需所用，灵活便利。云计算技术将变成重要支撑。技术网络系统的后台服务需要大量的计算、存储资源。Cloud technology is a general term for network technology, information technology, integration technology, management platform technology, and application technology based on the cloud computing business model. It can form a resource pool that can be used on demand and is flexible and convenient. Cloud computing technology will become an important support. The background services of the technical network system require a large amount of computing and storage resources.

示例的，图1中的服务器200可以是独立的物理服务器，也可以是多个物理服务器构成的服务器集群或者分布式系统，还可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(CDN，ContentDelivery Network)、以及大数据和人工智能平台等基础云计算服务的云服务器。终端设备400可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表、车载终端等，但并不局限于此。终端设备400以及服务器200可以通过有线或无线通信方式进行直接或间接地连接，本申请实施例中不做限制。For example, the server 200 in FIG1 can be an independent physical server, or a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content distribution networks (CDN, Content Delivery Network), and big data and artificial intelligence platforms. The terminal device 400 can be a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a car terminal, etc., but is not limited to this. The terminal device 400 and the server 200 can be directly or indirectly connected by wired or wireless communication, which is not limited in the embodiments of the present application.

下面继续对本申请实施例提供的电子设备的结构进行说明。以电子设备为服务器为例，参见图2，图2是本申请实施例提供的服务器200的结构示意图，图2所示的服务器200包括：至少一个处理器210、存储器240、至少一个网络接口220。服务器200中的各个组件通过总线系统230耦合在一起。可理解，总线系统230用于实现这些组件之间的连接通信。总线系统230除包括数据总线之外，还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见，在图2中将各种总线都标为总线系统230。The structure of the electronic device provided by the embodiment of the present application will be described below. Taking the electronic device as a server as an example, referring to FIG. 2, FIG. 2 is a schematic diagram of the structure of the server 200 provided by the embodiment of the present application, and the server 200 shown in FIG. 2 includes: at least one processor 210, a memory 240, and at least one network interface 220. The various components in the server 200 are coupled together by a bus system 230. It is understandable that the bus system 230 is used to realize the connection communication between these components. In addition to the data bus, the bus system 230 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clarity, various buses are marked as bus systems 230 in FIG.

处理器210可以是一种集成电路芯片，具有信号的处理能力，例如通用处理器、数字信号处理器(DSP，Digital Signal Processor)，或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等，其中，通用处理器可以是微处理器或者任何常规的处理器等。The processor 210 can be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., where the general-purpose processor can be a microprocessor or any conventional processor, etc.

存储器240可以是可移除的，不可移除的或其组合。示例性的硬件设备包括固态存储器，硬盘驱动器，光盘驱动器等。存储器240可选地包括在物理位置上远离处理器210的一个或多个存储设备。The memory 240 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid-state memory, hard disk drives, optical disk drives, etc. The memory 240 may optionally include one or more storage devices that are physically remote from the processor 210.

存储器240包括易失性存储器或非易失性存储器，也可包括易失性和非易失性存储器两者。非易失性存储器可以是只读存储器(ROM，Read Only Memory)，易失性存储器可以是随机存取存储器(RAM，Random Access Memory)。本申请实施例描述的存储器240旨在包括任意适合类型的存储器。The memory 240 includes a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memories. The non-volatile memory may be a read-only memory (ROM), and the volatile memory may be a random access memory (RAM). The memory 240 described in the embodiments of the present application is intended to include any suitable type of memory.

在一些实施例中，存储器240能够存储数据以支持各种操作，这些数据的示例包括程序、模块和数据结构或者其子集或超集，下面示例性说明。In some embodiments, memory 240 can store data to support various operations, examples of which include programs, modules, and data structures, or a subset or superset thereof, as exemplarily described below.

操作系统241，包括用于处理各种基本系统服务和执行硬件相关任务的系统程序，例如框架层、核心库层、驱动层等，用于实现各种基础业务以及处理基于硬件的任务；Operating system 241, including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;

网络通信模块242，用于经由一个或多个(有线或无线)网络接口520到达其他计算设备，示例性的网络接口520包括：蓝牙、无线相容性认证(WiFi)、和通用串行总线(USB，Universal Serial Bus)等；A network communication module 242, for reaching other computing devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 include: Bluetooth, Wireless Compatibility Authentication (WiFi), and Universal Serial Bus (USB);

在一些实施例中，本申请实施例提供的装置可以采用软件方式实现，图2示出了存储在存储器240中的文本情感分类装置243，其可以是程序和插件等形式的软件，包括以下软件模块：获取模块2431、修改模块2432、融合模块2433、生成模块2434、训练模块2435、确定模块2436、排序模块2437、卷积模块2438、池化模块2439、嵌入模块24310、提取模块24311和加权模块24312，这些模块是逻辑上的，因此根据所实现的功能可以进行任意的组合或进一步拆分。需要指出的是，在图2中为了方便表达，一次性示出了上述所有模块，但是不应视为在文本情感分类装置243排除了可以只包括获取模块2431、修改模块2432、融合模块2433、生成模块2434和训练模块2435的实施，将在下文中说明各个模块的功能。In some embodiments, the device provided in the embodiment of the present application can be implemented in software. FIG. 2 shows a text sentiment classification device 243 stored in a memory 240, which can be software in the form of a program and a plug-in, including the following software modules: an acquisition module 2431, a modification module 2432, a fusion module 2433, a generation module 2434, a training module 2435, a determination module 2436, a sorting module 2437, a convolution module 2438, a pooling module 2439, an embedding module 24310, an extraction module 24311, and a weighting module 24312. These modules are logical, so they can be arbitrarily combined or further split according to the functions implemented. It should be pointed out that in FIG. 2, for the convenience of expression, all the above modules are shown at one time, but it should not be regarded as excluding the implementation of the text sentiment classification device 243 that can only include the acquisition module 2431, the modification module 2432, the fusion module 2433, the generation module 2434, and the training module 2435. The functions of each module will be explained below.

下面将结合本申请实施例提供的服务器的示例性应用和实施，对本申请实施例提供的文本情感分类方法进行具体说明。The text sentiment classification method provided in the embodiment of the present application will be specifically described below in combination with the exemplary application and implementation of the server provided in the embodiment of the present application.

参见图3，图3是本申请实施例提供的文本情感分类方法的流程示意图，将结合图3示出的步骤进行说明。Refer to Figure 3, which is a flow chart of the text sentiment classification method provided in an embodiment of the present application, which will be explained in conjunction with the steps shown in Figure 3.

在步骤101中，获取原始样本、以及针对原始样本标注的情感标签。In step 101, original samples and sentiment labels annotated for the original samples are obtained.

这里，情感标签可以用于表征原始样本的情感极性，例如以原始样本为用户针对产品(例如网络会议客户端)的评论数据为例，用户对于产品的评论数据对应的情感一般可以分为正向情感和负向情感，当评论数据所表达的情感为正向情感时(例如评述数据包括不错、真好等字眼)，对应的情感标签可以为“1”；当评论数据所表达的情感为负向情感时(例如评论数据包括真差劲、不好用等字眼)，对应的情感标签可以为“0”。Here, the sentiment label can be used to characterize the sentiment polarity of the original sample. For example, taking the original sample as user comment data on a product (such as a web conferencing client), the sentiment corresponding to the user's comment data on the product can generally be divided into positive sentiment and negative sentiment. When the sentiment expressed by the comment data is positive (for example, the comment data includes words such as "not bad" and "really good"), the corresponding sentiment label can be "1"; when the sentiment expressed by the comment data is negative (for example, the comment data includes words such as "really bad" and "not easy to use"), the corresponding sentiment label can be "0".

在一些实施例中，可以通过以下方式获取原始样本、以及针对原始样本标注的情感标签：获取多个评论数据(例如可以是网页文本、新闻、报告等)；对所获取的多个评论数据进行预处理(例如包括分词、清洗、标准化等处理)，得到预处理后的多个评论数据；对预处理后的多个评论数据进行随机抽样，将被抽中的评论数据确定为原始样本(即原始文本样本)；基于人工经验对原始样本标注用于表征原始样本的情感极性的情感标签，例如对于图8A示出的评论数据801和评论数据802，由于评论数据801和评论数据802包含“很不错”、“效果很好”等积极情绪的字眼，则可以添加表征正向情感的情感标签，例如可以为评论数据801和评论数据802添加标签“1”；而对于图8B示出的评论数据803和评论数据804，由于评论数据803和评论数据804包含“听不太清”、“确实听不到”等消极情绪的字眼，则可以添加表征负向情感的情感标签，例如可以为评论数据803和评论数据804添加标签“0”。In some embodiments, the original sample and the sentiment label annotated for the original sample can be obtained in the following manner: obtaining a plurality of comment data (for example, web page text, news, reports, etc.); preprocessing the obtained plurality of comment data (for example, including word segmentation, cleaning, standardization, etc.) to obtain a plurality of preprocessed comment data; randomly sampling the plurality of preprocessed comment data, and determining the sampled comment data as the original sample (i.e., the original text sample); annotating the original sample with a sentiment label for characterizing the sentiment polarity of the original sample based on manual experience, for example, for the comment data 8 shown in FIG. 8A , 01 and comment data 802, since comment data 801 and comment data 802 contain words of positive emotions such as "very good" and "very effective", emotional labels representing positive emotions can be added, for example, label "1" can be added to comment data 801 and comment data 802; and for comment data 803 and comment data 804 shown in FIG8B, since comment data 803 and comment data 804 contain words of negative emotions such as "not very clear" and "really can't hear", emotional labels representing negative emotions can be added, for example, label "0" can be added to comment data 803 and comment data 804.

在步骤102中，基于关键词修改策略对原始样本进行修改，得到对抗样本。In step 102, the original sample is modified based on the keyword modification strategy to obtain an adversarial sample.

在一些实施例中，参见图4，图4是本申请实施例提供的文本情感分类方法的流程示意图，如图4所示，图3示出的步骤102可以通过图4示出的步骤1021至步骤1024实现，将结合图4示出的步骤进行说明。In some embodiments, referring to FIG. 4 , FIG. 4 is a flow chart of a text sentiment classification method provided in an embodiment of the present application. As shown in FIG. 4 , step 102 shown in FIG. 3 can be implemented through steps 1021 to 1024 shown in FIG. 4 , which will be explained in conjunction with the steps shown in FIG. 4 .

在步骤1021中，将原始样本切分为多个子句。In step 1021, the original sample is divided into multiple clauses.

在一些实施例中，可以对原始样本进行分词预处理，并利用一些特殊的标点符号将原始样本切分为多个子句，其中，标点符号的类型可以包括：逗号(，)、句号(。)、问号(？)、感叹号(！)，例如以原始样本为“因为白天要工作，培训只能晚上干。在会议中断开系统声音时，确实听不到会议中人的声音了”为例，可以将其切分为如下多个子句：“因为白天要工作”(子句1)、“培训只能晚上干”(子句2)、“在会议中断开系统声音时”(子句3)、“确实听不到会议中人的声音了”(子句4)。In some embodiments, the original sample can be pre-processed with word segmentation, and some special punctuation marks can be used to divide the original sample into multiple clauses, where the types of punctuation marks can include: comma (,), period (.), question mark (?), exclamation mark (!). For example, taking the original sample "Because I have to work during the day, the training can only be done at night. When the system sound is disconnected during the meeting, the voices of the people in the meeting can indeed not be heard" as an example, it can be divided into the following multiple clauses: "Because I have to work during the day" (Clause 1), "The training can only be done at night" (Clause 2), "When the system sound is disconnected during the meeting" (Clause 3), "The voices of the people in the meeting can indeed not be heard" (Clause 4).

在步骤1022中，遍历多个子句，将满足以下条件的子句添加到候选关键子句集中：从原始样本中移除子句后的情感分类结果与情感标签不同。In step 1022, multiple clauses are traversed, and clauses that meet the following conditions are added to the candidate key clause set: the sentiment classification result after removing the clause from the original sample is different from the sentiment label.

在一些实施例中，在将原始样本切分为多个子句之后，可以从原始样本中依次删除每一个子句，并将删除后的结果输入文本情感分类模型得到预测结果，当预测结果与原始的情感标签不同时(例如假设原始样本的情感标签为1，表征原始样本所表达的情感为正向情感，而基于删除后的结果预测得到的情感标签为2，表征删除后的结果所表达的情感为负向情感)，则将该删除的子句作为候选关键子句集中的一部分。In some embodiments, after the original sample is divided into multiple clauses, each clause can be deleted from the original sample in turn, and the deleted results can be input into the text sentiment classification model to obtain a prediction result. When the prediction result is different from the original sentiment label (for example, assuming that the sentiment label of the original sample is 1, indicating that the sentiment expressed by the original sample is positive, and the sentiment label predicted based on the deleted result is 2, indicating that the sentiment expressed by the deleted result is negative), the deleted clause is used as part of the candidate key clause set.

在步骤1023中，将候选关键子句集中的子句拆分为多个词语，并确定多个词语中的关键词。In step 1023, the clauses in the candidate key clause set are split into multiple words, and keywords in the multiple words are determined.

在一些实施例中，可以通过以下方式确定多个词语中的关键词：对拆分得到的多个词语进行词性标注，并删除属于无意义词性的词语，其中，无意义词性包括以下至少之一：介词、代词、数词、冠词；将多个词语中剩余的词语，确定为多个词语中的关键词。In some embodiments, keywords among multiple words can be determined in the following manner: perform part-of-speech tagging on the multiple words obtained by splitting, and delete words with meaningless parts of speech, where the meaningless parts of speech include at least one of the following: prepositions, pronouns, numerals, and articles; and determine the remaining words among the multiple words as keywords among the multiple words.

在另一些实施例中，在从拆分得到的多个词语中确定出关键词之后，还可以执行以下处理：确定每个关键词在候选关键子句集中的贡献度；按照贡献度从大到小的顺序对多个关键词进行排序，得到排序结果。In other embodiments, after determining the keywords from the multiple words obtained by splitting, the following processing can also be performed: determining the contribution of each keyword in the candidate key clause set; sorting the multiple keywords in descending order of contribution to obtain a sorting result.

示例的，在将候选关键子句集中的子句拆分成词语组成的形式之后，可以对候选关键子句集中的词语进行词性标注，并删除属于无意义词性的词语，随后计算候选关键子句集中剩余的词语(即关键词)的贡献度得分(例如可以根据关键词在候选关键子句集中出现的次数和位置，确定贡献度得分，出现的次数越多或者出现的位置越靠前，则对应的贡献度得分就越高)，最后将多个关键词按照贡献度得分进行降序排列，词语的排名越靠前则意味着其对分类结果的贡献越大。For example, after splitting the clauses in the candidate key clause set into a form composed of words, the words in the candidate key clause set can be tagged with parts of speech, and words with meaningless parts of speech can be deleted. Then, the contribution scores of the remaining words (i.e., keywords) in the candidate key clause set can be calculated (for example, the contribution score can be determined based on the number of times and position of the keyword in the candidate key clause set. The more times the keyword appears or the closer it appears, the higher the corresponding contribution score). Finally, multiple keywords are arranged in descending order according to their contribution scores. The higher the ranking of the word, the greater its contribution to the classification result.

在步骤1024中，按照预先设定的关键词修改策略对多个关键词进行修改，得到对抗样本。In step 1024, multiple keywords are modified according to a pre-set keyword modification strategy to obtain adversarial samples.

在一些实施例中，承接上述示例，可以通过以下方式实现上述的步骤1024：对排序结果中的前N个关键词逐个进行修改和替换，并同时预测修改后的文本的情感分类结果，其中，N为大于或等于1的正整数；当修改后的文本的情感分类结果与情感标签不同时，将修改后的文本确定为对抗样本。In some embodiments, following the above example, the above step 1024 can be implemented in the following manner: modify and replace the first N keywords in the sorting result one by one, and simultaneously predict the sentiment classification result of the modified text, where N is a positive integer greater than or equal to 1; when the sentiment classification result of the modified text is different from the sentiment label, the modified text is determined as an adversarial sample.

示例的，可以根据关键词修改策略对前N个关键词逐个进行修改和替换，并同时预测修改后的文本的情感标签，一旦情感标签发生变化，则可以认为对抗样本生成成功。下面以原始样本为中文文本为例，对关键词修改策略包括的汉字交换、字符插入和汉字拆分替换进行具体说明。For example, the first N keywords can be modified and replaced one by one according to the keyword modification strategy, and the sentiment label of the modified text can be predicted at the same time. Once the sentiment label changes, it can be considered that the adversarial sample is successfully generated. The following takes the original sample as Chinese text as an example to specifically explain the keyword modification strategy, including Chinese character exchange, character insertion, and Chinese character splitting and replacement.

示例的，汉字交换是指随机交换词语中两个汉字的位置，例如假设原始词语为“不错”，则经过汉字交换之后可以得到“错不”。For example, Chinese character exchange refers to randomly exchanging the positions of two Chinese characters in a word. For example, assuming the original word is "不错", then after Chinese character exchange, it can be obtained as "错不".

示例的，字符插入是指在词语中随机插入不常见的特殊符号，为了实现对抗样本生成过程中的自动化，本申请实施例建立了一个扰动符号集合，其中包括一些不具有实际意义且不会影响文本原始语义信息的不常见符号，例如“⊥”、“Θ”等，例如假设原始词语为“不错”，则经过字符插入之后可以得到“不⊥错”。For example, character insertion refers to randomly inserting uncommon special symbols into words. In order to achieve automation in the adversarial sample generation process, the embodiment of the present application establishes a perturbation symbol set, which includes some uncommon symbols that have no practical meaning and do not affect the original semantic information of the text, such as "⊥", "Θ", etc. For example, assuming that the original word is "not bad", after character insertion, "not ⊥ bad" can be obtained.

示例的，汉字拆分替换是指拆分词语中左右结构的汉字，并利用同音字替换其他结构的汉字。在中文汉字中，具有多部件的汉字往往能够拆分为两个或两个以上的偏旁部首或者其他汉字，因此，本申请实施例可以首先构建一个候选词库，里面包含所有具有左右构形的汉字以及将其拆分后的变体，在生成对抗样本时，对关键词中的每一个汉字与候选词库中的被替换汉字进行比对筛选，并用拆分后的变体替换原始文本。For example, Chinese character splitting and replacement refers to splitting the Chinese characters of the left and right structures in the words, and replacing the Chinese characters of other structures with homophones. In Chinese characters, Chinese characters with multiple components can often be split into two or more radicals or other Chinese characters. Therefore, the embodiment of the present application can first construct a candidate word library, which contains all Chinese characters with left and right structures and their variants after splitting. When generating adversarial samples, each Chinese character in the keyword is compared and screened with the replaced Chinese characters in the candidate word library, and the original text is replaced with the split variant.

在步骤103中，对原始样本和对抗样本分别对应的单特征向量进行融合处理，得到文本多特征融合向量。In step 103, the single feature vectors corresponding to the original sample and the adversarial sample are fused to obtain a text multi-feature fusion vector.

在一些实施例中，参见图5，图5是本申请实施例提供的文本情感分类方法的流程示意图，如图5所示，在执行图3示出的步骤103之前，还可以执行图5示出的步骤106至步骤108，将结合图5示出的步骤进行说明。In some embodiments, referring to FIG. 5 , FIG. 5 is a flow chart of a text sentiment classification method provided in an embodiment of the present application. As shown in FIG. 5 , before executing step 103 shown in FIG. 3 , steps 106 to 108 shown in FIG. 5 may also be executed, which will be explained in conjunction with the steps shown in FIG. 5 .

在步骤106中，确定原始样本和对抗样本分别对应的加权后的特征。In step 106, weighted features corresponding to the original sample and the adversarial sample are determined.

在一些实施例中，可以通过以下方式确定原始样本和对抗样本分别对应的加权后的特征：基于原始样本和对抗样本，分别确定词语级文本和字符级文本的逆文档频率特征向量；分别对原始样本和对抗样本进行词嵌入处理，对应得到原始样本的词向量、以及对抗样本的词向量；分别对原始样本和对抗样本进行字嵌入处理，对应得到原始样本的字向量、以及对抗样本的子向量；基于注意力机制提取词向量的语义特征和字向量的语义特征；分别将词向量的语义特征、以及字向量的语义特征与对应的逆文档特征向量进行加权(即将词向量的语义特征与词语级文本的逆文档频率特征向量进行加权、以及将字向量的语义特征与字符级文本的逆文档频率特征向量进行加权)，对应得到原始样本加权后的特征、以及对抗样本加权后的特征。In some embodiments, the weighted features corresponding to the original sample and the adversarial sample can be determined in the following ways: based on the original sample and the adversarial sample, the inverse document frequency feature vectors of the word-level text and the character-level text are determined respectively; the original sample and the adversarial sample are word-embedded respectively to obtain the word vector of the original sample and the word vector of the adversarial sample; the original sample and the adversarial sample are word-embedded respectively to obtain the word vector of the original sample and the sub-vector of the adversarial sample; the semantic features of the word vector and the semantic features of the word vector are extracted based on the attention mechanism; the semantic features of the word vector and the semantic features of the word vector are weighted with the corresponding inverse document feature vector (that is, the semantic features of the word vector are weighted with the inverse document frequency feature vector of the word-level text, and the semantic features of the word vector are weighted with the inverse document frequency feature vector of the character-level text), and the weighted features of the original sample and the weighted features of the adversarial sample are obtained accordingly.

示例的，可以通过基于转换器模型的双向编码表示(BERT，BidirectionalEncoder Rep resentations from Transformers)对原始样本和对抗样本进行词嵌入处理，对应得到原始样本的词向量、以及对抗样本的词向量；而字向量的获取可以通过Word2vec模型实现，该模型包含了连续词袋(CBOW，Continuous Bag Of Word)模型和连续跳字(Skip-Gram，Conti nuous Skip Gram)模型，其中，CBOW模型是一个三层神经网络，包含了输入层、隐藏层和输出层，最终将词独热(One-Hot)编码的模式转换成低维固定的连续向量。在得到词向量和字向量之后，可以将两者分别输入到带有注意力机制的深度双向门控循环单元(DBGR U，Deep Bi-directional Gate Recurrent Unit)中提取各时刻下的语义特征，最后可以分别将词向量和字向量的语义特征与对应的TF-IDF特征向量进行权重计算，得到加权后的特征。For example, the original sample and the adversarial sample can be word-embedded by using the bidirectional encoder representation (BERT) based on the transformer model, and the word vector of the original sample and the word vector of the adversarial sample can be obtained accordingly; and the word vector can be obtained by the Word2vec model, which includes the continuous bag of words (CBOW) model and the continuous skip word (Skip-Gram) model, wherein the CBOW model is a three-layer neural network, including an input layer, a hidden layer, and an output layer, and finally converts the one-hot encoding mode of the word into a low-dimensional fixed continuous vector. After obtaining the word vector and the word vector, the two can be respectively input into the deep bidirectional gate recurrent unit (DBGRU) with an attention mechanism to extract the semantic features at each moment, and finally the semantic features of the word vector and the word vector can be weighted with the corresponding TF-IDF feature vector to obtain the weighted features.

在步骤107中，分别对原始样本和对抗样本加权后的特征进行卷积处理，对应得到原始样本的特征图谱、以及对抗样本的特征图谱。In step 107, convolution processing is performed on the weighted features of the original sample and the adversarial sample, respectively, to obtain a feature map of the original sample and a feature map of the adversarial sample.

在一些实施例中，在得到原始样本和对抗样本分别对应的加权后的特征之后，可以分别采用不同窗口大小的卷积核进行局部特征提取，对应得到原始样本的特征图谱、以及对抗样本的特征图谱，随后可以将特征图谱输入至池化层进行数据压缩计算。In some embodiments, after obtaining the weighted features corresponding to the original sample and the adversarial sample, respectively, convolution kernels with different window sizes can be used for local feature extraction to obtain the feature map of the original sample and the feature map of the adversarial sample, and then the feature map can be input into the pooling layer for data compression calculation.

在步骤108中，分别对原始样本和对抗样本的特征图谱进行池化处理，对应得到原始样本的单特征向量、以及对抗样本的单特征向量。In step 108, the feature maps of the original sample and the adversarial sample are pooled to obtain a single feature vector of the original sample and a single feature vector of the adversarial sample.

在一些实施例中，在分别对原始样本和对抗样本加权后的特征进行卷积处理，对应得到原始样本的特征图谱、以及对抗样本的特征图谱之后，由于文本的长度不固定，当长度规模比较大时，卷积计算完成后依旧会得到高维的特征图谱，因此需要对其进行数据压缩的操作，例如可以分别对原始样本和对抗样本的特征图谱进行最大池化处理，对应得到原始样本的单特征向量、以及对抗样本的单特征向量。在处理后所有的特征图谱之后，可以在合并层对原始样本的单特征向量与对抗样本的单特征向量进行融合处理，得到文本多特征向量，其中，文本多特征向量可以用于构建先验分布。In some embodiments, after performing convolution processing on the weighted features of the original sample and the adversarial sample, respectively, and obtaining the feature map of the original sample and the feature map of the adversarial sample, since the length of the text is not fixed, when the length scale is relatively large, a high-dimensional feature map will still be obtained after the convolution calculation is completed, so it is necessary to perform data compression operations on them. For example, the feature maps of the original sample and the adversarial sample can be subjected to maximum pooling processing, respectively, to obtain a single feature vector of the original sample and a single feature vector of the adversarial sample. After processing all the feature maps, the single feature vector of the original sample and the single feature vector of the adversarial sample can be fused at the merging layer to obtain a text multi-feature vector, wherein the text multi-feature vector can be used to construct a prior distribution.

在步骤104中，基于从先验分布采样得到的潜在变量和目标情感标签，生成符合目标情感标签的生成样本。In step 104, based on the latent variables sampled from the prior distribution and the target sentiment label, a generated sample that conforms to the target sentiment label is generated.

在一些实施例中，可以通过以下方式实现步骤104：将从先验分布采样得到的潜在变量与目标情感标签输入解码器，以使解码器基于潜在向量和目标情感标签对应的标签向量，生成符合目标情感标签的生成样本(即带有指定情感倾向的文本)。In some embodiments, step 104 can be implemented in the following manner: the latent variables sampled from the prior distribution and the target emotion label are input into the decoder, so that the decoder generates a generated sample (i.e., text with a specified emotion tendency) that conforms to the target emotion label based on the latent vector and the label vector corresponding to the target emotion label.

在步骤105中，基于携带情感标签的原始样本、以及携带目标情感标签的生成样本，对文本情感分类模型进行训练。In step 105, a text sentiment classification model is trained based on the original samples carrying the sentiment labels and the generated samples carrying the target sentiment labels.

这里，训练后的文本情感分类模型用于确定待分类文本的情感分类结果。Here, the trained text sentiment classification model is used to determine the sentiment classification result of the text to be classified.

在一些实施例中，文本情感分类模型可以是生成对抗网络模型，其中，生成对抗网络模型包括生成模型和判别模型，生成模型用于生成上述的生成样本，则可以通过以下方式实现上述的步骤105：将携带情感标签的原始样本、以及携带目标情感标签的生成样本输入至判别模型，以使判别模型对原始样本和生成样本进行判定；将判别模型输出的判定结果反馈给生成模型，以作为生成模型得到的奖励值；基于奖励值对生成模型和判别模型的参数进行更新，其中，更新后的判别模型可以用于确定待分类文本的情感分类结果。In some embodiments, the text sentiment classification model can be a generative adversarial network model, wherein the generative adversarial network model includes a generative model and a discriminant model, and the generative model is used to generate the above-mentioned generated samples. The above-mentioned step 105 can be implemented in the following manner: the original samples carrying the sentiment labels and the generated samples carrying the target sentiment labels are input into the discriminant model, so that the discriminant model makes a judgment on the original samples and the generated samples; the judgment result output by the discriminant model is fed back to the generative model as a reward value obtained by the generative model; the parameters of the generative model and the discriminant model are updated based on the reward value, wherein the updated discriminant model can be used to determine the sentiment classification result of the text to be classified.

也就是说，本申请实施例通过将集成学习思想融入到VAE、GAN两种模型中，不仅能够使其情感分类性能达到更高的层次，而且可以生成带有指定情感倾向的文本，通过将VAE和GAN有效地结合在一起，GAN中的生成模型(下文中简称为G)可以通过VAE的数据特征挖掘以及潜在空间表示的优异性能进行情感文本的生成，而GAN中的判别模型(下文中简称为D)可以进行数据真伪性以及情感类别的判定。此外，当训练集中同时包含人工标注的原始样本以及机器标注的生成样本时，可以使情感分类效果得到进一步的提升，同时也能够让G生成情感类别更加准确的文本。That is to say, the embodiment of the present application integrates the idea of ensemble learning into the two models of VAE and GAN, which can not only make their sentiment classification performance reach a higher level, but also generate text with specified sentiment tendency. By effectively combining VAE and GAN, the generative model in GAN (hereinafter referred to as G) can generate sentiment text through VAE data feature mining and the excellent performance of latent space representation, while the discriminative model in GAN (hereinafter referred to as D) can determine the authenticity of data and sentiment category. In addition, when the training set contains both manually annotated original samples and machine-annotated generated samples, the sentiment classification effect can be further improved, and G can also generate text with more accurate sentiment categories.

示例的，可以通过以下方式实现上述的基于奖励值对生成模型和判别模型的参数进行更新：基于奖励值分别构建生成模型和判别模型的损失函数；基于损失函数，采用梯度下降的方式，对生成模型和判别模型的参数进行更新，如此，最终整体模型训练完成后，能够有效地识别出待分类文本的情感分类结果，情感分类性能得到进一步改善。For example, the above-mentioned updating of the parameters of the generation model and the discriminant model based on the reward value can be achieved in the following ways: based on the reward value, the loss functions of the generation model and the discriminant model are respectively constructed; based on the loss function, the parameters of the generation model and the discriminant model are updated by gradient descent. In this way, after the overall model training is completed, the sentiment classification results of the text to be classified can be effectively identified, and the sentiment classification performance is further improved.

在另一些实施例中，VAE和GAN也可以将两者的解码器与G进行重合使用，一边通过VAE的变分推断理论拟合潜在变量分布和输出重构数据，一边通过GAN的对抗训练加强生成文本的质量。此外，由于GAN标准框架是需要对G的完整输出输入至D中进行判别，而当VAE结合了GAN之后，会不断生成离散的文本序列，但D计算得出的损失梯度无法对生成的离散文本序列进行梯度更新，或者说对离散文本的微调更新可能无法导致生成有意义的文本，从而产生训练过程的模式崩溃现象。因此，本申请实施例可以将生成序列的过程看作一连串的动作，每步动作都是挑选一个词，每步的状态为已挑选组成的文本序列前缀，而最后一步动作完成后即可得到整个文本序列数据，之后D对其文本进行判定评分，将评价结果反馈给G，而这个结果便是G得到的奖励值，随后可以通过梯度下降的方式对G和D的参数进行优化，模型训练完成后，可以分别得到用于生成指定倾向情感类别的文本的文本生成模型(即G)以及用于确定文本的情感分类结果的文本情感分类模型(即D)。In other embodiments, VAE and GAN can also overlap the decoders of both with G, fitting the latent variable distribution and outputting the reconstructed data through the variational inference theory of VAE, while enhancing the quality of the generated text through the adversarial training of GAN. In addition, since the GAN standard framework requires the complete output of G to be input into D for judgment, when VAE is combined with GAN, it will continuously generate discrete text sequences, but the loss gradient calculated by D cannot perform gradient updates on the generated discrete text sequences, or the fine-tuning update of the discrete text may not lead to the generation of meaningful text, resulting in a mode collapse phenomenon in the training process. Therefore, the embodiment of the present application can regard the process of generating a sequence as a series of actions, each step of which is to select a word, and the state of each step is the prefix of the selected text sequence. After the last step is completed, the entire text sequence data can be obtained. After that, D judges and scores the text and feeds back the evaluation result to G, and this result is the reward value obtained by G. Subsequently, the parameters of G and D can be optimized by gradient descent. After the model training is completed, a text generation model (i.e., G) for generating text with a specified tendency emotion category and a text sentiment classification model (i.e., D) for determining the sentiment classification result of the text can be obtained respectively.

本申请实施例提供的文本情感分类方法，一方面通过基于关键词修改策略对原始样本进行修改的方式来得到对抗样本，可以解决如何在原始样本中确定需要修改的词语的位置、以及词语的修改尽量不影响文本情感分类模型的预测的问题；另一方面，通过将原始样本的单特征向量和对抗样本的单特征向量进行融合，即将单特征向量扩展为多特征融合向量，从而可以更好地学习样本的特征信息，从而提高后续训练得到的文本情感分类模型针对待分类文本的情感分类结果的准确率。The text sentiment classification method provided in the embodiment of the present application, on the one hand, obtains adversarial samples by modifying the original samples based on the keyword modification strategy, which can solve the problem of how to determine the position of the words that need to be modified in the original samples, and how to modify the words as little as possible to avoid affecting the prediction of the text sentiment classification model; on the other hand, by fusing the single feature vector of the original sample and the single feature vector of the adversarial sample, that is, expanding the single feature vector into a multi-feature fusion vector, it is possible to better learn the feature information of the sample, thereby improving the accuracy of the sentiment classification results of the text sentiment classification model obtained by subsequent training for the text to be classified.

下面，以针对产品(例如网络会议客户端)的评论数据为例，说明本申请实施例在一个实际的应用场景中的示例性应用。Below, taking review data for a product (such as a web conference client) as an example, an exemplary application of the embodiment of the present application in an actual application scenario is described.

本申请实施例提供一种文本情感分类方法，一方面提出一种基于多种关键词语修改策略的中文文本对抗样本的生成方案，该方案根据中文文本结构和语言特性设计出汉字拆分替换的关键词语修改策略，同时将汉字交换和字符插入两种策略应用于中文对抗样本，以解决“如何在原始文本中确定需要修改的词语的位置”和“词语的修改尽量不影响模型的预测”的技术问题；另一方面提出深度双向门控单元的变分自编码器(DBGRU-VAE，DeepBidirection al Gated Recurrent Uint-Variational Auto Encoder)算法，该算法的核心作用是将单特征卷积扩展为多特征卷积，同时在卷积层前使DBGRU的输出向量与词频-逆向文件频率(TF-ID F，Term Frequency-Inverse Document Frequency)特征向量、以及情感注意力融合特征向量进行乘积计算赋予额外权重，并且引入注意力机制使其能够有关注性地生成文本和提取文本特征，最终让模型能通过对抗训练带来高性能。The embodiment of the present application provides a text sentiment classification method. On the one hand, a scheme for generating Chinese text adversarial samples based on multiple keyword modification strategies is proposed. The scheme designs a keyword modification strategy of Chinese character splitting and replacement according to the structure and language characteristics of Chinese text, and applies two strategies, Chinese character exchange and character insertion, to Chinese adversarial samples to solve the technical problems of "how to determine the position of the words to be modified in the original text" and "how to modify the words as little as possible to avoid affecting the prediction of the model". On the other hand, a Deep Bidirectional Gated Recurrent Uint-Variational Auto Encoder (DBGRU-VAE) algorithm is proposed. The core function of the algorithm is to expand the single-feature convolution into multi-feature convolution. At the same time, before the convolution layer, the output vector of DBGRU is multiplied with the term frequency-inverse document frequency (TF-ID F) feature vector and the emotion attention fusion feature vector to give additional weights, and the attention mechanism is introduced to enable it to generate text and extract text features in a focused manner, so that the model can finally achieve high performance through adversarial training.

下面对本申请实施例提供的文本情感分类方法进行具体说明。The text sentiment classification method provided in the embodiment of the present application is described in detail below.

本申请实施例提供的文本情感分类方法可以划分为以下三个过程：样本准备、中文文本对抗样本生成、以及DBGRU-VAE算法，下面对这三个过程分别进行具体说明。The text sentiment classification method provided in the embodiment of the present application can be divided into the following three processes: sample preparation, Chinese text adversarial sample generation, and DBGRU-VAE algorithm. These three processes are described in detail below.

在一些实施例中，参见图6A，图6A是本申请实施例提供的文本情感分类方法的流程示意图，将结合图6A示出的步骤201至步骤203对样本准备过程进行具体说明。In some embodiments, referring to FIG. 6A , FIG. 6A is a flowchart of a text sentiment classification method provided in an embodiment of the present application, and the sample preparation process will be specifically described in conjunction with steps 201 to 203 shown in FIG. 6A .

在步骤201中，对所有的评论数据进行随机抽样，基于人工经验对抽样得到的评论数据进行正负向情感标注。In step 201, all comment data are randomly sampled, and positive and negative sentiment labels are performed on the sampled comment data based on manual experience.

在一些实施例中，参见图7，图7是本申请实施例提供的针对评论数据进行预处理的流程示意图，如图7所示，在对所有的评论数据进行随机抽样之前，还可以首先对评论数据进行预处理，其中，评论数据，即原始文本(raw data)可以包括网页文本、新闻、报告等，预处理的过程包括分词(Segmentation)、清洗(Cleaning)，例如包括去除无用的标签、特殊符号、以及停用词等、标准化(Normalization)，包括Stemming和Lemmazation，其中，Stemming类算法是根据语言学规则对单词进行标准化，它转化的结果并不一定是一个真实存在的词，Lemmazation是一类更为严格的算法，它的转化结果一定是一个真实存在的单词、特征提取(Feature Extraction)和建模(Modeling)。In some embodiments, referring to FIG. 7 , FIG. 7 is a flow chart of preprocessing of comment data provided in an embodiment of the present application. As shown in FIG. 7 , before randomly sampling all comment data, the comment data may be preprocessed first, wherein the comment data, i.e., raw text (raw data) may include web page text, news, reports, etc. The preprocessing process includes segmentation (Segmentation), cleaning (Cleaning), for example, including removing useless tags, special symbols, and stop words, etc., normalization (Normalization), including Stemming and Lemmazation, wherein the Stemming algorithm standardizes words according to linguistic rules, and its conversion result is not necessarily a real word. Lemmazation is a more rigorous algorithm, and its conversion result must be a real word, feature extraction (Feature Extraction) and modeling (Modeling).

在另一些实施例中，在对所有的评论数据进行随机抽样之后，可以基于人工经验对抽样得到的样本评论数据进行正负向情感标注，例如对于图8A示出的评论数据801和评论数据802，由于评述数据801中包括“很不错”、“一级棒”、评论数据802中包括“效果很好”等褒义的词语，因此可以将图8A示出的评论数据801和评论数据802人工标记为“正向”，而由于图8B示出的评论数据803中包括“声音的问题”、评论数据804中包括“听不到会议中人的声音了”等贬义的词语，因此可以将图8B示出的评论数据803和评论数据804人工标记为“反向”。In other embodiments, after randomly sampling all the comment data, the sampled comment data can be labeled with positive or negative sentiments based on manual experience. For example, for comment data 801 and comment data 802 shown in FIG8A, since comment data 801 includes commendatory words such as "very good" and "excellent", and comment data 802 includes "very good effect", comment data 801 and comment data 802 shown in FIG8A can be manually marked as "positive", and since comment data 803 shown in FIG8B includes derogatory words such as "sound problem" and comment data 804 includes derogatory words such as "I can't hear the voices of people in the meeting", comment data 803 and comment data 804 shown in FIG8B can be manually marked as "negative".

在步骤202中，对标注的评论数据进行随机分组，分为训练集、验证集、测试集。In step 202, the annotated review data are randomly grouped into a training set, a validation set, and a test set.

在一些实施例中，参见图9，图9是本申请实施例提供的对标注的评论数据进行随机分组的流程示意图，如图9所示，可以将所有标注的评论数据汇集成一个数据集，接着可以根据特征工程将数据集中的标注的评论数据进行随机分组，分为测试集、训练集、以及验证集，其中，训练集中的数据用于对机器学习模型进行训练；测试集中的数据用于对训练后的机器学习模型进行预测，得到预测结果；验证集中的数据用于对机器学习模型进行评估，得到评估结果。In some embodiments, referring to FIG9 , FIG9 is a schematic diagram of a process for randomly grouping annotated comment data provided in an embodiment of the present application. As shown in FIG9 , all annotated comment data can be collected into a data set, and then the annotated comment data in the data set can be randomly grouped according to feature engineering into a test set, a training set, and a validation set, wherein the data in the training set is used to train the machine learning model; the data in the test set is used to predict the trained machine learning model to obtain prediction results; and the data in the validation set is used to evaluate the machine learning model to obtain evaluation results.

在步骤203中，对训练集和验证集中的评论数据进行字符分割、分词和去停用词。In step 203, character segmentation, word segmentation and stop word removal are performed on the review data in the training set and the validation set.

在一些实施例中，在对标注的评论数据进行随机分组，得到训练集、验证集和测试集之后，可以对训练集和验证集中的评论数据进行字符分割、分词、以及去停用词等处理。In some embodiments, after the annotated comment data are randomly grouped to obtain a training set, a validation set, and a test set, the comment data in the training set and the validation set may be processed by character segmentation, word segmentation, and stop word removal.

至此，样本准备过程介绍结束。This concludes the introduction to the sample preparation process.

下面继续对中文文本对抗样本的生成过程进行说明。该部分首先利用句子独立性设计关键词语贡献值计算方法，来有效定位重要词语位置。同时，还根据中文文本结构和语言特性设计出汉字拆分替换的关键词语修改策略，以及将汉字交换和字符插入两种修改策略应用于中文对抗样本，下面将结合图6B进行具体说明。Next, we will continue to explain the generation process of Chinese text adversarial samples. This section first uses sentence independence to design a key word contribution value calculation method to effectively locate the position of important words. At the same time, we also design a key word modification strategy of Chinese character splitting and replacement based on the structure and language characteristics of Chinese text, and apply the two modification strategies of Chinese character exchange and character insertion to Chinese adversarial samples, which will be specifically explained in conjunction with Figure 6B.

在一些实施例中，参见图6B，图6B是本申请实施例提供的文本情感分类方法的流程示意图，将结合图6B示出的步骤204至步骤209对中文文本对抗样本的生成过程进行具体说明。In some embodiments, referring to FIG. 6B , FIG. 6B is a flow chart of a text sentiment classification method provided in an embodiment of the present application, and the generation process of Chinese text adversarial samples will be specifically described in combination with steps 204 to 209 shown in FIG. 6B .

在步骤204中，对原始输入的文本进行分词预处理，并切分为子句。In step 204, the original input text is pre-processed for word segmentation and segmented into clauses.

在步骤205中，从原始文本中依次删除每一个子句，并将删除后的结果送入分类模型得到预测结果，如果预测结果与原始标签不同，则将该删除的子句作为候选关键子句集中的一部分。In step 205, each clause is deleted from the original text in turn, and the deleted result is sent to the classification model to obtain a prediction result. If the prediction result is different from the original label, the deleted clause is used as a part of the candidate key clause set.

在步骤206中，对候选关键子句集中的词语进行词性标注，并删除属于无意义词性的词语，其中，无意义的词性包括介词、代词、数词、冠词。In step 206, the words in the candidate key clause set are tagged with parts of speech, and words with meaningless parts of speech are deleted, wherein the meaningless parts of speech include prepositions, pronouns, numerals, and articles.

在步骤207中，计算候选关键子句集中的所有关键词语的贡献度。In step 207, the contribution of all key words in the candidate key clause set is calculated.

在步骤208中，将关键词语按照其贡献度得分进行降序排列，其中，词语的排名越靠前则意味着对分类结果的贡献越大。In step 208, the key words are arranged in descending order according to their contribution scores, wherein the higher the ranking of a word is, the greater its contribution to the classification result is.

在步骤209中，根据关键词修改策略对前N个关键词语逐个进行修改和替换，并同时预测修改后的文本的标签，一旦标签发现改变，则认为文本对抗样本生成成功。In step 209, the first N key words are modified and replaced one by one according to the keyword modification strategy, and the label of the modified text is predicted at the same time. Once the label is found to be changed, it is considered that the text adversarial sample is generated successfully.

示例的，本申请实施例提供的针对中文语境的文本对抗样本生成算法，主要包括净化操作、关键词语定位和关键词语修改三个步骤，该算法的具体实现如下：For example, the text adversarial sample generation algorithm for the Chinese context provided in the embodiment of the present application mainly includes three steps: purification operation, keyword positioning and keyword modification. The specific implementation of the algorithm is as follows:

其中，s表示原始文本(例如上述的评述数据)；y表示s的真实分类标签(例如上述针对评论数据人工标记的标签)；F(·)表示被攻击分类器；T(·)表示对关键词语所使用的修改策略(即关键词修改策略)，可以是汉字交换，字符插入或者汉字拆分替换中的一种或多种；表示被允许的最大词语修改量；/>为最终得到的文本对抗样本；/>表示从原始关键词语转换到对抗样本的改动次数。下面将继续结合图10对文本对抗样本的生成过程进行说明。如图10所示，中文文本对抗样本的生成过程如下：Wherein, s represents the original text (such as the above-mentioned review data); y represents the real classification label of s (such as the above-mentioned label manually marked for the review data); F(·) represents the attacked classifier; T(·) represents the modification strategy used for the key words (i.e., the key word modification strategy), which can be one or more of Chinese character exchange, character insertion, or Chinese character splitting and replacement; Indicates the maximum allowed word modification amount; /> This is the final text adversarial sample; /> Indicates the number of changes from the original keyword to the adversarial sample. The following will continue to explain the generation process of text adversarial samples in conjunction with Figure 10. As shown in Figure 10, the generation process of Chinese text adversarial samples is as follows:

(1)对原始输入文本进行分词预处理，并利用{，。？！}将原始输入文本分割成子句；(1) Perform word segmentation preprocessing on the original input text and use {,.?!} to split the original input text into clauses;

(2)从原始文本中依次删除每一个子句，并将删除后的结果送入分类模型F得到预测结果y_i，如果预测结果y_i与原始序列标签y不同，则将该删除的子句作为候选关键子句集中的一部分；(2) Delete each clause from the original text in turn, and send the deleted result to the classification model F to obtain the predicted result y_i . If the predicted result y_i is different from the original sequence label y, the deleted clause is taken as part of the candidate key clause set;

(3)对候选关键句中的词语进行词性标注，并删除属于无意义词性的词语，其中，无意义词性包括介词、代词、数词和冠词；(3) Tagging the words in the candidate key sentences and deleting the words with meaningless parts of speech, where meaningless parts of speech include prepositions, pronouns, numerals, and articles;

(4)计算候选关键子句集中的所有关键词语的贡献度得分C_F(x，y)；(4) Calculate the contribution scores_CF (x,y) of all key words in the candidate key clause set;

(5)将关键词语按照贡献度得分C_F(x，y)进行降序排列，词语的排名越靠前则意味着其对分类结果的贡献越大；(5) Arrange the keywords in descending order according to their contribution scores_CF (x, y). The higher the ranking of a word, the greater its contribution to the classification result.

(6)根据关键词修改策略对前个关键词语逐个进行修改和替换，并同时预测修改后的文本的标签，一旦标签发生变化，则认为文本对抗样本生成成功。(6) Modify the strategy based on keywords The key words are modified and replaced one by one, and the labels of the modified text are predicted at the same time. Once the labels change, it is considered that the text adversarial sample is generated successfully.

至此，中文文本对抗样本的生成过程介绍结束。This concludes the introduction to the generation process of Chinese text adversarial samples.

下面继续对本申请实施例提供的DBGRU-VAE算法进行说明，该部分算法的核心作用是：将单特征卷积扩展为多特征卷积，同时在卷积层前使DBGRU的输出向量与TF-IDF特征向量、以及情感注意力融合特征向量进行乘积计算赋予额外权限，并且引入注意力机制使其能够有关注性地生成文本和提取文本特征，最终让模型能通过对抗训练带来高性能。下面将结合图6C进行具体说明。The following is a description of the DBGRU-VAE algorithm provided in the embodiment of the present application. The core function of this part of the algorithm is to expand the single-feature convolution into a multi-feature convolution, and at the same time, before the convolution layer, the output vector of DBGRU is multiplied with the TF-IDF feature vector and the emotional attention fusion feature vector to grant additional permissions, and the introduction of the attention mechanism enables it to generate text and extract text features with attention, and finally enables the model to achieve high performance through adversarial training. This will be specifically described in conjunction with Figure 6C.

在一些实施例中，参见图6C，图6C是本申请实施例提供的文本情感分类方法的流程示意图，将结合图6C示出的步骤210至步骤222对本申请实施例提供的DBGRU-VAE算法进行具体说明。In some embodiments, referring to FIG. 6C , FIG. 6C is a flow chart of a text sentiment classification method provided in an embodiment of the present application. The DBGRU-VAE algorithm provided in an embodiment of the present application will be specifically described in conjunction with steps 210 to 222 shown in FIG. 6C .

在步骤210中，基于原始样本和对抗样本，计算词语级文本和字符级文本的TF-IDF特征向量。In step 210, TF-IDF feature vectors of word-level text and character-level text are calculated based on the original sample and the adversarial sample.

在一些实施例中，TF-IDF是一种用来评估文本中的字词对文本重要程度的统计方法，其中，TF表示关键字词在文本中出现的频率大小，具体的计算公式如下：In some embodiments, TF-IDF is a statistical method used to evaluate the importance of words in a text, where TF represents the frequency of a keyword in a text. The specific calculation formula is as follows:

其中，n_i,j表示关键字词在文本中出现的次数，而IDF表示逆文档频率，具体公式如下：Among them, n_i,j represents the number of times the keyword appears in the text, and IDF represents the inverse document frequency. The specific formula is as follows:

其中，|D|表示文本总数，|{j:t_i∈d_j}|表示包含关键字词t_i的文本数。将预处理后的文本分别计算其字符级和词语级的TF×IDF权重向量，以此作为文本的关键特征表示。Among them, |D| represents the total number of texts, and |{j:t_i ∈d_j }| represents the number of texts containing keyword t_i . The TF×IDF weight vectors of the preprocessed texts are calculated at the character level and word level, respectively, and used as the key feature representation of the text.

在步骤211中，利用BERT生成词向量。In step 211, word vectors are generated using BERT.

在一些实施例中，如图11所示，可以采用基于转换器模型的双向编码表示(BERT，Bi directional Encoder Representations from Transformers)来生成词向量，例如可以采用BERT对文本语料数据进行模型训练，得到每个词的固定维度词向量表示，接着可以将多个词向量汇总成词向量合集D，随后可以对词向量合集D中的多个词向量进行连接，得到特征矩阵V，其中，BERT的训练过程可以参考图12。In some embodiments, as shown in FIG11 , a bidirectional encoder representation based on a transformer model (BERT, Bi directional Encoder Representations from Transformers) can be used to generate word vectors. For example, BERT can be used to perform model training on text corpus data to obtain a fixed-dimensional word vector representation for each word. Multiple word vectors can then be aggregated into a word vector collection D. Subsequently, multiple word vectors in the word vector collection D can be connected to obtain a feature matrix V. The training process of BERT can refer to FIG12 .

在步骤212中，利用Word2vec生成字向量。In step 212, word vectors are generated using Word2vec.

在一些实施例中，如图13所示，字向量表示可以采用Word2vec中的Skip-Gram模型，例如可以采用Word2vec的Skip-Gram模型对文本语料数据进行模型训练，得到每个字的固定维度字向量表示，接着可以将多个字向量汇总成字向量合集D，随后可以对字向量合集D中的多个字向量进行连接，得到特征矩阵U，其中，Word2vec的训练过程可以参考图14。In some embodiments, as shown in FIG13 , the word vector representation may adopt the Skip-Gram model in Word2vec. For example, the Skip-Gram model of Word2vec may be used to perform model training on text corpus data to obtain a fixed-dimensional word vector representation for each word. Multiple word vectors may then be aggregated into a word vector collection D. Subsequently, multiple word vectors in the word vector collection D may be connected to obtain a feature matrix U. The training process of Word2vec may refer to FIG14 .

在步骤213中，将两者分别输入到带有注意力机制的DBGRU中提取各时刻下的语义特征。In step 213, both are input into DBGRU with attention mechanism to extract semantic features at each moment.

在一些实施例中，在基于步骤211得到文本的词向量表示、以及基于步骤212得到字向量表示之后，可以将两者输入至DBGRU中，设文本输入X＝{x₁，x₂，…，x_n}，其中，x_k表示文本中通过词嵌入或字嵌入处理后的第k个特征向量，则首层正向的第k个隐藏层状态为：In some embodiments, after obtaining the word vector representation of the text based on step 211 and the character vector representation based on step 212, both can be input into the DBGRU. Let the text input X = {x₁ , x₂ , ..., x_n }, where x_k represents the kth feature vector in the text after word embedding or character embedding processing, then the kth hidden layer state in the first layer forward direction is:

对于GRU网络内部的处理有：The internal processing of the GRU network is:

其中，分别表示重置门、更新门的权重参数，/>和/>分别表示重置门、更新门的偏置参数，同理反向的第k个隐藏层状态为：in, Respectively represent the weight parameters of the reset gate and update gate,/> and/> They represent the bias parameters of the reset gate and update gate respectively. Similarly, the reverse k-th hidden layer state is:

随后，第l层反向的第k个隐藏层状态为：Then, the state of the kth hidden layer after the lth layer is reversed is:

其正向的第k个隐藏层状态为：Its forward kth hidden layer state is:

最后连结两个方向的隐藏状态和/>得到最终隐藏状态/>即：Finally, connect the hidden states in both directions and/> Get the final hidden state /> Right now:

将最终隐藏状态输入至输出层，输出层计算输出O_k，即：The final hidden state Input to the output layer, the output layer calculates the output O_k , that is:

其中，W_k表示输出层的权重参数，b_k表示输出层的偏置参数。本申请实施例还可以将特征向量输入至MFCNN中进一步提取特征，因此需要对双向隐藏层状态利用注意力机制分别计算其注意力权重参数，即：Wherein, W_k represents the weight parameter of the output layer, and b_k represents the bias parameter of the output layer. In the embodiment of the present application, the feature vector can also be input into MFCNN to further extract features, so it is necessary to use the attention mechanism to calculate the attention weight parameters of the bidirectional hidden layer state, that is:

其中，W和V分别表示权重参数，b表示偏置参数，e_k表示语义特征向量，e_j表示情感标签向量，α_k表示语义特征对文本情感类别的权重。Among them, W and V represent weight parameters respectively, b represents bias parameter, e_k represents semantic feature vector, e_j represents sentiment label vector, and α_k represents the weight of semantic feature to text sentiment category.

在步骤214中，分别将训练集的词向量和字向量的语义特征与对应的TF-IDF特征向量进行权重计算。In step 214, weights are calculated for the semantic features of the word vectors and character vectors of the training set and the corresponding TF-IDF feature vectors.

在步骤215中，将计算后的特征分别输入到卷积层进行卷积计算，分别输出特征图谱。In step 215, the calculated features are respectively input into the convolutional layers for convolution calculation, and the feature maps are respectively output.

在一些实施例中，在得到DBGRU输出的不同文本特征向量表示后，可以首先利用其计算完成的TF-IDF特征向量和情感注意力融合特征向量分别进行权重补充，得到加权特征向量，再分别采用不同窗口大小的卷积核对加权特征向量进行局部特征提取。示例的，可以采用h×d格式的卷积核进行卷积计算，其中，h表示卷积核在特征向量矩阵中滑动的窗口高度，d表示文本特征向量矩阵的维度。为了获取多样化的特征信息，本申请实施例可以采用三种不同滑动窗口高度的卷积核，并将卷积核的神经元数据量设置为128个。假设有文本D，其特征向量为{x₁，x₂，…，x_n}，其中，x_i表示经过处理后的词向量或者字向量，而使用h×d格式的卷积核对文本D进行滑动划分后，文本D将被划分成：In some embodiments, after obtaining the different text feature vector representations output by DBGRU, the calculated TF-IDF feature vector and the emotional attention fusion feature vector can be first used to perform weight supplementation to obtain weighted feature vectors, and then convolution kernels of different window sizes can be used to extract local features from the weighted feature vectors. For example, a convolution kernel in h×d format can be used for convolution calculation, where h represents the window height of the convolution kernel sliding in the feature vector matrix, and d represents the dimension of the text feature vector matrix. In order to obtain diversified feature information, the embodiment of the present application can use three convolution kernels with different sliding window heights, and set the neuron data volume of the convolution kernel to 128. Assume that there is a text D, whose feature vector is {x₁ , x₂ , ..., x_n }, where_xi represents the processed word vector or character vector, and after sliding division of the text D using the convolution kernel in h×d format, the text D will be divided into:

D_i:i+h-1＝{x_i,x_i+1,...,x_i+h-1} (15)D_i:i+h-1 ={_xi ,xi₊₁ ,...,xi_+h-1 } (15)

对划分后的文本D进行卷积计算操作，即：The convolution operation is performed on the divided text D, namely:

c_i＝f(WD_i:i+h-1+b) (16)_ci = f(WD_i:i+h-1 +b) (16)

其中，c_i表示经过卷积计算后的第i个特征，W表示权重参数，b表示偏置参数，f表示非线性映射函数，当整体文本特征向量完成卷积计算后，会得到n-h+1个特征值，则特征图谱C可以表示为：Among them,_ci represents the i-th feature after convolution calculation, W represents the weight parameter, b represents the bias parameter, and f represents the nonlinear mapping function. When the overall text feature vector completes the convolution calculation, n-h+1 feature values will be obtained, and the feature map C can be expressed as:

C＝[c₁,c₂,...,c_n-h+1] (17)C＝[c₁ ,c₂ ,...,_cn-h+1 ] (17)

可以依此方式分别提取出不同滑动窗口高度下的文本多元特征，最终将提取出的多元特征输入至池化层进行数据压缩计算。In this way, the multivariate features of the text at different sliding window heights can be extracted respectively, and finally the extracted multivariate features are input into the pooling layer for data compression calculation.

在步骤216中，将特征图谱输入到池化层进行数据压缩处理，得到单特征向量。In step 216, the feature map is input into the pooling layer for data compression processing to obtain a single feature vector.

在一些实施例中，在进行局部特征提取后，由于文本的长度不固定，当长度规模比较大时，卷积计算完成后依旧会得到高维的特征图谱。因此需要对特征图谱进行数据压缩的操作，本申请实施例采取最大值池化的方式进行压缩计算，具体的计算公式为：In some embodiments, after local feature extraction, since the length of the text is not fixed, when the length scale is relatively large, a high-dimensional feature map will still be obtained after the convolution calculation is completed. Therefore, it is necessary to perform data compression on the feature map. The embodiment of the present application adopts the maximum value pooling method for compression calculation. The specific calculation formula is:

其中，k表示具体要选取最大值的数量，l表示文本的长度，h表示卷积核滑动窗口的高度，s表示卷积核窗口的滑动步长，因此最终池化后的特征向量为：Among them, k represents the number of specific maximum values to be selected, l represents the length of the text, h represents the height of the convolution kernel sliding window, and s represents the sliding step size of the convolution kernel window. Therefore, the final feature vector after pooling is:

在处理完所有的特征图谱之后，可以将特征向量进行合并融合操作，等待进行后续的网络训练。After processing all feature maps, the feature vectors can be merged and fused, waiting for subsequent network training.

在步骤217中，在合并层将两种单特征向量进行融合处理，得到文本多特征融合向量。In step 217, the two single feature vectors are fused at the merging layer to obtain a text multi-feature fusion vector.

在步骤218中，通过先验分布和近似后验分布的KL距离得到损失函数L_prior。In step 218, the loss function L_prior is obtained by the KL distance between the prior distribution and the approximate posterior distribution.

在一些实施例中，可以通过先验分布和近似后验分布的KL距离(是Kullback-Leibler差异的简称，也叫做相对熵，它衡量的是相同事件空间里的两个概率分布的差异情况)得到损失函数L_prior，其中，KL距离的计算公式如下：In some embodiments, the loss function L_prior can be obtained by the KL distance (short for Kullback-Leibler difference, also called relative entropy, which measures the difference between two probability distributions in the same event space) between the prior distribution and the approximate posterior distribution, where the calculation formula of the KL distance is as follows:

其中，p(x)表示真实标签的空间分布，q(x)表示模型输出的预测标签的空间分布。Among them, p(x) represents the spatial distribution of true labels, and q(x) represents the spatial distribution of predicted labels output by the model.

在步骤219中，将从先验分布采样获得的近似潜在变量和情感标签同时输入到解码器中得到生成数据。In step 219, the approximate latent variables and the sentiment labels obtained by sampling from the prior distribution are simultaneously input into the decoder to obtain generated data.

在一些实施例中，由于真实情况下的后验分布很难直接计算出，因此需要利用变分推理的思想，通过假设一个已知的先验分布随机变量Q_θ(z|x)来近似其后验分布，即：In some embodiments, since the posterior distribution in the real case It is difficult to calculate directly, so it is necessary to use the idea of variational inference to approximate its posterior distribution by assuming a known prior distribution random variable Q_θ (z|x), that is:

Q_θ(z|x)＝N(z；μ(x),σ(x)²) (21)Q_θ (z|x)＝N(z；μ(x),σ(x)² ) (21)

其中，μ(x)表示均值，σ(x)表示标准差，而根据编码器输出的隐藏层状态H对其每个时刻取最大值处理得到H_x后，可以利用神经网络计算出其均值和对数方差，即：Among them, μ(x) represents the mean, σ(x) represents the standard deviation, and after taking the maximum value of the hidden layer state H output by the encoder at each moment to obtain H_x , the neural network can be used to calculate its mean and logarithmic variance, that is:

μ＝W_μH_x+b_μ (22)μ＝W_μ H_x +b_μ (22)

logσ²＝W_σH_x+b_σ (23)logσ² =W_σ H_x +b_σ (23)

其中，W_μ和W_σ分别表示均值和对数方差的权重参数，b_μ和b_σ分别表示均值和对数方差的偏置参数。Among them, W_μ and W_σ represent the weight parameters of the mean and logarithmic variance, respectively, and b_μ and b_σ represent the bias parameters of the mean and logarithmic variance, respectively.

有了近似的后验分布后，如果直接对其进行采样的话，模型本身将无法通过反向传播优化梯度的方式进行训练，因此还需要采用重参数化的技巧对其进行转换，即从N(0,1)中采样任意一个随机变量ε，则其近似的后验分布潜在变量Z可以表示为：After having an approximate posterior distribution, if we directly sample it, the model itself will not be able to be trained by back-propagation optimization of the gradient. Therefore, we need to use the reparameterization technique to transform it, that is, sampling any random variable ε from N(0,1), then its approximate posterior distribution latent variable Z can be expressed as:

Z＝ε×σ+μ (24)Z＝ε×σ+μ (24)

采用了BiLSTM作为解码器，其输出第k个时刻隐藏层状态s_k为：BiLSTM is used as the decoder, and its output of the hidden layer state_sk at the kth moment is:

s_k＝BiLSTM(Z,s_k-1,y_k-1,c_k,C) (25)s_k =BiLSTM(Z,s_k-1 ,y_k-1 ,c_k ,C) (25)

其中，Z表示潜在变量，s_k-1表示前一时刻的双向隐藏层状态，y_k-1表示前一时刻的输出，C表示当前目标生成所需要结合的情感标签向量，c_k表示当前时刻由编码器的隐藏状态结合注意力机制生成的语义编码向量，即：Among them, Z represents the latent variable, s_k-1 represents the bidirectional hidden layer state at the previous moment, y_k-1 represents the output at the previous moment, C represents the emotion label vector required to generate the current target, and c_k represents the semantic encoding vector generated by the hidden state of the encoder combined with the attention mechanism at the current moment, that is:

e_k＝tanh(Wh_k+b) (28)e_k =tanh(Wh_k +b) (28)

最终将隐藏层状态s_k通过一层神经网络和Softmax即可获得P_θ(z|x)的输出，在模型训练完成后通过潜在变量Z中进行采样便可生成多样化的情感文本。Finally, the hidden layer state_{sk is} passed through a layer of neural network and Softmax to obtain the output of P_θ (z|x). After the model training is completed, diverse emotional texts can be generated by sampling the latent variable Z.

在步骤220中，将带标签的生成数据和真实数据预处理后输入到判别模型中，分别得到反馈奖励值的生成模型的损失函数。In step 220, the generated data and real data with labels are preprocessed and input into the discriminant model to obtain the loss function of the generation model of the feedback reward value.

在一些实施例中，如图15所示，本申请实施例提供的辅助分类器生成对抗网络(ACG AN，Auxiliary Classifier GAN)并没有直接将条件信息输入至判别模型D中，而是在对带有标签信息的真实文本和生成数据判定真伪性的同时，还需要借助一个辅助分类器同时预测出条件的类别信息。In some embodiments, as shown in FIG. 15 , the auxiliary classifier generative adversarial network (ACG AN) provided in the embodiment of the present application does not directly input the conditional information into the discriminant model D. Instead, while determining the authenticity of the real text and generated data with label information, it also needs to use an auxiliary classifier to simultaneously predict the category information of the condition.

在步骤221中，通过梯度更新模型的参数。In step 221, the parameters of the model are updated via the gradient.

在一些实施例中，整体网络中ACGAN模块D的优化目标函数可以分为两个部分，即：In some embodiments, the optimization objective function of the ACGAN module D in the overall network can be divided into two parts, namely:

生成模型G的优化目标函数也可以分为两个部分，即：The optimization objective function of the generative model G can also be divided into two parts, namely:

L_cls(G)＝E_{Z～Pz，C～Pc}[L_D(C|G(Z，C))] (32)L_cls (G) = E_{Z～Pz, C～Pc} [L_D (C|G(Z,C))] (32)

对于VAE模块而言，其损失函数即为先验分布的正则化项与期望的对数似然之差，则：For the VAE module, its loss function is the difference between the regularization term of the prior distribution and the expected log-likelihood, then:

其中，L表示样本数。Where L represents the number of samples.

在步骤222中，模型训练完成后，输出所有评论数据属于正向评论的概率。In step 222, after the model training is completed, the probability of all comment data belonging to positive comments is output.

在一些实施例中，当模型训练完成后，可以调用训练后的模型对测试集中的评述数据进行情感分类，输出测试集中的所有评述数据属于正向评论的概率。In some embodiments, after the model training is completed, the trained model can be called to perform sentiment classification on the review data in the test set, and the probability that all the review data in the test set are positive reviews can be output.

需要说明的是，本申请实施例提供的文本情感分类方法不仅仅能够对产品的评述数据进行正负情感的预测，还可以应用于其他场景，例如只需要调整输入其他场景的样本，就能够识别出对应场景的情感类别，例如“歌词情感分类场景”等。此外，本申请实施例提供的技术方案也可以提高产品的精细化运营。It should be noted that the text sentiment classification method provided in the embodiment of the present application can not only predict the positive and negative sentiment of the product review data, but can also be applied to other scenarios. For example, only by adjusting the input samples of other scenarios, the sentiment category of the corresponding scenario can be identified, such as "lyrics sentiment classification scenario". In addition, the technical solution provided in the embodiment of the present application can also improve the refined operation of the product.

下面继续结合图16对本申请实施例提供的文本情感分类方法的有益效果进行进一步的说明。The beneficial effects of the text sentiment classification method provided in the embodiment of the present application will be further explained below in conjunction with Figure 16.

示例的，参见图16，图16是本申请实施例提供的不同方案的效果对比示意图，如图16所示，从广告点击率来看，本申请实施例提供的方案相比于其他技术方案，平均提高了182.7％，此外，从广告转化率来看，本申请实施例提供的方案相较于其他方案，平均提高了178.41％。For example, see Figure 16, which is a schematic diagram comparing the effects of different solutions provided in the embodiments of the present application. As shown in Figure 16, from the perspective of advertisement click-through rate, the solution provided in the embodiments of the present application is improved by an average of 182.7% compared with other technical solutions. In addition, from the perspective of advertisement conversion rate, the solution provided in the embodiments of the present application is improved by an average of 178.41% compared with other solutions.

下面继续说明本申请实施例提供的文本情感分类装置243的实施为软件模块的示例性结构，在一些实施例中，如图2所示，存储在存储器240的文本情感分类装置243中的软件模块可以包括：获取模块2431、修改模块2432、融合模块2433、生成模块2434和训练模块2435。The following continues to describe an exemplary structure of the text sentiment classification device 243 provided in an embodiment of the present application implemented as a software module. In some embodiments, as shown in Figure 2, the software modules stored in the text sentiment classification device 243 in the memory 240 may include: an acquisition module 2431, a modification module 2432, a fusion module 2433, a generation module 2434 and a training module 2435.

获取模块2431，用于获取原始样本、以及针对原始样本标注的情感标签；修改模块2432，用于基于关键词修改策略对原始样本进行修改，得到对抗样本；融合模块2433，用于对原始样本和对抗样本分别对应的单特征向量进行融合处理，得到文本多特征融合向量，其中，文本多特征融合向量用于构建先验分布；生成模块2434，用于基于从先验分布采样得到的潜在变量和目标情感标签，生成符合目标情感标签的生成样本；训练模块2435，用于基于携带情感标签的原始样本、以及携带目标情感标签的生成样本，对文本情感分类模型进行训练，其中，训练后的文本情感分类模型用于确定待分类文本的情感分类结果。An acquisition module 2431 is used to acquire original samples and sentiment labels annotated for the original samples; a modification module 2432 is used to modify the original samples based on a keyword modification strategy to obtain adversarial samples; a fusion module 2433 is used to fuse the single feature vectors corresponding to the original samples and the adversarial samples to obtain a text multi-feature fusion vector, wherein the text multi-feature fusion vector is used to construct a prior distribution; a generation module 2434 is used to generate generated samples that conform to the target sentiment label based on the latent variables and target sentiment labels sampled from the prior distribution; a training module 2435 is used to train a text sentiment classification model based on the original samples carrying the sentiment labels and the generated samples carrying the target sentiment labels, wherein the trained text sentiment classification model is used to determine the sentiment classification result of the text to be classified.

在一些实施例中，修改模块2432，还用于将原始样本切分为多个子句；遍历多个子句，将满足以下条件的子句添加到候选关键子句集中：从原始样本中移除子句后的情感分类结果与情感标签不同；将候选关键子句集中的子句拆分成多个词语，并确定多个词语中的关键词；按照预先设定的关键词修改策略对多个关键词进行修改，得到对抗样本。In some embodiments, the modification module 2432 is also used to divide the original sample into multiple clauses; traverse multiple clauses and add clauses that meet the following conditions to the candidate key clause set: the sentiment classification result after removing the clause from the original sample is different from the sentiment label; split the clauses in the candidate key clause set into multiple words, and determine the keywords in the multiple words; modify the multiple keywords according to a pre-set keyword modification strategy to obtain adversarial samples.

在一些实施例中，修改模块2432，还用于对多个词语进行词性标注，并删除属于无意义词性的词语，其中，无意义词性包括以下至少之一：介词、代词、数词、冠词；文本情感分类装置243还包括确定模块2436，用于将多个词语中剩余的词语，确定为多个词语中的关键词；In some embodiments, the modification module 2432 is further used to perform part-of-speech tagging on the multiple words and delete words belonging to meaningless parts of speech, wherein the meaningless parts of speech include at least one of the following: prepositions, pronouns, numerals, and articles; the text sentiment classification device 243 also includes a determination module 2436, which is used to determine the remaining words in the multiple words as keywords in the multiple words;

在一些实施例中，确定模块2436，还用于确定每个关键词在候选关键子句集中的贡献度；文本情感分类装置243还包括排序模块2437，用于按照贡献度从大到小的顺序对多个关键词进行排序，得到排序结果。In some embodiments, the determination module 2436 is also used to determine the contribution of each keyword in the candidate key clause set; the text sentiment classification device 243 also includes a sorting module 2437, which is used to sort multiple keywords in order of contribution from large to small to obtain a sorting result.

在一些实施例中，修改模块2432，还用于对排序结果中的前N个关键词逐个进行修改和替换，并同时预测修改后的文本的情感分类结果，其中，N为大于或等于1的正整数；确定模块2436，还用于当修改后的文本的情感分类结果与情感标签不同时，将修改后的文本确定为对抗样本。In some embodiments, the modification module 2432 is also used to modify and replace the first N keywords in the sorting results one by one, and simultaneously predict the sentiment classification result of the modified text, where N is a positive integer greater than or equal to 1; the determination module 2436 is also used to determine the modified text as an adversarial sample when the sentiment classification result of the modified text is different from the sentiment label.

在一些实施例中，确定模块2436，还用于确定原始样本和对抗样本分别对应的加权后的特征；文本情感分类装置243还包括卷积模块2438和池化模块2439，其中，卷积模块2438，用于分别对原始样本和对抗样本加权后的特征进行卷积处理，对应得到原始样本的特征图谱、以及对抗样本的特征图谱；池化模块2439，用于分别对原始样本和对抗样本的特征图谱进行池化处理，对应得到原始样本的单特征向量、以及对抗样本的单特征向量。In some embodiments, the determination module 2436 is further used to determine the weighted features corresponding to the original sample and the adversarial sample respectively; the text sentiment classification device 243 also includes a convolution module 2438 and a pooling module 2439, wherein the convolution module 2438 is used to perform convolution processing on the weighted features of the original sample and the adversarial sample respectively, and obtain the feature map of the original sample and the feature map of the adversarial sample; the pooling module 2439 is used to perform pooling processing on the feature maps of the original sample and the adversarial sample respectively, and obtain the single feature vector of the original sample and the single feature vector of the adversarial sample.

在一些实施例中，确定模块2436，还用于基于原始样本和对抗样本，分别确定词语级文本和字符级文本的逆文档频率特征向量；文本情感分类装置243还包括嵌入模块24310、提取模块24311和加权模块24312，其中，嵌入模块24310，用于分别对原始样本和对抗样本进行词嵌入处理，对应得到原始样本的词向量、以及对抗样本的词向量；以及用于分别对原始样本和对抗样本进行字嵌入处理，对应得到原始样本的字向量、以及对抗样本的字向量；提取模块24311，用于基于注意力机制分别提取词向量的语义特征和字向量的语义特征；加权模块24312，用于分别将词向量的语义特征、以及字向量的语义特征与对应的逆文档频率特征向量进行加权，对应得到原始样本加权后的特征、以及对抗样本加权后的特征。In some embodiments, the determination module 2436 is further used to determine the inverse document frequency feature vectors of the word-level text and the character-level text respectively based on the original sample and the adversarial sample; the text sentiment classification device 243 also includes an embedding module 24310, an extraction module 24311 and a weighting module 24312, wherein the embedding module 24310 is used to perform word embedding processing on the original sample and the adversarial sample respectively, and obtain the word vector of the original sample and the word vector of the adversarial sample; and to perform word embedding processing on the original sample and the adversarial sample respectively, and obtain the word vector of the original sample and the word vector of the adversarial sample; the extraction module 24311 is used to extract the semantic features of the word vector and the semantic features of the word vector respectively based on the attention mechanism; the weighting module 24312 is used to weight the semantic features of the word vector and the semantic features of the word vector with the corresponding inverse document frequency feature vector respectively, and obtain the weighted features of the original sample and the weighted features of the adversarial sample.

在一些实施例中，生成模块2434，还用于将从先验分布采样得到的潜在变量和目标情感标签输入解码器，以使解码器基于潜在变量和目标情感标签对应的标签向量，生成符合目标情感标签的生成样本。In some embodiments, the generation module 2434 is also used to input the latent variables and target emotion labels sampled from the prior distribution into the decoder, so that the decoder generates generated samples that conform to the target emotion label based on the label vectors corresponding to the latent variables and the target emotion label.

在一些实施例中，文本情感分类模型为生成对抗网络模型，生成对抗网络模型包括生成模型和判别模型，生成模型用于生成生成样本；训练模块2435，还用于将携带情感标签的原始样本、以及携带目标情感标签的生成样本输入至判别模型，以使判别模型对原始样本和生成样本进行判定；将判别模型输出的判定结果反馈给生成模型，以作为生成模型得到的奖励值；基于奖励值对生成模型和判别模型的参数进行更新，其中，更新后的判别模型用于确定待分类文本的情感分类结果。In some embodiments, the text sentiment classification model is a generative adversarial network model, which includes a generative model and a discriminant model. The generative model is used to generate generated samples; the training module 2435 is also used to input the original samples carrying sentiment labels and the generated samples carrying target sentiment labels into the discriminant model so that the discriminant model can make judgments on the original samples and the generated samples; the judgment result output by the discriminant model is fed back to the generative model as a reward value obtained by the generative model; the parameters of the generative model and the discriminant model are updated based on the reward value, wherein the updated discriminant model is used to determine the sentiment classification result of the text to be classified.

在一些实施例中，训练模块2435，还用于基于奖励值分别构建生成模型和判别模型的损失函数；基于损失函数，采用梯度下降的方式，对生成模型和判别模型的参数进行更新。In some embodiments, the training module 2435 is further used to construct loss functions of the generative model and the discriminative model respectively based on the reward value; based on the loss function, the parameters of the generative model and the discriminative model are updated by using a gradient descent method.

在一些实施例中，获取模块2431，还用于获取多个评述数据；对多个评论数据进行预处理，得到预处理后的多个评论数据；对预处理后的多个评论数据进行随机抽样，将被抽中的评论数据确定为原始样本；基于人工经验对原始样本标注用于表征原始样本的情感极性的情感标签。In some embodiments, the acquisition module 2431 is also used to obtain multiple comment data; preprocess the multiple comment data to obtain multiple preprocessed comment data; randomly sample the multiple preprocessed comment data, and determine the sampled comment data as the original sample; and annotate the original sample with an emotional label based on artificial experience to characterize the emotional polarity of the original sample.

需要说明的是，本申请实施例装置的描述，与上述方法实施例的描述是类似的，具有同方法实施例相似的有益效果，因此不做赘述。对于本申请实施例提供的文本情感分类装置中未尽的技术细节，可以根据图3至图5任一附图的说明而理解。It should be noted that the description of the device of the embodiment of the present application is similar to the description of the above-mentioned method embodiment, and has similar beneficial effects as the method embodiment, so it will not be repeated. The unfinished technical details of the text sentiment classification device provided in the embodiment of the present application can be understood according to the description of any of the accompanying drawings of Figures 3 to 5.

本申请实施例提供了一种计算机程序产品，该计算机程序产品包括计算机程序或计算机可执行指令，该计算机程序或计算机可执行指令存储在计算机可读存储介质中。计算机设备的处理器从计算机可读存储介质读取该计算机可执行指令，处理器执行该计算机可执行指令，使得该计算机设备执行本申请实施例上述的文本情感分类方法。The embodiment of the present application provides a computer program product, which includes a computer program or a computer executable instruction, and the computer program or the computer executable instruction is stored in a computer readable storage medium. The processor of the computer device reads the computer executable instruction from the computer readable storage medium, and the processor executes the computer executable instruction, so that the computer device executes the text sentiment classification method described in the embodiment of the present application.

本申请实施例提供一种存储有计算机可执行指令的计算机可读存储介质，其中存储有计算机可执行指令，当计算机可执行指令被处理器执行时，将引起处理器执行本申请实施例提供的文本情感分类方法，例如，如图3至图5任一附图示出的文本情感分类方法。An embodiment of the present application provides a computer-readable storage medium storing computer-executable instructions, wherein computer-executable instructions are stored. When the computer-executable instructions are executed by a processor, the processor will execute the text sentiment classification method provided by the embodiment of the present application, for example, the text sentiment classification method shown in any of Figures 3 to 5.

在一些实施例中，计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEP ROM、闪存、磁表面存储器、光盘、或CD-ROM等存储器；也可以是包括上述存储器之一或任意组合的各种设备。In some embodiments, the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEP ROM, flash memory, magnetic surface storage, optical disk, or CD-ROM; or it may be various devices including one or any combination of the above memories.

在一些实施例中，可执行指令可以采用程序、软件、软件模块、脚本或代码的形式，按任意形式的编程语言(包括编译或解释语言，或者声明性或过程性语言)来编写，并且其可按任意形式部署，包括被部署为独立的程序或者被部署为模块、组件、子例程或者适合在计算环境中使用的其它单元。In some embodiments, executable instructions may be in the form of a program, software, software module, script or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine or other unit suitable for use in a computing environment.

作为示例，可执行指令可被部署为在一个电子设备上执行，或者在位于一个地点的多个电子设备上执行，又或者，在分布在多个地点且通过通信网络互连的多个电子设备上执行。As an example, the executable instructions may be deployed to be executed on one electronic device, or on multiple electronic devices located at one site, or on multiple electronic devices distributed at multiple sites and interconnected by a communication network.

以上所述，仅为本申请的实施例而已，并非用于限定本申请的保护范围。凡在本申请的精神和范围之内所作的任何修改、等同替换和改进等，均包含在本申请的保护范围之内。The above is only an embodiment of the present application and is not intended to limit the protection scope of the present application. Any modifications, equivalent substitutions and improvements made within the spirit and scope of the present application are included in the protection scope of the present application.