CN116414979A

Movatterモバイル変換

Info

Publication number: CN116414979A
Application number: CN202310191175.4A
Authority: CN
Inventors: 周斌; 赵学臣; 涂宏魁; 李爱平; 江荣; 王晔; 田磊; 邹家英; 谢锋; 汪海洋; 张中; 伍泓舟
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2023-03-02
Filing date: 2023-03-02
Publication date: 2023-07-11
Anticipated expiration: 2043-03-02
Also published as: CN116414979B

Abstract

Translated fromChinese

本发明提供了一种基于分层对比学习的样本立场检测方法、装置，方法包括以下步骤：1)采集数据：在社交网络媒体中，采集社交网络文本数据，社交网络文本数据包括讨论的主题文本及用户对该主题的评论文本。2)在采集的数据基础上构造数据集，已知立场标签的源话题目标数据集合作为训练集；无立场标签的目的话题目标数据集合作为测试集；3)建立样本立场检测模型，输出评论文本对主题文本所持立场的预测概率；4)通过训练集训练样本立场检测模型，直至样本立场检测模型收敛得到立场预测模型；5)将需要预测的主题文本、评论文本输入训练好的立场预测模型，输出评论文本对主题文本所持立场的概率。

The present invention provides a sample position detection method and device based on layered comparative learning, the method comprising the following steps: 1) collecting data: in social network media, collecting social network text data, which includes the subject text of the discussion and the text of user comments on the topic. 2) Construct a data set based on the collected data. The source topic target data set with known position labels is used as a training set; the target topic target data set without position labels is used as a test set; 3) Establish a sample position detection model and output comment text The prediction probability of the stance held by the topic text; 4) Train the sample stance detection model through the training set until the sample stance detection model converges to obtain the stance prediction model; 5) Input the trained stance prediction model with the topic text and comment text that need to be predicted, Outputs the probability that the review text takes a stance on the topic text.

Description

Translated fromChinese

基于分层对比学习的样本立场检测方法、装置Sample position detection method and device based on hierarchical contrastive learning

技术领域technical field

本发明涉及数据挖掘分析技术领域，具体涉及一种基于分层对比学习的样本立场检测方法、装置。The invention relates to the technical field of data mining and analysis, in particular to a sample position detection method and device based on layered comparison learning.

背景技术Background technique

文本立场检测，也称为立场分类或立场识别，是指从用户发表的文本中自动判断其对于预先给定目标的立场。文本立场检测与文本情感分析是文本意见挖掘领域的重要研究方向，与文本情感分析不同的是，文本立场检测需要判别表达方式更为复杂的“支持、反对或中立”的立场，而不是对指定对象的积极或消极的情感极性。Text stance detection, also known as stance classification or stance recognition, refers to automatically judging the user's stance on a given target from the text published by the user. Text stance detection and text sentiment analysis are important research directions in the field of text opinion mining. Different from text sentiment analysis, text stance detection needs to distinguish the position of "support, opposition or neutral" with more complex expressions, rather than specifying The positive or negative affective polarity of the object.

近年来，立场检测的研究得到了学术界和工业界的广泛关注，在商业智能、舆情分析、等许多不同领域均有应用。同时，对于诸多研究均可起到辅助作用，其研究成果可推广到对话系统、个性化推荐、社会感知等研究领域，具有重要的学术研究与应用价值。In recent years, the research on stance detection has received extensive attention from academia and industry, and has applications in many different fields such as business intelligence, public opinion analysis, and so on. At the same time, it can play an auxiliary role in many researches, and its research results can be extended to research fields such as dialogue systems, personalized recommendations, and social perception, which have important academic research and application values.

传统特定目标的立场检测，更多的是针对单一目标立场检测的，是指给定单一的文本(推特、微博、新闻文章、辩论文本等)以及目标，需要确定文本对给定的目标的态度是支持、反对或者中立，即假设训练集和测试集中存在着相同目标的数据。然而，实际中收集所有目标主题的数据用于训练是不可行的，总是存在大量的未见过目标的数据，且针对一个新的目标话题，获取其高质量的标签往往是很昂贵的，因此研究自适应未知目标的样本立场检测至关重要。The position detection of the traditional specific target is more for a single target position detection, which means that given a single text (Twitter, Weibo, news article, debate text, etc.) The attitude is support, opposition or neutral, that is, it is assumed that there is data of the same target in the training set and the test set. However, it is not feasible to collect all target topic data for training in practice, there is always a large amount of unseen target data, and for a new target topic, obtaining its high-quality labels is often very expensive, Therefore, it is very important to study the sample stand detection of adaptive unknown targets.

发明内容Contents of the invention

本发明要解决的技术问题是：提供一种基于分层对比学习的样本立场检测方法、装置，通过采集社交网络文本数据，得到评论文本对主题文本所持立场的概率。The technical problem to be solved by the present invention is to provide a sample stance detection method and device based on hierarchical contrastive learning, by collecting social network text data, to obtain the probability of the stance held by the comment text on the topic text.

本发明解决上述技术问题所采用的技术方案是：一种基于分层对比学习的样本立场检测方法，包括以下步骤：The technical solution adopted by the present invention to solve the above-mentioned technical problems is: a sample position detection method based on layered comparison learning, comprising the following steps:

1)采集数据：在社交网络媒体中，采集社交网络文本数据，社交网络文本数据包括讨论的主题文本及用户对该主题的评论文本。1) Data collection: In social network media, social network text data is collected, and the social network text data includes the topic text of the discussion and the user's comment text on the topic.

2)在采集的数据基础上构造数据集，已知立场标签的源话题目标数据集合作为训练集；无立场标签的目的话题目标数据集合作为测试集；2) Construct a data set based on the collected data, the source topic target data set with known stance labels is used as the training set; the target topic target data set without stance labels is used as the test set;

3)建立样本立场检测模型，输出评论文本对主题文本所持立场的预测概率；3) Establish a sample position detection model, and output the predicted probability of the comment text's position on the topic text;

4)通过训练集训练样本立场检测模型，直至样本立场检测模型收敛得到立场预测模型；4) Train the sample stance detection model through the training set until the sample stance detection model converges to obtain the stance prediction model;

5)将需要预测的主题文本、评论文本输入训练好的立场预测模型，输出评论文本对主题文本所持立场的概率。5) Input the topic text and comment text to be predicted into the trained position prediction model, and output the probability that the comment text holds a position on the topic text.

优选的，步骤2中，每个主题文本、评论文本、立场标签构成一个训练样本，与训练样本主题文本不重合的每个主题文本、评论文本构成一个测试样本，按照该方式组织所有文本，构建训练集和测试集；Preferably, instep 2, each topic text, comment text, and position label constitute a training sample, and each topic text and comment text that do not overlap with the training sample topic text constitute a test sample, organize all texts in this way, construct training set and test set;

优选的，已知立场标签的源话题目标数据集合为

即训练集；无立场标签的目的话题目标数据集合/>

即测试集；其中/>

是源话题目标/>

中有标记样例的立场标签，N_s和N_d分别为源话题目标数据和目的话题目标数据的样本个数；使用源话题目标数据集合/>

中关于源话题目标/>

的每一个句子/>

训练样本立场检测模型，使得该样本立场检测模型泛化到目的话题目标数据集合/>

上，预测关于目的话题目标/>

的句子/>

的立场，进而输出评论文本对主题文本所持立场的预测概率。Preferably, the source topic target data set of known position labels is

That is, the training set; the target topic target data set without position label />

That is, the test set; where />

is the source topic target />

There are position labels of marked samples in , N_s and N_d are the sample numbers of the source topic target data and the target topic target data respectively; use the source topic target data set />

In About Source Topic Target />

every sentence of />

Train the sample position detection model, so that the sample position detection model can be generalized to the target topic target data set/>

on, predicting the topic target on purpose />

sentence />

, and then output the predicted probability of the comment text’s position on the topic text.

优选的，步骤3中，所述样本立场检测模型包括：文本全局语义特征提取模块、方面级特征提取模块、属性级特征提取模块、多维语义特征融合模块、立场检测模块；Preferably, in step 3, the sample stance detection model includes: a text global semantic feature extraction module, an aspect-level feature extraction module, an attribute-level feature extraction module, a multi-dimensional semantic feature fusion module, and a stance detection module;

文本全局语义特征提取模块：所述主题文本和评论文本拼接作为输入，输出评论文本面向特定主题的全局语义特征；Text global semantic feature extraction module: the topic text and comment text are concatenated as input, and output comment text is oriented to the global semantic feature of a specific topic;

方面级特征提取模块：将获得的全局语义特征作为输入，输出表示构成文本语义的多个方面级特征；Aspect-level feature extraction module: takes the obtained global semantic features as input, and outputs multiple aspect-level features that constitute text semantics;

属性级特征提取模块：将获得的方面级特征作为输入，输出方面级特征内对应的属性级特征；Attribute-level feature extraction module: take the obtained aspect-level features as input, and output the corresponding attribute-level features in the aspect-level features;

多维语义特征融合模块：将获得的多个方面级特征拼接后作为输入，输出全局语义特征分布的融合特征；Multi-dimensional semantic feature fusion module: concatenate the obtained multiple aspect-level features as input, and output the fusion feature of the global semantic feature distribution;

立场检测模块：将获得的融合特征作为输入，输出用户评论文本对特定主题文本所持立场的预测概率。Stance detection module: take the obtained fusion features as input, and output the predicted probability of the position of the user comment text on the specific topic text.

优选的，文本全局语义特征提取模块中，将每个样例构造为"[CLS]t[SEP]r[SEP]"格式输入给编码器模块，得到[CLS]标记隐藏层的d_m维向量

作为句子r针对特定目标t的特征表示，以及句子r中所有单词在最后一层隐藏层的特征矩阵/>

Preferably, in the text global semantic feature extraction module, each sample is constructed as "[CLS]t[SEP]r[SEP]" format and input to the encoder module to obtain the d_m -dimensional vector of the [CLS] tag hidden layer

As the feature representation of sentence r for a specific target t, and the feature matrix of all words in sentence r in the last hidden layer />

z，Z＝f_θ(u)＝BERT([CLS]t[SEP]x)z, Z = f_θ (u) = BERT([CLS]t[SEP]x)

在一个训练批中，所有样例的特征表示可定义为

N_b为训练批的大小。In a training batch, the feature representation of all samples can be defined as

N_b is the size of the training batch.

优选的，通过方面级特征提取模块从全局语义特征Z中提取不同方面级的语义表达特征，每个方面作为一个特征组；方面级特征提取模块是由K个特征专家的组成，每个特征专家都被定义为一个具有d_m维输入通道和d_g维输出通道的一维卷积，其卷积核大小为w；则第k个专家输出一个向量v_k＝f_k(Z)＝Conv_k(Z，w)。Preferably, different aspect-level semantic expression features are extracted from the global semantic feature Z through the aspect-level feature extraction module, and each aspect is regarded as a feature group; the aspect-level feature extraction module is composed of K feature experts, and each feature expert are defined as a one-dimensional convolution with d_m -dimensional input channels and d_g -dimensional output channels, and its convolution kernel size is w; then the k-th expert outputs a vector v_k = f_k (Z) = Conv_k (Z, w).

优选的，所述方面级特征提取模块采用有监督的组间对比学习，将同一组的任意两对样例都视为正例，与来自不同组的样例互为负例，方面级特征提取模块中每个数据批的组间对比损失定义为：Preferably, the aspect-level feature extraction module adopts supervised inter-group comparative learning, and regards any two pairs of samples in the same group as positive examples, and the samples from different groups are mutually negative examples, and the aspect-level feature extraction The inter-group contrastive loss for each data batch in the module is defined as:

采用一个投影头将方面级特征v_i映射为

用于计算对比损失；/>

为第i个方面级样例对应的组间损失；1_[i≠j]∈{0，1}是一个指示函数，当且仅当i≠j时为1；τ_a表示组间对比损失的温度参数，用于控制在对比学习中对困难样本的惩罚强度。A projection head is used to map aspect-level features v_i as

Used to calculate contrastive loss; />

is the inter-group loss corresponding to the i-th aspect-level sample; 1_[i≠j] ∈ {0, 1} is an indicator function, which is 1 if and only when i≠j; τ_a represents the inter-group comparison loss Temperature parameter to control the strength of the penalty for difficult samples in contrastive learning.

优选的，所述属性级特征提取模块中，属性级特征学习表述为一个自监督的表示学习问题；Preferably, in the attribute-level feature extraction module, attribute-level feature learning is expressed as a self-supervised representation learning problem;

通过专家特征映射函数Conv_k(Z，w)将样例u_i的全局语义特征映射到相应的方面级特征空间，得到正样例对{v_ki，v_ki′}，而同在第k组中其它样例为负样本；Through the expert feature mapping function Conv_k (Z, w), the global semantic features of the sample u_i are mapped to the corresponding aspect-level feature space, and the positive sample pair {v_ki , v_ki ′} is obtained, while the k-th group The other samples in are negative samples;

属性级特征提取模块中每个数据批的组内对比损失定义为：The within-group contrastive loss for each data batch in the attribute-level feature extraction module is defined as:

其中，

为第i个方面级样例的对比损失，τ_e表示组内温度参数，1_[i≠j]∈{0，1}是一个指示函数，当且仅当i≠j时为1。in,

is the contrastive loss of the i-th aspect-level sample, τ_e represents the temperature parameter within the group, and 1_[i≠j] ∈ {0, 1} is an indicator function, which is 1 if and only when i≠j.

优选的，所述多维语义特征融合模块中，融合特征表示为：Preferably, in the multi-dimensional semantic feature fusion module, the fusion feature is expressed as:

其中，

为拼接操作，/>

为样例i在K个方面特征拼接后的融合特征，FFN(·)为前馈神经网络，/>

为多维语义融合后特征，/>

和/>

为可学习参数。in,

For the concatenation operation, />

is the fusion feature of sample i after splicing K aspect features, FFN( ) is a feed-forward neural network, />

is the feature after multi-dimensional semantic fusion, />

and />

is a learnable parameter.

进一步地，通过优化原始全局语义特征和融合语义特征间的KL散度保持两者之间的分布一致性。Furthermore, the distribution consistency between the original global semantic features and the fusion semantic features is maintained by optimizing the KL divergence between them.

优选的，所述立场检测模块中，采用具有softmax归一化的全连接层预测立场预测的概率分布：Preferably, in the stance detection module, a fully connected layer with softmax normalization is used to predict the probability distribution of the stance prediction:

其中，

为输入样例x_i预测的立场概率分布，d_p为立场标签的维度，

和/>

为可学习参数。in,

is the position probability distribution predicted by the input sample x_i , d_p is the dimension of the position label,

and />

is a learnable parameter.

优选的，步骤4中，通过有监督的立场分类损失

自监督的组间对比学习损失

自监督的组内对比学习损失/>

和全局语义保持损失/>

来训练模型；整体的目标

可公式化为四个损失的加和：Preferably, in step 4, a supervised position classification loss

Self-Supervised Intergroup Contrastive Learning Loss

Self-Supervised Intragroup Contrastive Learning Loss />

and global semantics preserving loss />

to train the model; the overall goal

can be formulated as the sum of four losses:

其中，α、β、γ是可调节的超参数，Θ表示模型中所有可训练的参数，λ表示L2正则化系数。Among them, α, β, γ are adjustable hyperparameters, Θ represents all trainable parameters in the model, and λ represents the L2 regularization coefficient.

一种计算机装置，包括存储器和处理器，存储器存储有计算机程序，处理器执行计算机程序时实现上述的基于分层对比学习的样本立场检测方法。A computer device includes a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the above-mentioned sample position detection method based on hierarchical contrastive learning is realized.

一种计算机可读存储介质，其上存储有程序，其特征在于：程序被处理器执行时实现上述的基于分层对比学习的样本立场检测方法。A computer-readable storage medium, on which a program is stored, is characterized in that: when the program is executed by a processor, the above-mentioned sample position detection method based on hierarchical contrastive learning is realized.

本发明的有益效果是：本发明的基于分层对比学习的样本立场检测方法，通过在社交网络媒体中采集数据，在采集的数据基础上构造数据集，建立样本立场检测模型，输出评论文本对主题文本所持立场的预测概率，并通过训练集训练样本立场检测模型，直至样本立场检测模型收敛得到立场预测模型；将需要预测的主题文本、评论文本输入训练好的立场预测模型，输出评论文本对主题文本所持立场的概率，即使在立场目标数据分布不平衡的情况下，也能进行立场检测。本发明提供的基于分层对比学习的样本立场检测方法可以用于在线舆情事件分析，数据挖掘领域，尤其可以用于监测语义较为集中的用户立场研判与监管，也可用于企业的网络信息监管，预测企业关注的产品反馈，用于改进产品线及相关服务。The beneficial effects of the present invention are: the sample position detection method based on layered contrastive learning of the present invention collects data in social network media, constructs a data set on the basis of the collected data, establishes a sample position detection model, and outputs comment text pairs The prediction probability of the stance held by the topic text, and train the sample stance detection model through the training set until the sample stance detection model converges to obtain the stance prediction model; input the topic text and comment text that need to be predicted into the trained stance prediction model, and output the comment text pair The probability of the stance taken by the subject text, enabling stance detection even when the stance-target data distribution is unbalanced. The sample stance detection method based on layered comparison learning provided by the present invention can be used in the analysis of online public opinion events and data mining, especially in the research, judgment and supervision of user stances with relatively concentrated monitoring semantics, and can also be used in the network information supervision of enterprises. Predict the product feedback that enterprises are concerned about, and use it to improve product lines and related services.

附图说明Description of drawings

图1为本发明的一个实施例中的基于分层对比学习的样本立场检测方法的步骤示意图；Fig. 1 is a schematic diagram of the steps of a sample position detection method based on hierarchical contrast learning in an embodiment of the present invention;

图2为本发明的一个实施例中的基于分层对比学习的样本立场检测方法的流程示意图；FIG. 2 is a schematic flow diagram of a sample position detection method based on hierarchical contrast learning in an embodiment of the present invention;

图3为一个实施例中计算机装置的内部结构图。Fig. 3 is an internal structure diagram of a computer device in one embodiment.

具体实施方式Detailed ways

以下将结合附图及实施例来详细说明本发明的实施方式，借此对本发明如何应用技术手段来解决技术问题，并达成技术效果的实现过程能充分理解并据以实施。需要说明的是，只要不构成冲突，本发明中的各个实施例以及各实施例中的各个特征可以相互结合，所形成的技术方案均在本发明的保护范围之内。The implementation of the present invention will be described in detail below in conjunction with the accompanying drawings and examples, so as to fully understand and implement the process of how to apply technical means to solve technical problems and achieve technical effects in the present invention. It should be noted that, as long as there is no conflict, each embodiment and each feature in each embodiment of the present invention can be combined with each other, and the formed technical solutions are all within the protection scope of the present invention.

另外，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。In addition, the steps shown in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and, although a logical order is shown in the flow diagrams, in some cases, the sequence may be different. The steps shown or described are performed in the order herein.

见图1、图2，本发明的一种基于分层对比学习的样本立场检测方法，包括以下步骤：See Fig. 1, Fig. 2, a kind of sample position detection method based on layered comparison learning of the present invention, comprises the following steps:

具体的，在本发明的一个实施例中，步骤2中，每个主题文本、评论文本、立场标签构成一个训练样本，与训练样本主题文本不重合的每个主题文本、评论文本构成一个测试样本，按照该方式组织所有文本，构建训练集和测试集。Specifically, in one embodiment of the present invention, instep 2, each topic text, comment text, and position label constitute a training sample, and each topic text and comment text that do not overlap with the training sample topic text constitute a test sample , organize all texts in this way, and build training and testing sets.

具体的，在本发明的一个实施例中，已知立场标签的源话题目标数据集合为

即训练集；无立场标签的目的话题目标数据集合

即测试集；其中/>

是源话题目标/>

中关于源话题目标/>

的每一个句子/>

上，预测关于目的话题目标/>

的句子/>

的立场，进而输出评论文本对主题文本所持立场的预测概率。Specifically, in one embodiment of the present invention, the source topic target data set of the known position label is

That is, the training set; the target topic target data set without position labels

That is, the test set; where />

is the source topic target />

In About Source Topic Target />

every sentence of />

on, predicting the topic target on purpose />

sentence />

具体的，在本发明的一个实施例中，步骤3中，所述样本立场检测模型包括：文本全局语义特征提取模块、方面级特征提取模块、属性级特征提取模块、多维语义特征融合模块、立场检测模块；Specifically, in one embodiment of the present invention, in step 3, the sample standpoint detection model includes: text global semantic feature extraction module, aspect-level feature extraction module, attribute-level feature extraction module, multi-dimensional semantic feature fusion module, standpoint detection module;

具体的，在本发明的一个实施例中，文本全局语义特征提取模块主要由Bert层组成，将所述主题文本和评论文本拼接输入到全局语义特征提取模块，输出评论文本面向特定主题的全局语义特征。Specifically, in one embodiment of the present invention, the text global semantic feature extraction module is mainly composed of the Bert layer, and the topic text and comment text are spliced and input into the global semantic feature extraction module, and the output comment text is oriented to the global semantics of a specific topic. feature.

该部分关注文本高层语义，学习由n个词组成句子

针对特定话题目标t的立场特征表示。将每个样例构造为"[CLS]t[SEP]r[SEP]"格式输入给编码器模块，得到[CLS]标记隐藏层的d_m维向量/>

This part focuses on the high-level semantics of the text and learns to form sentences from n words

A stance feature representation for a topic-specific target t. Construct each sample into "[CLS]t[SEP]r[SEP]" format and input it to the encoder module to get the d_m- dimensional vector of [CLS]mark hidden layer />

z，Z＝f_θ(u)＝BERT([CLS]t[SEP]xz, Z=f_θ (u)=BERT([CLS]t[SEP]x

在一个训练批中，所有样例的特征表示可定义为

N_b为训练批的大小。在本实施例中，N_b设置为32，d_m设置为768。In a training batch, the feature representation of all samples can be defined as

N_b is the size of the training batch. In this embodiment, N_b is set to 32, and d_m is set to 768.

具体的，在本发明的一个实施例中，方面级特征提取模块将全局语义特征映射到不同的特征分布空间实现多方面级特征的分离。此处学习K个非线性特征嵌入函数v_k＝f_k(Z)将全局语义特征投影到k∈(1，…，K)个不同的d_g维潜在空间。Specifically, in one embodiment of the present invention, the aspect-level feature extraction module maps global semantic features to different feature distribution spaces to realize the separation of multi-aspect level features. Here K nonlinear feature embedding functions v_k = f_k (Z) are learned to project global semantic features to k ∈ (1,..., K) different d_g- dimensional latent spaces.

通过方面级特征提取模块从全局语义特征Z中提取不同方面级的语义表达特征，每个方面作为一个特征组。方面级特征提取模块是由K个特征专家的组成，每个特征专家都被定义为一个具有d_m维输入通道和d_g维输出通道的一维卷积，其卷积核大小为w。则第k个专家输出一个向量v_k＝f_k(Z)＝Conv_k(Z，w)。给定训练批的输入数据

经过方面级特征提取模块处理后，每个样例的全局语义特征被分离为K组，每个组分别有N_b个方面级样例，即

其中v_ki为第k组中的第i个样例，/>

为方面级样例v_ki的标签(此处，使用其所在的组号作为标签，即g_ki＝k)，因此K个组/>

共N_ap＝K×N_b个方面级样例。在本实施例中，K设置为4，w设置为3，d_g设置为283。Different aspect-level semantic expression features are extracted from the global semantic feature Z through the aspect-level feature extraction module, and each aspect is regarded as a feature group. The aspect-level feature extraction module is composed of K feature experts, each feature expert is defined as a 1D convolution with dm_- dimensional input channels and_dg- dimensional output channels, and its convolution kernel size is w. Then the kth expert outputs a vector v_k = f_k (Z) = Conv_k (Z, w). Input data for a given training batch

After being processed by the aspect-level feature extraction module, the global semantic features of each sample are separated into K groups, and each group has N_b aspect-level samples, namely

where v_ki is the i-th sample in the k-th group, />

is the label of the aspect-level sample v_ki (here, use its group number as the label, that is, g_ki =k), so K groups/>

A total of N_ap =K×N_b aspect-level samples. In this embodiment, K is set to 4, w is set to 3, and d_g is set to 283.

其次，为了尽可能地将不同方面级特征解耦，采用有监督的组间对比学习，实现组内高内聚，组间低耦合。此处，将同一组的任意两对样例都视为正例，与来自不同组的样例互为负例。因此，方面级特征提取模块中每个数据批的组间对比损失定义为：Second, in order to decouple different aspect-level features as much as possible, supervised inter-group comparative learning is adopted to achieve high intra-group cohesion and low inter-group coupling. Here, any two pairs of examples from the same group are regarded as positive examples, and examples from different groups are mutually negative examples. Therefore, the intergroup contrastive loss for each data batch in the aspect-level feature extraction module is defined as:

其中，采用一个投影头(即一层MLP)将方面级特征v_i映射为

用于计算对比损失；

为第i个方面级样例对应的组间损失；1_[i≠j]∈{0，1}是一个指示函数，当且仅当i≠j时为1；τ_a表示组间对比损失的温度参数，用于控制在对比学习中对困难样本的惩罚强度。在本实施例中，τ_a设置为0.5。Among them, a projection head (i.e., one layer of MLP) is used to map the aspect-level feature v_i as

Used to calculate contrastive loss;

is the inter-group loss corresponding to the i-th aspect-level sample; 1_[i≠j] ∈ {0, 1} is an indicator function, which is 1 if and only when i≠j; τ_a represents the inter-group comparison loss Temperature parameter to control the strength of the penalty for difficult samples in contrastive learning. In this embodiment, τ_a is set to 0.5.

具体的，在本发明的一个实施例中，属性级特征提取模块主要关注某一方面内可能包含的多种属性特征，不同的属性特征表达了对目标的不同立场态度，并将属性级特征学习表述为一个自监督的表示学习问题。Specifically, in one embodiment of the present invention, the attribute-level feature extraction module mainly focuses on multiple attribute features that may be included in a certain aspect. Different attribute features express different positions and attitudes towards the target, and learn the attribute-level features formulated as a self-supervised representation learning problem.

采用dropout作为表示向量最小形式的数据增强，与原有语义样本构成正样本对。定义Z＝f_θ(u，m)是dropout掩码为m的编码器，将样例u_i采用不同dropout掩码m输入编码器两次，获得代表全局语义的两个特征表示{Z⁽ⁱ⁾，Z′⁽ⁱ⁾}。进一步的，通过专家特征映射函数Conv_k(Z，w)将样例u_i的全局语义特征映射到相应的方面级特征空间，得到正样例对{v_ki，v_ki′}，而同在第k组中其它样例为负样本。数据增强后用于对比学习的数据批表示为

其大小为2N_b。因此，属性级特征提取模块中每个数据批的组内对比损失定义为：Dropout is used as the data enhancement in the smallest form of the representation vector, and a positive sample pair is formed with the original semantic sample. Define Z = f_θ (u, m) is an encoder with a dropout mask m, input the sample u_i into the encoder twice with different dropout masks m, and obtain two feature representations representing global semantics {Z^{(i )} , Z′⁽ⁱ⁾ }. Further, through the expert feature mapping function Conv_k (Z, w), the global semantic features of the sample u_i are mapped to the corresponding aspect-level feature space, and the positive sample pair {v_ki , v_ki ′} is obtained, while in The other samples in the kth group are negative samples. The data batches used for contrastive learning after data augmentation are expressed as

Its size is 2N_b . Therefore, the within-group contrastive loss for each data batch in the attribute-level feature extraction module is defined as:

其中，

为第i个方面级样例的对比损失，τ_e表示组内温度参数，1_[i≠j]∈{0，1}是一个指示函数，当且仅当i≠j时为1。在本实施例中，τ_e设置为0.5，N_b设置为32。in,

is the contrastive loss of the i-th aspect-level sample, τ_e represents the temperature parameter within the group, and 1_[i≠j] ∈ {0, 1} is an indicator function, which is 1 if and only when i≠j. In this embodiment, τ_e is set to 0.5, and N_b is set to 32.

具体的，在本发明的一个实施例中，多维语义特征融合模块主要关注学习到多方面特征的有效性。将多方面特征进行拼接，并通过前馈神经网络将拼接特征融合、降维；最终，多方面融合特征表示为：Specifically, in one embodiment of the present invention, the multi-dimensional semantic feature fusion module mainly focuses on the effectiveness of learned multi-aspect features. The multi-aspect features are spliced, and the spliced features are fused and dimensionally reduced through the feed-forward neural network; finally, the multi-aspect fusion features are expressed as:

其中，

为拼接操作，/>

为多维语义融合后特征，/>

和/>

为可学习参数。in,

For the concatenation operation, />

is the feature after multi-dimensional semantic fusion, />

and />

is a learnable parameter.

具体的，在本发明的一个实施例中，立场检测模块将获得的多方面融合特征输入立场检测模块，输出用户评论文本对特定主题文本所持立场的预测概率。Specifically, in an embodiment of the present invention, the stance detection module inputs the obtained multi-aspect fusion features into the stance detection module, and outputs the predicted probability of the stance held by the user comment text on the specific topic text.

采用具有softmax归一化的全连接层预测立场预测的概率分布：A fully connected layer with softmax normalization is used to predict the probability distribution of stance predictions:

其中，

为输入样例x_i预测的立场概率分布，d_p为立场标签的维度，

和/>

为可学习参数。在本实施例中，d_p为3，表示立场分为3种类型，分别为支持、中立和反对。in,

and />

is a learnable parameter. In this embodiment, d_p is 3, which means that the positions are divided into three types, which are support, neutral and opposition.

最后，通过对样例的预测标签

与真实标签y的交叉熵损失来训练分类器：Finally, by predicting the label of the sample

Train the classifier with the cross-entropy loss on the true label y:

具体的，在本发明的一个实施例中，步骤4：通过训练集训练立场检测模型，直至立场检测模型收敛得到训练好的立场检测预测模型。Specifically, in an embodiment of the present invention, step 4: train the position detection model through the training set until the position detection model converges to obtain a trained position detection prediction model.

具体的，在本发明的一个实施例中，将立场预测模型输出的预测概率与真实的标签进行比较，使用梯度下降法优化对数似然损失函数，通过联合优化有监督的立场分类损失

自监督的组间对比学习损失/>

自监督组内对比学习损失/>

和全局语义保持损失/>

来训练模型。整体的目标/>

可公式化为四个损失的加和：Specifically, in one embodiment of the present invention, the predicted probability output by the stance prediction model is compared with the real label, the log likelihood loss function is optimized using the gradient descent method, and the supervised stance classification loss is jointly optimized by

Self-Supervised Intergroup Contrastive Learning Loss />

Self-Supervised Intragroup Contrastive Learning Loss />

and global semantics preserving loss />

to train the model. Overall goal />

can be formulated as the sum of four losses:

其中，α、β、γ是可调节的超参数，Θ表示模型中所有可训练的参数，λ表示L2正则化系数。使用反向传播算法训练立场检测的模型参数，通过训练集对立场检测模型进行训练迭代，直至模型收敛，得到训练好的立场检测模型。在本实施例中，α设置为0.2，β设置为0.3，γ设置为0.1，λ设置为1e-5。Among them, α, β, γ are adjustable hyperparameters, Θ represents all trainable parameters in the model, and λ represents the L2 regularization coefficient. Use the backpropagation algorithm to train the model parameters of the stance detection, and iteratively train the stance detection model through the training set until the model converges to obtain the trained stance detection model. In this embodiment, α is set to 0.2, β is set to 0.3, γ is set to 0.1, and λ is set to 1e-5.

具体的，在本发明的一个实施例中，步骤5：将需要预测的话题文本、用户评论文本，组成样本以后，输入训练好的立场检测模型，输出用户评论文本对相关话题所持立场的概率；在本实施例中，最终选择概率最大的方面作为用户评论文本对相关话题的立场态度。Specifically, in one embodiment of the present invention, step 5: after the topic text to be predicted and the user comment text are composed into a sample, input the trained position detection model, and output the probability that the user comment text holds a position on the relevant topic; In this embodiment, the aspect with the highest probability is finally selected as the position and attitude of the user comment text on related topics.

针对样本立场检测任务，现有的一些方法尝试通过注意力机制，引入外部知识等实现模型在未知目标下的泛化。然则这种直接从已知主题向未知主题迁移的方式，由于可能仍存在特定主题的特征，其预测效果往往有限。而利用对抗学习的方法，通过鉴别器引导模型学习目标无关的特征，但在立场目标数据分布不平衡的情况下，会导致预测性能下降。但这些方法更多的是基于文本全局特征空间考虑学习"粗粒度"的高层特征，而往往会忽略很多影响立场表达的"细节过程"，这种方式存在较大的噪声并影响模型向未知目标域的迁移效果。因此，建模文本语义表达中多方面的基本底层特征分布，有利于细粒度的从已知目标域向未知目标域对齐，适应样本立场检测任务。基于此，提出利用分层对比学习进行多维语义特征表示的立场检测模型，该模型学习文本语义表达中细粒度的多方面基本底层特征以提高对未知目标的样本立场分类精度。具体来说，首先，引入多个专家作为粗粒度的特征提取器，利用面向组间的监督对比学习以捕获文本语义表达中的多维方面级特征；其次，设计面向组内的自监督对比学习算法，学习组内可区分的多个属性级特征；最后，通过细粒度的多方面特征对原始全局语义特征增强，实现模型的跨域立场分类能力。For the task of sample position detection, some existing methods try to achieve the generalization of the model under unknown targets through the attention mechanism and the introduction of external knowledge. However, this method of directly transferring from known topics to unknown topics may still have the characteristics of specific topics, and its predictive effect is often limited. However, using the method of adversarial learning, the discriminator guides the model to learn target-independent features, but in the case of unbalanced distribution of stance target data, it will lead to a decline in prediction performance. However, these methods are more based on the global feature space of the text to consider learning "coarse-grained" high-level features, and often ignore many "detailed processes" that affect the expression of positions. This method has a lot of noise and affects the model to unknown targets. Domain migration effects. Therefore, modeling the distribution of basic underlying features in various aspects of text semantic expression is conducive to fine-grained alignment from known target domains to unknown target domains, and is suitable for sample standpoint detection tasks. Based on this, a stance detection model using hierarchical contrastive learning for multi-dimensional semantic feature representation is proposed. The model learns fine-grained multi-faceted basic underlying features in text semantic representation to improve the accuracy of sample stance classification for unknown targets. Specifically, firstly, multiple experts are introduced as coarse-grained feature extractors, and between-group-oriented supervised contrastive learning is used to capture multidimensional aspect-level features in text semantic representations; secondly, within-group-oriented self-supervised contrastive learning algorithms are designed , to learn multiple attribute-level features that can be distinguished within the group; finally, the original global semantic features are enhanced through fine-grained multi-faceted features to realize the cross-domain position classification ability of the model.

这样的架构具有两个优势：Such an architecture has two advantages:

(1)更高效的知识迁移能力。本发明设计了一种新的分层对比学习方案，通过组间对比学习与组内对比学习，挖掘文本立场表达中的特征，进而与未知目标域特征对比，以此作为跨目标知识迁移的桥梁。经公开数据集上的实验，与最新的跨目标立场检测方法PT-HCL相比，有4.3％的F1_macro提升。如表1所示为本发明与其他方法对比的样本立场检测任务实验结果，MSFR为本发明，表1中每行粗体数值为实验最好的结果，下划线数值为第二好的结果。(1) More efficient knowledge transfer capability. The present invention designs a new layered comparative learning scheme, through inter-group comparative learning and intra-group comparative learning, mining the characteristics of the text position expression, and then comparing with the characteristics of the unknown target domain, so as to serve as a bridge for cross-target knowledge transfer . According to experiments on public datasets, compared with the latest cross-target position detection method PT-HCL, there is a 4.3% F1_macro improvement. As shown in Table 1, the experimental results of the sample position detection task compared between the present invention and other methods, MSFR is the present invention, the bold value in each row in Table 1 is the best result of the experiment, and the underlined value is the second best result.

表1：Table 1:

(2)更精准的用户立场特征建模。基于多方面特征的特征增强，考虑文本立场表达的"细节过程"，提高了特征表征的质量，使模型可以更好的处理zero-shot和cross-target等预测任务。通过真实数据的消融试验，验证了完全不加特征增强与本实施例的融合两种特征的结果存在+4.5％的性能差异，验证了本实施例多方面特征增强后获得更强的特征表示能力。如表2所示为本发明与其他方法对比的消融实验分析结果，MSFR为本发明，w/o表示不采用某损失函数。(2) More accurate modeling of user standpoint characteristics. Feature enhancement based on multi-faceted features, considering the "detailed process" of text standpoint expression, improves the quality of feature representation and enables the model to better handle prediction tasks such as zero-shot and cross-target. Through the ablation test of real data, it is verified that there is a performance difference of +4.5% between the result of no feature enhancement at all and the fusion of the two features in this embodiment, and it is verified that the multi-faceted feature enhancement of this embodiment obtains a stronger feature representation ability . Table 2 shows the analysis results of ablation experiments comparing the present invention with other methods. MSFR is the present invention, and w/o means that a certain loss function is not used.

表2：Table 2:

本实施例相当于现有技术具有以下优点：This embodiment is equivalent to the prior art and has the following advantages:

1.本发明针对文本语义表达中多方面基本底层特征进行建模，学习不同话题之间可迁移共用的知识特征。本发明扩展了传统立场检测的深度学习方法，从文本语义的底层构成角度分析，捕获社交网络文本底层通用的特征。1. The present invention models various basic underlying features in text semantic expression, and learns knowledge features that can be transferred and shared between different topics. The present invention expands the deep learning method of traditional stance detection, analyzes from the perspective of the bottom layer composition of text semantics, and captures the general characteristics of the bottom layer of social network text.

2.本发明针对文本立场表达,设计了一种基于全局语义保持机制，保证学习的多方面底层特征的有效性，并将多方面特征用于增强全局语义特征，进而预测用户评论文本的立场。2. The present invention designs a global semantics-based preservation mechanism for text stance expression to ensure the effectiveness of learned multi-aspect underlying features, and uses multi-aspect features to enhance global semantic features to predict the stance of user comment text.

本实施例利用了文本语义表达底层特征，针对样本立场检测问题有更可靠的预测性能，所以针对不同的互联网话题，可以通过训练获得较为通用的深度学习模型基础参数，更好的解决语义范畴内的问题，例如社会问题，网络舆论问题等。This embodiment utilizes text semantics to express the underlying features, and has more reliable prediction performance for the problem of sample position detection. Therefore, for different Internet topics, more general basic parameters of deep learning models can be obtained through training to better solve semantic problems. issues, such as social issues, network public opinion issues, etc.

本实施例提供的方法可以用于在线舆情事件分析，数据挖掘领域，尤其可以用于监测语义较为集中的用户立场研判与监管，也可用于企业的网络信息监管，预测企业关注的产品反馈，用于改进产品线及相关服务。The method provided in this embodiment can be used in the analysis of online public opinion events and data mining, especially in the research, judgment and supervision of user positions with relatively concentrated monitoring semantics. To improve product lines and related services.

在本发明的实施例中，还提供了一种计算机装置，包括存储器和处理器，存储器存储有计算机程序，处理器执行计算机程序时实现如上述的基于分层对比学习的样本立场检测方法。In an embodiment of the present invention, a computer device is also provided, including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the method for detecting the position of a sample based on hierarchical contrastive learning as described above is realized.

该计算机装置可以是终端，其内部结构图可以如图3所示。该计算机装置包括通过总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中，该计算机装置的处理器用于提供计算和控制能力。该计算机装置的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机装置的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现基于分层对比学习的零样本立场检测方法。该计算机装置的显示屏可以是液晶显示屏或者电子墨水显示屏，该计算机装置的输入装置可以是显示屏上覆盖的触摸层，也可以是计算机装置外壳上设置的按键、轨迹球或触控板，还可以是外接的键盘、触控板或鼠标等。The computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 3 . The computer device includes a processor connected by a bus, a memory, a network interface, a display screen and an input device. Wherein, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and computer programs. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer program is executed by a processor, a zero-sample stance detection method based on hierarchical contrastive learning is realized. The display screen of the computer device may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer device may be a touch layer covered on the display screen, or a button, a trackball or a touch pad provided on the computer device casing. , and can also be an external keyboard, touchpad, or mouse.

存储器可以是，但不限于，随机存取存储器(Random Access Memory，简称：RAM)，只读存储器(Read Only Memory，简称：ROM)，可编程只读存储器(Programmable Read-OnlyMemory，简称：PROM)，可擦除只读存储器(Erasable Programmable Read-Only Memory，简称：EPROM)，电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory，简称：EEPROM)等。其中，存储器用于存储程序，处理器在接收到执行指令后，执行程序。The memory can be, but not limited to, random access memory (Random Access Memory, referred to as: RAM), read-only memory (Read Only Memory, referred to as: ROM), programmable read-only memory (Programmable Read-Only Memory, referred to as: PROM) , Erasable Programmable Read-Only Memory (EPROM for short), Electric Erasable Programmable Read-Only Memory (EEPROM for short), etc. Wherein, the memory is used to store programs, and the processor executes the programs after receiving execution instructions.

处理器可以是一种集成电路芯片，具有信号的处理能力。上述的处理器可以是通用处理器，包括中央处理器(Central Processing Unit，简称：CPU)、网络处理器(NetworkProcessor，简称：NP)等。该处理器还可以是其他通用处理器、数字信号处理器(DigitalSignal Processor，DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现场可编程门阵列(Field-Programmable Gate Array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor can be an integrated circuit chip with signal processing capability. The aforementioned processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU for short), a network processor (Network Processor, NP for short), and the like. The processor can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

本领域技术人员可以理解，图3中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机装置的限定，具体的计算机装置可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Those skilled in the art can understand that the structure shown in Figure 3 is only a block diagram of a part of the structure related to the solution of this application, and does not constitute a limitation to the computer device on which the solution of this application is applied. The specific computer device can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.

在本发明的实施例中，还提供了一种计算机可读存储介质，其上存储有程序，其特征在于：程序被处理器执行时实现如上述的基于分层对比学习的样本立场检测方法。In an embodiment of the present invention, there is also provided a computer-readable storage medium on which a program is stored, which is characterized in that: when the program is executed by a processor, the above-mentioned sample position detection method based on hierarchical contrastive learning is realized.

本领域内的技术人员应明白，本发明实施例的实施例可提供为方法、计算机装置、或计算机程序产品。因此，本发明实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, computer devices, or computer program products. Accordingly, embodiments of the invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, embodiments of the invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本发明实施例是参照根据本发明实施例的方法、计算机装置、或计算机程序产品的流程图和/或框图来描述的。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图和/或中指定的功能的装置。Embodiments of the invention are described with reference to flowchart illustrations and/or block diagrams of methods, computer apparatuses, or computer program products according to embodiments of the invention. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor or processor of other programmable data processing terminal equipment to produce a machine such that instructions executed by the computer or processor of other programmable data processing terminal equipment Produce means for implementing the functions specified in the flowchart and/or.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing terminal to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the The instruction means implements the functions specified in the flowcharts.

以上对本发明所提供的基于分层对比学习的零样本立场检测方法、计算机装置、计算机可读存储介质的应用进行了详细介绍，本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想；同时，对于本领域的一般技术人员，依据本发明的思想，在具体实施方式及应用范围上均会有改变之处，综上所述，本说明书内容不应理解为对本发明的限制。The application of the zero-sample stance detection method, computer device, and computer-readable storage medium based on layered comparative learning provided by the present invention has been introduced above in detail. In this paper, specific examples are used to illustrate the principle and implementation of the present invention. , the description of the above embodiments is only used to help understand the method of the present invention and its core idea; at the same time, for those of ordinary skill in the art, according to the idea of the present invention, there will be changes in the specific implementation and application scope In summary, the contents of this specification should not be construed as limiting the present invention.

Claims

Translated fromChinese

1.一种基于分层对比学习的样本立场检测方法，其特征在于：包括以下步骤：1. A sample position detection method based on layered comparison learning, characterized in that: comprising the following steps:

2.根据权利要求1所述的基于分层对比学习的样本立场检测方法，其特征在于：2. the sample position detection method based on layered comparison learning according to claim 1, is characterized in that:

步骤2中，每个主题文本、评论文本、立场标签构成一个训练样本，与训练样本主题文本不重合的每个主题文本、评论文本构成一个测试样本，按照该方式组织所有文本，构建训练集和测试集。In step 2, each subject text, comment text, and position label constitute a training sample, and each subject text and comment text that do not overlap with the subject text of the training sample constitute a test sample, organize all texts in this way, construct a training set and test set.

3.根据权利要求1所述的基于分层对比学习的样本立场检测方法，其特征在于：3. the sample position detection method based on layered comparison learning according to claim 1, is characterized in that:

已知立场标签的源话题目标数据集合为

即训练集；无立场标签的目的话题目标数据集合/>

即测试集；其中/>

是源话题目标/>

中关于源话题目标/>

的每一个句子/>

上，预测关于目的话题目标/>

的句子/>

的立场，进而输出评论文本对主题文本所持立场的预测概率。The source topic target data set of known position labels is

That is, the test set; where />

is the source topic target />

In About Source Topic Target />

every sentence of />

on, predicting the topic target on purpose />

sentence />

4.根据权利要求1所述的基于分层对比学习的样本立场检测方法，其特征在于：4. the sample position detection method based on layered comparative learning according to claim 1, is characterized in that:

步骤3中，所述样本立场检测模型包括：文本全局语义特征提取模块、方面级特征提取模块、属性级特征提取模块、多维语义特征融合模块、立场检测模块；In step 3, the sample stance detection model includes: a text global semantic feature extraction module, an aspect-level feature extraction module, an attribute-level feature extraction module, a multi-dimensional semantic feature fusion module, and a stance detection module;

文本全局语义特征提取模块：所述主题文本和评论文本拼接作为输入，输出评论文本面向特定主题的全局语义特征；Text global semantic feature extraction module: the topic text and comment text are concatenated as input, and output comment text is oriented to a global semantic feature of a specific topic;

5.根据权利要求4所述的基于分层对比学习的样本立场检测方法，其特征在于：5. the sample position detection method based on layered comparison learning according to claim 4, is characterized in that:

文本全局语义特征提取模块中，将每个样例构造为″[CLS]t[SEP]r[SEP]″格式输入给编码器模块，得到[CLS]标记隐藏层的d_m维向量

In the text global semantic feature extraction module, each sample is constructed as "[CLS]t[SEP]r[SEP]" format and input to the encoder module, and the d_m- dimensional vector of the [CLS] tag hidden layer is obtained

z，Z＝f_θ(u)＝BERT([CLS]t[SEP]x)z, Z = f_θ (u) = BERT([CLS]t[SEP]x)

在一个训练批中，所有样例的特征表示可定义为

N_b is the size of the training batch.

6.根据权利要求4所述的基于分层对比学习的样本立场检测方法，其特征在于：6. the sample position detection method based on layered comparative learning according to claim 4, is characterized in that:

通过方面级特征提取模块从全局语义特征Z中提取不同方面级的语义表达特征，每个方面作为一个特征组；方面级特征提取模块是由K个特征专家的组成，每个特征专家都被定义为一个具有d_m维输入通道和d维输出通道的一维卷积，其卷积核大小为w；则第k个专家输出一个向量υ_k＝f_k(Z)＝Conv_k(Z，w)。The aspect-level feature extraction module extracts different aspect-level semantic expression features from the global semantic feature Z, and each aspect is regarded as a feature group; the aspect-level feature extraction module is composed of K feature experts, and each feature expert is defined is a one-dimensional convolution with d_m -dimensional input channels and d-dimensional output channels, and its convolution kernel size is w; then the kth expert outputs a vector υ_k = f_k (Z) = Conv_k (Z, w ).

7.根据权利要求6所述的基于分层对比学习的样本立场检测方法，其特征在于：所述方面级特征提取模块采用有监督的组间对比学习，将同一组的任意两对样例都视为正例，与来自不同组的样例互为负例，方面级特征提取模块中每个数据批的组间对比损失定义为：7. The sample standpoint detection method based on hierarchical contrastive learning according to claim 6, characterized in that: said aspect-level feature extraction module adopts supervised contrastive learning between groups, and any two pairs of samples of the same group are As a positive example, and samples from different groups are mutually negative examples, the inter-group comparison loss of each data batch in the aspect-level feature extraction module is defined as:

采用一个投影头将方面级特征υ_i映射为

用于计算对比损失；/>

为第i个方面级样例对应的组间损失；1_[i≠j]∈{0，1}是一个指示函数，当且仅当i≠j时为1；τ_a表示组间对比损失的温度参数，用于控制在对比学习中对困难样本的惩罚强度。Using a projection head, the aspect-level feature υ_i is mapped as

Used to calculate contrastive loss; />

8.根据权利要求4所述的基于分层对比学习的样本立场检测方法，其特征在于：所述属性级特征提取模块中，属性级特征学习表述为一个自监督的表示学习问题；8. The sample standpoint detection method based on layered comparison learning according to claim 4, characterized in that: in the attribute-level feature extraction module, attribute-level feature learning is expressed as a self-supervised representation learning problem;

通过专家特征映射函数Conv_k(Z，w)将样例u_i的全局语义特征映射到相应的方面级特征空间，得到正样例对{υ_ki，υ_ki′}，而同在第k组中其它样例为负样本；Through the expert feature mapping function Conv_k (Z, w), the global semantic features of the sample u_i are mapped to the corresponding aspect-level feature space, and the positive sample pair {υ_ki ,υ_ki ′} is obtained, while the same group k The other samples in are negative samples;

其中，

9.根据权利要求4所述的基于分层对比学习的样本立场检测方法，其特征在于：所述多维语义特征融合模块中，融合特征表示为：9. The sample standpoint detection method based on layered comparative learning according to claim 4, characterized in that: in the multidimensional semantic feature fusion module, the fusion feature is expressed as:

其中，

为拼接操作，/>

为多维语义融合后特征，/>

和/>

为可学习参数。in,

For the concatenation operation, />

is the feature after multi-dimensional semantic fusion, />

and />

is a learnable parameter.

10.根据权利要求4所述的基于分层对比学习的样本立场检测方法，其特征在于：所述立场检测模块中，采用具有softmax归一化的全连接层预测立场预测的概率分布：10. the sample position detection method based on layered comparison learning according to claim 4, is characterized in that: in described position detection module, adopts the probability distribution that the fully connected layer prediction position prediction with softmax normalization:

其中，

为输入样例x_i预测的立场概率分布，d_p为立场标签的维度，

和/>

为可学习参数。in,

and />

is a learnable parameter.

11.根据权利要求1所述的基于分层对比学习的样本立场检测方法，其特征在于：步骤4中，通过有监督的立场分类损失

自监督的组间对比学习损失/>

自监督的组内对比学习损失/>

和全局语义保持损失/>

来训练模型；整体的目标/>

可公式化为四个损失的加和：11. The sample standpoint detection method based on hierarchical contrastive learning according to claim 1, characterized in that: in step 4, the supervised standpoint classification loss

Self-Supervised Intergroup Contrastive Learning Loss />

Self-Supervised Intragroup Contrastive Learning Loss />

and global semantics preserving loss />

to train the model; overall goal />

can be formulated as the sum of four losses:

12.一种计算机装置，包括存储器和处理器，存储器存储有计算机程序，处理器执行计算机程序时实现如权利要求1-11任一项的基于分层对比学习的样本立场检测方法。12. A computer device, comprising a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the method for detecting the position of a sample based on hierarchical contrastive learning according to any one of claims 1-11 is realized.

13.一种计算机可读存储介质，其上存储有程序，其特征在于：程序被处理器执行时实现如权利要求1-11任一项的基于分层对比学习的样本立场检测方法。13. A computer-readable storage medium, on which a program is stored, characterized in that: when the program is executed by a processor, the sample position detection method based on hierarchical contrastive learning according to any one of claims 1-11 is realized.