CN114722833A

Movatterモバイル変換

Info

Publication number: CN114722833A
Application number: CN202210412719.0A
Authority: CN
Inventors: 冯铃; 王鑫
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2022-04-19
Filing date: 2022-04-19
Publication date: 2022-07-08
Anticipated expiration: 2042-04-19
Also published as: CN114722833B

Abstract

Translated fromChinese

本发明提供一种语义分类方法及装置，其中方法包括：获取待分类文本数据；将待分类文本数据输入至语义分类模型，基于待分类文本数据的词嵌入和依赖树对待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得语义分类模型输出的分类结果；语义分类模型是基于文本样本以及文本样本对应的类别标签进行训练后得到的，每个类别标签是根据文本样本预先确定的，并与文本样本一一对应。本发明实施例提供的语义分类方法，通过词嵌入实现对文本内容进行分析，并且结合依赖树获取待分类文本深层的语义信息，提高了语义理解的正确率，从而提高了语义分类的正确率。

The present invention provides a semantic classification method and device, wherein the method includes: acquiring text data to be classified; inputting the text data to be classified into a semantic classification model, and performing semantic understanding of the text data to be classified based on word embedding and dependency tree of the text data to be classified , and classify based on the semantic understanding results to obtain the classification results output by the semantic classification model; the semantic classification model is obtained after training based on the text samples and the class labels corresponding to the text samples, each class label is pre-determined according to the text samples, And one-to-one correspondence with text samples. The semantic classification method provided by the embodiment of the present invention realizes the analysis of the text content through word embedding, and obtains the deep semantic information of the text to be classified by combining the dependency tree, which improves the accuracy of semantic understanding and thus the accuracy of semantic classification.

Description

Translated fromChinese

一种语义分类方法及装置A semantic classification method and device

技术领域technical field

本发明涉及计算机技术领域，尤其涉及一种语义分类方法及装置。The present invention relates to the field of computer technology, and in particular, to a semantic classification method and device.

背景技术Background technique

伴随着信息的爆炸式增长，人工标注数据已经变得耗时、质量低下，且受到标注人主观意识的影响。因此，利用机器自动化地实现对数据的标注变得具有现实意义，将重复且枯燥的文本标注任务交由计算机进行处理能够有效克服以上问题，同时所标注的数据具有一致性、高质量等特点。目前，对于丰富的语言内容，已经有方法通过建立LIWC与类别相关的单词的字典来文本数据对应的分类，即基于字典的文本分类。With the explosive growth of information, manual labeling of data has become time-consuming, low-quality, and affected by the subjective consciousness of labelers. Therefore, it is of practical significance to use machines to automatically label data, and handing over repetitive and boring text labeling tasks to computers can effectively overcome the above problems. At the same time, the labelled data has the characteristics of consistency and high quality. At present, for rich language content, there are already methods to classify text data by establishing a dictionary of LIWC words related to categories, that is, dictionary-based text classification.

但是由于网络等因素，越来越多的单词的含义、使用方法和句型变化速度较快，基于字典的文本分类方法无法适应变化速度较快的语言环境。如果词典不能持续改进，基于词典的分类方法很难提供令人满意的性能，从而导致语言内容分类准确率低。However, due to factors such as the network, the meanings, usages and sentence patterns of more and more words change rapidly, and the dictionary-based text classification method cannot adapt to the rapidly changing language environment. If the lexicon cannot be continuously improved, it is difficult for lexicon-based classification methods to provide satisfactory performance, resulting in low accuracy of language content classification.

发明内容SUMMARY OF THE INVENTION

本发明提供一种语义分类方法，用以解决现有技术中语言内容分类准确率低的缺陷，提高语言内容分类的准确率。The present invention provides a semantic classification method, which is used to solve the defect of low accuracy of language content classification in the prior art, and to improve the accuracy of language content classification.

第一方面，本发明提供一种语义分类方法，包括：In a first aspect, the present invention provides a semantic classification method, comprising:

获取待分类文本数据；Get the text data to be classified;

将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；Input the text data to be classified into a semantic classification model, perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classify based on the semantic understanding result to obtain the semantic classification The classification result of the model output;

其中，所述语义分类模型是基于文本样本以及所述文本样本对应的类别标签进行训练后得到的，每个所述类别标签是根据所述文本样本预先确定的，并与所述文本样本一一对应。Wherein, the semantic classification model is obtained after training based on the text samples and the category labels corresponding to the text samples, and each category label is predetermined according to the text samples, and is one-to-one with the text samples correspond.

可选地，所述语义分类模型包括：编码模块和关系模块；Optionally, the semantic classification model includes: an encoding module and a relationship module;

所述将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果，包括：Inputting the text data to be classified into a semantic classification model, performing semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classifying the text data based on the semantic understanding result, and obtaining the The classification results output by the semantic classification model include:

将所述待分类文本数据输入至所述编码模块，获得所述编码模块输出的待分类向量；Inputting the text data to be classified into the encoding module to obtain a vector to be classified output by the encoding module;

将所述待分类向量输入至所述关系模块，获得所述关系模块输出的分类结果。The to-be-classified vector is input to the relationship module to obtain a classification result output by the relationship module.

可选地，所述编码模块包括BERT单元、依赖树单元、构造单元、依赖图单元和注意力单元；Optionally, the encoding module includes a BERT unit, a dependency tree unit, a construction unit, a dependency graph unit and an attention unit;

所述将所述待分类文本数据输入至所述编码模块，获得所述编码模块输出的待分类向量，包括：The inputting the text data to be classified into the encoding module to obtain the to-be-classified vector output by the encoding module includes:

将所述待分类文本数据输入至所述BERT单元，获得所述BERT单元输出的词嵌入向量集；Inputting the text data to be classified into the BERT unit to obtain the word embedding vector set output by the BERT unit;

将所述待分类文本数据输入至所述依赖树单元，获得所述依赖树单元输出的依赖树；Inputting the text data to be classified into the dependency tree unit to obtain a dependency tree output by the dependency tree unit;

将所述依赖树和所述词嵌入向量集输入至所述构造单元，获得所述构造单元输出的依赖图；Inputting the dependency tree and the word embedding vector set to the construction unit to obtain a dependency graph output by the construction unit;

将所述依赖图输入至所述依赖图单元，获取所述依赖图单元输出的第一待分类矩阵；inputting the dependency graph into the dependency graph unit, and obtaining the first matrix to be classified outputted by the dependency graph unit;

将所述第一待分类矩阵输入至所述注意力单元，获得所述注意力单元输出的第一待分类向量。The first to-be-classified matrix is input to the attention unit, and the first to-be-classified vector output by the attention unit is obtained.

可选地，所述将所述依赖树和所述词嵌入向量集输入至所述构造单元，获得所述构造单元输出的依赖图，包括：Optionally, the inputting the dependency tree and the word embedding vector set to the construction unit to obtain a dependency graph output by the construction unit, including:

基于所述词嵌入向量集确定所述依赖图的节点；determining nodes of the dependency graph based on the set of word embedding vectors;

基于所述依赖树和邻接矩阵确定每个所述节点之间的邻接关系；Determine an adjacency relationship between each of the nodes based on the dependency tree and the adjacency matrix;

所述邻接矩阵为：The adjacency matrix is:

其中，A_i,j表示第i个节点与第j个节点之间的连接关系，i为大于等于1的正整数，j为大于等于1的正整数，T(W_i,W_j)表示第i个节点W_i与第j个节点W_j之间存在关系。Among them, A_i,j represents the connection relationship between the ith node and the jth node, i is a positive integer greater than or equal to 1, j is a positive integer greater than or equal to 1, and T(W_i ,W_j ) represents the first There is a relationship between the i nodes Wi and the_{jth node W j}_.

可选地，所述依赖图单元包括至少一级隐含层；每级所述隐含层的输出公式如下：Optionally, the dependency graph unit includes at least one hidden layer; the output formula of each hidden layer is as follows:

其中，H^I表示第I级隐含层的隐藏表示，ReLU表示ReLU激活函数，

表示归一化后的邻接矩阵，A表示邻接矩阵，D表示A的度，W^I表示训练参数，b^I表示训练参数。Among them, H^I represents the hidden representation of the first-level hidden layer, ReLU represents the ReLU activation function,

represents the normalized adjacency matrix, A represents the adjacency matrix, D represents the degree of A, W^I represents the training parameter, and b^I represents the training parameter.

可选地，所述编码模块还包括图片单元和连接单元；Optionally, the encoding module further includes a picture unit and a connection unit;

所述方法还包括：The method also includes:

获取与所述待分类文本数据对应的待分类图片数据；Acquiring to-be-classified picture data corresponding to the to-be-classified text data;

将所述待分类图片数据输入至所述图片单元，获得所述图片单元输出的第二待分类矩阵；Inputting the picture data to be classified into the picture unit to obtain a second matrix to be classified outputted by the picture unit;

将所述第一待分类矩阵和所述第二待分类矩阵拼接，获得第三待分类矩阵；splicing the first matrix to be classified and the second matrix to be classified to obtain a third matrix to be classified;

将所述第三待分类矩阵输入至所述注意力单元，获得所述注意力单元输出的第二待分类向量。The third to-be-classified matrix is input to the attention unit, and the second to-be-classified vector output by the attention unit is obtained.

第二方面，本发明还提供一种语义分类装置，包括：In a second aspect, the present invention also provides a semantic classification device, comprising:

获取模块，用于获取待分类文本数据；The acquisition module is used to acquire the text data to be classified;

分类模块，用于将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；A classification module, configured to input the text data to be classified into a semantic classification model, perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classify based on the semantic understanding result, obtaining the classification result output by the semantic classification model;

第三方面，本发明还提供一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如第一方面所述语义分类方法。In a third aspect, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and running on the processor, the processor implements the program described in the first aspect when the processor executes the program Semantic Classification Methods.

第四方面，本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现如第一方面所述语义分类方法。In a fourth aspect, the present invention further provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the semantic classification method described in the first aspect.

第五方面，本发明还提供一种计算机程序产品，包括计算机程序，所述计算机程序被处理器执行时实现如第一方面所述语义分类方法。In a fifth aspect, the present invention further provides a computer program product, including a computer program, which implements the semantic classification method according to the first aspect when the computer program is executed by a processor.

本发明提供的语义分类方法及装置，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，提高了语义理解的正确率，从而提高了语义分类的正确率。The semantic classification method and device provided by the present invention perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classify based on the semantic understanding result, thereby improving the accuracy of semantic understanding, Thus, the accuracy of semantic classification is improved.

附图说明Description of drawings

为了更清楚地说明本发明或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are the For some embodiments of the invention, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1是本发明实施例提供的语义分类方法的流程示意图之一；Fig. 1 is one of the schematic flow charts of the semantic classification method provided by the embodiment of the present invention;

图2是本发明实施例提供的语义分类方法的流程示意图之二；2 is a second schematic flowchart of a semantic classification method provided by an embodiment of the present invention;

图3是本发明实施例提供的语义分类装置的结构示意图；3 is a schematic structural diagram of a semantic classification device provided by an embodiment of the present invention;

图4是本发明实施例提供的电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明中的附图，对本发明中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention. , not all examples. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

下面对本发明涉及的技术术语作一介绍：Below is an introduction to the technical terms involved in the present invention:

元学习(Meta-learning)：元学习是指希望模型获取一种“学会学习”的能力，使其可以在获取已有“知识”的基础上快速学习新的任务，它的意图在于通过少量的训练实例设计能够快速学习新技能或适应新环境的模型。元学习系统要接受大量任务(tasks)的训练，并预测其学习新任务的能力。Meta-learning: Meta-learning refers to the hope that the model acquires a "learning to learn" ability, so that it can quickly learn new tasks on the basis of acquiring existing "knowledge". Training examples design models that can quickly learn new skills or adapt to new environments. A meta-learning system is trained on a large number of tasks and predicts its ability to learn new tasks.

人工智能(Artificial Intelligence，AI)：AI是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能，感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。AI基础技术一般包括如传感器、专用AI芯片、云计算、分布式存储、大数据处理、操作/交互系统、机电一体化、计算机视觉、语音处理、自然语言处理以及机器学习/深度学习等。Artificial Intelligence (AI): AI is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. . AI basic technologies generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operation/interaction systems, mechatronics, computer vision, speech processing, natural language processing, and machine learning/deep learning.

注意力(attention)：是一个非常常见，但是又会被忽略的事实。例如天空一只鸟飞过去的时候，往往人类的注意力会追随着鸟儿，天空在人类的视觉系统中，自然成为了背景(background)信息。计算机视觉中的注意力机制的基本思想是想让系统学会把注意力放在感兴趣的地方，忽略背景信息而关注重点信息。Attention: This is a very common, but often overlooked fact. For example, when a bird flies past in the sky, human attention often follows the bird, and the sky naturally becomes the background information in the human visual system. The basic idea of the attention mechanism in computer vision is to let the system learn to pay attention to the places of interest, ignore the background information and focus on the key information.

损失函数(Loss Function)：也称为代价函数或目标函数，用来衡量模型的预测值f(x)与真实值Y不一致的程度，通常是表示为L(Y,f(x))的一个非负的实值函数。一般而言，损失函数的值(即，损失值)越小，模型拟合得越好，对新数据的预测能力也就越强。损失函数是深度学习中训练模型的“指挥棒”，通过对预测样本和真实样本标记产生的误差反向传播来指导模型参数学习。损失函数的损失值逐渐减少(收敛)时，可以认为完成模型训练。Loss Function: Also known as cost function or objective function, it is used to measure the degree to which the predicted value f(x) of the model is inconsistent with the true value Y, usually expressed as a L(Y, f(x)) Nonnegative real-valued function. In general, the smaller the value of the loss function (ie, the loss value), the better the model fits and the better the predictive power on new data. The loss function is the "baton" of the training model in deep learning, which guides the learning of model parameters by back-propagating the errors generated by the predicted samples and the real sample labels. When the loss value of the loss function gradually decreases (converges), the model training can be considered complete.

语义分类可以应用于基于社交媒体的压力类别检测。在当今社会中，人们正遭受着各种各样的压力。过多的未及时舒缓的压力会造成各种生理和心理问题，危害人类健康。因此，了解导致压力的压力源对于帮助人们采取有效措施应对压力非常重要。得益于社交媒体上信息丰富的语言内容，已经有方法通过建立LIWC与压力源相关类别的单词的压力源字典来检测各类别下的压力，即压力类别检测。Semantic classification can be applied to social media-based stress category detection. In today's society, people are suffering from all kinds of pressures. Too much stress that is not relieved in time can cause various physical and psychological problems and endanger human health. Therefore, understanding the stressors that cause stress is important to help people take effective measures to cope with stress. Benefiting from the informative language content on social media, there have been methods to detect stress under each category by building a stressor dictionary of LIWC and stressor-related categories of words, namely stress category detection.

然而随着网络语言在社交平台上流行，越来越多的单词使用和句型发生了变化，基于词典的压力类别检测方法无法满足当前的社交媒体环境，由于词典更新速度和更新内容的限制，基于词典的方法很难提供令人满意的性能，从而导致对于基于社交媒体的压力类别的分类准确率较低。However, with the popularity of online languages on social platforms, more and more word usage and sentence patterns have changed. The dictionary-based stress category detection method cannot meet the current social media environment. Due to the limitation of dictionary update speed and update content, Lexicon-based methods struggle to provide satisfactory performance, resulting in low classification accuracy for social media-based stress categories.

本发明实施例提出的语义分类方法可以应用于基于社交媒体的压力类别分类，提高分类准确率。The semantic classification method proposed in the embodiment of the present invention can be applied to the classification of stress categories based on social media, so as to improve the classification accuracy.

下面结合图1－图2描述本发明实施例提供的语义分类方法。The semantic classification method provided by the embodiment of the present invention is described below with reference to FIG. 1 to FIG. 2 .

图1是本发明实施例提供的语义分类方法的流程示意图之一，如图1所示，本发明实施例提供一种语义分类方法，包括：FIG. 1 is a schematic flowchart of a semantic classification method provided by an embodiment of the present invention. As shown in FIG. 1 , an embodiment of the present invention provides a semantic classification method, including:

步骤110，获取待分类文本数据；Step 110, obtaining text data to be classified;

示例性地，在基于社交媒体的压力类别分类场景中，待分类文本数据可以为社交媒体内容，如用户在社交媒体上发布的博客内容或动态分享等社交内容。如用户“123”在社交平台上发布的动态内容“明天要开学，作业还没写完。”应理解，待分类文本数据可以是在网络中抓取的，本发明实施例对待分类文本数据的来源不作限定。Exemplarily, in a social media-based stress category classification scenario, the text data to be classified may be social media content, such as blog content published by a user on social media or social content such as dynamic sharing. For example, the dynamic content published by the user "123" on the social platform "School starts tomorrow, and the homework is not finished." The source is not limited.

步骤120，将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；Step 120: Input the text data to be classified into a semantic classification model, perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classify based on the semantic understanding result, and obtain the Describe the classification results output by the semantic classification model;

具体地，依赖树，即依存关系树(dependency relation tree)，使用语义边来表示句子的语法信息，依赖树通过词汇所承受的语义框架来描述该词汇含义。Specifically, a dependency tree, ie, a dependency relation tree, uses semantic edges to represent the grammatical information of a sentence, and the dependency tree describes the meaning of a word through the semantic frame it bears.

本发明实施例提供的语义分类方法，通过词嵌入将无法计算的文本数据转化为向量形式，进而可以针对文本信息做计算，实现对文本内容进行分析，并且结合依赖树能够跨越句子表层句法结构束缚的特点，获取待分类文本深层的语义信息，提高了语义理解的正确率，从而提高了语义分类的正确率。The semantic classification method provided by the embodiment of the present invention converts text data that cannot be calculated into a vector form through word embedding, and then can perform calculations on text information to realize the analysis of text content, and combined with the dependency tree, it can overcome the constraints of the syntactic structure of the sentence surface. It can obtain the deep semantic information of the text to be classified, and improve the accuracy of semantic understanding, thereby improving the accuracy of semantic classification.

下面，对上述步骤在具体实施例中的可能的实现方式做进一步说明。In the following, possible implementation manners of the above steps in specific embodiments are further described.

步骤121，将所述待分类文本数据输入至所述编码模块，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，获得所述编码模块输出的待分类向量；Step 121: Input the text data to be classified into the encoding module, perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and obtain the to-be-classified text data output by the encoding module. vector;

步骤123，将所述待分类向量输入至所述关系模块，获得所述关系模块输出的分类结果。Step 123: Input the to-be-classified vector to the relationship module to obtain the classification result output by the relationship module.

具体地，编码模块用于获取待分类文本数据的类别相关表示；关系模块用于确定待分类文本数据属于哪个类别。关系模块中可以基于训练获得多种类别的类别向量表示，也可以基于知识库等获得类别向量表示，本发明实施例对类别向量表示的来源不做限定。计算待分类向量与关系模块中各个类别向量表示之间的距离，并将所述距离作为匹配得分。所述距离可以包括余弦距离或欧式距离等，具体计算方法与现有技术相同，此处不再赘述。Specifically, the encoding module is used to obtain a category-related representation of the text data to be classified; the relationship module is used to determine which category the text data to be classified belongs to. In the relationship module, category vector representations of various categories may be obtained based on training, and category vector representations may also be obtained based on knowledge bases, etc. The source of the category vector representations is not limited in this embodiment of the present invention. Calculate the distance between the vector to be classified and each category vector representation in the relationship module, and use the distance as a matching score. The distance may include a cosine distance or an Euclidean distance, etc., and the specific calculation method is the same as that in the prior art, which will not be repeated here.

步骤1211，将所述待分类文本数据输入至所述BERT单元，获得所述BERT单元输出的词嵌入向量集；Step 1211, input the text data to be classified into the BERT unit, and obtain the word embedding vector set output by the BERT unit;

应理解，文本数据中的每个单词对应一个词嵌入向量。It should be understood that each word in the text data corresponds to a word embedding vector.

具体地，BERT单元是预先训练好的BERT模型。BERT模型可以提供强大的上下文相关表示，并且可以用于各种目标任务。因此，本发明实施例中使用预训练的BERT来获得待分类文本数据对应的词嵌入向量集。词嵌入向量集能够体现待分类文本数据的语义表示，词的语义表示可以有效地理解整个句子的语义。Specifically, the BERT unit is a pre-trained BERT model. BERT models can provide powerful context-sensitive representations and can be used for various target tasks. Therefore, in the embodiment of the present invention, the pre-trained BERT is used to obtain the word embedding vector set corresponding to the text data to be classified. The word embedding vector set can reflect the semantic representation of the text data to be classified, and the semantic representation of the word can effectively understand the semantics of the entire sentence.

步骤1212，将所述待分类文本数据输入至所述依赖树单元，获得所述依赖树单元输出的依赖树；Step 1212, input the text data to be classified into the dependency tree unit, and obtain the dependency tree output by the dependency tree unit;

具体地，依赖树单元可以由spaCy等依赖树生成工具构成，本发明实施例对生成依赖树的工具不做限定。Specifically, the dependency tree unit may be composed of a dependency tree generating tool such as spaCy, and the embodiment of the present invention does not limit the tool for generating the dependency tree.

步骤1213，将所述依赖树和所述词嵌入向量集输入至所述构造单元，获得所述构造单元输出的依赖图；Step 1213, inputting the dependency tree and the word embedding vector set to the construction unit to obtain a dependency graph output by the construction unit;

具体地，依赖图是可以表示待分类文本对象语义含义的无向图，依赖图包括节点和边，节点是词嵌入向量，边是依赖树中的关系。边的权重可以基于节点之间的依赖关系取值。示例性地，两个节点之间的边取值为零时，表示两个节点之间不存在依赖关系；两个节点之间的边取值为非零正值时，表示两个节点之间存在依赖关系，且两个节点之间的边取值越大，表示两个节点之间的依赖关系越强。Specifically, a dependency graph is an undirected graph that can represent the semantic meaning of a text object to be classified. The dependency graph includes nodes and edges, where nodes are word embedding vectors, and edges are relationships in a dependency tree. Edge weights can take values based on dependencies between nodes. Exemplarily, when the value of the edge between two nodes is zero, it indicates that there is no dependency between the two nodes; when the value of the edge between the two nodes is non-zero and positive, it indicates that there is a relationship between the two nodes. There is a dependency relationship, and the larger the value of the edge between two nodes is, the stronger the dependency relationship between the two nodes is.

步骤12131，基于所述词嵌入向量集确定所述依赖图的节点；Step 12131, determine the node of the dependency graph based on the word embedding vector set;

具体地，依赖图中的每个节点分别与词嵌入向量一一对应，即一个节点对应一个词汇。如节点1对应“明天”，节点2对应“开学”，节点3对应“作业”，节点4对应“没”，节点5对应“写完”。Specifically, each node in the dependency graph corresponds to the word embedding vector one-to-one, that is, one node corresponds to one word. For example,node 1 corresponds to "tomorrow",node 2 corresponds to "school", node 3 corresponds to "homework", node 4 corresponds to "no", and node 5 corresponds to "completed".

步骤12132，基于所述依赖树和邻接矩阵确定每个所述节点之间的邻接关系；Step 12132, determine the adjacency relationship between each of the nodes based on the dependency tree and the adjacency matrix;

邻接矩阵计算如下：The adjacency matrix is calculated as follows:

其中，A_i,j表示第i个节点与第j个节点之间的连接关系，A_i,j∈R^Y×Y，即A_i,j属于Y×Y维的节点矩阵，i为大于等于1的正整数，j为大于等于1的正整数，T(W_i,W_j)表示第i个节点W_i与第j个节点W_j之间存在关系，A_i,j＝1表示第i个节点的节点与第j个节点的节点之间存在边，A_i,j＝0表示第i个节点的节点与第j个节点的节点之间不存在边。应理解，A_i,j表示第i个节点与第j个节点之间的连接关系可以通过依赖树确定。Among them, A_i,j represents the connection relationship between the i-th node and the j-th node, A_i,j ∈R^Y×Y , that is, A_i,j belongs to the node matrix of Y×Y dimension, i is greater than or equal to A positive integer of 1, j is a positive integer greater than or equal to 1, T(W_i ,W_j ) indicates that there is a relationship between the i-th node Wi and the_{j-th node W j}_, A_i,j =1 indicates the i-th node There is an edge between the node of the ith node and the node of the jth node, and A_i,j =0 indicates that there is no edge between the node of the ith node and the node of the jth node. It should be understood that A_i,j indicates that the connection relationship between the ith node and the jth node can be determined through a dependency tree.

表1是本发明实施例提供的邻接矩阵示意表，如表1所示，步骤12131中的示例对应的邻接矩阵为：Table 1 is a schematic diagram of an adjacency matrix provided by an embodiment of the present invention. As shown in Table 1, the adjacency matrix corresponding to the example in step 12131 is:

表1.邻接矩阵示意表Table 1. Schematic representation of adjacency matrix

节点序号Nodeserial number1122334455111111000000221111000000330000111111440000111111550000111111

步骤1214，将所述依赖图输入至所述依赖图单元，获取所述依赖图单元输出的第一待分类矩阵；Step 1214, input the dependency graph into the dependency graph unit, and obtain the first matrix to be classified outputted by the dependency graph unit;

具体地，所述依赖图单元可以为依赖图卷积网络，依赖图卷积网络利用图卷积网络来学习句法依赖信息。依赖图卷积网络将依赖图和与依赖图中每个节点对应的词嵌入向量作为输入，第I层中的每个节点都根据其邻域的隐藏表示进行更新：Specifically, the dependency graph unit may be a dependency graph convolution network, and the dependency graph convolution network uses the graph convolution network to learn syntactic dependency information. The dependency graph convolutional network takes as input the dependency graph and the word embedding vector corresponding to each node in the dependency graph, and each node in layer I is updated according to the hidden representation of its neighborhood:

其中，H^I表示第I层隐含层的隐藏表示，H^I∈R^Y×d，H^I为Y×d维矩阵，其每一行对应一个节点的表示。ReLU表示ReLU激活函数，

表示归一化后的邻接矩阵，A表示邻接矩阵，D表示A的度，W^I是第I层的训练参数，W^I∈R^d×d，W^I为d×d维矩阵，b^I是第I层的训练参数。Among them, H^I represents the hidden representation of the hidden layer of the first layer, H^I ∈ R^Y×d , H^I is a Y×d-dimensional matrix, and each row corresponds to the representation of a node. ReLU represents the ReLU activation function,

represents the normalized adjacency matrix, A represents the adjacency matrix, D represents the degree of A, W^I is the training parameter of the first layer, W^I ∈ R^d×d , W^I is a d×d-dimensional matrix, b^I is Training parameters for layer I.

应理解，H^I是经过依赖图卷积网络后，得到的依赖信息加强后的表示，归一化后的邻接矩阵能够在表达句法结构的同时，降低句法结构的复杂度，从而降低时间复杂度。激活函数可以采用ReLU函数，也可以采用其他激活函数。It should be understood that H^I is the representation of the enhanced dependency information obtained after the dependency graph convolution network. The normalized adjacency matrix can express the syntactic structure while reducing the complexity of the syntactic structure, thereby reducing the time complexity. . The activation function can use the ReLU function or other activation functions.

本发明实施例提供的语义分类方法，通过将依赖树和词嵌入相结合，实现将句法结构和单词含义相结合，使得语义分类模型能够从句法结构和单词含义两方面理解整个待分类文本数据的语义，提高了对待分类文本数据的语义理解的正确率。The semantic classification method provided by the embodiment of the present invention realizes the combination of syntactic structure and word meaning by combining the dependency tree and word embedding, so that the semantic classification model can understand the entire text data to be classified from the two aspects of syntactic structure and word meaning. Semantics improves the accuracy of semantic understanding of classified text data.

步骤1215，将所述第一待分类矩阵输入至所述注意力单元，获得所述注意力单元输出的第一待分类向量。Step 1215: Input the first matrix to be classified into the attention unit, and obtain the first vector to be classified output by the attention unit.

具体地，注意力单元是基于注意力机制构建的，以基于社交媒体的压力类别分类应用场景为例，注意力单元可以使语义分类模型学会关注压力相关信息。Specifically, the attention unit is constructed based on the attention mechanism. Taking the application scenario of stress category classification based on social media as an example, the attention unit can make the semantic classification model learn to pay attention to stress-related information.

可选地，注意力单元可以基于注意力向量公式和所述待分类矩阵获得待分类向量。Optionally, the attention unit may obtain the to-be-classified vector based on the attention vector formula and the to-be-classified matrix.

所述注意力向量公式为：The attention vector formula is:

v＝Softmax(PW_p+b_p)；v=Softmax(PW_p +b_p );

F＝v^TP；F=^vTP ;

其中，v表示注意力向量，v∈R^t×1，v为t×1维的矩阵，Softmax表示softmax函数，P表示待分类矩阵，W_p表示训练参数，W_p∈R^d×1，W_p为d×1维的矩阵，b_p表示训练参数，F为待分类向量，是用于表示待分类文本数据，F∈R^1×d，F为1×d维的矩阵，v^T表示v的转置。应理解，在输入数据仅为待分类文本数据的情况下，P为第一待分类矩阵，F为第一待分类向量。Among them, v is the attention vector, v∈R^t×1 , v is the t×1-dimensional matrix, Softmax is the softmax function, P is the matrix to be classified, W_p is the training parameter, W_p ∈ R^d×1 , W_p is a d×1-dimensional matrix, b_p represents the training parameters, F is the vector to be classified, which is used to represent the text data to be classified, F∈R^1×d , F is a 1×d-dimensional matrix, v^T represents v transposition of . It should be understood that in the case where the input data is only text data to be classified, P is the first matrix to be classified, and F is the first vector to be classified.

所述方法还包括：The method also includes:

步骤111，获取与所述待分类文本数据对应的待分类图片数据；Step 111, obtaining image data to be classified corresponding to the text data to be classified;

示例性地，如在基于社交媒体的压力类别检测场景中，用户可能会在社交媒体上进行配文，如发表文本“今天不开心”，并配图就医诊疗图片，可以表示用户由于身体健康而产生压力，由于图片数据可以提供丰富的内容信息，因此可以用于语言内容分类。应理解，语义分类模型可以基于文本样本以及所述文本样本对应的类别标签、图片样本以及所述图片样本对应的类别标签和组合样本以及所述组合样本对应的类别标签进行训练后得到的。组合样本中包括文本样本和与文本样本对应的图片样本。Exemplarily, in the social media-based stress category detection scenario, the user may post a text on social media, such as the text "I'm not happy today", accompanied by a picture of medical treatment, which may indicate that the user is in a state of health problems due to physical health. Generates pressure, since picture data can provide rich content information, it can be used for linguistic content classification. It should be understood that the semantic classification model can be obtained after training based on text samples and class labels corresponding to the text samples, picture samples and the class labels corresponding to the picture samples and combined samples and the class labels corresponding to the combined samples. The combined samples include text samples and image samples corresponding to the text samples.

步骤1216，将所述待分类图片数据输入至所述图片单元，获得所述图片单元输出的第二待分类矩阵；Step 1216, input the picture data to be classified into the picture unit, and obtain the second matrix to be classified outputted by the picture unit;

具体地，图片单元用于对图片数据进行编码，获得第二待分类矩阵，第二待分类矩阵用于表示图片含义。可选地，图片单元可以包括ResNet。ResNet可以有效地从社交媒体数据中获取图像特征，因此可以基于预训练好的34层ResNet提取图片特征。最终第二待分类矩阵为X＝{x₁,x₂，…,x_δ}，其中，δ表示图片数据的数量(如待分类文本数据中配图3张，则δ＝3)，x_i表示第i张图片的向量表示，1≤i≤δ；X∈R^δ×d，第二待分类矩阵为δ×d维的矩阵。Specifically, the picture unit is used to encode the picture data to obtain a second matrix to be classified, and the second matrix to be classified is used to represent the meaning of the picture. Optionally, the picture unit may include ResNet. ResNet can effectively obtain image features from social media data, so it can extract image features based on pre-trained 34-layer ResNet. The final second matrix to be classified is X={x₁ , x₂ ,...,x_δ }, where δ represents the number of picture data (if there are 3 pictures in the text data to be classified, then δ=3), x_i Represents the vector representation of the i-th picture, 1≤i≤δ; X∈R^δ×d , and the second matrix to be classified is a δ×d-dimensional matrix.

步骤1217，将所述第一待分类矩阵和所述第二待分类矩阵拼接，获得第三待分类矩阵；Step 1217, splicing the first matrix to be classified and the second matrix to be classified to obtain a third matrix to be classified;

以简化后的第一待分类矩阵和第二待分类矩阵做一简单的示例：Take the simplified first matrix to be classified and the second matrix to be classified as a simple example:

其中，H为待分类文本数据的文本表示，即第一待分类矩阵，X为待分类文本数据的图片表示，即第二待分类矩阵，con表示拼接函数，P表示第三待分类矩阵，P∈R^t×d，P为t×d维的矩阵，其中t＝Y+δ。Among them, H is the text representation of the text data to be classified, that is, the first matrix to be classified, X is the picture representation of the text data to be classified, that is, the second matrix to be classified, con represents the splicing function, P represents the third matrix to be classified, P ∈R^t×d , P is a matrix of dimension t×d, where t=Y+δ.

步骤1218，将所述第三待分类矩阵输入至所述注意力单元，获得所述注意力单元输出的第二待分类向量。Step 1218: Input the third matrix to be classified into the attention unit to obtain a second vector to be classified output by the attention unit.

应理解，注意力单元如步骤1215所述，此处不再赘述。注意力单元基于注意力机制融合待分类文本数据和待分类图像数据两种数据的模态，可以使语义分类模型学会关注压力相关信息。It should be understood that the attention unit is as described in step 1215, and details are not repeated here. The attention unit integrates the modalities of the text data to be classified and the image data to be classified based on the attention mechanism, so that the semantic classification model can learn to pay attention to the pressure-related information.

注意力单元生成第二待分类向量的过程如步骤1215所述，此处不再赘述。应理解，在输入为第三分类矩阵的情况下，注意力向量公式中P为第三待分类矩阵，F为第二待分类向量。The process of generating the second to-be-classified vector by the attention unit is as described in step 1215, which will not be repeated here. It should be understood that in the case where the input is the third classification matrix, in the attention vector formula, P is the third matrix to be classified, and F is the second vector to be classified.

应理解，本发明实施例中的待分类向量包括第一待分类向量和第二待分类向量其中任一。It should be understood that the to-be-classified vector in this embodiment of the present invention includes any one of a first to-be-classified vector and a second to-be-classified vector.

参考图2，图2是本发明实施例提供的语义分类方法的流程示意图之二。本发明实施例提供的语义分类模型可以是基于训练集和测试集进行元学习后得到的，所述训练集包括训练样本以及对应的分类标签，所述测试集包括测试样本以及对应的分类标签，所述支持集包括至少一个类别，每个类别包括至少一个支持样本以及每个所述支持样本对应的分类标签。Referring to FIG. 2 , FIG. 2 is a second schematic flowchart of a semantic classification method provided by an embodiment of the present invention. The semantic classification model provided by the embodiment of the present invention may be obtained by performing meta-learning based on a training set and a test set, where the training set includes training samples and corresponding classification labels, and the test set includes test samples and corresponding classification labels, The support set includes at least one category, and each category includes at least one support sample and a classification label corresponding to each support sample.

所述训练集和训练集都可以包括以下任一样本类型或组合：Both the training set and the training set may include any of the following sample types or combinations:

文本样本以及所述文本样本对应的类别标签；A text sample and a category label corresponding to the text sample;

图片样本以及所述图片样本对应的类别标签；A picture sample and a category label corresponding to the picture sample;

组合样本以及所述组合样本对应的类别标签；a combined sample and a category label corresponding to the combined sample;

其中，组合样本中包括文本样本和与文本样本对应的图片样本。The combined samples include text samples and image samples corresponding to the text samples.

在所述语义分类模型是基于训练集和测试集进行元学习后得到的情况下，语义分类方法还可以包括：When the semantic classification model is obtained after meta-learning based on the training set and the test set, the semantic classification method may further include:

获取支持集；get support set;

将所述支持集和所述待分类文本数据输入至语义分类模型，获得所述语义分类模型输出的分类结果。Inputting the support set and the text data to be classified into a semantic classification model to obtain a classification result output by the semantic classification model.

支持集包括至少一个类别，每个类别包括至少一个支持样本以及每个所述支持样本对应的分类标签。支持集可以是从数据库中获得的，也可以是人工标注的，本发明实施例对支持集的来源不作限定。The support set includes at least one category, and each category includes at least one support sample and a classification label corresponding to each of the support samples. The support set may be obtained from a database or manually marked, and the source of the support set is not limited in this embodiment of the present invention.

通过元学习，使得在语义分类模型的训练过程中，经过大量的训练，每次训练都遇到的是不同的任务，这个任务里存在以前的任务中没有见到过的样本。语义分类模型每次都要学习一个新的任务，经过大量的训练，语义分类模型能够具有对新任务的处理能力。这个新的任务可以是小样本任务，支持集中包括经过标记的少量样本，这些样本都是语义分类模型以前训练中所没见过的，在经过元学习后，语义分类模型能够在这几个小样本的帮助下，分类识别未标记的待分类文本数据。这种学习方式，使得语义分类模型在经历多次的不同任务训练之后，能够更好地处理任务之间的不同，忽略特定任务的特征。从而实现在遇到小样本分类新任务的时候，基于少量标注样本实现准确分类。Through meta-learning, in the process of training the semantic classification model, after a lot of training, each training encounters different tasks, and there are samples in this task that have not been seen in previous tasks. The semantic classification model has to learn a new task every time. After a lot of training, the semantic classification model can have the ability to process new tasks. This new task can be a small sample task, and the support set includes a small number of labeled samples that have not been seen in the previous training of the semantic classification model. After meta-learning, the semantic classification model can be used in these small samples. With the help of samples, classification identifies unlabeled text data to be classified. This learning method enables the semantic classification model to better deal with the differences between tasks after being trained on different tasks for many times, ignoring the characteristics of specific tasks. In this way, when encountering a new task of small sample classification, accurate classification can be achieved based on a small number of labeled samples.

本发明实施例提供的语义分类方法，通过元学习使得语义分类模型能够具有“学习如何学习”的能力，使得语义分类模型在频繁出现的样本上训练后，不需要调整，只需要少数不频繁出现的样本的数据，就可以识别出待分类数据(包括待分类文本数据和/或待分类图片数据)中不频繁出现的类别。The semantic classification method provided by the embodiment of the present invention enables the semantic classification model to have the ability of "learning how to learn" through meta-learning, so that after the semantic classification model is trained on frequently occurring samples, no adjustment is required, and only a few infrequent occurrences are required. Infrequently appearing categories in the data to be classified (including the text data to be classified and/or the image data to be classified) can be identified.

在所述语义分类模型是基于训练集和测试集进行元学习后得到的情况下，语义分类模型还可以包括：归纳模块；When the semantic classification model is obtained after meta-learning based on the training set and the test set, the semantic classification model may further include: an induction module;

所述语义分类方法还可以包括：The semantic classification method may further include:

步骤122，将所述支持集输入至所述归纳模块，获得所述归纳模块输出的类别向量集，所述类别向量集包括至少一个类别向量，每个所述类别向量与所述分类标签一一对应。Step 122: Input the support set to the induction module to obtain a category vector set output by the induction module, where the category vector set includes at least one category vector, and each category vector is associated with the classification label one by one. correspond.

可选地，所述归纳模块包括向量单元和赋权单元；Optionally, the induction module includes a vector unit and a weighting unit;

所述将所述支持集输入至所述归纳模块，获得所述归纳模块输出的类别向量集，包括：The inputting the support set to the induction module to obtain the category vector set output by the induction module includes:

步骤1221，将所述支持集输入至所述向量单元，获得所述向量单元输出的专家向量集，所述专家向量集中包括至少一个专家向量，每个所述专家向量与所述支持集中的支持样本一一对应；Step 1221: Input the support set to the vector unit, and obtain an expert vector set output by the vector unit, where the expert vector set includes at least one expert vector, and each expert vector is associated with the support in the support set. One-to-one correspondence between samples;

具体地，支持集中可以包括N个类别的支持样本，每个类别可以包括K个支持样本，由支持样本转换为专家向量的过程可以参照上述步1211至步骤1218，本发明实施例对支持样本转换为专家向量的过程不作限定。Specifically, the support set may include N categories of support samples, and each category may include K support samples. For the process of converting the support samples into expert vectors, reference may be made to the above steps 1211 to 1218. The embodiment of the present invention converts the support samples to The procedure for the expert vector is not limited.

步骤1222，将所述专家向量集输入至赋权单元，对所述专家向量集基于预设赋权公式进行赋权操作，获得所述赋权单元输出类别向量集。Step 1222: Input the expert vector set to the weighting unit, perform a weighting operation on the expert vector set based on a preset weighting formula, and obtain the output category vector set of the weighting unit.

具体地，归纳模块用于从支持集中生成每个类别对应的类别表示。赋权单元是基于混合专家机制设计的，用于基于专家向量集生成每个类别对应的类别表示。赋权单元对每个类别中的每个专家向量进行赋权，使得多个不同权重的不同专家向量能够组合起来表示对应的类别，从而使语义分类模型能够利用每个专家向量(即支持样本)的关键信息而忽略噪声。即，将支持集中的每个支持样本都视为专家，每个专家都旨在学习与类别相关的信息。所有专家共享相同的参数。Specifically, the induction module is used to generate the class representation corresponding to each class from the support set. The weighting unit is designed based on the hybrid expert mechanism to generate the class representation corresponding to each class based on the expert vector set. The weighting unit weights each expert vector in each category, so that multiple different expert vectors with different weights can be combined to represent the corresponding category, so that the semantic classification model can use each expert vector (ie support sample) key information while ignoring noise. That is, each support sample in the support set is regarded as an expert, and each expert aims to learn class-related information. All experts share the same parameters.

可选地，所述预设赋权公式可以通过数据建模算法－最优赋权法确定，还可以通过组合赋权法或指标赋权法等确定，本发明实施例对预设赋权公式的具体形式不作限定。Optionally, the preset weighting formula may be determined through a data modeling algorithm-optimal weighting method, or may be determined through a combination weighting method or an index weighting method, etc. The specific form is not limited.

可选地，所述预设赋权公式包括初始值公式、门值公式和类别表示函数；Optionally, the preset weighting formula includes an initial value formula, a gate value formula and a category representation function;

所述初始值公式为：The initial value formula is:

η_i＝Relu(e_iW_g+b_g)；η_i =Relu(_ei W_g +b_g );

所述门值公式为：The gate value formula is:

所述类别表示函数为：The category representation function is:

其中，η_i表示第i个专家样本对应的初始门值，Relu表示ReLU激活函数，e_i表示第i个专家样本对应的专家向量，W_g表示训练参数，b_g表示训练参数，g_i表示第i个专家样本对应的最终门值，exp表示指数函数，K表示一个类别中包含K个样本，c_j表示第j个类别的类别向量，i为大于等于1的正整数，j为大于等于1的正整数，K为大于等于1的正整数。Among them, η_i represents the initial gate value corresponding to the ith expert sample, Relu represents the ReLU activation function, e_i represents the expert vector corresponding to the ith expert sample, W_g represents the training parameters, b_g represents the training parameters, and_gi represents the The final gate value corresponding to the i-th expert sample, exp represents an exponential function, K represents a category containing K samples, c_j represents the category vector of the j-th category, i is a positive integer greater than or equal to 1, and j is greater than or equal to A positive integer of 1, and K is a positive integer greater than or equal to 1.

具体地，门值是对专家向量的非线性变换，门值计算用于语义分类模型学习衡量专家信息的重要性。应理解，图2中门值单元用于通过门值公式计算门值，协调单元用于计算类别表示。Specifically, the gate value is a nonlinear transformation of the expert vector, and the gate value calculation is used for the semantic classification model to learn to measure the importance of expert information. It should be understood that the gate value unit in FIG. 2 is used to calculate the gate value through the gate value formula, and the coordination unit is used to calculate the category representation.

本发明实施例提供的语义分类方法，通过对支持样本的表示(专家向量)进行赋权，使语义分类模型能够利用每个专家向量的关键信息而忽略噪声，提高了类别表示的准确性。The semantic classification method provided by the embodiment of the present invention enables the semantic classification model to utilize the key information of each expert vector while ignoring noise by weighting the representations (expert vectors) of the support samples, thereby improving the accuracy of class representation.

步骤1231，将所述待分类向量和所述类别向量集输入至所述关系模块，获得所述关系模块输出的分类结果，包括：Step 1231: Input the vector to be classified and the category vector set into the relationship module, and obtain the classification result output by the relationship module, including:

将所述待分类向量和所述类别向量集输入至所述关系模块，基于余弦相似度确定与所述待分类向量最相似的类别向量，获得所述关系模块输出的分类结果。The to-be-classified vector and the category vector set are input to the relationship module, the most similar category vector to the to-be-classified vector is determined based on the cosine similarity, and the classification result output by the relationship module is obtained.

具体地，关系模块用于衡量查询向量(即待分类向量)与类别(即类别向量集中各个类别向量)的相关性。余弦相似度在衡量两个向量之间的关系时显示出了巨大的优越性。因此，本发明实施例使用余弦相似度公式计算待分类向量和所述类别向量的相似度：Specifically, the relationship module is used to measure the correlation between the query vector (ie, the vector to be classified) and the category (ie, each category vector in the category vector set). Cosine similarity shows great superiority in measuring the relationship between two vectors. Therefore, the embodiment of the present invention uses the cosine similarity formula to calculate the similarity between the to-be-classified vector and the category vector:

余弦相似度公式为：The cosine similarity formula is:

其中，q表示待分类向量，c_i表示第i个类别的类别向量，i为正整数，j为正整数，

表示范围在－1到1的关系分数，分数约接近1表示两个向量越相似。Among them, q represents the vector to be classified, c_i represents the category vector of the ith category, i is a positive integer, j is a positive integer,

Represents a relationship score in the range -1 to 1, with a score close to 1 indicating that the two vectors are more similar.

将待分类文本数据分类为最大关系分数对应的类别。Classify the text data to be classified into the category corresponding to the largest relationship score.

本发明实施例提供的语义分类方法，通过将待分类向量与类别向量进行一一对比，由于待分类向量能够全面地表示待分类文本数据的语义含义，通过赋权提高了类别向量对类别表示的准确性，因此提高了社交媒体压力类别分类的准确性。The semantic classification method provided by the embodiment of the present invention compares the to-be-classified vector with the category vector one by one. Since the to-be-classified vector can comprehensively represent the semantic meaning of the to-be-classified text data, the weighting improves the ability of the class vector to represent the class. accuracy, thus improving the classification of social media stress categories.

可选地，所述语义分类模型通过如下步骤训练获得：：Optionally, the semantic classification model is obtained by training in the following steps:

步骤210，基于所述训练集、损失函数和预设元训练任务数对所述语义分类模型进行元训练；Step 210, meta-training the semantic classification model based on the training set, the loss function and the preset number of meta-training tasks;

步骤220，基于所述测试集和预设元测试任务数对所述语义分类模型进行元测试；Step 220, meta-testing the semantic classification model based on the test set and the preset number of meta-test tasks;

交替执行上述步骤，直至预设迭代次数用尽或所述语义分类模型输出的元测试结果达到预设准确率。The above steps are performed alternately until the preset number of iterations is exhausted or the meta-test result output by the semantic classification model reaches the preset accuracy rate.

具体地，表1是本发明实施例提供的元学习算法。Specifically, Table 1 shows the meta-learning algorithms provided by the embodiments of the present invention.

表1.元学习算法Table 1. Meta-Learning Algorithms

元训练为了有效地利用训练集，我们采用元训练策略。在每个训练episode中，都会构建一个元任务来计算梯度并更新我们的模型。如表1所示，元任务是通过从训练集类别中随机抽样N个类别形成的，然后为每个类别选择K个标记样本来作为支持集S，以及那些个类别的剩余样本的一部分被选择作为查询集Q。元训练过程旨在学习如何学习有意义的特征以最小化损失在查询集Q上。Meta-training To efficiently utilize the training set, we employ a meta-training strategy. In each training episode, a meta-task is built to compute gradients and update our model. As shown in Table 1, the meta-task is formed by randomly sampling N classes from the training set classes, then selecting K labeled samples for each class as the support set S, and a portion of the remaining samples from those classes are chosen as the query set Q. The meta-training process aims to learn how to learn meaningful features to minimize the loss on the query set Q.

这种元训练机制允许语义分类模型学习不同元任务的通用部分，例如如何提取重要特征和比较样本相似性，而忘记元任务中的任务特定部分。因此，该模型在面对新的元任务时仍然可以有效地工作。This meta-training mechanism allows semantic classification models to learn common parts of different meta-tasks, such as how to extract important features and compare sample similarities, while forgetting the task-specific parts of meta-tasks. Therefore, the model can still work effectively when faced with new meta-tasks.

元测试在完成元训练后，应用相同的episode构建机制来测试语义分类模型是否可以直接应用于新的类别。为了创建一个测试episode，首先从测试集中随机抽取N个新类别。然后从N个类别中采样测试用支持集和测试用查询集。语义分类模型的输出结果可以表示所有测试episode的查询集的平均性能。Meta-testing After meta-training, the same episode construction mechanism is applied to test whether the semantic classification model can be directly applied to new categories. To create a test episode, first randomly sample N new classes from the test set. The support set for testing and the query set for testing are then sampled from N classes. The output of the semantic classification model can represent the average performance of the query set for all test episodes.

在元测试后语义分类模型的输出结果正确率未达到预设准确率的情况下，再次对语义分类模型进行元训练，对再次经过元训练了的语义分类模型进行检测，直至迭代次数用尽或语义分类模型的输出结果正确率达到预设准确率后，停止对语义分类模型的训练。If the correct rate of the output result of the semantic classification model does not reach the preset accuracy rate after meta-testing, meta-train the semantic classification model again, and detect the semantic classification model that has been meta-trained again until the number of iterations is exhausted or After the correct rate of the output result of the semantic classification model reaches the preset accuracy rate, the training of the semantic classification model is stopped.

可选地，在元训练阶段，我们使用均方误差来优化我们的模型。Optionally, during the meta-training phase, we use mean squared error to optimize our model.

本发明实施例提供的语义分类方法，通过元学习使得语义分类模型能够具有“学习如何学习”的能力，使得语义分类模型能够在小样本数据的基础上仍然保证分类准确率，如在基于社交媒体的压力类别分类场景中，语义分类模型在频繁出现的压力类别上训练后，不需要调整，只需要少数不频繁出现的压力类别的数据就可以直接识别那些不频繁出现的压力类别。The semantic classification method provided by the embodiment of the present invention enables the semantic classification model to have the ability of "learning how to learn" through meta-learning, so that the semantic classification model can still guarantee the classification accuracy on the basis of small sample data, such as social media based In the stress category classification scenario of , the semantic classification model does not need to be adjusted after training on frequently occurring stress categories, and only needs a few data of infrequent stress categories to directly identify those infrequent stress categories.

下面对本发明提供的语义分类装置进行描述，下文描述的语义分类装置与上文描述的语义分类方法可相互对应参照。The semantic classification device provided by the present invention is described below, and the semantic classification device described below and the semantic classification method described above can be referred to each other correspondingly.

图3是本发明实施例提供的语义分类装置的结构示意图，如图3所示，本发明实施例提供一种语义分类装置，包括：FIG. 3 is a schematic structural diagram of a semantic classification device provided by an embodiment of the present invention. As shown in FIG. 3 , an embodiment of the present invention provides a semantic classification device, including:

获取模块310，用于获取待分类文本数据；an obtainingmodule 310, configured to obtain text data to be classified;

分类模块320，用于将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；Theclassification module 320 is configured to input the text data to be classified into a semantic classification model, perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classify based on the semantic understanding result , to obtain the classification result output by the semantic classification model;

在此需要说明的是，本发明实施例提供的上述装置，能够实现上述方法实施例所实现的所有方法步骤，且能够达到相同的技术效果，在此不再对本实施例中与方法实施例相同的部分及有益效果进行具体赘述。It should be noted here that the above-mentioned device provided by the embodiment of the present invention can realize all the method steps realized by the above-mentioned method embodiment, and can achieve the same technical effect, and the same as the method embodiment in this embodiment is not repeated here. The parts and beneficial effects will be described in detail.

图4示例了一种电子设备的实体结构示意图，如图4所示，该电子设备可以包括：处理器(processor)410、通信接口(Communications Interface)420、存储器(memory)430和通信总线440，其中，处理器410，通信接口420，存储器430通过通信总线440完成相互间的通信。处理器410可以调用存储器430中的逻辑指令，以执行语义分类方法，包括：获取待分类文本数据；将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；其中，所述语义分类模型是基于文本样本以及所述文本样本对应的类别标签进行训练后得到的，每个所述类别标签是根据所述文本样本预先确定的，并与所述文本样本一一对应。FIG. 4 illustrates a schematic diagram of the physical structure of an electronic device. As shown in FIG. 4 , the electronic device may include: a processor (processor) 410, a communication interface (Communications Interface) 420, a memory (memory) 430, and acommunication bus 440, Theprocessor 410 , thecommunication interface 420 , and thememory 430 communicate with each other through thecommunication bus 440 . Theprocessor 410 may invoke the logic instructions in thememory 430 to execute the semantic classification method, including: obtaining text data to be classified; inputting the text data to be classified into a semantic classification model, based on the word embedding and the word embedding of the text data to be classified; The dependency tree performs semantic understanding on the text data to be classified, and performs classification based on the semantic understanding result, and obtains the classification result output by the semantic classification model; wherein, the semantic classification model is based on the text samples and the corresponding text samples. The category labels are obtained after training, and each category label is predetermined according to the text samples, and corresponds to the text samples one-to-one.

此外，上述的存储器430中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above-mentioned logic instructions in thememory 430 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product. Based on this understanding, the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

另一方面，本发明还提供一种计算机程序产品，所述计算机程序产品包括计算机程序，计算机程序可存储在非暂态计算机可读存储介质上，所述计算机程序被处理器执行时，计算机能够执行上述各方法所提供的语义分类方法，该方法包括：获取待分类文本数据；将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；其中，所述语义分类模型是基于文本样本以及所述文本样本对应的类别标签进行训练后得到的，每个所述类别标签是根据所述文本样本预先确定的，并与所述文本样本一一对应。In another aspect, the present invention also provides a computer program product, the computer program product includes a computer program, the computer program can be stored on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, the computer can Execute the semantic classification method provided by each of the above methods, the method includes: acquiring text data to be classified; inputting the text data to be classified into a semantic classification model, based on the word embedding and dependency tree of the text data to be classified. Perform semantic understanding on the text data to be classified, and classify based on the semantic understanding result, and obtain the classification result output by the semantic classification model; wherein, the semantic classification model is trained based on the text samples and the category labels corresponding to the text samples It is obtained that each of the category labels is predetermined according to the text samples, and corresponds to the text samples one-to-one.

又一方面，本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现以执行上述各方法提供的语义分类方法，该方法包括：获取待分类文本数据；将所述待分类文本数据输入至语义分类模型，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，并基于语义理解结果进行分类，获得所述语义分类模型输出的分类结果；其中，所述语义分类模型是基于文本样本以及所述文本样本对应的类别标签进行训练后得到的，每个所述类别标签是根据所述文本样本预先确定的，并与所述文本样本一一对应。In another aspect, the present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored, the computer program is implemented by a processor to execute the semantic classification method provided by the above methods, and the method includes: Obtain the text data to be classified; input the text data to be classified into a semantic classification model, perform semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and classify based on the semantic understanding result , obtain the classification result output by the semantic classification model; wherein, the semantic classification model is obtained after training based on the text samples and the class labels corresponding to the text samples, and each class label is based on the text samples It is predetermined and corresponds to the text samples one-to-one.

以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下，即可以理解并实施。The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on this understanding, the above-mentioned technical solutions can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic A disc, an optical disc, etc., includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in various embodiments or some parts of the embodiments.

最后应说明的是：以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

Translated fromChinese

1.一种语义分类方法，其特征在于，包括：1. a semantic classification method, is characterized in that, comprises:

获取待分类文本数据；Get the text data to be classified;

2.根据权利要求1所述的语义分类方法，其特征在于，所述语义分类模型包括：编码模块和关系模块；2. The semantic classification method according to claim 1, wherein the semantic classification model comprises: an encoding module and a relationship module;

将所述待分类文本数据输入至所述编码模块，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，获得所述编码模块输出的待分类向量；Input the text data to be classified into the encoding module, perform semantic understanding on the text data to be classified based on the word embedding and the dependency tree of the text data to be classified, and obtain the vector to be classified output by the encoding module;

3.根据权利要求2所述的语义分类方法，其特征在于，所述编码模块包括BERT单元、依赖树单元、构造单元、依赖图单元和注意力单元；3. The semantic classification method according to claim 2, wherein the encoding module comprises a BERT unit, a dependency tree unit, a construction unit, a dependency graph unit and an attention unit;

所述将所述待分类文本数据输入至所述编码模块，基于所述待分类文本数据的词嵌入和依赖树对所述待分类文本数据进行语义理解，获得所述编码模块输出的待分类向量，包括：Inputting the text data to be classified into the encoding module, performing semantic understanding on the text data to be classified based on the word embedding and dependency tree of the text data to be classified, and obtaining a vector to be classified outputted by the encoding module ,include:

4.根据权利要求3所述的语义分类方法，其特征在于，所述将所述依赖树和所述词嵌入向量集输入至所述构造单元，获得所述构造单元输出的依赖图，包括：4. The semantic classification method according to claim 3, wherein the inputting the dependency tree and the word embedding vector set into the construction unit, and obtaining the dependency graph output by the construction unit, comprises:

所述邻接矩阵为：The adjacency matrix is:

其中，A_i,j表示第i个节点与第j个节点之间的连接关系，i为大于等于1的正整数，j为大于等于1的正整数，T(W_i，W_j)表示第i个节点W_i与第j个节点W_j之间存在关系。Among them, A_i,j represents the connection relationship between the ith node and the jth node, i is a positive integer greater than or equal to 1, j is a positive integer greater than or equal to 1, T(W_i , W_j ) represents the first There is a relationship between the i nodes Wi and the_{jth node W j}_.

5.根据权利要求4所述的语义分类方法，其特征在于，所述依赖图单元包括至少一级隐含层；每级所述隐含层的输出公式如下：5. The semantic classification method according to claim 4, wherein the dependency graph unit comprises at least one level of hidden layer; the output formula of each level of the hidden layer is as follows:

6.根据权利要求3所述的语义分类方法，其特征在于，所述编码模块还包括图片单元和连接单元；6. The semantic classification method according to claim 3, wherein the encoding module further comprises a picture unit and a connection unit;

所述方法还包括：The method also includes:

7.一种语义分类装置，其特征在于，包括：7. A semantic classification device, comprising:

其中，所述语义分类模型是基于文本样本以及所述文本样本对应的类别标签进行训练后得到的，每个所述类别标签是根据所述文本样本预先确定的，并与所述文本样本一一对应。The semantic classification model is obtained after training based on the text samples and the category labels corresponding to the text samples, and each of the category labels is predetermined according to the text samples, and is one-to-one with the text samples. correspond.

8.一种电子设备，包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序，其特征在于，所述处理器执行所述程序时实现如权利要求1至6任一项所述语义分类方法。8. An electronic device comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor implements the program as claimed in claim 1 when executing the program The semantic classification method described in any one of to 6.

9.一种非暂态计算机可读存储介质，其上存储有计算机程序，其特征在于，所述计算机程序被处理器执行时实现如权利要求1至6任一项所述语义分类方法。9 . A non-transitory computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the semantic classification method according to any one of claims 1 to 6 is implemented.

10.一种计算机程序产品，包括计算机程序，其特征在于，所述计算机程序被处理器执行时实现如权利要求1至6任一项所述语义分类方法。10. A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the semantic classification method according to any one of claims 1 to 6 is implemented.