CN116450781A

Movatterモバイル変換

Info

Publication number: CN116450781A
Application number: CN202210001960.4A
Authority: CN
Inventors: 段德峰; 谢军; 马维晶; 刘虹; 王璐; 黄丽云; 马建军; 张珍; 张松蕾; 胡建村
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Information Technology Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Information Technology Co Ltd
Priority date: 2022-01-04
Filing date: 2022-01-04
Publication date: 2023-07-18

Abstract

The invention provides a question and answer processing method and device, and relates to the technical field of computers. The method comprises the following steps: determining a question-answer mode of a target user statement according to a Task scene preset rule and a dialogue state detection rule; under the condition that the question-answer mode does not belong to the Task scene mode and the question-answer mode is a non-multi-round dialogue mode, inputting the target user statement into a question-answer fusion model, and determining a matching result of the target user statement and the question-answer fusion model; and under the condition that the question-answer mode belongs to the Task scene mode or the question-answer mode is a multi-round dialogue mode, inputting the target user statement into an enhanced intention recognition algorithm, and determining a recognition result of the target user statement. According to the question-answering processing method and device, the question-answering processing mode corresponding to the question-answering mode scene is adopted aiming at the specific question-answering mode, and the question-answering recognition algorithm is optimized, so that efficient processing of various question-answering scenes is realized.

Description

Translated fromChinese

问答的处理方法及装置Question and answer processing method and device

技术领域technical field

本发明涉及计算机技术领域，具体涉及一种问答的处理方法及装置。The present invention relates to the field of computer technology, in particular to a method and device for processing questions and answers.

背景技术Background technique

现有对问答的处理，一般采用分类、聚类或相似度匹配等算法对用户语句中的输入问题进行意图识别，进而检索专家积累的知识语料库进行回复。Existing processing of questions and answers generally uses algorithms such as classification, clustering, or similarity matching to identify the intent of the input questions in user sentences, and then retrieve the knowledge corpus accumulated by experts to reply.

现有方法中的问答的处理方法主要用于日常知识检索、知识问答场景，对问答的处理逻辑简单，支持问答语料及场景单一。对QA问答、图谱问答、任务型问答及基础闲聊问答同时存在的多种问答场景的支持度较低，难以应对复杂的综合性问答场景。并且，对复杂性问答、多轮对话问答等存在计算能力及效率不足的问题。The question and answer processing method in the existing method is mainly used in daily knowledge retrieval and knowledge question and answer scenarios, the processing logic of question and answer is simple, and it supports a single question and answer corpus and scene. The support for QA question answering, graph question answering, task-based question answering and basic chat question answering is low, and it is difficult to deal with complex comprehensive question answering scenarios. Moreover, there are problems of insufficient computing power and efficiency for complex question answering and multi-round dialogue question answering.

因此，如何提出一种方法，能够解决现有问答的处理技术中，功能简单、场景单一、效率不高的问题，具有十分重要的意义。Therefore, how to propose a method that can solve the problems of simple function, single scene and low efficiency in the existing question answering processing technology is of great significance.

发明内容Contents of the invention

本发明提供一种问答的处理方法及装置，用以解决现有技术中问答的处理中无法在多种不同场景中实现高效且准确问答的处理的技术问题。The present invention provides a question and answer processing method and device, which are used to solve the technical problem in the prior art that efficient and accurate question and answer processing cannot be realized in various scenarios.

第一方面，本发明提供一种问答的处理方法，包括：In a first aspect, the present invention provides a method for processing questions and answers, including:

根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式；Determine the question-and-answer mode of the target user's statement according to the preset rules of the Task scene and the dialogue state detection rules;

在所述问答模式不属于Task场景模式，并且所述问答模式为非多轮对话模式的情况下，将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果；When the question-and-answer mode does not belong to the Task scene mode, and the question-and-answer mode is not a multi-round dialogue mode, input the target user sentence into the question-and-answer fusion model, and determine the relationship between the target user sentence and the question-and-answer fusion model matching result;

在所述问答模式属于Task场景模式，或者所述问答模式为多轮对话模式的情况下，将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果；In the case where the question-and-answer mode belongs to the Task scene mode, or the question-and-answer mode is a multi-round dialogue mode, input the target user sentence into an enhanced intention recognition algorithm to determine the recognition result of the target user sentence;

其中，所述问答融合模型是基于图谱KBQA问答算法与问答对QA问答算法融合得到的。Wherein, the question-answer fusion model is obtained based on the fusion of the graph KBQA question-answer algorithm and the question-answer pair QA question-answer algorithm.

在一个实施例中，所述根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式，包括：In one embodiment, the determination of the question-and-answer mode of the target user statement according to the preset rules of the Task scene and the dialogue state detection rules includes:

根据预先设定的关键词字典和预先设定的正则表达式，确定所述Task场景预设规则；According to a preset keyword dictionary and a preset regular expression, determine the preset rules of the Task scene;

根据自然语言框架Rasa框架中内置的多轮对话管理策略，确定所述对话状态检测规则；According to the built-in multi-round dialogue management strategy in the natural language framework Rasa framework, determine the dialogue state detection rule;

将所述目标用户语句与所述Task场景预设规则匹配，确定所述目标用户语句的问答模式中的场景模式，并将所述目标用户语句与所述对话状态检测规则匹配，确定所述目标用户语句的问答模式中的对话模式。Matching the target user statement with the Task scene preset rule, determining the scene mode in the question-and-answer mode of the target user statement, and matching the target user statement with the dialog state detection rule, determining the target Conversational mode in question-and-answer mode for user statements.

在一个实施例中，所述将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果，包括：In one embodiment, the inputting the target user sentence into the enhanced intent recognition algorithm, and determining the recognition result of the target user sentence includes:

获取训练样本，对所述训练样本进行数据增强；Obtaining training samples, and performing data enhancement on the training samples;

根据数据增强后的训练样本，训练语言表征模型bert模型，得到训练后的bert模型；According to the training sample after data enhancement, train the language representation model bert model, and obtain the trained bert model;

将所述目标用户语句输入所述训练后的bert模型，得到所述目标用户语句的句向量；The target user sentence is input into the trained bert model to obtain the sentence vector of the target user sentence;

将所述目标用户语句通过人工设定的特征工程获得稀疏向量；Obtaining a sparse vector through the manually set feature engineering of the target user sentence;

拼接所述句向量和所述稀疏向量，并将拼接后得到的向量输入全连接层与softmax层，确定所述目标用户语句的识别结果。splicing the sentence vector and the sparse vector, and inputting the spliced vector into the fully connected layer and the softmax layer to determine the recognition result of the target user sentence.

在一个实施例中，所述对所述训练样本进行数据增强，包括：In one embodiment, the performing data enhancement on the training samples includes:

对所述训练样本中与意图分类无关的词槽进行标记；Marking word slots irrelevant to intent classification in the training samples;

根据人工确定的种子负样本与所述意图分类无关的词槽，生成训练负样本；Generate training negative samples according to the artificially determined seed negative samples that have nothing to do with the intent classification;

将所述训练样本进行分词，对分词后得到的词槽中未标记的词槽进行近义词替换；Carry out word segmentation to described training sample, carry out synonym replacement to unmarked word groove in the word groove obtained after word segmentation;

对近义词替换后的样本进行过采样，平衡不同类别的样本数量，得到均衡样本；Oversampling the samples replaced by synonyms, balancing the number of samples of different categories, and obtaining balanced samples;

将所述训练负样本和所述均衡样本，作为数据增强后的训练样本。The training negative samples and the balanced samples are used as training samples after data enhancement.

在一个实施例中，所述将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果，包括：In one embodiment, the inputting the target user statement into the question-answer fusion model, and determining the matching result between the target user statement and the question-answer fusion model include:

将所述目标用户语句并行输入所述KBQA问答算法与所述QA问答算法；Input the target user sentence into the KBQA question-answering algorithm and the QA question-answering algorithm in parallel;

根据所述KBQA问答算法确定所述目标用户语句的KBQA匹配结果，根据所述QA问答算法确定所述目标用户语句的QA匹配结果；Determine the KBQA matching result of the target user statement according to the KBQA question-and-answer algorithm, and determine the QA matching result of the target user statement according to the QA question-answer algorithm;

根据预先设置的优先级，将所述KBQA匹配结果与所述QA匹配结果中优先级高的匹配结果作为所述问答融合模型的匹配结果；According to the preset priority, use the matching result with higher priority among the KBQA matching result and the QA matching result as the matching result of the question-answer fusion model;

其中，所述QA问答算法中的匹配是基于余弦相似度匹配确定的。Wherein, the matching in the QA question answering algorithm is determined based on cosine similarity matching.

在一个实施例中，在所述目标用户语句与所述问答融合模型不匹配的情况下，In one embodiment, when the target user sentence does not match the question-answer fusion model,

将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果；Inputting the target user sentence into an enhanced intention recognition algorithm to determine the recognition result of the target user sentence;

在所述增强型意图识别算法无法识别所述目标用户语句情况下，根据默认的回答策略，输出回答结果。In the case that the enhanced intention recognition algorithm cannot recognize the target user sentence, an answer result is output according to a default answer strategy.

在一个实施例中，在所述增强型意图识别算法无法识别所述目标用户语句情况下，In one embodiment, when the enhanced intention recognition algorithm cannot recognize the target user sentence,

根据预设的修正规则，调整所述增强型意图识别算法的识别率，输出回答结果。Adjust the recognition rate of the enhanced intention recognition algorithm according to the preset correction rule, and output the answer result.

第二方面，本发明还提供一种问答的处理装置，包括：In a second aspect, the present invention also provides a question and answer processing device, including:

问答模式确定模块，用于根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式；The question-and-answer mode determination module is used to determine the question-and-answer mode of the target user statement according to the preset rules of the Task scene and the dialogue state detection rules;

问答融合模型匹配模块，用于在所述问答模式不属于Task场景模式，并且所述问答模式为非多轮对话模式的情况下，将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果；A question-and-answer fusion model matching module, used to input the target user sentence into the question-and-answer fusion model when the question-and-answer mode does not belong to the Task scene mode, and the question-and-answer mode is not a multi-round dialogue mode, and determine the target user The matching result of the statement and the question-answer fusion model;

增强型意图识别算法匹配模块，用于在所述问答模式属于Task场景模式，或者所述问答模式为多轮对话模式的情况下，将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果；The enhanced intention recognition algorithm matching module is used to input the target user sentence into the enhanced intention recognition algorithm when the question-and-answer mode belongs to the Task scene mode, or the question-and-answer mode is a multi-round dialogue mode, and determine the The recognition result of the target user sentence;

第三方面，本发明还提供一种电子设备，包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序，处理器执行计算机程序时实现上述任一种的问答的处理方法的步骤。In the third aspect, the present invention also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, any one of the above question-and-answer processing methods can be realized. A step of.

第四方面，本发明还提供一种计算机程序产品，包括计算机程序，所述计算机程序被处理器执行时实现上述任一种的问答的处理方法的步骤。In a fourth aspect, the present invention further provides a computer program product, including a computer program, and when the computer program is executed by a processor, the steps of any one of the above question-and-answer processing methods are implemented.

本发明提供的问答的处理方法、装置、电子设备及存储介质，通过对目标用户语句的问答模式进行判断，针对具体的问答模式采取对应问答模式场景的问答处理方式，实现了多种问答场景的高效处理。与此同时，有针对性的通过基于KBQA问答算法与QA问答算法的融合模型对知识型问答的处理，通过增强型意图识别算法对Task场景问答的处理，实现了对常规算法的优化，提升了处理效率与准确性。The question-and-answer processing method, device, electronic equipment, and storage medium provided by the present invention, by judging the question-and-answer mode of the sentence of the target user, adopting a question-and-answer processing method corresponding to the scene of the question-and-answer mode for a specific question-and-answer mode, and realizing multiple question-and-answer scenarios Efficient processing. At the same time, through the targeted processing of knowledge-based questions and answers through the fusion model based on KBQA question-answering algorithm and QA question-answering algorithm, and the processing of task scene questions and answers through the enhanced intention recognition algorithm, the conventional algorithm is optimized and the Processing efficiency and accuracy.

附图说明Description of drawings

为了更清楚地说明本发明或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the present invention or the technical solutions in the prior art, the accompanying drawings that need to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the accompanying drawings in the following description are the present invention. For some embodiments of the invention, those skilled in the art can also obtain other drawings based on these drawings without creative effort.

图1为本发明提供的问答的处理方法的流程示意图；Fig. 1 is a schematic flow chart of the question and answer processing method provided by the present invention;

图2为本发明提供的Rasa信息处理流程图；Fig. 2 is the Rasa information processing flowchart that the present invention provides;

图3为本发明提供的增强型意图识别算法结构图；FIG. 3 is a structural diagram of an enhanced intention recognition algorithm provided by the present invention;

图4为本发明提供的样本数据增强流程图；Fig. 4 is the flow chart of sample data enhancement provided by the present invention;

图5为本发明提供的融合规则流程图；FIG. 5 is a flow chart of fusion rules provided by the present invention;

图6为应用本发明提供的问答的处理方法的流程示意图；FIG. 6 is a schematic flow diagram of a processing method for applying questions and answers provided by the present invention;

图7为本发明提供的问答的处理装置的结构示意图；FIG. 7 is a schematic structural diagram of a question and answer processing device provided by the present invention;

图8为本发明提供的电子设备的结构示意图。FIG. 8 is a schematic structural diagram of an electronic device provided by the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明中的附图，对本发明中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the present invention. Obviously, the described embodiments are part of the embodiments of the present invention , but not all examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

图1为本发明提供的问答的处理方法的流程示意图。参照图1，本发明提供的问答的处理方法可以包括：FIG. 1 is a schematic flow chart of the question and answer processing method provided by the present invention. With reference to Fig. 1, the processing method of question and answer provided by the present invention can comprise:

S110、根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式；S110. Determine the question-and-answer mode of the target user's statement according to the preset rules of the Task scene and the dialogue state detection rules;

S120、在所述问答模式不属于Task场景模式，并且所述问答模式为非多轮对话模式的情况下，将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果；S120. In the case that the question-and-answer mode does not belong to the Task scene mode, and the question-and-answer mode is not a multi-round dialogue mode, input the target user statement into the question-answer fusion model, and determine the target user statement and the question-answer fusion The matching result of the model;

S130、在所述问答模式属于Task场景模式，或者所述问答模式为多轮对话模式的情况下，将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果；S130. In the case that the question-and-answer mode belongs to the Task scene mode, or the question-and-answer mode is a multi-round dialogue mode, input the target user sentence into an enhanced intention recognition algorithm, and determine the recognition result of the target user sentence;

本发明提供的问答的处理方法的执行主体可以是电子设备、电子设备中的部件、集成电路、或芯片。该电子设备可以是移动电子设备，也可以为非移动电子设备。示例性的，移动电子设备可以为手机、平板电脑、笔记本电脑、掌上电脑、车载电子设备、可穿戴设备、超级移动个人计算机(ultra-mobile personal computer，UMPC)、上网本或者个人数字助理(personal digital assistant，PDA)等，非移动电子设备可以为服务器、网络附属存储器(Network Attached Storage，NAS)、个人计算机(personal computer，PC)、电视机(television，TV)、柜员机或者自助机等，本发明不作具体限定。The execution subject of the question-and-answer processing method provided by the present invention may be an electronic device, a component in the electronic device, an integrated circuit, or a chip. The electronic device may be a mobile electronic device or a non-mobile electronic device. Exemplarily, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a vehicle electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a personal digital assistant (personal digital assistant, PDA), etc., the non-mobile electronic device can be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a television (television, TV), a teller machine or a self-service machine, etc., the present invention Not specifically limited.

下面以计算机执行本发明提供的问答的处理方法为例，详细说明本发明的技术方案。The technical solution of the present invention will be described in detail below by taking the computer to execute the question-and-answer processing method provided by the present invention as an example.

需要说明的是，问答一般包含四种，分别是Task场景问答、基础闲聊问答、图谱KBQA问答以及QA问答。Task场景问答是针对任务信息，执行具体的操作。比如执行开灯命令等。基础闲聊问答中集成了基础的问答功能，用于交互闲聊的内容。图谱KBQA问答以及QA问答，均属于知识型问答。其中，KBQA问答是通过知识图谱查询，对图库中的数据进行问答查询。例如“查询linux123的状态”，会通过识别实体如linux123与类别之后，进行图谱查询得出结果。QA问答，通过设置大量的知识问答对，通过匹配实现确定结果。It should be noted that there are generally four types of questions and answers, namely task scene questions and answers, basic chat questions and answers, map KBQA questions and answers, and QA questions and answers. Task scene question and answer is to perform specific operations for task information. For example, execute the command to turn on the lights. The basic question and answer function is integrated in the basic chat question and answer, which is used for interactive chat content. Graph KBQA questions and answers and QA questions and answers are both knowledge-based questions and answers. Among them, KBQA question and answer is to query the data in the gallery through knowledge map query. For example, "query the status of linux123", after identifying the entity such as linux123 and its category, it will query the graph to get the result. QA question and answer, by setting a large number of knowledge question and answer pairs, through matching to achieve a certain result.

在步骤S110中，根据预先定义的Task场景规则确定目标用户语句的问答模式是否为Task场景，根据对话状态检测规则确定目标用户语句的问答模式是否为多轮对话的模式。In step S110, it is determined whether the question-and-answer mode of the target user statement is a Task scene according to the predefined Task scene rule, and whether the question-answer mode of the target user statement is a multi-round dialogue mode is determined according to the dialog state detection rule.

针对是否为Task场景的判断，是通过预先定义的Task场景规则确认的。预先定义的Task场景规则是通过预先构建的Task场景专家库。将常用的任务相关内容信息比如执行**命令、创建**任务、启动**服务等信息录入Task场景专家库，将目标用户语句与Task场景专家库中的信息进行比对实现对任务信息的判断。The judgment of whether it is a Task scene is confirmed by the predefined Task scene rules. The pre-defined task scenario rules are pre-built task scenario expert libraries. Enter commonly used task-related content information such as executing **commands, creating **tasks, starting **services and other information into the Task scene expert database, and compare the target user statement with the information in the Task scenario expert database to achieve task information verification. judge.

可选的，针对目标用户语句是否为多轮对话模式的判断，可以借助Rasa框架进行。Rasa框架内置多种多轮对话管理策略,如最常用的“表单”策略，可以方便用户进行多轮对话的设计。根据Rasa框架确定目标用户语句是否处于多轮对话模式中。Optionally, judging whether the target user's statement is in a multi-round dialogue mode can be performed with the help of the Rasa framework. The Rasa framework has a variety of built-in multi-round dialogue management strategies, such as the most commonly used "form" strategy, which can facilitate the design of multi-round dialogue for users. Determine whether the target user statement is in a multi-round dialogue mode according to the Rasa framework.

可以理解的是，通过对目标用户语句的问答模式进行判断，可以初步对当前目标用户语句的问答进行筛选，根据筛选的结果，针对性的对目标用户语句进行处理，可以提高问答识别的效率。It can be understood that, by judging the question-and-answer mode of the target user statement, the question-and-answer of the current target user statement can be preliminarily screened, and according to the screening result, targeted processing of the target user statement can improve the efficiency of question-answer recognition.

在步骤S120中，通过步骤S110中确定目标用户语句的问答模式不属于Task场景模式，并且目标用户语句的问答模式为非多轮对话模式的情况下，将所述目标用户语句输入问答融合模型，确定目标用户语句与问答融合模型的匹配结果。In step S120, when it is determined in step S110 that the question-and-answer mode of the target user statement does not belong to the Task scene mode, and the question-answer mode of the target user statement is not a multi-round dialogue mode, the target user statement is input into the question-answer fusion model, Determine the matching result of the target user sentence and the question-answer fusion model.

在确定目标用户语句不属于Task场景模式，且目标用户语句为非多轮对话的情况下，则该目标用户语句极可能为知识型问答语句，所以可以将目标用户语句输入问答融合模型，进行知识型问答的检测。If it is determined that the target user sentence does not belong to the Task scenario mode, and the target user sentence is not a multi-round dialogue, then the target user sentence is likely to be a knowledge-based question-and-answer sentence, so the target user sentence can be input into the question-answer fusion model for knowledge type question answering detection.

在步骤S130中，在确定问答模式属于Task场景模式或者多轮对话模式的情况下，直接将目标用户语句输入增强型意图识别算法，确定目标用户语句的识别结果。In step S130, when it is determined that the question-and-answer mode belongs to the Task scene mode or the multi-round dialogue mode, directly input the target user sentence into the enhanced intent recognition algorithm to determine the recognition result of the target user sentence.

可以理解的是，在确定问答模式属于Task场景模型或者多轮对话模式的情况下，通过构建的数据增强型意图识别算法，将目标用户语句数据增强型意图识别算法，确定目标用户语句的识别结果。通过数据增强型的意图识别算法，提升了识别的准确率。It can be understood that, in the case of determining that the question-and-answer mode belongs to the Task scene model or the multi-round dialogue mode, through the constructed data-enhanced intent recognition algorithm, the target user sentence data-enhanced intent recognition algorithm is used to determine the recognition result of the target user sentence . Through the data-enhanced intent recognition algorithm, the accuracy of recognition is improved.

KBQA(Knowledge Base Question Answering，基于知识库问答)问答算法，是通过对输入的语句进行语义理解和解析，进而利用知识库进行查询、推理得出答案。知识库是用于知识管理的一种特殊的数据库，用于相关领域知识的采集、整理及提取。知识库中的知识是求解问题所需领域知识的集合，包括一些基本事实、规则和其他相关信息。QA问答算法为通过知识库中的指示进行相似度的检索，得到符合匹配阈值的问题答案。KBQA (Knowledge Base Question Answering, based on knowledge base question answering) question answering algorithm is to understand and analyze the semantics of the input sentence, and then use the knowledge base to query and reason to get the answer. Knowledge base is a special database used for knowledge management, which is used to collect, organize and extract knowledge in related fields. The knowledge in the knowledge base is the collection of domain knowledge required to solve the problem, including some basic facts, rules and other relevant information. The QA question answering algorithm is to search the similarity through the instructions in the knowledge base, and obtain the answers to the questions that meet the matching threshold.

可选的，可以根据预先设定的融合规则，将KBQA问答算法与QA问答算法融合，得到问答融合模型。将目标用户输入语句输入问答融合模型中的KBQA算法与QA算法，确定KBQA算法与QA算法中匹配程度更好的算法的输出作为所述问答融合模型的输出。Optionally, the KBQA question answering algorithm and the QA question answering algorithm may be fused according to a preset fusion rule to obtain a question answering fusion model. Input the sentence input by the target user into the KBQA algorithm and the QA algorithm in the question-answer fusion model, and determine the output of the algorithm with a better matching degree among the KBQA algorithm and the QA algorithm as the output of the question-answer fusion model.

本发明提供的问答的处理方法，通过对目标用户语句的问答模式进行判断，针对具体的问答模式采取对应问答模式场景的问答处理方式，实现了多种问答场景的高效处理。与此同时，有针对性的通过基于KBQA问答算法与QA问答算法的融合模型对知识型问答的处理，通过增强型意图识别算法对Task场景问答的处理，实现了对常规算法的优化，提升了处理效率与准确性。The question-and-answer processing method provided by the present invention realizes efficient processing of various question-and-answer scenarios by judging the question-and-answer mode of the target user's sentence, and adopting a question-and-answer processing method corresponding to the scene of the question-and-answer mode for a specific question-and-answer mode. At the same time, through the targeted processing of knowledge-based questions and answers through the fusion model based on KBQA question-answering algorithm and QA question-answering algorithm, and the processing of task scene questions and answers through the enhanced intention recognition algorithm, the conventional algorithm is optimized and the Processing efficiency and accuracy.

在一个实施例中，根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式，包括：根据预先设定的关键词字典和预先设定的正则表达式，确定所述Task场景预设规则；根据自然语言框架Rasa框架中内置的多轮对话管理策略，确定所述对话状态检测规则；将所述目标用户语句与所述Task场景预设规则匹配，确定所述目标用户语句的问答模式中的场景模式，并将所述目标用户语句与所述对话状态检测规则匹配，确定所述目标用户语句的问答模式中的对话模式。In one embodiment, determining the question-and-answer mode of the target user statement according to the Task scene preset rules and dialogue state detection rules includes: determining the Task scene according to a preset keyword dictionary and a preset regular expression Preset rules; according to the built-in multi-round dialogue management strategy in the natural language framework Rasa framework, determine the dialogue state detection rules; match the target user statement with the Task scene preset rule, and determine the target user statement The scenario mode in the question-and-answer mode, and matches the target user statement with the dialogue state detection rule to determine the dialogue mode in the question-answer mode of the target user statement.

具体地，根据预先设定的关键词字典和预先设定的正则表达式，确定Task场景的预设规则，用于匹配Task场景的问答语句。将目标用户语句与所述Task场景预设规则匹配，对输入的目标用户语句进行判断，确定目标用户语句是否为Task场景的问答语句。Specifically, according to a preset keyword dictionary and a preset regular expression, the preset rules of the Task scene are determined to match the question-and-answer sentences of the Task scene. Match the target user sentence with the preset rules of the Task scene, judge the input target user sentence, and determine whether the target user sentence is a question-and-answer sentence of the Task scene.

其次，还对当前目标用户语句是否正在多轮对话状态中进行检测。其中，对多轮对话状态的检测可以通过自然语言框架Rasa框架进行检测。Rasa框架内置多种多轮对话管理策略,如最常用的“表单”策略，可以方便用户进行多轮对话的设计。所以可以基于Rasa框架，确定当前问答是否为多轮对话的问答模式。Secondly, it also detects whether the current target user statement is in a multi-round dialogue state. Among them, the detection of the multi-round dialogue state can be carried out through the natural language framework Rasa framework. The Rasa framework has a variety of built-in multi-round dialogue management strategies, such as the most commonly used "form" strategy, which can facilitate the design of multi-round dialogue for users. Therefore, based on the Rasa framework, it can be determined whether the current question and answer is a multi-round dialogue question and answer mode.

具体地，Rasa的信息处理流程如图2本发明提供的Rasa信息处理流程图所示。首先接受目标用户语句信息，并将该信息送到Interpreter解析模块，将目标用户语句信息内容转变为一个字典，其中包括原始信息，意图，实体等。接下来会把字典送给跟踪模块Tracker，用于记录对话状态并跟踪对话进度的。策略模块Policy会接收到Tracker当前的状态，并根据这个状态选择一个合适的执行模块Action。Action一方面会把信息发送给Tracker，让它记录下当前的状态，另一方面还会返回输出结果。Specifically, the information processing flow of Rasa is shown in FIG. 2 as the flow chart of Rasa information processing provided by the present invention. First accept the target user statement information, and send the information to the Interpreter parsing module to convert the content of the target user statement information into a dictionary, including original information, intent, entities, etc. Next, the dictionary will be sent to the tracking module Tracker, which is used to record the dialogue status and track the progress of the dialogue. The policy module Policy will receive the current state of the Tracker, and select an appropriate execution module Action according to this state. On the one hand, Action will send information to Tracker, let it record the current state, and on the other hand, it will return the output result.

将目标用户语句与基于Rasa框架对话状态检测规则匹配，确定所述目标用户语句的问答模式是否为多轮对话模式。The target user statement is matched with the dialog state detection rule based on the Rasa framework, and it is determined whether the question-and-answer mode of the target user statement is a multi-round dialogue mode.

可以理解的是，通过分层检测的思想，先对问答模式进行判断，再对目标用户语句进行意图识别，在实现多场景情形识别的情况下，提升了意图识别的识别速度。It is understandable that, through the idea of layered detection, the question-and-answer mode is first judged, and then the intent recognition of the target user statement is performed. In the case of realizing multi-scene situation recognition, the recognition speed of intent recognition is improved.

本发明提供的问答的处理方法，通过确定Task场景预设规则与对话状态检测规则，实现对目标用户语句的问答模式进行判断，针对具体的问答模式采取对应问答模式场景的问答处理方式，实现了多种问答场景的高效处理。The question-and-answer processing method provided by the present invention realizes judging the question-and-answer mode of the target user statement by determining the preset rules of the Task scene and the dialogue state detection rules, and adopts the question-and-answer processing method corresponding to the scene of the question-and-answer mode for a specific question-and-answer mode to realize Efficient handling of various question and answer scenarios.

在一个实施例中，将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果，包括：获取训练样本，对所述训练样本进行数据增强；根据数据增强后的训练样本，训练语言表征模型bert模型，得到训练后的bert模型；将所述目标用户语句输入所述训练后的bert模型，得到所述目标用户语句的句向量；将所述目标用户语句通过人工设定的特征工程获得稀疏向量；拼接所述句向量和所述稀疏向量，并将拼接后得到的向量输入全连接层与softmax层，确定所述目标用户语句的识别结果。In one embodiment, inputting the target user sentence into an enhanced intention recognition algorithm, and determining the recognition result of the target user sentence includes: obtaining training samples, and performing data enhancement on the training samples; Sample, training language representation model bert model, obtains the bert model after training; The bert model after described target user sentence input described training, obtains the sentence vector of described target user sentence; Described target user sentence is artificially set The sparse vector is obtained through predetermined feature engineering; the sentence vector and the sparse vector are spliced, and the spliced vector is input into the fully connected layer and the softmax layer to determine the recognition result of the target user sentence.

本发明提出一种增强型意图识别算法，根据增强型意图识别算法对目标用户语句的意图进行识别，确定目标用户语句的识别结果，算法的结构图如图3本发明提供的增强型意图识别算法结构图所示。The present invention proposes an enhanced intention recognition algorithm. According to the enhanced intention recognition algorithm, the intention of the target user's sentence is recognized, and the recognition result of the target user's sentence is determined. The structure diagram of the algorithm is shown in Figure 3. The enhanced intention recognition algorithm provided by the present invention shown in the structure diagram.

增强型意图识别算法从网络结构上将输入端分为两部分进行处理。左端分支是通过将目标用户语句输入训练后的语言表征模型bert模型得到目标用户语句的句向量。其中，bert模型是基于数据增强后的训练样本进行训练的。右端的分支包括将目标用户语句通过人工设定的特征工程获得稀疏向量。其中特征工程可以选取可以辅助提升意图识别的特征，如表1特征工程设定表所示：The enhanced intention recognition algorithm divides the input end into two parts from the network structure for processing. The left branch is to obtain the sentence vector of the target user sentence by inputting the target user sentence into the trained language representation model bert model. Among them, the bert model is trained based on the training samples after data enhancement. The branch at the right end includes obtaining sparse vectors from target user sentences through manual feature engineering. Among them, feature engineering can select features that can assist in improving intent recognition, as shown in the feature engineering setting table in Table 1:

表1特征工程设定表Table 1 Feature engineering setting table

Feature1Feature1Feature2Feature2Feature3Feature3Feature4Feature4...... …匹配**正则Match **Regular匹配IP正则Match IP regular包含某字典contains a dictionary以**开头To ... beginning...... …

将左端分支得到的句向量与右端分支得到的稀疏向量进行拼接操作。可以理解为，右端的向量是为了给左端bert输出的向量增加辅助分类的特征信息。拼接后的向量再连接全连接层与softmax层，整个增强型意图识别模型的网络结构便构建完毕。The sentence vector obtained from the left branch and the sparse vector obtained from the right branch are concatenated. It can be understood that the vector at the right end is to add feature information for auxiliary classification to the vector output by bert at the left end. The spliced vectors are then connected to the fully connected layer and the softmax layer, and the network structure of the entire enhanced intent recognition model is constructed.

将目标用户语句输入构建好的增强型意图识别模型得到对应的句向量和稀疏向量，拼接所述句向量和所述稀疏向量，并将拼接后得到的向量输入全连接层与softmax层，最终确定目标用户语句的识别结果。在实际生产环境中，通过增强型意图识别算法，对以往较难识别的用户语句的意图识别能力有明显提升。Input the target user sentence into the constructed enhanced intent recognition model to obtain the corresponding sentence vector and sparse vector, splice the sentence vector and the sparse vector, and input the spliced vector into the fully connected layer and softmax layer, and finally determine The recognition result of the target user statement. In the actual production environment, through the enhanced intent recognition algorithm, the ability to recognize the intent of user sentences that were difficult to recognize in the past has been significantly improved.

本发明提供的问答的处理方法，通过对常规意图识别算法进行改进，构建增强型意图识别算法，提高了目标用户语句的意图识别效率与识别准确性。The question-and-answer processing method provided by the present invention improves the conventional intent recognition algorithm and constructs an enhanced intent recognition algorithm, which improves the efficiency and accuracy of target user sentence recognition.

在一个实施例中，对所述训练样本进行数据增强，包括：对所述训练样本中与意图分类无关的词槽进行标记；根据人工确定的种子负样本与所述意图分类无关的词槽，生成训练负样本；将所述训练样本进行分词，对分词后得到的词槽中未标记的词槽进行近义词替换；对近义词替换后的样本进行过采样，平衡不同类别的样本数量，得到均衡样本；将所述训练负样本和所述均衡样本，作为数据增强后的训练样本。In one embodiment, performing data enhancement on the training sample includes: marking the word slots in the training sample that are irrelevant to the intent classification; according to the artificially determined seed negative samples that are irrelevant to the intent classification, Generate training negative samples; perform word segmentation on the training samples, and replace unmarked word slots with synonyms in the word slots obtained after word segmentation; oversample the samples after the replacement of synonyms, balance the number of samples of different categories, and obtain balanced samples ; Using the training negative samples and the balanced samples as training samples after data enhancement.

可选的，对训练样本进行数据增强的步骤可以如图4本发明提供的样本数据增强流程图所示。对训练样本进行数据增强的步骤可以包括数据标记、生成负样本、数据增强、过采样。Optionally, the step of performing data enhancement on the training samples may be as shown in the flow chart of sample data enhancement provided by the present invention in FIG. 4 . The step of performing data enhancement on the training samples may include data labeling, generating negative samples, data enhancement, and oversampling.

数据标记：在确定原始意图训练样本时，将其中与意图分类无关的词槽标记出来。这些词槽对意图分类无关但是经常变换影响意图识别模型的训练。例如“创建[集群001]的群组”，可以将“集群001”用中括号标记，这样做的目的是为了在下一步的负样本生成时可以自动识别。Data labeling: When determining the original intent training samples, mark the word slots that are not related to the intent classification. These word slots are irrelevant to intent classification but often change to affect the training of intent recognition models. For example, "Create a group of [cluster 001]", you can mark "cluster 001" with square brackets. The purpose of this is to automatically identify when the negative sample is generated in the next step.

生成负样本：负样本可以分为两部分，种子负样本与自动生成的负样本。种子负样本是在编写训练样本时通过人工整理的特别设计的样本，这些样本容易与正类别意图混淆。自动生成的负样本是数据标记中标记出来的与意图分类无关的词槽样本，将它们自动的补充到负样本的类别中。这样可以在强型意图识别算法中，训练模型可以准确学习到样本中词的权重，降低正类别中词槽的权重同时升高关键词的权重。Generate negative samples: Negative samples can be divided into two parts, seed negative samples and automatically generated negative samples. Seed negative samples are specially designed samples curated by humans when compiling training samples, which are easily confused with positive class intents. The automatically generated negative samples are the word slot samples marked in the data label that have nothing to do with the intent classification, and they are automatically added to the category of the negative samples. In this way, in the strong intention recognition algorithm, the training model can accurately learn the weight of words in the sample, reduce the weight of word slots in the positive category and increase the weight of keywords.

数据增强：通过采用近义词替换的方式来扩展样本较少的类别。在实现过程中，通过分词后的样本进行近义词替换。其中，不对数据标记步骤中标记的词槽进行替换。例如，原始目标用户输入语句为“如何启动hadoop”，通过分词后变成“如何”、“启动”、“hadoop”，进行近义词替换后得到“怎么”、“开启”、“hadoop”。Data augmentation: Expand categories with fewer samples by using synonym replacement. In the implementation process, the synonyms are replaced by the samples after word segmentation. Among them, the word slots marked in the data labeling step are not replaced. For example, the input sentence of the original target user is "how to start hadoop", which becomes "how", "start", "hadoop" after word segmentation, and "how", "start", "hadoop" after the replacement of synonyms.

过采样：经过近义词数据增强后的样本多数情况下可以达到数据均衡，个别类别可能会由于词语的近义词数量不够而增加后的样本依然较少。可以采用过采样的方法来提升这些类别的样本数量。Oversampling: In most cases, the samples after synonym data enhancement can achieve data balance, and individual categories may still have fewer samples after the increase due to the insufficient number of synonyms of words. Oversampling can be used to increase the number of samples in these categories.

本发明提供的问答的处理方法，通过对训练样本进行数据增强，将训练样本输入增强型意图识别模型进行训练，从而根据增强型意图识别算法对Task场景问答的处理，提升了增强型意图识别算法的准确性。The question-and-answer processing method provided by the present invention, by performing data enhancement on the training samples, input the training samples into the enhanced intent recognition model for training, thereby improving the enhanced intent recognition algorithm for the processing of Task scene questions and answers according to the enhanced intent recognition algorithm accuracy.

在一个实施例中，将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果，包括：将所述目标用户语句并行输入所述KBQA问答算法与所述QA问答算法；根据所述KBQA问答算法确定所述目标用户语句的KBQA匹配结果，根据所述QA问答算法确定所述目标用户语句的QA匹配结果；根据预先设置的优先级，将所述KBQA匹配结果与所述QA匹配结果中优先级高的匹配结果作为所述问答融合模型的匹配结果；其中，所述QA问答算法中的匹配是基于余弦相似度匹配确定的。In one embodiment, inputting the target user sentence into the question-answer fusion model and determining the matching result of the target user sentence and the question-answer fusion model includes: inputting the target user sentence in parallel into the KBQA question-answer algorithm and the The QA question and answer algorithm; determine the KBQA matching result of the target user statement according to the KBQA question and answer algorithm, determine the QA matching result of the target user statement according to the QA question and answer algorithm; according to the preset priority, the KBQA The matching result and the matching result with higher priority among the QA matching results are used as the matching result of the question-answer fusion model; wherein, the matching in the QA question-answering algorithm is determined based on cosine similarity matching.

可选的，KBQA问答算法与QA问答算法均属于对知识库中存储的知识进行检索，均为通过对存在库中的知识进行相似度的检索，得到符合阈值的问题答案。Optionally, both the KBQA question-answering algorithm and the QA question-answering algorithm belong to searching the knowledge stored in the knowledge base, and both search for the similarity of the knowledge stored in the knowledge base to obtain answers to questions that meet the threshold.

KBQA问答算法采用基于语义解析的方式，对输入的目标用户语句进行命名实体识别与关系抽取操作，将得到的实体与关系放到图数据库中进行检索得到答案。The KBQA question-answering algorithm adopts a method based on semantic analysis to perform named entity recognition and relationship extraction operations on the input target user sentences, and put the obtained entities and relationships into the graph database for retrieval to obtain answers.

QA问答算法引入ES(ElasticSearch，相似度搜索)进行相似问题检索。ES中除了倒排索引可以采用基于TF-IDF的向量化方式与余弦相似度进行相似向量的匹配。The QA question answering algorithm introduces ES (ElasticSearch, similarity search) for similar question retrieval. In addition to the inverted index in ES, the vectorization method based on TF-IDF and the cosine similarity can be used to match similar vectors.

其中，相似度算法采用余弦相似度，计算公式如下：Among them, the similarity algorithm uses cosine similarity, and the calculation formula is as follows:

其中，cosθ为余弦相似度值，a为输入的目标用户语句，b为存入库中的每条候选语句。Among them, cosθ is the cosine similarity value, a is the input target user sentence, and b is each candidate sentence stored in the database.

在实际上线时，可以先将向量a，b进行单位化处理，这样公式中的分母为1，只需要计算分子的部分即可。cos余弦角度越接近1，那么两个向量的相似度就越接近，则目标用户语句与存入库中的候选语句的相似度就越高。ES接收到用户输入数据后，首先通过倒排索引找到若干个候选问题，再通过余弦相似度公式得出相似度数值，最后通过降序排序，找到最相似的匹配结果。In the actual line, the vectors a and b can be unitized first, so that the denominator in the formula is 1, and only the numerator part needs to be calculated. The closer the cosine angle is to 1, the closer the similarity between the two vectors is, and the higher the similarity between the target user sentence and the candidate sentence stored in the library is. After ES receives the user input data, it first finds several candidate questions through the inverted index, then obtains the similarity value through the cosine similarity formula, and finally sorts in descending order to find the most similar matching result.

可选的，可以采用Python多线程的方式，将目标用户语句并行输入KBQA算法与QA算法，可以同时给出预测结果。在实际生产环境中，KBQA和QA的查询耗时均在300ms～400ms左右，如果串行的执行查询操作总耗时在700ms～800ms左右，而并行查询则仅需400ms，大幅缩短了问答响应时间。Optionally, the Python multi-threading method can be used to input the target user statement into the KBQA algorithm and the QA algorithm in parallel, and the prediction results can be given at the same time. In the actual production environment, the query time of KBQA and QA is about 300ms to 400ms. If the serial query operation takes about 700ms to 800ms, the parallel query only takes 400ms, which greatly shortens the question and answer response time. .

并行查询得到结果后，为了找到更加匹配的答案，需要问答融合模型的融合的规则。如图5本发明提供的融合规则流程图所示。可以设定答案选择的优先级，定级标准可以根据2个KBQA算法与QA算法的性能高低情况、图谱和问答对的数据的质量、大小情况确定。其中，KBQA算法由于需要命名实体识别和关系抽取的2大步骤，整体的算法问答匹配效果优于QA问答算法。当两个算法都查询到结果时，优先选择图谱问答的结果作为答案。但当QA算法以极高的相似度匹配中了结果，此时需要优选QA算法的匹配结果。因为问答对的答案一般是人工整理的，匹配结果一般比KBQA算法返回的结果更加合理。After the results of the parallel query are obtained, in order to find a more matching answer, the fusion rules of the question-answer fusion model are needed. Fig. 5 shows the fusion rule flow chart provided by the present invention. The priority of answer selection can be set, and the grading standard can be determined according to the performance of the two KBQA algorithms and the QA algorithm, the quality and size of the atlas and question-answer data. Among them, the KBQA algorithm requires two major steps of named entity recognition and relationship extraction, and the overall algorithm question-answer matching effect is better than that of the QA question-answer algorithm. When both algorithms have query results, the result of graph question answering is preferred as the answer. But when the QA algorithm matches the result with a very high similarity, it is necessary to optimize the matching result of the QA algorithm. Because the correct answers to questions and answers are generally sorted out manually, the matching results are generally more reasonable than those returned by the KBQA algorithm.

本发明提供的问答的处理方法，通过对KBQA问答算法与QA问答算法进行融合，得到问答融合模型。通过问答融合模型，实现了对知识型问答的高性能处理。The question-answer processing method provided by the present invention obtains a question-answer fusion model by fusing the KBQA question-answer algorithm and the QA question-answer algorithm. Through the question-answer fusion model, the high-performance processing of knowledge-based question-answering is realized.

在一个实施例中，在所述目标用户语句与所述问答融合模型不匹配的情况下，将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果；在所述增强型意图识别算法无法识别所述目标用户语句情况下，根据默认的回答策略，输出回答结果。In one embodiment, if the target user sentence does not match the question-answer fusion model, input the target user sentence into an enhanced intention recognition algorithm to determine the recognition result of the target user sentence; In the case that the enhanced intention recognition algorithm fails to recognize the target user statement, an answer result is output according to a default answer strategy.

具体地，在确定目标用户语句与问答融合模型不匹配的情况下，将用户语句输入增强型意图识别算法进一步确定目标用户语句的识别结果。若确定增强型意图识别算法无法识别目标用户语句，可以根据默认的回答策略，输出回答结果。Specifically, when it is determined that the target user sentence does not match the question-answer fusion model, the user sentence is input into the enhanced intention recognition algorithm to further determine the recognition result of the target user sentence. If it is determined that the enhanced intention recognition algorithm cannot recognize the target user statement, an answer result may be output according to a default answer strategy.

可以理解的是，确定目标用户语句与问答融合模型不匹配的情况下，可以说明目标用户语句极可能为基础闲聊问答，所以可以根据增强型意图识别算法实现对目标用户语句的意图识别。It can be understood that if it is determined that the target user sentence does not match the question-answer fusion model, it can be explained that the target user sentence is likely to be a basic chat question and answer, so the intent recognition of the target user sentence can be realized according to the enhanced intent recognition algorithm.

可选的，例如用户随意输入了一串随机字符“asdfgh”,则可以通过rasa框架的FallbackPolicy策略给出设定好的默认回答，如“我还不理解您的意思，请换个说法”。Optionally, for example, if the user randomly enters a string of random characters "asdfgh", the FallbackPolicy strategy of the rasa framework can be used to give a preset default answer, such as "I still don't understand what you mean, please put it another way".

本发明提供的问答的处理方法，通过对目标用户语句的问答模式进行判断，针对具体的问答模式采取对应问答模式场景的问答处理方式，实现了多种问答场景的高效处理。在增强型意图识别算法无法识别目标用户语句情况下，根据默认的回答策略，输出回答结果，实现了无法识别用户意图时，可以对相应应答的设置。The question-and-answer processing method provided by the present invention realizes efficient processing of various question-and-answer scenarios by judging the question-and-answer mode of the target user's sentence, and adopting a question-and-answer processing method corresponding to the scene of the question-and-answer mode for a specific question-and-answer mode. In the case that the enhanced intention recognition algorithm cannot recognize the target user's sentence, the answer result is output according to the default answer strategy, and the corresponding answer can be set when the user intention cannot be recognized.

在一个实施例中，在所述增强型意图识别算法无法识别所述目标用户语句情况下，根据预设的修正规则，调整所述增强型意图识别算法的识别率，输出回答结果。In one embodiment, when the enhanced intent recognition algorithm cannot recognize the target user sentence, the recognition rate of the enhanced intent recognition algorithm is adjusted according to a preset correction rule, and an answer result is output.

可选的，在确定增强型意图识别算法无法识别所述目标用户语句情况下，基于预先设定的修正规则，对增强型意图识别算法的识别率进行修正，修正识别率后再次进行增强型意图识别算法的识别，输出识别的回答结果。修正的目的是应对一些边缘场景语句，例如“创建执行重启hadoop集群作业群组”，此时模型直接判断有时候会给出一个不高的识别率，但是这种问句属于“创建**群组”的标准问句，可以通过修正识别率，输出再次识别的结果。Optionally, when it is determined that the enhanced intention recognition algorithm cannot recognize the target user sentence, based on the preset correction rules, the recognition rate of the enhanced intention recognition algorithm is corrected, and the enhanced intention recognition algorithm is performed again after the recognition rate is corrected. Recognition of the recognition algorithm, and output of the recognition answer result. The purpose of the correction is to deal with some edge scene sentences, such as "create and execute restart hadoop cluster job group", at this time, the direct judgment of the model sometimes gives a low recognition rate, but this kind of question belongs to "create** group Group" standard question, the recognition rate can be corrected to output the result of re-recognition.

本发明提供的问答的处理方法，通过根据预设的修正规则，调整所述增强型意图识别算法的识别率，在增强型意图识别算法无法识别所述目标用户语句情况下，依然实现回答结果的输出。实现了对目标用户语句识别的全方位覆盖。The question and answer processing method provided by the present invention adjusts the recognition rate of the enhanced intention recognition algorithm according to the preset correction rules, and still realizes the accuracy of the answer result when the enhanced intention recognition algorithm cannot recognize the target user sentence. output. The comprehensive coverage of target user sentence recognition is realized.

下面以一应用本发明提供的问答的处理方法的流程示意图图6为例，说明本发明提供的技术方案：The technical solution provided by the present invention will be described below by taking a schematic flow diagram of Fig. 6, which is a processing method of question and answer provided by the present invention, as an example:

对目标用户语句的处理架构流程将Task场景、KBQA问答、QA问答以及闲聊场景4种问答场景检测融合为一体，整体架构设计采用分层的模块化的思想进行设计，可以拆分为4部分part1～part4。The architecture process for processing target user statements integrates the four question and answer scene detections of Task scene, KBQA question and answer, QA question and answer, and chat scene. ~part4.

part1部分针对Task场景问答定义了Task场景检测与对话状态检测2种功能。根据Task场景预设规则与对话状态检测规则对目标用户语句进行匹配检测，判断当前目标用户语句是否属于Task场景规则或此时的对话是否正处于多轮对话中。Part1 defines two functions of task scene detection and dialogue state detection for task scene question answering. According to the preset rules of the task scene and the detection rules of the dialogue state, the target user statement is matched and detected, and it is judged whether the current target user statement belongs to the task scene rule or whether the dialogue at this time is in multiple rounds of dialogue.

part2部分包括图谱KBQA问答与QA问答组成的问答融合模型。在确定问答模式不属于Task场景模式，并且问答模式为非多轮对话模式的情况下，目标用户语句在这里会并行传入这两部分进行KBQA问答与QA问答的匹配。The part2 part includes a question-and-answer fusion model composed of graph KBQA question-answer and QA question-answer. When it is determined that the question-and-answer mode does not belong to the Task scenario mode, and the question-and-answer mode is not a multi-round dialogue mode, the target user statement will be passed in parallel to these two parts for matching between KBQA question-answer and QA question-answer.

part3部分处理闲聊语句与部分在part1中未能满足预设规则的Task场景语句。通过在part3部分嵌入增强型意图识别算法实现。增强型意图识别算法实现对part2中未能匹配的目标用户语句进行识别。同时，还设置默认回答策略，当最终无法识别用户意图时，可以设置默认回答策略完成应答。Part3 deals with chatting sentences and some Task scene sentences that failed to meet the preset rules in part1. It is realized by embedding the enhanced intent recognition algorithm in part3. The enhanced intent recognition algorithm realizes the recognition of target user sentences that cannot be matched in part2. At the same time, a default answer strategy is also set. When the user's intention cannot be recognized in the end, the default answer strategy can be set to complete the answer.

part4部分接收满足part1要求的目标用户语句，即满足问答模式属于Task场景模式，或者满足问答模式为多轮对话模式。通过与part3中相同增强型意图识别算法模型进行意图识别。与此同时，增加准确率修正的部分，因此所有经过该模块的语句都会在后续满足设定的阈值。The part4 part receives the target user statement that meets the requirements of part1, that is, the question-and-answer mode belongs to the Task scene mode, or the question-and-answer mode meets the multi-round dialogue mode. Intent recognition is performed through the same enhanced intent recognition algorithm model as in part3. At the same time, the part of accuracy rate correction is added, so all sentences passing through this module will meet the set threshold in the future.

本发明还提供一种问答的处理装置，该装置与上文描述的问答的处理方法可相互对应参照。The present invention also provides a device for processing questions and answers, which can be referred to in correspondence with the method for processing questions and answers described above.

图7为本发明提供的问答的处理装置的结构示意图，如图7所示，该装置包括：Fig. 7 is a schematic structural diagram of a question and answer processing device provided by the present invention. As shown in Fig. 7, the device includes:

问答模式确定模块710，用于根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式；The question-and-answer mode determination module 710 is used to determine the question-and-answer mode of the target user statement according to the preset rules of the Task scene and the dialogue state detection rules;

问答融合模型匹配模块720，用于在所述问答模式不属于Task场景模式，并且所述问答模式为非多轮对话模式的情况下，将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果；The question-and-answer fusion model matching module 720 is configured to input the target user sentence into the question-and-answer fusion model when the question-and-answer mode does not belong to the Task scene mode, and the question-and-answer mode is not a multi-round dialogue mode, and determine the target The matching result of the user sentence and the question-answer fusion model;

增强型意图识别算法匹配模块730，用于在所述问答模式属于Task场景模式，或者所述问答模式为多轮对话模式的情况下，将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果；The enhanced intention recognition algorithm matching module 730 is configured to input the target user sentence into the enhanced intention recognition algorithm when the question-and-answer mode belongs to the Task scene mode, or the question-and-answer mode is a multi-round dialogue mode, and determine the Describe the recognition result of the target user sentence;

本发明提供的问答的处理装置，通过对目标用户语句的问答模式进行判断，针对具体的问答模式采取对应问答模式场景的问答处理方式，实现了多种问答场景的高效处理。与此同时，有针对性的通过基于KBQA问答算法与QA问答算法的融合模型对知识型问答的处理，通过增强型意图识别算法对Task场景问答的处理，实现了对常规算法的优化，提升了处理效率与准确性。The question and answer processing device provided by the present invention realizes efficient processing of various question and answer scenarios by judging the question and answer mode of the target user's sentence and adopting a question and answer processing method corresponding to the scene of the question and answer mode for a specific question and answer mode. At the same time, through the targeted processing of knowledge-based questions and answers through the fusion model based on KBQA question-answering algorithm and QA question-answering algorithm, and the processing of task scene questions and answers through the enhanced intention recognition algorithm, the conventional algorithm is optimized and the Processing efficiency and accuracy.

在一个实施例中，问答模式确定模块710具体用于：In one embodiment, the question-and-answer mode determination module 710 is specifically used to:

根据Task场景预设规则与对话状态检测规则，确定目标用户语句的问答模式，包括：Determine the question-and-answer mode of the target user's statement according to the preset rules of the Task scene and the dialogue state detection rules, including:

在一个实施例中，问答融合模型匹配模块720具体用于：In one embodiment, the question-answer fusion model matching module 720 is specifically used for:

将所述目标用户语句输入增强型意图识别算法，确定所述目标用户语句的识别结果，包括：Inputting the target user sentence into the enhanced intention recognition algorithm, and determining the recognition result of the target user sentence, including:

在一个实施例中，问答融合模型匹配模块720还具体用于：In one embodiment, the question-answer fusion model matching module 720 is also specifically used for:

对所述训练样本进行数据增强，包括：Carrying out data enhancement on the training samples, including:

将所述目标用户语句输入问答融合模型，确定所述目标用户语句与所述问答融合模型的匹配结果，包括：Inputting the target user sentence into the question-and-answer fusion model, determining the matching result of the target user sentence and the question-and-answer fusion model, including:

在一个实施例中，增强型意图识别算法匹配模块730具体用于：In one embodiment, the enhanced intention recognition algorithm matching module 730 is specifically used for:

在所述目标用户语句与所述问答融合模型不匹配的情况下，In the case where the target user statement does not match the question-answer fusion model,

在一个实施例中，增强型意图识别算法匹配模块730还具体用于：In one embodiment, the enhanced intent recognition algorithm matching module 730 is also specifically used for:

在所述增强型意图识别算法无法识别所述目标用户语句情况下，In the case where the enhanced intent recognition algorithm cannot recognize the target user statement,

本发明还提供一种电子设备，如图8所示，该电子设备可以包括：处理器(processor)810、通信接口(Communication Interface)820、存储器(memory)830和通信总线(bus)840，其中，处理器810，通信接口820，存储器830通过通信总线840完成相互间的通信。处理器810可以调用存储器830中的逻辑指令，以执行问答的处理方法的步骤，例如包括：The present invention also provides an electronic device. As shown in FIG. 8, the electronic device may include: a processor (processor) 810, a communication interface (Communication Interface) 820, a memory (memory) 830, and a communication bus (bus) 840, wherein , the processor 810 , the communication interface 820 , and the memory 830 communicate with each other through the communication bus 840 . The processor 810 may invoke logic instructions in the memory 830 to execute the steps of the question-and-answer processing method, including, for example:

此外，上述的存储器830中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above logic instructions in the memory 830 may be implemented in the form of software functional units and when sold or used as an independent product, may be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes. .

另一方面，本发明还提供一种计算机程序产品，所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序，所述计算机程序包括程序指令，当所述程序指令被计算机执行时，计算机能够执行上述各方法实施例所提供的问答的处理方法的步骤，例如包括：On the other hand, the present invention also provides a computer program product, the computer program product includes a computer program stored on a non-transitory computer-readable storage medium, the computer program includes program instructions, and when the program instructions are executed by a computer When executing, the computer can execute the steps of the question and answer processing method provided by the above method embodiments, for example including:

又一方面，本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现上述各方法实施例提供的问答的处理方法的步骤，例如包括：In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium, on which a computer program is stored. When the computer program is executed by a processor, the steps of the question-and-answer processing method provided by the above-mentioned method embodiments are implemented, for example include:

以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下，即可以理解并实施。The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without any creative effort.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。Through the above description of the implementations, those skilled in the art can clearly understand that each implementation can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware. Based on this understanding, the essence of the above technical solution or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic discs, optical discs, etc., including several instructions to make a computer device (which may be a personal computer, server, or network device, etc.) execute the methods described in various embodiments or some parts of the embodiments.

最后应说明的是：以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent replacements are made to some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.