技术领域Technical field
本发明涉及人工智能领域,具体而言,涉及一种基于领域知识图谱的知识问答方法及其装置、电子设备。The present invention relates to the field of artificial intelligence. Specifically, it relates to a knowledge question and answer method based on a domain knowledge graph, its device, and electronic equipment.
背景技术Background technique
当前,在金融服务场景中(例如,金融市场交易场景),需要手动进行交易事前调研、数据分析以及交易决策,往往面临无法快速从海量信息中获取有效信息,并及时对信息进行整合、分析,导致出现决策困难、决策不准确等问题,进而影响业务效率和增长。Currently, in financial service scenarios (for example, financial market transaction scenarios), pre-transaction research, data analysis, and transaction decision-making need to be carried out manually. It is often difficult to quickly obtain effective information from massive amounts of information, and to integrate and analyze the information in a timely manner. This leads to problems such as difficulty in decision-making and inaccurate decision-making, which in turn affects business efficiency and growth.
随着生成式模型的发布(例如,ChatGPT,全称Chat Generative Pre-trainedTransformer,即一种聊天机器人程序),自然语言文本理解和生成能力得到了显著提升,智能知识问答推理服务水平有了很大改善。然而,当前的生成式模型仍存在逻辑推理不靠谱、生成结果事实性低的问题,更无法为领域类(例如,金融领域)问题提供专业、准确的答案。With the release of generative models (for example, ChatGPT, full name Chat Generative Pre-trainedTransformer, a chatbot program), natural language text understanding and generation capabilities have been significantly improved, and the level of intelligent knowledge question and answer reasoning services has been greatly improved. . However, current generative models still have problems such as unreliable logical reasoning and low factuality of generated results. They are also unable to provide professional and accurate answers to domain-type (for example, financial field) questions.
相关技术中,往往采用如下方案进行智能知识问答:(1)基于知识图谱的智能问答推理(2)问答对匹配(3)基于生成式模型进行智能问答,其中,In related technologies, the following solutions are often used for intelligent question and answer: (1) Intelligent question and answer reasoning based on knowledge graphs (2) Question and answer pair matching (3) Intelligent question and answer based on generative models, where,
(1)基于知识图谱的智能问答推理,能够基于知识图谱技术,构建通用或专业领域的知识图谱,实现知识推理智能问答应用。(1) Intelligent question and answer reasoning based on knowledge graph can build a knowledge graph in general or professional fields based on knowledge graph technology to realize intelligent question and answer application of knowledge reasoning.
图1是根据相关技术的一种可选的基于知识图谱的智能问答推理的示意图,如图1所示,包括:问题分析模块、问题回答模块、答案生成模块,其中,问题分析模块包括:问题分类以及NLP(Natural Language Processing,即自然语言处理)技术,可以先对输入的问题进行问题分类,然后利用NLP技术进行问题关键词提取、语义分析等处理;问题回答模块包括:模式匹配以及知识问答,通过对问题分析模块传输的数据进行语义理解和解析,利用知识库进行查询、推理得出答案;答案生成模块:能够根据问题分析模块传输的数据,对候选答案进行打分,选出最佳答案。Figure 1 is a schematic diagram of an optional intelligent question and answer reasoning based on knowledge graphs according to related technologies. As shown in Figure 1, it includes: a question analysis module, a question answering module, and an answer generation module. The question analysis module includes: question Classification and NLP (Natural Language Processing) technology can first classify the input questions, and then use NLP technology to perform question keyword extraction, semantic analysis and other processing; the question answering module includes: pattern matching and knowledge question and answer , by semantically understanding and parsing the data transmitted by the question analysis module, and using the knowledge base to query and reason to obtain answers; the answer generation module: can score candidate answers and select the best answer based on the data transmitted by the question analysis module. .
(2)问答对匹配,依赖问答库,通过计算语义的相似性来匹配答案。(2) Question and answer pair matching, relying on the question and answer library, to match answers by calculating semantic similarity.
(3)基于生成式模型进行智能问答,基于预先训练好的模型按照上下文场景、用户问题等信息进行意图识别和语义分析,并生成问答答案。(3) Intelligent question and answer based on the generative model. Based on the pre-trained model, intent recognition and semantic analysis are carried out according to contextual scenarios, user questions and other information, and question and answer answers are generated.
图2是根据相关技术的一种可选的基于生成式模型进行智能问答的示意图,如图2所示,模型包括:意图分析、语义分析以及答案生成等模块,将问题输入至模型中,对问题进行意图分析以及语义分析,然后通过答案生成模块得到问答答案。Figure 2 is a schematic diagram of an optional generative model-based intelligent question answering according to related technologies. As shown in Figure 2, the model includes modules such as intent analysis, semantic analysis, and answer generation. Questions are input into the model, and The questions undergo intent analysis and semantic analysis, and then the question and answer answers are obtained through the answer generation module.
然而,相关技术中的智能知识问答方案存在如下问题:(1)对于基于模型进行智能问答推理方案,存在推理结果不可靠、结果可控性低等问题。一方面模型主要通过自收集、自标注的数据进行预训练,若训练样本数据存在样本不均衡,可能导致模型出现偏见性、公平性问题;另一方面针对领域类场景,模型进行预训练的数据中没有足够的专业样本数据(或者某些领域样本数据很少),且模型从网络中收集的信息,也面临非事实性问题,导致模型推理生成的内容可靠性低,且无法真正为领域类问题提供专业、靠谱的答案,而如果采用新标注领域类专业样本数据并注入模型进行预训练,则面临人力、计算成本高,且不能保证模型在领域类各细分场景都达到预期效果的问题;(2)对于基于知识图谱的智能问答推理方案,存在架构调整难,不易根据新的数据或场景进行修改和调整的问题,并且还存在推理能力弱,图谱构建成本高等问题。However, intelligent knowledge question and answer solutions in related technologies have the following problems: (1) For model-based intelligent question and answer reasoning solutions, there are problems such as unreliable reasoning results and low controllability of the results. On the one hand, the model is mainly pre-trained through self-collected and self-labeled data. If the training sample data has sample imbalance, it may lead to bias and fairness problems in the model. On the other hand, for domain scenarios, the model is pre-trained on data. There is not enough professional sample data in the network (or there is very little sample data in some fields), and the information collected by the model from the network also faces non-factual problems, resulting in low reliability of the content generated by the model inference and the inability to truly represent the domain. Provide professional and reliable answers to questions. However, if you use professional sample data of newly annotated fields and inject it into the model for pre-training, you will face high manpower and computational costs, and there is no guarantee that the model will achieve the expected results in each segmented scenario of the field. Problems; (2) For intelligent question and answer reasoning solutions based on knowledge graphs, it is difficult to adjust the architecture and difficult to modify and adjust according to new data or scenarios. There are also problems such as weak reasoning capabilities and high graph construction costs.
针对上述的问题,目前尚未提出有效的解决方案。In response to the above problems, no effective solution has yet been proposed.
发明内容Contents of the invention
本发明实施例提供了一种基于领域知识图谱的知识问答方法及其装置、电子设备,以至少解决相关技术中对问题进行知识问答推理的准确性较低的技术问题。Embodiments of the present invention provide a knowledge question and answer method based on a domain knowledge graph, its device, and electronic equipment, so as to at least solve the technical problem in related technologies that the accuracy of knowledge question and answer reasoning for questions is low.
根据本发明实施例的一个方面,提供了一种基于领域知识图谱的知识问答方法,包括:接收目标问题,并对所述目标问题进行处理,得到问题提示信息;基于所述目标问题,抽取目标实体集合或者目标关系集合,其中,所述目标实体集合包括:多个目标实体,所述目标关系集合包括:多个目标关系;在抽取到所述目标实体集合的情况下,基于所述目标实体集合,从预设领域知识图谱中检索与所述目标实体匹配的三元组信息,或者在抽取到所述目标关系集合的情况下,基于所述目标关系集合,从预设领域知识图谱中检索与所述目标关系匹配的三元组信息,得到三元组信息集合;基于所述三元组信息集合以及所述问题提示信息,构建输入知识信息,并将所述输入知识信息输入至预设推理模型,输出所述目标问题的目标答案。According to an aspect of an embodiment of the present invention, a knowledge question and answer method based on a domain knowledge graph is provided, which includes: receiving a target question, processing the target question, and obtaining question prompt information; and extracting a target based on the target question. An entity set or a target relationship set, wherein the target entity set includes: multiple target entities, and the target relationship set includes: multiple target relationships; when the target entity set is extracted, based on the target entity Set, retrieve triplet information matching the target entity from the preset domain knowledge graph, or if the target relationship set is extracted, retrieve from the preset domain knowledge graph based on the target relationship set Triplet information matching the target relationship is obtained to obtain a triplet information set; based on the triplet information set and the question prompt information, input knowledge information is constructed, and the input knowledge information is input to the preset The inference model outputs the target answer to the target question.
可选地,对所述目标问题进行处理,得到问题提示信息的步骤,包括:构建问题提示模板,其中,所述问题提示模板包括:问题指令;基于所述问题提示模板,在所述目标问题中加入所述问题指令,生成所述问题提示信息。Optionally, the step of processing the target question and obtaining question prompt information includes: constructing a question prompt template, wherein the question prompt template includes: question instructions; based on the question prompt template, in the target question Add the question instruction to generate the question prompt information.
可选地,基于所述目标问题,抽取目标实体集合或者目标关系集合的步骤,包括:对所述目标问题进行分词处理,得到多个分词;对所述分词进行分析,确定所述分词的词类型;在所述词类型是第一预设类型的情况下,将所述词类型指示的所述分词确定为所述目标实体,或者,在所述词类型是第二预设类型的情况下,将所述词类型指示的所述分词确定为所述目标关系,得到所述目标关系集合;在所有所述词类型都不是所述第一预设类型以及所述第二预设类型的情况下,确定所述目标问题的上下文信息,并基于所述上下文信息,补充所述目标问题对应的所述目标实体;基于所有所述目标实体,生成所述目标实体集合。Optionally, based on the target question, the step of extracting a target entity set or a target relationship set includes: performing word segmentation processing on the target question to obtain multiple word segments; analyzing the word segments to determine the words of the segmented words. Type; if the word type is the first preset type, determine the word segment indicated by the word type as the target entity, or, if the word type is the second preset type , determine the word segment indicated by the word type as the target relationship, and obtain the target relationship set; in the case that all the word types are not the first preset type and the second preset type Next, the context information of the target question is determined, and based on the context information, the target entity corresponding to the target question is supplemented; based on all the target entities, the target entity set is generated.
可选地,所述三元组信息包括:主体实体、对象实体和实体关系,在抽取到所述目标实体集合的情况下,基于所述目标实体集合,从预设领域知识图谱中检索与所述目标实体匹配的三元组信息,得到三元组信息集合的步骤,包括:确定检索跳数阈值以及初始检索跳数;从所述预设领域知识图谱中检索与所述目标实体匹配的知识图谱实体,其中,所述知识图谱实体是所述主体实体或者对象实体;在检索到与所述目标实体匹配的第一知识图谱实体的情况下,对所述初始检索跳数进行更新操作,得到当前检索跳数;在所述当前检索跳数小于所述检索跳数阈值的情况下,基于所述实体关系,确定与所述第一知识图谱实体关联的第二知识图谱实体;更新所述当前检索跳数,并继续基于所述实体关系,确定与所述第二知识图谱实体关联的第三知识图谱实体,直到所述当前检索跳数大于等于所述检索跳数阈值,得到知识图谱实体集合;基于所述知识图谱实体集合,确定每个所述知识图谱实体所属的所述三元组信息,得到所述三元组信息集合。Optionally, the triplet information includes: subject entity, object entity and entity relationship. When the target entity set is extracted, based on the target entity set, retrieve the information related to the preset domain knowledge graph from the preset domain knowledge graph. The steps of obtaining the triplet information set matching the target entity include: determining the retrieval hop count threshold and the initial retrieval hop count; retrieving knowledge matching the target entity from the preset domain knowledge graph Graph entity, wherein the knowledge graph entity is the subject entity or object entity; when the first knowledge graph entity matching the target entity is retrieved, the initial retrieval hop number is updated to obtain Current retrieval hop count; when the current retrieval hop count is less than the retrieval hop count threshold, determine a second knowledge graph entity associated with the first knowledge graph entity based on the entity relationship; update the current Retrieve the hop count, and continue to determine the third knowledge graph entity associated with the second knowledge graph entity based on the entity relationship, until the current retrieval hop count is greater than or equal to the retrieval hop count threshold, and obtain a knowledge graph entity set ; Based on the knowledge graph entity set, determine the triplet information to which each knowledge graph entity belongs, and obtain the triplet information set.
可选地,在抽取到所述目标关系集合的情况下,基于所述目标关系集合,从预设领域知识图谱中检索与所述目标关系匹配的三元组信息,得到三元组信息集合的步骤,包括:从所述预设领域知识图谱中检索与所述目标关系匹配的实体关系,直到检索成功或者检索次数达到预设检索阈值,得到检索结果;在检索成功的情况下,基于所述检索结果确定与所述目标关系匹配的目标实体关系,得到目标实体关系集合;基于所述目标实体关系集合,确定每个所述目标实体关系所属的所述三元组信息,得到所述三元组信息集合。Optionally, when the target relationship set is extracted, triplet information matching the target relationship is retrieved from the preset domain knowledge graph based on the target relationship set, and the triplet information set is obtained. Steps include: retrieving entity relationships matching the target relationship from the preset domain knowledge graph until the retrieval is successful or the number of retrieval times reaches a preset retrieval threshold, and obtaining retrieval results; in the case of successful retrieval, based on the retrieval The search result determines the target entity relationship that matches the target relationship, and obtains a target entity relationship set; based on the target entity relationship set, determines the triplet information to which each target entity relationship belongs, and obtains the triplet Group information collection.
可选地,在基于所述目标实体集合,从预设领域知识图谱中检索与所述目标实体匹配的三元组信息,得到三元组信息集合之后,还包括:连接所述三元组信息中的主体实体、对象实体以及实体关系,得到回答文本,其中,所述三元组信息对应有与所述目标实体关联的关联值;基于所述关联值,对所有所述回答文本进行排序,得到回答文本集合。Optionally, after retrieving triplet information matching the target entity from the preset domain knowledge graph based on the target entity set to obtain the triplet information set, the method further includes: connecting the triplet information The subject entity, object entity and entity relationship in the answer text are obtained, wherein the triplet information corresponds to an associated value associated with the target entity; based on the associated value, all the answer texts are sorted, Get the answer text collection.
可选地,基于所述三元组信息集合以及所述问题提示信息,构建输入知识信息的步骤,包括:构建回答提示模板,其中,所述回答提示模板包括:回答指令;基于所述回答提示模板,在所述回答文本集合中的每个所述回答文本中加入所述回答指令,生成回答提示信息集合;拼接所述问题提示信息以及所述回答提示信息集合,得到所述输入知识信息。Optionally, based on the triplet information set and the question prompt information, the step of constructing the input knowledge information includes: constructing an answer prompt template, wherein the answer prompt template includes: an answer instruction; based on the answer prompt Template: add the answer instruction to each answer text in the answer text set to generate a set of answer prompt information; splice the question prompt information and the answer prompt information set to obtain the input knowledge information.
可选地,将所述输入知识信息输入至预设推理模型,输出所述目标问题的目标答案的步骤,包括:采用所述预设推理模型分析所述问题提示信息,得到答案集合,其中,所述预设推理模型是采用训练数据集合预先训练的推理模型,所述训练数据集合包括:历史问题集合以及与所述历史问题集合中的每个历史问题对应的历史答案;将所述输入知识信息表征为预设条件,并基于所述预设条件,确定所述答案集合中每个答案的条件概率值;将最大条件概率值指示的所述答案确定为所述目标答案。Optionally, the step of inputting the input knowledge information into a preset reasoning model and outputting a target answer to the target question includes: using the preset reasoning model to analyze the question prompt information to obtain an answer set, wherein, The preset reasoning model is a reasoning model pre-trained using a training data set. The training data set includes: a historical question set and historical answers corresponding to each historical question in the historical question set; the input knowledge The information is characterized as a preset condition, and based on the preset condition, the conditional probability value of each answer in the answer set is determined; the answer indicated by the maximum conditional probability value is determined as the target answer.
根据本发明实施例的另一方面,还提供了一种基于领域知识图谱的知识问答装置,包括:接收单元,用于接收目标问题,并对所述目标问题进行处理,得到问题提示信息;抽取单元,用于基于所述目标问题,抽取目标实体集合或者目标关系集合,其中,所述目标实体集合包括:多个目标实体,所述目标关系集合包括:多个目标关系;检索单元,用于在抽取到所述目标实体集合的情况下,基于所述目标实体集合,从预设领域知识图谱中检索与所述目标实体匹配的三元组信息,或者在抽取到所述目标关系集合的情况下,基于所述目标关系集合,从预设领域知识图谱中检索与所述目标关系匹配的三元组信息,得到三元组信息集合;构建单元,用于基于所述三元组信息集合以及所述问题提示信息,构建输入知识信息,并将所述输入知识信息输入至预设推理模型,输出所述目标问题的目标答案。According to another aspect of the embodiment of the present invention, a knowledge question and answer device based on a domain knowledge graph is also provided, including: a receiving unit for receiving target questions and processing the target questions to obtain question prompt information; and extracting A unit configured to extract a target entity set or a target relationship set based on the target question, where the target entity set includes: multiple target entities, and the target relationship set includes: multiple target relationships; a retrieval unit, used to In the case where the target entity set is extracted, triple information matching the target entity is retrieved from the preset domain knowledge graph based on the target entity set, or in the case where the target relationship set is extracted Next, based on the target relationship set, retrieve triplet information matching the target relationship from the preset domain knowledge map to obtain a triplet information set; a construction unit for based on the triplet information set and The question prompt information constructs input knowledge information, inputs the input knowledge information into the preset reasoning model, and outputs the target answer to the target question.
可选地,所述接收单元包括:第一构建模块,用于构建问题提示模板,其中,所述问题提示模板包括:问题指令;第一生成模块,用于基于所述问题提示模板,在所述目标问题中加入所述问题指令,生成所述问题提示信息。Optionally, the receiving unit includes: a first building module for constructing a question prompt template, wherein the question prompt template includes: question instructions; and a first generating module for constructing a question prompt based on the question prompt template. Add the question instruction to the target question to generate the question prompt information.
可选地,所述抽取单元包括:第一处理模块,用于对所述目标问题进行分词处理,得到多个分词;第一分析模块,用于对所述分词进行分析,确定所述分词的词类型;第一确定模块,用于在所述词类型是第一预设类型的情况下,将所述词类型指示的所述分词确定为所述目标实体,或者,在所述词类型是第二预设类型的情况下,将所述词类型指示的所述分词确定为所述目标关系,得到所述目标关系集合;第二确定模块,用于在所有所述词类型都不是所述第一预设类型以及所述第二预设类型的情况下,确定所述目标问题的上下文信息,并基于所述上下文信息,补充所述目标问题对应的所述目标实体;第二生成模块,用于基于所有所述目标实体,生成所述目标实体集合。Optionally, the extraction unit includes: a first processing module, used to perform word segmentation processing on the target question to obtain multiple word segments; a first analysis module, used to analyze the word segmentation and determine the number of the word segmentation. Word type; a first determination module, configured to determine the word segment indicated by the word type as the target entity when the word type is a first preset type, or, when the word type is In the case of the second preset type, determine the word segmentation indicated by the word type as the target relationship to obtain the target relationship set; a second determination module is used to determine when all the word types are not the target relationship. In the case of the first preset type and the second preset type, determine the context information of the target question, and supplement the target entity corresponding to the target question based on the context information; the second generation module, Used to generate the target entity set based on all the target entities.
可选地,所述三元组信息包括:主体实体、对象实体和实体关系,所述检索单元包括:第三确定模块,用于确定检索跳数阈值以及初始检索跳数;第一检索模块,用于从所述预设领域知识图谱中检索与所述目标实体匹配的知识图谱实体,其中,所述知识图谱实体是所述主体实体或者对象实体;第一更新模块,用于在检索到与所述目标实体匹配的第一知识图谱实体的情况下,对所述初始检索跳数进行更新操作,得到当前检索跳数;第四确定模块,用于在所述当前检索跳数小于所述检索跳数阈值的情况下,基于所述实体关系,确定与所述第一知识图谱实体关联的第二知识图谱实体;第二更新模块,用于更新所述当前检索跳数,并继续基于所述实体关系,确定与所述第二知识图谱实体关联的第三知识图谱实体,直到所述当前检索跳数大于等于所述检索跳数阈值,得到知识图谱实体集合;第五确定模块,用于基于所述知识图谱实体集合,确定每个所述知识图谱实体所属的所述三元组信息,得到所述三元组信息集合。Optionally, the triplet information includes: subject entity, object entity and entity relationship, and the retrieval unit includes: a third determination module, used to determine the retrieval hop count threshold and the initial retrieval hop count; the first retrieval module, Used to retrieve knowledge graph entities matching the target entity from the preset domain knowledge graph, wherein the knowledge graph entities are the subject entities or object entities; a first update module, configured to retrieve and In the case where the target entity matches the first knowledge graph entity, update the initial retrieval hop number to obtain the current retrieval hop number; the fourth determination module is used to perform an update operation on the initial retrieval hop number when the current retrieval hop number is less than the retrieval hop number. In the case of a hop count threshold, based on the entity relationship, determine a second knowledge graph entity associated with the first knowledge graph entity; a second update module, used to update the current retrieval hop count, and continue based on the Entity relationship, determine the third knowledge graph entity associated with the second knowledge graph entity, until the current retrieval hop count is greater than or equal to the retrieval hop count threshold, obtain the knowledge graph entity set; the fifth determination module is used to determine the entity relationship based on The knowledge graph entity set determines the triplet information to which each knowledge graph entity belongs, and obtains the triplet information set.
可选地,所述检索单元还包括:第二检索模块,用于从所述预设领域知识图谱中检索与所述目标关系匹配的实体关系,直到检索成功或者检索次数达到预设检索阈值,得到检索结果;第六确定模块,用于在检索成功的情况下,基于所述检索结果确定与所述目标关系匹配的目标实体关系,得到目标实体关系集合;第七确定模块,用于基于所述目标实体关系集合,确定每个所述目标实体关系所属的所述三元组信息,得到所述三元组信息集合。Optionally, the retrieval unit further includes: a second retrieval module, configured to retrieve entity relationships matching the target relationship from the preset domain knowledge graph until the retrieval is successful or the number of retrieval times reaches the preset retrieval threshold, Obtain the retrieval results; the sixth determination module is used to determine the target entity relationship matching the target relationship based on the retrieval results when the retrieval is successful, and obtain the target entity relationship set; the seventh determination module is used to determine based on the retrieval results. The target entity relationship set is determined, the triplet information to which each target entity relationship belongs is determined, and the triplet information set is obtained.
可选地,所述知识问答装置还包括:第一连接模块,用于在基于所述目标实体集合,从预设领域知识图谱中检索与所述目标实体匹配的三元组信息,得到三元组信息集合之后,连接所述三元组信息中的主体实体、对象实体以及实体关系,得到回答文本,其中,所述三元组信息对应有与所述目标实体关联的关联值;第一排序模块,用于基于所述关联值,对所有所述回答文本进行排序,得到回答文本集合。Optionally, the knowledge question and answer device further includes: a first connection module, configured to retrieve triple information matching the target entity from the preset domain knowledge graph based on the target entity set to obtain the triple After the group information is collected, the subject entity, object entity and entity relationship in the triplet information are connected to obtain the answer text, wherein the triplet information corresponds to the associated value associated with the target entity; first sorting A module configured to sort all the answer texts based on the associated value to obtain a set of answer texts.
可选地,所述构建单元包括:第二构建模块,用于构建回答提示模板,其中,所述回答提示模板包括:回答指令;第三生成模块,用于基于所述回答提示模板,在所述回答文本集合中的每个所述回答文本中加入所述回答指令,生成回答提示信息集合;第一拼接模块,用于拼接所述问题提示信息以及所述回答提示信息集合,得到所述输入知识信息。Optionally, the building unit includes: a second building module, used to build an answer prompt template, wherein the answer prompt template includes: an answer instruction; and a third generation module, used to construct the answer prompt template based on the answer prompt template. Add the answer instruction to each answer text in the answer text set to generate a set of answer prompt information; the first splicing module is used to splice the question prompt information and the answer prompt information set to obtain the input knowledge information.
可选地,所述构建单元还包括:第二分析模块,用于采用所述预设推理模型分析所述问题提示信息,得到答案集合,其中,所述预设推理模型是采用训练数据集合预先训练的推理模型,所述训练数据集合包括:历史问题集合以及与所述历史问题集合中的每个历史问题对应的历史答案;第八确定模块,用于将所述输入知识信息表征为预设条件,并基于所述预设条件,确定所述答案集合中每个答案的条件概率值;第九确定模块,用于将最大条件概率值指示的所述答案确定为所述目标答案。Optionally, the construction unit further includes: a second analysis module for analyzing the question prompt information using the preset reasoning model to obtain an answer set, wherein the preset reasoning model is pre-set using a training data set. The trained inference model, the training data set includes: a historical question set and historical answers corresponding to each historical question in the historical question set; an eighth determination module, used to represent the input knowledge information as a preset condition, and based on the preset condition, determine the conditional probability value of each answer in the answer set; a ninth determination module is used to determine the answer indicated by the maximum conditional probability value as the target answer.
根据本发明实施例的另一方面,还提供了一种计算机可读存储介质,所述计算机可读存储介质包括存储的计算机程序,其中,在所述计算机程序运行时控制所述计算机可读存储介质所在设备执行上述任意一项基于领域知识图谱的知识问答方法。According to another aspect of the embodiment of the present invention, a computer-readable storage medium is also provided, the computer-readable storage medium including a stored computer program, wherein the computer-readable storage is controlled when the computer program is running. The device where the medium is located executes any of the above knowledge question and answer methods based on domain knowledge graphs.
根据本发明实施例的另一方面,还提供了一种电子设备,包括一个或多个处理器和存储器,所述存储器用于存储一个或多个程序,其中,当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现上述任意一项基于领域知识图谱的知识问答方法。According to another aspect of the embodiment of the present invention, an electronic device is also provided, including one or more processors and a memory, the memory being used to store one or more programs, wherein when the one or more programs When executed by the one or more processors, the one or more processors are caused to implement any of the above knowledge question and answer methods based on domain knowledge graphs.
在本公开中,接收目标问题,并对目标问题进行处理,得到问题提示信息,基于目标问题,抽取目标实体集合或者目标关系集合,在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合,基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。在本公开中,可以先对接收到的目标问题进行处理,以得到问题提示信息,并可以抽取目标问题中的目标实体集合或者目标关系集合,以根据目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息或者根据目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,然后根据得到的三元组信息集合以及问题提示信息,构建输入知识信息,之后将输入知识信息输入至预设推理模型,以得到输出目标问题的目标答案,通过结合预设领域知识图谱,能够将问题构建为输入知识信息,然后通过预设推理模型处理输入知识信息,能够降低模型推理过程中的偏见,提高对问题进行知识问答推理的准确性,进而解决了相关技术中对问题进行知识问答推理的准确性较低的技术问题。In the present disclosure, the target question is received, the target question is processed, and the question prompt information is obtained. Based on the target question, the target entity set or the target relationship set is extracted. When the target entity set is extracted, based on the target entity set, from Retrieve the triplet information matching the target entity from the preset domain knowledge graph, or if the target relationship set is extracted, retrieve the triplet information matching the target relationship from the preset domain knowledge graph based on the target relationship set. , obtain the triplet information set, construct the input knowledge information based on the triplet information set and the question prompt information, input the input knowledge information into the preset reasoning model, and output the target answer to the target question. In the present disclosure, the received target question can be processed first to obtain the question prompt information, and the target entity set or the target relationship set in the target question can be extracted to extract information from the preset domain knowledge graph based on the target entity set. Retrieve triple information that matches the target entity or retrieve triple information that matches the target relationship from the preset domain knowledge graph based on the target relationship set, and then construct the input based on the obtained triple information set and question prompt information. Knowledge information, and then input the input knowledge information into the preset reasoning model to obtain the target answer to the output target question. By combining the preset domain knowledge graph, the question can be constructed as input knowledge information, and then the input knowledge is processed through the preset reasoning model Information can reduce bias in the model reasoning process and improve the accuracy of knowledge question and answer reasoning on questions, thus solving the technical problem in related technologies that the accuracy of knowledge question and answer reasoning on questions is low.
附图说明Description of the drawings
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings described here are used to provide a further understanding of the present invention and constitute a part of this application. The illustrative embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention. In the attached picture:
图1是根据相关技术的一种可选的基于知识图谱的智能问答推理的示意图;Figure 1 is a schematic diagram of an optional intelligent question and answer reasoning based on knowledge graph according to related technologies;
图2是根据相关技术的一种可选的基于生成式模型进行智能问答的示意图;Figure 2 is a schematic diagram of an optional intelligent question and answer based on a generative model according to related technologies;
图3是根据本发明实施例的一种可选的基于领域知识图谱的知识问答方法的流程图;Figure 3 is a flow chart of an optional knowledge question and answer method based on domain knowledge graph according to an embodiment of the present invention;
图4是根据本发明实施例的一种可选的基于领域知识图谱的知识问答推理流程的示意图;Figure 4 is a schematic diagram of an optional knowledge question and answer reasoning process based on domain knowledge graphs according to an embodiment of the present invention;
图5是根据本发明实施例的一种可选的基于领域知识图谱的知识问答装置的示意图;Figure 5 is a schematic diagram of an optional knowledge question and answer device based on a domain knowledge graph according to an embodiment of the present invention;
图6是根据本发明实施例的一种用于基于领域知识图谱的知识问答方法的电子设备(或移动设备)的硬件结构框图。Figure 6 is a hardware structure block diagram of an electronic device (or mobile device) used for a knowledge question and answer method based on a domain knowledge graph according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only These are some embodiments of the present invention, rather than all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the invention described herein are capable of being practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, e.g., a process, method, system, product, or apparatus that encompasses a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.
为便于本领域技术人员理解本发明,下面对本发明各实施例中涉及的部分术语或名词做出解释:In order to facilitate those skilled in the art to understand the present invention, some terms or nouns involved in various embodiments of the present invention are explained below:
知识图谱,是一种用图模型来描述知识和建模实体之间的关联关系的方法,由节点和边组成。Knowledge graph is a method that uses a graph model to describe the relationship between knowledge and modeling entities. It is composed of nodes and edges.
语言模型,用于生成词序列的概率分布,即为一个文本确定一个概率分布,表示该文本存在的可能性。Language models are used to generate probability distributions of word sequences, that is, to determine a probability distribution for a text, indicating the possibility of the existence of the text.
需要说明的是,本公开中的基于领域知识图谱的知识问答方法及其装置可用于人工智能领域在基于领域知识图谱进行知识问答的情况下,也可用于除人工智能领域之外的任意领域在基于领域知识图谱进行知识问答的情况下,本公开中对基于领域知识图谱的知识问答方法及其装置的应用领域不做限定。It should be noted that the knowledge question and answer method and device based on the domain knowledge graph in the present disclosure can be used in the field of artificial intelligence when conducting knowledge question and answer based on the domain knowledge graph, and can also be used in any field except the field of artificial intelligence. In the case of performing knowledge question and answer based on the domain knowledge graph, the present disclosure does not limit the application fields of the knowledge question and answer method and device based on the domain knowledge graph.
需要说明的是,本公开所涉及的相关信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等),均为经用户授权或者经过各方充分授权的信息和数据,并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准,并提供有相应的操作入口,供用户选择授权或者拒绝。例如,本系统和相关用户或机构间设置有接口,在获取相关信息之前,需要通过接口向前述的用户或机构发送获取请求,并在接收到前述的用户或机构反馈的同意信息后,获取相关信息。It should be noted that the relevant information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this disclosure are all It is information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with the relevant laws, regulations and standards of relevant countries and regions, and corresponding operation portals are provided for users to choose to authorize or reject. For example, there is an interface between this system and relevant users or institutions. Before obtaining relevant information, it is necessary to send an acquisition request to the aforementioned users or institutions through the interface, and after receiving the consent information fed back by the aforementioned users or institutions, obtain the relevant information. information.
本发明下述各实施例可应用于各种基于领域知识图谱进行知识问答的系统/应用/设备中。本发明提出了一种基于领域知识图谱的知识问答推理方法,能够解决相关技术中智能推理问答结果可靠程度低、结果存在偏见性和公平性的问题。The following embodiments of the present invention can be applied to various systems/applications/devices for knowledge question and answer based on domain knowledge graphs. The present invention proposes a knowledge question and answer reasoning method based on a domain knowledge graph, which can solve the problems in related technologies of low reliability of intelligent reasoning question and answer results and bias and fairness in the results.
本发明利用领域知识图谱数据资产,能够增强模型推理结果的可控性、可靠性,减少模型非事实性错误,提升模型智能问答推理能力水平,提升模型在智能客服、虚拟助手等场景下的适用性,此外,本发明无需按照场景类型重新训练模型,能够灵活适用各类场景下的知识问答推理。This invention uses domain knowledge graph data assets to enhance the controllability and reliability of model reasoning results, reduce non-factual errors in the model, improve the model's intelligent question and answer reasoning ability, and improve the applicability of the model in scenarios such as intelligent customer service and virtual assistants. In addition, the present invention does not need to retrain the model according to the scene type, and can be flexibly applied to knowledge question and answer reasoning in various scenarios.
下面结合各个实施例来详细说明本发明。The present invention will be described in detail below with reference to various embodiments.
实施例一Embodiment 1
根据本发明实施例,提供了一种基于领域知识图谱的知识问答方法的实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, an embodiment of a knowledge question and answer method based on a domain knowledge graph is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be implemented in a computer system such as a set of computer executable instructions. are performed, and, although a logical order is shown in the flowchart diagrams, in some cases the steps shown or described may be performed in a different order than herein.
图3是根据本发明实施例的一种可选的基于领域知识图谱的知识问答方法的流程图,如图3所示,该方法包括如下步骤:Figure 3 is a flow chart of an optional knowledge question and answer method based on domain knowledge graph according to an embodiment of the present invention. As shown in Figure 3, the method includes the following steps:
步骤S301,接收目标问题,并对目标问题进行处理,得到问题提示信息。Step S301: Receive the target question, process the target question, and obtain question prompt information.
步骤S302,基于目标问题,抽取目标实体集合或者目标关系集合,其中,目标实体集合包括:多个目标实体,目标关系集合包括:多个目标关系。Step S302: Based on the target question, extract a target entity set or a target relationship set, where the target entity set includes multiple target entities, and the target relationship set includes multiple target relationships.
步骤S303,在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合。Step S303: If the target entity set is extracted, triple information matching the target entity is retrieved from the preset domain knowledge graph based on the target entity set, or if the target relationship set is extracted, based on the target relationship Set, retrieve triplet information matching the target relationship from the preset domain knowledge graph, and obtain a triplet information set.
步骤S304,基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。Step S304: Construct input knowledge information based on the triplet information set and question prompt information, input the input knowledge information into the preset reasoning model, and output the target answer to the target question.
通过上述步骤,可以接收目标问题,并对目标问题进行处理,得到问题提示信息,基于目标问题,抽取目标实体集合或者目标关系集合,在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合,基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。在本发明实施例中,可以先对接收到的目标问题进行处理,以得到问题提示信息,并可以抽取目标问题中的目标实体集合或者目标关系集合,以根据目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息或者根据目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,然后根据得到的三元组信息集合以及问题提示信息,构建输入知识信息,之后将输入知识信息输入至预设推理模型,以得到输出目标问题的目标答案,通过结合预设领域知识图谱,能够将问题构建为输入知识信息,然后通过预设推理模型处理输入知识信息,能够降低模型推理过程中的偏见,提高对问题进行知识问答推理的准确性,进而解决了相关技术中对问题进行知识问答推理的准确性较低的技术问题。Through the above steps, the target question can be received, processed, and the question prompt information obtained. Based on the target question, the target entity set or the target relationship set is extracted. When the target entity set is extracted, based on the target entity set, from Retrieve the triplet information matching the target entity from the preset domain knowledge graph, or if the target relationship set is extracted, retrieve the triplet information matching the target relationship from the preset domain knowledge graph based on the target relationship set. , obtain the triplet information set, construct the input knowledge information based on the triplet information set and the question prompt information, input the input knowledge information into the preset reasoning model, and output the target answer to the target question. In the embodiment of the present invention, the received target question can be processed first to obtain the question prompt information, and the target entity set or target relationship set in the target question can be extracted to extract the preset domain knowledge based on the target entity set. Retrieve the triplet information matching the target entity from the graph or retrieve the triplet information matching the target relationship from the preset domain knowledge graph based on the target relationship set, and then based on the obtained triplet information set and question prompt information, Construct input knowledge information, and then input the input knowledge information into the preset reasoning model to obtain the target answer to the output target question. By combining the preset domain knowledge graph, the question can be constructed as input knowledge information, and then processed through the preset reasoning model Inputting knowledge information can reduce bias in the model reasoning process and improve the accuracy of knowledge question and answer reasoning on questions, thus solving the technical problem in related technologies that the accuracy of knowledge question and answer reasoning on questions is low.
下面结合上述各步骤对本发明实施例进行详细说明。The embodiments of the present invention will be described in detail below in conjunction with the above steps.
步骤S301,接收目标问题,并对目标问题进行处理,得到问题提示信息。Step S301: Receive the target question, process the target question, and obtain question prompt information.
可选地,对目标问题进行处理,得到问题提示信息的步骤,包括:构建问题提示模板,其中,问题提示模板包括:问题指令;基于问题提示模板,在目标问题中加入问题指令,生成问题提示信息。Optionally, the step of processing the target question and obtaining question prompt information includes: constructing a question prompt template, wherein the question prompt template includes: question instructions; based on the question prompt template, adding question instructions to the target question to generate a question prompt information.
在本发明实施例中,可以先接收需要进行知识问答的目标问题,然后对目标问题进行处理,以得到问题提示信息,具体为:可以先构建问题提示模板,该问题提示模板包括:问题指令(例如,请回答下述问题),然后依据问题提示模板,在目标问题中加入问题指令,以生成问题提示信息,例如,目标问题x为“某书的作者是谁”,则根据问题提示模板,生成的问题提示信息x'为“请回答下述问题:某书的作者是谁”。In the embodiment of the present invention, the target question that requires knowledge Q&A can be received first, and then the target question can be processed to obtain the question prompt information. Specifically: a question prompt template can be constructed first, and the question prompt template includes: question instructions ( For example, please answer the following questions), and then add question instructions to the target question according to the question prompt template to generate question prompt information. For example, if the target question x is "Who is the author of a certain book?", then according to the question prompt template, The generated question prompt information x' is "Please answer the following question: Who is the author of a certain book?"
步骤S302,基于目标问题,抽取目标实体集合或者目标关系集合,其中,目标实体集合包括:多个目标实体,目标关系集合包括:多个目标关系。Step S302: Based on the target question, extract a target entity set or a target relationship set, where the target entity set includes multiple target entities, and the target relationship set includes multiple target relationships.
可选地,基于目标问题,抽取目标实体集合或者目标关系集合的步骤,包括:对目标问题进行分词处理,得到多个分词;对分词进行分析,确定分词的词类型;在词类型是第一预设类型的情况下,将词类型指示的分词确定为目标实体,或者,在词类型是第二预设类型的情况下,将词类型指示的分词确定为目标关系,得到目标关系集合;在所有词类型都不是第一预设类型以及第二预设类型的情况下,确定目标问题的上下文信息,并基于上下文信息,补充目标问题对应的目标实体;基于所有目标实体,生成目标实体集合。Optionally, based on the target question, the step of extracting the target entity set or the target relationship set includes: performing word segmentation processing on the target question to obtain multiple word segmentations; analyzing the word segmentations to determine the word type of the word segmentation; when the word type is the first In the case of the preset type, determine the participle indicated by the word type as the target entity, or, if the word type is the second preset type, determine the participle indicated by the word type as the target relationship, and obtain the target relationship set; in When all word types are not the first preset type and the second preset type, the context information of the target question is determined, and based on the context information, the target entity corresponding to the target question is supplemented; based on all target entities, a target entity set is generated.
在本发明实施例中,可以通过实体链接或自然语言模型提取问题或句子(即目标问题)中的内容(即实体或者关系,可以是主体实体(即位于主语位置的名词),也可以是对象实体(即位于宾语位置的名词),也可以是实体之间的关系(例如,夫妻关系、主从关系等)),如果实体或者关系不存在,可利用知识补全方法补充问题中的实体(即基于目标问题,抽取目标实体集合或者目标关系集合,该目标实体集合包括:多个目标实体,该目标关系集合包括:多个目标关系),具体为:可以先对目标问题进行分词处理,以得到多个分词,然后对每个分词进行分析,以确定每个分词的词类型(例如,名词、动词、形容词等),如果词类型是第一预设类型(例如,表示对象名称的名词等),则可以将词类型指示的分词确定为目标实体,如果词类型是第二预设类型(例如,表示关系的名词等),则可以将词类型指示的分词确定为目标实体。如果所有词类型都不是第一预设类型和第二预设类型,则可以先确定目标问题的上下文信息,然后根据上下文信息,补充目标问题对应的目标实体,之后根据所有目标实体,生成目标实体集合。In the embodiment of the present invention, the content (i.e., entity or relationship) in the question or sentence (i.e., target question) can be extracted through entity linking or natural language model. It can be a subject entity (i.e., a noun located in the subject position) or an object. Entities (that is, nouns located in the object position) can also be relationships between entities (for example, husband-wife relationships, master-slave relationships, etc.)). If the entities or relationships do not exist, the knowledge completion method can be used to supplement the entities in the question ( That is, based on the target problem, extract a target entity set or a target relationship set. The target entity set includes: multiple target entities, and the target relationship set includes: multiple target relationships). Specifically: the target problem can be segmented first, so as to Obtain multiple participles, and then analyze each participle to determine the word type of each participle (for example, noun, verb, adjective, etc.), if the word type is the first preset type (for example, a noun representing the name of the object, etc.) ), then the word segment indicated by the word type can be determined as the target entity. If the word type is the second preset type (for example, a noun indicating a relationship, etc.), then the word segment indicated by the word type can be determined as the target entity. If all word types are not the first preset type and the second preset type, you can first determine the context information of the target question, then supplement the target entity corresponding to the target question based on the context information, and then generate the target entity based on all target entities. gather.
步骤S303,在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合。Step S303: If the target entity set is extracted, triple information matching the target entity is retrieved from the preset domain knowledge graph based on the target entity set, or if the target relationship set is extracted, based on the target relationship Set, retrieve triplet information matching the target relationship from the preset domain knowledge graph, and obtain a triplet information set.
可选地,三元组信息包括:主体实体、对象实体和实体关系,在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,得到三元组信息集合的步骤,包括:确定检索跳数阈值以及初始检索跳数;从预设领域知识图谱中检索与目标实体匹配的知识图谱实体,其中,知识图谱实体是主体实体或者对象实体;在检索到与目标实体匹配的第一知识图谱实体的情况下,对初始检索跳数进行更新操作,得到当前检索跳数;在当前检索跳数小于检索跳数阈值的情况下,基于实体关系,确定与第一知识图谱实体关联的第二知识图谱实体;更新当前检索跳数,并继续基于实体关系,确定与第二知识图谱实体关联的第三知识图谱实体,直到当前检索跳数大于等于检索跳数阈值,得到知识图谱实体集合;基于知识图谱实体集合,确定每个知识图谱实体所属的三元组信息,得到三元组信息集合。Optionally, the triple information includes: subject entity, object entity and entity relationship. When the target entity set is extracted, triples matching the target entity are retrieved from the preset domain knowledge graph based on the target entity set. Information, the steps to obtain a triplet information set include: determining the retrieval hop threshold and the initial retrieval hop count; retrieving knowledge graph entities matching the target entity from the preset domain knowledge graph, where the knowledge graph entity is the main entity or Object entity; when the first knowledge graph entity matching the target entity is retrieved, the initial retrieval hop count is updated to obtain the current retrieval hop count; when the current retrieval hop count is less than the retrieval hop count threshold, based on Entity relationship, determine the second knowledge graph entity associated with the first knowledge graph entity; update the current retrieval hop count, and continue to determine the third knowledge map entity associated with the second knowledge map entity based on the entity relationship until the current retrieval hop count Greater than or equal to the retrieval hop threshold, a knowledge graph entity set is obtained; based on the knowledge graph entity set, the triplet information to which each knowledge graph entity belongs is determined, and a triplet information set is obtained.
在本发明实施例中,可以基于提取到的实体,从领域知识图谱(是基于领域知识预先构建的知识图谱,例如,金融领域知识图谱)中检索与从问题中提取到的实体相关的三元组(主体实体、对象实体、关系)(即基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,得到三元组信息集合),从知识图谱中检索到的三元组即可作为输入问题的相关事实,其中,三元组可能存在多对,三元组信息包括:主体实体、对象实体和实体关系。In the embodiment of the present invention, based on the extracted entities, the ternary information related to the entities extracted from the question can be retrieved from the domain knowledge graph (a knowledge graph pre-constructed based on the domain knowledge, for example, the financial domain knowledge graph). Group (subject entity, object entity, relationship) (that is, based on the target entity set, retrieve the triplet information matching the target entity from the preset domain knowledge graph to obtain the triplet information set), retrieved from the knowledge graph Triplets can be used as relevant facts for the input problem. There may be multiple pairs of triplets, and triplet information includes: subject entities, object entities, and entity relationships.
在本发明实施例中,在进行三元组检索时,检索空间的大小影响三元组的数量,因此,可以依据问答场景任务复杂度设置从问题上检索的跳数,并且考虑到检索到的三元组存在与目标问题无关或数量较多的问题,可以采用对称知识检索器或者非对称检索器进行检索。In the embodiment of the present invention, when performing triple retrieval, the size of the retrieval space affects the number of triples. Therefore, the number of hops retrieved from the question can be set according to the task complexity of the question and answer scenario, and the number of hops retrieved can be taken into account. Triplets have problems that are irrelevant to the target problem or have a large number, and can be retrieved using a symmetric knowledge retrieval or an asymmetric retrieval.
在本发明实施例中,可以先确定检索跳数阈值(可以根据实际情况进行设置,例如,2)以及初始检索跳数(例如,设置初始检索跳数为0),然后可以从预设领域知识图谱中检索与目标实体匹配的知识图谱实体(知识图谱实体是主体实体或者对象实体),如果检索到与目标实体匹配的第一知识图谱实体,则可以对初始检索跳数进行更新操作(即将初始检索跳数加1),得到当前检索跳数.如果当前检索跳数小于检索跳数阈值,则可以根据相应的实体关系,确定与第一知识图谱实体关联的第二知识图谱实体,再次更新当前检索跳数,并继续根据相应的实体关系,确定与第二知识图谱实体关联的第三知识图谱实体,直到当前检索跳数大于等于检索跳数阈值,从而得到知识图谱实体集合,之后根据知识图谱实体集合,确定每个知识图谱实体所属的三元组信息,得到三元组信息集合,其中,三元组信息K可以表示为:si表示主体实体,ri表示对象实体,oi表示实体关系,N表示三元组信息数量。In the embodiment of the present invention, the retrieval hop count threshold (which can be set according to the actual situation, for example, 2) and the initial retrieval hop count (for example, the initial retrieval hop count is set to 0) can be determined first, and then the retrieval hop count can be determined from the preset domain knowledge The knowledge graph entity that matches the target entity is retrieved in the graph (the knowledge graph entity is a subject entity or an object entity). If the first knowledge graph entity that matches the target entity is retrieved, the initial search hop count can be updated (that is, the initial Add 1 to the retrieval hop count to get the current retrieval hop count. If the current retrieval hop count is less than the retrieval hop count threshold, the second knowledge graph entity associated with the first knowledge graph entity can be determined based on the corresponding entity relationship, and the current retrieval hop count can be updated again. Retrieve the hop count, and continue to determine the third knowledge graph entity associated with the second knowledge graph entity based on the corresponding entity relationship, until the current retrieval hop count is greater than or equal to the retrieval hop count threshold, thereby obtaining the knowledge graph entity set, and then based on the knowledge graph Entity set, determine the triplet information to which each knowledge graph entity belongs, and obtain the triplet information set, where the triplet information K can be expressed as: si represents the subject entity, ri represents the object entity, oi represents the entity relationship, and N represents the number of triples of information.
可选地,在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合的步骤,包括:从预设领域知识图谱中检索与目标关系匹配的实体关系,直到检索成功或者检索次数达到预设检索阈值,得到检索结果;在检索成功的情况下,基于检索结果确定与目标关系匹配的目标实体关系,得到目标实体关系集合;基于目标实体关系集合,确定每个目标实体关系所属的三元组信息,得到三元组信息集合。Optionally, when the target relationship set is extracted, the steps of retrieving triplet information matching the target relationship from the preset domain knowledge graph based on the target relationship set to obtain the triplet information set include: obtaining the triplet information set from the preset domain knowledge graph. Assume that the domain knowledge graph is searched for entity relationships that match the target relationship until the search is successful or the number of searches reaches the preset search threshold, and the search results are obtained; in the case of successful retrieval, the target entity relationship that matches the target relationship is determined based on the search results. Obtain the target entity relationship set; based on the target entity relationship set, determine the triplet information to which each target entity relationship belongs, and obtain the triplet information set.
在本发明实施例中,如果从目标问题中抽取到目标关系集合,则可以从预设领域知识图谱中检索与目标关系匹配的实体关系,如果检索成功,则停止检索,将获取检索到的领域知识图谱中的关系所属的三元组信息(在检索成功的情况下,基于检索结果确定与目标关系匹配的目标实体关系,得到目标实体关系集合,并基于目标实体关系集合,确定每个目标实体关系所属的三元组信息,得到三元组信息集合);如果检索次数达到预设检索阈值(可以根据实际情况进行设置,例如,3次)时还没有检索成功,则可以停止检索。In the embodiment of the present invention, if the target relationship set is extracted from the target question, the entity relationship matching the target relationship can be retrieved from the preset domain knowledge graph. If the retrieval is successful, the retrieval will be stopped and the retrieved domain will be obtained. The triplet information to which the relationship in the knowledge graph belongs (if the retrieval is successful, the target entity relationship matching the target relationship is determined based on the retrieval results, the target entity relationship set is obtained, and each target entity is determined based on the target entity relationship set The triplet information to which the relationship belongs is obtained to obtain the triplet information set); if the retrieval has not been successful when the number of times of retrieval reaches the preset retrieval threshold (which can be set according to the actual situation, for example, 3 times), the retrieval can be stopped.
可选地,在基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,得到三元组信息集合之后,还包括:连接三元组信息中的主体实体、对象实体以及实体关系,得到回答文本,其中,三元组信息对应有与目标实体关联的关联值;基于关联值,对所有回答文本进行排序,得到回答文本集合。Optionally, after retrieving triplet information matching the target entity from the preset domain knowledge graph based on the target entity set to obtain the triplet information set, it also includes: connecting the main entities and objects in the triplet information Entity and entity relationship, the answer text is obtained, in which the triplet information corresponds to the associated value associated with the target entity; based on the associated value, all the answer texts are sorted, and the answer text collection is obtained.
在本发明实施例中,推理模型的输入是文本形式,因此需要将从领域知识图谱中检索到的跟问题相关的三元组转化为变长文本系列,具体为:可以采用线性方式将三元组的主体实体,实体关系和对象实体进行连接,以生成知识文本(即连接三元组信息中的主体实体、对象实体以及实体关系,得到回答文本)。在本实施例中,可以根据检索到三元组信息时经历的检索跳数,确定该三元组信息与目标实体关联的关联值,然后根据关联值,对所有回答文本进行排序,以得到排序后的回答文本集合。In the embodiment of the present invention, the input of the inference model is in the form of text, so it is necessary to convert the triples related to the question retrieved from the domain knowledge graph into a variable-length text series. Specifically, the triples can be converted into a variable-length text series in a linear manner. The main entity, entity relationship and object entity of the group are connected to generate knowledge text (that is, connecting the main entity, object entity and entity relationship in the triplet information to obtain the answer text). In this embodiment, the association value associated with the triplet information and the target entity can be determined based on the number of retrieval hops experienced when retrieving the triplet information, and then all answer texts can be sorted according to the association value to obtain the sorting The collection of answer texts after.
步骤S304,基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。Step S304: Construct input knowledge information based on the triplet information set and question prompt information, input the input knowledge information into the preset reasoning model, and output the target answer to the target question.
在本发明实施例中,需要将回答文本集合k转换为回答提示信息集合k',然后将回答提示信息集合预置到问题提示信息,得到输入知识信息,之后将输入知识信息输入至预设推理模型,由预设推理模型生成答案并返回最终的问答结果(即目标答案)。In the embodiment of the present invention, the answer text set k needs to be converted into the answer prompt information set k', and then the answer prompt information set is preset to the question prompt information to obtain the input knowledge information, and then the input knowledge information is input to the preset reasoning Model, the preset reasoning model generates answers and returns the final question and answer result (i.e., the target answer).
可选地,基于三元组信息集合以及问题提示信息,构建输入知识信息的步骤,包括:构建回答提示模板,其中,回答提示模板包括:回答指令;基于回答提示模板,在回答文本集合中的每个回答文本中加入回答指令,生成回答提示信息集合;拼接问题提示信息以及回答提示信息集合,得到输入知识信息。Optionally, based on the triplet information set and the question prompt information, the step of constructing the input knowledge information includes: constructing an answer prompt template, wherein the answer prompt template includes: answer instructions; based on the answer prompt template, in the answer text set Add answer instructions to each answer text to generate a set of answer prompt information; splice the question prompt information and the answer prompt information set to obtain input knowledge information.
在本发明实施例中,可以先构建回答提示模板,该回答提示模板包括:回答指令(例如,该问题的答案参考如下),然后依据回答提示模板,在回答文本集合中的每个回答文本中加入回答指令,以生成回答提示信息集合,例如,回答文本集合k为“某书的作者是A,某书的作者是A+B,某书的翻译作者是C”,则根据回答提示模板,生成的回答提示信息集合k'为“该问题的答案参考如下:某书的作者是A,某书的作者是A+B,某书的翻译作者是C”。之后,拼接问题提示信息以及回答提示信息集合,以得到输入知识信息[x′,k′],其中,[]表示连接。In the embodiment of the present invention, an answer prompt template may be constructed first. The answer prompt template includes: answer instructions (for example, the answer to the question is as follows), and then based on the answer prompt template, in each answer text in the answer text set Add answer instructions to generate a set of answer prompt information. For example, if the answer text set k is "The author of a certain book is A, the author of a certain book is A+B, and the translator of a certain book is C", then according to the answer prompt template, The generated answer prompt information set k' is "The answer to this question is as follows: the author of a certain book is A, the author of a certain book is A+B, and the translator of a certain book is C." After that, the question prompt information and the answer prompt information set are spliced to obtain the input knowledge information [x′, k′], where [] represents a connection.
可选地,将输入知识信息输入至预设推理模型,输出目标问题的目标答案的步骤,包括:采用预设推理模型分析问题提示信息,得到答案集合,其中,预设推理模型是采用训练数据集合预先训练的推理模型,训练数据集合包括:历史问题集合以及与历史问题集合中的每个历史问题对应的历史答案;将输入知识信息表征为预设条件,并基于预设条件,确定答案集合中每个答案的条件概率值;将最大条件概率值指示的答案确定为目标答案。Optionally, the step of inputting the input knowledge information into the preset reasoning model and outputting the target answer to the target question includes: using the preset reasoning model to analyze the question prompt information and obtain the answer set, wherein the preset reasoning model uses training data Collect pre-trained inference models. The training data set includes: a historical question set and historical answers corresponding to each historical question in the historical question set; the input knowledge information is represented as preset conditions, and the answer set is determined based on the preset conditions. The conditional probability value of each answer in ; determine the answer indicated by the maximum conditional probability value as the target answer.
在本发明实施例中,预设推理模型是采用训练数据集合(包括:历史问题集合以及与历史问题集合中的每个历史问题对应的历史答案)预先训练的推理模型,推理模型可以是各种算法模型,例如,利用深度神经网络来学习问题和答案之间的复杂关系的模型,通过分析大量数据来学习问题和答案之间的统计关系的模型等,在此不作限制。In the embodiment of the present invention, the preset inference model is an inference model pre-trained using a training data set (including: a historical question set and historical answers corresponding to each historical question in the historical question set). The inference model can be various Algorithm models, for example, models that use deep neural networks to learn complex relationships between questions and answers, models that analyze large amounts of data to learn statistical relationships between questions and answers, etc., are not limited here.
在本发明实施例中,在得到输入知识信息[x′,k′]后,可以将输入知识信息[x′,k′]注入预设推理模型中,然后采用预设推理模型分析问题提示信息x′,得到答案集合,并且,预设推理模型可以将输入知识信息作为预设条件,并根据预设条件,确定答案集合中每个答案的条件概率值(即P(y|[x′,k′]),其中,y表示答案集合中的某个答案),最后将最大条件概率值指示的答案确定为目标答案进行输出。In the embodiment of the present invention, after obtaining the input knowledge information [x′, k′], the input knowledge information [x′, k′] can be injected into the preset reasoning model, and then the preset reasoning model is used to analyze the question prompt information x′, get the answer set, and the preset reasoning model can use the input knowledge information as the preset condition, and determine the conditional probability value of each answer in the answer set according to the preset condition (i.e. P(y|[x′, k′]), where y represents an answer in the answer set), and finally the answer indicated by the maximum conditional probability value is determined as the target answer for output.
下面结合另一种可选的具体实施方式进行详细说明。A detailed description will be given below in conjunction with another optional specific implementation.
图4是根据本发明实施例的一种可选的基于领域知识图谱的知识问答推理流程的示意图,如图4所示,包括如下流程:Figure 4 is a schematic diagram of an optional knowledge question and answer reasoning process based on domain knowledge graphs according to an embodiment of the present invention. As shown in Figure 4, it includes the following process:
(1)获取问题,并在问题中按照模板方式加入指令,得到问题提示;(1) Obtain the question and add instructions according to the template to get the question prompt;
(2)对问题进行实体抽取,得到实体、关系元素,然后依据实体、关系元素,在领域知识图谱中进行知识检索,得到三元组(问题相关);(2) Extract entities from the question to obtain entities and relationship elements, and then perform knowledge retrieval in the domain knowledge graph based on the entities and relationship elements to obtain triples (problem-related);
(3)然后将三元组(问题相关)进行三元组口语化,生成知识文本;(3) Then the triples (problem-related) are colloquially translated into triples to generate knowledge text;
(4)在知识文本中按照模板方式加入指令,得到知识提示(事实),然后将知识提示(事实)预置到问题提示中,融合成知识提示;(4) Add instructions in the knowledge text according to the template method to obtain knowledge tips (facts), and then preset the knowledge tips (facts) into the question tips and integrate them into knowledge tips;
(5)将提示注入至推理模型,通过推理模型进行分析,生成问题答案。(5) Inject hints into the inference model, analyze them through the inference model, and generate answers to questions.
本发明实施例中,提出了一种基于领域知识图谱生成知识提示以增强模型专业知识推理能力的方法,能够解决相关技术中模型生成结果可靠性差,以及可能存在的偏见、公平性等问题,并且通过充分利用已构建的领域知识图谱数据资产事实性优势,能够自迭代增强模型推理能力,具备场景适应性更强,灵活度更高,成本最低的优点。此外,本实施例以事实知识为条件基于推理模型生成事实答案,能够有效避免模型生成逻辑较混乱答案,同时能够保持模型的参数不变,知识更新时不需要进行微调,对于应用领域知识更新迭代较快且多变的场景来说,应用更灵活,成本更低。In the embodiment of the present invention, a method of generating knowledge tips based on domain knowledge graphs to enhance the professional knowledge reasoning ability of the model is proposed, which can solve the problems of poor reliability of model generation results, possible bias, fairness and other problems in related technologies, and By making full use of the factual advantages of the constructed domain knowledge graph data assets, it can self-iteratively enhance model reasoning capabilities, with the advantages of stronger scene adaptability, higher flexibility, and lowest cost. In addition, this embodiment uses factual knowledge as a condition to generate factual answers based on the inference model, which can effectively avoid the model to generate answers with confusing logic, while keeping the parameters of the model unchanged. There is no need for fine-tuning when updating knowledge, and iterative update of knowledge in the application domain For faster and changeable scenarios, the application is more flexible and the cost is lower.
下面结合另一实施例进行详细说明。Detailed description will be given below in conjunction with another embodiment.
实施例二Embodiment 2
本实施例中提供的一种基于领域知识图谱的知识问答装置包含了多个实施单元,每个实施单元对应于上述实施例一中的各个实施步骤。A knowledge question and answer device based on a domain knowledge graph provided in this embodiment includes multiple implementation units, and each implementation unit corresponds to each implementation step in the above-mentioned Embodiment 1.
图5是根据本发明实施例的一种可选的基于领域知识图谱的知识问答装置的示意图,如图5所示,该知识问答装置可以包括:接收单元50,抽取单元51,检索单元52,构建单元53,其中,Figure 5 is a schematic diagram of an optional knowledge question and answer device based on domain knowledge graph according to an embodiment of the present invention. As shown in Figure 5, the knowledge question and answer device may include: a receiving unit 50, an extraction unit 51, a retrieval unit 52, Building unit 53, where,
接收单元50,用于接收目标问题,并对目标问题进行处理,得到问题提示信息;The receiving unit 50 is used to receive the target question, process the target question, and obtain question prompt information;
抽取单元51,用于基于目标问题,抽取目标实体集合或者目标关系集合,其中,目标实体集合包括:多个目标实体,目标关系集合包括:多个目标关系;The extraction unit 51 is used to extract a target entity set or a target relationship set based on the target problem, where the target entity set includes: multiple target entities, and the target relationship set includes: multiple target relationships;
检索单元52,用于在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合;The retrieval unit 52 is used to retrieve triple information matching the target entity from the preset domain knowledge graph based on the target entity set when the target entity set is extracted, or when the target relationship set is extracted, Based on the target relationship set, retrieve the triplet information matching the target relationship from the preset domain knowledge graph to obtain the triplet information set;
构建单元53,用于基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。The construction unit 53 is used to construct the input knowledge information based on the triplet information set and the question prompt information, input the input knowledge information into the preset reasoning model, and output the target answer to the target question.
上述知识问答装置,可以通过接收单元50接收目标问题,并对目标问题进行处理,得到问题提示信息,通过抽取单元51基于目标问题,抽取目标实体集合或者目标关系集合,通过检索单元52在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合,通过构建单元53基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。在本发明实施例中,可以先对接收到的目标问题进行处理,以得到问题提示信息,并可以抽取目标问题中的目标实体集合或者目标关系集合,以根据目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息或者根据目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,然后根据得到的三元组信息集合以及问题提示信息,构建输入知识信息,之后将输入知识信息输入至预设推理模型,以得到输出目标问题的目标答案,通过结合预设领域知识图谱,能够将问题构建为输入知识信息,然后通过预设推理模型处理输入知识信息,能够降低模型推理过程中的偏见,提高对问题进行知识问答推理的准确性,进而解决了相关技术中对问题进行知识问答推理的准确性较低的技术问题。The above knowledge question and answer device can receive the target question through the receiving unit 50, process the target question, and obtain the question prompt information. The extraction unit 51 extracts the target entity set or the target relationship set based on the target question, and the retrieval unit 52 extracts the target entity set or target relationship set. In the case of a target entity set, based on the target entity set, retrieve the triplet information matching the target entity from the preset domain knowledge graph, or in the case of extracting the target relationship set, based on the target relationship set, from the preset domain The knowledge graph is searched for triplet information that matches the target relationship to obtain a triplet information set. The input knowledge information is constructed by the construction unit 53 based on the triplet information set and the question prompt information, and the input knowledge information is input to the default The inference model outputs the target answer to the target question. In the embodiment of the present invention, the received target question can be processed first to obtain the question prompt information, and the target entity set or target relationship set in the target question can be extracted to extract the preset domain knowledge based on the target entity set. Retrieve the triplet information matching the target entity from the graph or retrieve the triplet information matching the target relationship from the preset domain knowledge graph based on the target relationship set, and then based on the obtained triplet information set and question prompt information, Construct input knowledge information, and then input the input knowledge information into the preset reasoning model to obtain the target answer to the output target question. By combining the preset domain knowledge graph, the question can be constructed as input knowledge information, and then processed through the preset reasoning model Inputting knowledge information can reduce bias in the model reasoning process and improve the accuracy of knowledge question and answer reasoning on questions, thus solving the technical problem in related technologies that the accuracy of knowledge question and answer reasoning on questions is low.
可选地,接收单元包括:第一构建模块,用于构建问题提示模板,其中,问题提示模板包括:问题指令;第一生成模块,用于基于问题提示模板,在目标问题中加入问题指令,生成问题提示信息。Optionally, the receiving unit includes: a first building module, used to construct a question prompt template, where the question prompt template includes: question instructions; a first generation module, used to add question instructions to the target question based on the question prompt template, Generate problem prompt information.
可选地,抽取单元包括:第一处理模块,用于对目标问题进行分词处理,得到多个分词;第一分析模块,用于对分词进行分析,确定分词的词类型;第一确定模块,用于在词类型是第一预设类型的情况下,将词类型指示的分词确定为目标实体,或者,在词类型是第二预设类型的情况下,将词类型指示的分词确定为目标关系,得到目标关系集合;第二确定模块,用于在所有词类型都不是第一预设类型以及第二预设类型的情况下,确定目标问题的上下文信息,并基于上下文信息,补充目标问题对应的目标实体;第二生成模块,用于基于所有目标实体,生成目标实体集合。Optionally, the extraction unit includes: a first processing module, used to perform word segmentation processing on the target question to obtain multiple word segments; a first analysis module, used to analyze the word segments and determine the word type of the segmented words; a first determination module, Used to determine the word segmentation indicated by the word type as the target entity when the word type is the first preset type, or to determine the segmentation indicated by the word type as the target entity when the word type is the second preset type. relationship to obtain the target relationship set; the second determination module is used to determine the context information of the target question when all word types are not the first preset type and the second preset type, and supplement the target question based on the context information The corresponding target entity; the second generation module is used to generate a target entity set based on all target entities.
可选地,三元组信息包括:主体实体、对象实体和实体关系,检索单元包括:第三确定模块,用于确定检索跳数阈值以及初始检索跳数;第一检索模块,用于从预设领域知识图谱中检索与目标实体匹配的知识图谱实体,其中,知识图谱实体是主体实体或者对象实体;第一更新模块,用于在检索到与目标实体匹配的第一知识图谱实体的情况下,对初始检索跳数进行更新操作,得到当前检索跳数;第四确定模块,用于在当前检索跳数小于检索跳数阈值的情况下,基于实体关系,确定与第一知识图谱实体关联的第二知识图谱实体;第二更新模块,用于更新当前检索跳数,并继续基于实体关系,确定与第二知识图谱实体关联的第三知识图谱实体,直到当前检索跳数大于等于检索跳数阈值,得到知识图谱实体集合;第五确定模块,用于基于知识图谱实体集合,确定每个知识图谱实体所属的三元组信息,得到三元组信息集合。Optionally, the triplet information includes: subject entity, object entity and entity relationship, and the retrieval unit includes: a third determination module, used to determine the retrieval hop count threshold and the initial retrieval hop count; a first retrieval module, used to obtain from the predetermined retrieval hop count. Assume that the domain knowledge graph is retrieved for knowledge graph entities that match the target entity, where the knowledge graph entity is a subject entity or an object entity; the first update module is used to retrieve the first knowledge graph entity that matches the target entity. , perform an update operation on the initial retrieval hop count to obtain the current retrieval hop count; the fourth determination module is used to determine, based on the entity relationship, the information associated with the first knowledge graph entity when the current retrieval hop count is less than the retrieval hop count threshold. the second knowledge graph entity; the second update module, used to update the current retrieval hop count, and continue to determine the third knowledge graph entity associated with the second knowledge graph entity based on the entity relationship, until the current retrieval hop count is greater than or equal to the retrieval hop count The threshold is used to obtain the knowledge graph entity set; the fifth determination module is used to determine the triplet information to which each knowledge graph entity belongs based on the knowledge graph entity set, and obtain the triplet information set.
可选地,检索单元还包括:第二检索模块,用于从预设领域知识图谱中检索与目标关系匹配的实体关系,直到检索成功或者检索次数达到预设检索阈值,得到检索结果;第六确定模块,用于在检索成功的情况下,基于检索结果确定与目标关系匹配的目标实体关系,得到目标实体关系集合;第七确定模块,用于基于目标实体关系集合,确定每个目标实体关系所属的三元组信息,得到三元组信息集合。Optionally, the retrieval unit also includes: a second retrieval module, used to retrieve entity relationships that match the target relationship from the preset domain knowledge graph, until the retrieval is successful or the retrieval times reach the preset retrieval threshold, and the retrieval results are obtained; sixth The determination module is used to determine the target entity relationship matching the target relationship based on the search results when the retrieval is successful, and obtain the target entity relationship set; the seventh determination module is used to determine each target entity relationship based on the target entity relationship set. The triplet information it belongs to obtains the triplet information set.
可选地,知识问答装置还包括:第一连接模块,用于在基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,得到三元组信息集合之后,连接三元组信息中的主体实体、对象实体以及实体关系,得到回答文本,其中,三元组信息对应有与目标实体关联的关联值;第一排序模块,用于基于关联值,对所有回答文本进行排序,得到回答文本集合。Optionally, the knowledge question and answer device further includes: a first connection module, configured to retrieve triplet information matching the target entity from the preset domain knowledge graph based on the target entity set, and after obtaining the triplet information set, connect The subject entity, object entity and entity relationship in the triplet information are used to obtain the answer text, where the triplet information corresponds to the associated value associated with the target entity; the first sorting module is used to sort all answer texts based on the associated value Sort and obtain the answer text collection.
可选地,构建单元包括:第二构建模块,用于构建回答提示模板,其中,回答提示模板包括:回答指令;第三生成模块,用于基于回答提示模板,在回答文本集合中的每个回答文本中加入回答指令,生成回答提示信息集合;第一拼接模块,用于拼接问题提示信息以及回答提示信息集合,得到输入知识信息。Optionally, the building unit includes: a second building module, used to build an answer prompt template, where the answer prompt template includes: an answer instruction; and a third generation module, based on the answer prompt template, each in the answer text collection Answer instructions are added to the answer text to generate a set of answer prompt information; the first splicing module is used to splice the question prompt information and the answer prompt information set to obtain input knowledge information.
可选地,构建单元还包括:第二分析模块,用于采用预设推理模型分析问题提示信息,得到答案集合,其中,预设推理模型是采用训练数据集合预先训练的推理模型,训练数据集合包括:历史问题集合以及与历史问题集合中的每个历史问题对应的历史答案;第八确定模块,用于将输入知识信息表征为预设条件,并基于预设条件,确定答案集合中每个答案的条件概率值;第九确定模块,用于将最大条件概率值指示的答案确定为目标答案。Optionally, the building unit also includes: a second analysis module, used to analyze the question prompt information using a preset inference model to obtain an answer set, wherein the preset inference model is an inference model pre-trained using a training data set, and the training data set It includes: a set of historical questions and historical answers corresponding to each historical question in the set of historical questions; an eighth determination module is used to characterize the input knowledge information as preset conditions, and determine each of the answers in the set of answers based on the preset conditions. The conditional probability value of the answer; the ninth determination module is used to determine the answer indicated by the maximum conditional probability value as the target answer.
上述的知识问答装置还可以包括处理器和存储器,上述接收单元50,抽取单元51,检索单元52,构建单元53等均作为程序单元存储在存储器中,由处理器执行存储在存储器中的上述程序单元来实现相应的功能。The above-mentioned knowledge question and answer device may also include a processor and a memory. The above-mentioned receiving unit 50, extraction unit 51, retrieval unit 52, construction unit 53, etc. are all stored in the memory as program units, and the processor executes the above-mentioned program stored in the memory. units to implement corresponding functions.
上述处理器中包含内核,由内核去存储器中调取相应的程序单元。内核可以设置一个或以上,通过调整内核参数来基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。The above-mentioned processor contains a core, and the core retrieves the corresponding program unit from the memory. One or more kernels can be set. By adjusting the kernel parameters, the input knowledge information is constructed based on the triplet information set and the question prompt information, and the input knowledge information is input into the preset reasoning model to output the target answer to the target question.
上述存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM),存储器包括至少一个存储芯片。The above-mentioned memory may include non-permanent memory in computer-readable media, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM). The memory includes at least A memory chip.
本申请还提供了一种计算机程序产品,当在数据处理设备上执行时,适于执行初始化有如下方法步骤的程序:接收目标问题,并对目标问题进行处理,得到问题提示信息,基于目标问题,抽取目标实体集合或者目标关系集合,在抽取到目标实体集合的情况下,基于目标实体集合,从预设领域知识图谱中检索与目标实体匹配的三元组信息,或者在抽取到目标关系集合的情况下,基于目标关系集合,从预设领域知识图谱中检索与目标关系匹配的三元组信息,得到三元组信息集合,基于三元组信息集合以及问题提示信息,构建输入知识信息,并将输入知识信息输入至预设推理模型,输出目标问题的目标答案。This application also provides a computer program product, which, when executed on a data processing device, is suitable for executing a program initialized with the following method steps: receiving a target question, processing the target question, obtaining question prompt information, and based on the target question , extract the target entity set or the target relationship set. When the target entity set is extracted, based on the target entity set, retrieve triple information matching the target entity from the preset domain knowledge graph, or when the target relationship set is extracted In the case of , based on the target relationship set, the triplet information matching the target relationship is retrieved from the preset domain knowledge map to obtain the triplet information set. Based on the triplet information set and the question prompt information, the input knowledge information is constructed. And input the input knowledge information into the preset reasoning model, and output the target answer to the target question.
根据本发明实施例的另一方面,还提供了一种计算机可读存储介质,计算机可读存储介质包括存储的计算机程序,其中,在计算机程序运行时控制计算机可读存储介质所在设备执行上述的基于领域知识图谱的知识问答方法。According to another aspect of the embodiment of the present invention, a computer-readable storage medium is also provided. The computer-readable storage medium includes a stored computer program, wherein when the computer program is running, the device where the computer-readable storage medium is located is controlled to execute the above-mentioned steps. Knowledge question and answer method based on domain knowledge graph.
根据本发明实施例的另一方面,还提供了一种电子设备,包括一个或多个处理器和存储器,存储器用于存储一个或多个程序,其中,当一个或多个程序被一个或多个处理器执行时,使得一个或多个处理器实现上述的基于领域知识图谱的知识问答方法。According to another aspect of the embodiment of the present invention, an electronic device is also provided, including one or more processors and a memory. The memory is used to store one or more programs, wherein when one or more programs are processed by one or more When executed by a processor, one or more processors are caused to implement the above-mentioned knowledge question and answer method based on the domain knowledge graph.
图6是根据本发明实施例的一种用于基于领域知识图谱的知识问答方法的电子设备(或移动设备)的硬件结构框图。如图6所示,电子设备可以包括一个或多个(图6中采用602a、602b,……,602n来示出)处理器602(处理器602可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器604。除此以外,还可以包括:显示器、输入/输出接口(I/O接口)、通用串行总线(USB)端口(可以作为I/O接口的端口中的一个端口被包括)、网络接口、键盘、电源和/或相机。本领域普通技术人员可以理解,图6所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,电子设备还可包括比图6中所示更多或者更少的组件,或者具有与图6所示不同的配置。Figure 6 is a hardware structure block diagram of an electronic device (or mobile device) used for a knowledge question and answer method based on a domain knowledge graph according to an embodiment of the present invention. As shown in Figure 6, the electronic device may include one or more (shown as 602a, 602b,..., 602n in Figure 6) processors 602 (the processors 602 may include but are not limited to microprocessors MCU or programmable A processing device such as a logic device (FPGA), and a memory 604 for storing data. In addition, it may also include: a display, an input/output interface (I/O interface), a universal serial bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, and a keyboard. , power supply and/or camera. Persons of ordinary skill in the art can understand that the structure shown in FIG. 6 is only illustrative and does not limit the structure of the above-mentioned electronic device. For example, the electronic device may also include more or fewer components than shown in FIG. 6 , or have a different configuration than shown in FIG. 6 .
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The above serial numbers of the embodiments of the present invention are only for description and do not represent the advantages and disadvantages of the embodiments.
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, each embodiment is described with its own emphasis. For parts that are not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of the units may be a logical functional division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the units or modules may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically alone, or two or more units can be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-On ly Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention is essentially or contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which can be a personal computer, a server or a network device, etc.) to execute all or part of the steps of the method described in various embodiments of the present invention. The aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk, etc. that can store program code. medium.
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above are only preferred embodiments of the present invention. It should be noted that those skilled in the art can make several improvements and modifications without departing from the principles of the present invention. These improvements and modifications can also be made. should be regarded as the protection scope of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311049695.8ACN117076688A (en) | 2023-08-18 | 2023-08-18 | Knowledge question and answer method based on domain knowledge graph and its device and electronic equipment |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311049695.8ACN117076688A (en) | 2023-08-18 | 2023-08-18 | Knowledge question and answer method based on domain knowledge graph and its device and electronic equipment |
| Publication Number | Publication Date |
|---|---|
| CN117076688Atrue CN117076688A (en) | 2023-11-17 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311049695.8APendingCN117076688A (en) | 2023-08-18 | 2023-08-18 | Knowledge question and answer method based on domain knowledge graph and its device and electronic equipment |
| Country | Link |
|---|---|
| CN (1) | CN117076688A (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117436531A (en)* | 2023-12-21 | 2024-01-23 | 安徽大学 | Question answering system and method based on rice pest knowledge graph |
| CN117493582A (en)* | 2023-12-29 | 2024-02-02 | 珠海格力电器股份有限公司 | Model result output method and device, electronic equipment and storage medium |
| CN117634617A (en)* | 2024-01-25 | 2024-03-01 | 清华大学 | Knowledge-intensive reasoning question and answer method, device, electronic device and storage medium |
| CN118069817A (en)* | 2024-04-18 | 2024-05-24 | 国家超级计算天津中心 | Knowledge graph-based generation type question-answering method, device and storage medium |
| CN118656458A (en)* | 2024-05-28 | 2024-09-17 | 深圳市云鲸视觉科技有限公司 | A virtual welcoming robot and robot interaction method thereof |
| CN119513232A (en)* | 2025-01-20 | 2025-02-25 | 湖南师范大学 | Chinese medical question-answering method and device based on knowledge graph and large language model |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20190059084A (en)* | 2017-11-22 | 2019-05-30 | 한국전자통신연구원 | Natural language question-answering system and learning method |
| CN112527997A (en)* | 2020-12-18 | 2021-03-19 | 中国南方电网有限责任公司 | Intelligent question-answering method and system based on power grid field scheduling scene knowledge graph |
| CN115080710A (en)* | 2022-03-01 | 2022-09-20 | 达而观信息科技(上海)有限公司 | Intelligent question-answering system adaptive to knowledge graphs in different fields and construction method thereof |
| CN116561264A (en)* | 2023-02-07 | 2023-08-08 | 南京博雅区块链研究院有限公司 | Knowledge graph-based intelligent question-answering system construction method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20190059084A (en)* | 2017-11-22 | 2019-05-30 | 한국전자통신연구원 | Natural language question-answering system and learning method |
| CN112527997A (en)* | 2020-12-18 | 2021-03-19 | 中国南方电网有限责任公司 | Intelligent question-answering method and system based on power grid field scheduling scene knowledge graph |
| CN115080710A (en)* | 2022-03-01 | 2022-09-20 | 达而观信息科技(上海)有限公司 | Intelligent question-answering system adaptive to knowledge graphs in different fields and construction method thereof |
| CN116561264A (en)* | 2023-02-07 | 2023-08-08 | 南京博雅区块链研究院有限公司 | Knowledge graph-based intelligent question-answering system construction method |
| Title |
|---|
| 乔振浩;车万翔;刘挺;: "基于问题生成的知识图谱问答方法", 智能计算机与应用, no. 05, 1 May 2020 (2020-05-01)* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117436531A (en)* | 2023-12-21 | 2024-01-23 | 安徽大学 | Question answering system and method based on rice pest knowledge graph |
| CN117493582A (en)* | 2023-12-29 | 2024-02-02 | 珠海格力电器股份有限公司 | Model result output method and device, electronic equipment and storage medium |
| CN117493582B (en)* | 2023-12-29 | 2024-04-05 | 珠海格力电器股份有限公司 | Model result output method and device, electronic equipment and storage medium |
| CN117634617A (en)* | 2024-01-25 | 2024-03-01 | 清华大学 | Knowledge-intensive reasoning question and answer method, device, electronic device and storage medium |
| CN117634617B (en)* | 2024-01-25 | 2024-05-17 | 清华大学 | Knowledge-intensive reasoning question answering method, device, electronic device and storage medium |
| CN118069817A (en)* | 2024-04-18 | 2024-05-24 | 国家超级计算天津中心 | Knowledge graph-based generation type question-answering method, device and storage medium |
| CN118656458A (en)* | 2024-05-28 | 2024-09-17 | 深圳市云鲸视觉科技有限公司 | A virtual welcoming robot and robot interaction method thereof |
| CN119513232A (en)* | 2025-01-20 | 2025-02-25 | 湖南师范大学 | Chinese medical question-answering method and device based on knowledge graph and large language model |
| Publication | Publication Date | Title |
|---|---|---|
| CN111241237B (en) | Intelligent question-answer data processing method and device based on operation and maintenance service | |
| CN117076688A (en) | Knowledge question and answer method based on domain knowledge graph and its device and electronic equipment | |
| CN116561538A (en) | Question-answer scoring method, question-answer scoring device, electronic equipment and storage medium | |
| CN110263324A (en) | Text handling method, model training method and device | |
| CN108804677A (en) | In conjunction with the deep learning question classification method and system of multi-layer attention mechanism | |
| CN112101042B (en) | Text emotion recognition method, device, terminal equipment and storage medium | |
| WO2023124837A1 (en) | Inquiry processing method and apparatus, device, and storage medium | |
| CN114186076B (en) | Knowledge graph construction method, device, equipment and computer-readable storage medium | |
| CN105868179A (en) | An intelligent question answering method and device | |
| CN111126067B (en) | Entity relationship extraction method and device | |
| CN113886535B (en) | Knowledge graph-based question and answer method and device, storage medium and electronic equipment | |
| WO2025123841A1 (en) | Knowledge graph construction method and apparatus, and storage medium and electronic device | |
| CN108614814A (en) | A kind of abstracting method of evaluation information, device and equipment | |
| CN113516094A (en) | A system and method for matching review experts for documents | |
| CN117852523A (en) | A cross-domain small sample relationship extraction method and device for learning discriminative semantics and multi-view context | |
| CN117556004A (en) | A knowledge question and answer method, device and storage medium based on food engineering | |
| CN115131058B (en) | Account identification method, device, equipment and storage medium | |
| CN113392220A (en) | Knowledge graph generation method and device, computer equipment and storage medium | |
| CN114328820B (en) | Information search method and related equipment | |
| CN118606574A (en) | Knowledge answering method, system, electronic device and storage medium based on large model | |
| CN117689027A (en) | Prompt text generation method and device, electronic equipment and storage medium | |
| CN116975315A (en) | Text matching method, device, computer equipment and storage medium | |
| CN114676237A (en) | Sentence similarity determination method, device, computer equipment and storage medium | |
| CN114281974A (en) | Processing method and device, storage medium and electronic device for querying question and answer library | |
| CN112686052A (en) | Test question recommendation method, test question training method, electronic equipment and storage device |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |