Movatterモバイル変換


[0]ホーム

URL:


CN112612899A - Knowledge graph construction method and device, storage medium and electronic equipment - Google Patents

Knowledge graph construction method and device, storage medium and electronic equipment
Download PDF

Info

Publication number
CN112612899A
CN112612899ACN202011329767.0ACN202011329767ACN112612899ACN 112612899 ACN112612899 ACN 112612899ACN 202011329767 ACN202011329767 ACN 202011329767ACN 112612899 ACN112612899 ACN 112612899A
Authority
CN
China
Prior art keywords
standard
relationship
knowledge graph
entity
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011329767.0A
Other languages
Chinese (zh)
Other versions
CN112612899B (en
Inventor
宋卿
张鹏洲
张弛
陈国伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Communication University of China
Original Assignee
Communication University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Communication University of ChinafiledCriticalCommunication University of China
Priority to CN202011329767.0ApriorityCriticalpatent/CN112612899B/en
Publication of CN112612899ApublicationCriticalpatent/CN112612899A/en
Application grantedgrantedCritical
Publication of CN112612899BpublicationCriticalpatent/CN112612899B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种知识图谱的构建方法、装置和计算机可读存储介质。本发明的知识图谱构建方法根据结构化数据库构建接初始结构化知识图谱,根据初始结构化知识图谱的标准实体和标准关系抽取目标领域的非结构化文本中的参考实体和参考关系,将非结构化文本中抽取的参考实体和参考关系与初始结构化知识图谱中的标准实体与标准关系进行知识融合构建目标领域的知识图谱。采用结构化知识构建的初始结构化知识图谱是高质量高准确度的知识图谱,再辅以非结构化文本对初始结构化知识图谱进行针对性的扩充和补全,从而能够构建出完备的、高质量的目标领域知识图谱。

Figure 202011329767

The invention discloses a method, device and computer-readable storage medium for constructing a knowledge graph. The knowledge graph construction method of the present invention constructs an initial structured knowledge graph according to a structured database, extracts reference entities and reference relationships in the unstructured text of the target field according to the standard entities and standard relationships of the initial structured knowledge graph, and converts the unstructured The reference entities and reference relations extracted from the transformed text are fused with the standard entities and standard relations in the initial structured knowledge graph to construct the knowledge graph of the target domain. The initial structured knowledge graph constructed with structured knowledge is a high-quality and high-accuracy knowledge graph, and supplemented by unstructured texts for targeted expansion and completion of the initial structured knowledge graph, so that a complete, High-quality target domain knowledge graph.

Figure 202011329767

Description

Knowledge graph construction method and device, storage medium and electronic equipment
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for constructing a knowledge graph, a computer-readable storage medium, and an electronic device.
Background
The knowledge graph is an important basic stone for realizing artificial intelligence, the semantic web is formed by storing a knowledge system, concepts and contents of human society in a computer in the form of an entity-relationship-entity basic triple, the mapping between human knowledge and data processed by the computer is realized, and meanwhile, the knowledge graph has the capabilities of knowledge association, knowledge reasoning and knowledge learning, and helps the computer to more accurately understand human language.
The knowledge graph concept is proposed to date, and is always a hot issue of research in the disputes between academia and industry. At present, the knowledge graph is widely applied to scenes such as semantic search, content recommendation, intelligent customer service and the like. Completing a high quality, fully knowledgeable open domain knowledgegraph requires a significant amount of labor, material, and time, and is not necessary for many industries. For an industry, especially an enterprise in the industry, in the construction and application of information and internet systems, a large amount of data has been accumulated, so we should focus on how to efficiently, quickly and accurately construct a complete and high-quality vertical domain knowledge graph for a specific industry and vertical domain based on the above data.
Most of the traditional construction of the knowledge graph in the vertical field is based on a structured database, the knowledge graph in the vertical field can be quickly and conveniently constructed by adopting the structured database, however, the constructed knowledge graph in the vertical field is not complete enough and has low quality due to the incomplete data in the structured database.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art. Therefore, the first objective of the present invention is to provide a method for constructing a domain knowledge graph, which is capable of constructing a complete and high-quality domain knowledge graph of a target domain by using a high-quality and high-accuracy initial structured knowledge graph constructed by using structured knowledge and then performing targeted expansion and completion of the initial structured knowledge graph with an unstructured text.
The second purpose of the invention is to provide a knowledge graph constructing device.
A third object of the invention is to propose a computer-readable storage medium.
A fourth object of the invention is to propose an electronic device.
In order to achieve the above object, an embodiment of a first aspect of the present invention provides a method for constructing a knowledge graph, including:
acquiring a structured database of a target field, and constructing an initial structured knowledge graph according to the structured database of the target field, wherein the initial structured knowledge graph comprises standard entities and standard relations;
extracting a reference entity and a reference relation in the unstructured text of the target field according to the standard entity and the standard relation;
and performing knowledge fusion on the reference entities and the reference relations extracted from the unstructured text and the standard entities and the standard relations in the initial structured graph to form a knowledge graph of the target field.
According to the method for constructing the knowledge graph, the initial structured knowledge graph is constructed according to the structured database of the target field, the reference entity and the reference relation in the unstructured text of the target field are extracted according to the standard entity and the standard relation of the initial structured knowledge graph, and the reference entity and the reference relation extracted from the unstructured text and the standard entity and the standard relation in the initial structured knowledge graph are subjected to knowledge fusion to form the knowledge graph of the target field. The initial structured knowledge graph constructed by the structured knowledge is a high-quality and high-accuracy knowledge graph, and the initial structured knowledge graph is subjected to targeted expansion and completion by the aid of the unstructured text on the basis, so that a complete and high-quality knowledge graph in the target field can be constructed.
According to an embodiment of the present invention, the extracting, according to the standard entity and the standard relationship, a corresponding reference entity and a corresponding reference relationship in the unstructured text of the field includes:
marking the corresponding entities and relations in the unstructured text according to the standard entities and the standard relations;
extracting text abstract processing is carried out on the marked unstructured text according to the standard entity and the standard relation so as to screen out a sentence set associated with the standard entity and the standard relation;
and performing entity recognition on the sentence sets associated with the standard entities and the standard relations, and performing entity relation extraction according to recognition results to obtain reference entities and reference relations in the unstructured text.
According to an embodiment of the present invention, the extracting the entity relationship according to the recognition result includes:
and on the basis of carrying out entity recognition on the standard entity and the sentence set associated with the standard relation, carrying out entity relation extraction by adopting a relation classification and syntactic analysis mode.
According to an embodiment of the present invention, the constructing an initial structured knowledge-graph from the structured database of the target domain comprises:
constructing a domain ontology, wherein the domain ontology comprises important concepts, concept relations and axioms in the target domain;
mapping the structured knowledge in the structured database to the domain ontology to obtain entity nodes and relationship nodes;
and carrying out knowledge fusion on the entity nodes and the relation nodes obtained from different structured databases to obtain the standard entities and the standard relations, and forming the initial structured knowledge graph according to the standard entities and the standard relations.
According to one embodiment of the invention, the knowledge fusion of the reference entities and reference relations extracted from the unstructured text and the standard entities and standard relations in the initial structured graph comprises:
verifying the reference entity and the reference relationship and the standard entity and the standard relationship according to the axiom to judge whether the reference entity and the standard entity, the reference relationship and the standard relationship meet the axiom;
and when the reference entity and the standard entity, and the reference relationship and the standard relationship meet the axiom, performing knowledge fusion on the reference entity and the standard entity, and the reference relationship and the standard relationship.
According to an embodiment of the present invention, the knowledge fusion of the reference entities and reference relations extracted from the unstructured text and the standard entities and standard relations in the initial structured graph further includes:
setting a relation confidence value according to the original source of the reference relation and the frequency of the reference relation;
and performing knowledge fusion on the reference relation and the standard relation according to the relation confidence value.
According to an embodiment of the present invention, fusing the reference relationship and the standard relationship according to the relationship confidence value includes:
and when the reference relationship conflicts with the standard relationship, if the relationship confidence value of the reference relationship is smaller than a preset threshold value, deleting the reference relationship.
In order to achieve the above object, a second embodiment of the present invention provides an apparatus for constructing a knowledge graph, including:
the acquisition module is used for acquiring a structured database of the target field;
a construction module for constructing an initial structured knowledge graph according to the structured database of the target domain, the initial structured knowledge graph comprising standard entities and standard relationships;
the extraction module is used for extracting a reference entity and a reference relation in the unstructured text of the target field according to the standard entity and the standard relation;
and the fusion module is used for carrying out knowledge fusion on the reference entities and the reference relations extracted from the unstructured text and the standard entities and the standard relations in the initial structured knowledge graph so as to form the knowledge graph of the target field.
According to the device for constructing the knowledge graph, the acquisition module is used for acquiring the structured database of the target field, the construction module is used for constructing the initial structured knowledge graph according to the structured database of the target field, the initial structured knowledge graph comprises the standard entity and the standard relation, the extraction module is used for extracting the reference entity and the reference relation in the unstructured text of the target field according to the standard entity and the standard relation, and the fusion module is used for carrying out knowledge fusion on the reference entity and the reference relation extracted from the unstructured text and the standard entity and the standard relation in the initial structured knowledge graph so as to form the knowledge graph of the target field. The initial structured knowledge graph constructed by the structured knowledge is a high-quality and high-accuracy knowledge graph, and the initial structured knowledge graph is subjected to targeted expansion and completion by the aid of the unstructured text on the basis, so that a complete and high-quality knowledge graph in the target field can be constructed.
In order to achieve the above object, a third embodiment of the present invention provides a computer-readable storage medium, on which a knowledge-graph constructing program is stored, which, when executed by a processor, implements the aforementioned knowledge-graph constructing method.
According to the computer-readable storage medium of the embodiment of the invention, by the above method for constructing the knowledge graph, the initial structured knowledge graph is constructed by using the structured knowledge, and the knowledge graph is a high-quality and high-accuracy knowledge graph, and on the basis, the initial structured knowledge graph is subjected to targeted expansion and completion by using the unstructured text, so that a complete and high-quality knowledge graph of the target field can be constructed.
In order to achieve the above object, a fourth aspect of the present invention provides an electronic device, including a memory, a processor, and a knowledge graph constructing program stored in the memory and executable on the processor, where the processor implements the knowledge graph constructing method when executing the knowledge graph constructing program.
According to the electronic equipment provided by the embodiment of the invention, the knowledge graph is constructed by adopting the structured knowledge through the construction method of the knowledge graph, the knowledge graph is a high-quality and high-accuracy knowledge graph, and the initial structured knowledge graph is subjected to targeted expansion and completion by using the unstructured text on the basis, so that a complete and high-quality knowledge graph in the target field can be constructed.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
FIG. 1 is a flow diagram of a method of construction of a knowledge-graph according to one embodiment of the invention;
FIG. 2 is a flow diagram of a method of construction of a knowledge-graph according to yet another embodiment of the invention;
FIG. 3 is a flow diagram of a method of construction of a knowledge-graph according to yet another embodiment of the invention;
FIG. 4 is a block diagram of an apparatus for constructing a knowledge-graph according to one embodiment of the present invention;
FIG. 5 is a block diagram of an apparatus for constructing a knowledge-graph according to another embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
A method, an apparatus, a computer-readable storage medium, and an electronic device for constructing a knowledge graph according to embodiments of the present invention are described below with reference to the accompanying drawings.
Fig. 1 is a flowchart of a method for constructing a knowledge-graph according to an embodiment of the present invention, and referring to fig. 1, the method for constructing a knowledge-graph includes the following steps:
step S100, acquiring a structured database of the target field, and constructing an initial structured knowledge graph according to the structured database of the target field, wherein the initial structured knowledge graph comprises standard entities and standard relations.
Specifically, a large number of structured databases are accumulated in the construction and application of an information and internet system for a specific target field, and an initial structured knowledge map is constructed by acquiring the accumulated structured databases.
In one embodiment, referring to FIG. 2, constructing an initial structured knowledge-graph from a structured database of target domains comprises:
and step S110, constructing a domain ontology, wherein the domain ontology comprises important concepts, concept relations and axioms in the target domain.
Specifically, when an initial structured knowledge graph is constructed according to a structured database of a target field, a field body can be constructed firstly, specifically, the constructed target field can be determined firstly, then important concepts, concept relations, axioms and other elements in the target field are clarified, and in the process, the method is completed mainly by automatically extracting a relational database table structure of an existing system of the target field and assisting field experts in modification and identification.
Step S120, mapping the structured knowledge in the structured database to the domain ontology to obtain entity nodes and relationship nodes.
Specifically, structured knowledge migration can be performed after the domain ontology is built. When structured knowledge migration is carried out, because a large amount of domain knowledge exists in a database or a document in the form of structured data, the structured knowledge in the database or the document can be extracted by an automatic program on the basis of the domain ontology determined in the previous step so as to carry out the knowledge migration.
And step S130, performing knowledge fusion on the entity nodes and the relation nodes acquired from the different structural databases to acquire standard entities and standard relations, and forming an initial structural knowledge graph according to the standard entities and the standard relations.
Specifically, after the structured knowledge migration, knowledge fusion may be performed. In knowledge fusion, since structured knowledge comes from different databases and documents, entity alignment, entity disambiguation and relationship, and attribute merging need to be performed on the same entity and relationship from different databases to collectively import the multi-source structured knowledge into the original knowledge graph, thereby obtaining an entity-relationship-entity transformed by the structured knowledge, wherein the structured knowledge is transformed into the entity and relationship as a standard entity and a standard relationship.
And S200, extracting a reference entity and a reference relation in the unstructured text of the target field according to the standard entity and the standard relation.
For example, a large amount of unstructured texts in the target field can be acquired from a search engine through a web page crawling technology, and since the standard entities and the standard relations acquired through the structured knowledge are knowledge contents with high accuracy and high quality, the reference entities and the reference relations in the unstructured texts can be extracted according to the standard entities and the standard relations acquired from the structured knowledge to establish a high-quality knowledge map of the target field.
In one embodiment, as shown in fig. 3, extracting corresponding reference entities and reference relationships in an unstructured text of a domain according to the standard entities and the standard relationships includes:
and step S210, marking the corresponding entities and relations in the unstructured text according to the standard entities and the standard relations.
Specifically, according to the standard entities and the standard relations, entities and relations related to the standard entities and the standard relations in the unstructured text are labeled one by special symbols, the process can be called as reverse labeling and can be automatically completed by a machine, and during labeling, only the entities or relations appearing are labeled.
Step S220, extracting text abstract processing is carried out on the marked unstructured text according to the standard entity and the standard relation so as to screen out a sentence set associated with the standard entity and the standard relation.
Considering that not all sentences in an article need to extract relations (many relations are not knowledge of a target field), the target sentences of the automatically extracted relations need to be screened, so that the extracted relations do not contain some unrelated entities and relations, and the expertise and the pureness of the knowledge of the domain knowledge graph are ensured. However, if the sentence is selected only by the entity and relationship labels in step S210, many new contents that are not originally in the structured data are omitted, which results in poor integrity and comprehensiveness of the domain knowledge graph. Therefore, a decimated text summarization process is performed on unstructured text. Generally, the extraction type text summarization process is to find out sentences containing more information by counting the word frequency of the keywords. In this embodiment, in order to improve the specialty of the domain knowledge graph, when the extraction-type document summarization processing is performed, the weight of the sentences associated with the standard entities and the standards is also increased, and the sentences with higher relevance are screened from a large number of sentences in the unstructured text and used as the target sentence set for entity identification and entity relationship extraction.
And step S230, performing entity recognition on the sentence set associated with the standard entity and the standard relation, and performing entity relation extraction according to the recognition result to obtain a reference entity and a reference relation in the unstructured text.
Specifically, when the entity recognition is performed on the sentence sets associated with the standard entities and the standard relationships, both the entities that are consistent with the standard entities and the entities related to the standard entities are recognized.
Further, the extracting the entity relationship according to the recognition result comprises: on the basis of carrying out entity recognition on a sentence set associated with a standard entity and a standard relation, entity relation extraction is carried out by adopting a relation classification and syntactic analysis mode, specifically, the two methods can be simultaneously carried out so as to extract semantic relations between two or more entities from a text by utilizing the recognized entities. The two methods are adopted to extract the entity relationship by adopting the relationship classification and the syntactic analysis simultaneously, so that the relationship between the extracted entities is more perfect, and the construction of a complete high-quality knowledge graph is facilitated.
And step S300, carrying out knowledge fusion on the reference entities and the reference relations extracted from the unstructured text and the standard entities and the standard relations in the initial structured spectrogram to form a knowledge spectrogram of the target field.
Specifically, the part of the entities and the relations which are extracted from the unstructured text in the previous step and are different from the standard entities and the standard relations are supplemented into the initial structured graph, and meanwhile, the entities and the relations which are similar to the standard entities and the standard relations are subjected to entity alignment, entity disambiguation and relation, attribute combination and the like. Step S200 and step S300 are iterated mutually, step S200 may further perform information extraction from the unstructured text by using the entities and relationships obtained in step S300, and step S300 may also perform targeted, controllable expansion and completion on the initial structured graph by using the new entities and relationships extracted in step S200.
The method for constructing the knowledge graph comprises the steps of obtaining a structured database of a target field, constructing an initial structured knowledge graph according to the structured database, extracting a reference entity and a reference relation in an unstructured text of the target field according to a standard entity and a standard relation of the initial structured knowledge graph, and carrying out knowledge fusion on the reference entity and the reference relation extracted from the unstructured text and the standard entity and the standard relation in the initial structured knowledge graph to construct the knowledge graph of the target field. The initial structured knowledge graph constructed by the structured knowledge is a high-quality and high-accuracy knowledge graph, and the initial structured knowledge graph is subjected to targeted expansion and completion by the aid of the unstructured text on the basis, so that a complete and high-quality knowledge graph in the target field can be constructed.
In one embodiment, step S300 includes: verifying the reference entity and the reference relationship and the standard entity and the standard relationship according to the axiom to judge whether the reference entity and the standard entity, the reference relationship and the standard relationship meet the axiom; and when the reference entity and the standard entity, the reference relationship and the standard relationship meet the axiom, fusing the reference entity and the standard entity, and the reference relationship and the standard relationship.
Further, in one embodiment, step S300 further includes: setting a relation confidence value according to the original source of the reference relation and the frequency of the reference relation; and fusing the reference relation and the standard relation according to the relation confidence value.
Specifically, the more reliable the original source of the reference (e.g., from an authority, official media post text, standard references, etc.), the more frequently the reference occurs, and the higher the corresponding relationship confidence value. In this embodiment, the relationship confidence value may be a numerical value between 0 and 1, and the larger the numerical value, the higher the corresponding relationship confidence value. Fusing the reference relationship and the standard relationship according to the relationship confidence value comprises: and when the reference relation conflicts with the standard relation, if the confidence value of the reference relation is smaller than a preset threshold value, deleting the reference relation. The preset threshold may be selected according to a requirement, for example, may be set to 0.6, when the reference relationship and the standard relationship conflict and cannot be combined, the relationship confidence value may be used as a reference, if the relationship confidence value of the reference relationship is less than 0.6, the reference relationship confidence value is low, the reference relationship is deleted, if the relationship confidence value of the reference relationship is greater than or equal to 0.6, the reference relationship confidence value is high, the reference relationship may be supplemented into the initial structured knowledge graph, and the standard relationship that conflicts with the reference relationship may be deleted. On one hand, the method adopts axiom to verify the standard entity and the reference entity, and the standard relation and the reference relation, on the other hand, the relation confidence value is set for the reference relation, and knowledge fusion is carried out with the assistance of the relation confidence value, so that the accuracy of the formed knowledge graph can be ensured.
In summary, according to the method for constructing a knowledge graph of the embodiment of the present invention, an initial structured knowledge graph is constructed according to a structured database, a reference entity and a reference relationship in an unstructured text of a target field are extracted according to a standard entity and a standard relationship of the initial structured knowledge graph, and the reference entity and the reference relationship extracted in the unstructured text and the standard entity and the standard relationship in the initial structured knowledge graph are subjected to knowledge fusion to construct the knowledge graph of the target field. The initial structured knowledge graph constructed by the structured knowledge is a high-quality and high-accuracy knowledge graph, and is supplemented with unstructured texts to perform targeted expansion and completion on the initial structured knowledge graph, so that a complete and high-quality target domain knowledge graph can be constructed.
Referring to fig. 4, another embodiment of the present application provides a knowledge-graph constructing apparatus, including:
the obtainingmodule 100 is configured to obtain a structured database in a target field.
Aconstruction module 200, configured to construct an initial structured knowledge graph according to the structured database of the target domain, where the initial structured knowledge graph includes standard entities and standard relationships.
And theextraction module 300 is configured to extract the reference entities and the reference relationships in the unstructured text of the target field according to the standard entities and the standard relationships.
And afusion module 400, configured to perform knowledge fusion on the reference entities and reference relationships extracted from the unstructured text and the standard entities and standard relationships in the initial structured knowledge graph to form a knowledge graph of the target domain.
In one embodiment, referring to fig. 5, theextraction module 300 includes alabeling unit 310, aprocessing unit 320, and anidentification extraction unit 330. Thelabeling unit 310 is configured to label, according to the standard entity and the standard relationship, the entity and the relationship corresponding to the unstructured text; theprocessing unit 320 is configured to perform abstraction-type text summarization on the labeled unstructured text according to the standard entity and the standard relationship to filter out a sentence set associated with the standard entity and the standard relationship; the recognition andextraction unit 330 is configured to perform entity recognition on the sentence sets associated with the standard entities and the standard relationships, and perform entity relationship extraction according to recognition results to obtain reference entities and reference relationships in the unstructured text.
In one embodiment, thebuilding module 200 comprises anontology unit 210 and amapping unit 220, wherein theontology unit 210 is used for building a domain ontology, and the domain ontology comprises important concepts, concept relationships and axioms in the target domain. Themapping unit 220 is configured to map the structured knowledge in the structured database to the domain ontology to obtain an entity node and a relationship node. Thefusion module 300 includes afusion subunit 310, and thefusion subunit 310 is configured to perform knowledge fusion on the entity nodes and the relationship nodes obtained from the different structured databases to obtain standard entities and standard relationships, and form an initial structured knowledge graph according to the standard entities and the standard relationships.
In one embodiment, thefusion module 300 further includes afirst verification unit 320 and asecond verification unit 330, where thefirst verification unit 320 is configured to verify the reference entity and the reference relationship and the standard entity and the standard relationship according to an axiom to determine whether the reference entity and the standard entity, the reference relationship, and the standard relationship satisfy the axiom; thefusion subunit 310 performs knowledge fusion on the reference entity and the standard entity, the reference relationship, and the standard relationship when the reference entity and the standard entity, the reference relationship, and the standard relationship satisfy the axiom. Thesecond verifying unit 330 is configured to set a relationship confidence value according to an original source of the reference relationship and a frequency of occurrence of the reference relationship, thefusion subunit 310 is configured to perform knowledge fusion on the reference relationship and the standard relationship according to the relationship confidence value, and when the reference relationship and the standard relationship conflict, if the relationship confidence value of the reference relationship is smaller than a preset threshold, the reference relationship is deleted.
It should be noted that, for the description of the apparatus for constructing a knowledge graph in the present application, please refer to the description of the method for constructing a knowledge graph in the present application, and details are not repeated here.
The device for constructing the knowledge graph is characterized in that the initial structured knowledge graph constructed by adopting the structured knowledge is a high-quality and high-accuracy knowledge graph, and the initial structured knowledge graph is subjected to targeted expansion and completion by the aid of the unstructured text, so that a complete and high-quality target field knowledge graph can be constructed.
In addition, in another embodiment of the present application, a computer-readable storage medium is provided, on which a knowledge graph constructing program is stored, and the knowledge graph constructing program is executed by a processor to implement the aforementioned knowledge graph constructing method, and for a description of operation of the knowledge graph constructing program in the present application, please refer to the description of the knowledge graph constructing method in the present application, which is not repeated herein.
According to the computer-readable storage medium of the embodiment of the invention, the complete and high-quality target domain knowledge graph can be constructed by the knowledge graph construction method.
In addition, another embodiment of the present application provides an electronic device, which includes a memory, a processor, and a knowledge graph constructing program that is stored in the memory and is executable on the processor, where the processor implements the knowledge graph constructing method when executing the knowledge graph constructing program, and details are not repeated here.
According to the electronic equipment provided by the embodiment of the invention, the complete and high-quality target domain knowledge graph can be constructed by the construction method of the knowledge graph.
It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

Translated fromChinese
1.一种知识图谱的构建方法,其特征在于,包括:1. a construction method of knowledge graph, is characterized in that, comprises:获取目标领域的结构化数据库,并根据所述目标领域的结构化数据库构建初始结构化知识图谱,所述初始结构化知识图谱包括标准实体和标准关系;Obtaining a structured database of the target domain, and constructing an initial structured knowledge graph according to the structured database of the target domain, where the initial structured knowledge graph includes standard entities and standard relationships;根据所述标准实体和所述标准关系抽取所述目标领域的非结构化文本中的参考实体和参考关系;Extracting reference entities and reference relationships in the unstructured text of the target domain according to the standard entities and the standard relationships;将从所述非结构化文本中抽取的参考实体和参考关系与所述初始结构化图谱中的标准实体和标准关系进行知识融合,以形成所述目标领域的知识图谱。Knowledge fusion is performed between the reference entities and reference relationships extracted from the unstructured text and the standard entities and standard relationships in the initial structured graph to form a knowledge graph of the target domain.2.根据权利要求1所述的知识图谱的构建方法,其特征在于,所述根据所述标准实体和所述标准关系抽取所述领域的非结构化文本中对应的参考实体和参考关系,包括:2. The method for constructing a knowledge graph according to claim 1, wherein the extracting the corresponding reference entity and reference relationship in the unstructured text of the field according to the standard entity and the standard relationship, comprising: :根据所述标准实体和所述标准关系对所述非结构化文本中对应的实体和关系进行标注;Marking the corresponding entities and relationships in the unstructured text according to the standard entities and the standard relationships;根据所述标准实体和所述标准关系对标注后的所述非结构化文本进行抽取式文本摘要处理,以筛选出与所述标准实体和所述标准关系相关联的句子集合;Perform extractive text summarization processing on the unstructured text after labeling according to the standard entity and the standard relationship, so as to filter out a sentence set associated with the standard entity and the standard relationship;对与所述标准实体和所述标准关系相关联的句子集合进行实体识别,并根据识别结果进行实体关系抽取,以获得所述非结构化文本中的参考实体和参考关系。Entity recognition is performed on the sentence set associated with the standard entity and the standard relationship, and entity relationship extraction is performed according to the recognition result, so as to obtain the reference entity and reference relationship in the unstructured text.3.根据权利要求2所述的知识图谱的构建方法,其特征在于,所述根据识别结果进行实体关系抽取,包括:3. The method for constructing a knowledge graph according to claim 2, wherein the entity relationship extraction is performed according to the identification result, comprising:在对与所述标准实体和所述标准关系相关联的句子集合进行实体识别的基础上,采用关系分类和句法分析的方式进行实体关系抽取。On the basis of performing entity recognition on the sentence set associated with the standard entity and the standard relationship, entity relation extraction is performed by means of relation classification and syntax analysis.4.根据权利要求1所述的知识图谱的构建方法,其特征在于,所述根据所述目标领域的结构化数据库构建初始结构化知识图谱,包括:4. The method for constructing a knowledge graph according to claim 1, wherein the initial structured knowledge graph is constructed according to the structured database of the target domain, comprising:构建领域本体,所述领域本体包括所述目标领域中的重要概念、概念关系和公理;constructing a domain ontology, where the domain ontology includes important concepts, conceptual relationships, and axioms in the target domain;将所述结构化数据库中的结构化知识映射到所述领域本体,以得到实体节点和关系节点;mapping the structured knowledge in the structured database to the domain ontology to obtain entity nodes and relation nodes;将从不同结构化数据库中获取的所述实体节点和所述关系节点进行知识融合以获取所述标准实体和所述标准关系,并根据所述标准实体和所述标准关系形成所述初始结构化知识图谱。Perform knowledge fusion on the entity nodes and the relationship nodes obtained from different structured databases to obtain the standard entities and the standard relationships, and form the initial structure according to the standard entities and the standard relationships Knowledge Graph.5.根据权利要求4所述的知识图谱的构建方法,其特征在于,将从所述非结构化文本中抽取的参考实体和参考关系与所述初始结构化图谱中的标准实体和标准关系进行知识融合,包括:5. The method for constructing a knowledge graph according to claim 4, wherein the reference entities and reference relations extracted from the unstructured text are compared with the standard entities and standard relations in the initial structured graph. Knowledge fusion, including:根据所述公理对所述参考实体和所述参考关系与所述标准实体和所述标准关系进行验证,以判断所述参考实体和所述标准实体、所述参考关系和所述标准关系是否满足所述公理;The reference entity and the reference relationship and the standard entity and the standard relationship are verified according to the axiom to determine whether the reference entity and the standard entity, the reference relationship and the standard relationship satisfy said axioms;当所述参考实体和所述标准实体、所述参考关系和所述标准关系满足所述公理时,对所述参考实体和所述标准实体、所述参考关系和所述标准关系进行知识融合。When the reference entity and the standard entity, the reference relationship and the standard relationship satisfy the axiom, knowledge fusion is performed on the reference entity and the standard entity, the reference relationship and the standard relationship.6.根据权利要求5所述的知识图谱的构建方法,其特征在于,将从所述非结构化文本中抽取的参考实体和参考关系与所述初始结构化图谱中的标准实体和标准关系进行知识融合,还包括:6. The method for constructing a knowledge graph according to claim 5, wherein the reference entities and reference relations extracted from the unstructured text are compared with the standard entities and standard relations in the initial structured graph. Knowledge fusion, which also includes:根据所述参考关系的原始来源和所述参考关系出现的频率设置关系置信度值;setting a relationship confidence value according to the original source of the reference relationship and the frequency of occurrence of the reference relationship;根据所述关系置信度值对所述参考关系和所述标准关系进行知识融合。Knowledge fusion is performed on the reference relationship and the standard relationship according to the relationship confidence value.7.根据权利要求6所述的知识图谱的构建方法,其特征在于,根据所述关系置信度值对所述参考关系和所述标准关系进行融合,包括:7. The method for constructing a knowledge graph according to claim 6, wherein the reference relationship and the standard relationship are fused according to the relationship confidence value, comprising:当所述参考关系和所述标准关系冲突时,若所述参考关系的关系置信度值小于预设的阈值,则删除所述参考关系。When the reference relationship and the standard relationship conflict, if the relationship confidence value of the reference relationship is less than a preset threshold, the reference relationship is deleted.8.一种知识图谱的构建装置,其特征在于,包括:8. A device for constructing a knowledge graph, comprising:获取模块,用于获取目标领域的结构化数据库;The acquisition module is used to acquire the structured database of the target domain;构建模块,用于根据所述目标领域的结构化数据库构建初始结构化知识图谱,所述初始结构化知识图谱包括标准实体和标准关系;a building module for constructing an initial structured knowledge graph according to the structured database of the target domain, where the initial structured knowledge graph includes standard entities and standard relationships;抽取模块,用于根据所述标准实体和所述标准关系抽取所述目标领域的非结构化文本中的参考实体和参考关系;an extraction module, configured to extract reference entities and reference relationships in the unstructured text of the target domain according to the standard entities and the standard relationships;融合模块,用于将从所述非结构化文本中抽取的参考实体和参考关系与所述初始结构化知识图谱中的标准实体和标准关系进行知识融合,以形成所述目标领域的知识图谱。The fusion module is used for knowledge fusion of the reference entities and reference relations extracted from the unstructured text and the standard entities and standard relations in the initial structured knowledge graph to form the knowledge graph of the target domain.9.一种计算机可读存储介质,其特征在于,其上存储有知识图谱的构建程序,该知识图谱的构建程序被处理器执行时实现如权利要求1-7中任一项所述的知识图谱的构建方法。9. A computer-readable storage medium, characterized in that a program for building a knowledge graph is stored thereon, and the program for building a knowledge graph is executed by a processor to realize the knowledge as claimed in any one of claims 1-7 The method of constructing the map.10.一种电子设备,其特征在于,包括存储器、处理器及存储在存储器上并可在处理器上运行的知识图谱的构建程序,所述处理器执行所述知识图谱的构建程序时,实现如权利要求1-7中任一项所述的知识图谱的构建方法。10. An electronic device, characterized in that it comprises a memory, a processor and a construction program of a knowledge graph stored in the memory and can be run on the processor, and when the processor executes the construction program of the knowledge graph, it realizes. The method for constructing a knowledge graph according to any one of claims 1-7.
CN202011329767.0A2020-11-242020-11-24Knowledge graph construction method and device, storage medium and electronic equipmentActiveCN112612899B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202011329767.0ACN112612899B (en)2020-11-242020-11-24Knowledge graph construction method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202011329767.0ACN112612899B (en)2020-11-242020-11-24Knowledge graph construction method and device, storage medium and electronic equipment

Publications (2)

Publication NumberPublication Date
CN112612899Atrue CN112612899A (en)2021-04-06
CN112612899B CN112612899B (en)2024-06-18

Family

ID=75225859

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202011329767.0AActiveCN112612899B (en)2020-11-242020-11-24Knowledge graph construction method and device, storage medium and electronic equipment

Country Status (1)

CountryLink
CN (1)CN112612899B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113127649A (en)*2021-05-072021-07-16支付宝(杭州)信息技术有限公司Map construction method and device
CN113177095A (en)*2021-04-292021-07-27北京明略软件系统有限公司Enterprise knowledge management method, system, electronic equipment and storage medium
CN113204648A (en)*2021-04-302021-08-03武汉工程大学Knowledge graph completion method based on automatic extraction relationship of judgment book text
CN113268607A (en)*2021-05-272021-08-17清华大学Knowledge graph construction method and device
CN113486189A (en)*2021-06-082021-10-08广州数说故事信息科技有限公司Open knowledge graph mining method and system
CN113806556A (en)*2021-09-142021-12-17广东电网有限责任公司 Construction method, device, equipment and medium of knowledge graph based on power grid data
CN114328965A (en)*2021-12-302022-04-12联想(北京)有限公司Knowledge graph updating method and device and computer equipment
CN115203431A (en)*2022-07-052022-10-18中国联合网络通信集团有限公司 A data processing method, device and storage medium
CN116069941A (en)*2022-11-172023-05-05国家电网有限公司Power grid equipment knowledge graph construction method based on electric power automation point table information

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108345647A (en)*2018-01-182018-07-31北京邮电大学 Web-based Domain Knowledge Graph Construction System and Method
CN109558492A (en)*2018-10-162019-04-02中山大学A kind of listed company's knowledge mapping construction method and device suitable for event attribution
CN110162637A (en)*2019-02-142019-08-23腾讯科技(深圳)有限公司Information Atlas construction method, device and equipment
WO2020143326A1 (en)*2019-01-112020-07-16平安科技(深圳)有限公司Knowledge data storage method, device, computer apparatus, and storage medium
CN111444351A (en)*2020-03-242020-07-24清华苏州环境创新研究院 A method and device for constructing knowledge graph in industrial process field

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108345647A (en)*2018-01-182018-07-31北京邮电大学 Web-based Domain Knowledge Graph Construction System and Method
CN109558492A (en)*2018-10-162019-04-02中山大学A kind of listed company's knowledge mapping construction method and device suitable for event attribution
WO2020143326A1 (en)*2019-01-112020-07-16平安科技(深圳)有限公司Knowledge data storage method, device, computer apparatus, and storage medium
CN110162637A (en)*2019-02-142019-08-23腾讯科技(深圳)有限公司Information Atlas construction method, device and equipment
CN111444351A (en)*2020-03-242020-07-24清华苏州环境创新研究院 A method and device for constructing knowledge graph in industrial process field

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
LUO, ZHIWEI ;XIE, RONG; CHEN, WEN ; YE, ZETAO: "Automatic domain terminology extraction and its evaluation for domain knowledge graph construction", WEB INTELLIGENCE AND AGENT SYSTEMS, vol. 16, no. 8, 11 September 2018 (2018-09-11)*
宋卿 , 戚成琳 , 杨越: "基于 Bootstrapping 的新闻事件型实体关系抽取方法", 中国传媒大学学报自然科学版, vol. 24, no. 4, 31 August 2017 (2017-08-31)*
宋卿;戚成琳;杨越: "基于Bootstrapping的新闻事件型实体关系抽取方法", 中国传媒大学学报 (自然科学版), vol. 24, no. 4, 25 August 2017 (2017-08-25), pages 47 - 50*
杨玉基;许斌;胡家威;仝美涵;张鹏;郑莉;: "一种准确而高效的领域知识图谱构建方法", 软件学报, no. 10, 8 February 2018 (2018-02-08)*
鄂世嘉;林培裕;向阳;: "自动化构建的中文知识图谱系统", 计算机应用, no. 04, 10 April 2016 (2016-04-10)*

Cited By (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113177095A (en)*2021-04-292021-07-27北京明略软件系统有限公司Enterprise knowledge management method, system, electronic equipment and storage medium
CN113177095B (en)*2021-04-292025-01-07北京明略软件系统有限公司 Enterprise knowledge management method, system, electronic device and storage medium
CN113204648A (en)*2021-04-302021-08-03武汉工程大学Knowledge graph completion method based on automatic extraction relationship of judgment book text
CN113127649B (en)*2021-05-072023-06-06支付宝(杭州)信息技术有限公司Map construction method and device
CN113127649A (en)*2021-05-072021-07-16支付宝(杭州)信息技术有限公司Map construction method and device
CN113268607A (en)*2021-05-272021-08-17清华大学Knowledge graph construction method and device
CN113486189B (en)*2021-06-082024-10-18广州数说故事信息科技有限公司Open knowledge graph mining method and system
CN113486189A (en)*2021-06-082021-10-08广州数说故事信息科技有限公司Open knowledge graph mining method and system
CN113806556A (en)*2021-09-142021-12-17广东电网有限责任公司 Construction method, device, equipment and medium of knowledge graph based on power grid data
CN113806556B (en)*2021-09-142024-11-08广东电网有限责任公司 Method, device, equipment and medium for constructing knowledge graph based on power grid data
CN114328965A (en)*2021-12-302022-04-12联想(北京)有限公司Knowledge graph updating method and device and computer equipment
CN115203431A (en)*2022-07-052022-10-18中国联合网络通信集团有限公司 A data processing method, device and storage medium
CN116069941A (en)*2022-11-172023-05-05国家电网有限公司Power grid equipment knowledge graph construction method based on electric power automation point table information
CN116069941B (en)*2022-11-172025-09-19国家电网有限公司Power grid equipment knowledge graph construction method based on electric power automation point table information

Also Published As

Publication numberPublication date
CN112612899B (en)2024-06-18

Similar Documents

PublicationPublication DateTitle
CN112612899A (en)Knowledge graph construction method and device, storage medium and electronic equipment
CN109726274B (en)Question generation method, device and storage medium
CN110674259A (en) Intent understanding method and apparatus
CN106776711A (en)A kind of Chinese medical knowledge mapping construction method based on deep learning
EP3528180A1 (en)Method, system and terminal for normalizingentities in a knowledge base, and computer readable storage medium
CN107357830B (en)Retrieval statement semantic fragment obtaining method and device based on artificial intelligence and terminal
Fischbach et al.Towards causality extraction from requirements
CN105005616B (en)Method and system are illustrated based on the text that textual image feature interaction expands
CN109145301B (en)Information classification method and device and computer readable storage medium
WO2020074017A1 (en)Deep learning-based method and device for screening for keywords in medical document
CN117454884B (en)Method, system, electronic device and storage medium for correcting historical character information
CN108959529A (en)Determination method, apparatus, equipment and the storage medium of problem answers type
CN107291949A (en)Information search method and device
CN118820389A (en) Keyword-based data association storage method and device
CN109522396B (en)Knowledge processing method and system for national defense science and technology field
EP3407206B1 (en)Reconciled data storage system
CN116258204A (en) Industrial safety production violation punishment management method and system based on knowledge graph
CN105260396A (en)Word retrieval method and apparatus
CN115858733A (en)Cross-language entity word retrieval method, device, equipment and storage medium
CN118503454B (en)Data query method, device, storage medium and computer program product
CN110704592A (en) Statement analysis processing method, apparatus, computer equipment and storage medium
CN119988600A (en) Coal industry large model retrieval enhanced generation method and system based on knowledge graph
CN118820430A (en) Training method, device, electronic device and storage medium for knowledge discovery system
CN117290542A (en)Video question-answering method, computer device and storage medium
CN110414006B (en) Text subject annotation method, device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp