Movatterモバイル変換


[0]ホーム

URL:


CN111598239B - Method and device for extracting process system of article based on graph neural network - Google Patents

Method and device for extracting process system of article based on graph neural network
Download PDF

Info

Publication number
CN111598239B
CN111598239BCN202010727219.7ACN202010727219ACN111598239BCN 111598239 BCN111598239 BCN 111598239BCN 202010727219 ACN202010727219 ACN 202010727219ACN 111598239 BCN111598239 BCN 111598239B
Authority
CN
China
Prior art keywords
title
level
node
article
level title
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010727219.7A
Other languages
Chinese (zh)
Other versions
CN111598239A (en
Inventor
宋永生
王楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wenling Technology Beijing Co ltd
Original Assignee
Jiangsu United Industrial Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu United Industrial Ltd By Share LtdfiledCriticalJiangsu United Industrial Ltd By Share Ltd
Priority to CN202010727219.7ApriorityCriticalpatent/CN111598239B/en
Publication of CN111598239ApublicationCriticalpatent/CN111598239A/en
Application grantedgrantedCritical
Publication of CN111598239BpublicationCriticalpatent/CN111598239B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention provides a method and a device for extracting a process system of an article based on a graph neural network, which relate to the technical field of artificial intelligence, and identify the hierarchical structure of different grades of titles of a first article by analyzing format information of the first article; judging whether each title is a behavior word describing the first process, when the first-level title is the behavior word describing the first process, establishing a time vector of the first-level title and the second-level title in the lower-layer title where the first-level title is located, establishing a belonging vector from the upper-layer title of the first-level title to the lower-layer title, further establishing a first title network diagram according to the time vector and the belonging vector, and performing unsupervised learning of a graph neural network on a large number of second title network diagrams of second articles to obtain a step sequence of a first process system and the first process system, so that the technical effect of maximizing the accuracy of the result of iterative learning of the graph neural network on the article title hierarchical structure is achieved.

Description

Method and device for extracting process system of article based on graph neural network
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for extracting a process system of an article based on a graph neural network.
Background
The basis of machine intelligence is the cognitive architecture of computers, which includes two broad categories: one type is a static conceptual system, such as: a classification system according to attribute characteristics, a structural system according to physical connection, and a relationship system according to logical relationships; the other is a dynamic event (process) system. A process that occurs in a particular spatio-temporal context is an event. Therefore, the identification and extraction of the process system are indispensable steps for the computer to acquire the machine intelligence, are the basis for the computer to judge the historical events and predict the future events, and are an important direction for the machine intelligence research at present.
The layout and the hierarchy for identifying the article title are mature technologies in the industry, because the commonly used text software (such as word, PDF, HTML and the like) of people carries format information, and people also use title numbering, font rendering, paragraph indentation, and counterpoint and the like to highlight the hierarchy of the title and the paragraph. Therefore, the computer can obtain rich information to identify the hierarchy of the article titles. The identified article title hierarchy itself reflects the relationship between the process and its steps. A title node is a step for the previous layer title it points to, and a process name for the next layer title it points to, so that when constructing the belonging vector (edge) of the title network graph, it is sufficient to rely on the hierarchical information of the article title structure. When determining how many steps a process includes and what the sequence is, the information about the process and the steps that are attached to the process provided by the title structure of an article is often incomplete, and in an article, even if two steps look "adjacent" in relative time, in fact, other steps may be hidden in the middle of the article. The traditional mathematical statistics needs similarity aggregation on a large number of article title structures, irreversibility and consistency check on increase and decrease of sequence elements in one step, and the like.
However, the applicant of the present invention finds that the prior art has at least the following technical problems:
the existing mathematical statistics can only carry out statistics aiming at the existing steps, the capacity of deducing unknown steps is not provided, and when step information of the same process reflected by different articles conflicts, consistency verification can cause the loss of the accuracy of the final result.
Disclosure of Invention
The embodiment of the invention provides a method and a device for extracting a process system of an article based on a graph neural network, which solve the technical problems that in the prior art, mathematical statistics can only be carried out aiming at the existing steps, the capacity of unknown steps is not deduced, and when step information of the same process reflected by different articles conflicts, consistency verification can cause the loss of the accuracy of a final result, thereby achieving the technical effects of continuously iterative learning based on the graph neural network, further having certain capacity of excavating hidden steps and ensuring the maximization of the accuracy of the result of the iterative learning of the graph neural network.
In view of the above problems, the present application is provided to provide a method and an apparatus for a process architecture for extracting articles based on a graph neural network.
In a first aspect, the present invention provides a method for extracting a process system of an article based on a graph neural network, the method comprising: obtaining first article format information of a first article; identifying a title hierarchy of the first article according to the first article format information to obtain a first-level title, wherein the first-level title comprises a first paragraph corresponding to the first-level title; judging whether the first-level title is a behavior word describing a first process; when the first-level title is a behavior word describing the first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located; obtaining a second-level title describing the first process in the lower-level title, wherein the second-level title contains a second paragraph corresponding to the second-level title; establishing a vector according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title; establishing a first title network graph according to the first-level title, the second-level title, the upper-layer title, the affiliated vector and the time vector; obtaining a plurality of second articles, and correspondingly establishing a plurality of second title network graphs according to the plurality of second articles, wherein the article names of the second articles and the first articles belong to synonyms; and inputting the first header network diagram and the plurality of second header network diagrams into a neural network for deep learning to obtain a first process system and a first process system.
Preferably, the first article format information includes a first article text format, a first article font format, and a first article paragraph format.
Preferably, the establishing the belonging vector according to the upper layer title and the lower layer title includes:
determining an upper node according to the upper title; determining the lower-layer title according to the first-level title and the second-level title; determining a lower layer node according to the lower layer title; and obtaining the vector of the lower node pointing to the upper node according to the lower node and the upper node.
Preferably, identifying the first paragraph and the second paragraph according to time to establish a time vector of the first level title and the second level title comprises:
obtaining a first-level title node of the first-level title; obtaining a second-level title node of the second-level title; obtaining a first time quantum according to the first paragraph corresponding to the first level title; obtaining a second time quantum according to the second paragraph corresponding to the second-level title; judging the time sequence of the first time quantum and the second time quantum; when the first amount of time is before the time of the second amount of time, determining whether the first level title node and the second level title node are adjacent nodes; obtaining the time vector pointing from the first level title node to the second level title node when the first level title node and the second level title node are adjacent nodes.
Preferably, the method further comprises:
inputting the first and second header network diagrams into the neural network for training to obtain multiple first header state functions hvWherein the first header state function hvIs denoted by hv= f(Xv,Xco[v],hne[v],Xne[v]) Wherein the first header state function hvIs vectorization representation of a node v, and judges whether the node v is a description first process; f (—) is a local transfer function, is shared by all nodes, and updates the state of the nodes according to the input domain information; xvIs a characteristic representation of the node v; xco[v]Is the edge connected to the node v, i.e. the feature representation of the belonging vector and the time vector; h isne[v]Is the state of the neighboring node; xne[v]Is a feature representation of the node v neighbors;
applying the plurality of first title state functions hvPerforming aggregation to obtain a first header state function set H, wherein the first header state function set H is represented as H = F (H, X), and F (#) is a local transfer function set; x is the feature set of the node v;
iteratively learning the first title state function set H along time to obtain an iterative function Ht+1Wherein the iterative function Ht+1Is represented by Ht+1= F(HtX) in which Ht+1The title state function set at the time t +1 of the first title state function set; htA first set of title state functions at time t;
when the iterative function Ht+1=HtWhile computing said iterative function Ht+1Obtaining the first process system and a sequence of steps for the first process system.
Preferably, the method further comprises:
according to the plurality of first title state functions hvDetermining the node v as describing a plurality of first steps O in the first processvWherein the first step OvIs represented by Ov= g(hv,Xv) Wherein g (#) is a local output function;
subjecting the plurality of first steps OvPerforming aggregation to obtain a first step aggregation O of the first process system, wherein the first step aggregation O is represented by O = G (H, X), and G (×) is a local output function aggregation.
In a second aspect, the present invention provides an apparatus for a process architecture for article extraction based on graph neural network, the apparatus comprising:
a first obtaining unit, configured to obtain first article format information of a first article;
a second obtaining unit, configured to identify a title hierarchy of the first article according to the first article format information to obtain a first-level title, where the first-level title includes a first paragraph corresponding to the first-level title;
the first judging unit is used for judging whether the first-level title is a behavior word describing a first process;
a first determining unit, configured to determine, when the first-level title is a behavior word describing the first process, an upper-layer title of the first-level title and a lower-layer title where the first-level title is located;
a third obtaining unit, configured to obtain a second-level title that describes the first process in the lower-level title, where the second-level title includes a second paragraph corresponding to the second-level title;
the first construction unit is used for establishing a vector of the upper layer title and the lower layer title according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first level title and the second level title;
a second constructing unit, configured to establish a first title network map according to the first level title, the second level title, the upper layer title, the belonging vector, and the time vector;
a third constructing unit, configured to obtain a plurality of second articles, and correspondingly establish a plurality of second headline network graphs according to the plurality of second articles, where the article names of the second articles and the first articles belong to a synonym;
a fourth obtaining unit, configured to perform deep learning on the first header network map and the plurality of second header network map input maps by using a neural network, so as to obtain a first process system and a sequence of steps of the first process system.
Preferably, the first article format information includes a first article text format, a first article font format, and a first article paragraph format.
Preferably, the establishing, by the first building unit, the vector according to the upper-layer title and the lower-layer title includes:
a second determining unit configured to determine an upper node according to the upper header;
a third determining unit configured to determine the lower-layer title according to the first-level title and the second-level title;
a fourth determining unit configured to determine a lower node according to the lower title;
a fifth obtaining unit, configured to obtain, according to the lower node and the upper node, the belonging vector of the lower node pointing to the upper node.
Preferably, the establishing, in the first building unit, a time vector of the first-level title and the second-level title according to the time for identifying the first paragraph and the second paragraph includes:
a sixth obtaining unit configured to obtain a first-level title node of the first-level title;
a seventh obtaining unit configured to obtain a second-level title node of the second-level title;
an eighth obtaining unit, configured to obtain a first amount of time according to the first paragraph corresponding to the first level title;
a ninth obtaining unit, configured to obtain a second amount of time according to the second paragraph corresponding to the second level title;
a second judging unit, configured to judge a time sequence of the first time amount and the second time amount;
a third judging unit configured to judge whether the first-level title node and the second-level title node are adjacent nodes when the first amount of time is before the time of the second amount of time;
a tenth obtaining unit configured to obtain the time vector pointing from the first-level title node to the second-level title node when the first-level title node and the second-level title node are adjacent nodes.
Preferably, the apparatus further comprises:
a tenth obtaining unit, configured to input the first header network map and the second header network map into the map neural network for training, and obtain a plurality of first header state functions hvWherein the first header state function hvIs denoted by hv= f(Xv,Xco[v],hne[v],Xne[v]) Wherein the first header state function hvIs vectorization representation of a node v, and judges whether the node v is a description first process; f (—) is a local transfer function, is shared by all nodes, and updates the state of the nodes according to the input domain information; xvIs a characteristic representation of the node v; xco[v]Is the edge connected to the node v, i.e. the feature representation of the belonging vector and the time vector; h isne[v]Is the state of the neighboring node; xne[v]Is a feature representation of the node v neighbors;
an eleventh obtaining unit for combining the plurality of first title state functions hvPerforming aggregation to obtain a first header state function set H, wherein the first header state function set H is represented as H = F (H, X), and F (#) is a local transfer function set; x is the feature set of the node v;
a twelfth obtaining unit, configured to iteratively learn, along time, the first header state function set H to obtain an iterative function Ht+1Wherein the iterative function Ht+1Is represented by Ht+1= F(HtX) in which Ht+1The title state function set at the time t +1 of the first title state function set; htA first set of title state functions at time t;
a thirteenth obtaining unit for obtaining the iteration function Ht+1=HtTime, calculateThe iteration function Ht+1Obtaining the first process system and a sequence of steps for the first process system.
Preferably, the apparatus further comprises:
a fifth determination unit for determining a first title state function h from the plurality of first title state functionsvDetermining the node v as describing a plurality of first steps O in the first processvWherein the first step OvIs represented by Ov= g(hv,Xv) Wherein g (#) is a local output function;
a fourteenth obtaining unit for obtaining the plurality of first steps OvPerforming aggregation to obtain a first step aggregation O of the first process system, wherein the first step aggregation O is represented by O = G (H, X), and G (×) is a local output function aggregation.
In a third aspect, the present invention provides an apparatus for a process architecture for article extraction based on graph neural network, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of any one of the above methods when executing the program.
In a fourth aspect, the invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods described above.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the method and the device for extracting the process system of the article based on the graph neural network provided by the embodiment of the invention are characterized in that first article format information of a first article is obtained; identifying a title hierarchy of the first article according to the first article format information to obtain a first-level title, wherein the first-level title comprises a first paragraph corresponding to the first-level title; judging whether the first-level title is a behavior word describing a first process; when the first-level title is a behavior word describing the first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located; obtaining a second-level title describing the first process in the lower-level title, wherein the second-level title contains a second paragraph corresponding to the second-level title; establishing a vector according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title; establishing a first title network graph according to the first-level title, the second-level title, the upper-layer title, the affiliated vector and the time vector; obtaining a plurality of second articles, and correspondingly establishing a plurality of second title network graphs according to the plurality of second articles, wherein the article names of the second articles and the first articles belong to synonyms; the first headline network graph and the plurality of second headline network graphs are input into a graph neural network for deep learning to obtain the step sequence of the first process system and the first process system, so that the technical problems that in the prior art, mathematical statistics can only be performed on the steps which appear, the capacity of unknown steps is not deduced, and when step information of the same process reflected by different articles conflicts, consistency verification can cause the loss of the accuracy of the final result are solved, continuous iterative learning based on the graph neural network is achieved, certain capacity of excavating hidden steps is achieved, and the technical effect of maximizing the accuracy of the result of the iterative learning of the graph neural network is ensured.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
FIG. 1 is a flowchart illustrating a method for extracting a process architecture of an article based on a graph neural network according to an embodiment of the present invention;
FIG. 2 is a block diagram of an apparatus for a process architecture for article extraction based on graph neural networks according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of another apparatus for extracting an article based on a graph neural network according to an embodiment of the present invention.
Description of reference numerals: a first obtainingunit 11, a second obtainingunit 12, afirst judging unit 13, a first determiningunit 14, a third obtainingunit 15, afirst constructing unit 16, asecond constructing unit 17, athird constructing unit 18, a fourth obtainingunit 19, abus 300, areceiver 301, aprocessor 302, atransmitter 303, amemory 304, and abus interface 306.
Detailed Description
The embodiment of the invention provides a method and a device for extracting a process system of an article based on a graph neural network, which are used for solving the technical problems that in the prior art, mathematical statistics can only be carried out aiming at the existing steps, the capability of deducing unknown steps is not realized, and when step information of the same process reflected by different articles conflicts, consistency verification can cause the loss of the accuracy of a final result.
The technical scheme provided by the invention has the following general idea: obtaining first article format information of a first article; identifying a title hierarchy of the first article according to the first article format information to obtain a first-level title, wherein the first-level title comprises a first paragraph corresponding to the first-level title; judging whether the first-level title is a behavior word describing a first process; when the first-level title is a behavior word describing the first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located; obtaining a second-level title describing the first process in the lower-level title, wherein the second-level title contains a second paragraph corresponding to the second-level title; establishing a vector according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title; establishing a first title network graph according to the first-level title, the second-level title, the upper-layer title, the affiliated vector and the time vector; obtaining a plurality of second articles, and correspondingly establishing a plurality of second title network graphs according to the plurality of second articles, wherein the article names of the second articles and the first articles belong to synonyms; and inputting the first header network graph and the plurality of second header network graphs into a graph neural network for deep learning to obtain a step sequence of a first process system and the first process system, so that continuous iterative learning based on the graph neural network is achieved, certain capacity of excavating hidden steps is achieved, and the technical effect of maximizing the accuracy of the result of the iterative learning of the graph neural network is ensured.
The technical solutions of the present invention are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features in the embodiments and examples of the present invention are described in detail in the technical solutions of the present application, and are not limited to the technical solutions of the present application, and the technical features in the embodiments and examples of the present application may be combined with each other without conflict.
The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
Example one
Fig. 1 is a flowchart illustrating a method for extracting a process system of an article based on a graph neural network according to an embodiment of the present invention. As shown in fig. 1, an embodiment of the present invention provides a method for extracting a process architecture of an article based on a graph neural network, where the method includes:
step 110: first article format information of a first article is obtained.
Step 120: and identifying the title hierarchy of the first article according to the first article format information to obtain a first-level title, wherein the first-level title comprises a first paragraph corresponding to the first-level title.
Further, the first article format information includes a first article text format, a first article font format, and a first article paragraph format.
Specifically, the first article text format, the first article font format and the first article paragraph format of the first article are analyzed, such as the title font, the title font size, the paragraph indentation and the proof. According to a first article text format, a first article font format, a first article paragraph format and the like in the first article format information, a title hierarchical structure of a first article is identified, and then the grade of the title, namely a first-grade title, is obtained, wherein the first-grade title comprises a first-grade title, a second-grade title, a third-grade title and the like. The first level headlines are a collective term for each headline in the hierarchy of headlines identified in the first article. The first-level title comprises a first section of title information corresponding to the first-level title, wherein the first section is used for describing or further expanding the specific text content of the first-level title, and the first section belongs to the content hooked by the first-level title.
Step 130: and judging whether the first-level title is a behavior word describing a first process.
Step 140: and when the first-level title is a behavior word describing the first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located.
Step 150: obtaining a second level title describing the first process in the lower level title, wherein the second level title includes a second paragraph corresponding to the second level title.
Specifically, the process identification is performed on each title in the title hierarchy identified in the first article, that is, the process is determined which titles in the first-level title describe. When the first-level title has a behavior word for describing a first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located, wherein the name of the first process is the upper-layer title of the layer where the first-level title is located; the lower layer title is the title of the layer where the first level title is located. And performing the same process identification on all the titles of the layer where the first-level title is located, and obtaining all second-level titles describing the first process in the lower-layer title, wherein the second-level titles and the first-level title belong to the same-layer title. The second-level title comprises a second paragraph corresponding to the second-level title, wherein the second paragraph is a specific text content for describing or further expanding the second-level title, and the second paragraph belongs to a content hooked by the second-level title. For example, if a title describing "prosecution" and a title describing "court" both have the same upper level title "litigation," the process is named "litigation" and both "prosecution" and "court" are steps in the litigation process.
Step 160: and establishing a belonging vector according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title.
Further, the establishing of the vector according to the upper layer title and the lower layer title includes: determining an upper node according to the upper title; determining the lower-layer title according to the first-level title and the second-level title; determining a lower layer node according to the lower layer title; and obtaining the vector of the lower node pointing to the upper node according to the lower node and the upper node. Further, identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title, including: obtaining a first-level title node of the first-level title; obtaining a second-level title node of the second-level title; obtaining a first time quantum according to the first paragraph corresponding to the first level title; obtaining a second time quantum according to the second paragraph corresponding to the second-level title; judging the time sequence of the first time quantum and the second time quantum; when the first amount of time is before the time of the second amount of time, determining whether the first level title node and the second level title node are adjacent nodes; obtaining the time vector pointing from the first level title node to the second level title node when the first level title node and the second level title node are adjacent nodes.
Specifically, each title in the article is taken as a node of a title network graph, and an upper-layer title is determined to be an upper-layer node and a lower-layer title is determined to be a lower-layer node, wherein the lower-layer title comprises a first-level title and a second-level title. And obtaining the affiliated vector of the lower node pointing to the upper node according to the lower node and the upper node, namely drawing an edge pointing to the name in the step between the lower header and the first process name header of the upper header as the affiliated vector. The method comprises the steps of obtaining a first level title node of a first level title and a second level title node of a second level title, obtaining a first time amount according to a first section corresponding to the first level title and obtaining a second time amount according to a second section corresponding to the second level title. And judging the time sequence of the first time quantum and the second time quantum, and judging whether the first-level title node and the second-level title node are adjacent nodes or not when the first time quantum is before the time of the second time quantum. That is, the next step of determining the first level title node is the second level title node. When the first level title node and the second level title node are adjacent nodes, a time vector pointing from the first level title node to the second level title node is obtained. That is, the amount of time is found in the corresponding text passage of each step heading with the same belonging vector pointing thereto, and the adjacent nodes are found in the heading containing the amount of time, and an edge connecting the two adjacent nodes is drawn in the first-to-last direction as a time vector.
Step 170: and establishing a first title network graph according to the first-level title, the second-level title, the upper-layer title, the affiliated vector and the time vector.
Specifically, a first-level title, a second-level title and an upper-layer title are used as nodes, and a belonging vector and a time vector are used as edges to establish a first-title network graph. That is, a first level title and its vector and time vector linked neighboring second level titles form a node of the first title network graph, and all steps included under each first process are linked in such a manner as to form the first title network graph of the first process.
Step 180: and obtaining a plurality of second articles, and correspondingly establishing a plurality of second title network graphs according to the plurality of second articles, wherein the article names of the second articles and the first articles belong to synonyms.
Step 190: and inputting the first header network diagram and the plurality of second header network diagrams into a neural network for deep learning to obtain a first process system and a first process system.
In particular, in a particular article, all headings describing process steps that are next to a process name heading, do not necessarily cover all steps of the process, and are incomplete. To obtain the complete set of steps of the process, all steps of the same process that occur in a large number of articles need to be analyzed. To do this, the present embodiment requires that a hierarchy of all the titles of a large number of articles be combated into a data set of elementary units combined by individual "process names and their underlying steps", although each unit may contain steps that are incomplete and out of order. Thus, a large number of second articles are obtained, the article names of the second articles and the first articles belonging to the same synonym, i.e. the second articles and the first articles belonging to the same type of articles. And correspondingly establishing a plurality of second headline network graphs according to a plurality of second articles, inputting the first headline network graph and the second headline network graphs into a graph neural network for deep learning, adding new nodes from the second headline network graphs to the first headline network graph, or adjusting the positions of the existing nodes, and obtaining a step sequence of a first process system and a first process system with extremely high integrity when the gradient function of the nodes between the second headline network graph and the first headline network graph tends to zero through continuous iterative learning. The graph neural network can deduce the definition (label) of the core node through the information of the surrounding nodes and edges in the continuously iterative learning process, so that the graph neural network has certain capacity of mining hidden nodes (steps), and further obtains a process system with high integrity and consistency.
Further, the method further comprises: inputting the first and second header network diagrams into the neural network for training to obtain multiple first header state functions hvWherein the first header state function hvIs denoted by hv= f(Xv,Xco[v],hne[v],Xne[v]) Wherein the first header state function hvIs vectorization representation of a node v, and judges whether the node v is a description first process; f (—) is a local transfer function, is shared by all nodes, and updates the state of the nodes according to the input domain information; xvIs a characteristic representation of the node v; xco[v]Is the edge connected to the node v, i.e. the feature representation of the belonging vector and the time vector; h isne[v]Is the state of the neighboring node; xne[v]Is a feature representation of the node v neighbors; applying the plurality of first title state functions hvPerforming aggregation to obtain a first header state function set H, wherein the first header state function set H is represented as H = F (H, X), and F (#) is a local transfer function set; x is the feature set of the node v; iteratively learning the first title state function set H along time to obtain an iterative function Ht+1Wherein the iterative function Ht+1Is represented by Ht+1= F(HtX) in which Ht+1The title state function set at the time t +1 of the first title state function set; htA first set of title state functions at time t; when the iterative function Ht+1=HtWhile computing said iterative function Ht+1Obtaining the first process system and a sequence of steps for the first process system.
Further, the method further comprises: according to the plurality of first title state functions hvDetermining the node v as describing a plurality of first steps O in the first processvWherein the first step OvIs represented by Ov= g(hv,Xv) Wherein g (#) is a local output function; subjecting the plurality of first steps toProcedure OvPerforming aggregation to obtain a first step aggregation O of the first process system, wherein the first step aggregation O is represented by O = G (H, X), and G (×) is a local output function aggregation.
Specifically, a first header network diagram and a second header network diagram are input into a neural network for training to obtain a plurality of first header state functions hvWherein the first header state function hvIs denoted by hv= f(xv,xco[v],hne[v],xne[v]) The first header state function is a state that converts nodes of the neural network of the graph composed of the first level headers into a digital representation. And collecting the second title state functions and the first title state functions in the plurality of second title network graphs to obtain a first title state function set H, namely the total number of all nodes in the first title network graph and the second title network graph. Iteratively learning the first title state function set H along time, sequencing all nodes in the first title state function set H according to the time sequence to obtain an iterative function Ht+1. The process is to find a first time amount in a first section under a first-level title, compare the time with adjacent nodes, if the time sequence is right, the position is not adjusted, if the time sequence is not right, the position of the first-level title node in the graph is adjusted until the time sequence is correct. When iterating function Ht+1=HtComputing an iterative function Ht+1The first process architecture and the sequence of steps of the first process architecture are obtained, that is, there is a relationship between adjacent states of the set of nodes, each learning of the graph neural network is an iteration, each iteration adds a new node to the graph, or adjusts the position of an existing node in the graph, or both. Can be used (H)t+1-Ht) The objective of the iterative learning is to make the gradient function approach to zero. When no new node can be added and no existing node position can be adjusted, the gradient function is zero, namely, no matter how many articles and titles are added, the number of processes which can be found by the graph neural network is zeroNo longer changing, the sequence of steps of each found process no longer changing, i.e. the iteration function Ht+1=HtA first process system with a very high degree of integrity and a sequence of steps of the first process system can be obtained. In the process of iteratively learning the first title network diagram and the second title network diagram by the graph neural network, a plurality of first title state functions h are usedvIt can be determined that node v describes a plurality of first steps O in a first procedurevAnd a plurality of first steps OvAnd (4) performing aggregation to obtain a first step aggregate O of the first process system, namely, aggregating all the first steps in the second title network diagram and the first steps in the first title network diagram to determine a complete first step.
Example two
Based on the same inventive concept as the method for extracting the process system of the article based on the graph neural network in the foregoing embodiment, the present invention further provides a method and an apparatus for extracting the process system of the article based on the graph neural network, as shown in fig. 2, the apparatus includes:
a first obtainingunit 11, where the first obtainingunit 11 is configured to obtain first article format information of a first article;
a second obtainingunit 12, where the second obtainingunit 12 is configured to identify a headline hierarchy of the first article according to the first article format information to obtain a first-level headline, where the first-level headline includes a first paragraph corresponding to the first-level headline;
afirst judging unit 13, where thefirst judging unit 13 is configured to judge whether the first-level title is a behavior word describing a first process;
a first determiningunit 14, where the first determiningunit 14 is configured to determine an upper-layer title of the first-level title and a lower-layer title where the first-level title is located when the first-level title is a behavior word describing the first process;
a third obtainingunit 15, where the third obtainingunit 15 is configured to obtain a second-level title describing the first process in the lower-level title, where the second-level title includes a second paragraph corresponding to the second-level title;
afirst constructing unit 16, where thefirst constructing unit 16 is configured to establish a vector according to the upper layer title and the lower layer title, and identify, according to time, the first paragraph and the second paragraph, and establish a time vector of the first level title and the second level title;
asecond constructing unit 17, where thesecond constructing unit 17 is configured to establish a first title network map according to the first level title, the second level title, the upper layer title, the belonging vector, and the time vector;
athird constructing unit 18, where thethird constructing unit 18 is configured to obtain a plurality of second articles, and correspondingly establish a plurality of second headline network graphs according to the plurality of second articles, where the article names of the second articles and the first articles belong to synonyms;
a fourth obtainingunit 19, where the fourth obtainingunit 19 is configured to perform deep learning on the first header network map and the plurality of second header network maps input map neural networks, so as to obtain a first process system and a sequence of steps of the first process system.
Further, the first article format information includes a first article text format, a first article font format, and a first article paragraph format.
Further, the establishing, by the first constructing unit, the vector according to the upper layer title and the lower layer title includes:
a second determining unit configured to determine an upper node according to the upper header;
a third determining unit configured to determine the lower-layer title according to the first-level title and the second-level title;
a fourth determining unit configured to determine a lower node according to the lower title;
a fifth obtaining unit, configured to obtain, according to the lower node and the upper node, the belonging vector of the lower node pointing to the upper node.
Further, the establishing, by the first building unit, the time vector of the first-level title and the second-level title according to the time vector of the first-level title and the second-level title by identifying the first paragraph and the second paragraph according to time includes:
a sixth obtaining unit configured to obtain a first-level title node of the first-level title;
a seventh obtaining unit configured to obtain a second-level title node of the second-level title;
an eighth obtaining unit, configured to obtain a first amount of time according to the first paragraph corresponding to the first level title;
a ninth obtaining unit, configured to obtain a second amount of time according to the second paragraph corresponding to the second level title;
a second judging unit, configured to judge a time sequence of the first time amount and the second time amount;
a third judging unit configured to judge whether the first-level title node and the second-level title node are adjacent nodes when the first amount of time is before the time of the second amount of time;
a tenth obtaining unit configured to obtain the time vector pointing from the first-level title node to the second-level title node when the first-level title node and the second-level title node are adjacent nodes.
Further, the apparatus further comprises:
a first training unit, configured to input the first header network diagram and the second header network diagram into the graph neural network for training to obtain a plurality of first header state functions hvWherein the first header state function hvIs denoted by hv= f(Xv,Xco[v],hne[v],Xne[v]),Wherein the first title state function hvIs vectorization representation of a node v, and judges whether the node v is a description first process; f (—) is a local transfer function, is shared by all nodes, and updates the state of the nodes according to the input domain information; xvIs a characteristic representation of the node v; xco[v]Is the edge connected to the node v, i.e. the feature representation of the belonging vector and the time vector; h isne[v]Is the state of the neighboring node; xne[v]Is a feature representation of the node v neighbors;
an eleventh obtaining unit for combining the plurality of first title state functions hvPerforming aggregation to obtain a first header state function set H, wherein the first header state function set H is represented as H = F (H, X), and F (#) is a local transfer function set; x is the feature set of the node v;
a twelfth obtaining unit, configured to iteratively learn, along time, the first header state function set H to obtain an iterative function Ht+1Wherein the iterative function Ht+1Is represented by Ht+1= F(HtX) in which Ht+1The title state function set at the time t +1 of the first title state function set; htA first set of title state functions at time t;
a thirteenth obtaining unit for obtaining the iteration function Ht+1=HtWhile computing said iterative function Ht+1Obtaining the first process system and a sequence of steps for the first process system.
Further, the apparatus further comprises:
a fifth determination unit for determining a first title state function h from the plurality of first title state functionsvDetermining the node v as describing a plurality of first steps O in the first processvWherein the first step OvIs represented by Ov= g(hv,Xv) Wherein g (#) is a local output function;
a fourteenth obtaining unit for obtaining the plurality of first steps OvPerforming aggregation to obtain a first step aggregation O of the first process system, wherein the first step aggregation O is represented by O = G (H, X), and G (×) is a local output function aggregation.
Various changes and specific examples of the method for extracting a process system of an article based on a graph neural network in the first embodiment of fig. 1 are also applicable to the apparatus for extracting a process system of an article based on a graph neural network in the present embodiment, and through the foregoing detailed description of the method for extracting a process system of an article based on a graph neural network, those skilled in the art can clearly know an implementation method of the apparatus for extracting a process system of an article based on a graph neural network in the present embodiment, so for the brevity of the description, detailed descriptions are not further provided here.
EXAMPLE III
Based on the same inventive concept as the method for extracting the process architecture of the article based on the graph neural network in the foregoing embodiment, the present invention further provides an apparatus for extracting the process architecture of the article based on the graph neural network, as shown in fig. 3, including amemory 304, aprocessor 302, and a computer program stored on thememory 304 and operable on theprocessor 302, where theprocessor 302 executes the program to implement the steps of any one of the methods for extracting the process architecture of the article based on the graph neural network.
Where in fig. 3 a bus architecture (represented by bus 300),bus 300 may include any number of interconnected buses and bridges,bus 300 linking together various circuits including one or more processors, represented byprocessor 302, and memory, represented bymemory 304. Thebus 300 may also link together various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. Abus interface 306 provides an interface between thebus 300 and thereceiver 301 andtransmitter 303. Thereceiver 301 and thetransmitter 303 may be the same element, i.e., a transceiver, providing a means for communicating with various other apparatus over a transmission medium. Theprocessor 302 is responsible for managing thebus 300 and general processing, and thememory 304 may be used for storing data used by theprocessor 302 in performing operations.
Example four
Based on the same inventive concept as the method for extracting the process system of the article based on the graph neural network in the foregoing embodiments, the present invention also provides a computer-readable storage medium on which a computer program is stored, which when executed by a processor implements the following steps: obtaining first article format information of a first article; identifying a title hierarchy of the first article according to the first article format information to obtain a first-level title, wherein the first-level title comprises a first paragraph corresponding to the first-level title; judging whether the first-level title is a behavior word describing a first process; when the first-level title is a behavior word describing the first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located; obtaining a second-level title describing the first process in the lower-level title, wherein the second-level title contains a second paragraph corresponding to the second-level title; establishing a vector according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title; establishing a first title network graph according to the first-level title, the second-level title, the upper-layer title, the affiliated vector and the time vector; obtaining a plurality of second articles, and correspondingly establishing a plurality of second title network graphs according to the plurality of second articles, wherein the article names of the second articles and the first articles belong to synonyms; and inputting the first header network diagram and the plurality of second header network diagrams into a neural network for deep learning to obtain a first process system and a first process system.
In a specific implementation, when the program is executed by a processor, any method step in the first embodiment may be further implemented.
One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:
the method and the device for extracting the process system of the article based on the graph neural network provided by the embodiment of the invention are characterized in that first article format information of a first article is obtained; identifying a title hierarchy of the first article according to the first article format information to obtain a first-level title, wherein the first-level title comprises a first paragraph corresponding to the first-level title; judging whether the first-level title is a behavior word describing a first process; when the first-level title is a behavior word describing the first process, determining an upper-layer title of the first-level title and a lower-layer title where the first-level title is located; obtaining a second-level title describing the first process in the lower-level title, wherein the second-level title contains a second paragraph corresponding to the second-level title; establishing a vector according to the upper layer title and the lower layer title, and identifying the first paragraph and the second paragraph according to time to establish a time vector of the first-level title and the second-level title; establishing a first title network graph according to the first-level title, the second-level title, the upper-layer title, the affiliated vector and the time vector; obtaining a plurality of second articles, and correspondingly establishing a plurality of second title network graphs according to the plurality of second articles, wherein the article names of the second articles and the first articles belong to synonyms; the first headline network graph and the plurality of second headline network graphs are input into a graph neural network for deep learning to obtain the step sequence of the first process system and the first process system, so that the technical problems that in the prior art, mathematical statistics can only be performed on the steps which appear, the capacity of unknown steps is not deduced, and when step information of the same process reflected by different articles conflicts, consistency verification can cause the loss of the accuracy of the final result are solved, continuous iterative learning based on the graph neural network is achieved, certain capacity of excavating hidden steps is achieved, and the technical effect of maximizing the accuracy of the result of the iterative learning of the graph neural network is ensured.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (9)

CN202010727219.7A2020-07-272020-07-27Method and device for extracting process system of article based on graph neural networkActiveCN111598239B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202010727219.7ACN111598239B (en)2020-07-272020-07-27Method and device for extracting process system of article based on graph neural network

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202010727219.7ACN111598239B (en)2020-07-272020-07-27Method and device for extracting process system of article based on graph neural network

Publications (2)

Publication NumberPublication Date
CN111598239A CN111598239A (en)2020-08-28
CN111598239Btrue CN111598239B (en)2020-11-06

Family

ID=72183075

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202010727219.7AActiveCN111598239B (en)2020-07-272020-07-27Method and device for extracting process system of article based on graph neural network

Country Status (1)

CountryLink
CN (1)CN111598239B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112699658B (en)*2020-12-312024-05-28科大讯飞华南人工智能研究院(广州)有限公司Text comparison method and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109460479A (en)*2018-11-192019-03-12广州合摩计算机科技有限公司A kind of prediction technique based on reason map, device and system
CN111222315A (en)*2019-12-312020-06-02天津外国语大学 A method for predicting the plot of a movie script

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10789755B2 (en)*2018-04-032020-09-29Sri InternationalArtificial intelligence in interactive storytelling
CN109635171B (en)*2018-12-132022-11-29成都索贝数码科技股份有限公司 A Fusion Reasoning System and Method for Smart Tags of News Programs
CN110188168B (en)*2019-05-242021-09-03北京邮电大学Semantic relation recognition method and device
CN110619081B (en)*2019-09-202022-05-17苏州市职业大学 A News Push Method Based on Interaction Graph Neural Network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN109460479A (en)*2018-11-192019-03-12广州合摩计算机科技有限公司A kind of prediction technique based on reason map, device and system
CN111222315A (en)*2019-12-312020-06-02天津外国语大学 A method for predicting the plot of a movie script

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Graph Convolutional Networks with Argument-Aware Pooling for Event Detection;Thien Huu Nguyen;《32nd AAAI Conference on Artificial Intelligence, AAAI 2018》;20181231;全文*
基于多关系循环事件的动态知识图谱推理;陈浩等;《模式识别与人工智能》;20200430;全文*

Also Published As

Publication numberPublication date
CN111598239A (en)2020-08-28

Similar Documents

PublicationPublication DateTitle
US10083517B2 (en)Segmentation of an image based on color and color differences
EP3462385A1 (en)Sgcnn: structural graph convolutional neural network
CN113190670A (en)Information display method and system based on big data platform
CN111143547B (en) A big data display method based on knowledge graph
WO2017173929A1 (en)Unsupervised feature selection method and device
CN117853824B (en)Big data-based 3D sand table projection analysis method
CN110020176A (en)A kind of resource recommendation method, electronic equipment and computer readable storage medium
US7991595B2 (en)Adaptive refinement tools for tetrahedral unstructured grids
CN103678436A (en)Information processing system and information processing method
CN118657808B (en) Flow field feature extraction and tracking method, device and equipment based on physical information fusion
He et al.Community Detection in Aviation Network Based on K-means and Complex Network.
Yan et al.A clustering algorithm for multi-modal heterogeneous big data with abnormal data
CN112052177B (en)MC/DC test case set generation method for multi-value coupling signal
CN117216620A (en)Document classification method and device based on multi-mode hypergraph clustering
CN111598239B (en)Method and device for extracting process system of article based on graph neural network
CN114118443B (en) Large-scale graph embedding training method and system based on Optane DIMM
CN114912703A (en)Method, device and equipment for predicting rupture pressure and storage medium
CN109063271B (en)Three-dimensional CAD model segmentation method and device based on ultralimit learning machine
Du et al.Evaluating structural and topological consistency of complex regions with broad boundaries in multi-resolution spatial databases
Windarto et al.K-Means Algorithm with Rapidminer in Clustering School Participation Rate in Indonesia
CardinalSets, graphs, and things we can see: A formal combinatorial ontology for empirical intra-site analysis
CN114612914B (en) A machine learning method and system for multi-label imbalanced data classification
Dossary et al.Progressive-recursive self-organizing maps PR-SOM for identifying potential drilling target areas
CN111737985B (en)Method and device for extracting process system from article title hierarchical structure
Pancerz et al.Rough sets for discovering concurrent system models from data tables

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant
TR01Transfer of patent right

Effective date of registration:20220513

Address after:Room 408, unit 2, building 15, courtyard 16, Yingcai North Third Street, future science city, Changping District, Beijing 102200

Patentee after:Wenling Technology (Beijing) Co.,Ltd.

Address before:Room 1502, Tongfu building, 501 Zhongshan South Road, Qinhuai District, Nanjing, Jiangsu 210006

Patentee before:Jiangsu United Industrial Limited by Share Ltd.

TR01Transfer of patent right

[8]ページ先頭

©2009-2025 Movatter.jp