

技术领域Technical Field
本发明涉及一种基于图卷积神经网络的电力业务系统数据对齐方法,属于电力业务数据处理技术领域。The invention relates to a data alignment method for an electric power business system based on a graph convolutional neural network, and belongs to the technical field of electric power business data processing.
背景技术Background Art
电力行业的深化改革要求电力企业进一步搞好信息化建设,进而更大程度实现信息互联和资源共享,使电力企业实现对数据资源的有效管理,充分挖掘数据资源的价值,实现降本增益的目标,从而进一步拓展行业空间。The deepening reform of the power industry requires power companies to further improve their information construction, and then realize information interconnection and resource sharing to a greater extent, so that power companies can effectively manage data resources, fully tap the value of data resources, and achieve the goal of reducing costs and increasing profits, thereby further expanding the industry space.
但是,在运用电力业务系统过程中,不同业务部门对设备信息的记录数据会存在不一致的情况,并且随着时间增长,电力网络数据日益复杂,数据在不同电力业务系统上传播时频繁复制、修改,导致不同业务系统中数据的可靠性降低。这时需要对不同业务系统中表示同一设备的信息进行修改,保证数据的可靠性与一致性。However, in the process of using the power business system, different business departments may have inconsistent data on equipment information. As time goes by, power network data becomes increasingly complex, and data is frequently copied and modified when it is transmitted on different power business systems, resulting in reduced reliability of data in different business systems. At this time, it is necessary to modify the information representing the same device in different business systems to ensure data reliability and consistency.
发明内容Summary of the invention
为了克服上述问题,本发明提供一种基于图卷积神经网络的电力业务系统数据对齐方法,该方法通过训练好的网络实体对齐模型,关联不同业务系统等价实体,预测多个电力业务系统中指向真实世界中的同一对象的数据的对应情况,保障不同数据源数据的可靠性。In order to overcome the above problems, the present invention provides a data alignment method for an electric power business system based on a graph convolutional neural network. The method associates equivalent entities of different business systems through a trained network entity alignment model, predicts the correspondence of data pointing to the same object in the real world in multiple electric power business systems, and ensures the reliability of data from different data sources.
本发明的技术方案如下:The technical solution of the present invention is as follows:
一种基于图卷积神经网络的电力业务系统数据对齐方法,包括:A data alignment method for a power business system based on a graph convolutional neural network, comprising:
获取两待对齐电力业务系统的设备台账信息数据,对所述设备台账信息数据进行清洗、预处理;Acquire equipment ledger information data of two power business systems to be aligned, and clean and pre-process the equipment ledger information data;
分别获取各电力业务系统中的实体及所述实体间的关系,根据所述实体及所述实体间的关系构建知识图谱,获取两所述知识图谱间预对齐的实体对,其中,所述实体为所述知识图谱的节点,所述实体间的关系为所述知识图谱的边;Respectively obtain entities in each power business system and the relationships between the entities, construct a knowledge graph based on the entities and the relationships between the entities, and obtain pre-aligned entity pairs between the two knowledge graphs, wherein the entities are nodes of the knowledge graph and the relationships between the entities are edges of the knowledge graph;
将所述知识图谱输入图自注意力卷积神经网络进行训练,将所述实体对作为对齐种子,作为所述图自注意力卷积神经网络训练时的监督信息;Inputting the knowledge graph into a graph self-attention convolutional neural network for training, and using the entity pairs as alignment seeds and as supervision information during the training of the graph self-attention convolutional neural network;
通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示,计算所述知识图谱间各节点的相似性,将最相似的两个节点作为对齐节点;Obtaining an embedded vector representation of each of the nodes through the graph self-attention convolutional neural network, calculating the similarity of each node between the knowledge graphs, and taking the two most similar nodes as alignment nodes;
根据对齐节点对待对齐电力业务系统实体的属性数据进行改写。The attribute data of the electric power business system entity to be aligned is rewritten according to the alignment node.
进一步的,对所述设备台账信息数据进行清洗、预处理,具体为剔除所述设备台账信息数据中损坏的数据,将剩余的数据处理为CSV格式。Furthermore, the equipment inventory information data is cleaned and preprocessed, specifically, damaged data in the equipment inventory information data is removed, and the remaining data is processed into a CSV format.
进一步,根据所述实体及所述实体间的关系构建知识图谱,获取两所述知识图谱间预对齐的实体对,具体为:Further, a knowledge graph is constructed according to the entities and the relationships between the entities, and entity pairs pre-aligned between the two knowledge graphs are obtained, specifically:
提取知识图谱的节点,具体为将所述设备台账信息数据中用于区分实体的属性作为所述知识图谱的节点,将所述实体的文本信息字段作为所述节点的文本特征;Extracting nodes of the knowledge graph, specifically, using the attributes used to distinguish entities in the equipment ledger information data as nodes of the knowledge graph, and using the text information field of the entity as text features of the node;
通过语言模型提取所述文本特征的语义信息,得到节点的嵌入向量表示;Extracting semantic information of the text features through a language model to obtain an embedded vector representation of the node;
根据所述实体间的关系构造所述节点的边;Constructing edges of the nodes according to the relationships between the entities;
得到两待对齐的知识图谱Gk:Get two knowledge graphs Gk to be aligned:
Gk={Ek,Rk,Tk};Gk = {Ek , Rk , Tk };
其中,k=1或2,Ek、Rk和Tk分别为所述知识图谱中节点的集合、实体关系的集合和三元组<e1,r,e2>,e1,e2∈E,r∈R,r为实体e1和实体e2之间的关系;Wherein, k=1 or 2, Ek , Rk and Tk are respectively the set of nodes, the set of entity relations and the triple <e1 , r, e2 > in the knowledge graph, e1 , e2 ∈E, r∈R, r is the relation between entity e1 and entity e2 ;
根据所述实体具有唯一值的属性对实体进行预对齐,得到实体对,实体对集合S为:The entities are pre-aligned according to the attributes of the entities with unique values to obtain entity pairs. The entity pair set S is:
其中,x、y分别为知识图谱G1、G2的实体节点。Among them, x and y are the entity nodes of the knowledge graphsG1 andG2 respectively.
进一步,所述语言模型为LaBSE模型。Furthermore, the language model is a LaBSE model.
进一步,将所述知识图谱输入图自注意力卷积神经网络进行训练,将所述实体对作为对齐种子,作为所述图自注意力卷积神经网络训练时的监督信息,具体为:Furthermore, the knowledge graph is input into a graph self-attention convolutional neural network for training, and the entity pairs are used as alignment seeds and as supervision information during the training of the graph self-attention convolutional neural network, specifically:
S1、将所述实体对集合S以预设比例划分为训练集与验证集,测试集为未对齐的节点;S1, dividing the entity pair set S into a training set and a validation set according to a preset ratio, and the test set is the unaligned nodes;
使用单层图注意力卷积神经网络对知识图谱中的各节点的邻居信息进行聚合操作,具体为:A single-layer graph attention convolutional neural network is used to aggregate the neighbor information of each node in the knowledge graph, specifically:
S2、随机采样目标节点的20个邻居节点作为邻居集合Ni,所述邻居节点包含目标节点本身,通过所述邻居集合Ni中各邻居节点的嵌入向量表示来更新目标节点的嵌入向量表示聚合公式如下:S2. Randomly sample 20 neighbor nodes of the target node as the neighbor setNi , where the neighbor nodes include the target node itself, and represent the neighbor nodes in the neighbor setNi by the embedding vector To update the embedding vector representation of the target node The aggregation formula is as follows:
其中,W为可训练权重矩阵,将所述节点的嵌入向量表示映射到高层次的特征,σ均为非线性激活函数Sigmoid,计算方式如下:Among them, W is a trainable weight matrix that maps the embedding vector representation of the node to high-level features, and σ is a nonlinear activation function Sigmoid, which is calculated as follows:
aij为节点j对节点i的重要性,计算方式如下:aij is the importance of node j to node i, and is calculated as follows:
其中,a为一个线性层,将向量转换为数值,LeakyReLU为非线性激活函数,表示如下:Among them, a is a linear layer that converts the vector into a numerical value, and LeakyReLU is a nonlinear activation function, which is expressed as follows:
其中,p为系数;Where p is the coefficient;
S3、基于所述实体对所述图自注意力卷积神经网络进行训练,根据梯度下降更新网络参数,采用贝叶斯个性化排名作为监督学习的目标函数,表达式如下:S3. Based on the entity, the graph self-attention convolutional neural network is trained, the network parameters are updated according to the gradient descent, and the Bayesian personalized ranking is used as the objective function of supervised learning. The expression is as follows:
其中,(x,y,y-)为知识图谱的节点x构建训练的三元组,y为另一知识图谱中与节点x预对齐的节点,y-为除x、y以外随机采样的任一节点;Where (x, y, y- ) is the triplet for training constructed by node x of the knowledge graph, y is the node pre-aligned with node x in another knowledge graph, and y- is any randomly sampled node except x and y;
S4、基于图结构多视图增强方法进行无监督训练,具体为:S4. Unsupervised training based on graph structure multi-view enhancement method, specifically:
通过对编码器网络参数θ进行扰动得到扰动网络参数θ′,将同一知识图谱节点输入网络与扰动网络中得到节点的两种视图表示h,h′,表示如下:By perturbing the encoder network parameter θ to obtain the perturbation network parameter θ′, the same knowledge graph node is input into the network and the perturbation network to obtain two view representations h and h′ of the node, which are expressed as follows:
h=f(N;θ),h′=f(N;θ′);h=f(N; θ), h′=f(N; θ′);
对编码器进行扰动的方式为:;The way to perturb the encoder is:
θ′l=θl+η·Δθl;θ′l =θl +η·Δθl ;
其中,θl和θ′l分别为第l层图自注意力卷积神经网络的参数和第l层扰动图自注意力卷积神经网络的参数,η为可调节的扰动强度超参数,Δθl为均值为零和方差为的高斯分布的扰动项;Where θl and θ′l are the parameters of the l-th layer graph self-attention convolutional neural network and the l-th layer perturbation graph self-attention convolutional neural network, respectively, η is an adjustable perturbation intensity hyperparameter, Δθl is a zero mean and variance The disturbance term of the Gaussian distribution;
将InfoNCE作为目标优化函数来拉近同一节点的原表示和扰动表示,推远与其他节点的扰动表示,表示如下:InfoNCE is used as the target optimization function to bring the original representation and perturbation representation of the same node closer, and push the perturbation representation of other nodes further away, as shown below:
其中,N为一个训练批次中节点的数目,sim为余弦相似度,τ为可调节参数;Where N is the number of nodes in a training batch, sim is the cosine similarity, τ is an adjustable parameter;
S5、重复步骤S2~S5,直至目标函数的值收敛或者达到预先设定的训练次数。S5. Repeat steps S2 to S5 until the value of the objective function converges or reaches a preset number of training times.
进一步,将所述实体对集合S以4:1的比例划分为训练集与验证集。Furthermore, the entity pair set S is divided into a training set and a validation set in a ratio of 4:1.
进一步,使用两所述知识图谱中节点较少的知识图谱的未对齐节点作为测试集。Furthermore, the unaligned nodes of the knowledge graph with fewer nodes in the two knowledge graphs are used as a test set.
进一步,通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示,计算所述知识图谱间各节点的相似性,将最相似的两个节点作为对齐节点,具体为:Furthermore, the embedded vector representation of each node is obtained through the graph self-attention convolutional neural network, the similarity of each node between the knowledge graphs is calculated, and the two most similar nodes are used as alignment nodes, specifically:
通过所述图自注意力卷积神经网络计算所述知识图谱各节点的嵌入向量表示;Calculate the embedding vector representation of each node of the knowledge graph through the graph self-attention convolutional neural network;
根据欧几里得范数计算两个知识图谱间各节点表示的相似性:Calculate the similarity of each node representation between two knowledge graphs based on the Euclidean norm:
取最相似的两个节点作为对齐节点。Take the two most similar nodes as the alignment nodes.
进一步,还包括对有标签的训练集和验证集节点进行性能统计,对无标签的测试集节点进行人工检查,判断是否有效,具体为:Furthermore, it also includes performance statistics for labeled training set and validation set nodes, and manual inspection of unlabeled test set nodes to determine whether they are effective, specifically:
对有标签的训练集和验证集节点计算其Hits@1、Hits@10:Calculate Hits@1 and Hits@10 for labeled training and validation set nodes:
其中,S为对齐节点对的集合,|S|为对齐节点对的个数,ranki为第i个对齐节点对的链接预测排名,I为判断函数,若为真值,则I=1,否则,I=0;Where S is the set of aligned node pairs, |S| is the number of aligned node pairs, ranki is the link prediction ranking of the i-th aligned node pair, and I is the judgment function. If it is a true value, I = 1, otherwise, I = 0;
对无标签的测试集节点进行人工检查,判断是否有效。Manually check the unlabeled test set nodes to determine whether they are valid.
本发明具有如下有益效果:The present invention has the following beneficial effects:
本发明通过训练好的网络实体对齐模型,关联不同业务系统等价实体,预测多个电力业务系统中指向真实世界中的同一对象的数据的对应情况,保障不同数据源数据的可靠性。本发明结合图自注意力卷积神经网络、监督学习和扰动视角对比学习,相比于现有技术能够更好的对两个知识图谱中的节点进行对齐。The present invention associates equivalent entities of different business systems through a trained network entity alignment model, predicts the correspondence of data pointing to the same object in the real world in multiple power business systems, and ensures the reliability of data from different data sources. Compared with the prior art, the present invention can better align nodes in two knowledge graphs by combining graph self-attention convolutional neural networks, supervised learning, and perturbation perspective comparative learning.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明方法的流程图。FIG. 1 is a flow chart of the method of the present invention.
图2为本发明实施例的图自注意力卷积神经网络训练过程示意图。Figure 2 is a schematic diagram of the self-attention convolutional neural network training process of an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
下面结合附图和具体实施例来对本发明进行详细的说明。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
参考图1-2,一种基于图卷积神经网络的电力业务系统数据对齐方法,包括:Referring to FIG1-2, a method for aligning data of a power business system based on a graph convolutional neural network includes:
获取两待对齐电力业务系统的设备台账信息数据,对所述设备台账信息数据进行清洗、预处理;Acquire equipment ledger information data of two power business systems to be aligned, and clean and pre-process the equipment ledger information data;
分别获取各电力业务系统中的实体及所述实体间的关系,根据所述实体及所述实体间的关系构建知识图谱,获取两所述知识图谱间预对齐的实体对,其中,所述实体为所述知识图谱的节点,所述实体间的关系为所述知识图谱的边;Respectively obtain entities in each power business system and the relationships between the entities, construct a knowledge graph based on the entities and the relationships between the entities, and obtain pre-aligned entity pairs between the two knowledge graphs, wherein the entities are nodes of the knowledge graph and the relationships between the entities are edges of the knowledge graph;
将所述知识图谱输入图自注意力卷积神经网络进行训练,将所述实体对作为对齐种子,作为所述图自注意力卷积神经网络训练时的监督信息;Inputting the knowledge graph into a graph self-attention convolutional neural network for training, and using the entity pairs as alignment seeds and as supervision information during the training of the graph self-attention convolutional neural network;
通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示,计算所述知识图谱间各节点的相似性,将最相似的两个节点作为对齐节点;Obtaining an embedded vector representation of each of the nodes through the graph self-attention convolutional neural network, calculating the similarity of each node between the knowledge graphs, and taking the two most similar nodes as alignment nodes;
根据对齐节点对待对齐电力业务系统实体的属性数据进行改写。The attribute data of the electric power business system entity to be aligned is rewritten according to the alignment node.
在一种具体实施例中,对所述设备台账信息数据进行清洗、预处理,具体为剔除所述设备台账信息数据中损坏的数据,将剩余的数据处理为CSV格式。In a specific embodiment, the equipment inventory information data is cleaned and preprocessed, specifically, damaged data in the equipment inventory information data is removed, and the remaining data is processed into a CSV format.
在本发明的一种实施方式中,根据所述实体及所述实体间的关系构建知识图谱,获取两所述知识图谱间预对齐的实体对,具体为:In one embodiment of the present invention, a knowledge graph is constructed according to the entities and the relationships between the entities, and entity pairs pre-aligned between the two knowledge graphs are obtained, specifically:
提取知识图谱的节点,具体为将所述设备台账信息数据中用于区分实体的属性作为所述知识图谱的节点,将所述实体的文本信息字段作为所述节点的文本特征;Extracting nodes of the knowledge graph, specifically, using the attributes used to distinguish entities in the equipment ledger information data as nodes of the knowledge graph, and using the text information field of the entity as text features of the node;
通过语言模型提取所述文本特征的语义信息,得到节点的嵌入向量表示;Extracting semantic information of the text features through a language model to obtain an embedded vector representation of the node;
根据所述实体间的关系构造所述节点的边;Constructing edges of the nodes according to the relationships between the entities;
得到两待对齐的知识图谱Gk:Get two knowledge graphs Gk to be aligned:
Gk={Ek,Rk,Tk};Gk = {Ek , Rk , Tk };
其中,k=1或2,Ek、Rk和Tk分别为所述知识图谱中节点的集合、实体关系的集合和三元组<e1,r,e2>,e1,e2∈E,r∈R,r为实体e1和实体e2之间的关系;Wherein, k=1 or 2, Ek , Rk and Tk are respectively the set of nodes, the set of entity relations and the triple <e1 , r, e2 > in the knowledge graph, e1 , e2 ∈E, r∈R, r is the relation between entity e1 and entity e2 ;
根据所述实体具有唯一值的属性对实体进行预对齐,得到实体对,实体对集合S为:The entities are pre-aligned according to the attributes of the entities with unique values to obtain entity pairs. The entity pair set S is:
其中,x、y分别为知识图谱G1、G2的实体节点。Among them, x and y are the entity nodes of the knowledge graphsG1 andG2 respectively.
在一种具体实施例中,所述语言模型为LaBSE模型。In a specific embodiment, the language model is a LaBSE model.
在本发明的一种实施方式中,将所述知识图谱输入图自注意力卷积神经网络进行训练,将所述实体对作为对齐种子,作为所述图自注意力卷积神经网络训练时的监督信息,具体为:In one embodiment of the present invention, the knowledge graph is input into a graph self-attention convolutional neural network for training, and the entity pair is used as an alignment seed and as supervision information during the training of the graph self-attention convolutional neural network, specifically:
S1、将所述实体对集合S以预设比例划分为训练集与验证集,测试集为未对齐的节点;S1, dividing the entity pair set S into a training set and a validation set according to a preset ratio, and the test set is the unaligned nodes;
使用单层图注意力卷积神经网络对知识图谱中的各节点的邻居信息进行聚合操作,具体为:A single-layer graph attention convolutional neural network is used to aggregate the neighbor information of each node in the knowledge graph, specifically:
S2、随机采样目标节点的20个邻居节点作为邻居集合Ni,所述邻居节点包含目标节点本身,通过所述邻居集合Ni中各邻居节点的嵌入向量表示来更新目标节点的嵌入向量表示聚合公式如下:S2. Randomly sample 20 neighbor nodes of the target node as the neighbor setNi , where the neighbor nodes include the target node itself, and represent the neighbor nodes in the neighbor setNi by the embedding vector To update the embedding vector representation of the target node The aggregation formula is as follows:
其中,W为可训练权重矩阵,将所述节点的嵌入向量表示映射到高层次的特征,σ均为非线性激活函数Sigmoid,计算方式如下:Among them, W is a trainable weight matrix that maps the embedding vector representation of the node to high-level features, and σ is a nonlinear activation function Sigmoid, which is calculated as follows:
aij为节点j对节点i的重要性,计算方式如下:aij is the importance of node j to node i, and is calculated as follows:
其中,a为一个线性层,将向量转换为数值,LeakyReLU为非线性激活函数,表示如下:Among them, a is a linear layer that converts the vector into a numerical value, and LeakyReLU is a nonlinear activation function, which is expressed as follows:
其中,p为系数;Where p is the coefficient;
S3、基于所述实体对所述图自注意力卷积神经网络进行训练,根据梯度下降更新网络参数,采用贝叶斯个性化排名作为监督学习的目标函数,表达式如下:S3. Based on the entity, the graph self-attention convolutional neural network is trained, the network parameters are updated according to the gradient descent, and the Bayesian personalized ranking is used as the objective function of supervised learning. The expression is as follows:
其中,(x,y,y-)为知识图谱的节点x构建训练的三元组,y为另一知识图谱中与节点x预对齐的节点,y-为除x、y以外随机采样的任一节点;Where (x, y, y- ) is the triplet for training constructed by node x of the knowledge graph, y is the node pre-aligned with node x in another knowledge graph, and y- is any randomly sampled node except x and y;
S4、基于图结构多视图增强方法进行无监督训练,具体为:S4. Unsupervised training based on graph structure multi-view enhancement method, specifically:
通过对编码器网络参数θ进行扰动得到扰动网络参数θ′,将同一知识图谱节点输入网络与扰动网络中得到节点的两种视图表示h,h′,表示如下:By perturbing the encoder network parameter θ to obtain the perturbation network parameter θ′, the same knowledge graph node is input into the network and the perturbation network to obtain two view representations h and h′ of the node, which are expressed as follows:
h=f(N;θ),h′=f(N;θ′);h=f(N; θ), h′=f(N; θ′);
对编码器进行扰动的方式为:;The way to perturb the encoder is:
θ′l=θl+η·Δθl;θ′l =θl +η·Δθl ;
其中,θl和θ′l分别为第l层图自注意力卷积神经网络的参数和第l层扰动图自注意力卷积神经网络的参数,η为可调节的扰动强度超参数,Δθl为均值为零和方差为的高斯分布的扰动项;Where θl and θ′l are the parameters of the l-th layer graph self-attention convolutional neural network and the l-th layer perturbation graph self-attention convolutional neural network, respectively, η is an adjustable perturbation intensity hyperparameter, Δθl is a zero mean and variance The Gaussian distribution of disturbance term;
将InfoNCE作为目标优化函数来拉近同一节点的原表示和扰动表示,推远与其他节点的扰动表示,表示如下:InfoNCE is used as the target optimization function to bring the original representation and perturbation representation of the same node closer, and push the perturbation representation of other nodes further away, as shown below:
其中,N为一个训练批次中节点的数目,sim为余弦相似度,τ为可调节参数;Where N is the number of nodes in a training batch, sim is the cosine similarity, τ is an adjustable parameter;
通过对所述图自注意力卷积神经网络的参数进行扰动得到一个扰动后的图自注意力卷积神经网络,节点在原图自注意力卷积神经网络下与扰动后的图自注意力卷积神经网络下分别输出节点的原视图与扰动视图表示,通过拉近同一个节点的原视图表示与节点的扰动视图表示,拉远与其他节点的扰动视图表示,能够得到更具鲁棒性的图自注意力卷积神经网络;A perturbed graph self-attention convolutional neural network is obtained by perturbing the parameters of the graph self-attention convolutional neural network, wherein the node outputs the original view and the perturbed view representation of the node under the original graph self-attention convolutional neural network and the perturbed graph self-attention convolutional neural network respectively, and a more robust graph self-attention convolutional neural network can be obtained by bringing the original view representation and the perturbed view representation of the same node closer and moving the perturbed view representations from other nodes further away;
S5、重复步骤S2~S5,直至目标函数的值收敛或者达到预先设定的训练次数。S5. Repeat steps S2 to S5 until the value of the objective function converges or reaches a preset number of training times.
在一种具体的实施例中,将所述实体对集合S以4:1的比例划分为训练集与验证集。In a specific embodiment, the entity pair set S is divided into a training set and a validation set in a ratio of 4:1.
在本发明的一种实施方式中,使用两所述知识图谱中节点较少的知识图谱的未对齐节点作为测试集。In one embodiment of the present invention, unaligned nodes of the knowledge graph with fewer nodes in the two knowledge graphs are used as test sets.
由于节点多的知识图谱会有大量剩余未对齐节点,这些节点在节点少的知识图谱中无真实的对齐实体,因此使用节点少的知识图谱的未对齐节点作为测试集。Since knowledge graphs with many nodes will have a large number of residual unaligned nodes, which have no real aligned entities in knowledge graphs with few nodes, the unaligned nodes of knowledge graphs with few nodes are used as test sets.
在本发明的一种实施方式中,通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示,计算所述知识图谱间各节点的相似性,将最相似的两个节点作为对齐节点,具体为:In one embodiment of the present invention, the embedded vector representation of each node is obtained by the graph self-attention convolutional neural network, the similarity of each node between the knowledge graphs is calculated, and the two most similar nodes are used as alignment nodes, specifically:
通过所述图自注意力卷积神经网络计算所述知识图谱各节点的嵌入向量表示;Calculate the embedding vector representation of each node of the knowledge graph through the graph self-attention convolutional neural network;
根据欧几里得范数计算两个知识图谱间各节点表示的相似性:Calculate the similarity of each node representation between two knowledge graphs based on the Euclidean norm:
取最相似的两个节点作为对齐节点。Take the two most similar nodes as the alignment nodes.
在本发明的一种实施方式中,还包括对有标签的训练集和验证集节点进行性能统计,对无标签的测试集节点进行人工检查,判断是否有效,具体为:In one embodiment of the present invention, it also includes performing performance statistics on the labeled training set and validation set nodes, and manually checking the unlabeled test set nodes to determine whether they are effective, specifically:
对有标签的训练集和验证集节点计算其Hits@1、Hits@10:Calculate Hits@1 and Hits@10 for labeled training and validation set nodes:
其中,S为对齐节点对的集合,|S|为对齐节点对的个数,ranki为第i个对齐节点对的链接预测排名,I为判断函数,若为真值,则I=1,否则,I=0;Where S is the set of aligned node pairs, |S| is the number of aligned node pairs, ranki is the link prediction ranking of the i-th aligned node pair, and I is the judgment function. If it is a true value, I = 1, otherwise, I = 0;
对无标签的测试集节点进行人工检查,判断是否有效。Manually check the unlabeled test set nodes to determine whether they are valid.
参考表1,将本发明方法与基于关系感知双图卷积网络的实体对齐技术进行比较,证明本发明方法结合图自注意力卷积神经网络、监督学习和扰动视角对比学习优于现有技术,本发明方法能够较好的对两个知识图谱中的节点进行对齐操作。Referring to Table 1, the method of the present invention is compared with the entity alignment technology based on the relationship-aware dual-graph convolutional network, which proves that the method of the present invention combined with the graph self-attention convolutional neural network, supervised learning and perturbation perspective contrast learning is superior to the existing technology. The method of the present invention can better align the nodes in the two knowledge graphs.
表1Table 1
以上所述仅为本发明的实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。The above descriptions are merely embodiments of the present invention and are not intended to limit the patent scope of the present invention. Any equivalent structure made using the contents of the present invention's specification and drawings, or directly or indirectly applied in other related technical fields, are also included in the patent protection scope of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211606845.6ACN115935941B (en) | 2022-12-14 | 2022-12-14 | A data alignment method for power business system based on graph convolutional neural network |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211606845.6ACN115935941B (en) | 2022-12-14 | 2022-12-14 | A data alignment method for power business system based on graph convolutional neural network |
| Publication Number | Publication Date |
|---|---|
| CN115935941Atrue CN115935941A (en) | 2023-04-07 |
| CN115935941B CN115935941B (en) | 2025-07-04 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202211606845.6AActiveCN115935941B (en) | 2022-12-14 | 2022-12-14 | A data alignment method for power business system based on graph convolutional neural network |
| Country | Link |
|---|---|
| CN (1) | CN115935941B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117194728A (en)* | 2023-09-07 | 2023-12-08 | 广东电网有限责任公司 | A business data degree distribution analysis method and device based on graph theory |
| CN119962436A (en)* | 2025-01-14 | 2025-05-09 | 山东大学 | Grouting simulation and pre-control decision-making method and system based on cross-scale data combination |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109902171A (en)* | 2019-01-30 | 2019-06-18 | 中国地质大学(武汉) | Method and system for text relation extraction based on hierarchical knowledge graph attention model |
| CN111931505A (en)* | 2020-05-22 | 2020-11-13 | 北京理工大学 | Cross-language entity alignment method based on subgraph embedding |
| CN113761221A (en)* | 2021-06-30 | 2021-12-07 | 中国人民解放军32801部队 | Knowledge graph entity alignment method based on graph neural network |
| CN113807520A (en)* | 2021-11-16 | 2021-12-17 | 北京道达天际科技有限公司 | Knowledge graph alignment model training method based on graph neural network |
| WO2022022045A1 (en)* | 2020-07-27 | 2022-02-03 | 平安科技(深圳)有限公司 | Knowledge graph-based text comparison method and apparatus, device, and storage medium |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109902171A (en)* | 2019-01-30 | 2019-06-18 | 中国地质大学(武汉) | Method and system for text relation extraction based on hierarchical knowledge graph attention model |
| CN111931505A (en)* | 2020-05-22 | 2020-11-13 | 北京理工大学 | Cross-language entity alignment method based on subgraph embedding |
| WO2022022045A1 (en)* | 2020-07-27 | 2022-02-03 | 平安科技(深圳)有限公司 | Knowledge graph-based text comparison method and apparatus, device, and storage medium |
| CN113761221A (en)* | 2021-06-30 | 2021-12-07 | 中国人民解放军32801部队 | Knowledge graph entity alignment method based on graph neural network |
| CN113807520A (en)* | 2021-11-16 | 2021-12-17 | 北京道达天际科技有限公司 | Knowledge graph alignment model training method based on graph neural network |
| Title |
|---|
| 孟鹏博;: "基于图神经网络的实体对齐研究综述", 现代计算机, no. 09, 25 March 2020 (2020-03-25)* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN117194728A (en)* | 2023-09-07 | 2023-12-08 | 广东电网有限责任公司 | A business data degree distribution analysis method and device based on graph theory |
| CN119962436A (en)* | 2025-01-14 | 2025-05-09 | 山东大学 | Grouting simulation and pre-control decision-making method and system based on cross-scale data combination |
| Publication number | Publication date |
|---|---|
| CN115935941B (en) | 2025-07-04 |
| Publication | Publication Date | Title |
|---|---|---|
| CN112966114B (en) | Literature classification method and device based on symmetrical graph convolutional neural network | |
| CN110347932B (en) | Cross-network user alignment method based on deep learning | |
| CN111079847B (en) | Remote sensing image automatic labeling method based on deep learning | |
| CN110909926A (en) | TCN-LSTM-based solar photovoltaic power generation prediction method | |
| CN110210486A (en) | A kind of generation confrontation transfer learning method based on sketch markup information | |
| CN112862093B (en) | Graphic neural network training method and device | |
| CN110209859A (en) | The method and apparatus and electronic equipment of place identification and its model training | |
| CN109767312A (en) | A credit evaluation model training and evaluation method and device | |
| CN115935941A (en) | Electric power service system data alignment method based on graph convolution neural network | |
| CN113139586B (en) | Model training method, device abnormality diagnosis method, electronic device, and medium | |
| CN114677535A (en) | Training method, image classification method and device for domain adaptive image classification network | |
| CN111046961A (en) | Fault classification method based on bidirectional long-and-short-term memory unit and capsule network | |
| CN106569954A (en) | Method based on KL divergence for predicting multi-source software defects | |
| CN113869333B (en) | Image identification method and device based on semi-supervised relationship measurement network | |
| CN114116692B (en) | Mask and bidirectional model-based missing POI track completion method | |
| CN111768792A (en) | Audio Steganalysis Method Based on Convolutional Neural Network and Domain Adversarial Learning | |
| CN111488498A (en) | "Node-Graph" Cross-layer Graph Matching Method and System Based on Graph Neural Network | |
| CN113655341B (en) | Fault positioning method and system for power distribution network | |
| WO2025087218A1 (en) | Method and system for detecting industrial internet abnormal node, medium, and device | |
| CN114139593A (en) | Training method and device for Deviational graph neural network and electronic equipment | |
| CN117197451A (en) | Remote sensing image semantic segmentation method and device based on domain self-adaption | |
| CN111461229B (en) | Deep neural network optimization and image classification method based on target transfer and line search | |
| CN117079017A (en) | Credible small sample image identification and classification method | |
| CN114580388A (en) | Data processing method, object prediction method, related device and storage medium | |
| CN112307914A (en) | A method for open domain image content recognition based on text information guidance |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |