CN115935941A

Movatterモバイル変換

Info

Publication number: CN115935941A
Application number: CN202211606845.6A
Authority: CN
Inventors: 蔡宇翔; 蒋鑫; 付婷; 倪文书; 王川丰; 杨启帆
Original assignee: State Grid Fujian Electric Power Co Ltd; Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd
Current assignee: State Grid Fujian Electric Power Co Ltd; Information and Telecommunication Branch of State Grid Fujian Electric Power Co Ltd
Priority date: 2022-12-14
Filing date: 2022-12-14
Publication date: 2023-04-07
Anticipated expiration: 2042-12-14
Also published as: CN115935941B

Abstract

Translated fromChinese

本发明涉及一种基于图卷积神经网络的电力业务系统数据对齐方法，包括：对设备台账信息数据进行清洗、预处理；根据实体及实体间的关系构建知识图谱，获取知识图谱间预对齐的实体对；将知识图谱输入图自注意力卷积神经网络进行训练，将实体对作为对齐种子，作为图自注意力卷积神经网络的监督信息；通过图自注意力卷积神经网络得到各节点的嵌入向量表示，计算知识图谱间各节点的相似性，将最相似的两个节点作为对齐节点；根据对齐节点对待对齐电力业务系统实体的属性数据进行改写。本发明通过训练好的网络实体对齐模型，关联不同业务系统等价实体，预测多个电力业务系统中指向真实世界中的同一对象的数据的对应情况，保障不同数据源数据的可靠性。

The invention relates to a method for aligning data in a power business system based on a graph convolutional neural network, including: cleaning and preprocessing equipment ledger information data; constructing a knowledge graph according to entities and relationships between entities, and obtaining pre-alignment between knowledge graphs Entity pairs; Input the knowledge map into the graph self-attention convolutional neural network for training, and use the entity pair as the alignment seed and the supervision information of the graph self-attention convolutional neural network; through the graph self-attention convolutional neural network, each The embedding vector representation of the node calculates the similarity of each node between the knowledge graphs, and takes the two most similar nodes as the alignment node; according to the alignment node, the attribute data of the power business system entity to be aligned is rewritten. The invention uses a trained network entity alignment model to associate equivalent entities of different business systems, predicts the corresponding situation of data pointing to the same object in the real world in multiple power business systems, and ensures the reliability of data from different data sources.

Description

Translated fromChinese

一种基于图卷积神经网络的电力业务系统数据对齐方法A data alignment method for power business system based on graph convolutional neural network

技术领域Technical Field

本发明涉及一种基于图卷积神经网络的电力业务系统数据对齐方法，属于电力业务数据处理技术领域。The invention relates to a data alignment method for an electric power business system based on a graph convolutional neural network, and belongs to the technical field of electric power business data processing.

背景技术Background Art

电力行业的深化改革要求电力企业进一步搞好信息化建设，进而更大程度实现信息互联和资源共享，使电力企业实现对数据资源的有效管理，充分挖掘数据资源的价值，实现降本增益的目标，从而进一步拓展行业空间。The deepening reform of the power industry requires power companies to further improve their information construction, and then realize information interconnection and resource sharing to a greater extent, so that power companies can effectively manage data resources, fully tap the value of data resources, and achieve the goal of reducing costs and increasing profits, thereby further expanding the industry space.

但是，在运用电力业务系统过程中，不同业务部门对设备信息的记录数据会存在不一致的情况，并且随着时间增长，电力网络数据日益复杂，数据在不同电力业务系统上传播时频繁复制、修改，导致不同业务系统中数据的可靠性降低。这时需要对不同业务系统中表示同一设备的信息进行修改，保证数据的可靠性与一致性。However, in the process of using the power business system, different business departments may have inconsistent data on equipment information. As time goes by, power network data becomes increasingly complex, and data is frequently copied and modified when it is transmitted on different power business systems, resulting in reduced reliability of data in different business systems. At this time, it is necessary to modify the information representing the same device in different business systems to ensure data reliability and consistency.

发明内容Summary of the invention

为了克服上述问题，本发明提供一种基于图卷积神经网络的电力业务系统数据对齐方法，该方法通过训练好的网络实体对齐模型，关联不同业务系统等价实体，预测多个电力业务系统中指向真实世界中的同一对象的数据的对应情况，保障不同数据源数据的可靠性。In order to overcome the above problems, the present invention provides a data alignment method for an electric power business system based on a graph convolutional neural network. The method associates equivalent entities of different business systems through a trained network entity alignment model, predicts the correspondence of data pointing to the same object in the real world in multiple electric power business systems, and ensures the reliability of data from different data sources.

本发明的技术方案如下：The technical solution of the present invention is as follows:

一种基于图卷积神经网络的电力业务系统数据对齐方法，包括：A data alignment method for a power business system based on a graph convolutional neural network, comprising:

获取两待对齐电力业务系统的设备台账信息数据，对所述设备台账信息数据进行清洗、预处理；Acquire equipment ledger information data of two power business systems to be aligned, and clean and pre-process the equipment ledger information data;

分别获取各电力业务系统中的实体及所述实体间的关系，根据所述实体及所述实体间的关系构建知识图谱，获取两所述知识图谱间预对齐的实体对，其中，所述实体为所述知识图谱的节点，所述实体间的关系为所述知识图谱的边；Respectively obtain entities in each power business system and the relationships between the entities, construct a knowledge graph based on the entities and the relationships between the entities, and obtain pre-aligned entity pairs between the two knowledge graphs, wherein the entities are nodes of the knowledge graph and the relationships between the entities are edges of the knowledge graph;

将所述知识图谱输入图自注意力卷积神经网络进行训练，将所述实体对作为对齐种子，作为所述图自注意力卷积神经网络训练时的监督信息；Inputting the knowledge graph into a graph self-attention convolutional neural network for training, and using the entity pairs as alignment seeds and as supervision information during the training of the graph self-attention convolutional neural network;

通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示，计算所述知识图谱间各节点的相似性，将最相似的两个节点作为对齐节点；Obtaining an embedded vector representation of each of the nodes through the graph self-attention convolutional neural network, calculating the similarity of each node between the knowledge graphs, and taking the two most similar nodes as alignment nodes;

根据对齐节点对待对齐电力业务系统实体的属性数据进行改写。The attribute data of the electric power business system entity to be aligned is rewritten according to the alignment node.

进一步的，对所述设备台账信息数据进行清洗、预处理，具体为剔除所述设备台账信息数据中损坏的数据，将剩余的数据处理为CSV格式。Furthermore, the equipment inventory information data is cleaned and preprocessed, specifically, damaged data in the equipment inventory information data is removed, and the remaining data is processed into a CSV format.

进一步，根据所述实体及所述实体间的关系构建知识图谱，获取两所述知识图谱间预对齐的实体对，具体为：Further, a knowledge graph is constructed according to the entities and the relationships between the entities, and entity pairs pre-aligned between the two knowledge graphs are obtained, specifically:

提取知识图谱的节点，具体为将所述设备台账信息数据中用于区分实体的属性作为所述知识图谱的节点，将所述实体的文本信息字段作为所述节点的文本特征；Extracting nodes of the knowledge graph, specifically, using the attributes used to distinguish entities in the equipment ledger information data as nodes of the knowledge graph, and using the text information field of the entity as text features of the node;

通过语言模型提取所述文本特征的语义信息，得到节点的嵌入向量表示；Extracting semantic information of the text features through a language model to obtain an embedded vector representation of the node;

根据所述实体间的关系构造所述节点的边；Constructing edges of the nodes according to the relationships between the entities;

得到两待对齐的知识图谱G_k：Get two knowledge graphs G_k to be aligned:

G_k＝{E_k，R_k，T_k}；G_k = {E_k , R_k , T_k };

其中，k＝1或2，E_k、R_k和T_k分别为所述知识图谱中节点的集合、实体关系的集合和三元组＜e₁，r，e₂>，e₁，e₂∈E，r∈R，r为实体e₁和实体e₂之间的关系；Wherein, k=1 or 2, E_k , R_k and T_k are respectively the set of nodes, the set of entity relations and the triple <e₁ , r, e₂ > in the knowledge graph, e₁ , e₂ ∈E, r∈R, r is the relation between entity e₁ and entity e₂ ;

根据所述实体具有唯一值的属性对实体进行预对齐，得到实体对，实体对集合S为：The entities are pre-aligned according to the attributes of the entities with unique values to obtain entity pairs. The entity pair set S is:

其中，x、y分别为知识图谱G₁、G₂的实体节点。Among them, x and y are the entity nodes of the knowledge graphs_G1 and_G2 respectively.

进一步，所述语言模型为LaBSE模型。Furthermore, the language model is a LaBSE model.

进一步，将所述知识图谱输入图自注意力卷积神经网络进行训练，将所述实体对作为对齐种子，作为所述图自注意力卷积神经网络训练时的监督信息，具体为：Furthermore, the knowledge graph is input into a graph self-attention convolutional neural network for training, and the entity pairs are used as alignment seeds and as supervision information during the training of the graph self-attention convolutional neural network, specifically:

S1、将所述实体对集合S以预设比例划分为训练集与验证集，测试集为未对齐的节点；S1, dividing the entity pair set S into a training set and a validation set according to a preset ratio, and the test set is the unaligned nodes;

使用单层图注意力卷积神经网络对知识图谱中的各节点的邻居信息进行聚合操作，具体为：A single-layer graph attention convolutional neural network is used to aggregate the neighbor information of each node in the knowledge graph, specifically:

S2、随机采样目标节点的20个邻居节点作为邻居集合N_i，所述邻居节点包含目标节点本身，通过所述邻居集合N_i中各邻居节点的嵌入向量表示

来更新目标节点的嵌入向量表示

聚合公式如下：S2. Randomly sample 20 neighbor nodes of the target node as the neighbor set_Ni , where the neighbor nodes include the target node itself, and represent the neighbor nodes in the neighbor set_Ni by the embedding vector

To update the embedding vector representation of the target node

The aggregation formula is as follows:

其中，W为可训练权重矩阵，将所述节点的嵌入向量表示映射到高层次的特征，σ均为非线性激活函数Sigmoid，计算方式如下：Among them, W is a trainable weight matrix that maps the embedding vector representation of the node to high-level features, and σ is a nonlinear activation function Sigmoid, which is calculated as follows:

a_ij为节点j对节点i的重要性，计算方式如下：_aij is the importance of node j to node i, and is calculated as follows:

其中，a为一个线性层，将向量转换为数值，LeakyReLU为非线性激活函数，表示如下：Among them, a is a linear layer that converts the vector into a numerical value, and LeakyReLU is a nonlinear activation function, which is expressed as follows:

其中，p为系数；Where p is the coefficient;

S3、基于所述实体对所述图自注意力卷积神经网络进行训练，根据梯度下降更新网络参数，采用贝叶斯个性化排名作为监督学习的目标函数，表达式如下：S3. Based on the entity, the graph self-attention convolutional neural network is trained, the network parameters are updated according to the gradient descent, and the Bayesian personalized ranking is used as the objective function of supervised learning. The expression is as follows:

其中，(x，y，y^-)为知识图谱的节点x构建训练的三元组，y为另一知识图谱中与节点x预对齐的节点，y^-为除x、y以外随机采样的任一节点；Where (x, y, y^- ) is the triplet for training constructed by node x of the knowledge graph, y is the node pre-aligned with node x in another knowledge graph, and y^- is any randomly sampled node except x and y;

S4、基于图结构多视图增强方法进行无监督训练，具体为：S4. Unsupervised training based on graph structure multi-view enhancement method, specifically:

通过对编码器网络参数θ进行扰动得到扰动网络参数θ′，将同一知识图谱节点输入网络与扰动网络中得到节点的两种视图表示h，h′，表示如下：By perturbing the encoder network parameter θ to obtain the perturbation network parameter θ′, the same knowledge graph node is input into the network and the perturbation network to obtain two view representations h and h′ of the node, which are expressed as follows:

h＝f(N；θ)，h′＝f(N；θ′)；h=f(N; θ), h′=f(N; θ′);

对编码器进行扰动的方式为：；The way to perturb the encoder is:

θ′_l＝θ_l+η·Δθ_l；θ′_l =θ_l +η·Δθ_l ;

其中，θ_l和θ′_l分别为第l层图自注意力卷积神经网络的参数和第l层扰动图自注意力卷积神经网络的参数，η为可调节的扰动强度超参数，Δθ_l为均值为零和方差为

的高斯分布的扰动项；Where θ_l and θ′_l are the parameters of the l-th layer graph self-attention convolutional neural network and the l-th layer perturbation graph self-attention convolutional neural network, respectively, η is an adjustable perturbation intensity hyperparameter, Δθ_l is a zero mean and variance

The disturbance term of the Gaussian distribution;

将InfoNCE作为目标优化函数来拉近同一节点的原表示和扰动表示，推远与其他节点的扰动表示，表示如下：InfoNCE is used as the target optimization function to bring the original representation and perturbation representation of the same node closer, and push the perturbation representation of other nodes further away, as shown below:

其中，N为一个训练批次中节点的数目，sim为余弦相似度，

τ为可调节参数；Where N is the number of nodes in a training batch, sim is the cosine similarity,

τ is an adjustable parameter;

S5、重复步骤S2～S5，直至目标函数的值收敛或者达到预先设定的训练次数。S5. Repeat steps S2 to S5 until the value of the objective function converges or reaches a preset number of training times.

进一步，将所述实体对集合S以4：1的比例划分为训练集与验证集。Furthermore, the entity pair set S is divided into a training set and a validation set in a ratio of 4:1.

进一步，使用两所述知识图谱中节点较少的知识图谱的未对齐节点作为测试集。Furthermore, the unaligned nodes of the knowledge graph with fewer nodes in the two knowledge graphs are used as a test set.

进一步，通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示，计算所述知识图谱间各节点的相似性，将最相似的两个节点作为对齐节点，具体为：Furthermore, the embedded vector representation of each node is obtained through the graph self-attention convolutional neural network, the similarity of each node between the knowledge graphs is calculated, and the two most similar nodes are used as alignment nodes, specifically:

通过所述图自注意力卷积神经网络计算所述知识图谱各节点的嵌入向量表示；Calculate the embedding vector representation of each node of the knowledge graph through the graph self-attention convolutional neural network;

根据欧几里得范数计算两个知识图谱间各节点表示的相似性：Calculate the similarity of each node representation between two knowledge graphs based on the Euclidean norm:

取最相似的两个节点作为对齐节点。Take the two most similar nodes as the alignment nodes.

进一步，还包括对有标签的训练集和验证集节点进行性能统计，对无标签的测试集节点进行人工检查，判断是否有效，具体为：Furthermore, it also includes performance statistics for labeled training set and validation set nodes, and manual inspection of unlabeled test set nodes to determine whether they are effective, specifically:

对有标签的训练集和验证集节点计算其Hits@1、Hits@10：Calculate Hits@1 and Hits@10 for labeled training and validation set nodes:

其中，S为对齐节点对的集合，|S|为对齐节点对的个数，rank_i为第i个对齐节点对的链接预测排名，I为判断函数，若为真值，则I＝1，否则，I＝0；Where S is the set of aligned node pairs, |S| is the number of aligned node pairs, rank_i is the link prediction ranking of the i-th aligned node pair, and I is the judgment function. If it is a true value, I = 1, otherwise, I = 0;

对无标签的测试集节点进行人工检查，判断是否有效。Manually check the unlabeled test set nodes to determine whether they are valid.

本发明具有如下有益效果：The present invention has the following beneficial effects:

本发明通过训练好的网络实体对齐模型，关联不同业务系统等价实体，预测多个电力业务系统中指向真实世界中的同一对象的数据的对应情况，保障不同数据源数据的可靠性。本发明结合图自注意力卷积神经网络、监督学习和扰动视角对比学习，相比于现有技术能够更好的对两个知识图谱中的节点进行对齐。The present invention associates equivalent entities of different business systems through a trained network entity alignment model, predicts the correspondence of data pointing to the same object in the real world in multiple power business systems, and ensures the reliability of data from different data sources. Compared with the prior art, the present invention can better align nodes in two knowledge graphs by combining graph self-attention convolutional neural networks, supervised learning, and perturbation perspective comparative learning.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明方法的流程图。FIG. 1 is a flow chart of the method of the present invention.

图2为本发明实施例的图自注意力卷积神经网络训练过程示意图。Figure 2 is a schematic diagram of the self-attention convolutional neural network training process of an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面结合附图和具体实施例来对本发明进行详细的说明。The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

参考图1-2，一种基于图卷积神经网络的电力业务系统数据对齐方法，包括：Referring to FIG1-2, a method for aligning data of a power business system based on a graph convolutional neural network includes:

在一种具体实施例中，对所述设备台账信息数据进行清洗、预处理，具体为剔除所述设备台账信息数据中损坏的数据，将剩余的数据处理为CSV格式。In a specific embodiment, the equipment inventory information data is cleaned and preprocessed, specifically, damaged data in the equipment inventory information data is removed, and the remaining data is processed into a CSV format.

在本发明的一种实施方式中，根据所述实体及所述实体间的关系构建知识图谱，获取两所述知识图谱间预对齐的实体对，具体为：In one embodiment of the present invention, a knowledge graph is constructed according to the entities and the relationships between the entities, and entity pairs pre-aligned between the two knowledge graphs are obtained, specifically:

G_k＝{E_k，R_k，T_k}；G_k = {E_k , R_k , T_k };

在一种具体实施例中，所述语言模型为LaBSE模型。In a specific embodiment, the language model is a LaBSE model.

在本发明的一种实施方式中，将所述知识图谱输入图自注意力卷积神经网络进行训练，将所述实体对作为对齐种子，作为所述图自注意力卷积神经网络训练时的监督信息，具体为：In one embodiment of the present invention, the knowledge graph is input into a graph self-attention convolutional neural network for training, and the entity pair is used as an alignment seed and as supervision information during the training of the graph self-attention convolutional neural network, specifically:

来更新目标节点的嵌入向量表示

To update the embedding vector representation of the target node

The aggregation formula is as follows:

其中，p为系数；Where p is the coefficient;

h＝f(N；θ)，h′＝f(N；θ′)；h=f(N; θ), h′=f(N; θ′);

对编码器进行扰动的方式为：；The way to perturb the encoder is:

θ′_l＝θ_l+η·Δθ_l；θ′_l =θ_l +η·Δθ_l ;

The Gaussian distribution of disturbance term;

其中，N为一个训练批次中节点的数目，sim为余弦相似度，

τ is an adjustable parameter;

通过对所述图自注意力卷积神经网络的参数进行扰动得到一个扰动后的图自注意力卷积神经网络，节点在原图自注意力卷积神经网络下与扰动后的图自注意力卷积神经网络下分别输出节点的原视图与扰动视图表示，通过拉近同一个节点的原视图表示与节点的扰动视图表示，拉远与其他节点的扰动视图表示，能够得到更具鲁棒性的图自注意力卷积神经网络；A perturbed graph self-attention convolutional neural network is obtained by perturbing the parameters of the graph self-attention convolutional neural network, wherein the node outputs the original view and the perturbed view representation of the node under the original graph self-attention convolutional neural network and the perturbed graph self-attention convolutional neural network respectively, and a more robust graph self-attention convolutional neural network can be obtained by bringing the original view representation and the perturbed view representation of the same node closer and moving the perturbed view representations from other nodes further away;

在一种具体的实施例中，将所述实体对集合S以4：1的比例划分为训练集与验证集。In a specific embodiment, the entity pair set S is divided into a training set and a validation set in a ratio of 4:1.

在本发明的一种实施方式中，使用两所述知识图谱中节点较少的知识图谱的未对齐节点作为测试集。In one embodiment of the present invention, unaligned nodes of the knowledge graph with fewer nodes in the two knowledge graphs are used as test sets.

由于节点多的知识图谱会有大量剩余未对齐节点，这些节点在节点少的知识图谱中无真实的对齐实体，因此使用节点少的知识图谱的未对齐节点作为测试集。Since knowledge graphs with many nodes will have a large number of residual unaligned nodes, which have no real aligned entities in knowledge graphs with few nodes, the unaligned nodes of knowledge graphs with few nodes are used as test sets.

在本发明的一种实施方式中，通过所述图自注意力卷积神经网络得到各所述节点的嵌入向量表示，计算所述知识图谱间各节点的相似性，将最相似的两个节点作为对齐节点，具体为：In one embodiment of the present invention, the embedded vector representation of each node is obtained by the graph self-attention convolutional neural network, the similarity of each node between the knowledge graphs is calculated, and the two most similar nodes are used as alignment nodes, specifically:

在本发明的一种实施方式中，还包括对有标签的训练集和验证集节点进行性能统计，对无标签的测试集节点进行人工检查，判断是否有效，具体为：In one embodiment of the present invention, it also includes performing performance statistics on the labeled training set and validation set nodes, and manually checking the unlabeled test set nodes to determine whether they are effective, specifically:

参考表1，将本发明方法与基于关系感知双图卷积网络的实体对齐技术进行比较，证明本发明方法结合图自注意力卷积神经网络、监督学习和扰动视角对比学习优于现有技术，本发明方法能够较好的对两个知识图谱中的节点进行对齐操作。Referring to Table 1, the method of the present invention is compared with the entity alignment technology based on the relationship-aware dual-graph convolutional network, which proves that the method of the present invention combined with the graph self-attention convolutional neural network, supervised learning and perturbation perspective contrast learning is superior to the existing technology. The method of the present invention can better align the nodes in the two knowledge graphs.

表1Table 1

技术technologyHit@1Hit@1Hit@10Hit@10RDGCNRDGCN0.450.450.650.65本发明方法Method of the present invention0.7110.7110.8720.872

以上所述仅为本发明的实施例，并非因此限制本发明的专利范围，凡是利用本发明说明书及附图内容所作的等效结构，或直接或间接运用在其他相关的技术领域，均同理包括在本发明的专利保护范围内。The above descriptions are merely embodiments of the present invention and are not intended to limit the patent scope of the present invention. Any equivalent structure made using the contents of the present invention's specification and drawings, or directly or indirectly applied in other related technical fields, are also included in the patent protection scope of the present invention.

Claims

1. A data alignment method for a power service system based on a graph convolution neural network is characterized by comprising the following steps:

acquiring equipment ledger information data of two power business systems to be aligned, and cleaning and preprocessing the equipment ledger information data;

respectively obtaining entities in each electric power service system and the relation between the entities, constructing a knowledge graph according to the entities and the relation between the entities, and obtaining a pre-aligned entity pair between the two knowledge graphs, wherein the entities are nodes of the knowledge graph, and the relation between the entities is an edge of the knowledge graph;

training the knowledge graph input graph self-attention convolution neural network, and taking the entity pair as an alignment seed as supervision information during the graph self-attention convolution neural network training;

obtaining embedded vector representation of each node through the graph self-attention convolution neural network, calculating the similarity of each node between the knowledge graphs, and taking two most similar nodes as alignment nodes;

and rewriting the attribute data of the power service system entity to be aligned according to the alignment node.

2. The method according to claim 1, wherein the equipment ledger information data is cleaned and preprocessed, specifically, damaged data in the equipment ledger information data is removed, and the remaining data is processed into a CSV format.

3. The method according to claim 1, wherein a knowledge graph is constructed according to the entities and the relationship between the entities, and a pre-aligned pair of entities between the two knowledge graphs is obtained, specifically:

extracting nodes of a knowledge graph, specifically, taking attributes used for distinguishing entities in the equipment standing book information data as the nodes of the knowledge graph, and taking text information fields of the entities as text characteristics of the nodes;

extracting semantic information of the text features through a language model to obtain embedded vector representation of the nodes;

constructing edges of the nodes according to the relation between the entities;

obtaining two knowledge graphs G to be aligned_k ：

G_k ＝{E_k ,R_k ,T_k }；

Wherein k =1 or 2,E_k 、R_k And T_k Respectively a set of nodes, a set of entity relationships and a triple in the knowledge-graph<e₁ ,r,e₂ >，e₁ ,e₂ Belongs to E, R belongs to R, and R is an entity E₁ And entity e₂ The relationship between them;

and pre-aligning the entities according to the attribute with the unique value of the entity to obtain entity pairs, wherein an entity pair set S is as follows:

wherein x and y are knowledge maps G respectively₁ 、G₂ The physical node of (1).

4. The graph convolution neural network-based power service system data alignment method of claim 3, wherein the language model is a LaBSE model.

5. The method according to claim 4, wherein the knowledge-graph input graph self-attention convolutional neural network is trained, and the entity pair is used as an alignment seed as supervision information for the graph self-attention convolutional neural network training, specifically:

s1, dividing the entity pair set S into a training set and a verification set according to a preset proportion, wherein the testing set is a non-aligned node;

using a single-layer graph attention convolution neural network to perform aggregation operation on neighbor information of each node in the knowledge graph, specifically:

s2, randomly sampling 20 neighbor nodes of target node as neighbor set N_i The neighbor node comprises the target node itself, and the neighbor set N is passed through_i Embedded vector representation of each neighbor node in the tree

To update the embedded vector representation @ofthe target node>

The polymerization formula is as follows:

wherein, W is a trainable weight matrix, the embedded vector representation of the node is mapped to the high-level feature, σ is a nonlinear activation function Sigmoid, and the calculation mode is as follows:

a_ij for the importance of node j to node i, the calculation is as follows:

where a is a linear layer, which converts vectors into values, and LeakyReLU is a nonlinear activation function, which is expressed as follows:

wherein p is an adjustable coefficient;

s3, training the graph self-attention convolution neural network based on the entity, updating network parameters according to gradient descent, and adopting Bayes personalized ranking as a target function of supervised learning, wherein the expression is as follows:

wherein (x, y)^- ) Constructing a training triplet for a node x of a knowledge-graph, y being a node in another knowledge-graph that is pre-aligned with the node x, y^- Is any node sampled randomly except x and y;

s4, carrying out unsupervised training based on the graph structure multi-view enhancement method, specifically:

obtaining a disturbance network parameter theta 'by disturbing the encoder network parameter theta, and further obtaining a node representation h, h' of the same node x in the knowledge graph under the network and the disturbance network, wherein the node representation h is represented as follows:

h＝f(x；θ),h′＝f(x；θ′)；

the method for disturbing the encoder comprises the following steps: (ii) a

θ_l ′＝θ_l +η·Δθ_l ；

Wherein, theta_l And theta_l ' parameters of the l-th layer image self-attention convolution neural network and parameters of the l-th layer disturbance image self-attention convolution neural network respectively, eta is an adjustable disturbance intensity hyper-parameter, and delta theta_l Is that the mean is zero and the variance is

Gaussian distribution of

The perturbation term of (1);

pulling the original representation h of the same node n to be close by taking InfonCE as an objective optimization function_n And disturbance represents h'_n The perturbation representation of the remote node and other nodes is expressed as follows:

wherein N is the number of nodes in a training batch, N' is a node other than node N, sim is cosine similarity,

tau is an adjustable parameter;

and S5, repeating the steps S2-S5 until the value of the target function converges or reaches the preset training times.

6. The method for data alignment of power service system based on graph-rolling neural network as claimed in claim 5, wherein the entity pair set S is divided into training set and validation set in a ratio of 4.

7. The graph convolution neural network-based power service system data alignment method of claim 5, wherein unaligned nodes of a knowledge graph with fewer nodes in the two knowledge graphs are used as a test set.

8. The method according to claim 5, wherein the graph self-attention convolutional neural network is used to obtain an embedded vector representation of each node, calculate the similarity of each node between the knowledge graphs, and use two most similar nodes as aligned nodes, specifically:

calculating an embedded vector representation of each node of the knowledge-graph by the graph self-attention convolutional neural network;

and calculating the similarity of each node representation between the two knowledge graphs according to the Euclidean norm:

and taking the two most similar nodes as the alignment nodes.

9. The method for aligning data of an electric power service system based on a convolutional neural network of claim 5, further comprising performing performance statistics on the labeled training set and verification set nodes, and performing manual inspection on the unlabeled test set nodes to determine whether the labeled training set and verification set nodes are valid, specifically:

calculating the Hits @1 and Hits @10 of the labeled training set and the labeled verification set nodes:

wherein S is the set of aligned node pairs, | S | is the number of aligned node pairs, rank_i Predicting and ranking the link of the ith alignment node pair, wherein I is a judgment function, if the I is a true value, I =1, and otherwise, I =0;

and manually checking the non-labeled test set nodes to judge whether the nodes are effective or not.