CN113051353B

Movatterモバイル変換

Info

Publication number: CN113051353B
Application number: CN202110244072.0A
Authority: CN
Inventors: 陆佳炜; 朱昊天; 王小定; 郑嘉弘; 张元鸣; 徐俊; 肖刚
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-03-05
Filing date: 2021-03-05
Publication date: 2024-05-10
Anticipated expiration: 2041-03-05
Also published as: CN113051353A

Abstract

Translated fromChinese

一种基于注意力机制的知识图谱路径可达性分析方法，括以下步骤：第一步、从知识库中构建目标三元组，并获得该三元组中头实体h和尾实体t之间所有的路径关系；第二步、进行关系编码；第三步、进行实体类型编码；第四步、重复第二步和第三步计算所有路径模式组合成的全局路径模式计算头实体h、直接关系r和尾实体t组成的三元组的能量函数，计算直接关系r能否连接头实体和尾实体的概率，将能量函数和能否链接的概率相乘，以此判断这个三元组是否成立。本发明提高了对实体和关系的利用率，注意力机制提升了概率计算结果的精确性，提升了三元组表示成的向量的准确度，增加了预测实体间是否能相互连接的结果的准确度。A knowledge graph path accessibility analysis method based on attention mechanism includes the following steps: the first step is to construct a target triple from the knowledge base and obtain all path relations between the head entity h and the tail entity t in the triple; the second step is to encode the relationship; the third step is to encode the entity type; the fourth step is to repeat the second and third steps to calculate the global path pattern composed of all path patterns The energy function of the triple consisting of the head entity h, the direct relationship r and the tail entity t is calculated, the probability of whether the direct relationship r can connect the head entity and the tail entity is calculated, and the energy function and the probability of whether the connection can be made are multiplied to determine whether the triple is established. The present invention improves the utilization rate of entities and relationships, and the attention mechanism improves the accuracy of the probability calculation results, improves the accuracy of the vector represented by the triple, and increases the accuracy of the result of predicting whether the entities can be connected to each other.

Description

Translated fromChinese

一种基于注意力机制的知识图谱路径可达性预测方法A knowledge graph path reachability prediction method based on attention mechanism

技术领域Technical Field

本方法涉及一种基于注意力机制的知识图谱路径可达性分析方法。This method involves a knowledge graph path reachability analysis method based on attention mechanism.

背景技术Background technique

知识库将人类知识组织成结构化的知识系统，它描述现实世界中实体(entity)间的关系(relation)。人们花费大量精力构建了各种结构化的知识库，如语言知识库WordNet、世界知识库Freebase等。知识库是推动人工智能学科发展和支撑智能信息服务应用(如智能搜索、智能问答、个性化推荐等)的重要基础技术。为了改进信息服务质量，国内外互联网公司(特别是搜索引擎公司)纷纷推出知识库产品，如谷歌知识图谱、微软BingSatori、百度知心以及搜狗知立方等。著名的IBM Watson问答系统和苹果Siri语音助理的背后，知识库也扮演着重要角色。知识库的兴起拉开了智能信息检索从字符串匹配跃迁至智能理解的序幕。Knowledge base organizes human knowledge into a structured knowledge system, which describes the relationship between entities in the real world. People have spent a lot of effort to build various structured knowledge bases, such as the language knowledge base WordNet and the world knowledge base Freebase. Knowledge base is an important basic technology to promote the development of artificial intelligence and support intelligent information service applications (such as intelligent search, intelligent question and answer, personalized recommendation, etc.). In order to improve the quality of information services, domestic and foreign Internet companies (especially search engine companies) have launched knowledge base products, such as Google Knowledge Graph, Microsoft BingSatori, Baidu Zhixin, and Sogou Zhilifang. Knowledge base also plays an important role behind the famous IBM Watson question and answer system and Apple Siri voice assistant. The rise of knowledge base has opened the prelude to the transition of intelligent information retrieval from string matching to intelligent understanding.

知识图谱由Google公司于2012年6月正式提出，是一种基于图的数据结构。知识图谱是一种结构化的语义知识库，以图的形式来展现现实世界中各个实体及其相互之间的关系，并用形式化的方式来进行描述。知识图谱的基本组成单元的通用表示形式是实体、“实体-关系-实体”三元组，以及实体的“属性-值”对。知识图谱以“实体-关系-实体”或“实体-属性-属性值”的三元组表达形式存储，这些数据将构成可观的实体关系网络，即知识的“图谱”。The knowledge graph was formally proposed by Google in June 2012. It is a graph-based data structure. The knowledge graph is a structured semantic knowledge base that presents entities and their relationships in the real world in the form of a graph and describes them in a formalized way. The general representation of the basic components of the knowledge graph is the entity, the "entity-relationship-entity" triple, and the entity's "attribute-value" pair. The knowledge graph is stored in the form of a triple expression of "entity-relationship-entity" or "entity-attribute-attribute value". These data will constitute a considerable entity relationship network, that is, the "graph" of knowledge.

表示学习的目标是，通过机器学习将研究对象的语义信息表示为稠密低维实值向量。知识表示学习是面向知识库中实体和关系的表示学习，通过将实体或关系投影到低维向量空间，能够实现对实体和关系的语义信息的表示，可以高效地计算实体、关系及其之间的复杂语义关联。知识库的表示学习旨在将实体和关系嵌入到一个低维空间中。大多数现有的方法在表示学习中只考虑直接关系，而PtransE提出了一种基于路径的表示学习模型，它将关系路径作为表示学习实体之间的转换。但是它仅仅依赖于关系，并且直接使用特定实体信息，在推演多步关系时仍然存在一定局限性。The goal of representation learning is to represent the semantic information of the research object as a dense low-dimensional real-valued vector through machine learning. Knowledge representation learning is the representation learning for entities and relations in the knowledge base. By projecting entities or relations into a low-dimensional vector space, the semantic information of entities and relations can be represented, and the complex semantic associations between entities, relations and their relationships can be efficiently calculated. The representation learning of the knowledge base aims to embed entities and relations into a low-dimensional space. Most existing methods only consider direct relations in representation learning, while PtransE proposes a path-based representation learning model that uses relational paths as transformations between representation learning entities. However, it only relies on relations and directly uses specific entity information, and still has certain limitations when deducing multi-step relations.

长短期记忆网络(Long Short-Term Memory，LSTM)，最早由Hochreiter、Schmidhuber于1997年提出，该模型由于能更好地发现长期依赖关系而被广泛用于处理时间序列信息。LSTM可以看作为特殊的RNN，其主要为解决长序列训练过程中的梯度消失及梯度爆炸问题，能够在更长的时间序列上依然表现优异。Long Short-Term Memory (LSTM) was first proposed by Hochreiter and Schmidhuber in 1997. This model is widely used to process time series information because it can better discover long-term dependencies. LSTM can be seen as a special RNN. It is mainly used to solve the gradient vanishing and gradient exploding problems in the long sequence training process, and can still perform well on longer time series.

发明内容Summary of the invention

为了克服现有技术的不足，本发明提出了一种基于注意力机制的知识图谱路径可达性分析方法，对知识库中三元组的关系和实体使用LSTM分别进行关系编码和实体类型编码，获得相应的向量输出，利用这些向量来计算头实体和尾实体是否可以通过关系相互链接的概率，通过将三元组的能量函数与预测的可以相互链接的概率相乘的结果来判定这个三元组是否成立，从而预测知识图谱中实体间的连接关系；提高了对实体和关系的利用率，注意力机制提升了概率计算结果的精确性，提升了三元组表示成的向量的准确度，增加了预测实体间是否能相互连接的结果的准确度。In order to overcome the shortcomings of the prior art, the present invention proposes a knowledge graph path reachability analysis method based on an attention mechanism, which uses LSTM to perform relationship encoding and entity type encoding on the relationships and entities of triples in the knowledge base, respectively, to obtain corresponding vector outputs, and uses these vectors to calculate the probability of whether the head entity and the tail entity can be linked to each other through relationships, and determines whether the triple is established by multiplying the energy function of the triple with the predicted probability that they can be linked to each other, thereby predicting the connection relationship between entities in the knowledge graph; the utilization rate of entities and relationships is improved, the attention mechanism improves the accuracy of the probability calculation results, improves the accuracy of the vectors represented by the triples, and increases the accuracy of the results of predicting whether the entities can be connected to each other.

为了解决上述技术问题本发明提供如下的技术方案：In order to solve the above technical problems, the present invention provides the following technical solutions:

一种基于注意力机制的知识图谱路径可达性分析方法，所述方法包括以下步骤：A knowledge graph path accessibility analysis method based on an attention mechanism, the method comprising the following steps:

第一步、从知识库中构建目标三元组，并获得该三元组中头实体h和尾实体t之间所有的路径关系；The first step is to construct the target triple from the knowledge base and obtain all the path relationships between the head entity h and the tail entity t in the triple;

第二步、进行关系编码，结合Word2vec将目标三元组中的直接关系和所有路径关系表示成向量，并将路径关系表示成的向量输入LSTM进行顺序编码，过程如下：The second step is to perform relationship encoding. Combined with Word2vec, the direct relationship and all path relationships in the target triple are represented as vectors, and the vectors represented by the path relationship are input into LSTM for sequential encoding. The process is as follows:

2.1、将头实体h和尾实体t之间的关系路径上的关系使用Word2vec转化为向量；初始化一个HashMap，用于存放头实体h和尾实体t之间的关系表示成的向量集合；2.1. Use Word2vec to convert the relationship between the head entity h and the tail entity t into a vector. Initialize a HashMap to store the vector set representing the relationship between the head entity h and the tail entity t.

2.2、将HashMap中的向量顺序输入LSTM，用LSTM的最后一个状态作为关系路径的向量表示，用v_π(p)表示；2.2. Input the vectors in HashMap into LSTM sequentially, and use the last state of LSTM as the vector representation of the relationship path, denoted by v_π (p);

2.3、将头实体h和尾实体t之间的直接关系r使用Word2vec转化为向量，并将该向量记为再将/>输入LSTM，用LSTM的最后一个状态作为直接关系r的向量表示，用v_π(r)表示；2.3. Convert the direct relationship r between the head entity h and the tail entity t into a vector using Word2vec, and record the vector as Then/> Input LSTM and use the last state of LSTM as the vector representation of the direct relationship r, denoted by v_π (r);

第三步、进行实体类型编码，结合Word2vec将目标三元组中的所有实体类型表示成向量，并将实体类型表示成的向量输入LSTM进行顺序编码；过程如下：The third step is to encode the entity type. Combined with Word2vec, all entity types in the target triple are represented as vectors, and the vectors represented by the entity type are input into LSTM for sequential encoding. The process is as follows:

3.1、初始化一个HashMap，命名为entitymap，用于存放第二步中选择的关系路径上的头实体h和尾实体t之间的实体的类型层次集合；3.1. Initialize a HashMap named entitymap to store the type hierarchy set of entities between the head entity h and the tail entity t on the relationship path selected in the second step;

3.2、获取选定关系路径上的头实体h和尾实体t之间的所有实体(包括头实体h和尾实体t)的实体类型；3.2. Obtain the entity types of all entities between the head entity h and the tail entity t on the selected relationship path (including the head entity h and the tail entity t);

3.3、用Word2vec把实体的实体类型转化为向量，并计算实体的类型上下文向量，将实体的类型上下文向量按顺序输入LSTM，并用LSTM的最后一个隐藏状态作为这条路径的实体类型编码的向量表示，用v_ε(p)表示；3.3. Use Word2vec to convert the entity type of the entity into a vector, and calculate the entity type context vector. Input the entity type context vector into LSTM in sequence, and use the last hidden state of LSTM as the vector representation of the entity type encoding of this path, denoted by v_ε (p);

第四步、重复第二步和第三步计算所有路径模式组合成的全局路径模式计算头实体h、直接关系r和尾实体t组成的三元组的能量函数，计算直接关系r能否连接头实体和尾实体的概率，将能量函数和能否链接的概率相乘，以此判断这个三元组是否成立；过程如下：Step 4: Repeat steps 2 and 3 to calculate the global path pattern composed of all path patterns. Calculate the energy function of the triple consisting of the head entity h, the direct relationship r and the tail entity t, calculate the probability that the direct relationship r can connect the head entity and the tail entity, and multiply the energy function and the probability of whether the connection can be made to determine whether the triple is established; the process is as follows:

4.1、将每一条关系路径上的v_π(p)和v_ε(p)连接在一起，组成路径模式v_ρ(p)即v_ρ(p)＝[v_π(p)；v_ε(p)]，最终获得所有路径模式的集合S＝{v_ρ(p₁),v_ρ(p₂),……,v_ρ(p_N)}；4.1. Connect_vπ (p) and_vε (p) on each relationship path to form a path pattern_vρ (p), that is,_vρ (p)＝[_vπ (p);_vε (p)], and finally obtain the set of all path patterns S＝{_vρ (_p1 ),_vρ (_p2 ),……,_vρ (_pN )};

4.2、使用软注意力机制(Soft Attention)将所有路径模式组合成全局路径模式4.2. Use Soft Attention to combine all path patterns into a global path pattern

4.3、计算计算头实体h、直接关系r和尾实体t组成的三元组的能量函数，其中，h用头实体的类型上下文向量表示，t用尾实体t的类型上下文向量/>表示，r用2.3的结果v_π(r)表示，计算公式如下：4.3. Calculate the energy function of the triple consisting of the head entity h, the direct relationship r and the tail entity t, where h is the type context vector of the head entity Indicates that t uses the type context vector of the tail entity t/> , r is represented by the result of 2.3 v_π (r), and the calculation formula is as follows:

4.4、计算头实体和尾实体可以通过直接关系r连接的概率P(r|h,t)，计算公式如下，其中σ就是sigmoid函数，f_pred是一个前馈网络，就是4.2的全局路径模式：4.4. Calculate the probability P(r|h,t) that the head entity and the tail entity can be connected through a direct relationship r. The calculation formula is as follows, where σ is the sigmoid function,_fpred is a feedforward network, This is the global path mode of 4.2:

4.5、计算头实体、尾实体和所有关系路径组成的整个三元组的能量函数，计算公式如下，其中exp(x)＝e^x，E(h,r,t)是4.3的能量函数，P(r|h,t)是4.4的结果，由于能量函数越接近于0越好，而概率越接近于1越好，因此将能量函数的结果的负数输入exp函数来保持整个能量函数的单调性：4.5. Calculate the energy function of the entire triple consisting of the head entity, the tail entity, and all relationship paths. The calculation formula is as follows, where exp(x) = e^x , E(h, r, t) is the energy function of 4.3, and P(r|h, t) is the result of 4.4. Since the closer the energy function is to 0, the better, and the closer the probability is to 1, the better, the negative number of the result of the energy function is input into the exp function to maintain the monotonicity of the entire energy function:

G(h，r，t)＝exp(-E(h，r，t))*P(r|h，t)G(h,r,t)=exp(-E(h,r,t))*P(r|h,t)

4.6、判断G(h,r,t)的值是否接近于1，若值越接近于1，说明这个直接关系r可以连接头实体h和尾实体t，这个三元组就成立；否则，不成立。4.6. Determine whether the value of G(h,r,t) is close to 1. If the value is closer to 1, it means that the direct relationship r can connect the head entity h and the tail entity t, and the triple is established; otherwise, it is not established.

进一步，所述4.2的过程如下：Further, the process of 4.2 is as follows:

4.2.1、遍历集合S，计算第i条路径模式的关系模式与向量u的相似程度e_i，计算公式如下，其中f_att,path是一个前馈网络，用于计算第i条路径模式的注意力值，向量u是一个可训练的关系依赖向量，来自于数据训练结果，用于表示试图预测的关系r：4.2.1. Traverse the set S and calculate the similarity e_{i between} the relational pattern of the i-th path pattern and the vector u. The calculation formula is as follows, where f_att,path is a feedforward network used to calculate the attention value of the i-th path pattern, and vector u is a trainable relational dependency vector derived from the data training results, used to represent the relationship r that is being predicted:

e_i＝f_att，path(v_ρ(p_i)，u)e_i =f_att,path (v_ρ (p_i )，u)

4.2.2、计算第i条路径模式的注意力值，表示这条路径模式的重要程度，计算公式如下：4.2.2. Calculate the attention value of the i-th path pattern to indicate the importance of this path pattern. The calculation formula is as follows:

4.2.3、计算全局路径模式，计算公式如下：4.2.3. Calculate the global path mode. The calculation formula is as follows:

再进一步，所述第一步的过程如下：Furthermore, the process of the first step is as follows:

1.1、在知识库中获取实体集E和关系集R，从中构建一个三元组S＝{(h,r,t)|h,t∈E∧r∈R}，r是实体h和t之间的直接关系，h是头实体，t是尾实体；1.1. Obtain entity set E and relationship set R in the knowledge base, and construct a triple S = {(h, r, t)|h, t∈E∧r∈R} from them, where r is the direct relationship between entities h and t, h is the head entity, and t is the tail entity;

1.2、获取h和t之间所有的关系路径集合P＝{p₁,p₂,…p_N}，其中，p_i表示路径集合P中第i条路径，N表示关系路径的数量，h和t之间第i条路径表示为P_i＝＜h,r_i1,r_i2,…,r_iM,t＞，M表示这条关系路径上关系的数量。1.2. Obtain the set of all relationship paths between h and t P = {p₁ ,p₂ ,…p_N }, where p_i represents the i-th path in the path set P, N represents the number of relationship paths, and the i-th path between h and t is expressed as P_i = <h, r_i1 ,r_i2 ,…,r_iM ,t>, and M represents the number of relationships on this relationship path.

所述2.1的过程如下：The process of 2.1 is as follows:

2.1.1、初始化一个HashMap，用于存放数组；2.1.1. Initialize a HashMap to store the array;

2.1.2、初始化一个ArrayList，命名为relationList，用于存放路径上的关系；2.1.2. Initialize an ArrayList named relationList to store the relationships on the path;

2.1.3、取头实体h和尾实体t之间的第i条关系路径，0＜i≤N，将所有这条路径上的关系按照顺序存入relationList；2.1.3. Take the i-th relationship path between the head entity h and the tail entity t, 0＜i≤N, and store all the relationships on this path in relationList in order;

2.1.4、初始化一个数组ArrayVec，其长度为relationList的长度，用于存放关系转化成的向量；2.1.4. Initialize an array ArrayVec, whose length is the length of relationList, to store the vectors converted from the relations;

2.1.5、遍历relationList，得到第j个关系，0＜j≤relationList.length，使用Word2vec将关系转化为向量vec_j；2.1.5. Traverse relationList, get the jth relation, 0＜j≤relationList.length, and use Word2vec to convert the relation into a vector vec_j ;

2.1.6、将向量vec_j存入ArrayVec[j-1]；2.1.6. Store vector vec_j into ArrayVec[j-1];

2.1.7、判断relationList是否遍历完成，若否，返回2.1.5，否则进行2.1.8；2.1.7. Determine whether the traversal of relationList is complete. If not, return to 2.1.5. Otherwise, proceed to 2.1.8.

2.1.8、将数组ArrayVec存入HashMap，HashMap的key值为i，value为数组ArrayVec；2.1.8. Store the array ArrayVec into HashMap, where the key value of HashMap is i and the value is the array ArrayVec;

2.1.9、判断关系路径是否已经全部存入HashMap，若否，返回2.1.3，否则进行2.2。2.1.9. Determine whether all relationship paths have been stored in the HashMap. If not, return to 2.1.3, otherwise proceed to 2.2.

更进一步，所述2.2中，LSTM由细胞状态和“门”结构，细胞状态相当于信息传输的路径，让信息能在序列链中传递下去，LSTM有三种门结构：遗忘门、输入门和输出门，遗忘门的功能是决定应丢弃或保留哪些信息，来自前一个隐藏状态的信息和当前输入的信息同时传递到sigmoid函数中去，输出值介于0和1之间，越接近0意味着越应该丢弃，越接近1意味着越应该保留；输入门用于更新细胞状态，首先将前一层隐藏状态的信息和当前输入的信息传递到sigmoid函数中去，将值调整到0-1之间来决定要更新哪些信息，0表示不重要，1表示重要；输出门用来确定下一个隐藏状态的值，隐藏状态包含了先前输入的信息，首先将前一个隐藏状态和当前输入传递到sigmoid函数中，然后将新得到的细胞状态传递给tanh函数；Furthermore, in 2.2, LSTM consists of a cell state and a "gate" structure. The cell state is equivalent to the path of information transmission, allowing information to be passed down in the sequence chain. LSTM has three gate structures: forget gate, input gate and output gate. The function of the forget gate is to decide which information should be discarded or retained. The information from the previous hidden state and the current input information are simultaneously passed to the sigmoid function, and the output value is between 0 and 1. The closer to 0, the more it should be discarded, and the closer to 1, the more it should be retained; the input gate is used to update the cell state. First, the information of the previous hidden state and the current input information are passed to the sigmoid function, and the value is adjusted to between 0 and 1 to decide which information to update. 0 means unimportant and 1 means important; the output gate is used to determine the value of the next hidden state. The hidden state contains the information of the previous input. First, the previous hidden state and the current input are passed to the sigmoid function, and then the newly obtained cell state is passed to the tanh function.

其中sigmoid函数常被用做神经网络的激活函数，其公式如下：The sigmoid function is often used as the activation function of the neural network, and its formula is as follows:

其中tanh函数是一个双曲正切函数，值域为(-1,1)，其公式如下：The tanh function is a hyperbolic tangent function with a range of (-1,1) and its formula is as follows:

所述2.2的过程如下：The process of 2.2 is as follows:

2.2.1、遍历HashMap，取key为k的value值，0＜k≤N；2.2.1. Traverse the HashMap and take the value of key k, 0＜k≤N;

2.2.2、遍历value当中的数组，取第x个向量vec_x，其中x的取值范围为0＜x≤HashMap.get(k).length；2.2.2. Traverse the array in value and take the x-th vector vec_x , where the value range of x is 0＜x≤HashMap.get(k).length;

2.2.3、设置LSTM初始细胞状态c₀＝0，初始隐藏状态h₀＝0；2.2.3. Set the LSTM initial cell state c₀ = 0 and the initial hidden state h₀ = 0;

2.2.4、将先前细胞状态c_x-1、先前隐藏状态h_x-1和向量vec_x输入LSTM，输出当前细胞状态c_x和当前隐藏状态h_x，其中，先前细胞状态和先前隐藏状态即为上一层LSTM的当前细胞状态和当前隐藏状态；继续重复执行2.2.2直至数组内的元素遍历完毕；2.2.4. Input the previous cell state c_x-1 , the previous hidden state h_x-1 and the vector vec_x into LSTM, and output the current cell state c_x and the current hidden state h_x , where the previous cell state and the previous hidden state are the current cell state and the current hidden state of the previous layer of LSTM; continue to repeat 2.2.2 until all elements in the array are traversed;

2.2.5、判断HashMap是否遍历完成，若否，重复步骤2.2.1，否则进入步骤2.2.6；2.2.5. Determine whether the HashMap traversal is complete. If not, repeat step 2.2.1, otherwise proceed to step 2.2.6;

2.2.6、获得LSTM的当前隐藏状态，并将它作为这条关系路径的向量表示，即v_π(p)；2.2.6. Get the current hidden state of LSTM and use it as the vector representation of this relationship path, i.e., v_π (p);

2.2.7、从步骤2.1.3重新开始，直到计算出头实体h和尾实体t之间所有关系路径的向量表示；2.2.7. Start again from step 2.1.3 until the vector representation of all relationship paths between the head entity h and the tail entity t is calculated;

2.3、将头实体h和尾实体t之间的直接关系r使用Word2vec转化为向量，并将该向量记为再将/>输入LSTM，用LSTM的最后一个状态作为直接关系r的向量表示，用v_π(r)表示。2.3. Convert the direct relationship r between the head entity h and the tail entity t into a vector using Word2vec, and record the vector as Then/> Input LSTM and use the last state of LSTM as the vector representation of the direct relationship r, denoted by v_π (r).

所述2.2.4的过程如下：The process of 2.2.4 is as follows:

2.2.4.1、计算遗忘门：将先前隐藏状态h_x-1和向量vec_x拼接成一个向量vech，经过一层全连接层计算后，再将vech传入sigmoid函数，获得遗忘门的输出，f_x；2.2.4.1. Calculate the forget gate: concatenate the previous hidden state h_x-1 and the vector vec_x into a vector vech. After a fully connected layer is calculated, vech is passed into the sigmoid function to obtain the output of the forget gate, f_x ;

2.2.4.2、计算输入门：将vech分别传入sigmoid函数和tanh函数，vech传入sigmoid函数后再经过一层全连接层计算，获得输入门输出i_x，vech传入tanh函数后的输出值为候选向量ā_x，将sigmoid函数和tanh函数的输出值相乘；2.2.4.2. Calculate the input gate: pass vech into the sigmoid function and tanh function respectively. After passing vech into the sigmoid function, it passes through a fully connected layer to obtain the input gate output i_x . The output value of vech after passing it into the tanh function is the candidate vector ā_x . Multiply the output values of the sigmoid function and the tanh function.

2.2.4.3、计算细胞状态：将先前细胞状态与遗忘门输出值相乘，再加上sigmoid函数和tanh函数的输出值相乘的结果，当前细胞状态计算公式如下：2.2.4.3. Calculate the cell state: multiply the previous cell state by the output value of the forget gate, and then add the result of multiplying the output values of the sigmoid function and the tanh function. The current cell state calculation formula is as follows:

c_x＝f_x·c_x-1+i_x·ā_xc_x = f_x · c_x-1 + i_x · ā_x

2.2.4.4、计算输出门：将vech经过一层全连接层计算后输入sigmoid函数，得到o_x，再将当前细胞状态c_x输入tanh函数后的结果与o_x相乘，得到当前隐藏状态h_x。2.2.4.4. Calculate the output gate: After vech is calculated by a fully connected layer, it is input into the sigmoid function to obtain o_x . Then the current cell state c_x is input into the tanh function and the result is multiplied by o_x to obtain the current hidden state h_x .

所述2.3的计算过程如下：The calculation process of 2.3 is as follows:

2.3.1、设置LSTM初始细胞状态c₀＝0，初始隐藏状态h₀＝0；2.3.1、Set the LSTM initial cell state c₀ = 0 and the initial hidden state h₀ = 0;

2.3.2、计算遗忘门：将先前隐藏状态h₀和向量r拼接成一个向量rh，经过一层全连接层计算后，再将rh传入sigmoid函数，获得遗忘门的输出，f₁；2.3.2. Calculate the forget gate: concatenate the previous hidden state h₀ and the vector r into a vector rh. After a fully connected layer is calculated, rh is passed into the sigmoid function to obtain the output of the forget gate, f₁ ;

2.3.3、计算输入门：将rh分别传入sigmoid函数和tanh函数，rh传入sigmoid函数后再经过一层全连接层计算，获得输入门输出i₁，vech传入tanh函数后的输出值为候选向量ā₁，将sigmoid函数和tanh函数的输出值相乘；2.3.3. Calculate the input gate: pass rh into the sigmoid function and tanh function respectively. After rh is passed into the sigmoid function, it is calculated through a fully connected layer to obtain the input gate output i₁ . The output value of vech after passing into the tanh function is the candidate vector ā₁ . Multiply the output values of the sigmoid function and the tanh function.

2.3.4、计算细胞状态：将先前细胞状态与遗忘门输出值相乘，再加上sigmoid函数和tanh函数的输出值相乘的结果，当前细胞状态计算公式如下：2.3.4. Calculate the cell state: multiply the previous cell state by the output value of the forget gate, and then add the result of multiplying the output values of the sigmoid function and the tanh function. The current cell state calculation formula is as follows:

c₁＝f₁·c₀+i₁·ā₁c₁ =f₁ ·c₀ +i₁ ·ā₁

2.3.5、计算输出门：将rh经过一层全连接层计算后输入sigmoid函数，得到o₁，再将当前细胞状态c₁输入tanh函数后的结果与o₁相乘，得到当前隐藏状态，h₁；2.3.5. Calculate the output gate: After rh is calculated by a fully connected layer, it is input into the sigmoid function to obtain o₁ . Then, the result of inputting the current cell state c₁ into the tanh function is multiplied by o₁ to obtain the current hidden state, h₁ ;

2.3.6、获得当前隐藏状态，表示为v_π(r)。2.3.6. Get the current hidden state, denoted as v_π (r).

所述3.3的步骤如下：The steps of 3.3 are as follows:

3.3.1、初始化一个ArrayList，将选定关系路径上的所有实体，包括头实体h和尾实体t按顺序存入ArrayList；3.3.1. Initialize an ArrayList and store all entities on the selected relationship path, including the head entity h and the tail entity t, into the ArrayList in order;

3.3.2、遍历ArrayList，给每一个实体设置一个类型层次结构的集合L，实体e_t表示ArrayList中第t个实体的表示，0＜t≤N，N表示ArrayList的长度，其中，e₀＝h，e_{ArrayList.length-1}＝t，L_t表示第t个实体的类型层次结构集合，L_t＝{l_t,1,…,l_t,C},l_t,1表示最具体的类型，l_t,C表示实体e_t高度为C的类型层次结构中最抽象的类型；3.3.2. Traverse ArrayList and set a type hierarchy set L for each entity. Entity e_t represents the representation of the t-th entity in ArrayList, 0＜t≤N, N represents the length of ArrayList, where e₀ ＝h, e_{ArrayList.length-1} ＝t, L_t represents the type hierarchy set of the t-th entity, L_t ＝{l_t,1 ,…,l_t,C }, l_t,1 represents the most specific type, and l_t,C represents the most abstract type in the type hierarchy with height C of entity e_t ;

3.3.3、将每个实体的层次类型转化为向量，得到v_t＝{v_t(l_t,1),…,v_t(l_t,C)}；3.3.3. Convert the hierarchical type of each entity into a vector and obtain v_t ={v_t (l_t,1 ),…,v_t (l_t,C )};

3.3.4、判断是否遍历完成，若否，返回3.3.3，否则，进入3.3.5；3.3.4. Determine whether the traversal is complete. If not, return to 3.3.3. Otherwise, proceed to 3.3.5.

3.3.5、计算实体e_t选择的类型向量，并计算实体e_t的类型上下文向量。3.3.5. Calculate the type vector selected by entity e_t and calculate the type context vector of entity e_t .

所述3.3.5计算过程如下：The calculation process of 3.3.5 is as follows:

3.3.5.1、设置初始细胞状态c₀＝f_init,c(v_π(p))，初始隐藏状态h₀＝f_init,h(v_π(p))，其中，f_init,c，f_init,h是两个独立的前馈网络；3.3.5.1. Set the initial cell state c₀ = f_init,c (v_π (p)) and the initial hidden state h₀ = f_init,h (v_π (p)), where f_init,c and f_init,h are two independent feedforward networks;

3.3.5.2、计算上下文向量计算公式为：/>3.3.5.2. Calculate the context vector The calculation formula is:/>

3.3.5.3、遍历ArrayList，计算实体e_t选择第m层抽象结构后的向量表示，计算公式如下，其中f_att,type是一个前馈网络：3.3.5.3. Traverse the ArrayList and calculate the vector representation of the entity e_t after selecting the mth layer of abstract structure. The calculation formula is as follows, where f_att,type is a feedforward network:

3.3.5.4、计算实体e_t选择第m层结构类型的权重α_t,m，这个权重表示m是正确的抽象级别的概率，计算公式如下，其中exp(x)＝e^x：3.3.5.4. Calculate the weight α_t,m of the entity e_t selecting the mth level structure type. This weight represents the probability that m is the correct level of abstraction. The calculation formula is as follows, where exp(x) = e^x :

3.3.5.5、计算实体e_t的类型上下文向量表示其计算公式如下：3.3.5.5. Compute the type context vector representation of entity e_t The calculation formula is as follows:

3.3.5.6、将先前隐藏状态和先前细胞状态以及，类型向下文向量输入LSTM，计算当前隐藏状态和当前细胞状态，计算过程如下：3.3.5.6. Input the previous hidden state and the previous cell state and the type vector to the LSTM to calculate the current hidden state and the current cell state. The calculation process is as follows:

3.3.5.6.1、计算遗忘门：将先前隐藏状态h_t-1和向量拼接成一个向量/>经过一层全连接层计算后，再传入sigmoid函数，获得遗忘门的输出，f_x；3.3.5.6.1. Calculate the forget gate: The previous hidden state h_t-1 and the vector Concatenate into a vector /> After a fully connected layer is calculated, it is passed into the sigmoid function to obtain the output of the forget gate, f_x ;

3.3.5.6.2、计算输入门：将分别传入sigmoid函数和tanh函数，/>传入sigmoid函数后再经过一层全连接层计算，获得输入门输出i_t，/>传入tanh函数后的输出值为候选向量/>将sigmoid函数和tanh函数的输出值相乘；3.3.5.6.2. Calculate the input gate: Pass in the sigmoid function and tanh function respectively, /> After passing the sigmoid function, it passes through a fully connected layer to obtain the input gate output_it , /> The output value after passing into the tanh function is the candidate vector/> Multiply the output values of the sigmoid function and the tanh function;

3.3.5.6.3、计算细胞状态：将先前细胞状态与遗忘门输出值相乘，再加上sigmoid函数和tanh函数的输出值相乘的结果，当前细胞状态计算公式如下：3.3.5.6.3. Calculate the cell state: multiply the previous cell state by the output value of the forget gate, and then add the result of multiplying the output values of the sigmoid function and the tanh function. The current cell state is calculated as follows:

3.3.5.6.4、计算输出门：将经过一层全连接层计算后输入sigmoid函数，得到o_t，再将当前细胞状态c_t输入tanh函数后的结果与o_t相乘，得到当前隐藏状态h_t；3.3.5.6.4. Calculate the output gate: After a fully connected layer is calculated, the sigmoid function is input to obtain o_t , and then the result of the current cell state c_t inputting the tanh function is multiplied by o_t to obtain the current hidden state h_t ;

3.3.5.7、判断是否遍历完成，若否，返回3.3.5.3，否则，进入步骤3.3.5.83.3.5.7. Determine whether the traversal is complete. If not, return to step 3.3.5.3. Otherwise, proceed to step 3.3.5.8.

3.3.5.8、获得LSTM的当前隐藏状态，并用v_ε(p)表示；3.3.5.8. Get the current hidden state of LSTM and represent it as v_ε (p);

3.3.5.9、获取头实体和尾实体的类型上下文向量，分别记为和/>3.3.5.9. Obtain the type context vectors of the head entity and the tail entity, denoted as and/>

本发明的有益效果主要表现在：在对知识图谱进行链接预测时，可以使用本方法对三元组进行关系编码和实体类型编码，同时引入注意力机制，将注意力机制与编码结果相结合来计算三元组成立的概率，本方法很好地利用了三元组中实体与关系的属性以及他们之间的联系，提高了对实体和关系的利用率，而注意力机制提升了概率计算结果的精确性，本方法提升了三元组表示成的向量的准确度，增加了预测实体间是否能相互连接的结果的准确度。The beneficial effects of the present invention are mainly manifested in: when performing link prediction on a knowledge graph, the present method can be used to perform relationship encoding and entity type encoding on triples, and at the same time, an attention mechanism is introduced, and the attention mechanism is combined with the encoding result to calculate the probability that the triple is established. The present method makes good use of the attributes of the entities and relationships in the triples and the connections between them, thereby improving the utilization rate of the entities and relationships, while the attention mechanism improves the accuracy of the probability calculation results. The present method improves the accuracy of the vectors represented by the triples, and increases the accuracy of the results of predicting whether the entities can be connected to each other.

具体实施方式Detailed ways

下面对本发明做进一步说明。The present invention is further described below.

一种基于注意力机制的知识图谱路径可达性分析方法，包括以下步骤：A knowledge graph path accessibility analysis method based on attention mechanism includes the following steps:

第一步、从知识库中构建目标三元组，并获得该三元组中头实体和尾实体之间所有的路径关系；过程如下：The first step is to construct the target triple from the knowledge base and obtain all the path relationships between the head entity and the tail entity in the triple; the process is as follows:

1.1、在知识库中获取实体集E和关系集R，从中构建一个三元组S＝{(h,r,t)|h,t∈E∧r∈R}，r是实体h和t之间的直接关系，h是头实体，t是尾实体。1.1. Get the entity set E and the relationship set R in the knowledge base, and construct a triple S = {(h, r, t)|h, t∈E∧r∈R} from them, where r is the direct relationship between entities h and t, h is the head entity, and t is the tail entity.

1.2、获取h和t之间所有的关系路径集合P＝{p₁,p₂,…p_N}，其中，p_i表示路径集合P中第i条路径，N表示关系路径的数量，h和t之间第i条路径表示为P_i＝＜h,r_i1,r_i2,…,r_iM,t＞，M表示这条关系路径上关系的数量；1.2. Obtain a set of all relationship paths between h and t P = {p₁ ,p₂ ,…p_N }, where p_i represents the ith path in the path set P, N represents the number of relationship paths, and the ith path between h and t is represented by P_i = <h, r_i1 ,r_i2 ,…,r_iM ,t>, and M represents the number of relationships on this relationship path;

第二步、进行关系编码，结合Word2vec将目标三元组中的直接关系和所有路径关系表示成向量，并将路径关系表示成的向量输入LSTM进行顺序编码；The second step is to perform relationship encoding. Combined with Word2vec, the direct relationship and all path relationships in the target triple are represented as vectors, and the vectors represented by the path relationship are input into LSTM for sequential encoding.

所述第二步中，Word2vec模型由Mikolov等人于2013年提出，该模型将文本中的内容词汇通过转换处理，化简为空间向量，词向量的数值受上下文的影响，蕴含了词与词之间相互的关联性，过程如下：In the second step, the Word2vec model was proposed by Mikolov et al. in 2013. The model converts the content vocabulary in the text into a space vector. The value of the word vector is affected by the context and contains the correlation between words. The process is as follows:

2.1.9、判断关系路径是否已经全部存入HashMap，若否，返回2.1.3，否则进行2.2；2.1.9. Determine whether all relationship paths have been stored in HashMap. If not, return to 2.1.3, otherwise proceed to 2.2;

所述2.2中，LSTM主要由细胞状态和“门”结构，细胞状态相当于信息传输的路径，让信息能在序列链中传递下去。LSTM有三种门结构：遗忘门、输入门和输出门，遗忘门的功能是决定应丢弃或保留哪些信息，来自前一个隐藏状态的信息和当前输入的信息同时传递到sigmoid函数中去，输出值介于0和1之间，越接近0意味着越应该丢弃，越接近1意味着越应该保留；输入门用于更新细胞状态，首先将前一层隐藏状态的信息和当前输入的信息传递到sigmoid函数中去，将值调整到0-1之间来决定要更新哪些信息，0表示不重要，1表示重要；输出门用来确定下一个隐藏状态的值，隐藏状态包含了先前输入的信息，首先将前一个隐藏状态和当前输入传递到sigmoid函数中，然后将新得到的细胞状态传递给tanh函数；In 2.2, LSTM is mainly composed of cell state and "gate" structure. Cell state is equivalent to the path of information transmission, allowing information to be passed on in the sequence chain. LSTM has three gate structures: forget gate, input gate and output gate. The function of the forget gate is to decide which information should be discarded or retained. The information from the previous hidden state and the current input information are passed to the sigmoid function at the same time. The output value is between 0 and 1. The closer to 0 means the more it should be discarded, and the closer to 1 means the more it should be retained; the input gate is used to update the cell state. First, the information of the previous hidden state and the current input information are passed to the sigmoid function, and the value is adjusted to between 0-1 to decide which information to update. 0 means unimportant and 1 means important; the output gate is used to determine the value of the next hidden state. The hidden state contains the information of the previous input. First, the previous hidden state and the current input are passed to the sigmoid function, and then the newly obtained cell state is passed to the tanh function;

再进一步，所述2.2的过程如下：Furthermore, the process of 2.2 is as follows:

2.2.4、将先前细胞状态c_x-1、先前隐藏状态h_x-1和向量vec_x输入LSTM，输出当前细胞状态c_x和当前隐藏状态h_x，其中，先前细胞状态和先前隐藏状态即为上一层LSTM的当前细胞状态和当前隐藏状态；继续重复执行2.2.2直至数组内的元素遍历完毕，总计4步如下：2.2.4. Input the previous cell state c_x-1 , the previous hidden state h_x-1 and the vector vec_x into LSTM, and output the current cell state c_x and the current hidden state h_x . The previous cell state and the previous hidden state are the current cell state and the current hidden state of the previous layer of LSTM. Continue to repeat 2.2.2 until the elements in the array are traversed. There are 4 steps in total as follows:

c_x＝f_x·c_x-1+i_x·ā_xc_x = f_x · c_x-1 + i_x · ā_x

2.2.4.4、计算输出门：将vech经过一层全连接层计算后输入sigmoid函数，得到o_x，再将当前细胞状态c_x输入tanh函数后的结果与o_x相乘，得到当前隐藏状态h_x；2.2.4.4. Calculate the output gate: After vech is calculated by a fully connected layer, it is input into the sigmoid function to obtain o_x . Then, the result of inputting the current cell state c_x into the tanh function is multiplied by o_x to obtain the current hidden state h_x .

所述2.3的计算过程如下：The calculation process of 2.3 is as follows:

c₁＝f₁·c₀+i₁·ā₁c₁ =f₁ ·c₀ +i₁ ·ā₁

2.3.5、计算输出门：将rh经过一层全连接层计算后输入sigmoid函数，得到o₁，再将当前细胞状态c₁输入tanh函数后的结果与o₁相乘，得到当前隐藏状态，h₁。2.3.5. Calculate the output gate: After rh is calculated by a fully connected layer, it is input into the sigmoid function to obtain o₁ . Then, the current cell state c₁ is input into the tanh function and multiplied by the result with o₁ to obtain the current hidden state, h₁ .

所述3.3的步骤如下：The steps of 3.3 are as follows:

3.3.2、遍历ArrayList，给每一个实体设置一个类型层次结构的集合L，实体e_t表示ArrayList中第t个实体的表示，0＜t≤N，N表示ArrayList的长度。其中，e₀＝h，e_{ArrayList.length-1}＝t，L_t表示第t个实体的类型层次结构集合，L_t＝{l_t,1,…,l_t,C},l_t,1表示最具体的类型，l_t,C表示实体e_t高度为C的类型层次结构中最抽象的类型；3.3.2. Traverse ArrayList and set a type hierarchy set L for each entity. Entity e_t represents the representation of the t-th entity in ArrayList, 0＜t≤N, N represents the length of ArrayList. Among them, e₀ ＝h, e_{ArrayList.length-1} ＝t, L_t represents the type hierarchy set of the t-th entity, L_t ＝{l_t,1 ,…,l_t,C }, l_t,1 represents the most specific type, and l_t,C represents the most abstract type in the type hierarchy of entity e_t with a height of C;

3.3.5、计算实体e_t选择的类型向量，并计算实体e_t的类型上下文向量，计算过程如下：3.3.5. Calculate the type vector selected by entity e_t and the type context vector of entity e_t . The calculation process is as follows:

3.3.5.6.2、计算输入门：将分别传入sigmoid函数和tanh函数，/>传入sigmoid函数后再经过一层全连接层计算，获得输入门输出i_t，/>传入tanh函数后的输出值为候选向量/>将sigmoid函数和tanh函数的输出值相乘。3.3.5.6.2. Calculate the input gate: Pass in the sigmoid function and tanh function respectively, /> After passing the sigmoid function, it passes through a fully connected layer to obtain the input gate output_it , /> The output value after passing into the tanh function is the candidate vector/> Multiply the output of the sigmoid function and the output of the tanh function.

3.3.5.8、获得LSTM的当前隐藏状态，并用v_ε(p)表示。3.3.5.8. Get the current hidden state of the LSTM and denote it by v_ε (p).

4.2、使用软注意力机制(Soft Attention)将所有路径模式组合成全局路径模式计算过程如下：4.2. Use Soft Attention to combine all path patterns into a global path pattern The calculation process is as follows:

所述4.2中，软注意力机制是一种特殊结构，用来自动学习和计算输入数据对输出数据的贡献大小。是在选择信息的时候，不是从N个信息中只选择1个，而是计算N个输入信息的加权平均，再输入到神经网络中计算；过程如下：In 4.2, the soft attention mechanism is a special structure used to automatically learn and calculate the contribution of input data to output data. When selecting information, it does not select only one from N information, but calculates the weighted average of N input information and then inputs it into the neural network for calculation; the process is as follows:

4.2.1、遍历集合S，计算第i条路径模式的关系模式与向量u的相似程度e_i，计算公式如下，其中f_att,path是一个前馈网络，用于计算第i条路径模式的注意力值，向量u是一个可训练的关系依赖向量，来自于数据训练结果，用于表示试图预测的关系r：4.2.1. Traverse the set S and calculate the similarity e_{i between} the relational pattern of the i-th path pattern and the vector u. The calculation formula is as follows, where_fatt,path is a feedforward network used to calculate the attention value of the i-th path pattern, and the vector u is a trainable relational dependency vector derived from the data training results, used to represent the relation r that is being predicted:

e_i＝f_att，path(v_ρ(p_i)，u)e_i =f_att,path (v_ρ (p_i )，u)

G(h，r，t)＝exp(-E(h，r，t))*P(r|h，t)G(h,r,t)=exp(-E(h,r,t))*P(r|h,t)

本说明书的实施例所述的内容仅仅是对发明构思的实现形式的列举，仅作说明用途。本发明的保护范围不应当被视为仅限于本实施例所陈述的具体形式，本发明的保护范围也及于本领域的普通技术人员根据本发明构思所能想到的等同技术手段。The contents described in the embodiments of this specification are merely enumerations of implementation forms of the inventive concept and are for illustrative purposes only. The protection scope of the present invention should not be considered to be limited to the specific forms described in this embodiment, and the protection scope of the present invention also extends to equivalent technical means that can be thought of by ordinary technicians in this field based on the inventive concept.