Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
To facilitate an understanding of the embodiments of the present specification, some noun explanations are introduced below.
Different pattern (HeterogeneousGraph): is a structure describing a set of objects, some of which are "related" in a sense. These objects correspond to mathematical abstractions called nodes, and each associated pair of nodes is called an edge. More than one node and edge type may be present in the iso-graph.
Small and medium-sized graph): the data structure is based on a graph, wherein nodes represent small and micro enterprises, and links represent relations between the small and medium enterprises, such as a control relation, a pooling relation, an upstream and downstream relation, a transfer relation and the like.
Funds shortage (FINANCIALLY CONSTRAINED): is an enterprise state, meaning that the enterprise has less funds than would be required to maintain normal production by the enterprise. Banks need to locate small micro-businesses with short funds from among a vast array of small micro-businesses to better aid the development of small micro-businesses.
Embedding (Embedding): one layer common in deep learning network models is mainly to handle vector representations of sparse features. The method not only can solve the length problem of the one-hot vector, but also can characterize the similarity between the features.
The mechanism of attention (Attention mechanism) stems from the study of human vision. In cognitive sciences, due to bottlenecks in information processing, humans may selectively focus on a portion of all information while ignoring other visible information. The above mechanism is often referred to as an attention mechanism. The core of the attention mechanism in deep learning is from the full information in the depth of interest network to the important information of interest.
Graph convolution network (Graph convolution network): the method is a armn neural network capable of processing graph structure data, and is used for updating embedding on a node by information aggregation or transmission of a neighborhood of a target node so as to further carry out downstream tasks.
Graph attention network (Graphattentionnetwork): the method is an improvement of a graph convolution network, and in the process of aggregating neighbor information, the graph convolution network can realize the weighted aggregation of neighbors by learning the weights of the neighbors. Therefore, not only is it relatively robust to noisy neighbors, but the attention mechanism also gives the model some interpretability.
Hyperplane (HYPERPLANE): refers to a subspace of dimension n-1 in an n-dimensional linear space. It can divide the linear space into two parts that do not intersect, so that two vectors that were difficult to distinguish before the division are distinguished after the division.
Relationship translation (Relational translation): is a method of calculating whether a relationship between two nodes is established on an iso-graph. The method treats the relationships in the isomerism as a translation vector between entities, namely, for each fact relationship triplet, the entities and the relationships are firstly expressed in the same vector space, and then the relationship vector is treated as translation between a head entity vector and a tail entity vector.
It should be appreciated that in the present description embodiment, there are a variety of heterogeneous relationships between enterprises, such as, for example, a partnership relationship, a parent company relationship, a supplier relationship, a share owner relationship, a share vouchers relationship, a common owner relationship, and so on. For ease of understanding, the following is illustrated with a parent corporate relationship and a provider relationship.
FIG. 1 is a schematic illustration of a scenario of enterprise relationships according to an embodiment of the present disclosure. As shown in fig. 1, in the current enterprise a, there are two heterogeneous enterprise relationship parent company relationships and vendor relationships. Wherein, enterprise B is the parent company of enterprise A, which will provide funds for enterprise A to develop; enterprise a is the provider of enterprise C, or enterprise a is the upstream enterprise of enterprise C, and downstream enterprise C needs to provide funds as a payment to upstream enterprise a. As shown in fig. 1, a parent company is in a one-to-many relationship with a subsidiary, and there may be multiple subsidiary companies for a parent company that need to be funded; the upstream and downstream companies are in a many-to-many relationship, and one enterprise may have multiple suppliers or may simultaneously serve as suppliers for multiple enterprises, i.e., one enterprise may have multiple downstream enterprises or may simultaneously have multiple upstream enterprises.
In such complex business heterogeneous relationships, the fund flow relationships corresponding to the different business relationships are also different. In addition, in different enterprise relationships, the funding requirements of neighboring nodes have different impact factors on the current node.
In order to consider the influence of various different enterprise relations on enterprise financing requirements, the embodiment of the specification provides a training method of an enterprise identification model needing financing. FIG. 2 is a flow chart of a method for identifying a financing requiring enterprise in accordance with an embodiment of the present disclosure. As shown in fig. 2, the method may include:
Step S202, generating map information of a target enterprise group.
Any one node in the map information corresponds to one enterprise in the target enterprise group, and any one side in the map information corresponds to a relationship between two enterprises corresponding to two nodes connected by the side.
It should be understood that the process of generating the enterprise atlas information belongs to the prior art, and specific implementation may refer to relevant content of the prior art, and embodiments of the present disclosure are not repeated herein. In the present specification, a Small and Medium-sized Enterprise (SME) group is taken as an example for illustration, and the map information of the target Enterprise group is a SME map, which may be expressed as g= (V, epsilon, delta, H, R). Wherein,
V, representing the set of nodes of the SME.
Epsilon represents a set of heterogeneous relationship types between SMEs (i.e., a set of edges in the SNE graph).
Delta, represents a set of relationship triples. For each relationship triplet (h, r, t) ∈Δ, h represents the head node of the edge, r represents the relationship type of the edge, and t represents the tail node of the edge.
And H, representing the original node characteristic matrix of all nodes in all SME graphs.
And R, representing a relation characteristic matrix of relation types among all nodes in all SME graphs.
It should be appreciated that in embodiments of the present description, a variety of relationships may be included between enterprises, such as, for example, a partnership, a parent relationship, a supplier relationship, a share owner relationship, a share vouchers relationship, a common owner relationship, and so forth. Of course, it should be understood that in the present description embodiments, the primary consideration is the different funding flow relationships due to the various business relationships. For example, company a is a parent company of company B, and when company B has a funds requirement, there is a funds flow relationship in which company a transfers funds to company B; for another example, company B is a provider of company C, and when company B provides an item to company C, there is a funding relationship in which company C needs to pay for the item to company B.
Of course, it should be understood that, for the embodiment of the present specification, step S202 is optional, and the map information may also be obtained directly, for example, may be read directly from a pre-stored database, or obtained from a third party, or the like, which is not limited in the embodiment of the present specification.
Step S204, the original feature vectors of the nodes representing the enterprises and the original feature vectors of the edges representing the enterprise relations are input into a target graph convolution network, and neighborhood structure learning is performed based on the entity relation combination operator to obtain final embedded (Embedding) vectors of the nodes and the edges.
The entity relation combination operator is used for converging the embedded vectors of the neighbor nodes of the target node and the corresponding connecting edges. Specifically, the entity relation combining operator is used for converging the embedded vector of the neighbor node of the target node and the embedded vector of the connecting edge of the target node and the neighbor node so as to obtain the embedded vector of the neighbor node after the embedded vector of the neighbor node is transferred to the target node.
It should be understood that in the embodiment of the present specification, the final embedded vector of the node and the edge refers to the embedded vector of the node and the edge output by the last convolution layer of the target graph convolution network.
It should be understood that in the embodiment of the present disclosure, the neighborhood structure learning is performed on the map information of the target enterprise group through the target map convolution network. It should be understood that when the neighborhood structure learning is performed based on the entity relationship combining operator to obtain the final embedded vector of each node and each edge, the embedded vectors of the neighboring nodes and the connecting edges of each target node can be converged layer by layer based on the entity relationship combining operator to obtain the embedded vector of the neighboring nodes after the embedded vector of the neighboring nodes is transferred to the target node, and the embedded vector of the edges is converted layer by layer to the next convolution layer. Namely: and converging the embedded vectors of the neighbor nodes and edges of each node layer by layer based on the entity relation combination operator to obtain the embedded vectors output by each convolution layer of each node and edge in the target graph convolution network. . In the embodiment of the present specification, the number of convolution layers L of the target graph convolution network is a super-parameter and is configurable. According to the embodiment of the specification, different L values can be selected according to the requirement, so that an optimal enterprise identification model to be financing is obtained. For example, the values of L may be 2, 3, 5, 8, 10, 20, etc., and each L may be a respective optimal model of the enterprise to be financing. However, the cost of training different L values to obtain the model is different from the prediction accuracy of the trained model, and when the L values are configured, the balance between the training cost and the prediction accuracy needs to be considered.
It should be appreciated that in the graph-convolution network, each convolution layer has a corresponding embedded vector input and embedded vector output. In the embodiment of the present specification, in the first layer convolution layer of the target graph convolution network, the inputs thereof are an embedding vector h of a set of nodes and an embedding vector r of a set of edges (relationships). The method can be specifically expressed by the following vectors:
Embedding vectors for nodes of a layer I convolutional layer inputWhere V represents the number of nodes of the node set V.Representing the input eigenvectors of node i at the layer i convolutional layer in the graph rolling network. The node embedded vector input by the first layer (1) convolution layer, namely the node embedded vector output by the first layer convolution layer.
Node embedded vector for input of first layer convolution layerWhere ε represents the number of relationships of the relationship set ε.Representing the input feature vector of the first layer convolution layer of edge (relationship) i in the graph convolution network. The embedded vector of the edge input by the first layer (1) convolution layer, namely the embedded vector of the edge output by the first layer convolution layer.
In particular, in the layer 1 convolutional layer, its inputs are the original feature vectors of the nodes and the original feature vectors of the edges.
In the embodiment of the present disclosure, in order to distinguish the importance of neighboring nodes with different relationship types, an entity-relationship combination operator may be designed to aggregate the embedded vectors from the neighboring nodes and edges by performing a synthesis operation on the embedded vectors of the neighboring nodes and edges. For each node u, the embedded vector inputs from one-hop neighbors are first aggregated to calculate the embedded vector output for each node. According to the entity relation combination operator, the embedded vector of each neighbor node v of the node u and the embedded vector of the edge corresponding to the neighbor node v after the embedded vector of the node u and the neighbor node v are transmitted to the node can be determined. The embedded vector of each neighbor node u and the embedded vector of the edge can determine a transferred embedded vectorAnd then can pass through a transformation matrixAfter linear transformation is carried out on the embedded vector transferred to the node u by each neighbor node of the node u, nonlinear transformation processing is carried out through a preset function f (), so that the embedded vector output corresponding to the node u is obtained. It should be appreciated that the preset function f () may select a variety of activation functions, which the present embodiments do not limit.
Optionally, the neighborhood structure learning process may include:
Determining an embedded vector of the target neighbor node in the target convolution layer after the embedded vector of the target neighbor node is transferred to the target node according to the entity relation combination operator, the embedded vector input of the target neighbor node of the target node in the target convolution layer and the embedded vector input of the connecting edge of the target node and the target neighbor node in the target convolution layer;
Determining the output of the embedded vector corresponding to the target node in the target convolution layer according to the first linear transformation matrix corresponding to the target convolution layer and the embedded vector of each neighbor node of the target node in the target convolution layer after the embedded vector of each neighbor node is transferred to the target node; the first linear transformation matrix corresponding to the target convolution layer is used for carrying out linear transformation processing on the node embedded vector input by the target convolution layer;
and determining the embedded vector of each node and each edge output by the last convolution layer of the target graph convolution network as the final embedded vector of each node and each edge.
Optionally, determining, according to the first linear transformation matrix corresponding to the target convolutional layer and the embedded vectors of the neighboring nodes of the target node in the target convolutional layer after the embedded vectors of the neighboring nodes are transferred to the target node, the embedded vector output corresponding to the target node in the target convolutional layer may include:
According to a first linear transformation matrix corresponding to the target convolution layer, performing linear transformation on the embedded vectors of all neighbor nodes of the target convolution layer after the embedded vectors of all neighbor nodes of the target node are transferred to the target node, and obtaining first transformation parameters corresponding to the target neighbor nodes;
And accumulating the first transformation parameters corresponding to each neighbor node of the target node, and then carrying out preset nonlinear transformation processing to obtain the embedded vector output corresponding to the target node in the target convolution layer.
Optionally, the embedded vector output of node u in the first layer (i.e., the embedded vector input of node u in the first +1 layer) is represented as shown in equation (1):
Where N (u) represents a one-hop neighbor set of node u, T (u, v) represents a relationship type of neighbor node v of node u and node u,An embedded vector input representing the relationship of node u's neighbor node v and node u in the layer l convolutional layer,An embedded vector input representing a neighbor node v of node u in the layer l convolutional layer,Synthesizing an operator according to entity-relation and embedding vector of neighbor node vAnd an embedding vector of the relation T (u, v)The embedded vector of the neighbor node v is transferred to the embedded vector after the node u,Representing a node embedded vector transformation matrix represented in a first layer convolution layer in a graph convolution network, wherein the node embedded vector transformation matrix is used for performing linear transformation processing on node embedded vectors input by the first layer convolution layer; the function f () is an activation function, and is used for performing preset nonlinear transformation processing on parameters in the function.
In addition, since the update of the embedded vector of node u in equation 1 transforms the original vector space, the relation vector of the input of layer 1+1Corresponding transformations are also to be made. The process of embedding vector output at each convolutional layer for each edge may include:
And performing linear transformation processing on the embedded vector input of the target edge corresponding to the target convolution layer according to a second linear transformation matrix corresponding to the target convolution layer, and performing the preset nonlinear transformation on the parameters subjected to the linear transformation processing to obtain the embedded vector output of the target edge corresponding to the target convolution layer, wherein the second linear transformation matrix corresponding to the target convolution layer is used for performing linear transformation processing on the embedded vector of the edge input by the target convolution layer.
Optionally, the ith relation embedded vector output in a first layer of convolution layers in the target graph convolution networkAnd corresponding ith relation embedded vector inputCan be represented by the following formula (2):
wherein,Representing a relationship transformation matrix in a first layer of convolution layers in the target graph convolution network. Embedding vector inputs according to an ith relationship in a first layer of convolution layers in a target graph convolution networkCombining relation transformation matrixAnd activating the function f (), so as to obtain the relation embedded vector output in the first layer convolution layer in the target graph convolution networkIt should be understood that the activation function f () of equation (1) for calculating the node embedded vector and the activation function f () of equation (2) for calculating the edge embedded vector may be the same or different, but are generally chosen to use the same activation function.
Through the above formulas (1) and (2), the target graph convolution network can model the multi-type heterogeneity of edges (relationships) in the graph of the target enterprise group while maintaining the spatial complexity of the relationship type modeling as (O (|ε|dl)), i.e., linear in the feature dimension, where ε| represents the number of nodes in the graph of the target enterprise group and dl represents the feature dimension of the embedding vector of the first layer.
Of course, it should be appreciated that if the embedding vectors of the neighbor nodes and the embedding vectors of the edges are aggregated by means of only entity-relation synthesis operators, there may be certain drawbacks, such as the inability to treat variable-size neighbors as inputs, but rather to focus on the most relevant neighbors for aggregation.
Preferably, to solve the above problem, a self-attention mechanism may be introduced to calculate the embedded vector of each node in the graph by focusing on the neighbor nodes. To stabilize the learning process of the self-attention mechanism, the present embodiment extends the mechanism, employing a multi-headed self-attention mechanism. At this time, neighborhood structure learning may be performed based on the entity-relationship combining operator and the multi-head self-attention mechanism to obtain final embedded vectors for each node and edge. It should be understood that when the neighborhood structure learning is performed based on the entity relationship combining operator and the multi-head self-attention mechanism, the embedded vectors of the neighboring nodes and the embedded vectors of the edges of each node can be assembled layer by layer, so as to obtain the embedded vectors output by each convolution layer of each node and edge in the target graph convolution network, and thus, the final embedded vectors of each node and edge output by the convolution layer of the last layer are obtained.
Specifically, the embedded vector convergence process of the node introducing the multi-head self-attention mechanism may include:
Determining an embedded vector of the target neighbor node in the target convolution layer after the embedded vector of the target neighbor node is transferred to the target node according to the entity relation combination operator, the embedded vector input of the target neighbor node of the target node in the target convolution layer and the embedded vector input of the connecting edge of the target node and the target neighbor node in the target convolution layer;
According to a third linear transformation matrix corresponding to the target convolution layer, a target multidimensional vector of the target node in the embedded vector input of the target convolution layer and an embedded vector of the target neighbor node of the target node in the target convolution layer after the embedded vector of the target neighbor node of the target node is transmitted to the target node, determining self-attention coefficients corresponding to the target multidimensional vector and the target convolution layer of the target neighbor node of the target node, and carrying out normalization processing to obtain normalized self-attention coefficients, wherein the third linear transformation matrix corresponding to the target convolution layer is used for carrying out linear transformation processing on the node embedded vector input by the target convolution layer;
Determining aggregation characteristics of the target node corresponding to the target convolution layer and the target fractal dimension vector according to the normalized self-attention coefficient of each neighbor node of the target node corresponding to the target fractal dimension vector and the target convolution layer, the third linear transformation matrix corresponding to the target convolution layer and the embedded vector of each neighbor node of the target node in the target convolution layer after the embedded vector of each neighbor node of the target node is transferred to the target node;
Merging the aggregation characteristics corresponding to each target multidimensional vector of the target node in the target convolution layer to obtain the embedded vector output of the target node in the target convolution layer;
and determining the embedded vector of each node and each edge output by the last convolution layer of the target graph convolution network as the final embedded vector of each node and each edge.
FIG. 3 is a schematic diagram of an embedding vector convergence process of nodes based on an entity-relationship combining operator and a multi-headed self-attention mechanism in an embodiment of the present specification. The embedding vector aggregation process of the node of the embodiment of the present specification is described below with reference to fig. 3.
In FIG. 3, h1、h2、h3、h4、h5、h6 and h7 represent the original eigenvectors of nodes 1-7, respectively; r1 is the relationship of node 2 to node 4, representing the inflow of funds from node 4 to node 2; r2 is the relationship of node 2 to node 5, representing the inflow of funds from node 5 to node 2; r3 is the relationship of node 2 to node 6, representing the flow of funds from node 2 to node 6; r4 is the relationship of node 2 to node 7, representing the flow of funds from node 2 to node 7; α12 represents the attention of node 2 to node 1, α13 represents the attention of node 3 to node 1, r5 represents the relationship of node 2 and node 1, and r6 represents the relationship of node 3 and node 1, respectively. To represent the attention corresponding to different multidimensional vectors in the multi-head attention mechanism, multi-head marks corresponding to the multidimensional vectors can be further added for distinguishing. . Concat/avg represents vector combining or averaging operation on the aggregated vector h1 to obtain the corresponding embedded vector output h1'. Fig. 3 shows a vector embedded update mechanism of the node 1 under a certain layer of neural network. Alpha represents the attention and h represents the embedding vector. There are three lines on the attention (a solid two dotted line) representing three-headed attention. Of course, the schematic diagram of fig. 3 is for reference only, and does not constitute any limitation on the convergence process of the embodiments of the present specification.
It should be understood that in the embodiment of the present disclosure, the multi-head parameter value K in the multi-head self-attention mechanism is also a super parameter and is configurable. In the embodiment of the specification, a plurality of K values can be selected respectively to obtain an optimal enterprise identification model to be financing. For each node u, in the kth head, a shared attention monolayer feed forward neural network a: The self-attention coefficient of the neighbor node v of the computing node u can be specifically shown as a formula (3):
wherein,Representing a shared linear transformation matrix for all nodes in the attention header k of the layer l convolutional layer.
Furthermore, embodiments of the present description may choose to be composed of weight vectorsAnd a parameterized single-layer feedforward neural network a, and performing nonlinear processing by applying LeakyReLU functions. Where dl represents the number of dimensions of the embedded vector of the first layer convolutional layer. At this time, the formula (3) may be expressed as:
where || denotes a vector merge operation to combine the split-dimensional vectors into one multi-dimensional vector.
The self-attention coefficients calculated in equations (3), (4) reveal the importance of the embedding vector of the neighbor node v of node u to the embedding vector of node u. In order to make the self-attention coefficients between different neighbors easy to compare, the self-attention coefficients of node u at all its neighbors can be normalized using a function such as a softmax function. Specifically, the method can be shown as a formula (5):
After the normalized attention coefficients are obtained, the K-th head's self-attention coefficients for each node u may be calculated and then K independent self-attention coefficients are connected to obtain the embedded vector output for each node u. Specifically, determining, according to the self-attention coefficient of each neighbor node of the target node in the target fractal dimension vector and the target convolutional layer, the third linear transformation matrix corresponding to the target convolutional layer, and the embedded vector of each neighbor node of the target node in the target convolutional layer after the embedded vector of each neighbor node is transferred to the target node, an aggregate feature of the target node in the target convolutional layer and the target fractal dimension vector may include:
According to the self-attention coefficient of the target neighbor node of the target node in the target fractal dimension vector and the corresponding self-attention coefficient of the target convolution layer, the third linear transformation matrix corresponding to the target convolution layer and the embedded vector of the target neighbor node in the target convolution layer after being transferred to the target node, carrying out multiplication operation to obtain an aggregation characteristic value corresponding to the target neighbor node of the target node;
accumulating the aggregation characteristic values corresponding to the neighbor nodes of the target node, and then carrying out preset nonlinear transformation processing to obtain the aggregation characteristic corresponding to the target node in the target convolution layer and the target fractal dimension vector.
The specific implementation formula corresponding to the aggregation feature corresponding to the node u in the convolution layer l and the fractal dimension vector k can be shown as formula (6):
of course, it should be understood that in the embodiments of this description, there may be multiple modes of operation for this merge operation.
Optionally, as an embodiment, performing a merging operation on the aggregate features corresponding to each target multidimensional vector of the target node in the target convolutional layer to obtain an embedded vector output of the target node in the target convolutional layer may include:
and carrying out vector merging operation on the aggregation features corresponding to each target fractal dimension vector of the target node in the target convolution layer to obtain the embedded vector output of the target node in the target convolution layer.
At this time, the expression of the formula (1) may be modified as shown in the formula (7):
where || represents a vector merging operation of K self-attention coefficients to form one feature vector of K dimensions.
Optionally, as an embodiment, performing a merging operation on the aggregate features corresponding to each target multidimensional vector of the target node in the target convolutional layer to obtain an embedded vector output of the target node in the target convolutional layer may include:
When the target convolution layer is not the last layer of the target graph convolution network, vector merging operation is carried out on the aggregation features corresponding to each target fractal dimension vector of the target node in the target convolution layer so as to obtain embedded vector output of the target node in the target convolution layer;
and when the target convolution layer is the last layer of the target graph convolution network, carrying out average operation or weighted average operation on the aggregation characteristics corresponding to each target fractal dimension vector of the target node in the target convolution layer to obtain the embedded vector output of the target node in the target convolution layer.
That is, various other processing schemes may be used in place of the vector merge operation in the embedding of vectors at the output nodes of the final convolutional layer. For example, the vector merge operation may be replaced with an averaging operation, as specifically shown in equation (7):
of course, the above-described averaging operation may be replaced by other operations, such as performing weighted averaging, summing squares, and the like. The present specification embodiments are not limited in this regard.
It should be appreciated that the multi-headed self-attention mechanism of the present specification is relatively efficient because it can parallelize neighbors of a node. In addition, the multi-head self-attention mechanism is suitable for the problem of generalizable learning and can be popularized to the scene of unpredictable node quantity.
The embedded vectors of all nodes and edges output by the convolution layer of the last layer can be obtained through the entity relation operator and the multi-head self-attention mechanism.
Step S206, constructing a true relationship triplet and a fake relationship triplet based on the final embedded vector of each node and each edge, and determining the score of each relationship triplet.
The relation triples can comprise embedded vectors of head nodes, embedded vectors of tail nodes and embedded vectors of connecting edges of the head nodes and the tail nodes.
It should be understood that after the neighborhood structure learning process of the foregoing step S204, the final embedded vector output of each node and each edge may be obtained from the target graph convolutional network. And obtaining the real relation triplet delta according to the embedded vector finally output by the head node and the tail node of each edge and the embedded vector finally output by each edge. For each relationship triplet (h, r, t) ∈Δ, h represents the head node of the edge, r represents the relationship type of the edge, and t represents the tail node of the edge.
In addition, in the present description embodiment, counterfeit relationship triples may also be generated. At least one of the embedded vector of the head node, the embedded vector of the tail node, and the embedded vector of the connecting edge of the head and tail nodes in the forged relationship triplet is forged. The present description embodiment introduces the index of the score of a relationship triplet to distinguish between a true relationship triplet and a counterfeit relationship triplet.
Determining the score of the relationship triplet may include:
Determining an aggregate vector of the relation triplet according to the embedded vector of the head node in the relation triplet, the embedded vector of the tail node and the embedded vector of the connecting edge of the head node and the tail node;
a score for the relationship triplet is determined based on the aggregate vector of the relationship triplet.
It should be appreciated that in the process of learning a map of a target enterprise population through neighborhood structure learning, there may be some deviation in accuracy of the obtained embedded vector and relationship triples of the nodes due to multi-structure heterogeneity that ignores the relationships in the map, e.g., one-to-one, one-to-many, many-to-one, and many-to-many, etc. For example, if r is a many-to-one structural mapping, i.e(H, r, t) ∈Δ, we will get the same final output representation for all h, i.e. h0=h1=……=hm, which is unreasonable in real scenes. Similarly, if r is a one-to-many structural map, i.eWe will get t0=t1=……=tm, which is also not reasonable.
In order to solve the problem caused by multi-structure isomerism, a conversion mechanism can be introduced on the relational hyperplane to distinguish embedded vectors of nodes. At this time, the head node embedded vector and the tail node embedded vector of the relation triplet are projected to the hyperplane of the connection edge embedded vector to determine the score of the relation triplet.
Optionally, as an embodiment, determining the aggregate vector of the relation triplet according to the embedded vector of the head node, the embedded vector of the tail node, the embedded vector of the connecting edge of the head node and the tail node in the relation triplet includes:
Acquiring a first projection vector of a head node embedded vector of a target relation triplet projected to a hyperplane of a connecting edge embedded vector of the target relation triplet and a second projection vector of a tail node embedded vector of the target relation triplet projected to a hyperplane of a connecting edge embedded vector of the target relation triplet;
Embedding vectors into the connecting edges of the first projection vector and the target relation triplet and the second projection vector, and solving vector sums in sequence;
and taking the vector sum as an aggregate vector according to the relation triplet.
Of course, it should be understood that embodiments of the present disclosure may also determine the aggregate vector of a relationship triplet by other means, for example, summing the embedded vector of the head node, the embedded vector of the tail node, and the embedded vector of the connecting edge of the head node and the tail node in the relationship triplet in order to obtain the aggregate vector of the relationship triplet. The present specification embodiments are not limited in this regard.
After the aggregate vector of the relationship triples is obtained, the score of the relationship triples may be determined based on the aggregate vector of the relationship triples. Similarly, embodiments of the present description may determine the score of a relationship triplet in a variety of ways. Alternatively, the vector di-norms of the aggregate vector of the relationship triples may be obtained as the score of the target relationship triplet. Or the aggregate vector is transformed according to the weighted value corresponding to each dimension of the aggregate vector to obtain the score of the relation triplet, and the like.
FIG. 4 is a schematic illustration of an embedded vector projection of a node to a relational hyperplane in an embodiment of the present description. As shown in fig. 4, the embedded vector 2 of node 2 may be distinguished by projection onto different relational hyperplanes r1 and r 2. Under each relationship type r, the embedded vector hs of each node may be projected into the hyperplane using vector wr to obtain an embedded vector projection of the nodeAt this time, the score of each relation triplet (s, r, o) = (s, T (s, o), o) may be represented by the following formula (9):
Where wr represents the norm vector of the hyperplane corresponding to the relation r.
That is, when computing a relationship triplet, the conversion mechanism on the relationship hyperplane allows each node to have a distinguishable embedded vector under different relationship types, which avoids the situation where the embedded vectors of the nodes are folded to be identical. The scoring function is used for respectively processing the fund transfer-in relation triplet and the fund transfer-out relation triplet, and the entity relation combination operator phi of the first layer of convolution layer in the neighborhood structure learning process can be instantiated as follows:
Where Δin (o) and Δout (o) represent a funds transfer relationship triplet and a funds transfer relationship triplet, respectively. Wherein,A projection vector representing the relationship type r of the hyperplane in a layer l convolution layer in the graph convolution network.
In addition, each relational hyperplane projection vector will undergo a matrix transformation during the neighborhood structure learning process for each layer. Specifically as shown in formula (11)
Wherein,Representing a relational hyperplane transformation matrix in a layer l convolution layer in a graph convolution network.
Step S208, adjusting parameters of the target graph convolutional network based on the direction of minimizing the loss function and retraining the target graph convolutional network until the loss function meets a convergence condition.
Wherein the loss function is related to the fractional differences of the true relationship triplet and the false relationship triplet.
It should be appreciated that the present description embodiments may perform connection structure learning after obtaining the embedded vectors of nodes and edges through the K convolutional layers in the neighborhood structure learning process. Since the score of the true relationship triplet is predicted to be higher and the score of the counterfeit relationship triplet is lower, to maximize the distinction between the true relationship triplet and the counterfeit relationship triplet, the following difference-based loss function may be used:
Where Δ and Δ' represent the true relationship triplet and the false relationship triplet, respectively, zi represents the final embedded vector output from node i of the target graph convolutional network, and γ is the margin separating the positive relationship triplet and the negative relationship triplet.
When it is desired to minimize the loss function, to ensure that the embedded vector rrL+1 for each relationship type r from the target graph convolutional network is adjusted into the relationship hyperplane, the following constraints can be considered:
where e is the error vector that ensures orthogonality. With these constraints, equation (12) can be rewritten as:
In the illustrated embodiment, the loss function may be minimized in a variety of ways based on equation (14) above, such as using a random gradient descent (SGD), or other gradient descent algorithm, and so forth.
Based on the thought of minimizing the loss function, we can adjust the linear transformation matrices Wln,k,Wlr and Wlw of any first layer convolution layer in the target graph convolution network, and the weight vector a of each relation r, the norm vector Wr of the hyperplane, and the like, and then retrain the target graph convolution neural network until the loss function meets the convergence condition.
Step S210, training an enterprise identification model to be financing based on the final embedded vector of the target graph convolutional network after training of the nodes corresponding to the enterprise with the financing demand labels.
After the target graph convolutional network after the last training is obtained, the embedded vector output by the node corresponding to each enterprise in the last convolutional layer of the target graph convolutional network after the last training can be obtained, so that the enterprise identification model to be financing is trained based on the final embedded vector of each node and the financing demand label, and the specific training process can refer to the prior art and is briefly described below.
At this time, the enterprise set with the financing demand labels (which may include two labels requiring financing and not requiring financing) may be taken out and divided into a training set and a prediction set, then the enterprise identification model requiring financing is trained according to the training set and the corresponding financing demand labels, and the prediction accuracy of the enterprise identification model requiring financing is determined according to the prediction set and the corresponding financing demand labels, so as to determine whether to perform the next training.
Of course, it should be understood that in embodiments of the present description, a variety of artificial intelligence algorithms may be employed to train the financing enterprise identification model, such as XGBoost (eXtremeGradientBoosting) algorithms, gradient-lifting tree (Gradient Boosting Decision Tree, GBDT) algorithms, and so on, as embodiments of the present description are not limited in this regard.
Step S212, based on the trained enterprise identification model to be financing and the final embedded vector of the target graph convolution network after training of the nodes corresponding to the target enterprise without the financing demand label, whether the target enterprise is the enterprise to be financing is identified.
After training of the enterprise identification model to be financing is completed, the enterprise which does not have the financing demand label can be identified based on the enterprise identification model to be financing, so as to screen out the enterprise to be financing.
According to the embodiment of the specification, the neighborhood structure learning and neighborhood connection learning are carried out according to the map information of the target enterprise group, and then the enterprise identification model to be financed is trained according to the embedded vector of the enterprise node and the financing demand label of the enterprise node obtained after the learning is finished, so that whether the target enterprise is the enterprise to be financed or not is predicted according to the trained enterprise identification model to be financed and the embedded vector of the target enterprise node, the positioning accuracy of the enterprise to be financed can be greatly improved, and the financing product delivery mechanism can deliver the financing product into the enterprise to be financed efficiently.
The embodiment of the present disclosure further provides a training method for the identification model of the enterprise to be financing, which may include the steps shown in step S202-step S210 in the above embodiment, which is not described herein again.
Fig. 5 is a schematic structural diagram of an apparatus for identifying a financing enterprise according to an embodiment of the present disclosure, referring to fig. 5, the apparatus may specifically include:
The neighborhood structure learning module 510 inputs the original feature vector representing the nodes of the enterprise and the original feature vector representing the edges of the enterprise relationship into the target graph convolution network, and performs neighborhood structure learning based on the entity relationship combination operator to obtain final embedded vectors of each node and each edge, wherein the entity relationship combination operator is used for converging the embedded vectors of the neighboring nodes of the target node and the corresponding connecting edges;
The relationship triplet generation module 520 constructs a true relationship triplet and a fake relationship triplet based on the final embedded vector of each node and edge;
a relationship triplet score determination module 530 that determines a score for each relationship triplet;
A graph convolution network parameter adjustment module 540 that adjusts parameters of the target graph convolution network based on a direction that minimizes a loss function and retrains the target graph convolution network until the loss function meets a convergence condition, wherein the loss function is related to a fractional difference value of a true relationship triplet and a fake relationship triplet; ;
The identification model training module 550 is used for training an enterprise identification model to be financing based on the final embedded vector of the target graph convolutional network after training of the nodes corresponding to the enterprise with the financing demand label;
The prediction module 560 identifies whether the target enterprise is an enterprise to be financing based on the trained enterprise to be financing identification model and the final embedded vector of the target graph convolutional network for the node corresponding to the target enterprise without the financing demand label after training.
The device for identifying a financing enterprise according to the embodiment of the present disclosure may further execute the method of the embodiment shown in fig. 2, and implement the function of the corresponding module in the corresponding step of fig. 2, and the specific implementation may refer to the embodiment shown in fig. 2, which is not repeated.
In addition, it should be noted that, among the respective components of the apparatus of the present specification, the components thereof are logically divided according to functions to be realized, but the present specification is not limited thereto, and the respective components may be re-divided or combined as necessary.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, and referring to fig. 6, the electronic device includes a processor, an internal bus, a network interface, a memory, and a nonvolatile memory, and may include hardware required by other services. The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form the enterprise identification device to be financing on a logic level. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
The network interface, processor and memory may be interconnected by a bus system. The bus may be an ISA (Industry Standard Architecture ) bus, a PCI (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) bus, or EISA (Extended Industry Standard Architecture ) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in FIG. 6, but not only one bus or type of bus.
The memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include read only memory and random access memory and provide instructions and data to the processor. The Memory may comprise a Random-Access Memory (RAM) or may further comprise a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory.
The processor is used for executing the program stored in the memory and specifically executing:
Inputting original feature vectors representing nodes of enterprises and original feature vectors representing edges of enterprise relations into a target graph convolution network, and carrying out neighborhood structure learning based on entity relation combination operators to obtain final embedded vectors of each node and each edge, wherein the entity relation combination operators are used for converging the embedded vectors of the adjacent nodes of the target nodes and the corresponding connecting edges;
Constructing a true relationship triplet and a fake relationship triplet based on the final embedded vector of each node and each edge, and determining the score of each relationship triplet;
adjusting parameters of the target graph convolutional network based on a direction of minimizing a loss function and retraining the target graph convolutional network until the loss function meets a convergence condition, wherein the loss function is related to a fractional difference value of a true relationship triplet and a fake relationship triplet;
training an enterprise identification model to be financing based on the final embedded vector of the target graph convolutional network after training of the nodes corresponding to the enterprise with the financing demand labels;
And identifying whether the target enterprise is the enterprise to be financing based on the trained enterprise identification model to be financing and the final embedded vector of the target graph convolutional network after training of the nodes corresponding to the target enterprise without the financing demand label.
The method performed by the apparatus for identifying a financing enterprise disclosed in the embodiment of fig. 2 of the present specification may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of this specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
Based on the same invention, the embodiments of the present disclosure further provide a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform the method for identifying a financing enterprise or the method for training a model for identifying a financing enterprise provided by the corresponding embodiment of fig. 2.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.