Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clear, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person skilled in the art without making any inventive step based on the embodiments in this description belong to the protection scope of this document.
To facilitate understanding of the embodiments of the present specification, some noun explanations are introduced below.
Heterogeneous graph (heterogenoousgraph): is a structure that describes a set of objects, some of which are "related" in a sense. These objects correspond to mathematical abstractions called nodes, and each related pair of nodes is called an edge. There may be more than one type of node and edge in the heterogeneous graph.
Small and medium-sized enterprise maps (Small and medium-sized maps): the data structure is based on a graph, wherein the nodes represent small and micro enterprises, and the links represent relationships among the small and medium enterprises, such as stock control relationships, joint fund relationships, upstream and downstream relationships, transfer relationships and the like.
Capital shortage (financial constraint): is an enterprise status, which means that the amount of money owned by the enterprise is less than the amount of money required to maintain the normal production of the enterprise. Banks need to be positioned from massive small enterprises to small enterprises with short funds to better assist the development of the small enterprises.
Embedding (Embedding): one layer common in deep learning network models is mainly used for processing vector representation of sparse features. The method can solve the length problem of the one-hot vector and can also represent the similarity between the features.
Attention mechanism (Attention mechanism) stems from the study of human vision. In cognitive science, humans selectively focus on a portion of all information while ignoring other visible information due to bottlenecks in information processing. The above mechanism is commonly referred to as an attention mechanism. The core of the attention mechanism in deep learning is from all information in the attention deep network to the attention focus information.
Graph volume network (Graph volume network): the military prostitute neural network is a neural network capable of processing graph structure data, and updates embedding on nodes by aggregating or transmitting information of the neighborhood of a target node, so as to perform downstream tasks.
Graph attention network (Graphattentionnetwork): the method is an improvement of a graph convolution network, and in the process of aggregating neighbor information, the graph attention network can realize weighted aggregation of neighbors by learning the weights of the neighbors. Therefore, it is not only more robust to noisy neighbors, but the attention mechanism also gives the model some interpretability.
Hyperplane (Hyperplane): refers to a subspace with dimension n-1 in an n-dimensional linear space. It can partition a linear space into two disjoint parts, so that two vectors that are difficult to distinguish before partitioning are distinguished after partitioning.
Relationship translation (Relational translation): the method is used for calculating whether the relation between two nodes is established on the heterogeneous graph. The method is characterized in that the relationships in the isomerism are regarded as certain translation vectors between the entities, namely for each fact relationship triple, the entities and the relationships are firstly represented in the same vector space, and then the relationship vectors are regarded as translation between a head entity vector and a tail entity vector.
It should be understood that in embodiments of the present specification, a variety of heterogeneous relationships exist between enterprises, such as, for example, a partnership, a parent company relationship, a supplier relationship, a share owner relationship, a share guarantor relationship, a common owner relationship, and the like. For ease of understanding, the following is illustrated with respect to parent company relationships and supplier relationships.
Fig. 1 is a schematic diagram of a scenario of an enterprise relationship in an embodiment of the present specification. As shown in fig. 1, currently in enterprise a, there are two heterogeneous enterprise relationships, a parent corporate relationship and a supplier relationship. The enterprise B is a parent company of the enterprise A and can provide funds for the enterprise A to develop; enterprise a is the supplier of enterprise C or enterprise a is an upstream enterprise of enterprise C, and downstream enterprise C needs to provide funds to upstream enterprise a as a payment for the goods. As shown in FIG. 1, a parent company and a subsidiary company are in a one-to-many relationship, and one parent company may have a plurality of subsidiary companies which need to be provided with funds; the upstream and downstream companies are in a many-to-many relationship, and one enterprise may have multiple suppliers, or may simultaneously serve as suppliers of multiple enterprises, that is, one enterprise may have multiple downstream enterprises, or may simultaneously have multiple upstream enterprises.
In such a complex enterprise heterogeneous relationship, the fund flow relationship corresponding to different enterprise relationships is also different. In addition, in different enterprise relations, the influence factors of the fund demands of the neighbor nodes on the current node are different.
In order to take account of the influence of various enterprise relationships on enterprise financing requirements, the embodiment of the specification provides a training method for an enterprise identification model needing financing. Fig. 2 is a flowchart of an identification method for an enterprise requiring financing according to an embodiment of the present disclosure. As shown in fig. 2, the method may include:
and step S202, generating map information of the target enterprise group.
Any node in the map information corresponds to one enterprise in the target enterprise group, and any edge in the map information corresponds to the relationship between two enterprises corresponding to the two nodes connected by the edge.
It should be understood that the process of generating enterprise graph information belongs to the prior art, and specific implementation may refer to relevant contents in the prior art, and details of the embodiments of this specification are not described herein again. In this specification, a Small and Medium-sized Enterprise (SME) group is taken as an example, and at this time, the map information of the target Enterprise group is an SME map, which can be represented as G ═ V, epsilon, Δ, H, R. Wherein,
V, represents a set of nodes of the SME.
ε, represents the set of heterogeneous relationship types between SMEs (i.e., the set of edges in the SNE graph).
And delta, representing a set of relational triplets. For each relationship triplet (h, r, t) ∈ Δ, h denotes the head node of the edge, r denotes the relationship type of the edge, and t denotes the tail node of the edge.
And H, representing the original node characteristic matrix of all nodes in all SME graphs.
And R represents a relationship characteristic matrix of relationship types among all nodes in all SME graphs.
It should be understood that in embodiments of the present description, various relationships may be included between enterprises, such as, for example, a partnership, a parent company relationship, a supplier relationship, a share owner relationship, a share guarantor relationship, a common owner relationship, and so on. Of course, it should be understood that in the embodiments of this specification, different fund flow relationships due to various business relationships are primarily considered. For example, company a is a parent company of company B, and when company B has a fund demand, there is a fund flow relationship in which company a transfers funds to company B; for another example, company B is a supplier of company C, and when company B provides an item to company C, there is a fund flow relationship in which company C pays a payment to company B.
Of course, it should be understood that, for the embodiment of this specification, step S202 is optional, and the map information may also be obtained directly, for example, the map information may be read directly from a pre-stored database, or obtained from a third party, or the like, which is not limited in this embodiment of the specification.
Step S204, inputting the original characteristic vectors of the nodes representing the enterprises and the original characteristic vectors of the edges representing the enterprise relations in the map information of the target enterprise group into a target map convolution network, and performing neighborhood structure learning based on an entity relation combination operator to obtain final Embedding (Embedding) vectors of the nodes and the edges.
The entity relation combination operator is used for converging the neighbor nodes of the target node and the embedded vectors of the corresponding connecting edges. Specifically, the entity relationship combination operator is used for converging the embedded vectors of the neighboring nodes of the target node and the embedded vectors of the connecting edges of the target node and the neighboring nodes to obtain the embedded vectors after the embedded vectors of the neighboring nodes are transmitted to the target node.
It should be understood that, in the embodiment of the present specification, the final embedded vector of the node and the edge refers to the embedded vector of the node and the edge output by the last convolutional layer of the target graph convolutional network.
It should be understood that, in the embodiment of the present specification, neighborhood structure learning is performed on the map information of the target enterprise population through the target map convolution network. It should be understood that, when performing neighborhood structure learning based on the entity relationship combination operator to obtain the final embedded vector of each node and edge, the embedded vectors of the neighboring nodes and the connecting edges of each target node may be aggregated layer by layer based on the entity relationship combination operator to obtain the embedded vector after the embedded vector of the neighboring node is transmitted to the target node, and the embedded vectors of the edges are converted layer by layer to the next convolution layer. Namely: and converging the embedded vectors of the neighbor nodes and edges of each node layer by layer based on the entity relationship combination operator to obtain the embedded vectors output by each convolution layer of each node and edge in the target graph convolution network. . In the embodiment of the specification, the number L of convolution layers of the target graph convolution network is a hyper-parameter and is configurable. According to the embodiment of the specification, different L values can be selected according to requirements, so that the optimal enterprise identification model needing financing is obtained. For example, the L value may take 2, 3, 5, 8, 10, 20, etc., and each L value may result in an optimal enterprise identification model to be financed. However, the cost of the model obtained by training different L values is different from the prediction accuracy of the trained model, and when the L value is configured, the balance between the training cost and the prediction accuracy needs to be considered.
It should be understood that in a graph convolution network, each convolution layer has a corresponding embedded vector input and embedded vector output. In the embodiment of the present specification, in the convolution layer of the l layer of the target graph convolution network, the input of the convolution layer is the embedding vector h of a group of nodes and the embedding vector r of a group of edges (relations). Specifically, the following vectors can be used:
embedding vector of node inputted by layer I convolution layer
Where | V | represents the number of nodes of the node set V.
Represents the input feature vector of the l-th convolutional layer of the node i in the graph convolutional network. The node embedded vector input by the (l + 1) th convolutional layer is also the node embedded vector output by the (l + 1) th convolutional layer.
Node-embedded vector input by layer I convolutional layer
Where | ε | represents the number of relationships in the set of relationships ε.
The input feature vector representing the i-th convolutional layer of the edge (relationship) in the graph convolutional network. The embedded vector of the edge input by the (l + 1) th convolutional layer, that is, the embedded vector of the edge output by the (l + 1) th convolutional layer.
In particular, in layer 1 convolutional layers, its inputs are the original feature vectors of nodes and the original feature vectors of edges.
In the embodiment of the specification, in order to distinguish the importance of neighbor nodes with different relationship types, the method introduces One can design an entity-relationship combining operator to aggregate the embedded vectors from neighboring nodes and edges by performing a synthesis operation on the embedded vectors of neighboring nodes and edges. For each node u, the embedded vector inputs from the one-hop neighbors are first aggregated to compute the embedded vector output for each node. According to the entity relation combination operator, the embedded vector of each neighbor node v of the node u and the embedded vectors of edges corresponding to the node u and the neighbor nodes v after being transmitted to the node can be determined. The embedding vector of each neighboring node u and the embedding vector of the edge can determine a transferred embedding vector
Then through a transformation matrix
And after the embedded vector transmitted to the node u by each neighbor node of the node u is subjected to linear transformation, nonlinear transformation processing is carried out through a preset function f () to obtain the embedded vector output corresponding to the node u. It should be understood that the preset function f () may select various activation functions, which is not limited in this embodiment of the present specification.
Optionally, the neighborhood structure learning process may include:
determining an embedded vector of a target neighbor node in the target convolutional layer after the embedded vector is transmitted to the target node according to the entity relationship combination operator, the embedded vector input of the target neighbor node of the target node in the target convolutional layer and the embedded vector input of the connecting edge of the target node and the target neighbor node in the target convolutional layer;
Determining the embedded vector output of the target node corresponding to the target convolutional layer according to the first linear transformation matrix corresponding to the target convolutional layer and the embedded vector after the embedded vector of each neighbor node of the target node in the target convolutional layer is transmitted to the target node; the first linear transformation matrix corresponding to the target convolutional layer is used for carrying out linear transformation processing on a node embedding vector input by the target convolutional layer;
and determining the embedded vectors of each node and each edge output by the last convolutional layer of the target graph convolutional network as final embedded vectors of each node and each edge.
Optionally, determining an embedded vector output of the target node corresponding to the target convolutional layer according to the first linear transformation matrix corresponding to the target convolutional layer and the embedded vector of each neighboring node of the target node in the target convolutional layer after being transferred to the target node, may include:
according to a first linear transformation matrix corresponding to the target convolutional layer, carrying out linear transformation on an embedded vector of each neighbor node of the target node in the target convolutional layer after the embedded vector is transmitted to the target node, and obtaining a first transformation parameter corresponding to the target neighbor node;
and accumulating the first transformation parameters corresponding to each neighbor node of the target node and then performing preset nonlinear transformation processing to obtain the embedded vector output corresponding to the target node in the target convolution layer.
Alternatively, the embedded vector output of node u in layer l (i.e. the embedded vector input of node u in layer l + 1) is represented as shown in equation (1):
wherein N (u) represents a one-hop neighbor set of the node u, T (u, v) represents a neighbor node v of the node u and a relation type of the node u,
an embedded vector input representing the relationship of node u and node v that are neighbors of node u in the convolutional layer of the l-th layer,
represents the embedded vector input of a neighbor node v of node u in the convolutional layer of the l-th layer,
embedding vector according to entity-relation synthesis operator and neighbor node v
Embedded vector of sum relation T (u, v)
Passing the embedded vector of neighbor node v to the node
uThe subsequent embedded vector is then processed to obtain the embedded vector,
a node embedded vector transformation matrix which is expressed in the first layer of convolution layer in the graph convolution network and is used for carrying out linear transformation processing on the node embedded vector input by the first layer of convolution layer; the function f () is an activation function for performing a preset nonlinear transformation process on parameters within the function.
In addition, since the update of the embedded vector of the node u in formula 1 transforms the original vector space, the relation vector of the l +1 th layer input
Corresponding transformations are also to be made. The embedding vector output process for each side at each convolutional layer may include:
And according to a second linear transformation matrix corresponding to the target convolutional layer, carrying out linear transformation processing on the embedded vector input corresponding to the target edge on the target convolutional layer, and carrying out preset nonlinear transformation on the parameters subjected to the linear transformation processing to obtain the embedded vector output corresponding to the target edge on the target convolutional layer, wherein the second linear transformation matrix corresponding to the target convolutional layer is used for carrying out linear transformation processing on the embedded vector input by the target convolutional layer.
Optionally, the ith relation in the ith convolutional layer in the target graph convolutional network is embedded into the vector output
And the corresponding ith relation embedding vector input
Can be shown as formula (2):
wherein,
the relational transformation matrix in the l-th convolutional layer in the target graph convolutional network is shown. Embedding vector inputs according to ith relation in ith convolutional layer in target graph convolutional network
Combining relation transformation matrix
And activating a function f () to obtain the relation embedded vector output in the l layer convolution layer in the target graph convolution network
It should be understood that the activation function f () of formula (1) for computing the node-embedded vector, and the activation function f () of formula (2) for computing the edge-embedded vector, may be the same or different, but are generally selected to use the same activation function.
With the above equations (1) and (2), the target graph convolution network can model multi-type heterogeneity of edges (relationships) in the graphs of the target enterprise population while maintaining the spatial complexity of the relationship type modeling as (O (| ε | d)l) That is, linear in the feature dimension, where | ε | represents the number of nodes in the graph for the target business population, dlThe feature dimension of the embedding vector representing the l-th layer.
It will of course be appreciated that if the embedding vectors of the neighbor nodes and the embedding vectors of the edges are aggregated by means of entity-relationship synthesizers only, there may be certain drawbacks, such as not being able to treat variable-size neighbors as inputs, but instead focusing on the most relevant neighbors.
Preferably, in order to solve the above problem, a self-attention mechanism may be introduced to calculate an embedding vector of each node in the graph by paying attention to neighboring nodes. In order to stabilize the learning process of the self-attention mechanism, the embodiment of the specification expands the mechanism and adopts a multi-head self-attention mechanism. At this time, neighborhood structure learning can be performed based on the entity relationship combination operator and the multi-head self-attention mechanism to obtain final embedded vectors of each node and edge. It should be understood that, when the neighborhood structure learning is performed based on the entity relationship combination operator and the multi-head self-attention mechanism, the embedded vectors of the neighbor nodes of each node and the embedded vectors of each edge can be aggregated layer by layer to obtain the embedded vectors output by each convolutional layer in the target graph convolutional network of each node and edge, so as to obtain the final embedded vectors of each node and edge output by the last convolutional layer.
Specifically, the embedded vector aggregation process of the node introducing the multi-head self-attention mechanism may include:
determining an embedded vector of a target neighbor node in the target convolutional layer after the embedded vector is transmitted to the target node according to the entity relationship combination operator, the embedded vector input of the target neighbor node of the target node in the target convolutional layer and the embedded vector input of the connecting edge of the target node and the target neighbor node in the target convolutional layer;
according to a third linear transformation matrix corresponding to the target convolutional layer, determining a target fractal vector of a target node in the embedded vector input of the target convolutional layer and an embedded vector after the embedded vector of a target neighbor node of the target node in the target convolutional layer is transmitted to the target node, determining a self-attention coefficient of the target neighbor node of the target node in the target fractal vector and the corresponding self-attention coefficient of the target convolutional layer, and performing normalization processing to obtain a normalized self-attention coefficient, wherein the third linear transformation matrix corresponding to the target convolutional layer is used for performing linear transformation processing on the node embedded vector input by the target convolutional layer;
determining the aggregation characteristics of the target node corresponding to the target convolution layer and the target fractal vector according to the normalized self-attention coefficient of each neighbor node of the target node corresponding to the target fractal vector and the target convolution layer, the third linear transformation matrix corresponding to the target convolution layer and the embedded vector of each neighbor node of the target node in the target convolution layer after being transmitted to the target node;
Merging the aggregation characteristics corresponding to the target fractal dimension vectors of the target nodes in the target convolutional layer to obtain the embedded vector output of the target nodes in the target convolutional layer;
and determining the embedded vector of each node and edge output by the last convolutional layer of the target graph convolutional network as the final embedded vector of each node and edge.
FIG. 3 is a schematic diagram of an embedding vector aggregation process of a node based on an entity relationship combination operator and a multi-head self-attention mechanism according to an embodiment of the present specification. The embedded vector aggregation process of the node according to the embodiment of the present disclosure is described below with reference to fig. 3.
In FIG. 3, h1、h2、h3、h4、h5、h6And h7Representing the original feature vectors of nodes 1-7, respectively; r is1Is the relationship of node 2 to node 4, indicating the flow of funds from node 4 to node 2; r is2Is the relationship of node 2 to node 5, indicating the flow of funds from node 5 to node 2; r is3Is the relationship of node 2 to node 6, indicating that funds are flowing from node 2 to node 6; r is4Is the relationship of node 2 to node 7, indicating that funds are flowing from node 2 to node 7; alpha is alpha12The expression indicates the attention of node 2 to node 1, α13Respectively, the attention of node 3 to node 1, r5Representing the relationship of node 2 and node 1, r6Representing the relationship of node 3 and node 1. In order to indicate the attention corresponding to different fractal vectors in the multi-head attention mechanism, multi-head marks corresponding to the fractal vectors can be further added for distinguishing. . Concat/avg represents the vector merging or averaging operation performed on the aggregated vector h1 to obtain the corresponding embedded vector output h1'. Fig. 3 shows a vector embedding update mechanism of node 1 in a certain layer of neural network. α denotes attention and h denotes an embedding vector. There are three lines on attention (a solid two dashed line) indicating three heads of attention. Of course, the schematic diagram of FIG. 3 is for reference only and does not describe the present inventionThe convergence process of the embodiments constitutes any limitation.
It should be understood that in the embodiment of the present specification, the multi-head parameter value K in the multi-head self-attention mechanism is also a super-parameter and is configurable. In the embodiment of the specification, multiple K values can be selected respectively to obtain the optimal enterprise identification model needing financing. For each node u, in the kth head, a single layer feedforward neural network a with shared attention may be used:
the self-attention coefficient of the neighboring node v of the node u is calculated, which may be specifically shown in formula (3):
wherein,
a linear transformation matrix representing the sharing of all nodes in attention head k of the l-th convolutional layer.
Furthermore, embodiments of the present specification may choose from a weight vector
And (3) carrying out non-linear processing by using a parameterized single-layer feedforward neural network a and applying an LeakyReLU function. Wherein, d
lAnd the dimension number of the embedding vector of the first layer convolution layer is shown. At this time, equation (3) can be expressed as:
wherein | | | represents a vector merging operation to combine the component dimensional vectors into one multi-dimensional vector.
The self-attention coefficients calculated in equations (3) and (4) reveal the importance of the embedding vector of the neighboring node v of the node u to the embedding vector of the node u. In order to make the self-attention coefficients of different neighbors easy to compare, the self-attention coefficients of the node u at all the neighbor nodes can be normalized by using a function such as a softmax function. Specifically, it can be shown as formula (5):
after obtaining the normalized attention coefficients, the K-th head self-attention coefficient of each node u may be calculated, and then K independent self-attention coefficients are concatenated to obtain the embedded vector output for each node u. Specifically, determining the aggregation characteristics of the target node corresponding to the target convolutional layer and the target multidimensional vector according to the self-attention coefficient of each neighbor node of the target node corresponding to the target multidimensional vector and the target convolutional layer, the third linear transformation matrix corresponding to the target convolutional layer, and the embedded vector of each neighbor node of the target node in the target convolutional layer after being transferred to the target node may include:
According to the target fractal dimension vector of the target node, the self-attention coefficient corresponding to the target convolution layer, the third linear transformation matrix corresponding to the target convolution layer and the embedded vector of the target neighbor node in the target convolution layer after being transmitted to the target node, multiplication is carried out to obtain the aggregation characteristic value corresponding to the target neighbor node of the target node;
and accumulating the aggregation characteristic values corresponding to the neighbor nodes of the target node and then performing preset nonlinear transformation processing to obtain the aggregation characteristics corresponding to the target node in the target convolution layer and the target fractal vector.
The specific implementation formula of the node u corresponding to the aggregation features corresponding to the convolution layer l and the fractal dimension vector k can be shown as formula (6):
of course, it should be understood that in the embodiments of the present specification, there may be a plurality of operation modes for the merge operation.
Optionally, as an embodiment, the merging the aggregation features corresponding to the target multidimensional vectors of the target node in the target convolutional layer to obtain an embedded vector output of the target node in the target convolutional layer may include:
and carrying out vector merging operation on the aggregation characteristics corresponding to the target dimension vectors of the target nodes in the target convolutional layer to obtain the embedded vector output of the target nodes in the target convolutional layer.
At this time, the expression of the formula (1) may be changed as shown in the formula (7):
wherein | | | represents a vector merging operation of K self-attention coefficients to form a K-dimensional feature vector.
Optionally, as an embodiment, the merging the aggregation features corresponding to the target multidimensional vectors of the target node in the target convolutional layer to obtain an embedded vector output of the target node in the target convolutional layer may include:
when the target convolutional layer is not the last convolutional layer of the target graph convolutional network, carrying out vector merging operation on the aggregation characteristics corresponding to all target fractal dimension vectors of target nodes in the target convolutional layer to obtain embedded vector output of the target nodes in the target convolutional layer;
and when the target convolutional layer is the last convolutional layer of the target graph convolutional network, carrying out average operation or weighted average operation on the aggregation characteristics corresponding to each target fractal dimension vector of the target node in the target convolutional layer to obtain the embedded vector output of the target node in the target convolutional layer.
That is, in the process of embedding vectors in the last layer of convolutional layer output nodes, various other processing methods can be adopted instead of the vector merging operation. For example, the vector merge operation may be replaced by an averaging operation, as shown in equation (7):
Of course, the averaging operation may be replaced by other operations, such as performing weighted averaging, summing squares, and the like. The embodiments of the present disclosure are not limited thereto.
It should be appreciated that the multi-headed self-attention mechanism of the embodiments of the present specification is highly efficient because it can parallelize the processing of neighbors of the node. In addition, the multi-head self-attention mechanism is suitable for the problem of inductive learning and can be popularized to the scene of unpredictable node number.
And obtaining the embedded vectors of all nodes and edges output by the last layer of convolutional layer through the entity relational operator and the multi-head self-attention mechanism.
And step S206, constructing a true relation triple and a forged relation triple based on the final embedded vectors of the nodes and the edges, and determining the scores of the relation triples.
The relationship triples may include the embedded vectors of the head nodes, the embedded vectors of the tail nodes, and the embedded vectors of the connecting edges of the head and tail nodes.
It should be understood that after the neighborhood structure learning process of step S204, the final embedded vector output of each node and each edge can be obtained from the target graph convolutional network. And obtaining the true relation triple delta according to the finally output embedded vector of the head node and the tail node of each edge and the finally output embedded vector of each edge. For each relationship triplet (h, r, t) ∈ Δ, h denotes the head node of the edge, r denotes the relationship type of the edge, and t denotes the tail node of the edge.
In addition, in the embodiment of the present specification, a forged relationship triple may also be generated. At least one of the embedded vector of the head node, the embedded vector of the tail node and the embedded vector of the connecting edge of the head node and the tail node in the forged relationship triple is forged. The embodiment of the present specification introduces an index, which is the score of a relationship triple, to distinguish a true relationship triple from a forged relationship triple.
Determining scores for relationship triplets may include:
determining an aggregation vector of the relation triple according to the embedded vector of the head node, the embedded vector of the tail node and the embedded vectors of the connecting edges of the head node and the tail node in the relation triple;
determining a score for the relationship triplet based on the aggregated vector of relationship triplets.
It should be understood that in the process of reading out the graph of the target enterprise group through neighborhood structure learning, due to the fact that multiple structural isomerism of the relationship in the graph is omitted, such as one-to-one, one-to-many, many-to-one, many-to-many and the like, the accuracy of the obtained embedded vector of the node and the relationship triple can have a certain deviation. For example, if r is a many-to-one structural map, i.e.
(h, r, t) ∈ Δ, then we will get the same final output representation for all h, i.e., h
0=h
1=……=h
mThis is not reasonable in real scenarios. Similarly, if r is a one-to-many structure map, i.e.
Then we will get t
0=t
1=……=t
mThis is also not reasonable.
In order to solve the problem caused by multi-structure heterogeneity, a conversion mechanism can be introduced on the relation hyperplane to distinguish the embedded vectors of the nodes. At this time, the head node embedding vector and the tail node embedding vector of the relational triplet are projected to the hyperplane of the connecting edge embedding vector to determine the score of the relational triplet.
Optionally, as an embodiment, determining an aggregation vector of the relationship triplet according to the embedded vector of the head node, the embedded vector of the tail node, and the embedded vectors of the connecting edges of the head node and the tail node in the relationship triplet includes:
acquiring a first projection vector of a head node embedding vector of a target relation triple projected to a hyperplane of a connecting edge embedding vector of the target relation triple and a second projection vector of a tail node embedding vector of the target relation triple projected to a hyperplane of the connecting edge embedding vector of the target relation triple;
sequentially summing the first projection vector, the connecting edge embedding vector of the target relation triple and the second projection vector;
And taking the vector sum as an aggregation vector according to the relation triple.
Of course, it should be understood that the embodiments of the present specification may also determine the aggregation vector of the relational triple by other ways, for example, sequentially summing the embedded vector of the head node, the embedded vector of the tail node, and the embedded vectors of the connecting edges of the head node and the tail node in the relational triple to obtain the aggregation vector of the relational triple. The embodiments of the present disclosure are not limited thereto.
After the aggregation vector of the relationship triple is obtained, the score of the relationship triple can be determined based on the aggregation vector of the relationship triple. Similarly, embodiments of the present description may determine scores for relationship triples in a variety of ways. Optionally, a vector two-norm of an aggregation vector of relationship triples may be obtained as a score of the target relationship triples. Or, the aggregation vector is transformed according to the weighted value corresponding to each dimension of the aggregation vector to obtain the score of the relation triple, and the like.
Fig. 4 is a schematic diagram of projection of embedded vectors of nodes to a relational hyperplane in an embodiment of the present description. As shown in fig. 4, the embedded vector 2 of node 2 can be distinguished by projection into different relational hyperplanes r1 and r 2. Under each relationship type r, a vector w may be utilized
rEmbedding vector h of each node
sProjection of an embedded vector into a hyperplane to obtain nodes
At this time, the score of each relational triplet (s, r, o) ═ s, T (s, o), o) can be expressed as the following equation (9):
wherein wrRepresenting the norm vector of the hyperplane to which the relation r corresponds.
That is, when computing a relationship triplet, the conversion mechanism on the relationship hyperplane causes each node to have a distinguishable embedded vector under a different relationship type, which avoids the situation where the embedded vectors of the nodes are collapsed to be the same. By using the scoring function to process the fund-in relation triple and the fund-out relation triple respectively, the entity relation combination operator Φ of the l-th convolutional layer in the learning process of the neighborhood structure can be instantiated as follows:
wherein, Delta
in(o) and. DELTA.
outAnd (o) respectively representing a fund-in relationship triple and a fund-out relationship triple. Wherein,
a projection vector representing the relationship type r of the hyperplane in the l-th convolutional layer in the graph convolutional network.
In addition, in the process of learning the neighborhood structure of each layer l, each relation hyperplane projection vector is subjected to matrix transformation. Specifically, as shown in formula (11)
Wherein,
a relational hyperplane transformation matrix in the l-th convolutional layer in a graph convolutional network is shown.
And S208, adjusting parameters of the target graph convolution network based on the direction of the minimized loss function and retraining the target graph convolution network until the loss function meets the convergence condition.
Wherein the loss function is associated with a fractional difference of a true relationship triplet and a false relationship triplet.
It should be appreciated that the embodiments of the present specification may perform connection structure learning after obtaining the embedded vectors of nodes and edges through K convolutional layers in the neighborhood structure learning process. Since the scores of true relationship triplets are expected to be higher and the scores of forged relationship triplets are lower, to maximize the distinction between true relationship triplets and forged relationship triplets, the following difference-based loss function may be used:
where Δ and Δ' represent true and false relationship triplets, z, respectivelyiRepresenting the final embedded vector output from node i of the target graph convolutional network, and gamma is the margin separating the positive and negative relationship triplets.
When it is desired to minimize the loss function, to guarantee the embedded vector r of each relationship type r from the target graph convolution networkrL+1Adjusted into the relation hyperplane, the following constraints may be considered:
Where e is the error vector that ensures orthogonality. With these constraints, equation (12) can be rewritten as:
in the illustrated embodiment, the loss function can be minimized in a variety of ways based on equation (14), such as by using a random gradient descent (SGD), or other gradient descent algorithms, among others.
Based on the idea of minimizing the above-mentioned loss function, we can targetLinear transformation matrix W of arbitrary layer I convolutional layer in convolutional layer networkln,k,WlrAnd WlwAnd a weight vector a, a hyperplane norm vector w for each relation rrAnd adjusting, and then retraining the target graph convolutional neural network until the loss function meets the convergence condition.
And step S210, training an enterprise identification model needing financing based on the final embedded vector of the target graph convolutional network after the training of the nodes corresponding to the enterprise with the financing demand label.
After the target graph convolutional network after the last training is obtained, the embedded vector output by the node corresponding to each enterprise at the last convolutional layer of the target graph convolutional network after the last training can be obtained, so as to train the enterprise identification model needing financing based on the final embedded vector of each node and the financing demand label, and the specific training process can refer to the prior art and is briefly introduced below.
At this time, the enterprise set with the financing demand labels (which may include two labels of financing demand and financing demand) is taken out and divided into a training set and a prediction set, then the enterprise identification model to be financed is trained according to the training set and the corresponding financing demand labels, and the prediction accuracy of the enterprise identification model to be financed is determined according to the prediction set and the corresponding financing demand labels, so as to determine whether to perform the next round of training.
Of course, it should be understood that, in this embodiment of the present disclosure, a variety of artificial intelligence algorithms may be used to train the financing enterprise identification model, for example, an xgboost (extremegradientbooring) algorithm, a Gradient Boosting Decision Tree (GBDT) algorithm, and the like, which is not limited in this embodiment of the present disclosure.
Step S212, based on the trained enterprise identification model needing financing and the final embedded vector of the target graph convolutional network after the training of the node corresponding to the target enterprise without the financing demand label, whether the target enterprise is the enterprise needing financing is identified.
After the training of the enterprise identification model needing financing is completed, the enterprises without the financing demand labels can be identified based on the enterprise identification model needing financing, so that the enterprises needing financing can be screened out.
In the embodiment of the specification, neighborhood structure learning and neighborhood connection learning are carried out according to map information of a target enterprise group, and then an enterprise identification model needing financing is trained according to an embedded vector of an enterprise node and a financing demand label of the enterprise node obtained after learning is finished, so that whether the target enterprise is an enterprise needing financing is predicted according to the trained enterprise identification model needing financing and the embedded vector of the target enterprise node, and therefore the positioning accuracy of the enterprise needing financing can be greatly improved, and a financing product delivery mechanism can deliver financing products to the enterprise needing financing efficiently.
The embodiment of the present specification further provides a training method for an enterprise identification model needing financing, which may include steps shown in step S202 to step S210 in the foregoing embodiment, and the embodiment of the present specification is not described herein again.
Fig. 5 is a schematic structural diagram of an enterprise identification device to be financed according to an embodiment of the present disclosure, and referring to fig. 5, the device may specifically include:
the neighborhoodstructure learning module 510 inputs the original feature vectors of the nodes representing the enterprises and the original feature vectors of the edges representing the enterprise relations in the graph information of the target enterprise group into a target graph convolution network, and performs neighborhood structure learning based on an entity relation combination operator to obtain final embedded vectors of all the nodes and edges, wherein the entity relation combination operator is used for converging the adjacent nodes of the target nodes and the embedded vectors of the corresponding connecting edges;
A relationtriple generation module 520 for constructing a true relation triple and a forged relation triple based on the final embedded vector of each node and edge;
a relationship triplet score determiningmodule 530 that determines a score for each relationship triplet;
a graph convolution networkparameter adjustment module 540, configured to adjust parameters of the target graph convolution network based on a direction of a minimized loss function and retrain the target graph convolution network until the loss function satisfies a convergence condition, where the loss function is related to a difference between scores of a true relationship triplet and a forged relationship triplet; (ii) a
The recognitionmodel training module 550 is used for training the recognition model of the enterprise needing financing based on the final embedded vector of the target graph convolutional network after the training of the node corresponding to the enterprise with the financing demand label is finished;
and theprediction module 560 is used for identifying whether the target enterprise is the enterprise needing financing or not based on the trained enterprise identification model needing financing and the final embedded vector of the target graph convolutional network of the nodes corresponding to the target enterprise without the financing demand labels after training.
The enterprise identification device needing financing in the embodiment of the present specification may also execute the method in the embodiment shown in fig. 2, and implement the functions of the corresponding modules in the corresponding steps in fig. 2, and the specific implementation may refer to the embodiment shown in fig. 2, and is not described again.
In addition, it should be noted that, in the respective components of the apparatus of the present specification, the components therein are logically divided according to the functions to be implemented thereof, but the present specification is not limited thereto, and the respective components may be newly divided or combined as necessary.
Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure, and referring to fig. 6, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, and may also include hardware required by other services. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the enterprise identification device needing financing on the logic level. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.
The network interface, the processor and the memory may be interconnected by a bus system. The bus may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
The memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both read-only memory and random access memory and provides instructions and data to the processor. The Memory may include a Random-Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory.
The processor is used for executing the program stored in the memory and specifically executing the following steps:
inputting original characteristic vectors representing nodes of enterprises and original characteristic vectors representing edges of enterprise relations in map information of a target enterprise group into a target map convolution network, and performing neighborhood structure learning based on an entity relation combination operator to obtain final embedded vectors of all the nodes and the edges, wherein the entity relation combination operator is used for converging neighbor nodes of the target nodes and embedded vectors of corresponding connecting edges;
constructing a true relation triple and a forged relation triple based on the final embedded vectors of the nodes and the edges and determining the fraction of each relation triple;
adjusting parameters of the target graph convolution network based on the direction of the minimized loss function and retraining the target graph convolution network until the loss function meets a convergence condition, wherein the loss function is related to the fraction difference value of a true relation triple and a fake relation triple;
Training an enterprise identification model needing financing based on a final embedded vector of a target graph convolutional network after training of a node corresponding to an enterprise with a financing demand label;
and identifying whether the target enterprise is the enterprise needing financing or not based on the trained enterprise needing financing identification model and the final embedded vector of the target graph convolutional network of the nodes corresponding to the target enterprise without the financing requirement labels after training.
The method executed by the financing-requiring enterprise identification apparatus according to the embodiment shown in fig. 2 of the present specification can be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of this specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and combines hardware thereof to complete the steps of the method.
Based on the same invention creation, the present specification further provides a computer readable storage medium storing one or more programs, which when executed by an electronic device including a plurality of application programs, cause the electronic device to execute the financing-required enterprise identification method or the training method of the financing-required enterprise identification model provided by the embodiment corresponding to fig. 2.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the system embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
The foregoing description of specific embodiments has been presented for purposes of illustration and description. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The description has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The above description is only an example of the present disclosure, and is not intended to limit the present disclosure. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.