the relationships in the different industry maps can be given different weights according to actual conditions, such as: if the probability of two persons who have worked together in the criminal group to do a crime again is relatively high, the weight of the number of times of the crime together with the entity is given to be higher; if the probability of two persons working against a crime group who are often connected is relatively high, the weight given to the relationship between the entities as the number of calls is higher. The weight can be obtained through historical data calculation or expert summary experience in related fields;

and (4-3) forming a new network by using the community divided in the step (4-2) as a node. The weight of the edge between the new nodes is the sum of the original weights between the two new nodes (communities), the community nodes have self-connection edges, and the weight of the community nodes is 2 times of the sum of the weights of the edges connected between all the nodes in the communities. And then iterating using the method of step (4-3) for the new network being constructed. Stopping iteration when the network is not changed any more, namely the maximum modularity appears;

Δ Q changes brought by any node moving to other communities:

Δ Q ═ modularity of the community after node i removed the original community-modularity of the community where node i was originally located) + (modularity of community j after node i moved to community j-original modularity of community j).

Compared with the prior art, the technical scheme of the invention has the following beneficial effects:

aiming at the relative dispersion of the service databases of various departments of the public security at present and the identification of criminal teams mainly by the modes of investigation and investigation, meticulous reconnaissance, data investigation and the like of workers, the technical scheme of the invention can effectively decompose social network communities and reduce the team investigation range, thereby improving the working efficiency, reducing the working intensity and achieving the aim of assisting decision.

Drawings

FIG. 1 is a social network diagram listed in the present invention;

fig. 2 is a flow chart of a knowledge-graph based hybrid group discovery method in an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1-2, the present invention provides a specific embodiment of a knowledge-graph-based hybrid partnership discovery method, which is described herein using a sample (knowledge-graph-based discovery of a community in which a suspect is located); the system comprises a data acquisition module [1], a data sorting module [2], a knowledge graph building module [3] and a community dividing module [4 ];

the system diagram is as shown in fig. 2, and a data acquisition module [1] is used for cleaning and converting mastered relevant data of each social relation of a criminal suspect according to the characteristics of the existing business database of each department of public security;

the data sorting module [2] is used for sorting out attributes and relations required by the entities in the map and naming the attributes and relations in a unified manner;

the knowledge graph building module [3] is used for building an entity relationship network;

the community division module [4] performs community division on the entity relationship network by using a louvain algorithm;

specifically, the data acquisition module [1] is responsible for centralizing data stored in business databases of all departments of a public security organization, such as social basic information, case information and other data;

the data sorting module [2] sorts the entity data set, the relationship types among the entities and the entity attribute types; like the social network in fig. 1, the entities are persons (a, b, c, d), the attributes may be basic information of the persons, i.e., birth date, graduation, native place, profession, occupation, etc., and the relationships include the number of calls, the number of co-morbidities, the number of co-dwellings, etc.

A knowledge graph module [3] is constructed, namely an entity relation network is constructed, and the relation can be extracted from unstructured data or directly extracted from structured data;

The community dividing module [4] extracts a sub-network with a certain suspected person as a center and a path of n from the knowledge graph of the knowledge graph building module [3], divides the sub-network into communities by a louvain algorithm, namely, the compactness of one community is measured by the modularity, and if a node is added into a certain community, the modularity of the community is increased to the maximum extent, the node should belong to the community. If the modularity of the community is not increased after other communities are added, the community is left in the current community. The value of the path n is generally 2 according to a six-degree separation theory (the depth of any network does not exceed 4 layers) and case experience (the depth of the connection between any two persons in a criminal partnership is mostly within a two-layer relation), and the method is as follows:

Σ in: the sum of the weights of all edges inside community C;

ki: the sum of the weights of all edges that occur at node i;

m: the sum of the weights of all edges in the network;

the relationships in the different industry maps can be given different weights according to actual conditions, such as: if the probability of two persons who have worked together in the criminal group to do a crime again is relatively high, the weight of the number of times of the crime with the entity is given to be a higher value. If the probability of two persons working against a crime group who are often connected is relatively high, the weight given to the relationship between the entities as the number of calls is higher. The weight can be obtained through historical data calculation or expert summary experience in related fields;

Δ Q changes brought by any node moving to other communities:

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

The principle and embodiments of the present invention have been described herein by way of specific examples, which are provided only to help understand the method and the core idea of the present invention, and the above is only a preferred embodiment of the present invention, and it should be noted that there are objectively infinite specific structures due to the limited character expressions, and it will be apparent to those skilled in the art that a plurality of modifications, decorations or changes can be made without departing from the principle of the present invention, and the above technical features can also be combined in a suitable manner; such modifications, variations, combinations, or adaptations of the invention using its spirit and scope, as defined by the claims, may be directed to other uses and embodiments.

Claims

1. A mixed group discovery method based on knowledge graph is characterized by comprising a data acquisition module (1), a data sorting module (2), a knowledge graph building module (3) and a community dividing module (4);

the data acquisition module (1) is used for cleaning and converting the mastered relevant data of each social relation of the criminal suspects according to the characteristics of the existing business database of each department of public security;

2. The knowledge graph-based hybrid group discovery method as claimed in claim 1, wherein the data acquisition module (1) is responsible for integrating data stored in business databases of various departments of the public security organization together, wherein the data includes social basic information, case information and other data;

the data sorting module (2) sorts the entity data set, the relationship types among the entities and the entity attribute types;

the extraction from the unstructured data can be obtained by means of named entity identification, case type classification and the like, and the structured data can be directly obtained from a database;

constructing an effective knowledge graph, namely a complex network, from the triple-packet data set, and initializing parameters, wherein the parameters are determined according to actual application; a complex network can be abstracted into a graph consisting of a point set and an edge set; each edge has corresponding points, namely each edge in the 'edge set' has a pair of points in the 'point set' corresponding to the point;

the community dividing module (4) extracts a sub-network with a suspected person as a center and a path of n from the knowledge graph of the knowledge graph building module (3), divides the sub-network into communities by a louvain algorithm, namely, the compactness of one community is measured by the modularity, and if a node is added into a community, the modularity of the community is increased to the maximum extent, the node belongs to the community; if the modularity is not increased after the other communities are added, the community is left in the current community; and the path n is valued according to a six-degree separation theory and case experience.

3. A knowledge graph-based hybrid group discovery method as claimed in claim 2, wherein the community partitioning module (4) is embodied as follows:

step (4-1), assuming that N nodes exist in the network, and distributing a community to each node, namely, how many nodes exist in the initial stage;

step (4-2), for each node i in the network, considering all neighbor nodes j of the node i, calculating the incremental change of the modularity when the node i moves from the community where the node i is located to the community where the neighbor nodes j are located, and moving the node i to the community where the node j which increases the modularity to the maximum and is not negative is located; if all the calculated gains are not positive numbers, the node is still in the original community; the process is repeated for all nodes and applied in sequence until no node moves, the first process stops, i.e. no movement of any node results in an increase in modularity;

Σ in: the sum of the weights of all edges inside community C;

ki: the sum of the weights of all edges that occur at node i;

m: the sum of the weights of all edges in the network;

the relations in the different industry maps can be weighted according to actual conditions, and if the probability of two persons who are frequently contacted in the criminal group is higher, the weight of the relation between the entities, namely the number of calls, is higher; the weight can be obtained through historical data calculation or expert summary experience in related fields;

step (4-3), the community divided in the step (4-2) is used as a node to form a new network; the weight of the edge between the new nodes is the sum of the original weights between the two new nodes, the community nodes have self-connection edges, and the weight of the community nodes is 2 times of the sum of the weights of the edges connected between all the nodes in the community; then, iterating the constructed new network by using the method in the step (4-2); stopping iteration when the network is not changed any more, namely the maximum modularity appears;

Δ Q changes brought by any node moving to other communities: