Background
The complex electromagnetic environment often contains multi-dimensional and multi-modal data such as various platform targets, electronic targets, coordination relations, electromagnetic parameters and the like. The graph is a widely used data structure, and is very suitable for describing the unstructured inter-related data.
We refer to such a data set with electromagnetic information, with complex information of multiple dimensions, as an electronic target sequence. The electronic target sequence details the combined information of the platform and the device in a specific scene, and refers to the comprehensive characterization of all targets and relationships in one scene. The sequence comprises all platform targets, electronic targets and various parameters of targets in a scene, including various types of electronic equipment such as radars, radio stations and the like, and various dimensional relations among all nodes such as command relations, association relations, communication relations, dynamic changes of the sequence and the like.
After modeling and characterizing the electronic target sequence by using the graph structure, researching and designing what method is used for analyzing and predicting the electronic target sequence, effectively utilizing the information of the electronic target sequence, and giving reference and warning to a user, thus the method becomes a problem to be analyzed and discussed.
Electronic target sequence analysis focuses on the systematic operation of the electronic target sequence, and main tasks include:
the association recognition, namely judging which electronic target sequence belongs to the known electronic target sequence in the real-time scene, so as to know the tag (such as task, slave and the like) of the real-time electronic target sequence;
Analyzing and predicting, namely judging whether an undetected object or composition relation possibly exists in an electronic object sequence of the real-time scene, judging whether an abnormally-occurring object or relation exists, and predicting the possible electronic object and sequence structure.
Since knowledge maps or simple directed graphs are generally used for modeling the electronic target sequence, the algorithm used is generally the most prominent method in the analysis of the graph data structure, namely matching of the graph data.
The accurate graph matching is the most widely applied graph matching technology at present, and plays an important role in the problems of high accuracy requirements on matching results, such as social network inquiry, biological data analysis, social security analysis and the like. From the perspective of algorithm design, exact graph matching techniques are divided into index-free matching techniques and index-based matching techniques. The index-free matching technology mainly adopts a search strategy, and all nodes in the data graph are matched in sequence by analyzing the node attribute and the node neighbor structure, so that the index-free matching technology is suitable for accurate matching of small-scale data graphs. The Ullmann algorithm is the earliest known index-free matching method, and the following representative algorithms comprise VF2, graphQL, GADDI, spath and the like, and the algorithms mainly improve search strategies by adding pruning, merging and other auxiliary information on the basis of the Ullmann algorithm.
However, such graph matching techniques, due to algorithm limitations, tend to simply database based on known information and materials, followed by associative recognition of the electronic target sequence based on the database. In general, an exact graph matching algorithm is used to forcedly match a real-time electronic target sequence to known information, and then according to the known information, a node or a side relationship possibly existing in real time is presumed, and the use scene is very limited and is easily influenced by the integrity of the known information. Secondly, like the Ullmann algorithm, the precise graph matching algorithm focuses more on the structure of the graph, ignores parameters carried by the nodes, and cannot utilize multi-mode and multi-structure data in good scenes.
Meanwhile, in a practical scenario, the structure of an electronic target sequence is often not constant, and as time goes by, the platform target coordinates, the device target parameters and the communication command relationship between targets may change, that is, the electronic target sequence with time sequence characteristics. Current research is constrained by analysis and speculation of the static electronic target sequence itself, and lack of correlation for such data analysis and prediction.
As Graphic Neural Networks (GNNs) are proposed, machine learning analysis of non-euclidean spatial data has also produced many results, and graphic convolution networks, graphic attention networks, and the like have found many applications in the fields of chemistry, traffic, and the like. The GNN can define convolution operation on the graph, aggregate the multidimensional characteristic parameters of the nodes, and utilize the relation among the nodes to carry out graph classification, node classification and edge prediction on the whole graph structure, so that the GNN has wide application prospect on the data which has multidimensional characteristics and can use the non-Euclidean space represented by the graph structure and is an electronic target sequence.
Detailed Description
The flow chart of the implementation of the invention is shown in fig. 1, and the specific steps are as follows:
s1. modeling step a) real-time data modeling
The algorithm of the system operates on the graph data structure, so the system firstly acquires real-time electronic target sequence data and performs graph data structure characterization on the real-time electronic target sequence data.
In the embodiment, the node is used for representing a platform target existing in real time, the attribute of the node is the parameter of the platform target and the electronic equipment target, and the side relationship is used for representing the relationship of communication, command, association, subordinate and the like between the platform target and the electronic equipment target.
Among the nodes, the parameters of the platform node, the radar node and the radio station node fully use the parameters which can be reported by the acquisition equipment and then serve as the attributes of the target node for storage. In an embodiment, the data parameter structure acquired in the real-time scenario is required to be the same as the known parameter structure.
In the side relationship, the command relationship refers to the relationship of one platform to command the other platform between two platform targets in a real-time scene, the communication relationship refers to the relationship of mutual communication between the two platform targets, the association relationship refers to a certain cooperative coordination implicit between the platforms but not other relationship of command or communication, and the subordinate relationship refers to the mounting relationship of radar nodes and communication nodes and platform nodes. The command relationship and the association relationship can be obtained through a frequent item set algorithm, and the communication relationship and the subordinate relationship can be obtained through a detection device. In an embodiment, these four relationships are defined as real-time known data.
For example, a sequence of electronic targets characterized by a graph data structure is shown in FIG. 2, wherein all platform targets contain parameters of "altitude", "heading", "speed", "latitude and longitude", "type", "name", "model".
B) Real-time data encoding
After the graph data structure representation is carried out on the real-time data, as the characteristics of part of nodes and parameters are in a character string format, the data needs to be encoded before training and real-time identification, so that the data can be used as the input of a neural network.
Because the character string parameters of each individual are often independent of each other and are not of various types, the invention adopts a simple hash algorithm-BKDRHash algorithm to encode the character string type parameters, and the algorithm python code is as follows:
For example, the character string "AN/SPS" is encoded to obtain the value 1009179012, which can be input into the following neural network to participate in training or real-time prediction.
In addition, the three parameters of the platform type, the node type and the edge type are fixed and few, so that the configuration items are written by using fixed codes, such as 0 for command relationship, 4 for radar node and the like, and the consistency of the whole system is ensured. And the task label of the whole sequence is used as the label of the classification task of the association identification, belongs to the subsequent output content, and is encoded according to the sequence of the task number.
S2, association identification
A) Formation identification
Modeling the real-time data to obtain a real-time electronic target sequence represented by the graph data structure, and then firstly performing formation identification, namely relating the real-time electronic target sequence to formation in a known database so as to facilitate subsequent analysis and calculation by using formation information.
When the names and model characteristics of platform nodes, radar nodes and radio nodes in the real-time electronic target sequence are known, a direct node matching method is used for searching a database to obtain all related formations, and a formation number is output.
When an unknown node exists or a node is detected but the node name/model cannot be known, performing formation identification by using an accurate graph matching algorithm-Ullmann algorithm.
The Ullmann algorithm is a classical algorithm for the subgraph isomorphism problem. The following is the step of determining whether graph Q is a subgraph of graph G using Ullmann's algorithm:
1. all nodes and edges in query graph Q and target graph G are labeled. The labels of nodes and edges are a non-negative integer representing their type. For example, if a node is a "road bed" node, it may be marked 1.
2. An alternative set S (v) is selected for each node v in the query graph Q, which contains all nodes marked as type v in the target graph G. If v has no alternative nodes, the algorithm may be stopped because query graph Q has no possibility to find isomorphic subgraphs in target graph G.
3. For each edge e in the query graph Q, the intersection of the candidate nodes of the two nodes to which it is connected is calculated, resulting in a candidate set C (e).
4. For each node v in the query graph Q, a recursive search is performed for each node w in the candidate set S (v) in the order of the nodes in the query graph Q. For each selected node w, if there is a node where one edge connects v and w in the target graph G for all nodes adjacent to v in the query graph Q, then w is added to the candidate set S (v) and the search for the next node is continued. If any alternative node cannot be found, then it is necessary to trace back and select other nodes.
5. If all nodes in query graph Q can be matched to nodes in target graph G, then query graph Q is illustrated as a sub-graph of target graph G.
The temporal complexity of the Ullmann algorithm is O (n+|q|), where n is the number of nodes in the target graph G and |q| is the number of nodes in the query graph Q.
Finally, the serial number and information of the associated formation are also output.
B) Task identification
And after the real-time electronic target sequence is encoded, inputting the real-time electronic target sequence into an associated recognition network, and performing a full-graph classification task on the real-time electronic target sequence by using a graph convolution neural network GCN to obtain a task tag of the real-time electronic target sequence.
The core idea of GCN is to learn a function map f (), on each layer graph convolution, by which node vi can aggregate its own features xi with its neighbor features xj,j∈N(vi) to generate a new representation of node vi.
A typical GCN network architecture for classifying graph nodes is shown in fig. 3. It comprises an input layer, two hidden layers and an output layer, wherein each hidden layer uses a layer of graph convolution and a layer ReLu of activation layers to perform feature computation on the graph. The principle of each GCN hidden layer is as follows:
wherein, the input of the first layer network is Hl, A is the adjacent matrix added with the self-connection, D is the degree matrix, Wl is the parameter to be trained, and sigma is the corresponding activation function, namely the final form of GCN.
In embodiments, the use of GCN to supervised classification of graph data is primarily involved. The network model uses two layers of graph roll layering and one layer of linear classifier layer, and the model structure is shown in fig. 4.
After the coded electronic target sequence diagram data structure enters the association identification network, the operation flow is shown in fig. 5. The association recognition network comprises a two-layer graph neural network convolution layer and a linear classifier. The coded electronic target sequence firstly passes through two layers of graph neural network convolution layers, the graph volume layer of each layer is converged with the characteristics of neighbor nodes and the characteristics of adjacent edges to the current node to generate new graph data structure characteristics of the current node, finally, the graph data structure containing the latest parameters is input into a layer of linear classifier, the corresponding task label probability is output, and the largest task label is selected as the final output.
S3 hidden node and relationship speculation
A) Hidden node speculation
Through formation identification and task identification in the step 2, a formation number and a task label associated with a real-time electronic target sequence can be obtained, the formation number is used as a first-level index, the task label is used as a second-level index, and the associated and identified corresponding electronic target sequence of the known database is read based on the graph database, so that a breadth-first search algorithm is used for presuming a hidden target node.
Breadth-first search algorithms (also known as breadth-first searches) are among the simplest graph search algorithms, which are prototypes of many important graph algorithms. The Dijkstra single source shortest path algorithm and Prim minimum spanning tree algorithm both use ideas similar to breadth-first search.
The invention uses a breadth-first search algorithm to output real-time platform targets or electronic device targets which are possibly hidden in the real-time electronic target sequence and not found by the detection device.
B) Hidden edge relationship speculation
The coded real-time electronic target sequence is input into a training-completed edge prediction network, hidden target relations of the real-time electronic target sequence are estimated by using a graph self-encoder GAE, and undetected target relations possibly existing in the real-time electronic target sequence are output.
As shown in fig. 6, the main purpose of the GAE is to obtain the appropriate embeddings embedding to represent the nodes in the graph so that they can be used in other tasks, and the GAE can reconstruct reconstructed to embedding of the nodes in the graph through the structure of the encoder-decoder to support the next task.
In an embodiment, GAE is used as an edge prediction task, a Graph convolution neural network is used as an encoder, an inner product is used as a decoder, and the edge prediction network is shown in fig. 7, and includes an encoder including two Graph Conv layers, a random addition negative link module ADD NEGATIVE LINKS, a decoder, and a classifier.
The process flow of the edge prediction network is shown in fig. 8, after inputting the real-time electronic target sequence after input encoding, the encoder in the model creates node embedding Node Embedding to output the Graph embedding feature through Graph Conv with two convolution layers, meanwhile, a negative link ADD NEGATIVE LINKS is randomly added on the original Graph to enable the link prediction task to be changed into a positive link of the original edge and a negative link of the newly added edge to increment the electronic target sequence, the Graph embedding feature and the incremented electronic target sequence are input into the decoder, the decoder uses the node embedding to conduct link prediction (binary classification) on all edges (including the negative link), calculates a dot product Node pair multiplication of a node embedding from a pair of nodes on each edge, and then aggregates the value AGGREGATE EMBED DIM of the whole embedding dimension to output new Graph data structure features to the classifier, and the two classifier creates a value representing the edge existence probability on each edge through the Sigmoid function to output binary classification result. Through the binary classification result, command relationships, communication relationships and association relationships possibly existing between platform targets in the real-time electronic target sequence can be output.
S4, predicting time sequence structure
Inputting the coded real-time electronic target sequence into a time sequence structure prediction network for completing training, and predicting the electronic target sequence structure and parameters of the next beat according to the set time step by using a time-graph convolution network T-GCN.
The T-GCN is a time graph rolling network, and GCN and a gating recursion unit GRU are used in a framework in a unified mode, wherein the former is used for graph node parameters and topological structures, and the latter is used for learning time characteristics.
A structure of the T-GCN is shown in FIG. 9. The method comprises an input layer, a airspace layer, a time domain layer and a prediction layer, wherein in each time step, a network convolves graph data through GCN, calculates to obtain characteristic vectors of the graph, then inputs the characteristic vectors into a gating recursion unit GRU, calculates and outputs the characteristic vectors of the graph of the next time step based on memory information of the past time step, and transmits information of the current time step.
In the embodiment, the reporting period of the acquisition equipment is required to be less than or equal to 1 minute, and the parameters and the topological structure of the real-time electronic target sequence are changed according to the minute level or the hour level. Thus, the interval time between each time sequence electronic target sequence is set to be 5 minutes, namely the sampling period is set to be 5 minutes, and the sequence time stamp interval is set to be 300. Meanwhile, the network step size is set to 1, that is, the input of each time-series electronic target sequence, the predicted structure of the electronic target sequence is 5 minutes later. Finally, the number of network hidden layer units, i.e. the hidden layer length, is set to 100.
The processing flow of the time sequence structure prediction network is shown in fig. 10, after the encoded real-time electronic target sequence is input, firstly, the trained graph convolution GCN unit is used for calculating the electronic target sequence graph structure data to generate new graph embedded features embedding, then embedding is transmitted into a gating recursive unit GRU unit, the GRU unit is based on the memory features of the history information, the current graph embedded features embedding are combined, the electronic target sequence features after five minutes are calculated, and finally the predicted electronic target sequence structure and parameters are output.
System initialization
Embodiments the system requires an initialization of the system before operation, and is briefly described as not belonging to the core of the present invention, which includes:
a) The formation known database is initialized, the formation information is input into the known information database, formation identification is convenient to carry out, the formation structure is consistent with that of fig. 2, and the formation information is static information, so that the information is contained with less part of parameters than an electronic target sequence.
B) Initializing sequence known information, namely inputting the sequence known information with task labels and time information into a database, associating the sequence with formation, and conveniently inquiring the sequence as a clustering index;
c) Training the neural network, namely training the neural network related in the steps S2, S3 and S4 on the basis of the constructed data set to generate a model parameter list and storing the model parameter list.
The embodiment is based on GNN, firstly, carrying out graph data structure characterization on a real-time electronic target sequence, then carrying out classification task on the real-time electronic target sequence based on GCN, carrying out association identification to obtain task labels of the real-time electronic target sequence, carrying out node prediction based on the labels, simultaneously carrying out node embedding calculation on the real-time electronic target sequence by using GAE to infer a possibly hidden inter-target edge relation, and finally carrying out prediction on the sequence structure of the next step of the real-time electronic target sequence by using T-GCN. This has many advantages as a flow.
Firstly, in the association identification of the electronic target sequence, the graph convolution neural network is used, so that multidimensional parameters of each target node in the scene are better and more fully utilized, the association identification result is more reliable, the whole task label of the electronic target sequence is output, and the user can conveniently refer to and judge according to the information.
Secondly, in the association identification, formation identification is carried out at the same time, and the association index is combined with the task label output in the first advantage, so that the efficiency of the system in searching data and the follow-up hidden target node presumption task is higher.
Thirdly, on hidden side relationship presumption, a graph embedded network is used, the characteristics of the topology and node parameters of the real-time electronic target sequence are effectively utilized, so that the relationship presumption is not limited by known data, the universality is improved, and more reference values are given to users.
Fourth, in the structure prediction of the electronic target sequence, the time domain graph convolutional neural network is used, so that possible future structural changes of the real-time electronic target sequence can be predicted according to past data, instead of simply and statically analyzing the current target and data, and certain reference and warning effects can be given to users.