CN119996079B

Movatterモバイル変換

Info

Publication number: CN119996079B
Application number: CN202510450817.7A
Authority: CN
Inventors: 郑轶; 张冠阳; 郭超; 周维强; 全梅菡; 蒋志强; 姚杰
Original assignee: Bozhi Safety Technology Co ltd
Current assignee: Bozhi Safety Technology Co ltd
Priority date: 2025-04-11
Filing date: 2025-04-11
Publication date: 2025-06-13
Anticipated expiration: 2045-04-11
Also published as: CN119996079A

Abstract

The invention provides a large-scale network node scene construction method and system based on a network target range, and relates to the technical field of network security. And then, optimizing a network topology structure by adopting a graph theory clustering algorithm, and constructing a connection relation between nodes based on the service affinity. And finally, constructing a node running environment according to the resource utilization index, deploying network service, implanting a vulnerability program, performing anomaly detection and repair, and finally generating a large-scale network node scene meeting the constraint of the resource optimization scheme. The invention can automatically construct a large-scale network target range with a complex topological structure and a real vulnerability environment, and improves the efficiency and the authenticity of network security attack and defense exercise.

Description

Large-scale network node scene construction method and system based on network target range

Technical Field

The invention relates to a network security technology, in particular to a large-scale network node scene construction method and system based on a network target range.

Background

The network target range serves as an important platform for network safety research, and the reality and the scale of the network node scene construction directly influence the training effect of attack and defense countermeasure. The network node scene construction in the existing network target range mainly depends on manual configuration, the requirement of large-scale scene construction is difficult to meet, and the vulnerability relevance and service relevance among nodes are not considered enough.

Along with the continuous expansion of the scale of the network target range, the number of network nodes is exponentially increased, and the traditional manual configuration mode has the problems of large workload, low configuration efficiency and the like. Meanwhile, due to the lack of quantitative analysis on vulnerability propagation characteristics and business affinities among nodes, the authenticity of a construction scene is difficult to ensure, and the network attack and defense countermeasure effect is influenced.

The existing network node scene construction method mainly focuses on the function realization of single nodes, and is difficult to realize the self-adaptive optimization of a network topology structure and the reasonable allocation of resources due to insufficient consideration of the relevance among the nodes and the resource utilization efficiency. Therefore, a method capable of automatically constructing a large-scale network node scene and guaranteeing scene authenticity and resource utilization efficiency is needed.

Disclosure of Invention

The embodiment of the invention provides a large-scale network node scene construction method and system based on a network shooting range, which can solve the problems in the prior art.

In a first aspect of an embodiment of the present invention,

The method for constructing the large-scale network node scene based on the network shooting range comprises the following steps:

Calculating the vulnerability association degree among nodes in the network node scene construction requirement, dividing the network node into a plurality of attack and defense levels, generating an attack and defense level node distribution diagram, extracting candidate network node templates from a network node template library, calculating vulnerability chain propagation coefficients, matching the vulnerability chain propagation coefficients with the attack and defense level node distribution diagram, and screening out target network node templates;

Calculating node distribution density threshold values by adopting graph theory clustering algorithm, carrying out hierarchical optimization on network topology types in network node scene construction requirements to obtain an initial network topology structure, generating network node examples by attribute mapping on a target network node template, constructing connection relations among the network node examples based on inter-node communication protocol information, extracting communication characteristics among the network node examples, calculating service affinity, optimizing the initial network topology structure based on the service affinity, and generating a target network topology structure;

Generating a resource utilization index according to a target network topological structure, constructing a node running environment through a resource optimization scheme obtained by predicting the resource utilization index, deploying network services, performing pressure test on the network services to generate a fault characteristic sequence, selecting a matched vulnerability program from a preset vulnerability program library based on the fault characteristic sequence, implanting the vulnerability program into the node running environment, monitoring state information in the node running environment, performing anomaly detection and repair according to the fault characteristic sequence, and generating a large-scale network node scene meeting the constraint of the resource optimization scheme.

In an alternative embodiment of the present invention,

Calculating the vulnerability association degree among nodes in the network node scene construction requirement, dividing the network node into a plurality of attack and defense levels, generating an attack and defense level node distribution diagram, extracting candidate network node templates from a network node template library, calculating vulnerability chain propagation coefficients, matching the vulnerability chain propagation coefficients with the attack and defense level node distribution diagram, and screening out target network node templates, wherein the steps of:

Constructing a node association graph according to the vulnerability association degree among nodes, dividing the node association graph based on a community discovery algorithm, generating an attack-defense level node distribution graph, introducing a vulnerability utilization chain, calculating the triggering probability of adjacent vulnerabilities and the attack chain length attenuation coefficient, matching with the vulnerability intensity value at the corresponding position in the attack-defense level node distribution graph, and selecting a candidate network node template with the minimum matching degree value as a target network node template, wherein the method specifically comprises the steps of:

Obtaining vulnerability attribute information of network nodes to generate node vulnerability feature vectors, calculating distance attenuation function values based on inter-node communication hops, performing tensor operation on the node vulnerability feature vectors and the distance attenuation function values to obtain a node vulnerability combination matrix, and calculating vulnerability association degrees among nodes in network node scene construction requirements based on the node vulnerability combination matrix;

constructing a node association graph by taking the vulnerability association degree as an edge weight, performing iterative division on the node association graph by adopting a community discovery algorithm, dividing a network node into a plurality of attack and defense levels, and calculating a level vulnerability intensity value for each attack and defense level;

Generating an attack-defense level node distribution diagram based on the level position relation of the attack-defense level and the level vulnerability intensity value, wherein nodes represent the attack-defense level, and connecting edges between the nodes represent the vulnerability intensity relation between the levels;

Extracting candidate network node templates from a preset network node template library, establishing a connection relation among vulnerabilities based on vulnerability characteristics in the candidate network node templates through system state requirements, state changes and attack entry types in the vulnerability characteristics, introducing continuity of the system state changes as screening conditions of a vulnerability exploitation chain, combining cascade triggering probability of the attack chain with a length attenuation coefficient, and calculating to obtain a vulnerability chain propagation coefficient;

And carrying out matching calculation on the vulnerability chain propagation coefficient and the vulnerability intensity value at the corresponding position in the attack and defense hierarchical node distribution diagram to obtain a propagation coefficient deviation value, calculating the deployed hierarchical vulnerability intensity variation based on the candidate network node template, determining the weighted sum of the propagation coefficient deviation value and the hierarchical vulnerability intensity variation as a matching degree value, and selecting the candidate network node template with the minimum matching degree value as a target network node template.

In an alternative embodiment of the present invention,

Establishing a connection relation among vulnerabilities through system state requirements, state changes and attack entry types in vulnerability characteristics, introducing continuity of the system state changes as screening conditions of vulnerability exploitation chains, combining cascade triggering probability of the attack chains with length attenuation coefficients, and calculating to obtain vulnerability chain type propagation coefficients comprises the following steps:

Extracting vulnerability characteristics from candidate network node templates, wherein the vulnerability characteristics comprise vulnerability-triggered system state requirements, vulnerability-utilized system state changes and vulnerability-utilized attack entry types, and constructing vulnerability characteristic vectors;

Performing feature matching according to the system state requirement, the system state change and the attack entry type in the vulnerability feature vector, and establishing a connection relationship between two vulnerabilities when the system state change of the first vulnerability meets the system state requirement of the second vulnerability and the attack entry type of the second vulnerability belongs to the attack entry introduced by the first vulnerability;

Constructing a vulnerability dependency graph based on the connection relation, identifying a vulnerability utilization chain path in the vulnerability dependency graph by adopting a depth-first search method, and screening the vulnerability utilization chain path according to the continuity of system state change in the vulnerability feature vector to generate a vulnerability utilization chain;

calculating the connection distance between adjacent nodes based on the number of nodes of the exploit chain, calculating the path length of the exploit chain according to the connection distance, and substituting the path length into an exponential decay function to obtain an attack chain length decay coefficient of the exploit chain;

Carrying out continuous multiplication operation on the triggering probabilities of adjacent loopholes in the loopholes chain according to the connection sequence of the loopholes in the loopholes chain to obtain cascade triggering probabilities of the loopholes chain, and multiplying the cascade triggering probabilities with the attack chain length attenuation coefficient to obtain a loopholes chain propagation coefficient;

and constructing an attack path diagram based on the vulnerability feature vector and the vulnerability utilization chain, mapping the vulnerability utilization chain into an attack path, counting the success times of the attack path through multiple simulations, optimizing the calculation parameters of the triggering probability and the attack chain length attenuation coefficient by utilizing the success times of the attack path, and recalculating the vulnerability chain propagation coefficient by using the optimized parameters to obtain the final vulnerability chain propagation coefficient.

In an alternative embodiment of the present invention,

Calculating a node distribution density threshold value by adopting a graph theory clustering algorithm, carrying out hierarchical optimization on network topology types in network node scene construction requirements to obtain an initial network topology structure, generating a network node instance by attribute mapping on a target network node template, constructing a connection relationship between the network node instances based on inter-node communication protocol information, extracting communication characteristics between the network node instances, calculating service affinity, optimizing the initial network topology structure based on the service affinity, and generating the target network topology structure comprises the following steps:

Constructing a distance matrix according to Euclidean distances among nodes, calculating local density and relative density of the nodes by adopting a graph theory clustering algorithm based on the distance matrix, setting a density threshold value according to network node scene construction requirements, carrying out normalization processing on the local density and the relative density of the nodes to obtain a node comprehensive density value, carrying out hierarchical division on the nodes according to network topology layering requirements, determining topological connection relation of the nodes in each layer according to the comprehensive density value, and obtaining an initial network topology structure meeting hierarchical constraint through iterative optimization;

Extracting hardware configuration information and software configuration information from a target network node template, constructing a static attribute feature set and a dynamic attribute feature set, mapping the static attribute feature set into hardware parameters of a node instance based on a feature mapping rule, mapping the dynamic attribute feature set into software parameters of the node instance, and generating a network node instance;

Designing a protocol feature tensor, encoding communication protocol information among node instances into a multidimensional feature tensor, extracting an implicit factor of the protocol feature by adopting a tensor decomposition method, constructing a protocol similarity measurement model based on the implicit factor, calculating a protocol compatibility matrix among the node instances, and determining a connection relation among the node instances according to a compatibility threshold;

Extracting service flow time sequence characteristics among network node examples by adopting a sliding time window, analyzing the service flow time sequence characteristics based on a long-short-time memory network to obtain a service flow prediction result, constructing a service dependency graph through service call chain analysis, extracting node association characteristics in the service dependency graph by combining a graph attention network, counting node resource utilization characteristics based on resource monitoring data, and carrying out multidimensional characteristic fusion calculation on the service flow prediction result, the node association characteristics and the resource utilization characteristics;

And taking the service affinity as a state space, taking the change of the connection relation between network node instances as an action space, constructing a reward function considering node processing capacity constraint, link bandwidth constraint and end-to-end time delay constraint, adopting a deep reinforcement learning method to iteratively optimize a connection strategy based on value function estimation and a strategy gradient algorithm, outputting an optimal topology connection scheme when the reward function is converged, and optimizing an initial network topology structure through the optimal topology connection scheme to generate a target network topology structure.

In an alternative embodiment of the present invention,

The method for iteratively optimizing the connection strategy based on the value function estimation and the strategy gradient algorithm by adopting the deep reinforcement learning method, and outputting the optimal topology connection scheme when the reward function converges comprises the following steps:

constructing a dual-network architecture of a value function network and a strategy network, guiding experience pool priority sampling by using a time sequence difference error and carrying out parameter optimization based on strategy confidence domain constraint, adjusting strategy network improvement direction by adopting a dominance function and introducing state access distribution guiding value function network training, and outputting a sequence with highest action probability as an optimal topology connection scheme when the optimization converges, wherein the method specifically comprises the following steps of:

Acquiring a service affinity matrix and a connection state matrix of an initial network topological structure, and constructing a dual-network architecture of a value function network and a strategy network, wherein the value function network performs feature extraction based on the service affinity matrix and the connection state matrix to obtain a state feature vector and outputs a state value;

Calculating a time sequence difference error based on the state value and a reward value calculated based on node processing capacity constraint, link bandwidth constraint and end-to-end time delay constraint, storing the state feature vector, node connection change action probability, the reward value and the state feature vector of the next state into an experience pool as state transition information, distributing sampling priority for the state transition information in the experience pool based on the time sequence difference error, acquiring a training sample according to the sampling priority, and updating parameters of a value function network by using the training sample;

Constructing a strategy objective function based on the action probability and the state transition information of the node connection relation change, calculating strategy gradient by taking the state value as a baseline function, and updating parameters of a strategy network by utilizing the strategy gradient and the state transition information under the constraint of a strategy confidence domain;

The time sequence difference error is used as a dominance function, the improvement direction of a strategy network is adjusted based on the dominance function, the node connection change action probability output by the strategy network is utilized to generate state access distribution, and the state access distribution is used for training of a value function network;

And monitoring a rewarding value, node connection change action probability output by the strategy network and a change trend of a state value output by the value function network, and when the fluctuation range of the change trend is smaller than a convergence threshold value, extracting an action sequence with highest probability in the node connection change action probability output by the strategy network as an optimal topology connection scheme.

In an alternative embodiment of the present invention,

Generating a resource utilization index according to a target network topological structure, constructing a node running environment by a resource optimization scheme obtained by predicting the resource utilization index, deploying network services, performing stress test on the network services to generate a fault feature sequence, selecting a matched vulnerability program from a preset vulnerability program library based on the fault feature sequence, implanting the vulnerability program into the node running environment, monitoring state information in the node running environment, performing anomaly detection and repair according to the fault feature sequence, and generating a large-scale network node scene meeting the constraint of the resource optimization scheme, wherein the method comprises the following steps of:

Constructing a multidimensional resource state space, mapping a CPU state vector, a memory state vector, a network state vector and a disk state vector of a network node in a target network topological structure to the multidimensional resource state space, performing countermeasure training on the state vector by adopting a countermeasure variance self-encoder to obtain resource state distribution, calculating a resource state entropy based on the resource state distribution, constructing a loss function of a resource prediction model, training a cyclic neural network by using the loss function to obtain a resource utilization prediction model, inputting historical resource state distribution into the resource utilization prediction model to generate future resource utilization indexes, and setting resource allocation parameters to generate a resource optimization scheme based on the resource utilization indexes;

Calculating a system stability matrix based on the resource state entropy, performing singular value decomposition on the stability matrix to obtain a characteristic modal sequence, identifying a system state mutation point based on the characteristic modal sequence, taking the system state mutation point as a fault injection moment, constructing a node operation environment at the fault injection moment according to the resource optimization scheme, and deploying network services;

Performing pressure test on the network service to acquire system state data, mapping the system state data to a characteristic manifold by using a complex variable kernel function, calculating a geodesic line on the characteristic manifold to obtain a fault propagation path, constructing a fault feature sequence based on the calculation curvature feature of the fault propagation path, performing manifold matching on the fault feature sequence and the vulnerability feature in a preset vulnerability program library, and selecting a vulnerability program which has the highest matching degree and meets the constraint of the resource optimization scheme to be implanted into the node operation environment;

And analyzing the characteristic manifold by adopting a continuous coherent method to obtain fault persistence, constructing a repair threshold based on the coupling relation of the resource state entropy and the fault persistence, performing anomaly detection and repair on the node operation environment according to the repair threshold, and outputting the current node operation environment as a large-scale network node scene when the resource utilization index of the repaired node operation environment meets the constraint of the resource optimization scheme.

In an alternative embodiment of the present invention,

Analyzing the characteristic manifold by adopting a continuous coherent method to obtain fault persistence, and constructing a repair threshold based on the coupling relation of the resource state entropy and the fault persistence comprises the following steps:

Constructing a nested sequence of a simplex complex on a feature manifold, extracting duration information of fault features by using coherent group mapping, obtaining corrected fault duration by combining two-stage feature weight optimization, establishing a coupling matrix of the fault duration and resource state entropy, and performing feature decomposition on the coupling matrix to obtain a repair threshold, wherein the method specifically comprises the following steps of:

Constructing a distance measurement space on a feature manifold, calculating a distance value between point pairs based on the distance measurement space, constructing a simplex compound body by utilizing the distance value, generating a nested sequence of the simplex compound body by gradually increasing the distance value, calculating coherent group mapping between adjacent simplex compound bodies in the nested sequence to obtain a topological feature sequence, extracting occurrence time and disappearance time of features in the topological feature sequence, calculating feature duration based on the occurrence time and the disappearance time, and weighting and summing the feature duration to obtain fault duration;

distributing a weighting coefficient to the fault duration according to the structure of the simplex complex, taking the product of the weighting coefficient and the fault duration as a dimension characteristic value, carrying out characteristic decomposition on the dimension characteristic value to obtain a characteristic weight, and summing the product of the characteristic weight and the dimension characteristic value to obtain a corrected fault duration;

Calculating the association coefficient of the corrected fault duration and the resource state entropy, filling the association coefficient into a diagonal matrix formed by the corrected fault duration and the resource state entropy to obtain a coupling matrix, carrying out feature decomposition on the coupling matrix to obtain a feature vector, multiplying the feature vector by the corrected fault duration and the corrected resource state entropy respectively, summing the feature vector to obtain a threshold judgment value, and optimizing the threshold judgment value and marked restoration decision data to obtain a restoration threshold value.

In a second aspect of an embodiment of the present invention,

Provided is a large-scale network node scene construction system based on a network target range, comprising:

The first unit is used for calculating the vulnerability association degree among nodes in the network node scene construction requirement, dividing the network node into a plurality of attack and defense levels, generating an attack and defense level node distribution diagram, extracting candidate network node templates from a network node template library, calculating vulnerability chain propagation coefficients, matching the vulnerability chain propagation coefficients with the attack and defense level node distribution diagram, and screening out target network node templates;

The second unit is used for calculating a node distribution density threshold value by adopting a graph theory clustering algorithm, carrying out hierarchical optimization on network topology types in network node scene construction requirements to obtain an initial network topology structure, generating network node examples by attribute mapping on a target network node template, constructing connection relations among the network node examples based on inter-node communication protocol information, extracting communication characteristics among the network node examples, calculating service affinity, optimizing the initial network topology structure based on the service affinity, and generating a target network topology structure;

The third unit is used for generating a resource utilization index according to the target network topological structure, constructing a node running environment through a resource optimization scheme obtained by predicting the resource utilization index, deploying network services, performing pressure test on the network services to generate a fault characteristic sequence, selecting a matched vulnerability program from a preset vulnerability program library based on the fault characteristic sequence, implanting the vulnerability program into the node running environment, monitoring state information in the node running environment, performing anomaly detection and repair according to the fault characteristic sequence, and generating a large-scale network node scene meeting the constraint of the resource optimization scheme.

In a third aspect of an embodiment of the present invention,

There is provided an electronic device including:

A processor;

A memory for storing processor-executable instructions;

wherein the processor is configured to invoke the instructions stored in the memory to perform the method described previously.

In a fourth aspect of an embodiment of the present invention,

There is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method as described above.

In the embodiment, by constructing the attack and defense hierarchical node distribution diagram, the vulnerability association relationship among network nodes is accurately depicted, and the simulation capability of the network target range to the complex network attack and defense environment is improved. By calculating the loophole chain propagation coefficient and optimizing the network topology structure by combining the graph theory clustering algorithm, the rationality of the loophole propagation path can be effectively improved, and the network attack and defense strategy exercise is more realistic and targeted. And optimizing an initial network topology structure by adopting a service affinity analysis method, so that the communication relationship among network nodes is more in line with the actual service scene, and the accuracy of the simulation network is improved. Meanwhile, the node operation environment is predicted and optimized by combining the resource utilization rate, the dynamic construction of large-scale network nodes can be realized on the premise of ensuring the efficient utilization of the network range calculation resources, and the expandability and the applicability of the network range are improved. A fault characteristic sequence is generated through pressure test, and vulnerability program matching and implantation are performed based on the sequence, so that the influence of network attack on service operation can be effectively simulated, and the accuracy of vulnerability reproduction is improved. In addition, by combining an abnormality detection and repair mechanism, the method can quickly respond to abnormal states in the network environment, and improve the stability and safety of network services, so that a more real and intelligent large-scale network target range environment is constructed.

Drawings

FIG. 1 is a flow chart of a method for constructing a scene of a large-scale network node based on a network target range according to an embodiment of the invention;

FIG. 2 is a graph showing comparison of the success rate prediction accuracy of exploit chain in an embodiment of the present invention;

fig. 3 is a graph showing the comparison of the load distribution of the optimized front and rear links according to the embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

Fig. 1 is a flow chart of a method for constructing a scene of a large-scale network node based on a network target range according to an embodiment of the present invention, as shown in fig. 1, the method includes:

S101, calculating the vulnerability association degree among nodes in the network node scene construction requirement, dividing the network node into a plurality of attack and defense levels, generating an attack and defense level node distribution map, extracting candidate network node templates from a network node template library, calculating vulnerability chain propagation coefficients, matching the vulnerability chain propagation coefficients with the attack and defense level node distribution map, and screening out target network node templates;

s102, calculating a node distribution density threshold value by adopting a graph theory clustering algorithm, carrying out hierarchical optimization on network topology types in network node scene construction requirements to obtain an initial network topology structure, generating network node examples by attribute mapping on a target network node template, constructing connection relations among the network node examples based on inter-node communication protocol information, extracting communication features among the network node examples, calculating service affinity, optimizing the initial network topology structure based on the service affinity, and generating a target network topology structure;

S103, generating a resource utilization index according to a target network topological structure, constructing a node running environment through a resource optimization scheme obtained by predicting the resource utilization index, deploying network services, performing pressure test on the network services to generate a fault feature sequence, selecting a matched vulnerability program from a preset vulnerability program library based on the fault feature sequence, implanting the vulnerability program into the node running environment, monitoring state information in the node running environment, performing anomaly detection and repair according to the fault feature sequence, and generating a large-scale network node scene meeting the constraint of the resource optimization scheme.

In an optional embodiment, calculating a vulnerability association degree between nodes in a network node scene construction requirement, dividing a network node into a plurality of attack and defense levels, generating an attack and defense level node distribution diagram, extracting candidate network node templates from a network node template library, calculating a vulnerability chain propagation coefficient, matching the vulnerability chain propagation coefficient with the attack and defense level node distribution diagram, and screening out a target network node template comprises:

Illustratively, vulnerability attribute information, such as vulnerability type, CVE number, CVSS score, affected software version, etc., of all network nodes in the scene to be built is obtained. The vulnerability attribute information is converted into a numeric feature vector, for example, the vulnerability type is represented by using one-hot coding, the vulnerability severity is represented by using the numerical value of the CVSS score, and the like. Assuming that two nodes are provided, the vulnerability feature vector of the node A is [1,0,7.5] which indicates that a type 1 vulnerability exists, no type 2 vulnerability exists, the CVSS score is 7.5, the vulnerability feature vector of the node B is [0,1,9.0] which indicates that no type 1 vulnerability exists, the type 2 vulnerability exists, and the CVSS score is 9.0.

And calculating a distance attenuation function value between the nodes. The function value is used for measuring the influence of the communication distance between the nodes on the vulnerability association degree. For example, a decay function based on the number of hops may be used, the greater the number of hops, the greater the decay value. Assuming that the number of hops in communication between node a and node B is 2 and the decay function is 1/hop count, the decay value is 0.5.

And calculating the vulnerability characteristic vector and the distance attenuation function value of the node to obtain a node vulnerability combination matrix. The matrix reflects the strength of vulnerability associations between nodes.

For example, the vulnerability combination matrix element values of the node a and the node B are [1×0.5,0× 0.5,7.5 ×0.5,0×0.5,1× 0.5,9.0 ×0.5] = [0.5,0,3.75,0,0.5,4.5].

And calculating the vulnerability association degree between the nodes based on the node vulnerability combination matrix. For example, the combined matrix element values may be weighted and summed to obtain a vulnerability association of 8.75 for node a and node B.

And constructing a node association graph by taking the calculated vulnerability association degree as an edge weight. Nodes in the graph represent network nodes, edges represent vulnerability association relations among the nodes, and edge weights represent association strength.

And adopting a community discovery algorithm to carry out iterative division on the node association graph, and dividing the network node into a plurality of attack and defense levels. For example, community discovery may be performed using the Louvain algorithm. It is assumed that the network nodes are divided into three levels, a core layer, a traffic layer and a boundary layer, respectively.

A hierarchical vulnerability strength value is calculated for each attack and defense hierarchy. For example, vulnerability CVSS scores for all nodes within a hierarchy may be averaged as a hierarchy vulnerability strength value. Assuming that the vulnerability intensity value of the core layer is 8.0, the vulnerability intensity value of the service layer is 7.0, and the vulnerability intensity value of the boundary layer is 6.0.

And generating an attack and defense level node distribution diagram based on the level position relation of the attack and defense level and the level vulnerability intensity value. In the graph, nodes represent attack and defense levels, and connecting lines among the nodes represent vulnerability strength relations among the levels. For example, the core layer is connected to the service layer, and the value on the connection is 1.0, which is the difference between 8.0 and 7.0, indicating the potential threat level of the core layer to the service layer.

And extracting candidate network node templates from a preset network node template library. For example, the template library includes different types of node templates such as Web servers, database servers, mail servers, and the like.

Based on the vulnerability characteristics in the candidate network node templates, the connection relation among vulnerabilities is established through the system state requirements, the state changes and the attack entry types in the vulnerability characteristics. For example, if the result of an attack by one vulnerability is the triggering condition for another vulnerability, then the two vulnerabilities may be connected to form an attack chain. The continuity of system state change is introduced as a screening condition of the exploit chain, for example, only vulnerabilities with continuous state change can form an effective attack chain. And combining the cascade triggering probability of the attack chain with the length attenuation coefficient, and calculating to obtain the vulnerability chain propagation coefficient. For example, an attack chain including three loopholes, the triggering probability of each loophole is 0.8, 0.9 and 0.7, and the length attenuation coefficient is 0.5, and then the propagation coefficient of the loophole chain is 0.8x0.9 x0.7 x0.5=0.252.

And matching and calculating the loophole chain propagation coefficient with the loophole intensity value at the corresponding position in the attack and defense level node distribution diagram to obtain a propagation coefficient deviation value. For example, comparing the vulnerability chain propagation coefficient of the Web server template with the boundary layer vulnerability intensity value of 6.0 to obtain a deviation value. And calculating the deployed hierarchical vulnerability strength variation based on the candidate network node templates. For example, after deployment of a Web server template, the boundary layer vulnerability strength value may change from 6.0 to 7.0, with a change of 1.0. And determining the weighted sum of the propagation coefficient deviation value and the level vulnerability intensity variation as a matching degree value. The candidate network node template with the smallest matching degree value is selected as the target network node template.

In the embodiment, the attack and defense level node distribution map is constructed by accurately calculating the vulnerability association degree among the network nodes, so that the attack and defense structure of the network target range is more real and hierarchical. The node association graph is divided by using a community discovery algorithm, so that the rationality of an attack and defense level is ensured, the accuracy of attack path analysis is improved, and the propagation characteristics of the vulnerability in the network can be effectively described by calculating the triggering probability and attenuation coefficient of the vulnerability exploitation chain. By combining the distance attenuation function of the node vulnerability feature vector and the communication hop count, the vulnerability association degree is accurately measured, and the tensor calculation method is adopted to improve the calculation efficiency, so that the network scene construction is more efficient and intelligent. The continuity of system state change is introduced as a screening condition of the vulnerability exploitation chain, so that the feasibility of an attack path can be more accurately evaluated, and the selected network node template can truly reflect the dynamic process of vulnerability propagation. By matching the vulnerability chain propagation coefficient with the vulnerability intensity value of the attack and defense level, the optimal adaptation of the screened target network node template on the vulnerability propagation influence is ensured, so that the simulation precision of the network target range is improved. By adopting a weighted matching mechanism of propagation coefficient deviation values and hierarchical vulnerability strength variation, the deployment strategy of network nodes can be optimized in a self-adaptive mode, the rationality of vulnerability simulation is improved, and finally generated network target fields can reflect the propagation characteristics of attack chains in a real network environment more accurately.

In an optional implementation manner, establishing a connection relation among vulnerabilities through system state requirements, state changes and attack entry types in vulnerability characteristics, introducing continuity of the system state changes as screening conditions of vulnerability exploitation chains, combining cascade triggering probability of the attack chains with length attenuation coefficients, and calculating to obtain vulnerability chain propagation coefficients comprises the following steps:

Illustratively, vulnerability characteristics are extracted from candidate network node templates, and vulnerability characteristic vectors are constructed. The vulnerability characteristics comprise vulnerability-triggered system state requirements, system state changes after vulnerability exploitation and attack entry types of the vulnerability exploitation. For example, a feature vector of a vulnerability may be expressed as a system state requirement of "Web service on", a system state change of "acquire WebShell rights", and an attack entry type of "remote code execution".

And performing feature matching according to the vulnerability feature vectors, and establishing a connection relationship between vulnerabilities. When the system state change of the first vulnerability meets the system state requirement of the second vulnerability and the attack entry type of the second vulnerability belongs to the attack entry introduced by the first vulnerability, a connection relationship is established between the two vulnerabilities. For example, the system state of the vulnerability a changes to "acquire WebShell permission", the system state requirement of the vulnerability B is "WebShell permission", the attack entry type of the vulnerability B is "local command execution", and the vulnerability a introduces the attack entry of "local command execution", so that a connection relationship between the vulnerability a and the vulnerability B can be established.

And constructing a vulnerability dependency graph based on the connection relation among the vulnerabilities. The vulnerability dependency graph is a directed graph, nodes represent vulnerabilities, and edges represent connection relations among vulnerabilities. And identifying the exploit chain path in the vulnerability dependency graph by adopting a depth-first search method. For example, in the vulnerability dependency graph, if there is a path from vulnerability A to vulnerability B to vulnerability C, then this path is an exploit chain path.

And screening the exploit chain path according to the system state change continuity to generate the exploit chain. The continuity of system state change means that the system state change of adjacent vulnerabilities in the exploit chain needs to conform to a logical order. For example, if the system state of vulnerability a changes to "acquire WebShell rights" and the system state of vulnerability B changes to "give authority to administrator rights," then the system state changes of both vulnerabilities are continuous. Otherwise, if the system state of the vulnerability a changes to "acquire WebShell authority", and the system state of the vulnerability B changes to "reject service", the system state changes of the two vulnerabilities are discontinuous, and the vulnerability utilization chain path needs to be screened out.

And calculating the attack chain length attenuation coefficient of the exploit chain. And calculating the connection distance between adjacent nodes based on the node number of the exploit chain. The connection distance can be evaluated according to factors such as difficulty of the vulnerability exploitation, required time and the like. For example, if the exploitation difficulty between vulnerability a and vulnerability B is high, their connection distance is large. And calculating the path length of the exploit chain according to the connection distance. The path length is the sum of all connection distances. Substituting the path length into an exponential decay function to obtain an attack chain length decay coefficient. The longer the path length, the smaller the attenuation coefficient. For example, a path length of 3 may have an attenuation coefficient of 0.8 and a path length of 5 may have an attenuation coefficient of 0.5.

And calculating cascade triggering probability of the exploit chain. And carrying out continuous multiplication operation on the triggering probabilities of adjacent holes in the exploit chain according to the connection sequence of the holes in the exploit chain to obtain the cascade triggering probability of the exploit chain. For example, if the trigger probability of the vulnerability a is 0.8 and the trigger probability of the vulnerability B is 0.9, the cascade trigger probability of the exploit chain composed of the vulnerability a and the vulnerability B is 0.8×0.9=0.72.

And multiplying the cascade triggering probability by the attack chain length attenuation coefficient to obtain the vulnerability chain propagation coefficient. For example, if the cascade trigger probability is 0.72 and the attack chain length attenuation coefficient is 0.8, the vulnerability chain propagation coefficient is 0.72×0.8=0.576.

And constructing an attack path diagram based on the vulnerability feature vector and the vulnerability exploitation chain, and mapping the vulnerability exploitation chain into an attack path. And counting the successful times of the attack path through multiple simulation, and optimizing the calculation parameters of the triggering probability and the attack chain length attenuation coefficient by utilizing the successful times of the attack path. And recalculating the loophole chain propagation coefficient by using the optimized parameters to obtain a final loophole chain propagation coefficient.

FIG. 2 is a graph comparing the accuracy of the prediction of the success rate of the exploit chain in the embodiment of the invention, and comparing the difference between the success rate of the exploit chain predicted by different methods and the actual observed value. The data show that the average deviation of the predicted value and the actual observed value of the technical scheme is only 4.2%, and the average deviation of the CVSS attack chain evaluation method, the multi-step attack graph analysis method and the traditional Bayesian network method is 12.8%, 9.6% and 15.3% respectively. Particularly in a scene with medium complexity, the predicted value of the technical scheme is 68.5%, the actual observed value is 65.2%, the deviation is only 3.3%, and the predicted values of other methods are 55.6%, 76.4% and 48.9%, respectively, so that the deviation is obviously larger. The technical scheme is characterized in that the prediction accuracy of the success rate of the vulnerability chain utilization is remarkably improved through repeated simulation statistics and parameter optimization. Notably, CVSS and bayesian network methods typically underestimate vulnerability chain utilization success rates, while multi-step attack graph analysis algorithms tend to overestimate, and these deviations can lead to false allocation of secure resources. According to the technical scheme, the continuity of system state change is introduced as a screening condition, and parameter optimization is carried out by combining multiple simulation statistics, so that the deviations are effectively overcome, and a more reliable decision basis is provided for security defense.

In this embodiment, by matching the system status requirement, status change and attack entry type, the configuration of the exploit chain is ensured to conform to the logic of the actual attack path, so that the propagation path of the vulnerability is more accurate. And identifying the exploit chain in the vulnerability dependency graph by using a depth-first search method, introducing system state change continuity as a screening condition, effectively eliminating unreasonable attack paths, and improving the reliability of vulnerability propagation analysis. By calculating the length attenuation coefficient of the attack chain and combining the continuous multiplication operation of the vulnerability triggering probability, the vulnerability chain propagation capability is accurately estimated, and misjudgment caused by ignoring the attack path attenuation effect in the traditional method is avoided. In addition, the successful times of attack paths are counted through multiple simulation, and the calculation parameters of the trigger probability and the attenuation coefficient are optimized, so that the vulnerability propagation model is closer to the real network attack environment, and the accuracy of vulnerability propagation risk assessment is improved. The method can effectively screen the exploit chain with the most attack value, optimize vulnerability propagation path analysis, promote scientificity of network security assessment and defense strategy formulation, provide more real attack chain simulation for attack and defense exercise in a network target range, and improve pertinence and practicability of security research and defense system.

In an alternative embodiment, a graph theory clustering algorithm is adopted to calculate a node distribution density threshold value and perform hierarchical optimization on network topology types in network node scene construction requirements to obtain an initial network topology structure, a target network node template is mapped through attributes to generate network node examples, connection relations among the network node examples are constructed based on inter-node communication protocol information, communication features among the network node examples are extracted, service affinities are calculated, the initial network topology structure is optimized based on the service affinities, and the generation of the target network topology structure comprises the following steps:

Illustratively, node distribution density calculation and initial topology construction are performed first. And constructing a distance matrix by calculating Euclidean distance between nodes in the network, wherein the distance value is calculated by physical position coordinates of the nodes. Taking a network of 100 nodes as an example, each node has three-dimensional coordinate values, and the distances between them are calculated by traversing all pairs of nodes. For each node, counting the number of nodes in the neighborhood range as a local density value, and simultaneously calculating the minimum distance between the nodes and other high-density nodes as a relative density value. For example, setting the neighborhood radius to be 10 units of distance, and the local density of a node to be 15 indicates that there are 15 adjacent nodes in the range.

Node template information is then extracted and an instance is generated. The node template comprises hardware parameters such as CPU core number, memory capacity, storage space and the like, and software parameters such as operating system type, running environment configuration and the like. For example, a compute node template may contain a hardware configuration of a 16-core CPU, 64GB of memory, 512GB of memory space, and a software configuration of a Linux operating system, a Docker container environment. These parameters are instantiated via feature mapping rules to generate specific node instances.

And then analyzing the characteristics of the communication protocol among the nodes. And encoding protocol information such as TCP/IP, HTTP and the like into feature vectors, wherein the feature vectors comprise dimensions such as protocol types, port numbers, message formats and the like. And extracting main characteristics through characteristic decomposition, and calculating protocol compatibility among nodes. For example, both nodes support the HTTP protocol and the port configurations match, their protocol compatibility is high.

The topology is optimized based on traffic flow characteristics. And adopting a sliding time window of 5 minutes to count the traffic flow among the nodes, and predicting the future flow trend by combining the historical data. Meanwhile, the service calling relationship is analyzed, and a dependency graph is constructed. For example, there are an average of 1000 requests per second between nodes a and B, and there is a direct service invocation dependence, indicating that their traffic affinity is high.

And finally, optimizing the connection relation through deep reinforcement learning. Defining a state space includes a traffic affinity matrix for the nodes, and an action space includes operations to add or delete connections between the nodes. Setting the reward function considers constraint conditions of processing capacity, bandwidth, time delay and the like. And obtaining an optimal connection strategy through iterative training, and generating a final network topology structure.

In the embodiment, through graph theory clustering and multi-level optimization, the automatic construction of the network topology structure is realized, and the efficiency and the accuracy of network planning are improved. The layering method based on node density analysis ensures that the network structure is more reasonable, and avoids the problem of unbalanced node distribution. The feature mapping and protocol compatibility analysis are adopted, so that the feasibility of connection between node instances is ensured, and the interoperability and reliability of the network are improved. The dynamic attribute mapping mechanism enables the network to have better expansibility and adaptability. The optimization method based on business affinity and deep reinforcement learning realizes the dynamic adjustment of the network topology structure and improves the network performance and the resource utilization rate. Through multi-dimensional feature fusion and constraint condition consideration, the practicability and the practicability of the optimization result are ensured.

In an alternative embodiment, iteratively optimizing the connection strategy based on the value function estimation and the strategy gradient algorithm using the deep reinforcement learning method, outputting an optimal topology connection scheme when the reward function converges comprises:

Illustratively, a dual network architecture system is first constructed. The value function network adopts a three-layer fully-connected neural network structure, the input layer receives the business affinity matrix and the connection state matrix, and the state feature vector is obtained by extracting features through convolution operation. Taking a network comprising 5 nodes as an example, the service affinity matrix is a symmetric matrix of 5x5, and in the connection state matrix, 1 indicates that there is a connection between the nodes, and 0 indicates that there is no connection. The hidden layer contains 128 neurons and the output layer outputs scalar state values using the ReLU activation function. The strategy network also adopts a three-layer structure, the shared state feature vector is used as input, the associated features among the nodes are extracted through the graph attention layer, and the output layer obtains the node connection change action probability by using a Softmax function.

An experience playback training is then performed. And calculating the time sequence differential error based on the current state value and the reward value calculated by considering constraints such as the node CPU utilization rate is less than 80%, the link bandwidth utilization rate is less than 60%, the end-to-end time delay is less than 100ms and the like. And storing the transition information consisting of the state characteristic vector, the action probability, the rewarding value and the next state characteristic vector into an experience pool with the capacity of 10000. And distributing sampling weights to samples in the experience pool according to the time sequence difference error, wherein the larger the time sequence difference error is, the higher the weight is. Each training samples 256 samples from the experience pool by weight to update the value function network parameters.

The policy network is then optimized. And constructing a strategy objective function based on the action probability and the state transition information, and calculating a strategy improvement direction by taking the state value as a baseline function. Under the constraint that the new and old policies Kullback-Leibler divergence (KL divergence) is smaller than 0.01, the Adam optimizer is used to update the policy network parameters at a learning rate of 0.001. And meanwhile, the time sequence difference error is used as an advantage function to adjust the improvement direction of the strategy network, and the action probability output by the strategy network is used for generating state access distribution guide value function network training.

And finally, carrying out convergence judgment. The average prize value, action probability, and rate of change of state values for the last 100 steps are calculated every 1000 steps of training. And when the change rates of the three are less than 1%, judging convergence, and extracting an action sequence with highest output probability of the strategy network as an optimal scheme.

Fig. 3 is a graph of link load distribution before and after optimization according to an embodiment of the present invention, which shows link load distribution before and after network topology optimization. As can be clearly seen from the distribution curve, the link load distribution before optimization (dotted line) shows a U-shaped distribution with high ends and low middle, which indicates that a large number of low-load and high-load links exist in the network at the same time, and the resource utilization is unbalanced. Specific data shows that about 15% of the link load rate is lower than 20% (in idle state) before optimization, while about 12% of the link load rate exceeds 80% (in congested state), wherein 5% of the link load rate even exceeds 95%. After the technical scheme is optimized (solid line), the link load distribution presents more concentrated bell-shaped distribution, and the whole is concentrated to an intermediate load interval (40% -70%). After optimization, the proportion of low load links (< 20%) is reduced to 8%, the proportion of high load links (> 80%) is reduced to 7%, and the load rate of no links exceeds 90%. Especially in the ideal load interval of 40% -60%, the number of optimized links is increased from 17 to 24, and the number of optimized links accounts for 48% of the total number of links. The obvious load balancing improvement directly reflects that the technical scheme can effectively identify congestion points and idle resources in the network, and more reasonable distribution of network resources is realized by intelligently adjusting a topology connection structure, so that the overall network performance and user experience are improved.

In the embodiment, by constructing a dual-network architecture of a value function network and a strategy network, the end-to-end solution of the complex network topology optimization problem is realized, and the limitation that the traditional heuristic algorithm needs to manually design an optimization rule is avoided. The training mode of combining priority experience playback and strategy confidence domain constraint based on time sequence difference errors is adopted, so that the sample utilization efficiency is improved, the strategy stability improvement is ensured, and the algorithm convergence speed is accelerated. The bidirectional guidance mechanism of the dominant function and the state access distribution is introduced, so that the collaborative optimization of the value function estimation and the strategy improvement is realized, the algorithm performance is improved, and the reliability of an output topology scheme is ensured.

In an alternative embodiment, generating a resource utilization index according to a target network topology structure, constructing a node operation environment by a resource optimization scheme obtained by predicting the resource utilization index, deploying network services, performing a pressure test on the network services to generate a fault feature sequence, selecting a matched vulnerability program from a preset vulnerability program library based on the fault feature sequence, implanting the vulnerability program into the node operation environment, monitoring state information in the node operation environment, performing anomaly detection and repair according to the fault feature sequence, and generating a large-scale network node scene meeting the constraint of the resource optimization scheme, wherein the method comprises the following steps:

A method for constructing a large-scale network node scene can simulate a real network environment and is used for testing and verifying the stability and reliability of network services. The method is characterized in that resource allocation is optimized according to a target network topological structure and a resource utilization rate prediction result, and faults are simulated and repaired on nodes.

Illustratively, first, target network topology information is collected, including the number of nodes, connection relationships, and hardware configuration of each node, such as the number of CPU cores, memory size, network bandwidth, disk capacity, and the like. Then, state information such as CPU utilization rate, memory utilization rate, network traffic and disk IO of the network node is collected, and the information is converted into multidimensional vector representations such as CPU state vector, memory state vector, network state vector and disk state vector.

Next, these state vectors are trained using the antialiasing self-encoder. The antialiasing self-encoder comprises two parts, an encoder that compresses the high-dimensional state vector into a low-dimensional representation, and a decoder that attempts to reconstruct the original state vector from the low-dimensional representation. By countertraining, the encoder is enabled to extract key features in the state vector and learn the probability distribution of the resource states.

Based on the learned resource state distribution, a resource state entropy is calculated, which reflects the degree of confusion of the resource state. The higher the entropy value, the more unstable the resource state. A loss function of the resource prediction model is constructed using the resource state entropy, and a recurrent neural network, such as a long short term memory network (LSTM), is trained using the loss function to predict resource utilization over a period of time in the future. The historical resource state distribution is input into a trained cyclic neural network, and then future resource utilization indexes such as CPU utilization rate, memory utilization rate and the like of each node in a future period of time can be obtained.

And (3) formulating a resource optimization scheme according to the predicted resource utilization index, for example, determining CPU allocation limit, memory allocation limit and the like of each node. Then, a system stability matrix is calculated based on the resource status entropy, which reflects the stability of the system in different resource statuses. Singular value decomposition is carried out on the stability matrix to obtain a characteristic modal sequence, and the characteristic modal sequence reflects the change trend of the system state along with time. And identifying a system state mutation point, namely the moment when the system stability is obviously changed, by analyzing the characteristic modal sequence. These abrupt points are used as fault injection moments.

And at the preset fault injection moment, constructing a node operation environment according to a resource optimization scheme, and deploying network services on the nodes. The deployed network service is then stress tested, e.g., simulating a large number of users accessing the service simultaneously, and collecting system state data, e.g., CPU usage, memory usage, network latency, etc. And mapping the collected system state data to a high-dimensional characteristic manifold by using a complex variable kernel function, and calculating a geodesic line on the manifold to obtain a fault propagation path. A curvature feature is calculated based on the fault propagation path, and a fault feature sequence is constructed, which describes feature changes during fault propagation.

And performing manifold matching on the generated fault feature sequence and the vulnerability features in a preset vulnerability program library, selecting the vulnerability program which has the highest matching degree and meets the constraint of the resource optimization scheme, and implanting the vulnerability program into the node operation environment. The preset loophole program library comprises various loophole programs, such as buffer overflow loopholes, SQL injection loopholes and the like, and each loophole program has corresponding characteristic description.

And analyzing the characteristic manifold by adopting a continuous coherent method to obtain the fault duration, wherein the index reflects the duration of the fault. And constructing a repair threshold based on the coupling relation between the resource state entropy and the fault duration. When the system state entropy and the fault duration exceeds the repair threshold, the system is considered to need to be repaired. And performing abnormality detection and repair on the node running environment according to the repair threshold, such as restarting service, updating patch and the like. When the resource utilization index of the repaired node operation environment meets the constraint of the resource optimization scheme, the current node operation environment is output as a large-scale network node scene.

For example, in a network comprising 10 nodes, a resource state vector is constructed by collecting the CPU usage and memory usage of the nodes. After training the anti-variation self-encoder, the resource state distribution is obtained. The resource state entropy is calculated by using the distribution, and the resource utilization rate of the cyclic neural network for predicting the future 24 hours is trained. According to the prediction result, the CPU allocation limit of each node is set to be 80%, and the memory allocation limit is set to be 70%. Upon simulated fault injection, a buffer overflow vulnerability program is selected for implantation into node 3. After abnormality detection and repair, the CPU utilization rate and the memory utilization rate of the node 3 are restored to normal levels, and the constraint of a resource optimization scheme is met.

In the embodiment, the CPU, the memory, the network and the disk resource states of the network node are accurately depicted by constructing a multidimensional resource state space, and training is performed by adopting the anti-variation self-encoder, so that the resource state distribution is more real, and the accuracy of resource utilization rate prediction is improved. And generating future resource utilization index based on the prediction model, and optimizing a resource allocation scheme, thereby improving the overall resource management efficiency of the system. By calculating the resource state entropy and combining a singular value decomposition method, the system state mutation points can be effectively identified, and the fault injection time can be accurately determined, so that the fault simulation is more realistic. And mapping system state data to a characteristic manifold by using a complex variable kernel function, and calculating a geodesic analysis fault propagation path so as to enable fault characteristic extraction to be more accurate. By combining the fault characteristic manifold matching method, the vulnerability program which is most in line with the resource optimization constraint can be screened from the preset vulnerability program library, and the pertinence and the effectiveness of the vulnerability test are improved. And calculating fault persistence by adopting a continuous coherent analysis method, constructing a repair threshold by combining with a resource state entropy, realizing intelligent anomaly detection and repair of the node operation environment, and ensuring that the repaired environment can meet the constraint of a resource optimization scheme. Finally, the scheme can construct a large-scale network node scene meeting the resource optimization requirement, improve the authenticity, stability and efficiency of network simulation and attack and defense testing, and provide powerful support for network security research and infrastructure optimization.

In an alternative embodiment, analyzing the feature manifold by adopting a continuous coherent method to obtain a fault duration, and constructing a repair threshold based on the coupling relation between the resource state entropy and the fault duration comprises:

Illustratively, first, a nested sequence of simplex complexes is constructed on a feature manifold. The feature manifold is a topological representation of the system state, which is a geometric structure of a high-dimensional space, and each point represents the resource state of the system at a certain moment, including CPU load, memory occupation, network bandwidth use condition, disk IO state and the like. The manifold is capable of capturing a continuous change in system state and is suitable for analyzing fault propagation modes. Simplex complexes are topologies made up of multiple simplex (e.g., points, line segments, triangles, and high-dimensional generalizations thereof) used to represent topological connections between data points. Constructing a simplex complex on a manifold first requires defining a distance metric space for computing similarity between data points. The distance measurement space can be calculated by adopting methods such as Euclidean distance, mahalanobis distance or cosine similarity, and the like, so that the difference between different resource state vectors can be measured.

After constructing the distance metric space, the distance value between the pairs of points needs to be calculated. The distance value is used for judging the similarity between two data points (namely resource states), and when the distance is smaller than a set threshold value, a topological connection is established between the two data points to form a simplex. With the gradual increase of the distance threshold, the structure of the simplex complex gradually changes, so that a nested sequence, namely a topological structure under different scales, is formed. The construction of the nested sequence can be realized by Vietoris-Rips complex (VR complex) methods and the like, and the nested sequence can gradually connect data points under different scales to generate the hierarchical change of the topological structure.

After the nested sequence is constructed, topological feature changes at different scales need to be analyzed by using coherent group mapping. The coherent group is a topological invariant and can characterize connected components, ring structures, cavities and the like in a data topological structure. For example, at small scales, different resource states may be independent of each other, but as the scale increases, certain states gradually merge, and this change can be reflected by the coherent group map. The coherent group mapping is used for analyzing topological feature changes between adjacent simplex complexes and generating a topological feature sequence, wherein the topological feature sequence comprises appearance time and disappearance time of each topological feature.

Based on the moment of appearance and the moment of disappearance, the feature duration, i.e. the time of existence of a certain topological feature at different scales, can be calculated. Features of longer duration typically represent a stable topology, such as certain long-standing resource status patterns of the system, while features of shorter duration may represent transient anomalies, such as sudden CPU load surges or network congestion. Therefore, the duration of the topological feature is calculated, and weighted and summed to obtain the preliminary fault duration which is used for measuring the stability and the influence range of the system fault.

To optimize the failure persistence, a weighting process is required in combination with the structural features of the simplex complex. Structural features of simplex complexes include topological connectivity, local density, dimensional information, and the like. By calculating the local connectivity of each simplex, the importance of each simplex in the whole topology structure is determined, and a weighting coefficient is allocated to each simplex. Multiplying the weighting coefficients by the fault duration value to obtain the dimension characteristic value of each dimension. Then, carrying out feature decomposition on the dimension feature values, calculating feature weights of different dimensions, and finally obtaining corrected fault persistence so as to improve the accuracy and the robustness of fault analysis.

And then, calculating the coupling relation between the corrected fault duration and the resource state entropy, and further constructing a coupling matrix. The resource state entropy is used for measuring the balance degree of system resources, and is based on the concept of information entropy, and the stability of the system is evaluated by calculating the distribution condition of the resource states of a CPU, a memory, a network, a disk and the like. Higher entropy values indicate more uniform resource usage and more stable systems, while lower entropy values indicate some resources are abnormal, such as too high CPU occupancy or bottlenecks in a network port.

In order to establish the relationship between the fault duration and the resource state entropy, firstly, the association coefficients of the fault duration and the resource state entropy need to be calculated, and the coupling degree between the two variables is measured. And then, filling the correlation coefficient into a diagonal matrix formed by the corrected fault duration and the resource state entropy to generate a coupling matrix. The coupling matrix is used to describe the global correlation between the system resource states and the fault persistence.

After the coupling matrix construction is completed, feature decomposition is required to extract key feature vectors. These feature vectors are used to identify the primary factors that affect system stability, for example, certain resource status patterns may lead to certain types of failures. Based on the feature vectors, a threshold decision value may be calculated, measuring whether the system is in an acceptable operating state. The specific calculation method is to multiply and sum the feature vector with the corrected fault duration and the resource state entropy respectively so as to obtain a global threshold index.

In order to optimize the threshold decision result, adjustments are made using historical repair decision data. The historical repair decision data comprises information such as system faults, fault influence ranges, taken repair measures and effectiveness thereof which occur in the past. Parameters of the threshold calculation model can be adjusted through a machine learning method, such as a gradient lifting decision tree or a Bayesian optimization method, so that the parameters are more in line with actual system operation conditions. The optimized repair threshold can more accurately judge the fault recovery capability of the system in different resource states and is used for guiding the execution of an automatic repair strategy.

Table 1 is a table of comparison of comprehensive performance indexes of different fault analysis methods according to embodiments of the present invention, in which the performance of the present invention on multiple performance indexes is compared with other methods. The technical scheme has 94.6 percent of fault detection accuracy, which is obviously higher than other methods, and simultaneously maintains the lowest false alarm rate (3.2 percent) and the lowest false alarm rate (2.2 percent). In terms of processing time, although the traditional statistical method is faster (0.57 ms/sample), the present solution (0.98 ms/sample) has significant advantages over the deep learning method (2.86 ms/sample) and classical coherent analysis (1.74 ms/sample). In terms of resource consumption, the technical scheme (245 MB) is far lower than the deep learning method (876 MB), and about 40% of resources are saved compared with classical coherent analysis (412 MB).

Table 1a comparison table of comprehensive performance indexes of different fault analysis methods;

In the embodiment, the topological relation among different resource states can be accurately captured by constructing the characteristic manifold and analyzing the system state by utilizing the simplex complex, so that the accuracy of fault detection is improved. Topology characteristic changes are analyzed by adopting coherent group mapping, so that fault duration calculation can reflect fault influence ranges of the system under different scales, and abnormality detection capability is improved. The overall stability of the system can be better measured by calculating the resource state entropy and carrying out coupling analysis on the resource state entropy and the fault persistence, so that the fine resource management and fault prediction are realized. The repair threshold is optimized through feature decomposition and is dynamically adjusted by combining with historical repair decision data, so that the repair threshold is more in line with the actual application scene, and the effectiveness and adaptability of an automatic repair strategy are improved.

In a second aspect of an embodiment of the present invention,

There is provided a network-based range large-scale network node scene construction system, the system comprising:

In a third aspect of an embodiment of the present invention,

There is provided an electronic device including:

A processor;

A memory for storing processor-executable instructions;

In a fourth aspect of an embodiment of the present invention,

The present invention may be a method, apparatus, system, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing various aspects of the present invention.

It should be noted that the above embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that the technical solution described in the above embodiments may be modified or some or all of the technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the scope of the technical solution of the embodiments of the present invention.

Claims

Translated fromChinese

1.基于网络靶场的大规模网络节点场景构造方法，其特征在于，包括：1. A method for constructing a large-scale network node scenario based on a network range, characterized by comprising:

计算网络节点场景构造需求中节点间的漏洞关联度，将网络节点划分为多个攻防层级，生成攻防层级节点分布图，从网络节点模板库中提取候选网络节点模板并计算漏洞链式传播系数，将漏洞链式传播系数与攻防层级节点分布图进行匹配，筛选出目标网络节点模板；Calculate the vulnerability correlation between nodes in the network node scenario construction requirements, divide the network nodes into multiple attack and defense levels, generate an attack and defense level node distribution map, extract candidate network node templates from the network node template library and calculate the vulnerability chain propagation coefficient, match the vulnerability chain propagation coefficient with the attack and defense level node distribution map, and screen out the target network node template;

采用图论聚类算法计算节点分布密度阈值并对网络节点场景构造需求中的网络拓扑类型进行分层优化得到初始网络拓扑结构，将目标网络节点模板通过属性映射生成网络节点实例，基于节点间通信协议信息构建网络节点实例之间的连接关系，提取网络节点实例之间的通信特征并计算业务亲和度，基于业务亲和度优化初始网络拓扑结构，生成目标网络拓扑结构；A graph clustering algorithm is used to calculate the node distribution density threshold and hierarchically optimize the network topology type in the network node scenario construction requirements to obtain the initial network topology structure. The target network node template is used to generate a network node instance through attribute mapping. The connection relationship between network node instances is constructed based on the communication protocol information between nodes. The communication characteristics between network node instances are extracted and the business affinity is calculated. The initial network topology structure is optimized based on the business affinity to generate the target network topology structure.

根据目标网络拓扑结构生成资源利用率指标，通过对资源利用率指标进行预测得到的资源优化方案构建节点运行环境并部署网络服务，对网络服务进行压力测试生成故障特征序列，基于故障特征序列从预设的漏洞程序库中选择匹配的漏洞程序并植入节点运行环境，监控节点运行环境中的状态信息并根据故障特征序列进行异常检测和修复，生成满足资源优化方案约束的大规模网络节点场景；Generate resource utilization indicators according to the target network topology structure, build a node operating environment and deploy network services through the resource optimization plan obtained by predicting the resource utilization indicators, perform stress testing on the network services to generate fault feature sequences, select matching vulnerability programs from a preset vulnerability program library based on the fault feature sequences and implant them into the node operating environment, monitor the status information in the node operating environment and perform anomaly detection and repair according to the fault feature sequences, and generate a large-scale network node scenario that meets the constraints of the resource optimization plan;

计算网络节点场景构造需求中节点间的漏洞关联度，将网络节点划分为多个攻防层级，生成攻防层级节点分布图，从网络节点模板库中提取候选网络节点模板并计算漏洞链式传播系数，将漏洞链式传播系数与攻防层级节点分布图进行匹配，筛选出目标网络节点模板包括：Calculate the vulnerability correlation between nodes in the network node scenario construction requirements, divide the network nodes into multiple attack and defense levels, generate an attack and defense level node distribution map, extract candidate network node templates from the network node template library and calculate the vulnerability chain propagation coefficient, match the vulnerability chain propagation coefficient with the attack and defense level node distribution map, and screen out the target network node templates including:

根据节点间的漏洞关联度构建节点关联图，基于社区发现算法对节点关联图进行划分，生成攻防层级节点分布图，引入漏洞利用链并计算相邻漏洞的触发概率与攻击链长度衰减系数，与攻防层级节点分布图中对应位置的漏洞强度值进行匹配，选择具有最小匹配度值的候选网络节点模板作为目标网络节点模板，具体包括：According to the vulnerability correlation between nodes, a node correlation graph is constructed. The node correlation graph is divided based on the community discovery algorithm to generate an attack and defense level node distribution graph. The vulnerability exploitation chain is introduced and the trigger probability of adjacent vulnerabilities and the attack chain length attenuation coefficient are calculated. The vulnerability intensity values at the corresponding positions in the attack and defense level node distribution graph are matched. The candidate network node template with the minimum matching value is selected as the target network node template. Specifically, it includes:

获取网络节点的漏洞属性信息生成节点漏洞特征向量，基于节点间通信跳数计算距离衰减函数值，将所述节点漏洞特征向量与距离衰减函数值进行张量运算得到节点漏洞组合矩阵，基于所述节点漏洞组合矩阵计算网络节点场景构造需求中节点间的漏洞关联度；Obtain vulnerability attribute information of network nodes to generate node vulnerability feature vectors, calculate distance decay function values based on the number of communication hops between nodes, perform tensor operations on the node vulnerability feature vectors and the distance decay function values to obtain node vulnerability combination matrices, and calculate vulnerability correlations between nodes in network node scenario construction requirements based on the node vulnerability combination matrix;

将所述漏洞关联度作为边权重构建节点关联图，采用社区发现算法对所述节点关联图进行迭代划分，将网络节点划分为多个攻防层级，对每个攻防层级计算层级漏洞强度值；The vulnerability correlation degree is used as the edge weight to construct a node correlation graph, the node correlation graph is iteratively divided using a community discovery algorithm, the network nodes are divided into multiple attack and defense levels, and the level vulnerability strength value is calculated for each attack and defense level;

基于所述攻防层级的层级位置关系与层级漏洞强度值生成攻防层级节点分布图，其中，节点表示攻防层级，节点间的连接边表示层级间的漏洞强度关系；Generate an attack and defense level node distribution graph based on the level position relationship of the attack and defense levels and the level vulnerability strength values, wherein the nodes represent the attack and defense levels, and the connecting edges between the nodes represent the vulnerability strength relationship between the levels;

从预设的网络节点模板库中提取候选网络节点模板，基于所述候选网络节点模板中的漏洞特征，通过漏洞特征中的系统状态需求、状态变化和攻击入口类型建立漏洞间的连接关系，引入系统状态变化连续性作为漏洞利用链的筛选条件，并将攻击链的级联触发概率与长度衰减系数相结合，计算得到漏洞链式传播系数；Extract candidate network node templates from a preset network node template library, establish connection relationships between vulnerabilities based on vulnerability features in the candidate network node templates through system state requirements, state changes, and attack entry types in the vulnerability features, introduce system state change continuity as a screening condition for vulnerability exploitation chains, and combine the cascade trigger probability of the attack chain with the length attenuation coefficient to calculate the vulnerability chain propagation coefficient;

将所述漏洞链式传播系数与攻防层级节点分布图中对应位置的漏洞强度值进行匹配计算得到传播系数偏差值，基于所述候选网络节点模板计算部署后的层级漏洞强度变化量，将所述传播系数偏差值与层级漏洞强度变化量的加权和确定为匹配度值，选择具有最小匹配度值的候选网络节点模板作为目标网络节点模板；The vulnerability chain propagation coefficient is matched and calculated with the vulnerability strength value of the corresponding position in the attack and defense hierarchical node distribution map to obtain a propagation coefficient deviation value, and the hierarchical vulnerability strength change amount after deployment is calculated based on the candidate network node template. The weighted sum of the propagation coefficient deviation value and the hierarchical vulnerability strength change amount is determined as a matching value, and the candidate network node template with the minimum matching value is selected as the target network node template;

通过漏洞特征中的系统状态需求、状态变化和攻击入口类型建立漏洞间的连接关系，引入系统状态变化连续性作为漏洞利用链的筛选条件，并将攻击链的级联触发概率与长度衰减系数相结合，计算得到漏洞链式传播系数包括：The connection relationship between vulnerabilities is established through the system state requirements, state changes and attack entry types in the vulnerability characteristics. The continuity of system state changes is introduced as the screening condition for the vulnerability exploit chain. The cascade trigger probability of the attack chain is combined with the length attenuation coefficient. The vulnerability chain propagation coefficient is calculated, including:

从候选网络节点模板中提取漏洞特征，所述漏洞特征包含漏洞触发的系统状态需求、漏洞利用后的系统状态变化、漏洞利用的攻击入口类型，构建漏洞特征向量；Extract vulnerability features from the candidate network node template, wherein the vulnerability features include system state requirements triggered by the vulnerability, system state changes after the vulnerability is exploited, and attack entry types of the vulnerability exploitation, and construct a vulnerability feature vector;

根据所述漏洞特征向量中的系统状态需求、系统状态变化和攻击入口类型进行特征匹配，当第一个漏洞的系统状态变化满足第二个漏洞的系统状态需求，且第二个漏洞的攻击入口类型属于第一个漏洞引入的攻击入口时，在两个漏洞间建立连接关系；Perform feature matching according to the system state requirements, system state changes, and attack entry types in the vulnerability feature vector, and establish a connection relationship between the two vulnerabilities when the system state change of the first vulnerability meets the system state requirements of the second vulnerability and the attack entry type of the second vulnerability belongs to the attack entry introduced by the first vulnerability;

基于所述连接关系构建漏洞依赖图，采用深度优先搜索方法在所述漏洞依赖图中识别漏洞利用链路径，根据所述漏洞特征向量中的系统状态变化连续性筛选所述漏洞利用链路径，生成漏洞利用链；Building a vulnerability dependency graph based on the connection relationship, identifying a vulnerability exploit chain path in the vulnerability dependency graph using a depth-first search method, screening the vulnerability exploit chain path according to the continuity of system state changes in the vulnerability feature vector, and generating a vulnerability exploit chain;

基于所述漏洞利用链的节点数量，计算相邻节点间的连接距离，根据所述连接距离计算漏洞利用链的路径长度，将所述路径长度代入指数衰减函数，得到漏洞利用链的攻击链长度衰减系数；Based on the number of nodes in the vulnerability exploit chain, the connection distance between adjacent nodes is calculated, the path length of the vulnerability exploit chain is calculated according to the connection distance, and the path length is substituted into an exponential decay function to obtain an attack chain length decay coefficient of the vulnerability exploit chain;

对所述漏洞利用链中的相邻漏洞，按照漏洞利用链中漏洞的连接顺序，依次将相邻漏洞的触发概率进行连乘运算，得到漏洞利用链的级联触发概率，将所述级联触发概率与所述攻击链长度衰减系数相乘，得到漏洞链式传播系数；For adjacent vulnerabilities in the vulnerability exploitation chain, the triggering probabilities of the adjacent vulnerabilities are multiplied in sequence according to the connection order of the vulnerabilities in the vulnerability exploitation chain to obtain the cascade triggering probability of the vulnerability exploitation chain, and the cascade triggering probability is multiplied by the attack chain length attenuation coefficient to obtain the vulnerability chain propagation coefficient;

基于所述漏洞特征向量和漏洞利用链构建攻击路径图，将漏洞利用链映射为攻击路径，通过多次仿真统计攻击路径的成功次数，利用攻击路径的成功次数对触发概率和攻击链长度衰减系数的计算参数进行优化，使用优化后的参数重新计算漏洞链式传播系数，得到最终的漏洞链式传播系数。An attack path graph is constructed based on the vulnerability feature vector and the vulnerability exploitation chain, and the vulnerability exploitation chain is mapped to an attack path. The number of successes of the attack path is counted through multiple simulations, and the calculation parameters of the trigger probability and the attack chain length attenuation coefficient are optimized using the number of successes of the attack path. The vulnerability chain propagation coefficient is recalculated using the optimized parameters to obtain the final vulnerability chain propagation coefficient.

2.根据权利要求1所述的方法，其特征在于，采用图论聚类算法计算节点分布密度阈值并对网络节点场景构造需求中的网络拓扑类型进行分层优化得到初始网络拓扑结构，将目标网络节点模板通过属性映射生成网络节点实例，基于节点间通信协议信息构建网络节点实例之间的连接关系，提取网络节点实例之间的通信特征并计算业务亲和度，基于业务亲和度优化初始网络拓扑结构，生成目标网络拓扑结构包括：2. The method according to claim 1 is characterized in that a graph clustering algorithm is used to calculate the node distribution density threshold and the network topology type in the network node scenario construction requirement is hierarchically optimized to obtain an initial network topology structure, a target network node template is generated into a network node instance through attribute mapping, a connection relationship between network node instances is constructed based on inter-node communication protocol information, communication features between network node instances are extracted and business affinity is calculated, the initial network topology structure is optimized based on business affinity, and generating a target network topology structure includes:

根据节点间的欧氏距离构建距离矩阵，基于距离矩阵采用图论聚类算法计算节点局部密度和相对密度，结合网络节点场景构造需求设定密度阈值，将所述节点局部密度与相对密度进行归一化处理得到节点综合密度值，根据网络拓扑分层要求对节点进行层次划分，依据综合密度值在每层中确定节点的拓扑连接关系，通过迭代优化得到满足层次约束的初始网络拓扑结构；A distance matrix is constructed according to the Euclidean distance between nodes. A graph clustering algorithm is used based on the distance matrix to calculate the local density and relative density of nodes. A density threshold is set in combination with the network node scenario construction requirements. The local density and relative density of the nodes are normalized to obtain the node comprehensive density value. The nodes are hierarchically divided according to the network topology layering requirements. The topological connection relationship of the nodes in each layer is determined according to the comprehensive density value. The initial network topology structure that meets the hierarchical constraints is obtained through iterative optimization.

从目标网络节点模板中提取硬件配置信息和软件配置信息，构建静态属性特征集和动态属性特征集，基于特征映射规则将静态属性特征集映射为节点实例的硬件参数，将动态属性特征集映射为节点实例的软件参数，生成网络节点实例；Extracting hardware configuration information and software configuration information from the target network node template, constructing a static attribute feature set and a dynamic attribute feature set, mapping the static attribute feature set to the hardware parameters of the node instance based on the feature mapping rule, mapping the dynamic attribute feature set to the software parameters of the node instance, and generating a network node instance;

设计协议特征张量，将节点实例间的通信协议信息编码为多维特征张量，采用张量分解方法提取协议特征的隐含因子，基于所述隐含因子构建协议相似度度量模型，计算节点实例间的协议兼容性矩阵，根据兼容性阈值确定节点实例间的连接关系；Design a protocol feature tensor, encode the communication protocol information between node instances into a multi-dimensional feature tensor, use a tensor decomposition method to extract implicit factors of the protocol features, build a protocol similarity measurement model based on the implicit factors, calculate the protocol compatibility matrix between node instances, and determine the connection relationship between node instances according to the compatibility threshold;

采用滑动时间窗口提取网络节点实例间的业务流量时序特征，基于长短时记忆网络分析所述业务流量时序特征得到业务流量预测结果，通过服务调用链分析构建业务依赖关系图，结合图注意力网络提取业务依赖关系图中的节点关联特征，基于资源监控数据统计节点资源利用特征，将所述业务流量预测结果、节点关联特征和资源利用特征进行多维度特征融合计算业务亲和度；A sliding time window is used to extract the business traffic time series characteristics between network node instances. The business traffic time series characteristics are analyzed based on the long short-term memory network to obtain the business traffic prediction results. A business dependency graph is constructed through service call chain analysis. The node association characteristics in the business dependency graph are extracted in combination with the graph attention network. The node resource utilization characteristics are counted based on resource monitoring data. The business traffic prediction results, node association characteristics and resource utilization characteristics are fused in multiple dimensions to calculate the business affinity.

将业务亲和度作为状态空间，将网络节点实例间连接关系变更作为动作空间，构建考虑节点处理能力约束、链路带宽约束和端到端时延约束的奖励函数，采用深度强化学习方法基于值函数估计和策略梯度算法迭代优化连接策略，当奖励函数收敛时输出最优拓扑连接方案，通过最优拓扑连接方案对初始网络拓扑结构进行优化，生成目标网络拓扑结构。Taking business affinity as the state space and the change of connection relationship between network node instances as the action space, a reward function that considers node processing capacity constraints, link bandwidth constraints and end-to-end delay constraints is constructed. The deep reinforcement learning method is used to iteratively optimize the connection strategy based on value function estimation and policy gradient algorithm. When the reward function converges, the optimal topology connection plan is output. The initial network topology structure is optimized through the optimal topology connection plan to generate the target network topology structure.

3.根据权利要求2所述的方法，其特征在于，采用深度强化学习方法基于值函数估计和策略梯度算法迭代优化连接策略，当奖励函数收敛时输出最优拓扑连接方案包括：3. The method according to claim 2 is characterized in that the connection strategy is iteratively optimized based on value function estimation and policy gradient algorithm using a deep reinforcement learning method, and when the reward function converges, the optimal topological connection solution is output, which includes:

构建值函数网络和策略网络的双网络架构，利用时序差分误差指导经验池优先级采样并基于策略置信域约束进行参数优化，同时采用优势函数调整策略网络改进方向并引入状态访问分布指导值函数网络训练，当优化收敛时输出动作概率最高的序列作为最优拓扑连接方案，具体包括：A dual network architecture of value function network and policy network is constructed. The temporal difference error is used to guide the priority sampling of the experience pool and the parameters are optimized based on the policy trust region constraint. At the same time, the advantage function is used to adjust the improvement direction of the policy network and the state access distribution is introduced to guide the training of the value function network. When the optimization converges, the sequence with the highest action probability is output as the optimal topological connection solution, which includes:

获取业务亲和度矩阵和初始网络拓扑结构的连接状态矩阵，构建值函数网络和策略网络的双网络架构，其中，所述值函数网络基于所述业务亲和度矩阵和连接状态矩阵进行特征提取得到状态特征向量，并输出状态值；所述策略网络基于所述状态特征向量提取节点间关联特征，并输出节点连接变更动作概率；Obtaining the business affinity matrix and the connection state matrix of the initial network topology structure, and constructing a dual network architecture of a value function network and a policy network, wherein the value function network extracts features based on the business affinity matrix and the connection state matrix to obtain a state feature vector, and outputs a state value; the policy network extracts node association features based on the state feature vector, and outputs a node connection change action probability;

基于所述状态值和基于节点处理能力约束、链路带宽约束及端到端时延约束计算得到的奖励值，计算时序差分误差，将所述状态特征向量、节点连接变更动作概率、奖励值和下一状态的状态特征向量作为状态转移信息存入经验池，基于所述时序差分误差为经验池中的状态转移信息分配采样优先级，按照所述采样优先级获取训练样本，利用所述训练样本更新值函数网络的参数；Based on the state value and the reward value calculated based on the node processing capacity constraint, the link bandwidth constraint and the end-to-end delay constraint, a time difference error is calculated, the state feature vector, the node connection change action probability, the reward value and the state feature vector of the next state are stored in an experience pool as state transition information, a sampling priority is assigned to the state transition information in the experience pool based on the time difference error, training samples are obtained according to the sampling priority, and the parameters of the value function network are updated using the training samples;

基于所述节点连接关系变更的动作概率和状态转移信息构建策略目标函数，将所述状态值作为基线函数计算策略梯度，在策略置信域约束下利用所述策略梯度和状态转移信息更新策略网络的参数；Constructing a policy objective function based on the action probability and state transition information of the node connection relationship change, calculating the policy gradient using the state value as a baseline function, and updating the parameters of the policy network using the policy gradient and state transition information under the policy trust region constraint;

将所述时序差分误差作为优势函数，基于所述优势函数调整策略网络的改进方向，利用策略网络输出的节点连接变更动作概率生成状态访问分布，将所述状态访问分布用于值函数网络的训练；Using the temporal difference error as an advantage function, adjusting the improvement direction of the policy network based on the advantage function, generating a state access distribution using the node connection change action probability output by the policy network, and using the state access distribution for training the value function network;

监测奖励值、策略网络输出的节点连接变更动作概率和值函数网络输出的状态值的变化趋势，当变化趋势的波动幅度小于收敛阈值时，提取策略网络输出的节点连接变更动作概率中概率最高的动作序列作为最优拓扑连接方案。Monitor the changing trends of the reward value, the node connection change action probability output by the policy network, and the state value output by the value function network. When the fluctuation amplitude of the changing trend is less than the convergence threshold, extract the action sequence with the highest probability in the node connection change action probability output by the policy network as the optimal topological connection plan.

4.根据权利要求1所述的方法，其特征在于，根据目标网络拓扑结构生成资源利用率指标，通过对资源利用率指标进行预测得到的资源优化方案构建节点运行环境并部署网络服务，对网络服务进行压力测试生成故障特征序列，基于故障特征序列从预设的漏洞程序库中选择匹配的漏洞程序并植入节点运行环境，监控节点运行环境中的状态信息并根据故障特征序列进行异常检测和修复，生成满足资源优化方案约束的大规模网络节点场景包括：4. The method according to claim 1 is characterized in that a resource utilization index is generated according to a target network topology structure, a node operating environment is constructed and a network service is deployed by using a resource optimization scheme obtained by predicting the resource utilization index, a stress test is performed on the network service to generate a fault feature sequence, a matching vulnerability program is selected from a preset vulnerability program library based on the fault feature sequence and implanted into the node operating environment, status information in the node operating environment is monitored and anomaly detection and repair are performed according to the fault feature sequence, and generating a large-scale network node scenario that meets the constraints of the resource optimization scheme includes:

构建多维资源状态空间，将目标网络拓扑结构中网络节点的CPU状态向量、内存状态向量、网络状态向量及磁盘状态向量映射至所述多维资源状态空间，采用对抗变分自编码器对所述状态向量进行对抗训练得到资源状态分布，基于所述资源状态分布计算资源状态熵，构建资源预测模型的损失函数，利用所述损失函数训练循环神经网络得到资源利用率预测模型，将历史资源状态分布输入所述资源利用率预测模型生成未来资源利用率指标，基于所述资源利用率指标设定资源分配参数生成资源优化方案；Construct a multidimensional resource state space, map the CPU state vector, memory state vector, network state vector and disk state vector of the network node in the target network topology structure to the multidimensional resource state space, use an adversarial variational autoencoder to perform adversarial training on the state vector to obtain a resource state distribution, calculate the resource state entropy based on the resource state distribution, construct a loss function of the resource prediction model, use the loss function to train a recurrent neural network to obtain a resource utilization prediction model, input the historical resource state distribution into the resource utilization prediction model to generate a future resource utilization index, and set resource allocation parameters based on the resource utilization index to generate a resource optimization plan;

基于所述资源状态熵计算系统稳定性矩阵，对所述稳定性矩阵进行奇异值分解得到特征模态序列，基于所述特征模态序列识别系统状态突变点，将所述系统状态突变点作为故障注入时刻，根据所述资源优化方案在故障注入时刻处构建节点运行环境并部署网络服务；Calculating a system stability matrix based on the resource state entropy, performing singular value decomposition on the stability matrix to obtain a characteristic mode sequence, identifying a system state mutation point based on the characteristic mode sequence, taking the system state mutation point as a fault injection moment, and constructing a node operating environment and deploying network services at the fault injection moment according to the resource optimization solution;

对所述网络服务进行压力测试采集系统状态数据，利用复变量核函数将所述系统状态数据映射至特征流形，在所述特征流形上计算测地线得到故障传播路径，基于所述故障传播路径计算曲率特征构建故障特征序列，将所述故障特征序列与预设漏洞程序库中的漏洞特征进行流形匹配，选择匹配度最高且满足所述资源优化方案约束的漏洞程序植入所述节点运行环境；Performing stress testing on the network service to collect system status data, mapping the system status data to a characteristic manifold using a complex variable kernel function, calculating geodesics on the characteristic manifold to obtain a fault propagation path, constructing a fault feature sequence based on the curvature characteristics of the fault propagation path, performing manifold matching on the fault feature sequence and the vulnerability features in a preset vulnerability program library, and selecting a vulnerability program with the highest matching degree and satisfying the constraints of the resource optimization solution to be implanted into the node operating environment;

采用持续同调方法分析所述特征流形得到故障持续度，基于所述资源状态熵和故障持续度的耦合关系构建修复阈值，根据所述修复阈值对所述节点运行环境进行异常检测和修复，当修复后的节点运行环境的资源利用率指标满足所述资源优化方案约束时，将当前节点运行环境作为大规模网络节点场景输出。The characteristic manifold is analyzed by a continuous coherence method to obtain the fault duration, a repair threshold is constructed based on the coupling relationship between the resource state entropy and the fault duration, anomaly detection and repair of the node operating environment are performed according to the repair threshold, and when the resource utilization index of the repaired node operating environment meets the constraints of the resource optimization solution, the current node operating environment is output as a large-scale network node scenario.

5.根据权利要求4所述的方法，其特征在于，采用持续同调方法分析所述特征流形得到故障持续度，基于所述资源状态熵和故障持续度的耦合关系构建修复阈值包括：5. The method according to claim 4 is characterized in that the continuous coherence method is used to analyze the characteristic manifold to obtain the fault duration, and the repair threshold is constructed based on the coupling relationship between the resource state entropy and the fault duration, which comprises:

在特征流形上构建单纯形复合体的嵌套序列，利用同调群映射提取故障特征的持续时间信息，结合两级特征权重优化得到修正的故障持续度，建立故障持续度与资源状态熵的耦合矩阵，对耦合矩阵进行特征分解得到修复阈值，具体包括：A nested sequence of simplex complexes is constructed on the feature manifold. The duration information of the fault feature is extracted using homology group mapping. The corrected fault duration is obtained by combining two-level feature weight optimization. The coupling matrix of fault duration and resource state entropy is established. The coupling matrix is decomposed by features to obtain the repair threshold, which includes:

在特征流形上构建距离度量空间，基于所述距离度量空间计算点对之间的距离值，利用所述距离值构建单纯形复合体，通过逐步增大所述距离值生成所述单纯形复合体的嵌套序列，计算所述嵌套序列中相邻单纯形复合体之间的同调群映射得到拓扑特征序列，提取所述拓扑特征序列中特征的出现时刻和消失时刻，基于所述出现时刻和消失时刻计算特征持续时间，将所述特征持续时间加权求和得到故障持续度；Constructing a distance metric space on a characteristic manifold, calculating the distance value between point pairs based on the distance metric space, constructing a simplex complex using the distance value, generating a nested sequence of the simplex complex by gradually increasing the distance value, calculating the homology group mapping between adjacent simplex complexes in the nested sequence to obtain a topological characteristic sequence, extracting the appearance time and disappearance time of features in the topological characteristic sequence, calculating the feature duration based on the appearance time and disappearance time, and obtaining the fault duration by weighted summation of the feature duration;

根据所述单纯形复合体的结构对所述故障持续度分配加权系数，将所述加权系数与故障持续度的乘积作为维度特征值，对所述维度特征值进行特征分解得到特征权重，将所述特征权重与维度特征值的乘积求和得到修正后的故障持续度；According to the structure of the simplex complex, a weight coefficient is assigned to the fault duration, the product of the weight coefficient and the fault duration is used as a dimensional eigenvalue, the dimensional eigenvalue is subjected to eigendecomposition to obtain a feature weight, and the product of the feature weight and the dimensional eigenvalue is summed to obtain a corrected fault duration;

计算修正后的故障持续度与资源状态熵的关联系数，将所述关联系数填充至由修正后的故障持续度和资源状态熵构成的对角矩阵得到耦合矩阵，对所述耦合矩阵进行特征分解得到特征向量，将所述特征向量分别与修正后的故障持续度和资源状态熵相乘并求和得到阈值判定值，利用所述阈值判定值与标记的修复决策数据进行优化得到修复阈值。Calculate the correlation coefficient between the corrected fault duration and resource state entropy, fill the correlation coefficient into the diagonal matrix composed of the corrected fault duration and resource state entropy to obtain a coupling matrix, perform eigendecomposition on the coupling matrix to obtain eigenvectors, multiply the eigenvectors with the corrected fault duration and resource state entropy respectively and sum them to obtain a threshold judgment value, and use the threshold judgment value and the marked repair decision data to optimize to obtain a repair threshold.

6.基于网络靶场的大规模网络节点场景构造系统，用于实现前述权利要求1-5中任一项所述的方法，其特征在于，包括：6. A large-scale network node scenario construction system based on a network range, used to implement the method described in any one of claims 1 to 5, characterized in that it includes:

第一单元，用于计算网络节点场景构造需求中节点间的漏洞关联度，将网络节点划分为多个攻防层级，生成攻防层级节点分布图，从网络节点模板库中提取候选网络节点模板并计算漏洞链式传播系数，将漏洞链式传播系数与攻防层级节点分布图进行匹配，筛选出目标网络节点模板；The first unit is used to calculate the vulnerability correlation between nodes in the network node scenario construction requirements, divide the network nodes into multiple attack and defense levels, generate an attack and defense level node distribution map, extract candidate network node templates from the network node template library and calculate the vulnerability chain propagation coefficient, match the vulnerability chain propagation coefficient with the attack and defense level node distribution map, and screen out the target network node template;

第二单元，用于采用图论聚类算法计算节点分布密度阈值并对网络节点场景构造需求中的网络拓扑类型进行分层优化得到初始网络拓扑结构，将目标网络节点模板通过属性映射生成网络节点实例，基于节点间通信协议信息构建网络节点实例之间的连接关系，提取网络节点实例之间的通信特征并计算业务亲和度，基于业务亲和度优化初始网络拓扑结构，生成目标网络拓扑结构；The second unit is used to calculate the node distribution density threshold using a graph clustering algorithm and perform hierarchical optimization on the network topology type in the network node scenario construction requirements to obtain an initial network topology structure, generate a network node instance from a target network node template through attribute mapping, build a connection relationship between network node instances based on inter-node communication protocol information, extract communication features between network node instances and calculate service affinity, optimize the initial network topology structure based on service affinity, and generate a target network topology structure;

第三单元，用于根据目标网络拓扑结构生成资源利用率指标，通过对资源利用率指标进行预测得到的资源优化方案构建节点运行环境并部署网络服务，对网络服务进行压力测试生成故障特征序列，基于故障特征序列从预设的漏洞程序库中选择匹配的漏洞程序并植入节点运行环境，监控节点运行环境中的状态信息并根据故障特征序列进行异常检测和修复，生成满足资源优化方案约束的大规模网络节点场景。The third unit is used to generate resource utilization indicators according to the target network topology, build a node operating environment and deploy network services through the resource optimization plan obtained by predicting the resource utilization indicators, perform stress testing on the network services to generate a fault feature sequence, select matching vulnerability programs from a preset vulnerability program library based on the fault feature sequence and implant them into the node operating environment, monitor the status information in the node operating environment and perform anomaly detection and repair according to the fault feature sequence, and generate a large-scale network node scenario that meets the constraints of the resource optimization plan.

7.一种电子设备，其特征在于，包括：7. An electronic device, comprising:

处理器；processor;

用于存储处理器可执行指令的存储器；a memory for storing processor-executable instructions;

其中，所述处理器被配置为调用所述存储器存储的指令，以执行权利要求1至5中任意一项所述的方法。The processor is configured to call the instructions stored in the memory to execute the method described in any one of claims 1 to 5.

8.一种计算机可读存储介质，其上存储有计算机程序指令，其特征在于，所述计算机程序指令被处理器执行时实现权利要求1至5中任意一项所述的方法。8. A computer-readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method according to any one of claims 1 to 5.