Attack tracing method based on threat intelligence and ATT & CKTechnical Field
The invention relates to the field of computer network security, in particular to an attack tracing method based on threat intelligence and ATT & CK.
Background
Threats are potential factors that can cause damage to a particular target system, asset. As the hacking means is diversified, the method is complicated, the application is diversified, and a large number of uncertain factors exist in the threat. And the Advanced sustainable Threat (APT) has become a network attack means which seriously threatens the government and enterprise data security due to the characteristics of high utilization means, long attack duration period and high attack hazard.
In a real scene, before APT attack causes serious economic loss, the existing defense framework cannot timely and accurately discover the existence of the threat. More importantly, in the disk replication stage after the attack, due to the fact that the attack means is too complex, an attacker fills too much dirty data interference and the like, it is difficult to position the vulnerability in detail and know the detailed attack flow. In addition, many hundreds of G of log data are duplicated, making the tracing process difficult. In general, the following problems exist in the prior art:
firstly, under the actual condition, the false alarm information based on IPS in the real network is numerous every day, and for huge log files, the traditional method still adopts Ip-Time-Process and the like to carry out inefficient log processing and attack behavior screening in the tracing Process, thereby consuming a great deal of Time and labor.
Secondly, threat intelligence information is not always complete and often only contains single or few attacker information, the intention of an attack in the life cycle and the attack flow cannot be closely related in the threat intelligence information, and the attacker information is disordered and brings great difficulty to the tracing process.
Thirdly, an attacker can perform a large amount of detection behaviors before executing the attack behaviors, and even filling dirty data in the attack execution confuses the real attack purpose, which also increases the tracing difficulty.
According to the problems, firstly, a label technology is introduced, the mapping relation between a label system and the threat level is established, the log is preprocessed by using threat information, data information with hidden danger of threat is cleaned and screened out, specific labels are given for different log files under different application scenes with different characteristics, and classification, analysis and management are facilitated.
Threat intelligence often can not connect attack flows tightly in reality, so the invention combines an ATT & CK attacker framework to assist attack tracing and positioning through an advanced penetration test attack matrix (ATT & CK) framework and by knowing the possibly used technology, tactics and program of an opponent according to the existing part of key information. And performing targeted log filtering and scanning, and adding context associated information, thereby enriching attack semantics and perfecting an attacker path.
Aiming at the situation, the invention provides an attack tracing method based on threat intelligence and ATT & CK.
Disclosure of Invention
Aiming at the problems of difficult log data classification, low processing process efficiency, difficult positioning of safety accidents, difficult recovery of paths of attackers, incomplete threat information of attack events and the like in the conventional network security attack tracing technology, the attack tracing method based on the threat information and ATT & CK is provided and is realized by utilizing a label module, a threat information database module, a data preprocessing module and an ATT & CK processing module. Wherein, threat information library module includes outside threat collection submodule, inside threat aggregate analysis submodule, inside threat information library. The ATT & CK processing module comprises an ATT & CK framework analysis submodule, a data processing submodule and a feedback submodule. The label module is connected with the data preprocessing module, the data preprocessing module is respectively connected with the ATT & CK processing module and the threat information database module, and the ATT & CK processing module is connected with the threat information database module. The external threat acquisition submodule is connected with the internal threat information library, the internal threat acquisition submodule is connected with the internal threat aggregation analysis submodule, and the internal threat aggregation analysis submodule is connected with the internal threat information library. The ATT & CK framework analysis submodule is connected with the data processing submodule, and the data processing submodule is connected with the feedback submodule.
The method comprises the following specific steps:
based on an application scene, firstly establishing a mapping relation between a label system and a processing rule, presetting a weight coefficient of each attack behavior attribute, and identifying data in a log file by using corresponding data characteristics of the label system. And aiming at the nature of the threat, identifying the data in the log file through characteristics such as IP (Internet protocol), MD5 identification code, HASH identification code, domain name and the like exposed by the attack behavior.
And collecting internal security threats, and establishing and perfecting an internal threat intelligence library by combining threat intelligence sources shared by all external platforms.
The external threat acquisition submodule is utilized to establish an open source threat information collection and query framework, threat information is automatically collected from various open resources, the network crawler technology and the API interface are utilized to simplify the external collection flow, the rapid collection and sorting of the threat information are realized, the acquired data are stored in an internal threat information library, and a relational database is adopted for storage. The web crawler technology is used for analyzing the web pages and crawling the content of the web pages by utilizing BeautifuleSoup, Requests, Scapy and other libraries in Python aiming at the targets of Twitter, Tor, darknet forum, security portal 360, freebuf, fireeye, MCafee and the like. The method comprises the steps of collecting threat intelligence for an open-source threat intelligence sharing platform and manufacturers by utilizing API interfaces, wherein the threat intelligence comprises AlienVault OTX, GreyNoise, Hunter, MalShare and the like, and the platforms all provide fixed API interfaces to call collected information.
The data source collection is carried out by utilizing an internal threat collection submodule, wherein the data source collection comprises illegal access, unauthorized access, identity authentication and unconventional operation monitored in traditional safety equipment including a firewall, an IDS and an IPS, and the data source collection is carried out by utilizing a sandbox operation mode, a honeypot technology, a DPI technology, a DFI technology, a malicious code detection technology and the like.
And discovering context, scene indexes, attack indexes and attack influences related to a certain threat under a specific trend and situation scene in the whole environment by utilizing an internal threat aggregation analysis submodule, collecting and analyzing established security event information in the process of dealing with the threat, and finally storing an analysis result into an internal threat information library according to a threat information format standard.
After the internal threat information library is established, files needing tracing detection are preprocessed through a data preprocessing module, log files of the files are subjected to preliminary filtering, preprocessed threat information is screened out from the log files through multi-dimensional information such as IP, MD5 identification codes, HASH identification codes, domain names and attack characteristics, corresponding threat weight labels are marked on data in the log files in the data preprocessing process through a label module according to the mapping relation between a preset label system and a processing rule, and corresponding characteristics exposed by the attack behavior are marked through a regular matching label system rule.
And sending the data after passing through the data preprocessing module to an ATT & CK processing module. By utilizing the ATT & CK processing module and combining with an advanced penetration test attack matrix from the perspective of an attacker, technical points needing preferential processing are identified through Tactics, technology and procedure (TTP) analysis on the attacker, and the source tracing accuracy is improved.
And an ATT & CK framework analysis submodule in the ATT & CK processing module analyzes through the ATT & CK and constructs an attacker path model by combining a local network topology environment. The ATT & CK framework analysis submodule contains all possible methods used by an attacker and information such as a passing way, and an attacker attack path model is simulated and established on the basis of the information.
And loading the data output by the data preprocessing module as a new data source into the data processing submodule by utilizing a data processing submodule in the ATT & CK processing module based on the model generated by the analysis submodule, and carrying out gradient processing with priority on the log data according to the label information. And continuing to supplement the information of the attack behavior by using the data processing submodule, wherein the method specifically comprises the following operations: firstly, aiming at more complete excavated attack behaviors, remodeling an attacker path on the basis of the traditional threat information, and enriching the context relationship of the attack behaviors according to the progressive relationship; secondly, aiming at the still incomplete information and the unearthed attack behaviors, the obtained information is subjected to gap filling operation according to a model generated by the analysis submodule, links missing in the attack information are searched, and according to the missing links, log information is collected and filtered again by combining the attack characteristics, the attack areas and other information of the information missing links.
And utilizing a feedback sub-module in the ATT & CK processing module to perform the following operations: firstly, after collecting and organizing the attack path information perfected by the data processing submodule, writing the information into a local threat information library in a specified format, and secondly, generating an attack traceability visualized report which is tightly connected with the context of an attacker.
The invention has the beneficial effects that:
1. the attack tracing method based on threat intelligence and ATT & CK disclosed by the invention is improved on the traditional log data cleaning and single threat intelligence tracing method, rules are preset according to a label configuration module, a method for judging threat weight based on multi-dimensional information is adopted and labels are marked, the label rules are optimized, the information retrieval efficiency is improved, and meanwhile, later-stage information sorting, classification and modeling are facilitated.
2. According to the attack tracing method based on threat intelligence and ATT & CK disclosed by the invention, by establishing a local threat intelligence library and utilizing a mode of combining threat intelligence and a label system, on one hand, the meaningless operations of excessive repeated access, detection, dirty data filling and the like after an attacker is detected are avoided, for example, the attacker forges information such as IP and the like to cause overlarge mark amount and consume system resources; on the other hand, a plurality of attack behaviors with similar threat levels have certain homology or similar attack paths, and the method can greatly reduce the workload.
3. The method of the invention combines ATT & CK and threat information, on one hand, the characteristics of single, incomplete and disorderly threat information can be solved, on the other hand, the threat information can be perfected by establishing a local threat information library, and through local ATT & CK modeling analysis and accurate association attack front and back behaviors, the detection discovery is further promoted, and thus, a closed-loop self-learning and self-updating framework is achieved.
Drawings
Fig. 1 is an overall framework of an implementation system of the present invention.
FIG. 2 is a schematic diagram of the components of the threat intelligence library module.
Detailed Description
For a better understanding of the present disclosure, an example is given here.
An attack tracing method based on threat intelligence and ATT & CK is implemented as follows:
as shown in FIG. 1, the method is implemented by using a label module, a threat intelligence database module, a data preprocessing module and an ATT & CK processing module. Wherein, threat information library module includes outside threat collection submodule, inside threat aggregate analysis submodule, inside threat information library. The ATT & CK processing module comprises an ATT & CK framework analysis submodule, a data processing submodule and a feedback submodule. The label module is connected with the data preprocessing module, the data preprocessing module is respectively connected with the ATT & CK processing module and the threat information database module, and the ATT & CK processing module is connected with the threat information database module. The external threat acquisition submodule is connected with the internal threat information library, the internal threat acquisition submodule is connected with the internal threat aggregation analysis submodule, and the internal threat aggregation analysis submodule is connected with the internal threat information library. The ATT & CK framework analysis submodule is connected with the data processing submodule, and the data processing submodule is connected with the feedback submodule.
Based on an application scene, firstly establishing a mapping relation between a label system and a processing rule, presetting a weight coefficient of each attack behavior attribute, and utilizing the corresponding data characteristics of the label system and identifying data in a log file to standardize and generalize the processing rule. Different from the traditional way of establishing the label index, information is not marked by using a single IP as an identifier, so that the marking amount is overlarge, and the workload is increased. The special structure of the threat intelligence utilizes the characteristics of the threat intelligence, particularly, the characteristics of IP, MD5 identification code, HASH identification code, domain name and the like exposed by the attack behavior are identified according to the threat property, and the identification mode has the advantages that a plurality of attack behaviors with similar threat levels have certain homology at high probability or attack paths are similar, so that the workload can be greatly reduced. The identification rule in the method can also avoid the problem that the quantity of marks is too large and resources are consumed because an attacker forges information such as IP and the like. The tag system threat weights table is shown in table 1.
TABLE 1 Mark-answer system threat weighting table
| Threat weights | Threat intelligence features |
| 1 | (IP, hash, Domain, etc.) is broadly said to have C&C, high risk behavior characteristics of right-lifting and the like |
| 2 | (IP, hash, Domain, etc.) generally refers to behavior features with high risk for command execution, ARP, etc |
| 3 | (IP、hash. Domain, etc.) generally refers to the feature of having back-door traffic, bypassing dangerous behavior such as wf |
| 4 | (IP, hash, Domain, etc.) generally refers to behavior features with low risk of detection, scanning, collection, etc |
As shown in fig. 2, the threat intelligence library module includes: the system comprises an external threat acquisition submodule, an internal threat aggregation analysis submodule and an internal threat information library.
And (4) collecting internal security threats, and establishing and perfecting a local threat intelligence library by combining threat intelligence sources shared by all external platforms, so that attack data can be analyzed and traced quickly and efficiently. The local threat intelligence base is established, the threat intelligence can greatly reduce the work in the log retrieval process, wherein the internal security threat collection and arrangement is required to be combined with the shared threat intelligence source of each external platform, so that the most basic local threat intelligence base is perfected.
The external threat acquisition submodule is utilized to establish an open source threat information collection and query framework, threat information is automatically collected from various open resources, the web crawler technology and the API interface are utilized to simplify the external collection flow, and the rapid collection and arrangement of the threat information are realized. The web crawler technology is used for analyzing the web pages and crawling the content of the web pages by utilizing BeautifuleSoup, Requests, Scapy and other libraries in Python aiming at the targets of Twitter, Tor, darknet forum, security portal 360, freebuf, fireeye, MCafee and the like. The method comprises the steps of collecting threat intelligence for an open-source threat intelligence sharing platform and manufacturers by utilizing API interfaces, wherein the threat intelligence comprises AlienVault OTX, GreyNoise, Hunter, MalShare and the like, and the platforms all provide fixed API interfaces to call collected information.
In the method, data collected by an external threat collection submodule are all in a standardized and structured format set by an international threat information standard STIX, and are stored in an internal threat information database, and the collected data represent a series of linked attack behaviors because of the collected data, so that the collected data are stored by adopting a relational database, wherein the relational database comprises Mysql, Msql and the like.
As shown in fig. 2, an internal threat acquisition sub-module is used to collect data sources, on one hand, illegal access, unauthorized access, identity authentication, irregular operation, and the like, which are monitored in conventional security devices represented by firewalls, IDS, IPS, and the like. On the other hand, the method also comprises sandbox execution, honeypot, DPI, DFI, malicious code detection and the like, namely all by collecting traditional detection equipment results or native middleware logs.
And discovering context, scene indexes, attack indexes and attack influences related to a certain threat under a specific trend and situation scene in the whole environment by utilizing an internal threat aggregation analysis submodule, collecting and analyzing established security event information in the process of dealing with the threat, and finally storing an analysis result into an internal threat information library according to a threat information format standard. The internal threat aggregation analysis submodule performs collation according to the STIX threat intelligence structured format.
After the internal threat information library is established, a data preprocessing module is used for preprocessing files needing tracing detection, log files of the files are preliminarily filtered, preprocessed threat information is screened out from the log files through multi-dimensional information such as IP, MD5 identification codes, HASH identification codes, domain names and attack characteristics, log information with threat hidden dangers is accordingly greatly simplified, analyzed and filtered log information has high TPR (true Positive Rate), meanwhile, a label module is used for marking corresponding threat weight labels on data in the log files in the data preprocessing process according to the mapping relation between a preset label system and a processing rule, and corresponding characteristics exposed by attack behaviors are marked by regularly matching label system rules. Based on threat weight classification, the method is convenient for post-tracing processing and identification, and final analysis and summarization.
Sending the data after passing through the data preprocessing module into ATT&CK processing module at ATT&Before the CK module processes, the threat information itself contains informationNot necessarily complete, often only the attack signature for a certain time period for a certain organization is recorded. The threat intelligence is independently utilized to process and extract data, so that the data result is single and cannot be well related to each attack behavior. Using ATT&The CK processing module is used for identifying technical points needing preferential processing by combining with an advanced penetration test attack matrix and analyzing Tactics, technologies and Procedures (TTP) of an attacker from the perspective of the attacker, and the tracing accuracy is improved. MITER

Is a globally accessible knowledge base of tactics and techniques based on observations of the real world. ATT (automatic transfer terminal)&The CK knowledge base is used as the basis for private sector, government sector, and development of network security product and service domain specific threat models and methods. ATT (automatic transfer terminal)&CK is open and can be used for free by anyone or an organization.
And an ATT & CK framework analysis submodule in the ATT & CK processing module analyzes through the ATT & CK and constructs an attacker path model by combining a local network topology environment. The ATT & CK framework analysis submodule contains all possible methods used by an attacker and information such as a passing way, and an attacker attack path model is simulated and established on the basis of the information. Based on the attacker angle, the attack path is from Initial Access, Execution, Persistence, software implementation, feedback event, creation Access, Discovery, target Movement, Collection, Command and Control and Execution.
The attacker attack path model processes and models by using an NLP method in AI, threat information comprises Chinese and English texts, videos, audios, bit streams or other complex format information, text classification, part of speech tagging, syntactic analysis, information retrieval, information extraction, system question answering and machine translation are carried out by using the NLP method and combining ATT & CK, models are built step by step, and finally data output by the data preprocessing module is loaded into the data processing submodule as a new data source. After the data processing submodule reads the tags, the data are subjected to gradient processing with priority according to the written tag values, wherein attacks with the same threat level generally easily have similar attack characteristics, and similar threat levels are convenient to classify and summarize for similar attack behaviors. Threat intelligence is some evidence-based knowledge, including context, mechanism, label, meaning, and actionable advice, that is relevant to a threat or hazard that an asset is exposed to, and that can be used to provide information support for the asset-related subject's response to or handling decisions about the threat or hazard. All data is threat intelligence if it represents some behavior with threat implications. And analyzing the originally isolated data from the inside to extract data with threat meaning, wherein the extracted information is threat information.
And loading the data output by the data preprocessing module as a new data source into the data processing submodule by utilizing a data processing submodule in the ATT & CK processing module based on the model generated by the analysis submodule, and carrying out gradient processing with priority on the log data according to the label information. Attacks with the same threat level generally have the same attack characteristics more easily, and the source tracing process is more favorably simplified. And continuing to supplement the information of the attack behavior by using the data processing submodule, wherein the method specifically comprises the following operations: : firstly, aiming at more complete excavated attack behaviors, remodeling an attacker path on the basis of the traditional threat information, and enriching the context relationship of the attack behaviors according to the progressive relationship; secondly, aiming at the still incomplete information and the unearthed attack behaviors, the obtained information is subjected to gap filling operation according to a model generated by the analysis submodule, links missing in the attack information are searched, and according to the missing links, log information is collected and filtered again by combining the attack characteristics, the attack areas and other information of the information missing links.
And utilizing a feedback sub-module in the ATT & CK processing module to perform the following operations: firstly, after collecting and organizing the attack path information perfected by the data processing submodule, writing the information into a local threat information library in a specified format, and secondly, generating an attack traceability visualized report which is tightly connected with the context of an attacker. The tracing result is more closely related to the behavior of an attacker on the basis of the traditional tracing, and the attack behavior is captured more completely.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.