CN115632832A

Movatterモバイル変換

Info

Publication number: CN115632832A
Application number: CN202211215456.0A
Authority: CN
Inventors: 张东升; 宋海山
Original assignee: Wenzhou Jiarun Technology Development Co ltd
Current assignee: Shanghai Baoyun Network Information Service Co ltd
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2023-01-20
Anticipated expiration: 2042-09-30
Also published as: CN115632832B

Abstract

The invention relates to an artificial intelligence technology, and discloses a big data attack processing method applied to cloud service, which comprises the following steps: screening interception data from the received data of the cloud service according to the interception time domain, and extracting flow data, program data and content data from the interception data; detecting an attack event from the traffic data; detecting malicious code from program data; performing event mining on the content data to obtain a behavior event, performing big data association analysis on an attack event, a malicious code and the behavior event to obtain association attributes, and extracting attack features from the association attributes; updating the interception time domain according to the attack characteristics, returning to the step of intercepting the received data of the cloud service according to the interception time domain to obtain standard attack characteristics, and filtering the received data according to the standard attack characteristics. The invention further provides a big data attack processing system applied to the cloud service. The invention can improve the efficiency of processing the cloud service attack.

Description

Big data attack processing method and system applied to cloud service

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a big data attack processing method and system applied to cloud service.

Background

With the development of computer technology and the coming of big data era, people have more and more requirements on cloud services, and in order to improve the security of cloud services and guarantee the working efficiency of cloud services, a cloud service data processing system needs to be built to process data attacks on the cloud services.

Most of the existing cloud service attack processing technologies are single attack data detection based on flow analysis, and then targeted interception is performed. For example, based on session anomaly detection of cloud services, a sending IP of anomalous data is retrieved, and directional interception is further realized according to the sending IP.

Disclosure of Invention

The invention provides a big data attack processing method and system applied to cloud services, and mainly aims to solve the problem of low efficiency in attack processing of the cloud services.

In order to achieve the above object, the big data attack processing method applied to the cloud service provided by the present invention includes:

screening interception data from received data of the cloud service according to the interception time domain, and extracting flow data, program data and content data from the interception data;

extracting flow characteristics from the flow data, and detecting an attack event from the flow data according to the flow characteristics;

extracting code features from the program data, and detecting malicious codes from the program data according to the code features;

performing event mining on the content data to obtain a behavior event, performing big data association analysis on the attack event, the malicious code and the behavior event to obtain an association attribute, and extracting attack features from the association attribute;

updating the interception time domain according to the attack characteristics, returning to the step of intercepting the received data of the cloud service according to the interception time domain to obtain standard attack characteristics, and filtering the received data according to the standard attack characteristics.

Optionally, the extracting traffic data, program data, and content data from the intercepted data includes:

carrying out flow monitoring on the intercepted data to obtain flow data;

screening program data from the intercepted data according to the data type;

and performing behavior tracking on the intercepted data to obtain content data.

Optionally, the screening program data from the intercepted data according to the data type includes:

splitting the intercepted data into a plurality of data file packets;

selecting the data file packages one by one as target file packages, and taking the suffix names of the target file packages as target suffix names;

judging whether the target suffix name is matched with a program suffix name in a preset program suffix library or not;

when the target suffix name is not matched with the program suffix name, returning to the step of selecting the data file packages one by one as the target file package

Adding the target package of files to program data when the target suffix name matches the program suffix name.

Optionally, the extracting a flow feature from the flow data includes:

dividing the flow data into a plurality of data streams, and tracing the data streams one by one to obtain a communication address set;

carrying out address verification on the addresses in the communication address set one by one to obtain the number of forged addresses, and calculating the speed increase of the forged addresses according to the number of the forged addresses;

selecting the data streams one by one as target data streams, taking the flow of the target data streams as target flow, and extracting a paired data proportion from the target data streams;

calculating the flow rate increase of the target data stream according to the target flow rate and the paired data proportion by using the following flow rate increase formula:

wherein G is the traffic acceleration, n is the target traffic, D is the pairwise data ratio, and T is the transmission time corresponding to the target data stream;

extracting the flow packet number, the flow bit number and the flow life cycle from the target data flow;

and integrating all the flow acceleration rates, the flow packet number, the flow bit number, the flow life cycle and the fake address acceleration rate into flow characteristics.

Optionally, the detecting an attack event from the traffic data according to the traffic characteristics includes:

selecting data streams in the flow data one by one as target data streams, and taking flow characteristics corresponding to the target data streams as target flow characteristics;

acquiring an average flow characteristic corresponding to the target flow characteristic from a preset flow characteristic library;

calculating a characteristic deviation value of the target flow characteristic by using a characteristic deviation algorithm as follows:

wherein V is the characteristic deviation value, α, β, γ, δ, ε are preset traffic characteristic weights, G is a traffic acceleration rate in the target traffic characteristic, D is a fake address acceleration rate in the target traffic characteristic, p is a traffic packet number in the target traffic characteristic,

is the average traffic packet number in the average traffic characteristic, b is the number of traffic bits in the target traffic characteristic,

is the average traffic bit number in the average traffic characteristic, l is the traffic lifetime in the target traffic characteristic,

is the average traffic lifetime in the average traffic signature;

judging whether the characteristic deviation value is larger than a preset deviation threshold value or not;

when the characteristic deviation value is smaller than or equal to the deviation threshold value, returning to the step of selecting the data streams in the flow data one by one as target data streams;

and when the characteristic deviation value is larger than the deviation threshold value, taking the target data stream as an abnormal data stream, and taking an event corresponding to the target data stream as an attack event.

Optionally, the extracting the code feature from the program data includes:

dividing the program data into a plurality of program packages;

selecting the program packages one by one as target program packages, performing byte code conversion on the target program packages to obtain target program byte codes, and performing byte feature extraction on the target program byte codes by using a preset byte sliding window to obtain byte entropy;

performing character code conversion on the target program packet to obtain target program character codes, and performing character feature extraction on the target program character codes by using a preset character sliding window to obtain character entropy;

extracting an executable file header from the target program package, and performing protocol feature extraction on the executable file header to obtain a protocol array;

extracting numerical value information of the executable file header to obtain a compiled numerical value;

and taking the byte entropy, the character entropy, the protocol array and the compiling numerical value as code characteristics of the target program package.

Optionally, the detecting malicious code from the program data according to the code characteristics includes:

selecting program packages in the program data one by one as target program packages, and performing attribute detection on code features corresponding to the target program packages to obtain program attributes;

judging whether the program attribute is a malicious attribute;

when the program attribute is a malicious attribute, taking the target program package as malicious code;

when the program attribute is not a malicious attribute, extracting development attribute features from the remarks of developers of the target program package, extracting name attribute features from the package name of the target program package, and extracting program attribute features from the program attribute;

calculating the suspicion degree of the target program package according to the development attribute feature, the name attribute feature and the program attribute feature by using the following suspicion degree formula:

wherein, a refers to the doubtful degree, ω refers to a preset doubtful degree confrontation coefficient, arccos refers to an inverse cosine function, P refers to the program attribute feature, K refers to the development attribute feature, and N refers to the name attribute feature;

judging whether the suspicious degree is larger than a preset suspicious threshold value or not;

when the suspicious degree is larger than the suspicious threshold value, taking the target program package as a malicious code;

and when the suspicious degree is smaller than or equal to the suspicious threshold value, returning to the step of selecting the program packages in the program data one by one as target program packages.

In order to solve the above problem, the present invention further provides a big data attack processing system applied to a cloud service, where the system includes:

the data classification module is used for screening interception data from the received data of the cloud service according to an interception time domain, and extracting flow data, program data and content data from the interception data;

the attack event module is used for extracting flow characteristics from the flow data and detecting an attack event from the flow data according to the flow characteristics;

the malicious code module is used for extracting code characteristics from the program data and detecting malicious codes from the program data according to the code characteristics;

the association analysis module is used for carrying out event mining on the content data to obtain a behavior event, carrying out big data association analysis on the attack event, the malicious code and the behavior event to obtain an association attribute, and extracting attack features from the association attribute;

and the attack processing module is used for updating the interception time domain according to the attack characteristics, returning to the step of intercepting the received data of the cloud service according to the interception time domain to obtain standard attack characteristics, and filtering the received data according to the standard attack characteristics.

According to the embodiment of the invention, the traffic data, the program data and the content data are extracted from the intercepted data, so that the intercepted data can be subjected to security detection from three directions, the hidden attack data can be conveniently found according to the relevance among the security detection results in the three directions, and the identification rate of the attack data is improved; by extracting the flow characteristics from the flow data and detecting the attack events from the flow data according to the flow characteristics, the data attack can be preliminarily screened according to the abnormal fluctuation of the flow data when the attack events are detected, and the code characteristics are extracted from the program data, so that the step of executing the program data is omitted, and the time for detecting the malicious codes is saved.

The identification rate of malicious code detection can be improved by utilizing attribute detection and the doubtful degree to judge whether malicious codes are detected from the program data, the detection precision of the malicious codes is improved, the attack events, the malicious codes and the behavior events are subjected to big data association analysis to obtain the association attributes, attack characteristics are extracted from the association attributes, the association analysis can be effectively carried out on the flow, user behaviors and the behavior events among the malicious programs, the speed of big data processing is increased, the attack processing efficiency is also improved, the interception time domain is updated according to the attack characteristics to obtain standard attack characteristics, the detection period duration can be adjusted according to the primary detection result, the attack detection comprehensiveness and applicability are ensured, and the attack processing efficiency is improved. Therefore, the big data attack processing method and system applied to the cloud service can solve the problem of low efficiency when the attack processing is carried out on the cloud service.

Drawings

Fig. 1 is a schematic flowchart of a big data attack processing method applied to a cloud service according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of extracting flow characteristics according to an embodiment of the present invention;

FIG. 3 is a schematic flow chart illustrating a big data association analysis according to an embodiment of the present invention;

fig. 4 is a system architecture diagram of a big data attack processing system applied to a cloud service according to an embodiment of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the application provides a big data attack processing method applied to cloud services. The execution subject of the big data attack processing method applied to the cloud service includes but is not limited to at least one of electronic devices such as a server and a terminal which can be configured to execute the method provided by the embodiment of the application. In other words, the big data attack processing method applied to the cloud service may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform.

Fig. 1 is a schematic flow diagram of a big data attack processing method applied to a cloud service according to an embodiment of the present invention. In this embodiment, the big data attack processing method applied to the cloud service includes:

s1, screening interception data from received data of cloud service according to an interception time domain, and extracting flow data, program data and content data from the interception data.

In the embodiment of the present invention, the intercepting time domain refers to the length of the intercepting time.

In detail, the cloud service refers to a cloud computing service, that is, a cloud computing product that can be provided and used as a service, and includes a cloud host, a cloud space, cloud development, cloud testing, a comprehensive product, and the like.

In the embodiment of the present invention, the screening of the intercepted data from the received data of the cloud service according to the intercepted time domain means that data corresponding to the time length of the intercepted time domain is intercepted from the received data as the intercepted data.

In detail, the traffic data refers to information of communication data generated when the cloud service performs network communication in the intercepted data, and includes data such as traffic size, IP address of communication, communication duration, and the like.

Specifically, the program data refers to information related to a program in the intercepted data, such as data of an installation package, application codes and the like of software.

In detail, the content data refers to network structure data of the intercepted data, content data and the like, for example, data such as pictures, videos, webpage information and the like.

In this embodiment of the present invention, the extracting traffic data, program data, and content data from the intercepted data includes:

monitoring the flow of the intercepted data to obtain flow data;

screening program data from the intercepted data according to the data type;

In detail, the intercepted data may be monitored in real time by using Traffic monitoring software such as a network Multi-routing Traffic Monitor (MRTG) or a Sniffer (Sniffer port), so as to obtain the Traffic data.

In this embodiment of the present invention, the screening program data from the intercepted data according to the data type includes:

splitting the intercepted data into a plurality of data file packets;

Specifically, the program suffix library is a character set containing various program suffix names including java, exe, and apk characters.

Specifically, the behavior tracing refers to extracting a behavior log from the intercepted data, so as to obtain content data.

In the embodiment of the invention, the traffic data, the program data and the content data are extracted from the intercepted data, so that the intercepted data can be subjected to security detection from three directions, the hidden attack data can be conveniently found according to the relevance among the security detection results in the three directions, and the identification rate of the attack data is improved.

S2, extracting flow characteristics from the flow data, and detecting an attack event from the flow data according to the flow characteristics.

In the embodiment of the present invention, referring to fig. 2, the extracting the traffic characteristics from the traffic data includes:

s21, dividing the flow data into a plurality of data streams, and tracing the data streams one by one to obtain a communication address set;

s22, carrying out address verification on the addresses in the communication address set one by one to obtain the number of forged addresses, and calculating the speed increase of the forged addresses according to the number of the forged addresses;

s23, selecting the data streams one by one as target data streams, taking the flow of the target data streams as target flow, and extracting a paired data proportion from the target data streams;

s24, calculating the flow rate increase of the target data flow according to the target flow and the paired data proportion by using the following flow rate increase formula:

s25, extracting the flow packet number, the flow bit number and the flow life cycle from the target data stream;

and S26, all the flow rate increase, the flow packet number, the flow bit number, the flow survival cycle and the fake address increase are integrated into flow characteristics.

In detail, the data stream may be subjected to data tracing through methods such as ping communication or communication logs, so as to obtain a communication address set.

Specifically, the calculating of the fake address speed increase rate according to the number of fake addresses means dividing the number of fake addresses by the transmission time corresponding to the target data stream.

In detail, the number of traffic packets refers to the number of data packets of the target data stream.

In the embodiment of the invention, the flow rate increase of the target data flow is calculated according to the target flow rate and the paired data proportion by using the flow rate increase formula, so that repeated received data caused by communication failure can be provided, and a flow rate trend chart can be reflected more accurately.

In detail, the detecting an attack event from the traffic data according to the traffic characteristics includes:

is the average traffic bit number in the average traffic characteristic, is the traffic lifetime in the target traffic characteristic,

is the mean flow in said mean flow characteristicA volume survival period;

In detail, the traffic characteristic library includes average traffic characteristics of data streams similar to a target data stream queried by big data.

In detail, the flow characteristic weight is trained in advance from a plurality of abnormal flow test experiments.

In the embodiment of the invention, the characteristic deviation value of the target flow characteristic is calculated by utilizing the characteristic deviation algorithm, so that the abnormal condition of the flow can be detected from multiple dimensions, and the inquiry success rate of the attack event is improved.

In the embodiment of the invention, the flow characteristics are extracted from the flow data, the attack event is detected from the flow data according to the flow characteristics, the attack event can be detected according to the abnormal fluctuation of the flow data, and the data attack can be preliminarily screened.

And S3, extracting code features from the program data, and detecting malicious codes from the program data according to the code features.

In this embodiment of the present invention, the extracting code features from the program data includes:

dividing the program data into a plurality of program packages;

Specifically, the character code conversion refers to converting the target package into a file in an ASCII code format.

In detail, the Executable file header refers to a (PE) file header.

Specifically, the extracting of the protocol feature of the executable file header to obtain the protocol array refers to hashing file names and function names in an Information Technology association (IAT) form of the executable file header, and forming the protocol array according to a result of the hashing.

In detail, the extracting of the numerical information from the executable file header to obtain the compilation numerical value may be extracting the numerical information such as the compilation time in the executable file header to form information such as a timestamp.

In detail, the detecting malicious code from the program data according to the code characteristics includes:

judging whether the program attribute is a malicious attribute;

when the program attribute is a malicious attribute, taking the target program package as a malicious code;

judging whether the suspicious degree is larger than a preset suspicious threshold value;

when the suspicious degree is larger than the suspicious threshold value, the target program package is used as a malicious code;

Specifically, pre-trained binary convolutional neural networks may be used to perform attribute detection on the code features corresponding to the target program package, so as to obtain program attributes.

In detail, the extracting of the development attribute features from the developer notes of the target package means extracting keywords from the developer notes, and performing word vector conversion on the keywords to obtain the development attribute features.

Specifically, the method for extracting the name attribute features from the package name of the target package is consistent with the method for extracting the development attribute features from the remarks of the developer of the target package, and is not described herein again.

Specifically, the method for extracting the program attribute features from the program attributes is consistent with the method for extracting the development attribute features from the developer notes of the target package, and is not described herein again.

In the embodiment of the invention, the step of executing the program data is omitted by extracting the code features from the program data, the time for detecting the malicious codes is saved, the identification rate of the malicious code detection can be improved by detecting the malicious codes from the program data by utilizing the attribute detection and the suspicion degree judgment, and the detection precision of the malicious codes is improved.

And S4, carrying out event mining on the content data to obtain a behavior event, carrying out big data association analysis on the attack event, the malicious code and the behavior event to obtain an association attribute, and extracting attack features from the association attribute.

In this embodiment of the present invention, the performing event mining on the content data to obtain a behavior event includes:

performing data cleaning on the content data to obtain standard content data;

and extracting a modification event and an abnormal event from the standard content data, and converging the modification event and the abnormal event into a behavior event.

In detail, modification events and exception events may be extracted from the standard content data by calling the API history of the system.

In detail, referring to fig. 3, the performing big data association analysis on the attack event, the malicious code, and the behavior event to obtain an association attribute includes:

s41, adding a timestamp to the attack event to obtain an attack event incremental curve, adding a timestamp to the malicious code to obtain a malicious code incremental curve, and adding a timestamp to the behavior event to obtain a behavior event incremental curve;

s42, segmenting the attack event incremental curve, the malicious code incremental curve and the behavior event incremental curve by using a preset time window to obtain a plurality of time domain incremental segments;

s43, selecting the time domain increment segments one by one as target time domain segments, and calculating the event correlation degrees among the attack events, the malicious codes and the behavior events in the target time domain segments by utilizing a correlation degree algorithm as follows:

wherein C is the event correlation degree, m is the total duration of the preset time window, theta is a preset correlation degree confrontation coefficient, i is the ith time in the preset time window, and x_i The value of the attack event incremental curve corresponding to the ith moment in the preset time window is referred to,

is the average value, y, of the attack event increment curve corresponding to the preset time window_i The value of the malicious code incremental curve corresponding to the ith moment in the preset time window is referred to,

is the average value, z, of the incremental curve of the malicious code corresponding to the preset time window_i Is the value of the behavior event increment curve corresponding to the ith moment in the preset time window,

the average value of the behavior event increment curve corresponding to the preset time window is referred to;

and S44, taking the target time domain segment with the event correlation degree larger than a preset correlation threshold value as a correlation time domain segment, and extracting correlation attributes from the correlation time domain segment.

In the embodiment of the invention, the incidence degree algorithm is utilized to calculate the incidence degree among the attack events, the malicious codes and the behavior events in the target time domain segment, so that the liveness relation among the attack events, the malicious codes and the behavior events can be represented.

Specifically, the extracting of the association attribute from the association time domain segment refers to extracting an attack intensity attribute from an attack event corresponding to the association is a phrase segment, extracting an attack type attribute from a malicious code corresponding to the association is a phrase segment, and extracting an attack behavior attribute from a behavior event corresponding to the association is a phrase segment.

In detail, the method for extracting the correlation attribute from the correlation time domain segment is consistent with the method for extracting the development attribute feature from the developer remark of the target package in step S3, and is not described herein again.

In detail, the extracting of the attack features from the associated attributes refers to performing time domain feature analysis on the attack intensity attributes, the attack type attributes and the attack behavior attributes of the associated attributes to obtain the attack features.

Specifically, the time domain characteristic analysis is performed on the attack strength attribute, the attack type attribute and the attack behavior attribute of the associated attribute to obtain the attack characteristic, namely, the predicted attack strength attribute corresponding to the current interception time domain is predicted according to the attack type attribute and the attack behavior attribute, the difference value between the attack strength attribute and the predicted attack strength attribute is used as the attack time domain characteristic, the attack type characteristic is extracted from the attack type attribute, the attack behavior characteristic is extracted from the attack behavior attribute, and the attack time domain characteristic, the attack type characteristic and the attack behavior characteristic are converged into the attack characteristic.

In the embodiment of the invention, the association attribute is obtained by performing big data association analysis on the attack event, the malicious code and the behavior event, and the attack characteristic is extracted from the association attribute, so that the association analysis can be effectively performed on the flow, the user behavior and the behavior event among malicious programs, the speed of big data processing is increased, and the efficiency of attack processing is also improved.

And S5, updating the interception time domain according to the attack characteristics, returning to the step of intercepting the received data of the cloud service according to the interception time domain to obtain standard attack characteristics, and filtering the received data according to the standard attack characteristics.

In the embodiment of the present invention, the updating the interception time domain according to the attack feature means adjusting the duration of the interception time domain according to the attack time domain feature in the attack feature.

In detail, the filtering the received data according to the standard attack features refers to adding the standard attack features into a preset security analysis feature library and intercepting data corresponding to the standard attack features.

In the embodiment of the invention, the interception time domain is updated according to the attack characteristics to obtain the standard attack characteristics, the detection period duration can be adjusted according to the primary detection result, the comprehensiveness and applicability of the attack detection are ensured, and the attack processing efficiency is improved.

According to the embodiment of the invention, the traffic data, the program data and the content data are extracted from the intercepted data, so that the intercepted data can be subjected to security detection from three directions, the hidden attack data can be conveniently found according to the relevance among the security detection results in the three directions, and the identification rate of the attack data is improved; by extracting the flow characteristics from the flow data, detecting the attack event from the flow data according to the flow characteristics, primarily screening the data attack on the detected attack event according to the abnormal fluctuation of the flow data, and extracting the code characteristics from the program data, the step of executing the program data is omitted, and the time for detecting the malicious code is saved.

The method has the advantages that the identification rate of malicious code detection can be improved by utilizing attribute detection and the suspicious degree to judge whether malicious codes are detected from the program data, the detection precision of the malicious codes is improved, the attack events, the malicious codes and the behavior events are subjected to big data association analysis to obtain the association attributes, attack characteristics are extracted from the association attributes, the association analysis can be effectively carried out on the flow, user behaviors and the behavior events among the malicious programs, the speed of big data processing is increased, the attack processing efficiency is also improved, the interception time domain is updated according to the attack characteristics to obtain standard attack characteristics, the detection period duration can be adjusted according to the primary detection result, the attack detection comprehensiveness and applicability are ensured, and the attack processing efficiency is improved. Therefore, the big data attack processing method applied to the cloud service can solve the problem of low efficiency when the attack processing is carried out on the cloud service.

The big dataattack processing system 100 applied to the cloud service can be installed in an electronic device. According to the implemented functions, the big dataattack processing system 100 applied to the cloud service may include adata classification module 101, anattack event module 102, amalicious code module 103, anassociation analysis module 104, and anattack processing module 105. The module of the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions of the respective modules/units are as follows:

thedata classification module 101 is configured to screen interception data from received data of a cloud service according to an interception time domain, and extract traffic data, program data, and content data from the interception data;

theattack event module 102 is configured to extract traffic characteristics from the traffic data, and detect an attack event from the traffic data according to the traffic characteristics;

themalicious code module 103 is configured to extract code features from the program data, and detect a malicious code from the program data according to the code features;

theassociation analysis module 104 is configured to perform event mining on the content data to obtain a behavior event, perform big data association analysis on the attack event, the malicious code, and the behavior event to obtain an association attribute, and extract an attack feature from the association attribute;

theattack processing module 105 is configured to update the interception time domain according to the attack characteristics, return to the step of intercepting the received data of the cloud service according to the interception time domain to obtain standard attack characteristics, and filter the received data according to the standard attack characteristics.

In detail, when used, each module in the big dataattack processing system 100 applied to cloud services in the embodiment of the present invention adopts the same technical means as the big data attack processing method applied to cloud services described in fig. 1 to fig. 3, and can produce the same technical effect, which is not described herein again.

In the embodiments provided in the present invention, it should be understood that the disclosed method and system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

Furthermore, it will be obvious that the term "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only intended to illustrate the technical solutions of the present invention and not to limit the same, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A big data attack processing method applied to cloud services is characterized by comprising the following steps:

s1: screening interception data from the received data of the cloud service according to an interception time domain, and extracting flow data, program data and content data from the interception data;

s2: extracting flow characteristics from the flow data, and detecting an attack event from the flow data according to the flow characteristics;

s3: extracting code features from the program data, and detecting malicious codes from the program data according to the code features;

s4: performing event mining on the content data to obtain a behavior event, performing big data association analysis on the attack event, the malicious code and the behavior event to obtain an association attribute, and extracting attack features from the association attribute, wherein the big data association analysis on the attack event, the malicious code and the behavior event to obtain the association attribute comprises:

s41: adding a timestamp to the attack event to obtain an attack event incremental curve, adding a timestamp to the malicious code to obtain a malicious code incremental curve, and adding a timestamp to the behavior event to obtain a behavior event incremental curve;

s42: segmenting the attack event incremental curve, the malicious code incremental curve and the behavior event incremental curve by using a preset time window to obtain a plurality of time domain incremental segments;

s43: selecting the time domain increment segments one by one as target time domain segments, and calculating the event correlation degrees among the attack events, the malicious codes and the behavior events in the target time domain segments by utilizing the following correlation degree algorithm:

wherein C is the event correlation degree, m is the total duration of the preset time window, theta is a preset correlation degree confrontation coefficient, i is the ith time in the preset time window, x_i The value of the attack event incremental curve corresponding to the ith moment in the preset time window is referred to,

is the average value, y, of the attack event incremental curve corresponding to the preset time window_i The malicious code corresponding to the ith moment in the preset time window is increasedThe value of the quantity curve is taken as,

s44: taking the target time domain segment with the event correlation degree larger than a preset correlation threshold value as a correlation time domain segment, and extracting correlation attributes from the correlation time domain segment;

s5: updating the interception time domain according to the attack characteristics, returning to the step of intercepting the received data of the cloud service according to the interception time domain to obtain standard attack characteristics, and filtering the received data according to the standard attack characteristics.

2. The big data attack processing method applied to the cloud service, as claimed in claim 1, wherein the extracting of the traffic data, the program data, and the content data from the intercepted data includes:

monitoring the flow of the intercepted data to obtain flow data;

screening program data from the intercepted data according to the data type;

3. The big data attack processing method applied to the cloud service, according to claim 2, wherein the screening of the program data from the intercepted data according to the data type includes:

splitting the intercepted data into a plurality of data file packets;

And when the target suffix name is matched with the program suffix name, adding the target file packet into program data.

4. The big data attack processing method applied to the cloud service, according to claim 1, wherein the extracting of the traffic features from the traffic data includes:

calculating the flow rate increase of the target data flow according to the target flow rate and the paired data proportion by using the following flow rate increase formula:

5. The big data attack processing method applied to the cloud service, according to the traffic characteristics, wherein the detecting the attack event from the traffic data according to the traffic characteristics comprises:

is the average traffic lifetime in the average traffic signature;

6. The big data attack processing method applied to the cloud service, as claimed in claim 1, wherein the extracting of the code features from the program data comprises:

dividing the program data into a plurality of program packages;

selecting the program packages one by one as target program packages, performing byte code conversion on the target program packages to obtain target program byte codes, and performing byte feature extraction on the target program byte codes by using a preset byte sliding window to obtain byte entropies;

performing character code conversion on the target program package to obtain a target program character code, and performing character feature extraction on the target program character code by using a preset character sliding window to obtain a character entropy;

7. The big data attack processing method applied to the cloud service, according to the code characteristics, wherein the detecting the malicious code from the program data comprises:

judging whether the program attribute is a malicious attribute;

when the program attribute is not a malicious attribute, extracting development attribute features from the remarks of the developers of the target program package, extracting name attribute features from the package name of the target program package, and extracting program attribute features from the program attribute;

wherein, a refers to the doubtful degree, ω is a preset doubtful degree confrontation coefficient, arccos is an inverse cosine function, P refers to the program attribute feature, K refers to the development attribute feature, and N refers to the name attribute feature;

8. A big data attack processing system applied to cloud services, characterized in that the system comprises:

the association analysis module is configured to perform event mining on the content data to obtain a behavior event, perform big data association analysis on the attack event, the malicious code, and the behavior event to obtain an association attribute, and extract an attack feature from the association attribute, where the performing big data association analysis on the attack event, the malicious code, and the behavior event to obtain the association attribute includes: adding a timestamp to the attack event to obtain an attack event incremental curve, adding a timestamp to the malicious code to obtain a malicious code incremental curve, and adding a timestamp to the behavior event to obtain a behavior event incremental curve; segmenting the attack event incremental curve, the malicious code incremental curve and the behavior event incremental curve by using a preset time window to obtain a plurality of time domain incremental segments; selecting the time domain increment segments one by one as target time domain segments, and calculating the event correlation degrees among the attack events, the malicious codes and the behavior events in the target time domain segments by utilizing a correlation degree algorithm as follows:

is that the preset time window is rightAverage of the corresponding incremental curves of the attack events, y_i The value of the malicious code incremental curve corresponding to the ith moment in the preset time window is referred to,

the average value of the behavior event increment curve corresponding to the preset time window is referred to; taking the target time domain segment with the event correlation degree larger than a preset correlation threshold value as a correlation time domain segment, and extracting correlation attributes from the correlation time domain segment;