Movatterモバイル変換


[0]ホーム

URL:


CN118337526A - Method for generating anti-attack sample - Google Patents

Method for generating anti-attack sample
Download PDF

Info

Publication number
CN118337526A
CN118337526ACN202410740771.8ACN202410740771ACN118337526ACN 118337526 ACN118337526 ACN 118337526ACN 202410740771 ACN202410740771 ACN 202410740771ACN 118337526 ACN118337526 ACN 118337526A
Authority
CN
China
Prior art keywords
sample
data
improved
generator
wasserstein gan
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410740771.8A
Other languages
Chinese (zh)
Other versions
CN118337526B (en
Inventor
徐大伟
吕月
赵剑
李念峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun University
Original Assignee
Changchun University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun UniversityfiledCriticalChangchun University
Priority to CN202410740771.8ApriorityCriticalpatent/CN118337526B/en
Publication of CN118337526ApublicationCriticalpatent/CN118337526A/en
Application grantedgrantedCritical
Publication of CN118337526BpublicationCriticalpatent/CN118337526B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

A method for generating a challenge sample. Belongs to the technical field of network security, and in particular relates to the technical field of network intrusion detection. The improved WASSERSTEIN GAN network of autonomous design is utilized to learn characteristics of benign traffic to disguise malicious traffic to generate challenge samples, to utilize the challenge samples for better testing and upgrade of the machine learning based intrusion detection system, introduces a multi-generator architecture in WASSERSTEIN GAN network, and adds distortion rates in generator loss functions. The method has wider application scene, and can generate the antagonistic network flow under the condition of not knowing the detailed information of the target model, so that the generated antagonistic attack sample accords with the actual attack scene, thereby improving the performance of the intrusion detection system more accurately.

Description

Translated fromChinese
一种对抗攻击样本生成方法A method for generating adversarial attack samples

技术领域Technical Field

本发明属于网络安全技术领域,具体涉及网络入侵检测技术领域。The present invention belongs to the technical field of network security, and in particular to the technical field of network intrusion detection.

背景技术Background technique

入侵检测系统(IDS)是一种主动防御措施,旨在检测和响应网络和系统中的可疑或恶意活动。入侵检测主要分为两类,分别是基于签名的入侵检测系统和基于模型的入侵检测系统。随着攻击技术的不断发展和优化,传统的基于签名的入侵检测系统逐渐难以满足日益复杂的检测需求。近年来,机器学习(ML),特别是深度学习,在处理分类任务方面表现出色,因此在入侵检测领域得到了广泛应用。因此,现有技术开始选择关注基于ML的IDS,以对新型攻击有更为灵活和智能的应对。An intrusion detection system (IDS) is an active defense measure designed to detect and respond to suspicious or malicious activities in networks and systems. Intrusion detection can be divided into two main categories: signature-based intrusion detection systems and model-based intrusion detection systems. With the continuous development and optimization of attack technologies, traditional signature-based intrusion detection systems have gradually become unable to meet the increasingly complex detection needs. In recent years, machine learning (ML), especially deep learning, has performed well in handling classification tasks and has therefore been widely used in the field of intrusion detection. Therefore, existing technologies have begun to focus on ML-based IDS to have a more flexible and intelligent response to new attacks.

尽管ML模型在IDS中发挥了至关重要的作用,但它们的可靠性存在一定缺陷,导致高假阳性结果的出现。这导致了即使发生了攻击行为也无法被检测到。此外,由于攻击者可以利用机器学习模型在处理输入数据时所呈现的脆弱性,引发了一种新的网络流量攻击方式,即对抗性攻击。对抗性攻击是一种专门针对机器学习模型的攻击方法,旨在通过微小但精心设计的输入数据修改来欺骗模型,使其产生错误的分类。这种攻击即使在看似无害的输入下也可能产生严重的安全问题。Although ML models play a vital role in IDS, their reliability is flawed, resulting in high false positive results. This results in attacks not being detected even if they occur. In addition, because attackers can exploit the vulnerabilities presented by machine learning models when processing input data, a new type of network traffic attack has been triggered, namely adversarial attacks. Adversarial attacks are an attack method specifically targeting machine learning models, which aims to deceive the model into making incorrect classifications through small but carefully designed modifications to input data. This attack can cause serious security issues even with seemingly harmless inputs.

对抗性攻击可以根据攻击者对目标网络的了解程度和可见性分为黑盒攻击和白盒攻击两种类型。白盒攻击的核心思想是攻击者拥有对目标系统的详细内部信息,包括源代码、结构和算法等。这使得攻击者能够更深入地了解系统的运作方式,从而更有效地发起攻击。目前白盒攻击的威胁程度不断增加,随着深度学习和人工智能的快速发展,白盒攻击变得更为普遍和严重。攻击者能够通过分析模型的内部结构,发现并利用系统的薄弱点,这对各个领域的安全性都构成了威胁。但是尽管一些经典的白盒攻击方法已经证明了其对神经网络的有效性,并展现了规避防御机制的能力,但由于其在实践中的实现局限性,一些攻击方法在特定情境下可能并不切实可行。黑盒攻击是指攻击者在攻击目标系统时,没有系统的内部信息,只能通过输入和输出的观察来猜测系统的运作方式,这种攻击方式在现代网络安全威胁中占据重要地位。如今最先进的黑盒攻击方法有基于生成对抗网络(GAN)的恶意流量生成方法,迁移学习方法和嵌入式对抗样本生成方法等。Adversarial attacks can be divided into two types: black-box attacks and white-box attacks, depending on the attacker's knowledge and visibility of the target network. The core idea of white-box attacks is that the attacker has detailed internal information about the target system, including source code, structure, and algorithms. This allows the attacker to have a deeper understanding of how the system works, thereby launching attacks more effectively. At present, the threat level of white-box attacks is increasing, and with the rapid development of deep learning and artificial intelligence, white-box attacks have become more common and serious. Attackers are able to discover and exploit system weaknesses by analyzing the internal structure of the model, which poses a threat to security in all fields. However, although some classic white-box attack methods have proven their effectiveness against neural networks and demonstrated the ability to circumvent defense mechanisms, some attack methods may not be practical in certain situations due to their implementation limitations in practice. Black-box attacks refer to attacks in which the attacker has no internal information about the system when attacking the target system and can only guess how the system works by observing the input and output. This attack method occupies an important position in modern network security threats. Today's most advanced black-box attack methods include malicious traffic generation methods based on generative adversarial networks (GANs), transfer learning methods, and embedded adversarial sample generation methods.

使用GAN伪装恶意流量以逃避IDS的原理是利用GAN的生成器和判别器之间的竞争性训练,生成与正常流量相似但是具有恶意目的网络流量。由于GAN的生成器和判别器之间不断的进行对抗性训练使得GAN的生成器结构能够生成具有高度逼真性的假样本,同时由于GAN是端到端的训练模型,可以根据不同的攻击场景和目标系统灵活调整网络结构和训练策略,使其具有很好的自适应性,传统的IDS很难从大量的流量中准确地检测出这些恶意流量。The principle of using GAN to disguise malicious traffic to evade IDS is to use the competitive training between the generator and discriminator of GAN to generate network traffic that is similar to normal traffic but has malicious purposes. Due to the continuous adversarial training between the generator and discriminator of GAN, the generator structure of GAN can generate highly realistic fake samples. At the same time, because GAN is an end-to-end training model, it can flexibly adjust the network structure and training strategy according to different attack scenarios and target systems, making it highly adaptable. It is difficult for traditional IDS to accurately detect these malicious traffic from a large amount of traffic.

基于GAN生成的恶意流量在当前备受关注,使用基于改进的GAN伪装生成的恶意流量来重训练基于机器学习的入侵检测系统可以使入侵检测系统能够识别更加多样化的恶意攻击流量,使其更具有泛化能力和鲁棒性,拥有更好的性能,并且技术人员已经广泛采用模拟对抗攻击样本的方法来模拟生成恶意流量,从而对检测网络进行测试和升级,但是仍然存在一个关键的挑战,即模拟对抗攻击样本时的模式崩溃,这种现象对生成器产生的样本多样性有显著影响。由于GAN模型主要是通过生成器和判别器之间的博弈来生成数据,而生成器和判别器之间的博弈主要是通过损失函数来实现,所以损失函数的选择会直接影响到生成数据的多样性和质量。因此,尽管在利用GAN生成对抗攻击样本方面取得了显著进展,但我们仍然需要进一步研究和探索更加复杂和多样化的模型架构,以提高生成的对抗攻击样本的质量和逼真度。Malicious traffic generated by GAN has attracted much attention at present. Using malicious traffic generated by improved GAN disguise to retrain machine learning-based intrusion detection systems can enable intrusion detection systems to identify more diverse malicious attack traffic, making them more generalizable and robust, and having better performance. Technicians have widely adopted the method of simulating adversarial attack samples to simulate the generation of malicious traffic, thereby testing and upgrading the detection network. However, there is still a key challenge, namely, the mode collapse when simulating adversarial attack samples, which has a significant impact on the diversity of samples generated by the generator. Since the GAN model mainly generates data through the game between the generator and the discriminator, and the game between the generator and the discriminator is mainly achieved through the loss function, the choice of the loss function will directly affect the diversity and quality of the generated data. Therefore, although significant progress has been made in generating adversarial attack samples using GAN, we still need to further study and explore more complex and diverse model architectures to improve the quality and fidelity of generated adversarial attack samples.

发明内容Summary of the invention

为了解决上述技术问题,本发明提供了对抗攻击样本生成方法,利用自主设计的GAN网络来学习良性流量的特征,从而用来伪装恶意流量生成对抗攻击样本,从而利用所述对抗攻击样本更好的测试以及升级基于机器学习的入侵检测系统,所述方法包括如下步骤:In order to solve the above technical problems, the present invention provides a method for generating adversarial attack samples, which uses a self-designed GAN network to learn the characteristics of benign traffic, thereby disguising malicious traffic to generate adversarial attack samples, thereby using the adversarial attack samples to better test and upgrade the intrusion detection system based on machine learning. The method comprises the following steps:

S1、获取流量样本集并进行预处理,将处理后的数据分为正常流量样本集和恶意流量样本集;S1, obtain a traffic sample set and perform preprocessing, and divide the processed data into a normal traffic sample set and a malicious traffic sample set;

S2、将正常流量样本集中的数据输入到改进的Wasserstein GAN网络中,使用改进的Wasserstein GAN学习正常流量样本的特征;S2, input the data in the normal traffic sample set into the improved Wasserstein GAN network, and use the improved Wasserstein GAN to learn the characteristics of the normal traffic samples;

所述改进的Wasserstein GAN网络在Wasserstein GAN网络中引入多生成器结构,且在生成器损失函数中添加失真率;The improved Wasserstein GAN network introduces a multi-generator structure into the Wasserstein GAN network, and adds a distortion rate to the generator loss function;

S3、将恶意流量样本集中的数据按照所属类别分别输入改进的Wasserstein GAN网络中,利用改进的Wasserstein GAN网络学习到的正常流量样本特征对恶意流量样本数据进行伪装后生成的数据即为对抗攻击样本;S3, inputting the data in the malicious traffic sample set into the improved Wasserstein GAN network according to their categories, and disguising the malicious traffic sample data using the normal traffic sample features learned by the improved Wasserstein GAN network to generate data that is the adversarial attack sample;

S4、将生成的对抗攻击样本输入改进的Wasserstein GAN网络中的判别器进行判别,如果被判别为正常流量样本,则证明生成的对抗攻击样本符合要求。S4. Input the generated adversarial attack sample into the discriminator in the improved Wasserstein GAN network for discrimination. If it is discriminated as a normal traffic sample, it proves that the generated adversarial attack sample meets the requirements.

本发明所述方法的有益效果为:The beneficial effects of the method of the present invention are:

(1)本发明所述方法具有普遍性,应用场景更加广泛,它可以在不知道目标模型详细信息的情况下生成对抗性网络流量,这样使生成的对抗攻击样本符合实际的攻击场景,从而能够更加精确的提升入侵检测系统的性能。(1) The method of the present invention is universal and has a wider range of application scenarios. It can generate adversarial network traffic without knowing the detailed information of the target model, so that the generated adversarial attack samples are consistent with the actual attack scenario, thereby being able to more accurately improve the performance of the intrusion detection system.

(2)本发明所述方法也具有效率高耗费资源少的优点,本发明中的对抗样本生成方法通过集成多个生成器进行对抗样本生成,有效提高生成器生成数据的质量并一定程度上解决Wasserstein GAN(WGAN)的模式崩溃问题,并且对于损失函数进行针对性调整,帮助模型更准确地衡量生成数据与真实数据之间的差异,帮助生成器更加精细地调整生成数据的特征,从而提高生成器的训练效果和生成样本的质量;从而降低了对抗攻击样本训练的时间复杂度,节约时间和所占的空间,在降低了成本的同时还提高了精确度。(2) The method described in the present invention also has the advantages of high efficiency and low resource consumption. The adversarial sample generation method in the present invention integrates multiple generators to generate adversarial samples, effectively improving the quality of the data generated by the generator and solving the mode collapse problem of Wasserstein GAN (WGAN) to a certain extent. The loss function is adjusted in a targeted manner to help the model more accurately measure the difference between the generated data and the real data, and help the generator to more finely adjust the characteristics of the generated data, thereby improving the training effect of the generator and the quality of the generated samples; thereby reducing the time complexity of adversarial attack sample training, saving time and space, and improving accuracy while reducing costs.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为Wasserstein GAN网络结构图;Figure 1 is a diagram of the Wasserstein GAN network structure;

图2为改进的Wasserstein GAN网络结构图;Figure 2 is a diagram of the improved Wasserstein GAN network structure;

图3为本发明所述对抗攻击样本生成流程图。FIG3 is a flow chart of generating anti-attack samples according to the present invention.

具体实施方式Detailed ways

下面将结合附图对本发明的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明的保护范围。The technical solution of the present invention will be described clearly and completely below in conjunction with the accompanying drawings. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the protection scope of the present invention.

实施例1、Embodiment 1,

本发明所述方法包括如下步骤:The method of the present invention comprises the following steps:

S1、获取流量样本集并进行预处理,将处理后的数据分为正常流量样本集和恶意流量样本集;S1, obtain a traffic sample set and perform preprocessing, and divide the processed data into a normal traffic sample set and a malicious traffic sample set;

S2、将正常流量样本集中的数据输入到改进的Wasserstein GAN网络中,使用改进的Wasserstein GAN学习正常流量样本的特征;S2, input the data in the normal traffic sample set into the improved Wasserstein GAN network, and use the improved Wasserstein GAN to learn the characteristics of the normal traffic samples;

所述改进的Wasserstein GAN网络在Wasserstein GAN网络中引入多生成器结构,且在生成器损失函数中添加失真率;The improved Wasserstein GAN network introduces a multi-generator structure into the Wasserstein GAN network, and adds a distortion rate to the generator loss function;

S3、将恶意流量样本集中的数据按照所属类别分别输入改进的Wasserstein GAN网络中,利用改进的Wasserstein GAN网络学习到的正常流量样本特征对恶意流量样本数据进行伪装后生成的数据即为对抗攻击样本;S3, inputting the data in the malicious traffic sample set into the improved Wasserstein GAN network according to their categories, and disguising the malicious traffic sample data using the normal traffic sample features learned by the improved Wasserstein GAN network to generate data that is the adversarial attack sample;

S4、将生成的对抗攻击样本输入改进的Wasserstein GAN网络中的判别器进行判别,如果被判别为正常流量样本,则证明生成的对抗攻击样本符合要求。S4. Input the generated adversarial attack sample into the discriminator in the improved Wasserstein GAN network for discrimination. If it is discriminated as a normal traffic sample, it proves that the generated adversarial attack sample meets the requirements.

实施例2、Embodiment 2,

本实施例是对实施例1的进一步限定,This embodiment is a further limitation of Embodiment 1.

步骤S1中的预处理具体为:The preprocessing in step S1 is specifically as follows:

S11、对所述流量样本集进行数据清洗、特征提取以及数据归一化操作;S11, performing data cleaning, feature extraction and data normalization operations on the traffic sample set;

数据清洗具体为:对样本集中包含Nan和Infinity的样本进行删除;The specific data cleaning is as follows: delete the samples containing Nan and Infinity in the sample set;

特征提取具体为:使用互信息来计算每个特征与所属样本标签之间的相关性,然后对相关性进行排序,将相关性较弱的特征进行删除,具体为保留排序中前70位的特征,对其他进行删除;Feature extraction is as follows: mutual information is used to calculate the correlation between each feature and the sample label to which it belongs, and then the correlation is sorted and the features with weak correlation are deleted. Specifically, the top 70 features in the sorting are retained and the others are deleted.

相关性具体计算公式如下:The specific calculation formula of correlation is as follows:

;

其中,XY,具体含义解释如下:Among them,X andY have the following specific meanings:

假设数据清洗后的数据集为D,包含有n条数据,表示为:x=(x1,x2...xn)属于D;Assume that the data set after data cleaning is D, which contains n data, expressed as:x = (x1, x2...xn ) belongs to D;

对于每一条数据中都有m个数据,前m-1个为特征数据,第m个为该条数据的所属标签;For each piece of data, there arem data, the firstm -1 are feature data, and themth is the label to which the data belongs;

以计算第一个特征数据所属样本标签之间的相关性为例,设所有n条数据的第一个特征为一个样本集即T,设所有n条数据的所属样本标签为一个样本集即LTaking the calculation of the correlation between the sample labels to which the first feature data belongs as an example, let the first feature of alln pieces of data be a sample set, namelyT , and let the sample labels to which alln pieces of data belong be a sample set, namelyL ;

使用上述公式进行计算时,公式中X即代表样本集T,公式中Y即代表样本集LWhen using the above formula for calculation,X in the formula represents the sample setT , andY in the formula represents the sample setL ;

x表示X中的特征数据,y表示Y中的样本标签;x represents the feature data in X,and y represents the sample label inY ;

px,y)表示xy的联合概率分布,px)表示x的边缘概率分布,py)表示y的边缘概率分布;p (x ,y ) represents the joint probability distribution ofx andy ,p (x ) represents the marginal probability distribution ofx , andp (y ) represents the marginal probability distribution ofy ;

数据归一化具体为:使用最大最小归一化对样本集进行数据归一化,这一步骤有利于模型的训练和收敛,具体计算公式如下:Data normalization is as follows: use maximum and minimum normalization to normalize the sample set. This step is conducive to model training and convergence. The specific calculation formula is as follows:

= = ;

其中,代表计算出的特征的归一化值,表示数据中该特征的最小值,表示数据中该特征的最大值。in, represents the normalized value of the calculated feature, represents the minimum value of this feature in the data, Indicates the maximum value of this feature in the data.

S12、分析各特征的意义并确定最终需要保留的特征。S12. Analyze the significance of each feature and determine the features that need to be retained in the end.

保留特征的标准为:使该条数据在能够逃避基于机器学习的入侵检测系统的前提下不改变其原始标签属性,其中,TCP/IP协议中的源IP地址、目标IP地址和端口号不能修改。The standard for retaining features is: the original label attributes of the data remain unchanged while being able to evade the machine learning-based intrusion detection system. Among them, the source IP address, destination IP address and port number in the TCP/IP protocol cannot be modified.

以CICIDS2017数据集为例,对表1中的特征数据进行保留,不得进行伪装修改:Taking the CICIDS2017 dataset as an example, the characteristic data in Table 1 is retained and no disguised modification is allowed:

表1:Table 1:

步骤S1中,将处理后的数据分为正常流量样本集和恶意流量样本集具体为:In step S1, the processed data is divided into a normal traffic sample set and a malicious traffic sample set, specifically:

每条数据有多个特征,其中最后一个特征为该条数据的所属类别,即流量标签,根据流量标签将数据划分为正常流量样本集和恶意流量样本集;常见的数据集为CICIDS2017和NSL-KDD数据集,其划分方式分别如表2和表3所示:Each piece of data has multiple features, the last of which is the category to which the data belongs, namely the traffic label. The data is divided into a normal traffic sample set and a malicious traffic sample set according to the traffic label. Common data sets are CICIDS2017 and NSL-KDD data sets, and their division methods are shown in Table 2 and Table 3 respectively:

表2:Table 2:

表3:table 3:

实施例3、Embodiment 3,

本实施例是对实施例1的进一步限定,所述改进的Wasserstein GAN网络对原有的Wasserstein GAN网络进行的改进具体为:This embodiment is a further limitation of Embodiment 1. The improved Wasserstein GAN network improves the original Wasserstein GAN network in the following specific aspects:

多生成器结构:改进的Wasserstein GAN网络以Wasserstein GAN网络作为基础模型,引入了多生成器结构,Wasserstein GAN网络模型的结构如图1所示,图中,XBENIGN表示正常流量样本数据,XMAL表示恶意流量样本数据,G表示生成器,D表示判别器;WassersteinGAN网络中数据通过单一生成器进行生成,将生成的数据与原始数据输入到判别器中进行鉴别,经过多次迭代更新后,达到判别器无法判断输入的数据为真实数据还是生成器生成数据的目标;Multi-generator structure: The improved Wasserstein GAN network uses the Wasserstein GAN network as the basic model and introduces a multi-generator structure. The structure of the Wasserstein GAN network model is shown in Figure 1. In the figure, XBENIGN represents normal traffic sample data, XMAL represents malicious traffic sample data, G represents the generator, and D represents the discriminator. In the WassersteinGAN network, data is generated by a single generator, and the generated data and the original data are input into the discriminator for identification. After multiple iterative updates, the discriminator cannot determine whether the input data is real data or data generated by the generator.

改进的Wasserstein GAN网络引入多生成器结构,如图2所示,G1~n表示n个生成器,Bot和DOS等代表的是具体的恶意流量样本数据所属类别;设计了n个生成器结构,其中n代表输入数据的具体类别个数,使得每个类别数据都使用该类别的独立生成器,每个类别数据单独使用一个生成器可以有效提高生成器生成数据的质量并一定程度上解决Wasserstein GAN网络的模式崩溃问题。The improved Wasserstein GAN network introduces a multi-generator structure, as shown in Figure 2, where G1~n represents n generators, and Bot and DOS represent the categories to which specific malicious traffic sample data belongs. An n generator structure is designed, where n represents the number of specific categories of the input data, so that each category of data uses an independent generator of that category. Using a generator for each category of data can effectively improve the quality of the data generated by the generator and solve the mode collapse problem of the Wasserstein GAN network to a certain extent.

损失函数改进:在Wasserstein GAN网络中,通过将判别器的参数限制为Lipschitz连续,然后优化一个带有截断梯度的损失函数来实现最小化 Wasserstein 距离,在Wasserstein GAN网络的生成器损失函数的基础上,添加失真率来度量真实数据与生成数据的距离;这种改进可以帮助模型更准确地衡量生成数据与真实数据之间的差异,帮助生成器更加精细地调整生成数据的特征,从而提高生成器的训练效果和生成样本的质量;改进的Wasserstein GAN网络的生成器和判别器的损失函数公式如下:Improved loss function: In the Wasserstein GAN network, the Wasserstein distance is minimized by restricting the parameters of the discriminator to be Lipschitz continuous and then optimizing a loss function with truncated gradients. On the basis of the generator loss function of the Wasserstein GAN network, the distortion rate is added to measure the distance between the real data and the generated data. This improvement can help the model more accurately measure the difference between the generated data and the real data, and help the generator to more finely adjust the characteristics of the generated data, thereby improving the training effect of the generator and the quality of the generated samples. The loss function formulas of the generator and discriminator of the improved Wasserstein GAN network are as follows:

;

;

其中,LG表示生成器的损失函数,LD表示判别器的损失函数,Pr是数据集给定的真实数据分布,Pg是改进的Wasserstein GAN网络的生成器分布,N为总样本数,i表示该条数据的第i个特征,表示从生成分布Pg中采样的期望值,表示从真实数据分布Pr中采样的期望值,表示判别器的输出,输出的结果为数据的真假,真实为0,虚假为1。Among them,LG represents the loss function of the generator,LD represents the loss function of the discriminator,Pr isthe real data distribution given by the dataset ,Pg isthe generator distribution of the improved Wasserstein GAN network,N is the total number of samples,i represents thei -th feature of the data, represents the expected value sampled from the generatingdistributionPg , represents the expected value sampled from the true data distributionPr , Represents the output of the discriminator. The output result is the truth or falsity of the data, 0 for true and 1 for false.

实施例4、Embodiment 4,

本实施例是对实施例1的进一步限定,所述步骤S3具体为:This embodiment is a further limitation of Embodiment 1, and the step S3 is specifically as follows:

S31、将恶意流量样本集中的数据按照所属类别分别输入改进的Wasserstein GAN网络中;S31, inputting the data in the malicious traffic sample set into the improved Wasserstein GAN network according to their categories;

S32、使用改进的Wasserstein GAN网络的生成器进行对抗样本生成,使用步骤S12中保留的特征替代生成的对抗样本中的对应部分;S32, using the generator of the improved Wasserstein GAN network to generate adversarial samples, and using the features retained in step S12 to replace the corresponding parts in the generated adversarial samples;

由于流量数据的特殊性,其中很多的特征数据决定该条流量的类别属性,如果任意的改变流量数据的特征会导致其失去原本的属性;Due to the particularity of traffic data, many of the characteristic data determine the category attributes of the traffic. If the characteristics of the traffic data are arbitrarily changed, it will lose its original attributes.

所以保留部分流量特征会保留流量原始的标签属性,使实验正常进行,如任意的改变特征数据,可能会导致以下结果:Therefore, retaining some traffic features will retain the original label attributes of the traffic and allow the experiment to proceed normally. If the feature data is changed arbitrarily, the following results may occur:

使流量数据失效,不符合TCP等通信协议;Invalidate traffic data and do not comply with TCP and other communication protocols;

使流量数据失去原本的标签,失去原始数据的恶意性,以至于混淆实验,降低实验效果;The traffic data loses its original label and the maliciousness of the original data, thus confusing the experiment and reducing the experimental effect;

实施例5、Embodiment 5,

本实施例是对实施例1的进一步限定,步骤S4具体为:This embodiment is a further limitation of Embodiment 1, and step S4 is specifically as follows:

S41、将生成的对抗样本输入到判别器,使用判别器与正常流量样本集进行分类判断,确认生成的对抗样本是否能通过判别器的判别;S41, input the generated adversarial sample into the discriminator, use the discriminator and the normal traffic sample set to perform classification judgment, and confirm whether the generated adversarial sample can pass the judgment of the discriminator;

S42、使用生成器进行对抗样本生成以及判别器判断过程中产生的损失对生成器和判别器进行反馈更新;S42, using the generator to generate adversarial samples and the loss generated in the discriminator judgment process to feedback update the generator and the discriminator;

S43、多次迭代上述操作,使改进的Wasserstein GAN网络性能得到提升。S43. The above operation is iterated multiple times to improve the performance of the improved Wasserstein GAN network.

整个对抗攻击样本生成的流程图,即如图3所示,先获取流量样本集并进行预处理,将处理后的数据分为正常流量样本集XBENIGN和恶意流量样本集XMAL;将正常流量样本集中的数据输入到改进的Wasserstein GAN网络中,使用改进的Wasserstein GAN网络学习正常流量样本的特征;将恶意流量样本集中的数据按照所属类别分别输入改进的Wasserstein GAN网络中,图中的Bot、DOS……DDOS、XSS分别表示不同的类别,G1~n表示n个生成器;利用改进的Wasserstein GAN网络学习到的正常流量样本特征对恶意流量样本数据进行伪装后生成的数据即为对抗攻击样本;将对抗攻击样本输入判别器使用判别器将其与正常流量样本进行分类判断,输出的结果为0时,判别对抗攻击样为真实,即判别器将其判别为正常流量样本,输出的结果为1时,判别对抗攻击样为虚假,即判别器将其判别为恶意流量样本。The flowchart of the entire adversarial attack sample generation is shown in FIG3. First, a traffic sample set is obtained and preprocessed, and the processed data is divided into a normal traffic sample set XBENIGN and a malicious traffic sample set XMAL ; the data in the normal traffic sample set is input into the improved Wasserstein GAN network, and the features of the normal traffic samples are learned using the improved Wasserstein GAN network; the data in the malicious traffic sample set are respectively input into the improved Wasserstein GAN network according to their categories, and Bot, DOS...DDOS, and XSS in the figure represent different categories, respectively, and G1~n represents n generators; the data generated by disguising the malicious traffic sample data using the features of the normal traffic samples learned by the improved Wasserstein GAN network is the adversarial attack sample; the adversarial attack sample is input into the discriminator, and the discriminator is used to classify and judge it with the normal traffic sample. When the output result is 0, the adversarial attack sample is judged to be true, that is, the discriminator judges it as a normal traffic sample, and when the output result is 1, the adversarial attack sample is judged to be false, that is, the discriminator judges it as a malicious traffic sample.

Claims (6)

CN202410740771.8A2024-06-112024-06-11Method for generating anti-attack sampleActiveCN118337526B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202410740771.8ACN118337526B (en)2024-06-112024-06-11Method for generating anti-attack sample

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202410740771.8ACN118337526B (en)2024-06-112024-06-11Method for generating anti-attack sample

Publications (2)

Publication NumberPublication Date
CN118337526Atrue CN118337526A (en)2024-07-12
CN118337526B CN118337526B (en)2024-09-13

Family

ID=91780002

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202410740771.8AActiveCN118337526B (en)2024-06-112024-06-11Method for generating anti-attack sample

Country Status (1)

CountryLink
CN (1)CN118337526B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119892442A (en)*2025-01-062025-04-25内蒙古工业大学Method, system and computer equipment for generating antagonistic flow example

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113158390A (en)*2021-04-292021-07-23北京邮电大学Network attack traffic generation method for generating countermeasure network based on auxiliary classification
US20210319113A1 (en)*2019-01-072021-10-14Zhejiang UniversityMethod for generating malicious samples against industrial control system based on adversarial learning
CN114866341A (en)*2022-06-172022-08-05哈尔滨工业大学Vulnerability amplification type backdoor attack security assessment method for network intrusion detection system
CN116707992A (en)*2023-07-122023-09-05浙江工业大学Malicious traffic avoidance detection method based on generation countermeasure network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20210319113A1 (en)*2019-01-072021-10-14Zhejiang UniversityMethod for generating malicious samples against industrial control system based on adversarial learning
CN113158390A (en)*2021-04-292021-07-23北京邮电大学Network attack traffic generation method for generating countermeasure network based on auxiliary classification
CN114866341A (en)*2022-06-172022-08-05哈尔滨工业大学Vulnerability amplification type backdoor attack security assessment method for network intrusion detection system
CN116707992A (en)*2023-07-122023-09-05浙江工业大学Malicious traffic avoidance detection method based on generation countermeasure network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119892442A (en)*2025-01-062025-04-25内蒙古工业大学Method, system and computer equipment for generating antagonistic flow example
CN119892442B (en)*2025-01-062025-06-20内蒙古工业大学Method, system and computer equipment for generating antagonistic flow example

Also Published As

Publication numberPublication date
CN118337526B (en)2024-09-13

Similar Documents

PublicationPublication DateTitle
Xia et al.Poisoning attacks in federated learning: A survey
CN110933060B (en) A mining Trojan detection system based on traffic analysis
Prasad et al.BARTD: Bio-inspired anomaly based real time detection of under rated App-DDoS attack on web
CalderonThe benefits of artificial intelligence in cybersecurity
CN106790292A (en)The web application layer attacks detection and defence method of Behavior-based control characteristic matching and analysis
Barati et al.Distributed Denial of Service detection using hybrid machine learning technique
CN108737336B (en)Block chain-based threat behavior processing method and device, equipment and storage medium
Koroniotis et al.A new intelligent satellite deep learning network forensic framework for smart satellite networks
Ahuja et al.Ascertain the efficient machine learning approach to detect different ARP attacks
CN109644184A (en)For the clustering method from the DDOS Botnet on IPFIX Data Detection cloud
CN113709097B (en)Network risk sensing method and defense method
CN118337526B (en)Method for generating anti-attack sample
CN116319060B (en) An intelligent self-evolutionary generation method for network threat disposal strategies based on DRL model
CN116707992A (en)Malicious traffic avoidance detection method based on generation countermeasure network
CN115913731A (en)Strategic honeypot deployment defense method based on intelligent penetration test
Srilatha et al.DDoSNet: A deep learning model for detecting network attacks in cloud computing
KR20190028880A (en)Method and appratus for generating machine learning data for botnet detection system
Chen et al.Advanced persistent threat organization identification based on software gene of malware
Alyasiri et al.Grammatical evolution for detecting cyberattacks in Internet of Things environments
CN116980901A (en)Internet of vehicles intrusion detection method based on hybrid stacking integration algorithm
Xiao et al.FDSFL: Filtering defense strategies toward targeted poisoning attacks in IIoT-based federated learning networking system
Berei et al.Machine learning algorithms for DoS and DDoS cyberattacks detection in real-time environment
Chen et al.HoleMal: A lightweight IoT malware detection framework based on efficient host-level traffic processing
Erdol et al.Low dimensional secure federated learning framework against poisoning attacks
Ding et al.Machine learning for cybersecurity: Network-based botnet detection using time-limited flows

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp