Movatterモバイル変換


[0]ホーム

URL:


CN116132111A - Attack identification method and device based on mouse track data in network traffic - Google Patents

Attack identification method and device based on mouse track data in network traffic
Download PDF

Info

Publication number
CN116132111A
CN116132111ACN202211635593.XACN202211635593ACN116132111ACN 116132111 ACN116132111 ACN 116132111ACN 202211635593 ACN202211635593 ACN 202211635593ACN 116132111 ACN116132111 ACN 116132111A
Authority
CN
China
Prior art keywords
track data
mouse
mouse track
features
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211635593.XA
Other languages
Chinese (zh)
Other versions
CN116132111B (en
Inventor
高霞
余燕
解建华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongtong Uniform Chuangfa Science And Technology Co ltd
Original Assignee
Zhongtong Uniform Chuangfa Science And Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongtong Uniform Chuangfa Science And Technology Co ltdfiledCriticalZhongtong Uniform Chuangfa Science And Technology Co ltd
Priority to CN202211635593.XApriorityCriticalpatent/CN116132111B/en
Publication of CN116132111ApublicationCriticalpatent/CN116132111A/en
Application grantedgrantedCritical
Publication of CN116132111BpublicationCriticalpatent/CN116132111B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本公开的实施例提供了一种基于网络流量中鼠标轨迹数据的攻击识别方法及装置。所述方法包括:采集并解析网络流量中的鼠标轨迹数据;对鼠标轨迹数据进行预处理;分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理;根据标准化处理后的特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型;使用逻辑回归分类模型对网页中的鼠标轨迹数据进行识别。以此方式,可以有效的识别自动化攻击,同时又不会影响用户体验。

Figure 202211635593

Embodiments of the present disclosure provide an attack identification method and device based on mouse track data in network traffic. The method includes: collecting and analyzing mouse track data in network traffic; preprocessing the mouse track data; analyzing and extracting features of the preprocessed mouse track data, and standardizing the features; Generate a training sample set for model training, generate a logistic regression classification model for automatic attack recognition; use the logistic regression classification model to identify mouse track data in web pages. In this way, automated attacks can be effectively identified without compromising user experience.

Figure 202211635593

Description

Translated fromChinese
基于网络流量中鼠标轨迹数据的攻击识别方法及装置Attack identification method and device based on mouse track data in network traffic

技术领域technical field

本公开涉及信息安全技术领域,尤其涉及基于网络流量中鼠标轨迹数据的攻击识别方法及装置。The present disclosure relates to the technical field of information security, in particular to an attack identification method and device based on mouse track data in network traffic.

背景技术Background technique

随着互联网技术的蓬勃发展,网络黑产技术也在不断优化,日趋成熟,互联网上充斥着越来越多的自动化攻击流量,这些流量可能是黑灰产业链的自动化攻击、网络爬虫。他们一旦攻击成功,将会对用户或企业造成不可估量的损失。With the vigorous development of Internet technology, network black production technology is also being continuously optimized and matured. The Internet is flooded with more and more automated attack traffic. These traffic may be automated attacks and web crawlers in the black and gray industry chain. Once they attack successfully, it will cause immeasurable losses to users or enterprises.

传统的自动化攻击识别方法主要通过验证码的方式,用户能正确填写验证码,则认为是真人,否则认为是自动化攻击,予以拦截。但这种方法一方面较大的影响着用户的正常体验、尤其对于不太熟悉互联网的老年人群影响很大,他们经常会因为无法在规定的时间内正确填写验证码而无法使用功能;另一方面,现在有人工打码技术或者机器学习方法帮助自动化工具正确填写验证码,让验证码的有效性逐渐下降。The traditional automatic attack identification method mainly uses the verification code. If the user can fill in the verification code correctly, he is considered a real person; otherwise, it is regarded as an automated attack and intercepted. However, on the one hand, this method greatly affects the normal experience of users, especially for the elderly who are not familiar with the Internet. They often cannot use the function because they cannot correctly fill in the verification code within the specified time; On the one hand, there are now manual coding techniques or machine learning methods to help automated tools fill in verification codes correctly, making the verification codes less effective.

因此,如何有效的识别自动化攻击,同时又不会影响用户体验,显得至关重要。Therefore, how to effectively identify automated attacks without affecting user experience is very important.

发明内容Contents of the invention

本公开提供了一种基于网络流量中鼠标轨迹数据的攻击识别方法及装置,可以有效的识别自动化攻击。The present disclosure provides an attack identification method and device based on mouse track data in network traffic, which can effectively identify automated attacks.

根据本公开的第一方面,提供了一种基于网络流量中鼠标轨迹数据的攻击识别方法。该方法包括:According to a first aspect of the present disclosure, an attack identification method based on mouse track data in network traffic is provided. The method includes:

采集并解析网络流量中的鼠标轨迹数据;其中,鼠标轨迹数据包括自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据;Collect and analyze mouse track data in network traffic; wherein, mouse track data includes mouse track data generated by automated attacks and mouse track data generated by users;

对鼠标轨迹数据进行预处理;分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理;Preprocess the mouse track data; analyze and extract the features of the preprocessed mouse track data, and standardize the features;

根据标准化处理后的特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型;Generate a training sample set based on the standardized features for model training, and generate a logistic regression classification model for automatic attack recognition;

使用逻辑回归分类模型对网页中的鼠标轨迹数据进行识别。Use a logistic regression classification model to identify mouse track data in web pages.

在第一方面的一些可实现方式中,对鼠标轨迹数据进行预处理,包括:In some practicable manners of the first aspect, the mouse track data is preprocessed, including:

筛选并删除鼠标轨迹数据中的异常数据;Filter and delete abnormal data in mouse track data;

将筛选后的鼠标轨迹数据进行轨迹整合;Trajectory integration of the filtered mouse trajectory data;

根据轨迹整合的结果绘制轨迹图。Trajectory plots are drawn from the results of trajectory integration.

在第一方面的一些可实现方式中,分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理,包括:In some implementable manners of the first aspect, the features of the preprocessed mouse track data are analyzed and extracted, and the features are standardized, including:

分析轨迹图中自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据的差异;Analyze the difference between the mouse track data generated by automated attacks and the mouse track data generated by users in the track graph;

提取差异特征并使用向量表示特征;Extract difference features and use vectors to represent features;

采用z-score算法,对特征做标准化处理。The z-score algorithm is used to standardize the features.

在第一方面的一些可实现方式中,根据标准化处理后的特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型,包括:In some implementable ways of the first aspect, a training sample set is generated according to the standardized features for model training, and a logistic regression classification model for automatic attack recognition is generated, including:

根据标准化处理后的特征;According to the characteristics after normalization processing;

将预处理后的鼠标轨迹数据作为样本;Take the preprocessed mouse track data as a sample;

将每一样本对应的样本类型作为样本标签,根据每一样本及其对应的标签生成训练样本集;The sample type corresponding to each sample is used as a sample label, and a training sample set is generated according to each sample and its corresponding label;

确定模型的评价指标,采用训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型。Determine the evaluation index of the model, use the training sample set for model training, and generate a logistic regression classification model for automatic attack recognition.

在第一方面的一些可实现方式中,样本类型,包括:In some possible implementations of the first aspect, the sample types include:

自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据。Mouse track data generated by automated attacks and mouse track data generated by users.

在第一方面的一些可实现方式中,使用逻辑回归分类模型对网页中的鼠标轨迹数据进行识别,包括:In some implementable manners of the first aspect, a logistic regression classification model is used to identify the mouse track data in the webpage, including:

对网页中的鼠标轨迹数据进行预处理;Preprocess the mouse track data in the web page;

分析并提取预处理后的数据特征,对特征做标准化处理;Analyze and extract the preprocessed data features, and standardize the features;

将标准化处理后的特征输入逻辑回归分类模型中进行识别。Input the standardized features into the logistic regression classification model for identification.

根据本公开的第二方面,提供了一种基于网络流量中鼠标轨迹数据的攻击识别装置。该装置包括:According to a second aspect of the present disclosure, an attack identification device based on mouse track data in network traffic is provided. The unit includes:

采集模块,采集网络流量中的鼠标轨迹数据;The collection module collects the mouse track data in the network traffic;

解析模块,解析网络流量中的鼠标轨迹数据;Analysis module, which analyzes mouse track data in network traffic;

预处理模块,对鼠标轨迹数据进行预处理;The preprocessing module preprocesses the mouse track data;

特征提取模块,分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理;The feature extraction module analyzes and extracts the features of the preprocessed mouse track data, and standardizes the features;

模型训练模块,根据标准化处理后的特征进行模型训练;The model training module performs model training according to the standardized features;

模型生成模块,生成针对自动化攻击识别的逻辑回归分类模型。The model generation module generates a logistic regression classification model for automatic attack recognition.

根据本公开的第三方面,提供了一种电子设备。该电子设备包括:至少一个处理器;以及与至少一个处理器通信连接的存储器;存储器存储有可被至少一个处理器执行的指令,指令被至少一个处理器执行,以使至少一个处理器能够执行如以上所述的方法。According to a third aspect of the present disclosure, an electronic device is provided. The electronic device includes: at least one processor; and a memory communicatively connected to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor so that the at least one processor can execute method as described above.

根据本公开的第四方面,提供了一种存储有计算机指令的非瞬时计算机可读存储介质,计算机指令用于使计算机执行如以上所述的方法。According to a fourth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, the computer instructions are used to cause a computer to execute the method as described above.

在本公开中,以自动化攻击产生的鼠标轨迹和正常用户产生的鼠标轨迹为样本训练出能够识别自动化攻击的逻辑回归分类模型,该方法加大了自动化攻击绕过的难度,能够有效保护其他用户和企业的利益。In this disclosure, a logistic regression classification model capable of identifying automated attacks is trained using the mouse tracks generated by automated attacks and mouse tracks generated by normal users as samples. This method increases the difficulty of bypassing automated attacks and can effectively protect other users and corporate interests.

应当理解,发明内容部分中所描述的内容并非旨在限定本公开的实施例的关键或重要特征,亦非用于限制本公开的范围。本公开的其它特征将通过以下的描述变得容易理解。It should be understood that what is described in the Summary of the Invention is not intended to limit the key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood through the following description.

附图说明Description of drawings

结合附图并参考以下详细说明,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。附图用于更好地理解本方案,不构成对本公开的限定在附图中,相同或相似的附图标记表示相同或相似的元素,其中:The above and other features, advantages and aspects of the various embodiments of the present disclosure will become more apparent with reference to the following detailed description when taken in conjunction with the accompanying drawings. The accompanying drawings are used to better understand the present solution, and do not constitute a limitation to the present disclosure. In the accompanying drawings, the same or similar reference numerals represent the same or similar elements, wherein:

图1示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的流程图;FIG. 1 shows a flow chart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure;

图2示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的总流程示意图;FIG. 2 shows a schematic flowchart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure;

图3示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的数据预处理流程图;FIG. 3 shows a data preprocessing flowchart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure;

图4示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的特征提取流程图;FIG. 4 shows a flow chart of feature extraction of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure;

图5-图7示出了本公开实施例提供的自动化攻击产生的示例性的鼠标轨迹图;5-7 show exemplary mouse trajectory diagrams generated by automated attacks provided by embodiments of the present disclosure;

图8示出了本公开实施例提供的用户产生的示例性的鼠标轨迹图;FIG. 8 shows an exemplary mouse trajectory diagram generated by a user provided by an embodiment of the present disclosure;

图9示出了根据本公开实施例提供的鼠标数据生成的鼠标轨迹图;FIG. 9 shows a mouse trajectory diagram generated according to mouse data provided by an embodiment of the present disclosure;

图10示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别装置的框图;FIG. 10 shows a block diagram of an attack identification device based on mouse track data in network traffic provided by an embodiment of the present disclosure;

图11示出了能够实施本公开的实施例的示例性电子设备的方框图。FIG. 11 shows a block diagram of an exemplary electronic device capable of implementing embodiments of the present disclosure.

具体实施方式Detailed ways

为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的全部其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments It is a part of the embodiments of the present disclosure, but not all of them. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present disclosure.

另外,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。In addition, the term "and/or" in this article is only an association relationship describing associated objects, which means that there may be three relationships, for example, A and/or B may mean: A exists alone, A and B exist at the same time, There are three cases of B alone. In addition, the character "/" in this article generally indicates that the contextual objects are an "or" relationship.

针对背景技术中提到的问题,本公开实施例提供了一种基于网络流量中鼠标轨迹数据的攻击识别方法及装置,可以有效的识别自动化攻击。In view of the problems mentioned in the background technology, the embodiments of the present disclosure provide an attack identification method and device based on mouse track data in network traffic, which can effectively identify automated attacks.

具体地,结合图2进行说明,图2示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的总流程示意图。如图2所示,采集并解析网络流量中的鼠标轨迹数据;对鼠标轨迹数据进行预处理;分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理;根据标准化处理后的特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型;使用训练好的逻辑回归分类模型对网页中的鼠标轨迹数据进行识别,判断是否是自动化攻击并输出判断结构。Specifically, it will be described in conjunction with FIG. 2 , which shows a schematic flowchart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure. As shown in Figure 2, the mouse track data in the network traffic is collected and analyzed; the mouse track data is preprocessed; the features of the preprocessed mouse track data are analyzed and extracted, and the features are standardized; The feature generation training sample set is used for model training, and a logistic regression classification model for automatic attack recognition is generated; the trained logistic regression classification model is used to identify the mouse track data in the webpage, judge whether it is an automated attack, and output the judgment structure.

以此方式,可以加大自动化攻击绕过的难度,在不影响用户体验的情况下能够有效识别自动化攻击。In this way, the difficulty of bypassing automated attacks can be increased, and automated attacks can be effectively identified without affecting user experience.

下面结合附图,通过具体的实施例对本公开实施例提供的基于网络流量中鼠标轨迹数据的攻击识别方法进行详细说明。The attack identification method based on the mouse track data in the network traffic provided by the embodiment of the present disclosure will be described in detail below with reference to the accompanying drawings.

图1示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的流程图;如图1所示,基于网络流量中鼠标轨迹数据的攻击识别方法100可以包括以下步骤:Figure 1 shows a flowchart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure; as shown in Figure 1 , theattack identification method 100 based on mouse track data in network traffic may include the following steps :

S110,采集并解析网络流量中的鼠标轨迹数据。S110. Collect and analyze mouse track data in network traffic.

其中,鼠标轨迹数据包括自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据。Wherein, the mouse track data includes mouse track data generated by automated attacks and mouse track data generated by users.

具体地,可以分段采集以下信息:Specifically, the following information can be collected in sections:

track_id、ip、每段轨迹的编号。其中,track_id, ip, the number of each track. in,

track_id是唯一标识用户的轨迹,用户的轨迹可能由多段轨迹拼接组成。track_id is the track that uniquely identifies the user, and the user's track may be composed of multiple tracks.

ip,访问path,log_time,获得ip信息。ip, access path, log_time, get ip information.

每段轨迹是由长度不等的轨迹点组成的时序数据,可以根据log_time记录每段轨迹的编号。Each track is time-series data composed of track points with different lengths, and the number of each track can be recorded according to log_time.

示例性地,可以设置每隔15ms采集一次轨迹点,每个轨迹点采集鼠标事件(包括:单击按下、单击弹起、双击按下、双击弹起、移动等)、鼠标在屏幕上的x轴坐标、鼠标在屏幕上的y轴坐标、事件发生时的时间与第一个点的时间差及其他类似鼠标信息。Exemplarily, it can be set to collect a track point every 15ms, and each track point collects mouse events (including: single-click press, single-click pop-up, double-click press, double-click pop-up, movement, etc.), the mouse on the screen The x-axis coordinates of the mouse, the y-axis coordinates of the mouse on the screen, the time difference between the time when the event occurred and the first point, and other similar mouse information.

S120,对鼠标轨迹数据进行预处理;分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理。S120, preprocessing the mouse track data; analyzing and extracting features of the preprocessed mouse track data, and standardizing the features.

下面结合图3对鼠标轨迹数据的预处理进行详细说明,图3示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的数据预处理流程图。如图3所示,对鼠标轨迹数据进行预处理,可以包括:The preprocessing of mouse track data will be described in detail below with reference to FIG. 3 . FIG. 3 shows a data preprocessing flowchart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure. As shown in Figure 3, preprocessing the mouse track data may include:

筛选并删除鼠标轨迹数据中的异常数据。Filter and delete outliers in mouse track data.

具体地,正常情况下的鼠标很难精确触碰到屏幕上的[0,0]点,坐标处于[0,0]的轨迹点多为异常数据,删除此类异常数据。Specifically, under normal circumstances, it is difficult for the mouse to accurately touch the [0,0] point on the screen, and the track points whose coordinates are at [0,0] are mostly abnormal data, and such abnormal data should be deleted.

将筛选后的鼠标轨迹数据进行轨迹整合。The filtered mouse track data is track integrated.

具体地,将每个track_id的多段轨迹整合成一条完整的轨迹,整合过程中遇到如下情况时需做分段处理:Specifically, the multi-segment tracks of each track_id are integrated into a complete track, and segmentation processing is required when the following situations are encountered during the integration process:

鼠标移入/移出窗口工作时:当两个轨迹点间的时间间隔超过一定时长,且两点间的距离较大时,此时的鼠标很可能是在执行移入/移出窗口的工作,以x轴、y轴的坐标间距都超过预设范围的点为鼠标移入/移出窗口的轨迹点,以该点为标志将整段轨迹划分为两段轨迹。When the mouse moves in/out of the window: when the time interval between two track points exceeds a certain period of time, and the distance between the two points is relatively large, the mouse at this time is probably performing the work of moving in/out of the window, and the x-axis The point where the distance between the coordinates of the y-axis and the y-axis exceeds the preset range is the track point of the mouse moving in/out of the window, and the entire track is divided into two tracks with this point as a symbol.

示例性地,若两点间的时间间隔超过500ms,且x轴、y轴的坐标间距都超过200时,则该点为鼠标移入/移出窗口的轨迹点,以该点为标志将整段轨迹划分为两段轨迹。For example, if the time interval between two points exceeds 500ms, and the distance between the coordinates of the x-axis and y-axis exceeds 200, then this point is the track point of the mouse moving in/out of the window, and the entire track is marked with this point Divided into two tracks.

当两段轨迹时间差>5min时,将整段轨迹划分为两段轨迹。When the time difference between the two trajectories is >5min, the entire trajectory is divided into two trajectories.

当一整条轨迹的最后一次单击事件之后,其所有鼠标操作的内容都不会被提交至服务器,不用再分析该段轨迹,予以删除。After the last click event of a whole track, the contents of all the mouse operations will not be submitted to the server, and the track will be deleted without further analysis.

根据轨迹整合的结果绘制轨迹图,对数据进行平铺处理并输出轨迹图。According to the result of trajectory integration, the trajectory map is drawn, the data is tiled and the trajectory map is output.

具体地,将整合后的每条轨迹都绘制成大小相同的图像,如图5-图8所示,图5-图7示出了本公开实施例提供的自动化攻击产生的示例性的鼠标轨迹图;图8示出了本公开实施例提供的用户产生的示例性的鼠标轨迹图。Specifically, each track after integration is drawn into an image of the same size, as shown in FIGS. FIG. 8 shows an exemplary mouse trajectory diagram generated by a user provided by an embodiment of the present disclosure.

在一些实施例中,分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理,包括:In some embodiments, the features of the preprocessed mouse track data are analyzed and extracted, and the features are standardized, including:

分析轨迹图中自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据的差异。Analyze the discrepancies between mouse trace data generated by automated attacks and user-generated mouse trace data in the trace graph.

具体地,如图5-图7所示,自动化攻击大多采用工具生成的轨迹去访问网页,生成的轨迹简单、直线、折线较多;如图8所示,用户产生的鼠标数据生成的轨迹曲线较多,不同人的轨迹差异较大。可以通过图像,分析自动化攻击轨迹与用户轨迹的差异,使用向量表示提取到的有区分度的特征。Specifically, as shown in Figures 5-7, most automated attacks use trajectories generated by tools to access web pages, and the generated trajectories are simple, with many straight lines and broken lines; There are many, and the trajectories of different people are quite different. Through images, the difference between automated attack trajectories and user trajectories can be analyzed, and vectors can be used to represent the extracted distinguishing features.

提取差异特征并使用向量表示特征。Extract discriminative features and represent features using vectors.

下面结合图4对提取差异特征并使用向量表示特征进行详细说明,图4示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的特征提取流程图;如图4所示,特征提取可以包括:The following is a detailed description of extracting difference features and using vector representation features in conjunction with FIG. 4. FIG. 4 shows a flow chart of feature extraction for an attack recognition method based on mouse track data in network traffic provided by an embodiment of the present disclosure; as shown in FIG. 4 As shown, feature extraction can include:

计算entropy。Calculate entropy.

具体地,如图5所示,鼠标在移动时,坐标像锯齿一样抖动,则统计抖动占整条轨迹的比例作为entropy。Specifically, as shown in FIG. 5 , when the mouse moves, the coordinates vibrate like sawtooth, and the ratio of the jitter to the entire track is counted as entropy.

关于抖动的定义:鼠标突然向反方向移动或停止前进,在几个轨迹点内又变回原来的方向,则抖动次数+1,x轴、y轴两个方向分开计算抖动个数。The definition of shaking: the mouse suddenly moves in the opposite direction or stops moving forward, and then changes back to the original direction within a few track points, then the number of shaking will be +1, and the number of shaking will be calculated separately in the x-axis and y-axis directions.

关于抖动比例的定义:抖动次数/(2*轨迹点数)。The definition of the shake ratio: shake times/(2*track points).

需要说明的是,轨迹段超过一定的轨迹点数才进行计算,轨迹太短时,默认取0。It should be noted that the calculation is performed only when the trajectory segment exceeds a certain number of trajectory points. When the trajectory is too short, the value is 0 by default.

计算diffLastAngleRatio。Computes diffLastAngleRatio.

具体地,如图6所示,若轨迹多次转弯,但每次转弯后都沿直线移动,则统计在转弯次数占整条轨迹的比例作为diffLastAngleRatio。Specifically, as shown in FIG. 6, if the trajectory turns multiple times, but moves along a straight line after each turn, the ratio of the number of turns to the entire trajectory is counted as diffLastAngleRatio.

关于角度定义:两轨迹点间曼哈顿距离大于10时为一个向量,计算向量与x轴的夹角。Regarding the angle definition: when the Manhattan distance between two trajectory points is greater than 10, it is a vector, and the angle between the vector and the x-axis is calculated.

关于转弯的定义:当前角度与上一个角度不同时,则转弯个数+1。About the definition of turning: when the current angle is different from the previous angle, the number of turns will be +1.

关于转弯比例的定义:100*转弯个数/所有角度个数。The definition of turning ratio: 100*number of turns/number of all angles.

计算timeStampSmallerCount。Calculate timeStampSmallerCount.

具体地,统计整条轨迹中轨迹点的时间戳小于上一个轨迹点的时间戳的次数,以计算timeStampSmallerCount。Specifically, the number of times that the timestamp of the track point in the entire track is smaller than the timestamp of the previous track point is counted to calculate timeStampSmallerCount.

计算moveSkipCount。Calculate moveSkipCount.

具体地,如图7所示,轨迹中的多段曲线由一条直线连接,表示该直线上没有轨迹点,则统计鼠标移动过程中的跳点个数作为moveSkipCount。Specifically, as shown in FIG. 7 , multiple curves in the trajectory are connected by a straight line, indicating that there is no trajectory point on the straight line, and the number of jump points during the mouse movement is counted as moveSkipCount.

关于跳点定义:针对鼠标移动事件,与前一个移动距离相比,后一个移动距离突然变大,距离突然变大的条件为:last_dis<5,记last_dis=5,last_dis*100<cur_dis。Regarding the definition of the jump point: For the mouse movement event, compared with the previous movement distance, the latter movement distance suddenly becomes larger, and the condition for the sudden increase of the distance is: last_dis<5, record last_dis=5, last_dis*100<cur_dis.

最后,输出特征向量表示的轨迹。Finally, the trajectory represented by the feature vector is output.

采用z-score算法,对特征做标准化处理。The z-score algorithm is used to standardize the features.

具体地,采用z-score算法对特征向量表示的轨迹进行标准化处理。Specifically, the z-score algorithm is used to standardize the trajectory represented by the feature vector.

S130,根据标准化处理后的特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型;S130, generating a training sample set according to the standardized features to perform model training, and generating a logistic regression classification model for automatic attack recognition;

在一些实施例中,根据标准化处理后的特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型,包括:In some embodiments, a training sample set is generated according to the standardized features for model training, and a logistic regression classification model for automatic attack recognition is generated, including:

根据标准化处理后的特征;According to the characteristics after normalization processing;

将预处理后的鼠标轨迹数据作为样本;Take the preprocessed mouse track data as a sample;

将每一样本对应的样本类型作为样本标签,根据每一样本及其对应的标签生成训练样本集;The sample type corresponding to each sample is used as a sample label, and a training sample set is generated according to each sample and its corresponding label;

确定模型的评价指标,采用训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型。Determine the evaluation index of the model, use the training sample set for model training, and generate a logistic regression classification model for automatic attack recognition.

其中,样本类型包括:Among them, the sample types include:

自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据。Mouse track data generated by automated attacks and mouse track data generated by users.

在一些实施例中,预处理后的鼠标轨迹数据可以按7:1:2的比例划分为训练集、验证集和测试集,使用训练集训练所需模型,使用验证集验证模型训练的效果,使用测试集评估模型最终的泛化能力。若泛化能力达到期望,则模型训练结束,若泛化能力没有达到期望,则需要分析原因,可以通过调整模型超参数、搜集更完备的数据、调整模型等方式,重新训练模型,直至模型的泛化能力达到期望。In some embodiments, the preprocessed mouse track data can be divided into a training set, a verification set and a test set in a ratio of 7:1:2, use the training set to train the required model, use the verification set to verify the effect of model training, Use the test set to evaluate the final generalization ability of the model. If the generalization ability meets the expectation, the model training ends. If the generalization ability does not meet the expectation, the reason needs to be analyzed. The model can be retrained by adjusting the hyperparameters of the model, collecting more complete data, and adjusting the model until the model reaches its goal. The generalization ability meets expectations.

S140,使用逻辑回归分类模型对网页中的鼠标轨迹数据进行识别。S140, using a logistic regression classification model to identify the mouse track data in the webpage.

具体地,采集网页中的新的鼠标轨迹数据,对其进行预处理并提取预处理后的数据特征,对特征做标准化处理,将标准化处理后的特征输入训练好的逻辑回归分类模型中进行识别。Specifically, collect the new mouse track data in the webpage, preprocess it and extract the preprocessed data features, standardize the features, and input the standardized features into the trained logistic regression classification model for identification .

在一些实施例中,使用1表示自动化工具生成的轨迹,0表示真人使用鼠标产生的轨迹。In some embodiments, 1 represents a trajectory generated by an automated tool, and 0 represents a trajectory generated by a real person using a mouse.

在一些实施例中,使用逻辑回归分类模型对网页中的鼠标轨迹数据进行识别,包括:In some embodiments, the mouse track data in the webpage is identified using a logistic regression classification model, including:

对网页中的鼠标轨迹数据进行预处理;Preprocess the mouse track data in the web page;

分析并提取预处理后的数据特征,对特征做标准化处理;Analyze and extract the preprocessed data features, and standardize the features;

将标准化处理后的特征输入逻辑回归分类模型中进行识别。Input the standardized features into the logistic regression classification model for identification.

下面结合图2和图9对本公开另一具体实施例进行详细说明。Another specific embodiment of the present disclosure will be described in detail below with reference to FIG. 2 and FIG. 9 .

图2示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别方法的总流程示意图;图9示出了根据本公开实施例提供的鼠标数据生成的鼠标轨迹图。FIG. 2 shows a schematic flowchart of an attack identification method based on mouse track data in network traffic provided by an embodiment of the present disclosure; FIG. 9 shows a mouse track diagram generated from mouse data provided by an embodiment of the present disclosure.

如图2所示,采集页面的鼠标轨迹数据,其中,采集到的鼠标轨迹数据可以如表1所示:As shown in Figure 2, the mouse track data of the page is collected, wherein the collected mouse track data can be shown in Table 1:

表1Table 1

Figure BDA0004007100980000121
Figure BDA0004007100980000121

如表1所示,2020年9月15日凌晨01秒时,用户id为104823581047924在ip为60.182.71.6的机器上,访问页面paht1时产生了一小段鼠标轨迹,其track_id为8KNhXkuj9nWoNSYS5bjP6ADNgIB。As shown in Table 1, at 01 seconds in the morning on September 15, 2020, the user id 104823581047924 on the machine with ip 60.182.71.6, generated a small mouse track when accessing the page paht1, and its track_id was 8KNhXkuj9nWoNSYS5bjP6ADNgIB.

上述表1的第1列记录了相关的鼠标事件,其中,3表示移入窗口,0表示鼠标移动,4表示移出窗口;第2列记录了鼠标移动过程中的x轴坐标;第3列记录了鼠标移动过程中的y轴坐标;第4列记录了当前事件发生时与第一个轨迹点发生时的时间差。The first column of the above table 1 records the relevant mouse events, among which, 3 means moving into the window, 0 means moving the mouse, and 4 means moving out of the window; the second column records the x-axis coordinates during the mouse movement; the third column records The y-axis coordinate during the mouse movement; the fourth column records the time difference between when the current event occurs and when the first track point occurs.

如图2所示,采集鼠标轨迹数据后对其进行预处理。As shown in Figure 2, after the mouse track data is collected, it is preprocessed.

具体地,将track_id为8KNhXkuj9nWoNSYS5bjP6ADNgIB的所有轨迹段整合成完整的轨迹,如下所示:Specifically, all track segments whose track_id is 8KNhXkuj9nWoNSYS5bjP6ADNgIB are integrated into a complete track, as follows:

{"track_id":"8KNhXkuj9nWoNSYS5bjP6ADNgIB","item":[[3,301,302,6750],[0,301,302,6750],[0,318,319,6769],[0,336,307,6939],[0,355,307,7042],[0,373,307,7063],[0,392,307,7079],[4,412,307,7096],…]}。{"track_id":"8KNhXkuj9nWoNSYS5bjP6ADNgIB","item":[[3,301,302,6750],[0,301,302,6750],[0,318,319,6769],[0,336,307,6939],[0,355,307,7042 ],[0,373,307,7063], [0,392,307,7079],[4,412,307,7096],...]}.

根据整合完成的数据绘制相应的轨迹图,绘制完成的轨迹图如图9所示。Draw the corresponding trajectory diagram according to the integrated data, and the completed trajectory diagram is shown in Figure 9.

如图2所示,提取绘制完成的轨迹图的特征,并使用向量法表示所提取的特征,采用z-score算法,标准化处理所有特征。As shown in Figure 2, the features of the drawn trajectory map are extracted, and the extracted features are represented by the vector method, and all features are standardized by using the z-score algorithm.

通过计算提取特征,可以得到如下数据:By calculating and extracting features, the following data can be obtained:

entropy=8。entropy=8.

diffLastAngleRatio=14。diffLastAngleRatio=14.

timeStampSmallerCount=0。timeStampSmallerCount=0.

moveSkipCount=0。moveSkipCount=0.

使用向量法表示4个特征为:[8,14,0,0],采用z-score算法将其标准化。Use the vector method to represent the four features as: [8,14,0,0], and use the z-score algorithm to standardize them.

如图2所示,提取与处理完特征以后,将用户轨迹的向量标记为0,将自动化攻击的轨迹向量标记为1。将所有特征向量按照7:1:2的比例进行分层采样,并将所有样本数据划分为训练集、验证集、测试集。其中训练集、验证集用于逻辑回归分类模型的训练,测试集用于测试训练好的逻辑回归模型,评估训练好的模型的泛化能力,若泛化能力达到期望,则模型训练结束,若泛化能力没有达到期望,则分析原因,通过调整模型超参数、搜集更完备的数据、调整模型等方式,重新训练模型,直至模型的泛化能力达到期望。As shown in Figure 2, after the features are extracted and processed, the vector of the user trajectory is marked as 0, and the trajectory vector of the automated attack is marked as 1. Stratified sampling is performed on all feature vectors according to the ratio of 7:1:2, and all sample data are divided into training set, verification set, and test set. The training set and verification set are used for the training of the logistic regression classification model, and the test set is used to test the trained logistic regression model and evaluate the generalization ability of the trained model. If the generalization ability meets the expectation, the model training ends. If If the generalization ability does not meet the expectation, analyze the reason, and retrain the model by adjusting the model hyperparameters, collecting more complete data, adjusting the model, etc. until the generalization ability of the model meets the expectation.

若泛化能力达到期望,模型训练结束,生成所需的逻辑回归分类模型,则将表1所示鼠标轨迹的特征标准化后输入到达到期望的逻辑回归分类模型中。If the generalization ability reaches the expectation, the model training ends, and the required logistic regression classification model is generated, then the features of the mouse trajectory shown in Table 1 are normalized and then input into the expected logistic regression classification model.

若模型预测概率大于0.5,则预测该轨迹为自动化工具的攻击轨迹。若模型预测概率小于等于0.5,则预测该轨迹为用户产生的正常轨迹。If the model prediction probability is greater than 0.5, the track is predicted to be the attack track of an automated tool. If the model prediction probability is less than or equal to 0.5, it is predicted that the trajectory is a normal trajectory generated by the user.

若预测结果为自动化工具的攻击轨迹,可以通过把该用户的id,ip加入黑名单等方式拦截该用户的访问请求,拒绝其访问,以保护其他用户和企业的利益。If the predicted result is the attack track of the automated tool, the user's access request can be intercepted by adding the user's id and ip to the blacklist, etc., and its access can be denied to protect the interests of other users and enterprises.

根据本公开的实施例,实现了以下技术效果:According to the embodiments of the present disclosure, the following technical effects are achieved:

以自动化攻击产生的鼠标轨迹和正常用户产生的鼠标轨迹为样本训练出能够识别自动化攻击的逻辑回归分类模型,在不影响用户体验的情况下能够有效识别自动化攻击,有效保护了其他用户和企业的利益。Using mouse traces generated by automated attacks and mouse traces generated by normal users as samples to train a logistic regression classification model that can identify automated attacks, it can effectively identify automated attacks without affecting user experience, effectively protecting the security of other users and enterprises Benefit.

需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本公开并不受所描述的动作顺序的限制,因为依据本公开,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于可选实施例,所涉及的动作和模块并不一定是本公开所必须的。It should be noted that for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that the present disclosure is not limited by the described action sequence. Because of this disclosure, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily required by the present disclosure.

以上是关于方法实施例的介绍,以下通过装置实施例,对本公开所述方案进行进一步说明。The above is the introduction of the method embodiments, and the solution of the present disclosure will be further described through the device embodiments below.

图10示出了本公开实施例提供的一种基于网络流量中鼠标轨迹数据的攻击识别装置的框图。如图10所示,装置1000包括:Fig. 10 shows a block diagram of an attack identification device based on mouse track data in network traffic provided by an embodiment of the present disclosure. As shown in Figure 10, thedevice 1000 includes:

采集模块,采集网络流量中的鼠标轨迹数据。The collection module collects mouse track data in network traffic.

解析模块,解析网络流量中的鼠标轨迹数据。Parsing module, parsing mouse track data in network traffic.

预处理模块,对鼠标轨迹数据进行预处理。The preprocessing module preprocesses the mouse track data.

特征提取模块,分析并提取预处理后的鼠标轨迹数据的特征,并对特征做标准化处理。The feature extraction module analyzes and extracts the features of the preprocessed mouse track data, and standardizes the features.

模型训练模块,根据标准化处理后的特征进行模型训练。The model training module performs model training according to the standardized features.

模型生成模块,生成针对自动化攻击识别的逻辑回归分类模型。The model generation module generates a logistic regression classification model for automatic attack recognition.

在一些实施例中,装置1000还可以包括:In some embodiments,device 1000 may also include:

筛选模块,筛选并删除鼠标轨迹数据中的异常数据。Screening module to filter and delete abnormal data in the mouse track data.

轨迹整合模块,将筛选后的鼠标轨迹数据进行轨迹整合。The trajectory integration module integrates the filtered mouse trajectory data.

轨迹图绘制模块,根据轨迹整合的结果绘制轨迹图。The trajectory graph drawing module draws a trajectory graph according to the result of trajectory integration.

可以理解的是,图10所示的基于网络流量中鼠标轨迹数据的攻击识别装置1000中的各个模块/单元具有实现本公开实施例提供的基于网络流量中鼠标轨迹数据的攻击识别方法100中的各个步骤的功能,并能达到其相应的技术效果,为了简洁,在此不再赘述。It can be understood that each module/unit in theattack identification device 1000 based on mouse track data in network traffic shown in FIG. The functions of each step and their corresponding technical effects can be achieved, and for the sake of brevity, details will not be described here.

根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质。According to the embodiment of the present disclosure, the present disclosure also provides an electronic device and a readable storage medium.

图11示出了可以用来实施本公开的实施例的电子设备的示例性框图。电子设备1100旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。FIG. 11 shows an exemplary block diagram of an electronic device that may be used to implement embodiments of the present disclosure.Electronic device 1100 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.

电子设备1100包括计算单元1101,其可以根据存储在ROM1102中的计算机程序或者从存储单元1108加载到RAM1103中的计算机程序,来执行各种适当的动作和处理。在RAM1103中,还可存储电子设备1100操作所需的各种程序和数据。计算单元1101、ROM 1102以及RAM 1103通过总线1104彼此相连。I/O接口1105也连接至总线1104。Theelectronic device 1100 includes acomputing unit 1101 that can perform various appropriate actions and processes according to computer programs stored in theROM 1102 or loaded from thestorage unit 1108 into theRAM 1103 . In theRAM 1103, various programs and data necessary for the operation of theelectronic device 1100 may also be stored. Thecomputing unit 1101,ROM 1102, andRAM 1103 are connected to each other through abus 1104. I/O interface 1105 is also connected tobus 1104 .

电子设备1100中的多个部件连接至I/O接口1105,包括:输入单元1106,例如键盘、鼠标等;输出单元1107,例如各种类型的显示器、扬声器等;存储单元1108,例如磁盘、光盘等;以及通信单元1109,例如网卡、调制解调器、无线通信收发机等。通信单元1109允许电子设备1100通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Multiple components in theelectronic device 1100 are connected to the I/O interface 1105, including: aninput unit 1106, such as a keyboard, a mouse, etc.; anoutput unit 1107, such as various types of displays, speakers, etc.; astorage unit 1108, such as a magnetic disk, an optical disk etc.; and acommunication unit 1109, such as a network card, a modem, a wireless communication transceiver, and the like. Thecommunication unit 1109 allows theelectronic device 1100 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.

计算单元1101可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元1101的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元1101执行上文所描述的各个方法和处理,例如方法100。例如,在一些实施例中,方法100可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元1108。在一些实施例中,计算机程序的部分或者全部可以经由ROM 1102和/或通信单元1109而被载入和/或安装到电子设备1100上。当计算机程序加载到RAM 1103并由计算单元1101执行时,可以执行上文描述的方法100的一个或多个步骤。备选地,在其他实施例中,计算单元1101可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行方法100。Thecomputing unit 1101 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples ofcomputing units 1101 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. Thecalculation unit 1101 executes various methods and processes described above, such as themethod 100 . For example, in some embodiments,method 100 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such asstorage unit 1108 . In some embodiments, part or all of the computer program can be loaded and/or installed on theelectronic device 1100 via theROM 1102 and/or thecommunication unit 1109. When a computer program is loaded intoRAM 1103 and executed bycomputing unit 1101, one or more steps ofmethod 100 described above may be performed. Alternatively, in other embodiments, thecomputing unit 1101 may be configured to execute themethod 100 in any other suitable manner (for example, by means of firmware).

本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips Implemented in a system of systems (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor Can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.

用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes can be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams The operation is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, Random Access Memory (RAM), Read Only Memory (ROM), Erasable Programmable Read Only Memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.

为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置;以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide for interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user can Input is provided to the computer through the keyboard and the pointing device. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,也可以为分布式系统的服务器,或者是结合了区块链的服务器。A computer system may include clients and servers. Clients and servers are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, a server of a distributed system, or a server combined with a blockchain.

应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in the present disclosure can be achieved, no limitation is imposed herein.

上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。The specific implementation manners described above do not limit the protection scope of the present disclosure. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present disclosure shall be included within the protection scope of the present disclosure.

Claims (9)

Translated fromChinese
1.一种基于网络流量中鼠标轨迹数据的攻击识别方法,其特征在于,1. An attack identification method based on mouse track data in network traffic, characterized in that,所述方法包括:The methods include:采集并解析网络流量中的鼠标轨迹数据;其中,所述鼠标轨迹数据包括自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据;Collecting and analyzing mouse track data in network traffic; wherein, the mouse track data includes mouse track data generated by automated attacks and mouse track data generated by users;对所述鼠标轨迹数据进行预处理;分析并提取预处理后的所述鼠标轨迹数据的特征,并对所述特征做标准化处理;Preprocessing the mouse track data; analyzing and extracting the features of the preprocessed mouse track data, and standardizing the features;根据标准化处理后的所述特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型;Generate a training sample set for model training according to the features after standardized processing, and generate a logistic regression classification model for automatic attack recognition;使用所述逻辑回归分类模型对网页中的鼠标轨迹数据进行识别。Using the logistic regression classification model to identify the mouse track data in the webpage.2.根据权利要求1所述的方法,其特征在于,所述对所述鼠标轨迹数据进行预处理,包括:2. The method according to claim 1, wherein said preprocessing the mouse track data comprises:筛选并删除所述鼠标轨迹数据中的异常数据;Filter and delete abnormal data in the mouse track data;将筛选后的鼠标轨迹数据进行轨迹整合;Trajectory integration of the filtered mouse trajectory data;根据所述轨迹整合的结果绘制轨迹图。A trajectory map is drawn from the results of the trajectory integration.3.根据权利要求2所述的方法,其特征在于,所述分析并提取预处理后的所述鼠标轨迹数据的特征,并对所述特征做标准化处理,包括:3. method according to claim 2, is characterized in that, described analyzing and extracting the feature of described mouse track data after pretreatment, and doing standardization process to described feature, comprises:分析所述轨迹图中自动化攻击产生的鼠标轨迹数据和用户产生的鼠标轨迹数据的差异;Analyzing the difference between the mouse track data generated by the automated attack and the mouse track data generated by the user in the track graph;提取差异特征并使用向量表示所述特征;Extract differential features and represent said features using vectors;采用z-score算法,对所述特征做标准化处理。The z-score algorithm is used to standardize the features.4.根据权利要求1所述的方法,其特征在于,所述根据标准化处理后的所述特征生成训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型,包括:4. The method according to claim 1, wherein said generating a training sample set according to said features after standardized processing carries out model training, and generating a logistic regression classification model for automated attack recognition, comprising:根据标准化处理后的所述特征;according to said features after normalization;将预处理后的所述鼠标轨迹数据作为样本;Using the preprocessed mouse track data as a sample;将每一样本对应的样本类型作为样本标签,根据每一样本及其对应的标签生成训练样本集;The sample type corresponding to each sample is used as a sample label, and a training sample set is generated according to each sample and its corresponding label;确定模型的评价指标,采用训练样本集进行模型训练,生成针对自动化攻击识别的逻辑回归分类模型。Determine the evaluation index of the model, use the training sample set for model training, and generate a logistic regression classification model for automatic attack recognition.5.根据权利要求4所述的方法,其特征在于,所述样本类型,包括:5. The method according to claim 4, wherein the sample type comprises:所述自动化攻击产生的鼠标轨迹数据和所述用户产生的鼠标轨迹数据。The mouse trace data generated by the automated attack and the mouse trace data generated by the user.6.根据权利要求1所述的方法,其特征在于,所述使用所述逻辑回归分类模型对网页中的鼠标轨迹数据进行识别,包括:6. method according to claim 1, is characterized in that, described using described logistic regression classification model to identify the mouse track data in the webpage, comprising:对所述网页中的鼠标轨迹数据进行预处理;Preprocessing the mouse track data in the webpage;分析并提取预处理后的数据特征,对所述特征做标准化处理;Analyze and extract the preprocessed data features, and standardize the features;将标准化处理后的所述特征输入所述逻辑回归分类模型中进行识别。The standardized features are input into the logistic regression classification model for identification.7.一种基于网络流量中鼠标轨迹数据的攻击识别装置,其特征在于,7. An attack identification device based on mouse track data in network traffic, characterized in that,包括:include:采集模块,采集网络流量中的鼠标轨迹数据;The collection module collects the mouse track data in the network traffic;解析模块,解析所述网络流量中的鼠标轨迹数据;An analysis module, which analyzes the mouse track data in the network traffic;预处理模块,对所述鼠标轨迹数据进行预处理;A preprocessing module, for preprocessing the mouse track data;特征提取模块,分析并提取预处理后的所述鼠标轨迹数据的特征,并对所述特征做标准化处理;The feature extraction module analyzes and extracts the features of the preprocessed mouse track data, and standardizes the features;模型训练模块,根据标准化处理后的所述特征进行模型训练;A model training module, which performs model training according to the features after standardized processing;模型生成模块,生成针对自动化攻击识别的逻辑回归分类模型。The model generation module generates a logistic regression classification model for automatic attack recognition.8.一种电子设备,其特征在于,包括:8. An electronic device, characterized in that it comprises:至少一个处理器;以及at least one processor; and与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-6中任一项所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1-6. Methods.9.一种存储有计算机指令的非瞬时计算机可读存储介质,其特征在于,所述计算机指令用于使所述计算机执行根据权利要求1-6中任一项所述的方法。9. A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to make the computer execute the method according to any one of claims 1-6.
CN202211635593.XA2022-12-192022-12-19Attack identification method and device based on mouse track data in network trafficActiveCN116132111B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202211635593.XACN116132111B (en)2022-12-192022-12-19Attack identification method and device based on mouse track data in network traffic

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202211635593.XACN116132111B (en)2022-12-192022-12-19Attack identification method and device based on mouse track data in network traffic

Publications (2)

Publication NumberPublication Date
CN116132111Atrue CN116132111A (en)2023-05-16
CN116132111B CN116132111B (en)2025-08-19

Family

ID=86309165

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202211635593.XAActiveCN116132111B (en)2022-12-192022-12-19Attack identification method and device based on mouse track data in network traffic

Country Status (1)

CountryLink
CN (1)CN116132111B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2008204118A (en)*2007-02-202008-09-04Shinano Kenshi Co LtdPersonal identification information acquisition method and personal identification information acquisition system
US20090027343A1 (en)*2007-07-272009-01-29Samsung Electronics Co., Ltd.Trajectory-estimation apparatus and method based on pen-type optical mouse
US8606725B1 (en)*2008-10-292013-12-10Emory UniversityAutomatic client-side user-behavior analysis for inferring user intent
CN106155298A (en)*2015-04-212016-11-23阿里巴巴集团控股有限公司Man-machine recognition methods and device, the acquisition method of behavior characteristics data and device
CN106446128A (en)*2016-09-202017-02-22刘志军Tracing method and device of webpage access tracks
WO2017197549A1 (en)*2016-05-162017-11-23深圳维盛半导体科技有限公司Dpi automatic regulation mouse and method
CN107609590A (en)*2017-09-122018-01-19山东师范大学A kind of multiple dimensioned mouse track feature extracting method, device and system
CN107766852A (en)*2017-12-062018-03-06电子科技大学A kind of man-machine mouse track detection method based on convolutional neural networks
US20180132940A1 (en)*2016-11-142018-05-17Intai Technology Corp.Method and system for verifying panoramic images of implants
CN110879881A (en)*2019-11-152020-03-13重庆邮电大学 Mouse Trajectory Recognition Method Based on Feature Group Hierarchy and Semi-Supervised Random Forest
CN111209573A (en)*2018-11-212020-05-29中标软件有限公司 A security perception method for access request based on mouse displacement trajectory
CN112634401A (en)*2020-12-282021-04-09深圳市优必选科技股份有限公司Plane trajectory drawing method, device, equipment and storage medium
CN112650402A (en)*2020-12-252021-04-13广州市博大电子设备有限公司Control method of mouse and application thereof
CN115146160A (en)*2022-06-302022-10-04广州华多网络科技有限公司Machine behavior detection method, device, equipment and medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2008204118A (en)*2007-02-202008-09-04Shinano Kenshi Co LtdPersonal identification information acquisition method and personal identification information acquisition system
US20090027343A1 (en)*2007-07-272009-01-29Samsung Electronics Co., Ltd.Trajectory-estimation apparatus and method based on pen-type optical mouse
US8606725B1 (en)*2008-10-292013-12-10Emory UniversityAutomatic client-side user-behavior analysis for inferring user intent
CN106155298A (en)*2015-04-212016-11-23阿里巴巴集团控股有限公司Man-machine recognition methods and device, the acquisition method of behavior characteristics data and device
WO2017197549A1 (en)*2016-05-162017-11-23深圳维盛半导体科技有限公司Dpi automatic regulation mouse and method
CN106446128A (en)*2016-09-202017-02-22刘志军Tracing method and device of webpage access tracks
US20180132940A1 (en)*2016-11-142018-05-17Intai Technology Corp.Method and system for verifying panoramic images of implants
CN107609590A (en)*2017-09-122018-01-19山东师范大学A kind of multiple dimensioned mouse track feature extracting method, device and system
CN107766852A (en)*2017-12-062018-03-06电子科技大学A kind of man-machine mouse track detection method based on convolutional neural networks
CN111209573A (en)*2018-11-212020-05-29中标软件有限公司 A security perception method for access request based on mouse displacement trajectory
CN110879881A (en)*2019-11-152020-03-13重庆邮电大学 Mouse Trajectory Recognition Method Based on Feature Group Hierarchy and Semi-Supervised Random Forest
CN112650402A (en)*2020-12-252021-04-13广州市博大电子设备有限公司Control method of mouse and application thereof
CN112634401A (en)*2020-12-282021-04-09深圳市优必选科技股份有限公司Plane trajectory drawing method, device, equipment and storage medium
CN115146160A (en)*2022-06-302022-10-04广州华多网络科技有限公司Machine behavior detection method, device, equipment and medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MENG XIA: "Qlens:Visual Analytics of Multi-step Problem-solving Behaviors for Improving Qusetion Design", IEEE TRANSACTION ON VISUALIZATION AND COMPUTER GRAPHICS, 13 October 2020 (2020-10-13), pages 870 - 880*
张志腾;刘琳岚;: "基于梯度提升决策树的鼠标轨迹识别方法与研究", 信息通信, no. 09, 15 September 2018 (2018-09-15), pages 22 - 24*
陈喆: "基于BP 神经网络的鼠标轨迹识别技术", 电脑知识与技术, 8 January 2013 (2013-01-08), pages 130 - 132*

Also Published As

Publication numberPublication date
CN116132111B (en)2025-08-19

Similar Documents

PublicationPublication DateTitle
CN112347244B (en)Yellow-based and gambling-based website detection method based on mixed feature analysis
US12118770B2 (en)Image recognition method and apparatus, electronic device and readable storage medium
CN112148881B (en) Methods and devices for outputting information
US11823494B2 (en)Human behavior recognition method, device, and storage medium
CN113239807B (en) Methods and devices for training bill recognition models and bill recognition
CN113051911B (en) Methods, devices, equipment, media and program products for extracting sensitive words
CN112559747A (en)Event classification processing method and device, electronic equipment and storage medium
CN114187448A (en) Document image recognition method and apparatus, electronic device, computer readable medium
CN112800919A (en)Method, device and equipment for detecting target type video and storage medium
CN115619245A (en)Portrait construction and classification method and system based on data dimension reduction method
CN114090601A (en) A data screening method, apparatus, device and storage medium
CN119670072A (en) Database abnormal behavior detection method, device, electronic device and storage medium
CN116246287A (en) Target object recognition method, training method, device and storage medium
CN111798356A (en) An abnormal pattern recognition method of rail transit passenger flow based on big data
CN116108844A (en)Risk information identification method, apparatus, device and storage medium
CN113642495B (en)Training method, apparatus, and program product for evaluating model for time series nomination
CN114638359A (en)Method and device for removing neural network backdoor and image recognition
CN119629636A (en) Spam call identification method, device, computer equipment and storage medium
CN114155589A (en) An image processing method, apparatus, device and storage medium
CN113360688A (en)Information base construction method, device and system
CN111597423A (en)Performance evaluation method and device of interpretable method of text classification model
CN107133644B (en) Digital Library Content Analysis System and Method
CN117576648A (en)Automatic driving scene mining method and device, electronic equipment and storage medium
CN116132111A (en) Attack identification method and device based on mouse track data in network traffic
CN116662589A (en)Image matching method, device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp