Movatterモバイル変換


[0]ホーム

URL:


CN112905671A - Time series exception handling method and device, electronic equipment and storage medium - Google Patents

Time series exception handling method and device, electronic equipment and storage medium
Download PDF

Info

Publication number
CN112905671A
CN112905671ACN202110313319.XACN202110313319ACN112905671ACN 112905671 ACN112905671 ACN 112905671ACN 202110313319 ACN202110313319 ACN 202110313319ACN 112905671 ACN112905671 ACN 112905671A
Authority
CN
China
Prior art keywords
data
time series
abnormal
feature
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110313319.XA
Other languages
Chinese (zh)
Inventor
张文池
王泓琳
陈哲康
周波
王勇
刘大鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
Beijing Bishi Technology Co ltd
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bishi Technology Co ltd, National Computer Network and Information Security Management CenterfiledCriticalBeijing Bishi Technology Co ltd
Priority to CN202110313319.XApriorityCriticalpatent/CN112905671A/en
Publication of CN112905671ApublicationCriticalpatent/CN112905671A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本发明提供一种时间序列异常处理方法、装置、电子设备及计算机可读存储介质。其中,时间序列异常处理方法,包括步骤:获取时间序列数据,对所述时间序列数据训练,构建模型;根据所述模型检测实时获得的时间序列数据中是否存在异常数据,若存在,则推荐部分异常数据;判断被推荐的所述部分异常数据是否合理,然后反馈判断结果;根据所述判断结果优化所述模型,然后继续检测实时时间序列数据。根据本发明的时间序列异常处理方法,对数据没有明显的偏向性,能够适配具有特定场景语义的指标,能应对非传统互联网领域的运维需求,具有更高的可扩展性,具有普适性,给出的异常结果能够给出具体的异常原因。

Figure 202110313319

The present invention provides a time series exception processing method, device, electronic device and computer-readable storage medium. The method for processing abnormal time series includes the steps of: acquiring time series data, training the time series data, and constructing a model; detecting whether there is abnormal data in the time series data obtained in real time according to the model, and recommending the part if there is. Abnormal data; judge whether the recommended part of abnormal data is reasonable, and then feed back the judgment result; optimize the model according to the judgment result, and then continue to detect real-time time series data. The time series exception processing method according to the present invention has no obvious bias towards data, can adapt to indicators with specific scene semantics, can meet the operation and maintenance requirements in non-traditional Internet fields, has higher scalability, and is universally applicable. The abnormal result given can give the specific abnormal cause.

Figure 202110313319

Description

Time series exception handling method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for processing a time series exception, an electronic device, and a computer-readable storage medium.
Background
Modern software enterprises often rely on a large number of application services installed on a large number of infrastructures, including physical machines, virtual machines, containers. To ensure the reliability of these high-level services and systems, the operation and maintenance personnel need to monitor and check the operating conditions of the infrastructure. During routine operation and maintenance management work, an operation and maintenance engineer typically monitors and collects various performance metrics for the infrastructure. For example, the machine often has indexes such as memory utilization rate, CPU utilization rate, and disk utilization rate, and in actual operation, due to a fault caused by external attack, disk medium aging, performance continuous overload, and the like, the availability of the machine is severely challenged, and at this time, these monitoring indexes also reflect an abnormality. The method is very important for the abnormity detection of the time series indexes, and can help an operation and maintenance team to find the fault as soon as possible, so that the efficiency of fault occurrence to troubleshooting is improved.
The problem of anomaly detection of time series indexes is also widely noted in academia, and algorithms for anomaly detection of time series indexes are proposed in large quantities in recent years, but are limited by algorithm effects and detection performance, and the methods still cannot meet the requirements of actual landing application. In consideration of the fact that the number of indexes to be monitored and checked in operation and maintenance work is extremely large, manual index marking is impractical, and therefore a supervised anomaly detection method is difficult to practice, and an unsupervised learning mode must be adopted. In addition, the time series anomaly detection scenes are different, the service objects and loads of the services are different greatly, and the trends and characteristics shown by the indexes sometimes have strong service correlation, so that the anomaly detection method needs to have the capability of efficiently collecting the feedback of the operation and maintenance experts so as to acquire the knowledge of the operation and maintenance experts.
The following table 1 lists the most advanced unsupervised time series anomaly detection algorithms in the academic world at present, most of the algorithms adopt deep learning models, huge computing resources are required to support training, the computing performance needs to be improved, and user feedback cannot be directly applied to deep learning framework optimization. The traditional unsupervised statistical learning method needs a large amount of manual parameter adjustment and has uneven effects. The algorithms also have obvious bias on data, each algorithm is excellent in performance on a specific data type, but has no universality, and specific abnormal reasons are difficult to explain by given abnormal results.
Characteristics ofRegression statistics learningTraditional unsupervised learningUnsupervised depth generation model
High capacityDifference (D)In generalIs excellent in
Without need of regulating parametersDifference (D)Is excellent inIn general
Need not labelIs excellent inIs excellent inIs excellent in
The detection speed is highIs excellent inIn generalIn general
Low training resourcesIs excellent inIn generalDifference (D)
Short training timeIs excellent inIn generalDifference (D)
Can be manually adjustedDifference (D)In generalDifference (D)
TABLE 1
Disclosure of Invention
The present invention is directed to solve at least one of the problems in the background art and provides a time series exception handling method, a time series exception handling apparatus, an electronic device, and a computer-readable storage medium.
In order to achieve the above object, the present invention provides a method for processing time series exception, comprising the following steps:
acquiring time sequence data, training the time sequence data, and constructing a model;
detecting whether abnormal data exist in the time sequence data obtained in real time according to the model, and if so, recommending part of the abnormal data;
judging whether the recommended part of abnormal data is reasonable or not, and then feeding back a judgment result;
and optimizing the model according to the judgment result, and then continuously detecting the real-time sequence data.
According to one aspect of the invention, acquiring time series data comprises acquiring regular small-scale time series data and irregular large-scale time series data, clustering all time series data when acquiring irregular large-scale time series data, and then training various types of time series data to construct a model.
According to one aspect of the invention, the clustering process is to capture the correlation among the time sequence data to be trained through DBSCAN, and cluster the data with approximate shape and consistent periodicity.
According to an aspect of the present invention, in the clustering process, in calculating the approximation degree of the time-series data, the distance between the time-series data is calculated using DTW.
According to one aspect of the invention, according to the type of the time sequence data, feature data capable of representing the corresponding type of the time sequence data is selected for training, and a model is constructed.
According to one aspect of the invention, RRCF is adopted to select all the feature data for training, all the feature data are iterated to obtain a plurality of decision trees, the decision trees form a decision forest, and then whether abnormal data exist in the real-time sequence data is determined through voting of the decision forest.
According to one aspect of the invention, when constructing the decision tree, the RRCF selects a segmentation dimension for segmenting the feature data when constructing the decision tree, and the RRCF has a probability of selecting the feature data as
Figure BDA0002990156980000031
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability of the feature i being selected, the probability value being between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; gi represents the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation after the characteristic i is sorted according to the characteristic size in the training sample set; sigma gjRepresenting g calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
According to one aspect of the invention, the RRCF equally divides the feature data in the slicing dimension into N intervals [ l [ ]0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each of the intervals is selected is
Figure BDA0002990156980000032
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi](ii) a Wherein l-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved for the training set, h-N-1 represents the maximum value of the characteristic, the difference between the minimum value and the maximum value is divided by N, and the N intervals are equally divided.
According to one aspect of the present invention, when the abnormal data exists, the abnormal score codip of the abnormal data is calculated by using the dividing point, and when the abnormal score codip is calculated, the ratio codip of the number of the abnormal data contained in the sibling subtree and the father subtree of the dividing point is calculatedNodeSelecting the largest ratio CoDispNodeAbnormal data xiIs an abnormality score of
Figure BDA0002990156980000041
According to one aspect of the invention, the recommending part of the abnormal data is to select a plurality of most abnormal segments in the abnormal data, and recommend after obtaining labels of the plurality of segments; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data and recommending after obtaining labels of the segments; or
And the recommendation of the abnormal data of the part is to divide the abnormal data into a plurality of groups according to the abnormal scores, obtain a plurality of fragments in each group, and recommend after obtaining the labels of the fragments.
According to one aspect of the invention, after the abnormal data of n labeled segments are obtained by the model, the abnormal data and M decision trees in the decision forest of the model jointly form an abnormal score matrix codip _ M [ x [ [ x ])i][treej]For each exception data xiIf the feedback judgment result is true positive, the decision tree isjHas a weight of twj=twj+δ×CoDisp_M[xi][treej]And selecting a decision tree with higher weight according to the feedback judgment result so as to optimize the model.
In order to achieve the above object, the present invention further provides a time-series exception handling apparatus, including:
the data processing module is used for acquiring time series data, training the time series data and constructing a model;
the abnormal data detection recommending module detects whether abnormal data exist in the time sequence data obtained in real time according to the model, and if the abnormal data exist, part of the abnormal data are recommended;
the abnormal data judgment feedback module judges whether the part of abnormal data is reasonable or not and then feeds back a judgment result;
and the model optimization module optimizes the model according to the feedback judgment result and then continuously detects the real-time sequence data.
According to an aspect of the invention, further comprising:
and the data classification processing module is used for acquiring irregular large-scale time sequence data, clustering all the time sequence data, training various time sequence data and constructing a model.
In order to achieve the above object, the present invention further provides an electronic device, which includes a processor, a memory, and a computer program stored in the memory and executable on the processor, wherein the computer program, when executed by the processor, implements the above time-series exception handling method.
To achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements the above time-series exception handling method.
According to one scheme of the invention, as the number of time sequences to be monitored in a production environment is extremely large, each production unit can generate dozens or even hundreds of monitoring index data, the index data need to be monitored completely, if the time sequences are trained respectively in a targeted manner, the number of models and consumed resources are extremely large, and the existing operation and maintenance resources are difficult to support. Therefore, before the targeted training stage of the index data, the data are clustered, so that the detection processing time can be greatly reduced, and the abnormity can be quickly and accurately processed.
According to one scheme of the invention, a characteristic data selection stage is provided, and more appropriate characteristic data are extracted in a targeted manner according to the statistical information and characteristics of indexes, so that the accuracy of the model is improved.
According to one scheme of the invention, the most abnormal 30 segments are selected, and the labels of the abnormal segments are acquired, so that the explicit abnormality can be further confirmed, and the false positive rate can be reduced.
According to one scheme of the invention, 30 most uncertain segments (namely around the vicinity of an abnormality judgment threshold) are selected, and the labels can help the model to clearly classify boundaries, so that the identification accuracy of fuzzy abnormalities is improved.
According to one aspect of the invention, the abnormal data is divided into 10 groups according to the abnormal scores, each group obtains at most 3 segments, and the labels can capture attitudes of the judgment feedback module on different abnormal judgment conditions, so as to help the model determine the optimal threshold value selection range.
According to one scheme of the invention, the invention provides an unsupervised, white-box and accurate time series exception handling method which is matched with active learning and can actively and efficiently collect feedback information. On the basis of a traditional unsupervised learning frame, an active learning stage is introduced, abnormality is actively recommended to a judgment feedback part (such as a judgment feedback module or operation and maintenance personnel) and feedback is acquired, so that a model is corrected, and the accuracy is improved. The method reserves the advantages of the traditional unsupervised learning in the aspects of parameter adjustment and marking, designs the application strategy of marking feedback in a targeted manner, and further optimizes the recall rate, the detection speed and the capacity of the model.
According to one scheme of the invention, the processing method has no obvious bias on data, can adapt to indexes with specific scene semantics, can meet the operation and maintenance requirements in the field of non-traditional Internet, has higher expandability and universality, and can give specific abnormal reasons for the given abnormal result.
According to one aspect of the present invention, the present invention is able to accurately detect and interpret anomalies, testing on 1 public data set and 2 time series data of a commercial bank's actual production environment, ultimately reaching F1-score of 0.81 and 0.89 on both data sets. Compared with the traditional unsupervised exception handling method, the best F1-score is improved by 0.19-0.5 on two data sets, and the detection time is shortened by 58%.
Drawings
FIG. 1 schematically shows a flow diagram of a method for time series exception handling according to one embodiment of the present invention;
FIG. 2 schematically represents an approximate index plot collected by the same switch;
3-5 schematically show three different anomaly fragment proactive recommender diagrams;
fig. 6 schematically shows a functional configuration diagram of a time-series abnormality processing apparatus according to an embodiment of the present invention.
Detailed Description
The content of the invention will now be discussed with reference to exemplary embodiments. It is to be understood that the embodiments discussed are merely intended to enable one of ordinary skill in the art to better understand and thus implement the teachings of the present invention, and do not imply any limitations on the scope of the invention.
As used herein, the term "include" and its variants are to be read as open-ended terms meaning "including, but not limited to. The term "based on" is to be read as "based, at least in part, on". The terms "one embodiment" and "an embodiment" are to be read as "at least one embodiment".
In view of the above-described drawbacks of the prior art in the background art, the present invention provides a batch task time monitoring method, which can predict the time of a batch task and detect an abnormality of the batch task, and update a task model or generate an alarm according to the prediction and detection results.
FIG. 1 schematically shows a flow diagram of a method for time series exception handling according to one embodiment of the present invention. As shown in fig. 1, a time-series exception handling method according to an embodiment of the present invention includes the following steps:
a. acquiring time sequence data, training the time sequence data, and constructing a model;
b. detecting whether abnormal data exist in the time sequence data obtained in real time according to the model, and if so, recommending part of the abnormal data;
c. judging whether the recommended part of abnormal data is reasonable or not, and then feeding back a judgment result;
d. and optimizing the model according to the judgment result, and then continuously detecting the real-time sequence data.
In practice, the time series data may be represented by x, where x is { x ═ x1,x2,...,xNN is the length of data x, data point x at any time ttIs a specific data value. The time series may be collected from many sources, such as networks, transaction links, request logs, and the like. Same sourceHave a greater probability of having similar characteristics.
Because the number of time sequences to be monitored in a production environment is extremely large, each production unit can generate dozens or even hundreds of monitoring index data, the index data need to be monitored completely, if the time sequences are trained respectively in a targeted manner, the number of models and consumed resources are extremely large, and the existing operation and maintenance resources are difficult to support. Therefore, before the targeted training stage of the index data, the data are clustered, so that the detection processing time can be greatly reduced, and the abnormity can be quickly and accurately processed.
Specifically, according to an embodiment of the present invention, in the step a, in the clustering stage, the algorithm uses DBSCAN to capture the association relationship between the timing indexes to be trained, and clusters the indexes with similar shapes and consistent periodicity. In calculating the index similarity, distance between indexes is calculated using dtw (dynamic Time warping). The DBSCAN does not need to provide the predefined category information, and can control the clustering accuracy by adjusting the clustering radius, so the DBSCAN is very suitable for index clustering scenes.
Figure 2 schematically shows an approximate index map of the same switch acquisition. As shown in fig. 2, two network traffic curves for different ports of the same switch exhibit substantially the same trend and scale. In an actual production environment, the same type of data under the same monitoring unit also has the clustering characteristic, and by utilizing the characteristic, the number of models generated in a model training stage can be greatly reduced, consumed resources are reduced, and the cost performance of an operation and maintenance tool is improved. In addition, the number of data in a part of scenes is small, the accuracy requirement is high, and the cost performance of the single training data is higher than that of pre-clustering at the moment, so that the clustering stage is taken as an optional step.
As can be seen from the above, in the present invention, acquiring time series data includes acquiring regular small-scale time series data and acquiring irregular large-scale time series data, and the acquisition of regular small-scale time series data is only performed by direct training, while the acquisition of irregular large-scale time series data requires clustering, and then training various types of time series data, and then constructing a model.
According to one embodiment of the invention, according to the type of the time sequence data, feature data capable of representing the corresponding type of the time sequence data is selected for training, and a model is constructed. The time series data have different characteristics. For example, percentage type sequence data tends to exhibit a horizontal state with short dips or spikes in failure; transaction sequence data related to services often show periodic peaks/valleys, and a small amount of fluctuation occurs in the case of failure; exchanging infrastructure sequence data such as space, there may be a process that slowly rises over time. Therefore, the invention provides a characteristic data selection stage, and according to the statistical information and characteristics of the indexes, more suitable characteristic data are extracted in a targeted manner, so that the model accuracy is improved. The specific extraction rules are shown in the following table:
Figure BDA0002990156980000091
TABLE 2
In this embodiment, table 2 contains simple and effective feature data that can cover the different features of most curves, and is easy to calculate and performs well.
According to one embodiment of the invention, RRCF is adopted to select all feature data for training, all feature data are iterated to obtain a plurality of decision trees, the decision trees form a decision forest, and then whether abnormal data exist in real-time sequence data is determined through voting of the decision forest.
When the decision tree is constructed, the RRCF selects the segmentation dimension for segmenting the feature data, and the probability of the RRCF selecting the feature data under the segmentation dimension is
Figure BDA0002990156980000092
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability, that the feature i is selectedThe value is between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; gi represents the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation after the characteristic i is sorted according to the characteristic size in the training sample set; sigma gjRepresenting g calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
Specifically, the unsupervised anomaly detection basic algorithm selected by the invention is RRCF (robust Random Cut forest), the detection effect of the unsupervised anomaly detection basic algorithm is better than that of other unsupervised anomaly detection algorithms, and a certain difference exists between the accuracy of the unsupervised anomaly detection basic algorithm and the accuracy of the unsupervised anomaly detection basic algorithm used when the vehicle is actually landed. The RRCF trains all training sample feature data in batches, each batch of feature data is subjected to multiple rounds of iteration to obtain a decision tree, and all decision trees finally form a decision forest and decide whether the training sample feature data are abnormal or not through voting. In the process of constructing the decision tree, feature segmentation needs to be selected from multiple dimensions of feature data. The RRCF considers that the segmentation is carried out on the dimension with larger coverage data range, the distinguishing effect of the sample is better, namely the probability that the feature i is selected
Figure BDA0002990156980000101
li=maxx∈Sxi-minx∈SxiWherein Si represents the probability of the feature i being selected, li represents the difference between the maximum value and the minimum value in the feature i, S represents the training sample set, and xiRepresenting the value of the feature i calculated for one sample in S. But this does not take into account the effect of the distribution of the dimensions themselves. According to an embodiment of the invention, when a decision tree is constructed and the dimension for cutting branches is selected, in addition to considering the coverage range of data of the dimension, the extreme difference of the data is used as an influence factor, namely, the probability of selecting the characteristic i is selected by the invention
Figure BDA0002990156980000102
Wherein g isi=maxx∈Sxj-xj-1. Thus, the larger the maximum spacing of the data distribution in each dimension,the degree of discrimination provided by segmentation at the interval is higher, so that segmentation dimensionality is selected more effectively, and model accuracy is improved.
Further, when a decision tree is constructed, after each iteration determines a segmentation dimension, a suitable boundary point needs to be selected on data of the dimension, and left and right subtrees are divided according to the boundary point. After the RRCF equally divides the dimension data, a dividing point is randomly selected, and the distribution characteristics of the dimension are not considered. According to one embodiment of the invention, the RRCF equally divides the feature data in the segmentation dimension into N intervals l0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each interval is selected is
Figure BDA0002990156980000103
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi]. Wherein l-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved for the training set, h-N-1 represents the maximum value of the characteristic, the difference between the minimum value and the maximum value is divided by N, and the N intervals are equally divided. For example, the left and right endpoints of the ith interval are liAnd hi. The selection strategy can identify the sparse part of the segmentation dimension more accurately, so that the discrimination is improved. In the present embodiment, diThe density of the intervals is represented, and refers to the number of samples in the range. Since the spacing widths are the same, the greater the number of samples, the greater the density. Count represents the Count, p represents each sample falling in the interval, i.e. [ l ] is countedi,hi]Number of samples in the interval range. Uniform [ li,hi]Represents the interval of pair li,hiMake normalization, XiIs a randomly selected segmentation point in the normalized interval.
Further, when abnormal data exists, an abnormal score codip of the abnormal data is calculated using the dividing point (specific node), and when the abnormal score codip is calculated, the sibling subtree and father of the dividing point are calculatedProportion CoDisp of abnormal data quantity contained in subtreeNodeThe higher the ratio, the higher the outlier degree of the outlier data. Since the calculation process of each abnormal data involves a plurality of characteristic data, the model is gradually moved upwards from the initial node for detection, and after repeated multiple iterations, the largest proportion CoDisp is selectedNodeAbnormal data xiIs an abnormality score of
Figure BDA0002990156980000111
Abnormal score CoDispxiMeans xiThe calculated degree of abnormality is sampled. First, xiA leaf sample in the decision tree is dropped, and the algorithm searches upwards from the leaf until a branch Node is found, and the sample size of the sub-tree represented by the Node is far smaller than that of the sibling sub-tree thereof. Final sample xiThe Codisp of (1) is the average value of the Codisp of the Node nodes corresponding to the sample in each tree in the whole forest. In the present embodiment, the largest ratio codip is selectedNodeConsidering the depth at which the node is located, deeper nodes in the tree are more normal. Thus find the demarcation point of the sample where xiThe subtree is isolated from other large samples and is more representative.
Further, in the step b, recommending part of abnormal data as a plurality of most abnormal segments in the selected abnormal data, and recommending after obtaining labels of the plurality of segments; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data, and recommending after obtaining labels of the plurality of segments; or
And recommending part of abnormal data, namely segmenting the abnormal data into a plurality of groups according to the abnormal scores, acquiring a plurality of fragments in each group, and recommending after acquiring the labels of the fragments.
3-5 schematically show three different anomaly fragment proactive recommender diagrams. As shown in fig. 3, according to an embodiment of the present invention, the scheme a selects the most abnormal 30 segments, and the labels of these abnormal segments can further affirm the explicit abnormality and reduce the false positive rate.
According to another embodiment of the invention, as shown in fig. 4, the scheme B selects the most uncertain 30 segments (i.e., around the anomaly determination threshold), and these labels can help the model to clearly classify the boundary, thereby improving the identification accuracy of the fuzzy anomaly.
As shown in fig. 5, according to the third embodiment of the present invention, the solution C divides the abnormal data into 10 groups according to the abnormal score, each group obtains at most 3 segments, and these labels can capture, for example, attitudes of the judgment feedback module on different abnormal judgment conditions, thereby helping the model determine the optimal threshold selection range.
In experiments disclosing data sets, the F1-score for protocol a was higher than the other two protocols, but each of the other two protocols possessed specific applicable scenarios.
Furthermore, the invention improves the processing efficiency of the model in the online detection stage through various technologies, and enables the model to have the capability of dynamic adjustment according to the feedback of the user. In the on-line detection stage, only the extreme abnormal value is selected as the automatic model feedback data to dynamically adjust the RRCF model, so that the model updating frequency is reduced, and the detection performance is improved. According to an embodiment of the invention, after the abnormal data of n labeled segments are obtained by the model, the abnormal data and M trees in the decision forest of the model jointly form an abnormal score matrix codip _ M [ x ]i][treej]For each exception data xiIf the user marks true sun, tree is usedjWeight tw ofj=twj+δ×CoDosp_M[xi][treej]. The self-correction of the model is fed back, so that the model can be helped to screen out decision trees with higher quality, the decision trees have higher weight in later-stage abnormal judgment, and the decision trees with higher weight are selected, so that the model is optimized, and the influence on the detection result is improved.
Furthermore, the present invention provides a time-series exception handling apparatus for implementing the time-series exception handling method, as shown in fig. 6, the apparatus including:
the data processing module is used for acquiring time series data, training the time series data and constructing a model;
the abnormal data detection recommending module detects whether abnormal data exist in the time sequence data obtained in real time according to the model, and if the abnormal data exist, part of the abnormal data are recommended;
the abnormal data judgment feedback module judges whether the part of abnormal data is reasonable or not and then feeds back a judgment result;
and the model optimization module optimizes the model according to the feedback judgment result and then continuously detects the real-time sequence data.
According to an embodiment of the present invention, further comprising:
and the data classification processing module is used for acquiring irregular large-scale time sequence data, clustering all the time sequence data, training various time sequence data and constructing a model.
In the invention, the data processing module acquires time sequence data, including acquiring regular small-scale time sequence data and irregular large-scale time sequence data, and when acquiring irregular large-scale time sequence data, all the time sequence data are clustered, and then various time sequence data are trained to construct a model.
The clustering process is to capture the incidence relation among the time sequence data to be trained through DBSCAN and cluster the data with approximate shape and consistent periodicity.
In the clustering process, in calculating the approximation degree of the time-series data, the distance between the time-series data is calculated using Dynamic Time Warping (DTW).
And the data classification processing module selects characteristic data which can represent the time sequence data of the corresponding type according to the type of the time sequence data to train and construct a model.
According to one embodiment of the invention, the abnormal data detection recommendation module adopts RRCF to select all feature data for training, the feature data are iterated to obtain a plurality of decision trees, the decision trees form a decision forest, and then whether abnormal data exist in the real-time sequence data or not is determined through decision forest voting.
In this embodiment, when constructing the decision tree, the RRCF selects a segmentation dimension for segmenting the feature data, and the RRCF has a probability of selecting the feature data in the segmentation dimension of
Figure BDA0002990156980000131
gi=maxx∈Sxj-xj-1(ii) a Where i is the characteristic data, piRepresenting the probability of the feature i being selected, the probability value being between 0 and 1; liRepresenting the difference between the maximum value and the minimum value of the characteristic i in a training sample set and in a characteristic set obtained by calculation; gi represents the maximum difference between two adjacent characteristic values in the characteristic set obtained by calculation after the characteristic i is sorted according to the characteristic size in the training sample set; sigma gjRepresenting g calculated for each feature dimension jjThe summation ∑ ljRepresents l calculated for each feature dimension jjAnd (6) summing.
In this embodiment, the RRCF equally divides the feature data in the segmentation dimension into N intervals [ l [ ]0,h0,l1,h1,...,lN-1,hN-1]And calculating the density d of each intervali=Count(p,p∈[li,hi]) Wherein the probability that each interval is selected is
Figure BDA0002990156980000141
Finally randomly selecting a cutting point X from the selected intervali~Uniform[li,hi]. Wherein l-0 and h-N-1 respectively represent the minimum value of the characteristic in the characteristic dimension solved for the training set, h-N-1 represents the maximum value of the characteristic, the difference between the minimum value and the maximum value is divided by N, and the N intervals are equally divided.
When abnormal data exists, the abnormal score CoDisp of the abnormal data is calculated by using the dividing point, and when the abnormal score CoDisp is calculated, the proportion CoDisp of the abnormal data quantity contained in the brother subtree and the father subtree of the dividing point is calculatedNodeSelecting the largest ratio CoDispNodeAbnormal data xiIs an abnormality score of
Figure BDA0002990156980000142
In the invention, the abnormal data detection recommending module recommends part of abnormal data as a plurality of most abnormal segments in the selected abnormal data, acquires labels of the plurality of segments and then recommends; or
Recommending partial abnormal data by selecting a plurality of uncertain segments in the abnormal data, and recommending after obtaining labels of the plurality of segments; or
And recommending part of abnormal data, namely segmenting the abnormal data into a plurality of groups according to the abnormal scores, acquiring a plurality of fragments in each group, and recommending after acquiring the labels of the fragments.
According to an embodiment of the present invention, after obtaining the abnormal data of n labeled segments, the model and M decision trees in the decision forest of the model jointly form an abnormal score matrix codip _ M [ x [ ]i][treej]For each exception data xiIf the feedback judgment result is true positive, the decision tree isjHas a weight of twj=twj+δ×CoDisp_M[xi][treej]And selecting a decision tree with higher weight according to the feedback judgment result so as to optimize the model.
To achieve the above object, the present invention also provides an electronic device, including: the time-series exception handling system comprises a processor, a memory and a computer program which is stored on the memory and can run on the processor, wherein the computer program realizes the time-series exception handling method when being executed by the processor.
In order to achieve the above object, the present invention further provides a computer-readable storage medium, on which a computer program is stored, and the computer program is executed by a processor to implement the above time-series exception handling method.
According to the scheme, the invention provides an unsupervised, white-box and accurate time series exception handling method which is matched with active learning and can actively and efficiently collect feedback information. On the basis of a traditional unsupervised learning frame, an active learning stage is introduced, abnormality is actively recommended to a judgment feedback part (such as a judgment feedback module or operation and maintenance personnel) and feedback is acquired, so that a model is corrected, and the accuracy is improved. The method reserves the advantages of the traditional unsupervised learning in the aspects of parameter adjustment and marking, designs the application strategy of marking feedback in a targeted manner, and further optimizes the recall rate, the detection speed and the capacity of the model.
Moreover, the processing method has no obvious bias on data, can adapt to indexes with specific scene semantics, can meet the operation and maintenance requirements in the field of non-traditional Internet, has higher expandability and universality, and can give specific abnormal reasons to the given abnormal result.
Moreover, the present invention was able to accurately detect and interpret anomalies, tested on 1 public data set and time series data of 2 commercial bank actual production environments, ultimately reaching F1-score of 0.81 and 0.89 on both data sets. Compared with the traditional unsupervised exception handling method, the best F1-score is improved by 0.19-0.5 on two data sets, and the detection time is shortened by 58%.
Those of ordinary skill in the art will appreciate that the modules and algorithm steps described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and devices may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, each functional module in the embodiments of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method for transmitting/receiving the power saving signal according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
It should be understood that the order of execution of the steps in the summary of the invention and the embodiments of the present invention does not absolutely imply any order of execution, and the order of execution of the steps should be determined by their functions and inherent logic, and should not be construed as limiting the process of the embodiments of the present invention.

Claims (15)

Translated fromChinese
1.一种时间序列异常处理方法,其特征在于,包括以下步骤:1. a time series exception processing method, is characterized in that, comprises the following steps:获取时间序列数据,对所述时间序列数据训练,构建模型;Obtain time series data, train the time series data, and build a model;根据所述模型检测实时获得的时间序列数据中是否存在异常数据,若存在,则推荐部分异常数据;Detecting whether there is abnormal data in the time series data obtained in real time according to the model, if there is, recommending some abnormal data;判断被推荐的所述部分异常数据是否合理,然后反馈判断结果;Judging whether the recommended part of the abnormal data is reasonable, and then feeding back the judgment result;根据所述判断结果优化所述模型,然后继续检测实时时间序列数据。Optimize the model according to the judgment result, and then continue to detect real-time time series data.2.根据权利要求1所述的时间序列异常处理方法,其特征在于,获取时间序列数据包括获取规则的小规模时间序列数据和获取不规则的大规模时间序列数据,获取不规则的大规模时间序列数据时,对所有时间序列数据聚类处理,然后对各类时间序列数据训练,构建模型。2. The method for processing anomalies in time series according to claim 1, wherein obtaining time series data comprises obtaining regular small-scale time series data and obtaining irregular large-scale time series data, and obtaining irregular large-scale time series data. For sequence data, all time series data are clustered, and then various types of time series data are trained to build models.3.根据权利要求2所述的时间序列异常处理方法,其特征在于,所述聚类处理是通过DBSCAN来捕获待训练时间序列数据之间的关联关系,将形状近似和周期性相符的数据聚类。3. The method for processing anomalies in time series according to claim 2, wherein the clustering process is to capture the correlation between the time series data to be trained through DBSCAN, and cluster the data with approximate shape and periodicity. kind.4.根据权利要求3所述的时间序列异常处理方法,其特征在于,在所述聚类处理过程中,在计算时间序列数据的近似度时,使用动态时间规整算法计算时间序列数据之间的距离。4. The method for processing anomalies in time series according to claim 3, wherein, in the clustering process, when calculating the approximation of the time series data, a dynamic time warping algorithm is used to calculate the difference between the time series data. distance.5.根据权利要求4所述的时间序列异常处理方法,其特征在于,根据所述时间序列数据的类型,选取能够代表相应类型的时间序列数据的特征数据进行训练,构建模型。5 . The time series exception processing method according to claim 4 , wherein, according to the type of the time series data, characteristic data that can represent the corresponding type of time series data is selected for training to build a model. 6 .6.根据权利要求5所述的时间序列异常处理方法,其特征在于,采用稳健随机砍伐森林选取所有所述特征数据进行训练,所有所述特征数据经过迭代得到多个决策树,多个所述决策树组成决策森林,然后通过所述决策森林投票决定实时时间序列数据中是否存在异常数据。6. The method for processing anomalies in time series according to claim 5, characterized in that, adopting robust random deforestation to select all the characteristic data for training, and all the characteristic data are iterated to obtain a plurality of decision trees, and a plurality of the characteristic data are obtained. The decision trees form a decision forest, and then vote through the decision forest to determine whether there is abnormal data in the real-time time series data.7.根据权利要求6所述的时间序列异常处理方法,其特征在于,在构建所述决策树时,所述RRCF选择切分所述特征数据的切分维度,在所述切分维度下,所述RRCF选取所述特征数据的概率为
Figure FDA0002990156970000021
gi=maxx∈Sxj-xj-1;其中i为特征数据,pi表示特征i被选择的概率,概率值为0到1之间;li表示特征i在训练的样本集中,计算得到的特征集合里,最大值和最小值的差;gi表示特征i在训练的样本集中,计算得到的特征集合里,按特征大小排序后,相邻的两个特征值之间最大的差值;∑gi代表每个特征维度j计算出来的gj求和∑lj代表每个特征维度j计算出来的lj求和。7. The method for processing anomalies in time series according to claim 6, wherein, when constructing the decision tree, the RRCF selects a segmentation dimension to segment the feature data, and under the segmentation dimension, The probability that the RRCF selects the feature data is
Figure FDA0002990156970000021
gi =maxx∈S xj -xj-1 ; where i is the feature data, pi represents the probability of feature i being selected, and the probability value is between 0 and 1; lii represents that feature i is in the training sample set , in the calculated feature set, the difference between the maximum value and the minimum value;gi indicates that feature i is in the training sample set. In the calculated feature set, after sorting by feature size, the largest value between two adjacent feature values ∑gi represents the sum of gj calculated by each feature dimension j and ∑lj represents the sum of lj calculated by each feature dimension j.8.根据权利要求7所述的时间序列异常处理方法,其特征在于,所述RRCF将所述切分维度上的所述特征数据等分,等分为N个间隔[l0,h0,l1,h1,...,lN-1,hN-1],并计算每个间隔的密度di=Count(p,p∈[li,hi]),其中每个所述间隔被选择的概率为
Figure FDA0002990156970000022
最终从被选择的所述间隔中随机挑选切分点Xi~Uniform[li,hi];其中,1-0和h-N-1分别代表针对训练集求解的特征维度中,1-0表示特征的最小值,h-N-1代表特征的最大值,两者作差,除以N,等分为N个间隔。
8. The time series anomaly processing method according to claim 7, wherein the RRCF divides the feature data on the segmentation dimension into equal parts, and divides them into N intervals [l0 , h0 , l1 , h1 , . . . , lN-1 , hN-1 ], and calculate the density di = Count(p, p∈[li , hi ]) for each interval, where each The probability of the interval being chosen is
Figure FDA0002990156970000022
Finally, the segmentation points Xi ~Uniform[lii , hi ] are randomly selected from the selected interval; wherein, 1-0 and hN-1 respectively represent the feature dimensions solved for the training set, and 1-0 represents The minimum value of the feature, hN-1 represents the maximum value of the feature, the difference between the two, divided by N, is divided into N intervals.
9.根据权利要求8所述的时间序列异常处理方法,其特征在于,存在所述异常数据时,利用所述切分点计算所述异常数据的异常分数CoDisp,计算所述异常分数CoDisρ时,计算所述切分点的兄弟子树和父亲子树所包含所述异常数据数量的比例CoDispNode,选择最大的比例CoDispNode,异常数据xi的异常分数为
Figure FDA0002990156970000023
T∈forest。
9. The time series abnormality processing method according to claim 8, wherein when the abnormal data exists, the abnormal score CoDisp of the abnormal data is calculated by using the cutting point, and when the abnormal score CoDisp is calculated, Calculate the ratio CoDispNode of the number of abnormal data contained in the sibling subtree and the father subtree of the split point, select the largest ratio CoDispNode , and the abnormal score of abnormal data xi is
Figure FDA0002990156970000023
T ∈ forest.
10.根据权利要求9所述的时间序列异常处理方法,其特征在于,所述推荐部分异常数据为选择所述异常数据中推测为最异常的多个片段;或者10 . The method for processing anomalies in time series according to claim 9 , wherein the recommended partial anomaly data is to select a plurality of segments that are presumed to be the most anomalous among the anomalous data; or所述推荐部分异常数据为选择所述异常数据中最不确定的多个片段,获取所述多个片段的标注后进行推荐;或者The recommended part of the abnormal data is to select the most uncertain multiple segments in the abnormal data, and obtain the annotations of the multiple segments to make recommendations; or所述推荐部分异常数据为根据所述异常分数将所述异常数据切分为多组,每组获取多个片段,获取所述片段的标注后进行推荐。The recommended part of the abnormal data is to divide the abnormal data into multiple groups according to the abnormal score, obtain a plurality of segments for each group, and perform recommendation after obtaining the labels of the segments.11.根据权利要求10所述的时间序列异常处理方法,其特征在于,所述模型获取n个标注片段的异常数据后与所述模型的决策森林中的m棵决策树共同组成异常分数矩阵CoDisp_M[xi][treej],对于每个异常数据xi,若反馈判断结果为真阳,则决策树treej的权重为twj=twj+δ×CoDisp_M[xi][treej],根据反馈判断结果选择更高权重的决策树,从而优化所述模型。11. The method for processing anomalies in time series according to claim 10, wherein after the model acquires the anomalous data of n labeled segments, it forms an anomaly score matrix CoDisp_M together with m decision trees in the decision forest of the model. [xi ][treej ], for each abnormal dataxi , if the feedback judgment result is true positive, the weight of decision tree treej is twj =twj +δ×CoDisp_M[xi][treej ], A decision tree with a higher weight is selected according to the feedback judgment result, thereby optimizing the model.12.一种时间序列异常处理装置,其特征在于,包括:12. A time series exception processing device, comprising:数据处理模块,用于获取时间序列数据,对时间序列数据训练,构建模型;The data processing module is used to obtain time series data, train time series data, and build models;异常数据检测推荐模块,根据所述模型检测实时获得的时间序列数据中是否存在异常数据,若存在,则推荐部分异常数据;An abnormal data detection and recommendation module, which detects whether there is abnormal data in the time series data obtained in real time according to the model, and if there is, recommends some abnormal data;异常数据判断反馈模块,判断所述部分异常数据是否合理,然后反馈判断结果;Abnormal data judgment feedback module, judges whether the part of abnormal data is reasonable, and then feeds back the judgment result;模型优化模块,根据所述反馈判断结果优化所述模型,然后继续检测实时时间序列数据。The model optimization module optimizes the model according to the feedback judgment result, and then continues to detect real-time time series data.13.根据权利要求12所述的时间序列异常处理装置,其特征在于,还包括13 . The time series exception processing device according to claim 12 , further comprising: 13 .数据分类处理模块,用于获取不规则的大规模时间序列数据,对所有时间序列数据聚类处理,然后对各类时间序列数据训练,构建模型。The data classification processing module is used to obtain irregular large-scale time series data, cluster all time series data, and then train various types of time series data to build models.14.一种电子设备,其特征在于,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至11中任一项所述的时间序列异常处理方法。14. An electronic device, characterized in that it comprises a processor, a memory, and a computer program stored on the memory and executable on the processor, the computer program being executed by the processor to achieve the right The time series exception handling method described in any one of requirements 1 to 11 is required.15.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至11中任一项所述的时间序列异常处理方法。15. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the time according to any one of claims 1 to 11 is realized Sequence exception handling method.
CN202110313319.XA2021-03-242021-03-24Time series exception handling method and device, electronic equipment and storage mediumPendingCN112905671A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202110313319.XACN112905671A (en)2021-03-242021-03-24Time series exception handling method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202110313319.XACN112905671A (en)2021-03-242021-03-24Time series exception handling method and device, electronic equipment and storage medium

Publications (1)

Publication NumberPublication Date
CN112905671Atrue CN112905671A (en)2021-06-04

Family

ID=76106631

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110313319.XAPendingCN112905671A (en)2021-03-242021-03-24Time series exception handling method and device, electronic equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN112905671A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113656271A (en)*2021-08-102021-11-16上海浦东发展银行股份有限公司Method, device and equipment for processing user abnormal behaviors and storage medium
CN114066173A (en)*2021-10-262022-02-18福建正孚软件有限公司Capital flow behavior analysis method and storage medium
CN115146174A (en)*2022-07-262022-10-04北京永信至诚科技股份有限公司Key clue recommendation method and system based on multi-dimensional weight model
CN116467666A (en)*2023-04-282023-07-21浙江大学Graph anomaly detection method and system based on integrated learning and active learning
WO2024235326A1 (en)*2023-05-172024-11-21广东恒翼能科技股份有限公司Multivariate time series-based real-time anomaly detection method for lithium battery formation and capacity-grading processes, and electronic device and storage medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20130024172A1 (en)*2010-03-302013-01-24Kabushiki Kaisha ToshibaAnomaly detecting apparatus
CN105228175A (en)*2015-09-172016-01-06福建新大陆软件工程有限公司A kind of base station energy consumption optimization method based on decision tree and system
CN109032829A (en)*2018-07-232018-12-18腾讯科技(深圳)有限公司Data exception detection method, device, computer equipment and storage medium
CN109753049A (en)*2018-12-212019-05-14国网江苏省电力有限公司南京供电分公司 An abnormal command detection method for source-network-load interactive industrial control system
CN109871401A (en)*2018-12-262019-06-11北京奇安信科技有限公司 A method and device for detecting abnormality in time series
CN110138745A (en)*2019-04-232019-08-16极客信安(北京)科技有限公司Abnormal host detection method, device, equipment and medium based on data stream sequences
US20190288904A1 (en)*2016-12-072019-09-19Huawei Technologies Co., Ltd.Network Detection Method and Apparatus
US20200053110A1 (en)*2017-03-282020-02-13Han Si An Xin (Beijing) Software Technology Co., LtdMethod of detecting abnormal behavior of user of computer network system
CN110910204A (en)*2019-10-242020-03-24东莞市盟大塑化科技有限公司 An artificial intelligence-based user monitoring system
CN111178456A (en)*2020-01-152020-05-19腾讯科技(深圳)有限公司Abnormal index detection method and device, computer equipment and storage medium
CN111262722A (en)*2019-12-312020-06-09中国广核电力股份有限公司Safety monitoring method for industrial control system network
CN111459778A (en)*2020-03-122020-07-28平安科技(深圳)有限公司Operation and maintenance system abnormal index detection model optimization method and device and storage medium
CN111858231A (en)*2020-05-112020-10-30北京必示科技有限公司Single index abnormality detection method based on operation and maintenance monitoring
CN111931868A (en)*2020-09-242020-11-13常州微亿智造科技有限公司Time series data abnormity detection method and device
CN112084056A (en)*2020-08-252020-12-15腾讯科技(深圳)有限公司Abnormality detection method, apparatus, device and storage medium
CN112381181A (en)*2020-12-112021-02-19桂林电子科技大学Dynamic detection method for building energy consumption abnormity

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20130024172A1 (en)*2010-03-302013-01-24Kabushiki Kaisha ToshibaAnomaly detecting apparatus
CN105228175A (en)*2015-09-172016-01-06福建新大陆软件工程有限公司A kind of base station energy consumption optimization method based on decision tree and system
US20190288904A1 (en)*2016-12-072019-09-19Huawei Technologies Co., Ltd.Network Detection Method and Apparatus
US20200053110A1 (en)*2017-03-282020-02-13Han Si An Xin (Beijing) Software Technology Co., LtdMethod of detecting abnormal behavior of user of computer network system
CN109032829A (en)*2018-07-232018-12-18腾讯科技(深圳)有限公司Data exception detection method, device, computer equipment and storage medium
CN109753049A (en)*2018-12-212019-05-14国网江苏省电力有限公司南京供电分公司 An abnormal command detection method for source-network-load interactive industrial control system
CN109871401A (en)*2018-12-262019-06-11北京奇安信科技有限公司 A method and device for detecting abnormality in time series
CN110138745A (en)*2019-04-232019-08-16极客信安(北京)科技有限公司Abnormal host detection method, device, equipment and medium based on data stream sequences
CN110910204A (en)*2019-10-242020-03-24东莞市盟大塑化科技有限公司 An artificial intelligence-based user monitoring system
CN111262722A (en)*2019-12-312020-06-09中国广核电力股份有限公司Safety monitoring method for industrial control system network
CN111178456A (en)*2020-01-152020-05-19腾讯科技(深圳)有限公司Abnormal index detection method and device, computer equipment and storage medium
CN111459778A (en)*2020-03-122020-07-28平安科技(深圳)有限公司Operation and maintenance system abnormal index detection model optimization method and device and storage medium
CN111858231A (en)*2020-05-112020-10-30北京必示科技有限公司Single index abnormality detection method based on operation and maintenance monitoring
CN112084056A (en)*2020-08-252020-12-15腾讯科技(深圳)有限公司Abnormality detection method, apparatus, device and storage medium
CN111931868A (en)*2020-09-242020-11-13常州微亿智造科技有限公司Time series data abnormity detection method and device
CN112381181A (en)*2020-12-112021-02-19桂林电子科技大学Dynamic detection method for building energy consumption abnormity

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
YAN SU 等: "An Improved Random Forest Model for the Prediction of Dam Displacement", IEEE ACCESS, vol. 9, pages 9142, XP011831800, DOI: 10.1109/ACCESS.2021.3049578*
YAO WANG: ""Practical and White-Box Anomaly Detection through Unsupervised and Active Learning"", 《2020 29TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS AND NETWORKS》, pages 1 - 8*
孟亦凡;李敬兆;张梅;: "基于LSTM边缘计算与随机森林雾决策的矿工状态监测设备", 煤矿机械, no. 11, pages 150 - 154*
杨永娇 等: "基于Isolation Forest和Random Forest相结合的智能电网时间序列数据异常检测算法", 计算机与现代化, no. 03, pages 99 - 102*
邓志赟 等: "基于PAM-RF的奶牛活动异常情况监测", 广东农业科学, vol. 42, no. 16, pages 122 - 129*
闻克宇;赵国堂;何必胜;马剑;: "基于改进迁移学习的高速铁路短期客流时间序列预测方法", 系统工程, no. 03, pages 77 - 87*
马超 等: "大数据环境下离散制造车间异常事件发现方法", 计算机应用与软件, vol. 34, no. 09, pages 288 - 293*

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113656271A (en)*2021-08-102021-11-16上海浦东发展银行股份有限公司Method, device and equipment for processing user abnormal behaviors and storage medium
CN113656271B (en)*2021-08-102024-06-07上海浦东发展银行股份有限公司Method, device, equipment and storage medium for processing abnormal behaviors of user
CN114066173A (en)*2021-10-262022-02-18福建正孚软件有限公司Capital flow behavior analysis method and storage medium
CN115146174A (en)*2022-07-262022-10-04北京永信至诚科技股份有限公司Key clue recommendation method and system based on multi-dimensional weight model
CN116467666A (en)*2023-04-282023-07-21浙江大学Graph anomaly detection method and system based on integrated learning and active learning
WO2024235326A1 (en)*2023-05-172024-11-21广东恒翼能科技股份有限公司Multivariate time series-based real-time anomaly detection method for lithium battery formation and capacity-grading processes, and electronic device and storage medium

Similar Documents

PublicationPublication DateTitle
CN112905671A (en)Time series exception handling method and device, electronic equipment and storage medium
CN108415789B (en)Node fault prediction system and method for large-scale hybrid heterogeneous storage system
CN113572625B (en)Fault early warning method, early warning device, equipment and computer medium
CN111738308A (en) Dynamic threshold detection method of monitoring indicators based on clustering and semi-supervised learning
CN113254255B (en) A cloud platform log analysis method, system, device and medium
Zhang et al.Predict failures in production lines: A two-stage approach with clustering and supervised learning
CN111325410B (en)Universal fault early warning system based on sample distribution and early warning method thereof
CN113918367B (en) A large-scale system log anomaly detection method based on attention mechanism
CN109544399B (en)Power transmission equipment state evaluation method and device based on multi-source heterogeneous data
CN110083507B (en) Method and device for classifying key performance indicators
Wang et al.Practical and white-box anomaly detection through unsupervised and active learning
CN115242457B (en) A method, device, electronic device and storage medium for detecting log data
CN117527622B (en)Data processing method and system of network switch
CN113125903A (en)Line loss anomaly detection method, device, equipment and computer-readable storage medium
CN115587543A (en) Tool Remaining Life Prediction Method and System Based on Federated Learning and LSTM
EP4033421A1 (en)Method and system for predicting a failure of a monitored entity
CN116483602A (en)Abnormality detection method, abnormality detection device and computer storage medium
CN112363891A (en)Exception reason obtaining method based on fine-grained event and KPIs analysis
CN110753049B (en)Safety situation sensing system based on industrial control network flow
CN112039907A (en)Automatic testing method and system based on Internet of things terminal evaluation platform
CN118982347B (en) IT operation and maintenance service management method based on big data
Aziz et al.Cluster Analysis-Based Approach Features Selection on Machine Learning for Detecting Intrusion.
CN111475380A (en)Log analysis method and device
CN110837953A (en)Automatic abnormal entity positioning analysis method
CN118297640B (en)Product marketing management system and method based on big data

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
TA01Transfer of patent application right

Effective date of registration:20210810

Address after:100029 Beijing city Chaoyang District Yumin Road No. 3

Applicant after:NATIONAL COMPUTER NETWORK AND INFORMATION SECURITY MANAGEMENT CENTER

Address before:100083 4th floor, block a, Dongsheng building, No. 8, Zhongguancun East Road, Haidian District, Beijing

Applicant before:Beijing Bishi Technology Co.,Ltd.

Applicant before:NATIONAL COMPUTER NETWORK AND INFORMATION SECURITY MANAGEMENT CENTER

TA01Transfer of patent application right
RJ01Rejection of invention patent application after publication

Application publication date:20210604

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp