CN108446184A

Movatterモバイル変換

Info

Publication number: CN108446184A
Application number: CN201810155161.6A
Authority: CN
Inventors: 张银霞; 付铁山
Original assignee: Beijing Tianyuan Creative Technology Ltd
Current assignee: Beijing Tianyuan Creative Technology Ltd
Priority date: 2018-02-23
Filing date: 2018-02-23
Publication date: 2018-08-24
Anticipated expiration: 2038-02-23
Also published as: CN108446184B

Abstract

The present invention provides a kind of method and system of analysis failure root primordium, and wherein method includes：It is ranked up according to the time attribute of each data in data set, cutting is carried out to the data set according to preset time window, obtains several groups Sub Data Set；The frequent item set and correlation rule in data set are obtained according to Apriori algorithm, includes a certain number of data with strong incidence relation in frequent item set；It is ranked up according to the time attribute of frequent episode intensive data, the forward data that will sort successively are matched with the accompanying alarm reason data to prestore in alarm cause database, if successful match, then the forward data of the sequence are removed from data set, continue to compare from next data item, the root primordium of the last data of sequential in matching data unsuccessful and that sequence is forward as the frequent item set.The present invention is suitable for the full dimension monitoring scene of IT system, liberates operation maintenance personnel pressure, reduces operation maintenance personnel competency profiling.

Description

Analyze the method and system of failure root primordium

Technical field

The present invention relates to data mining technology fields, more particularly, to the method and system of analysis failure root primordium.

Background technology

Under information-based, digitized development trend, the complexity of IT system improves, and IT frameworks are also increasingly sophisticated, existingIT frameworks include at least：Hardware, virtual machine, container, middleware, database, application, business etc., business once occurFailure can quote the alarm/event information of a large amount of even magnanimity so that maintenance personnel's nothing before a large amount of alarm/event data surfacesMethod accurately carries out fault location, and fault location is higher to the competency profiling of maintenance personnel, needs by having abundant operationIt safeguards and the personnel of development Experience participates in.

Accident analysis root primordium method based on association rule mining be carry out IT system fault diagnosis, positioning important sideMethod.Existing accident analysis mostly uses greatly threshold value judgement, allows that threshold value or threshold range threshold value is arranged on KPI, moreThresholding then report and alarm, it is this judge be not just for the analysis of the performance issue of IT system it is very suitable, due to network or itsHis reason causes user response is slack-off can report a large amount of response time Threshold Crossing Alert, since these are not that can influence userUsing the failure of IT system, most of user bear feel can with system again it is normal, will not pay close attention to or complain, still,Once customer complaint, performance alarm also can be very more, analyze and relatively difficult.

Invention content

The present invention provides a kind of analysis failure root primordium for overcoming the above problem or solving the above problems at least partlyMethod and system.

According to an aspect of the present invention, a kind of method of analysis failure root primordium is provided, including：

S1, it is ranked up according to the time attribute of each data in data set, according to preset time window to the data setCutting is carried out, several groups Sub Data Set is obtained；

S2, the frequent item set and correlation rule in data set are obtained according to Apriori algorithm, includes in the frequent item setA certain number of data with strong incidence relation；

S3, it is ranked up according to the time attribute of frequent episode intensive data, the forward data that will sort successively are former with alarmIt because of the accompanying alarm reason data matching to prestore in database, if successful match, removes, continues to match the next item down, finally willMatch root primordium of the data unsuccessful and that sequence is forward as the last data of sequential in the frequent item set；

Wherein, comprising in the alarm data in each domain, daily record in IT system in preset time range in the data setThe anomalous performance data that wrong data and performance data are concentrated.

Preferably, further include before the step S1：

The alarm data and performance data in each domain in IT system in the preset time range are obtained by APM probes,The wrong data of daily record data in IT system is obtained by log collection side；

The anomalous performance data in the performance data are screened using mean value and multiple variance；

By the wrong data and performance in the alarm data in each domain, daily record in IT system in the preset time rangeAnomalous performance data in data constitute the data set.

Preferably, the step of anomalous performance data screened using mean value and multiple variance in the performance data,It specifically includes：

The performance data concentrated to performance data is ranked up by preset rules, takes the performance data of median as equalValue；

The variance of calculated performance data filters out the performance data that numerical values recited is within the scope of mean value to 3 times of variances, willRemaining performance data is as the anomalous performance data.

Preferably, the step S2 is specifically included：

It will from high to low sort after all data statistics supports in data set, obtain candidate's 1- item collections, remove candidateIt is less than the data of minimum support in 1- item collections, obtains frequent 1- item collections；

According to Apriori algorithm using successively search technique, until obtaining frequent m- item collections, meet condition：Frequently m-Collection be not empty and (m-1)-subset frequently, m be not more than number and (m+1)-with most multidata subdata intensive dataItem collection is sky；

All items for listing frequent m- item collections generate correlation rule according to Apriori algorithm.

Preferably, further include after the step S3：The correlation rule and root primordium are shown.

Preferably, the IT system includes one or more in following domain：Business, network, application, database, outsideInterface, container, virtual machine and physical store.

According to another aspect of the present invention, a kind of system of analysis failure root primordium is also provided, including：

Cutting module, for being ranked up according to the time attribute of each data in data set, according to preset time window pairThe data set carries out cutting, obtains several groups Sub Data Set；

Relating module, it is described frequent for obtaining frequent item set and correlation rule in data set according to Apriori algorithmInclude a certain number of data with strong incidence relation in item collection；

Root primordium searching module successively leans on sequence for being ranked up according to the time attribute of frequent episode intensive dataPreceding data are matched with the accompanying alarm reason data to prestore in alarm cause database, by it from data set if successful matchMiddle removal, continues to match the next item down, will finally match the unsuccessful and forward data of sequence as sequential in the frequent item set mostThe root primordium of data afterwards；

Preferably, the system also includes data set acquisition module, the data set acquisition module specifically includes：

Collector unit, the alarm number for obtaining each domain in IT system in the preset time range by APM probesAccording to and performance data, pass through log collection side obtain IT system in daily record data wrong data；

Screening unit, for screening the anomalous performance data in the performance data using mean value and multiple variance；

Unit is converged, is used for the mistake in the alarm data in each domain, daily record in IT system in the preset time rangeThe anomalous performance data missed in data and performance data constitute the data set.

Preferably, the screening unit is specifically used for：

The performance data concentrated to performance data is ranked up by preset rules, takes median as mean value；

Preferably, the relating module is specifically used for：

The method and system of analysis failure root primordium proposed by the present invention, by the fault data in certain time according to the timeWindow is classified as multiple Sub Data Sets, and the frequent item set with strong incidence relation is found out by Apriori algorithm, will have strong associationThe forward data of time attribute are matched with preset alarm cause database in the frequent item set of relationship, if successful match,Then using the data as root primordium.The present invention is suitable for the full dimension monitoring scene of IT system, liberates operation maintenance personnel pressure, reducesOperation maintenance personnel competency profiling.

Description of the drawings

Fig. 1 is the method flow diagram according to the analysis failure root primordium of the embodiment of the present invention；

Fig. 2 is the functional block diagram according to the system of the analysis failure root primordium of the embodiment of the present invention.

Specific implementation mode

With reference to the accompanying drawings and examples, the specific implementation mode of the present invention is described in further detail.Implement belowExample is not limited to the scope of the present invention for illustrating the present invention.

In order to overcome the above problem of the prior art, the embodiment of the present invention to provide a kind of method of analysis failure root primordium,The design concept of this method is：Alarm occurs all to be caused by due to some root failure, and root failure can cause other alarmsIt generates together, referred to as accompanying alarm, causes alarm windstorm, it is therefore desirable to earliest fault data is found from time sequencing, andAccording to association rule mining method, obtains and meet minimum support, multiple fault datas with incidence relation as frequent, according to sorting in frequent episode, forward fault data is compared with preset alarm cause database, if comparing successfully,Illustrate that fault data is root primordium data, if comparison is unsuccessful, continues will sort secondary preceding fault data and alarm cause numberIt is compared according to library, until successful match.Through actual test, the method for the embodiment of the present invention can quickly and accurately find valuableThe alarm association rule and root primordium of value, for system maintenance, personnel provide decision support.

Specifically, Fig. 1 shows the method flow diagram of the analysis failure root primordium of the embodiment of the present invention, as shown, shouldMethod includes：

101, it is ranked up according to the time attribute of each data in data set, according to preset time window to the data setCutting is carried out, several groups Sub Data Set is obtained；Include the alarm number in each domain in IT system in preset time range in data setAccording to the wrong data and the anomalous performance data concentrated of performance data in, daily record.

It should be noted that this method can collect the fault data within the scope of certain time first, include the announcement in each domainAlert data (are error in IT field mistake, for example network is disconnected, has net mistake and comes out；Alarm is untreated mistake, is hadAlarm means wrong generation certainly, but the mistake IT system is untreated), the wrong data in daily record and performance data collectionIn anomalous performance data, these fault datas by data cleansing be stored in database in, constitute data set, asked for subsequentTopic retrospect and root primordium positioning, from the perspective of linear and nonlinear, performance data belongs to linear data, and alarm data andWrong data belongs to nonlinear data.

As well-known to those skilled in the art, daily record data has a rank, for example, info grades, debug grades andError grades etc., the embodiment of the present invention judges log category by rank.Debug level data is the minimum daily record data of rank,For general, during running, do not export generally.Info level logs data are used for the current shape of reponse systemState to end user, so, the information exported herein, it should there is practical significance to end user, that is, final useIt wants to see clear to be what looks like in family.It is said from certain angle, the information of Info outputs can be regarded as software productionA part for product (just as the word on those interactive interfaces).Error level data, i.e. wrong data, that is Ke YijinThe work of some prosthetic of row, but can not determine that system can normally work down, some stage of system afterwards, it is likely thatIt can be because this problem currently, leads to a mistake that can not be repaired (such as delay machine), it is also possible to working always to stoppingAlso there are not serious problems.

In computer systems, each data existence time attribute, so-called time attribute are to show the beginning of the dataMoment, finish time etc..It according to being carved at the beginning of each data in data set, is ranked up, just obtains (the i.e. event of each dataBarrier) generation sequence, further to data set according to time window carry out cutting, the data in data set can be sorted out to differenceIn the corresponding Sub Data Set of time window.For example 10 points to 12 points of certain day is analyzed, totally 120 minutes data, with 10 minutes windowsGranularity does cutting, splits into 12 groups of data, and every group of data all include several anomalous performance data, alarm data/wrong data, this hairThe design of bright embodiment is：If performance issue occurs, many problems, but these problems can be simultaneously broken out at certain momentMost of concentrated in several key problems, is done probability analysis by the set to decomposition and is found out those key problems.ThisThe method of the data mining of inventive embodiments helps to find out the relatively high data of probability of occurrence from a large amount of data.

102, frequent item set and correlation rule in data set are obtained according to Apriori algorithm, is wrapped in the frequent item setContaining a certain number of data with strong incidence relation；

It should be noted that Apriori algorithm, which is one kind, being used for association rule mining (Association ruleMining representative algorithm) is used for Mining Boolean Association Rules frequent item set, and so-called frequent item set is as its name suggests in numberAccording to the data acquisition system that concentration frequently occurs, the correlation rule that the embodiment of the present invention is concentrated by Apriori algorithm mining data obtainsThe data acquisition system with strong incidence relation, strong incidence relation is taken to meet strong rule --- meet minimum support and minimum confidenceDegree.The design concept of the embodiment of the present invention is it is believed that the data with strong incidence relation belong to root alarm with higher probabilityWith the relationship of relative companion alarm.

103, it is ranked up according to the time attribute of frequent episode intensive data, the forward data that will sort successively are former with alarmIt because of the accompanying alarm reason data matching to prestore in database, if successful match, removes, continues to match the next item down, finally willMatch root primordium of the data unsuccessful and that sequence is forward as the last data of sequential in the frequent item set.

It should be noted that being the set of several data with strong incidence relation, such as frequent episode in frequent item setCollection (EMS memory occupation is high, CPU occupies height), by being ranked up according to the time attribute of the data in frequent item set, it is known that memoryOccupy height be happened at CPU occupy it is high before, just using EMS memory occupation height as with possible root primordium in alarm cause databaseThe accompanying alarm reason data matching to prestore.Alarm cause database is the operation maintenance management that the embodiment of the present invention createsDatabase, the database save alarm data and relative companion alarm.Alarm cause database in the embodiment of the present inventionIt can be updated by operation maintenance personnel according to actual demand.In embodiments of the present invention, according to the vertical sequence of time attributeIt is ranked up, the most preceding data of sequence are time upper earliest data, if the data and quick-fried broken alarm cause Data MatchingSuccess, then just the data are removed from frequent item set, then the data before matching sequence time, are ranked up with this, untilFinding out can not matched data.

On the basis of the above embodiments, before step 101 further include the process for obtaining data set, specifically, the processIncluding：

001, the alarm data and performance data in each domain in IT system in preset time range are obtained by APM probes,The wrong data of daily record data in IT system is obtained by log collection method.

Table 1 shows in the IT system of the embodiment of the present invention each domain and mutually in requisition for the performance data of acquisition.

The domain of table 1IT systems and performance data table

002, the performance data collection in preset time range in IT system is obtained, using mean value and multiple variance screenabilityAnomalous performance data in data.

It should be noted that since performance data is linear, under normal circumstances, performance data steadily tends to straight line,And when occurring abnormal, performance data will appear rough curve, therefore the embodiment of the present invention passes through median and multipleVariance screens anomalous performance data.

003, by the wrong data and performance in the alarm data in each domain, daily record in IT system in preset time rangeAnomalous performance data in data constitute data set.

On the basis of the above embodiments, using the anomalous performance data in mean value and multiple variance screenability dataStep specifically includes：

The performance data concentrated to performance data is ranked up by preset rules (such as sequence from big to small), takes middle positionNumber is used as mean value；

It should be noted that the embodiment of the present invention uses median as mean value, because median is more sensitive to exceptional value,For example, when 1,2,3,4,100, average value=22, but median=3 (intermediate value), it is clear that find out exceptional value using medianRationally.

The variance of calculated performance data filters out the performance data that numerical values recited is within the scope of mean value to 3 times of variances, willRemaining performance data is as anomalous performance data.

On the basis of the above embodiments, step 102 specifically includes：

All items for listing m- frequent item sets generate correlation rule according to Apriori algorithm.

It should be noted that since anomalous performance data itself are a discrete, nonlinear data points, in general, greatlyData in most application programs are generated by the program of one or more reflection system functions.When bottom layer application program is not withWhen normal mode is run, anomalous performance data are will produce, quickly and efficiently find that these anomalous performance data have valence very muchValue.In IT system, since problem is chain generation：Root primordium must first occur and be unable to self-healing, to otherProblem occurs together, forms alarm windstorm.Therefore, the embodiment of the present invention is closed when being associated analysis to anomalous performance dataJoining regular process is：

All frequent item sets are found out, the frequency that the so-called frequency spectrum item collection i.e. set occurs is not less than minimum support；ByFrequent item set generates Strong association rule, these Strong association rules must satisfy minimum support and min confidence.

Specifically, it calculates EMS memory occupation and CPU is above the probability of predetermined threshold value appearance, i.e., in a data set,The value of the sum of anomalous performance data in number divided by data acquisition system that the above problem occurs simultaneously.Such as：Support ({ memoriesIt is high } -->{ CPU occupy high })=memory height and CPU occupy the high number/data recording number=3/5=60% occurred simultaneously.

Find Strong association rule

It should be noted that it is higher from the performance data of magnanimity and anomalous performance data to have analyzed probability in previous stepData, next then pass through probability analysis void recruit Strong association rule.Conditional probability analysis is used in embodiments of the present inventionMode, for example, calculating CPU in the case of memory height occupies also high probability, conversely, EMS memory occupation is low, CPU occupies low.Such as：Confidence ({ memory is high } -->{ CPU occupy high })=memory height and CPU occupy the high number/memory occurred simultaneously and be higher byExisting number=3/3=100%；Confidence ({ CPU occupies high } -->{ memory high })=memory height and CPU occupy it is high sameWhen number/CPU for occurring occupy the high number=3/4=75% occurred.

Apriori algorithm used by embodiment for a better understanding of the present invention explains the base of Apriori algorithm firstThis concept：

1, item collection and K- item collections

It is the set of all items (i.e. data) in data set to enable I={ i1, i2, i3 ... id }, and T={ t1, t2, t3 ....tN } be all affairs (i.e. time window) set, the item collection that each affairs ti includes is the subset of I.In association analysis,Including 0 or multiple collection is collectively referred to as item collection.If an item collection includes K item, it is referred to as K- item collections.Empty set refers to notInclude any item collection.For example, in the example of the present invention, { CPU occupies height, and response time is high, and memory uses high }It is a 3- item collection.Table 2 shows that a data set table of the embodiment of the present invention, wherein TID1 indicate that first time window corresponds toSubclass, as shown in Table 2, TID1 contains two item collections：CPU high and corresponding duration are high.

2 data set table of table

2, support counting

One critical nature of item collection is its support counting, that is, includes the affairs number of specific item collection, mathematically, itemThe support counting σ (X) of collection X can be expressed as：Wherein, symbol | * | indicate member in setThe number of element.In the embodiment that table 2 describes, the support counting of item collection { time delay is high, and memory uses high, response time height } is2, because only that including this 3 items simultaneously in 3 and 4 two affairs.

3, correlation rule

Correlation rule is the expression formula that contains shaped like X → Y, and wherein X and Y are disjoint item collections, i.e.,AssociationThe intensity of rule can be measured with its support (support) and confidence level (confidence).Support determines ruleIt can be used for the frequent degree of data-oriented collection, and confidence level determines the frequent degree that Y occurs in the affairs comprising X.

The formal definition of both measurements of support (s) and confidence level (c) is as follows：

S (X → Y)=σ (X ∪ Y)/N

C (X → Y)=σ (X ∪ Y)/σ (X)

Wherein, σ (X ∪ Y) is the support counting of (X ∪ Y), and N is affairs sum, and σ (X) is the support counting of X.

Example

In the embodiment that table 2 describes, rule { response time is high, and memory uses high } → { time delay is high } is considered.Due to itemThe support counting of collection { response time is high, and memory uses high, time delay height } is 2, and the sum of affairs is 5, so the branch of ruleDegree of holding is 2/5=0.4.The confidence level of rule be item collection { response time is high, and for memory using high, time delay is high } support counting withThe quotient of item collection { response time is high, and memory uses high } support technology, due to exist 3 affairs simultaneously comprising response time height withMemory is using height, so the confidence level of rule is 2/3=0.67.

Associated rule discovery

The set T of given affairs, associated rule discovery refer to finding out support to be more than or equal to minsup (minimum support)And confidence level is more than or equal to the strictly all rules of minconf (min confidence), and minsup and minconf are corresponding supportsAnd confidence threshold value.

The excavation of correlation rule is the process of two steps：

(1) frequent item set generates：Its target is to find that all item collections for meeting minimum support threshold value are (at least and predefinedIt is minimum support to count it is the same), these item collections are referred to as frequent item set.

(2) regular generation：Its target is that the rule of all high confidence levels is extracted from the frequent item set that previous step is found,These rules are referred to as strong rule.(must satisfy minimum support and min confidence)

The essence of Apriori algorithm looks for frequent item set using candidate.Apriori algorithm is a kind of most influential diggingDig the algorithm of Boolean Association Rules frequent item set.The name of algorithm based on the fact that：Algorithm uses frequent item set propertyPriori, as it will be seen that.Apriori is using a kind of alternative manner for being referred to as and successively searching for, and k- item collections are for visitingRope (k+1)-item collection.First, the set of frequent 1- item collections is found out.The set is denoted as L1.L1 is used to look for the set of frequent 2- item collectionsL2, and L2 so goes down, for looking for L3 until that cannot find frequent k- item collections.Each Lk is looked for need a scan database.

Apriori properties：All nonvoid subsets of frequent item set all must be also frequent.Apriori properties are based on such asLower observation：According to definition, if it is not frequent, i.e. P (I) that item collection I, which is unsatisfactory for minimum support threshold value s, I,<s.If item AIt is added to I, then result item collection (i.e. I ∪ A) can not possibly appearance more more frequent than I.Therefore, I ∪ A are nor frequent, i.e. P (I ∪ A)<s.The property belongs to a kind of special classification, referred to as antimonotone, it is intended that if a set cannot pass through test, its instituteThere is superset also all cannot be by identical test.It is referred to as antimonotone, because under the meaning that can't pass test, which isDull.

For Apriori algorithm, if a set is frequent item set, its all subsets are all frequent item sets.It liftsExample：Assuming that a set { memory is high, and CPU occupancy is high } is frequent item set, i.e., memory height, CPU are occupied high while being appeared in oneThe number of record is more than or equal to minimum support min_support, then its subset { memory is high }, and { CPU occupies high } goes out occurrenceNumber is necessarily greater than equal to min_support, i.e., its subset is all frequent item set.If a set is not frequent item set,Its all supersets are not frequent item sets.Citing：Assuming that set { memory is high } is not frequent item set, i.e., time that memory height occursNumber is less than min_support, then its any superset number that such as { memory is high, and CPU occupies high } occurs is necessarily smaller than min_Support, therefore its superset must be nor frequent item set.

The key of Apriori algorithm is how with Lk-1 to look for Lk, is made of following two step process：

Connection step：To look for Lk, pass through Lk-1 and the set for oneself connecting generation candidate's k- item collections.The set of the candidateIt is denoted as Ck.If l1 and l2 are the item collections in Lk-1.Mark li [j] indicates the jth item of li (for example, l1 [k-2] indicates the inverse of l13rd).For convenience, it is assumed that the item in affairs or item collection sorts by dictionary order.Execute connection Lk-1Lk-1；Wherein, Lk-1Element be attachable, if (k-2) a item is identical before them；That is, the element l1 and l2 of Lk-1 are attachable, if(l1 [1]=l2 [1]) ∧ (l1 [2]=l2 [2]) ∧ ... ∧ (l1 [k-2]=l2 [k-2]) ∧ (l1 [k-1]<l2[k-1]).ItemPart (l1 [k-1]<L2 [k-1]) it is simply to ensure not generate repetition.It is l1 [1] l1 to connect the result item collection that l1 and l2 is generated[2]…l1[k-1]l2[k-1]。

Beta pruning walks：Ck is the superset of Lk；That is, its member can be frequent, may not be frequent but allFrequent k- item collections are included in Ck.Scan database determines each candidate counting in Ck, so that it is determined that Lk is (that is, according to fixedJustice, count value is frequent not less than all candidates that minimum support counts, to belong to Lk).However, Ck may be very big,Involved calculation amount is just very big in this way.To compress Ck, Apriori properties can be used with following method：It is any non-frequent(k-1)-item collection is not the subset that may be frequent k- item collections.Therefore, if (k-1)-subset of candidate's k- item collection notIn Lk-1, then the candidate is also impossible to be frequent, so as to by being deleted in Ck.The test of this subset can use allThe Hash tree of frequent item set is rapidly completed.

Correlation rule is generated by frequent item set

Once finding out frequent item set by the affairs in database D, it is categorical (strong to generate Strong association rule by themCorrelation rule meets minimum support and min confidence).For confidence level, following formula, wherein conditional probability item collection can be usedSupport counting indicates.Confidence (A → B)=P (A │ B)=support (A ∪ B)/support (A), whereinSupport (A ∪ B) is the support counting of (A ∪ B), and support (A) is the support counting of A.According to the formula, correlation ruleIt can generate as follows：

F1, for each frequent item set l, generate all nonvoid subsets of l.

F2, rule are exported if support (l)/support (s) >=min_conf for each nonvoid subset s of lThenWherein, min_conf is minimal confidence threshold.Since rule is generated by frequent item set, each rule is certainlyIt is dynamic to meet minimum support.Frequent item set is stored in together with their support in hash tables in advance so that they can be quickIt is accessed.

Apriori algorithm is introduced with an example below, data set has 9 time windows, i.e. 9 subnumbers in this exampleAccording to collection, | D |=9.Include data I1, I2 and I5 in wherein Sub Data Set T1；Include data I2 and I4 in Sub Data Set T2；SubnumberInclude data I2 and I3 according to collecting in T3；Include data I1, I2 and I4 in Sub Data Set T4；In Sub Data Set T5 comprising data I1 andI3；Include data I2 and I3 in Sub Data Set T6；Include data I1 and I3 in Sub Data Set T7；Include data in Sub Data Set T8I1, I2, I3 and I5；Include data I1, I2 and I3 in Sub Data Set T9.

One), Mining Frequent Itemsets Based

1, in the first time iteration of algorithm, each item is the member of the set C1 of candidate 1- item collections, and algorithm is simply sweptAll affairs are retouched, each occurrence number is counted.

2, assume that minimum affairs support is counted as 2 (that is, minsup=2/9=22%).It can determine frequent 1- item collectionsSet L1.It is made of the candidate 1- item collections with minimum support.

3, it is the set L2 of the frequent 2- item collections of discovery, algorithm generates the set C2 of candidate's 2- item collections using L1 × L1.

4, affairs in D are scanned, the support for calculating each candidate in C2 counts.

5, the set L2 of frequent 2- item collections is determined, it is made of the candidate 2- item collections in the C2 with minimum support.

6, the generation of the set C3 of candidate's 3- item collections is listed in figure in detail.First, enable C3=L2L2=I1, I2,I3},{I1,I2,I5},{I1,I3,I5},{I2,I3,I4},{I2,I3,I5},{I2,I4,I5}}.According to Apriori properties,All subsets of frequent item set must be frequent, and we can determine whether rear 4 candidates to be unlikely to be frequent.Therefore, weThey are deleted by C3, their count value need not be just sought when L3 again in this way, scanning D thereafter and determining.Note that Apriori is calculatedMethod gives k- item collections, we only need to check whether their (k-1)-subset is frequent using successively search technique.

【L2L2 connections generate the process of C3】

1. connection：C3=L2 × L2={ { I1, I2 }, { I1, I3 }, { I1, I5 }, { I2, I3 }, { I2, I4 }, { I2, I5 } }{ { I1, I2 }, { I1, I3 }, { I1, I5 }, { I2, I3 }, { I2, I4 }, { I2, I5 } }={ I1, I2, I3 }, { I1, I2, I5 },{I1,I3,I5},{I2,I3,I4},{I2,I3,I5},{I2,I4,I5}}

2. using Apriori property beta prunings：All subsets of frequent item set must be frequent.

The 2- item subsets of f { I1, I2, I3 } are { I1, I2 }, { I1, I3 } and { I2, I3 }.All 2- of { I1, I2, I3 }Subset is all the element of L2.Therefore, retain { I1, I2, I3 } in C3.

The 2- item subsets of f { I1, I2, I5 } are { I1, I2 }, { I1, I5 } and { I2, I5 }.All 2- of { I1, I2, I5 }Subset is all the element of L2.Therefore, retain { I1, I2, I5 } in C3.

The 2- item subsets of f { I1, I3, I5 } are { I1, I3 }, { I1, I5 } and { I3, I5 }.{ I3, I5 } is not the element of L2,Because rather than it is frequent.In this way, by deleting { I1, I3, I5 } in C3.

The 2- item subsets of f { I2, I3, I4 } are { I2, I3 }, { I2, I4 } and { I3, I4 }.{ I3, I4 } is not the element of L2,Because rather than it is frequent.In this way, by deleting { I2, I3, I4 } in C3.

The 2- item subsets of f { I2, I3, I5 } are { I2, I3 }, { I2, I5 } and { I3, I5 }.{ I3, I5 } is not the element of L2,Because rather than it is frequent.In this way, by deleting { I2, I3, I5 } in C3.

The 2- item subsets of f { I2, I4, I5 } are { I2, I4 }, { I2, I5 } and { I4, I5 }.{ I4, I5 } is not the element of L2,Because rather than it is frequent.In this way, by deleting { I2, I3, I5 } in C3.

3. C3={ { I1, I2, I3 }, { I1, I2, I5 } } after beta pruning

7, affairs in D are scanned, to determine L3, it is made of the candidate 3- item collections in the C3 with minimum support.

8, algorithm generates the set C4 of candidate's 4- item collections using L3 × L3.Although connection generation result I1, I2, I3,I5 } }, this item collection is cut off, because its subset { I1, I3, I5 } is not frequent.In this way,Therefore algorithm terminates,Have found all frequent item sets.

On the basis of the above embodiments, IT system includes one or more in following domain：Business, network, application, numberAccording to library, external interface, container, virtual machine and physical store.

On the basis of the above embodiments, further include after step 103：The correlation rule and root primordium are opened upShow.It should be noted that by the way that correlation rule and root primordium to be shown, operation maintenance personnel can be facilitated to provide decision support.

Fig. 2 shows the functional block diagrams of the system of the analysis failure root primordium of the embodiment of the present invention, as shown, this methodIncluding：

Cutting module 201, for being ranked up according to the time attribute of each data in data set, according to preset time windowCutting is carried out to the data set, obtains several groups Sub Data Set；Comprising each in IT system in preset time range in data setThe anomalous performance data that alarm data, the wrong data in daily record and the performance data in a domain are concentrated.

It should be noted that the cutting module of this system can collect the fault data within the scope of certain time first, includingThe alarm data in each domain (is error in IT field mistake, for example network is disconnected, has net mistake and comes out；Alarm is untreatedMistake, have alarm mean wrong generation certainly, but the mistake IT system is untreated), the wrong data in daily record andThe anomalous performance data that performance data is concentrated, these fault datas are stored in by data cleansing in database, and data set is constituted, and are usedIn follow-up problem retrospect and root primordium positioning, from the perspective of linear and nonlinear, performance data belongs to linear data, andAlarm data and wrong data belong to nonlinear data.

In computer systems, each data existence time attribute, so-called time attribute are to show the beginning of the dataMoment, finish time etc..It according to being carved at the beginning of each data in data set, is ranked up, just obtains (the i.e. event of each dataBarrier) generation sequence, further to data set according to time window carry out cutting, the data in data set can be sorted out to differenceIn the corresponding Sub Data Set of time window.For example 10 points to 12 points of certain day is analyzed, totally 120 minutes data, with 10 minutes windowsGranularity does cutting, splits into 12 groups of data, and every group of data all include several anomalous performance data, alarm data/wrong data, this hairThe design of bright embodiment is：If performance issue occurs, many problems, but these problems can be simultaneously broken out at certain momentMost of concentrated in several key problems, is done probability analysis by the set to decomposition and is found out those key problems.ThisThe system of the data mining of inventive embodiments helps to find out the relatively high data of probability of occurrence from a large amount of data.

Relating module 202, it is described for obtaining frequent item set and correlation rule in data set according to Apriori algorithmInclude a certain number of data with strong incidence relation in frequent item set.

Root primordium searching module 203 successively will sequence for being ranked up according to the time attribute of frequent episode intensive dataForward data are matched with the accompanying alarm reason data to prestore in alarm cause database, if successful match, are removed, and are continuedThe next item down is matched, will finally match data unsuccessful and that sequence is forward as the last data of sequential in the frequent item setRoot primordium.

On the basis of the above embodiments, the system of the embodiment of the present invention further includes data set acquisition module, and data set obtainsModulus block specifically includes：

On the basis of the various embodiments described above, screening unit is specifically used for：

On the basis of the various embodiments described above, the relating module is specifically used for：

The apparatus embodiments described above are merely exemplary, wherein can be as the unit that separating component illustratesOr may not be and be physically separated, the component shown as unit may or may not be physical unit, i.e.,A place can be located at, or may be distributed over multiple network units.It can select according to the actual needs thereinSome or all of module achieves the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying creative laborIn the case of dynamic, you can to understand and implement.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment canIt is realized by the mode of software plus required general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, onStating technical solution, substantially the part that contributes to existing technology can be expressed in the form of software products in other words, shouldComputer software product can store in a computer-readable storage medium, such as ROM/RAM, magnetic disc, CD, including several fingersIt enables and using so that a computer equipment (can be personal computer, server or the network equipment etc.) executes each implementationThe method of certain parts of example or embodiment.

Finally it should be noted that：The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations；AlthoughPresent invention has been described in detail with reference to the aforementioned embodiments, it will be understood by those of ordinary skill in the art that：It still may be usedWith technical scheme described in the above embodiments is modified or equivalent replacement of some of the technical features；And these modifications or replacements, various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution spirit andRange.

Claims

1. a kind of method of analysis failure root primordium, which is characterized in that including：

S1, it is ranked up according to the time attribute of each data in data set, the data set is carried out according to preset time windowCutting obtains several groups Sub Data Set；

S2, the frequent item set and correlation rule in data set are obtained according to Apriori algorithm, include certain in the frequent item setThe data with strong incidence relation of quantity；

S3, it is ranked up according to the time attribute of frequent episode intensive data, the forward data that will sort successively and alarm cause numberIt according to the accompanying alarm reason data matching to prestore in library, if successful match, removes, continues to match the next item down, it finally will matchingUnsuccessful and root primordium of the forward data as the last data of sequential in the frequent item set that sort；

Wherein, the mistake in the alarm data in each domain, daily record in IT system in preset time range is included in the data setThe anomalous performance data that data and performance data are concentrated.

2. the method as described in claim 1, which is characterized in that further include before the step S1：

The alarm data and performance data that each domain in IT system in the preset time range is obtained by APM probes, pass throughLog collection side obtains the wrong data of daily record data in IT system；

By the wrong data and performance data in the alarm data in each domain, daily record in IT system in the preset time rangeIn anomalous performance data constitute the data set.

3. method as claimed in claim 2, which is characterized in that described to screen the performance data using mean value and multiple varianceIn anomalous performance data the step of, specifically include：

The variance of calculated performance data filters out the performance data that numerical values recited is within the scope of mean value to 3 times of variances, will be remainingPerformance data as the anomalous performance data.

4. the method as described in claim 1, which is characterized in that the step S2 is specifically included：

It will from high to low sort after all data statistics supports in data set, obtain candidate's 1- item collections, remove candidate 1-The data less than minimum support are concentrated, frequent 1- item collections are obtained；

According to Apriori algorithm using successively search technique, until obtaining frequent m- item collections, the frequent m- item collections meet itemPart：Frequent m- item collections be not empty and (m-1)-subset frequently, m be not more than the number with most multidata subdata intensive dataAnd (m+1)-item collection is sky；

5. the method as described in claim 1, which is characterized in that further include after the step S3：By the correlation rule andRoot primordium is shown.

6. the method as described in claim 1, which is characterized in that the IT system includes one or more in following domain：IndustryBusiness, network, application, database, external interface, container, virtual machine and physical store.

7. a kind of system of analysis failure root primordium, which is characterized in that including：

Cutting module, for being ranked up according to the time attribute of each data in data set, according to preset time window to describedData set carries out cutting, obtains several groups Sub Data Set；

Relating module, for obtaining frequent item set and correlation rule in data set, the frequent item set according to Apriori algorithmIn include a certain number of data with strong incidence relation；

Root primordium searching module will sort forward successively for being ranked up according to the time attribute of frequent episode intensive dataData are matched with the accompanying alarm reason data to prestore in alarm cause database, if successful match, are removed, are continued under matchingOne data will finally match data unsuccessful and that sequence is forward as the root of the last data of sequential in the frequent item setReason；

8. system as claimed in claim 7, which is characterized in that further include data set acquisition module, the data set obtains mouldBlock specifically includes：

Collector unit, for by APM probes obtain in the preset time range in IT system the alarm data in each domain andPerformance data obtains the wrong data of daily record data in IT system by log collection side；

Unit is converged, is used for the error number in the alarm data in each domain, daily record in IT system in the preset time rangeAnomalous performance data according to this and in performance data constitute the data set.

9. system as claimed in claim 8, which is characterized in that the screening unit is specifically used for：

10. system as claimed in claim 7, which is characterized in that the relating module is specifically used for：

According to Apriori algorithm using successively search technique, until obtaining frequent m- item collections, meet condition：Frequent m- item collections are notFor empty and (m-1)-subset frequently, m be not more than number and (m+1)-item collection with most multidata subdata intensive dataFor sky；