The content of the invention
In order to overcome technical problem present in correlation technique, the present invention provides one kind and utilizes timing dependence to carry out IT eventsHinder the method and device of root cause analysis, failure cause can be analyzed in time and excluded former after failure generation to realizeBarrier.
In a first aspect, the embodiments of the invention provide a kind of side that IT failure root cause analysis is carried out using timing dependenceMethod, its feasible technical scheme includes as follows:
A kind of method that IT failure root cause analysis is carried out using timing dependence, methods described is included:
Obtain system journal;
The critical field of the system journal is extracted, the critical field is counted to obtain the time of system journalSequence data;
Assume that detection automatically extracts the correlated characteristic of the time series data based on quantization;
When IT failures occur, the correlated characteristic of the time series data is examined by Granger causalityTest, wherein, the size of the causality value between each correlated characteristic of the time series data is as being evaluated as the ITThe foundation of the occurrence cause of failure.
It is described to extract the system day in a kind of implementation being likely to occur on the other hand with reference on the other handThe critical field of will, is counted to obtain the time series data of system journal to the critical field, including:
Extract the critical field of the system journal;
Key index parameter to the system journal carries out counting the time series data for obtaining the system journal;
Wherein, the key index parameter is included more than one or both of access number, authority change, error messageCombination.
It is described to extract the system day in a kind of implementation being likely to occur on the other hand with reference on the other handThe critical field of will, is counted to obtain the time series data of system journal to the critical field, in addition to:
Parametrization setting is carried out to the critical field;
Critical field after the parametrization obtained to the system journal sets up parameter role graph of a relation;
It is described that the correlated characteristic of the time series data is tested by Granger causality, including:Pass throughGranger causality is tested to the critical field after the parametrization.
With reference on the other hand, in a kind of implementation being likely to occur on the other hand, the time series dataThe size of causality value between each correlated characteristic as the occurrence cause for being evaluated as the IT failures foundation, including:
The critical field after the parametrization is tested by Granger causality, the key after parametrization is drawnThe causality value of field;
The quantitative causality figure of the IT failures is set up according to the causality value.
With reference on the other hand, in a kind of implementation being likely to occur on the other hand, the time series dataThe size of causality value between each correlated characteristic is also wrapped as the foundation for the occurrence cause for being evaluated as the IT failuresInclude:
Determine that path maximum in the quantitative causality figure is IT fault propagations path.
Second aspect, the embodiment of the present invention additionally provides a kind of dress that IT failure root cause analysis is carried out using timing dependencePut, its feasible technical scheme includes as follows:
Described device includes:
Acquisition module, for obtaining system journal;
Statistical module, the critical field for extracting the system journal, is counted to obtain to the critical fieldThe time series data of system journal;
Module is automatically extracted, for assuming that detection automatically extracts the correlated characteristic of the time series data based on quantization;
Fault determination module, for when IT failures occur, by Granger causality to the time series dataCorrelated characteristic test, wherein, the size of the causality value between each correlated characteristic of the time series dataIt is used as the foundation for the occurrence cause for being evaluated as the IT failures.
Above-mentioned device, the statistical module includes:
Extract submodule, the critical field for extracting the system journal;
Statistic submodule, for the key index parameter to the system journal count obtaining the system journalTime series data;
Wherein, the key index parameter is included more than one or both of access number, authority change, error messageCombination.
Above-mentioned device, the statistical module also includes:
Setup module is parameterized, for carrying out parametrization setting to the critical field;
Parameter Map sets up module, and parameter role is set up for the critical field after the parametrization that is obtained to the system journalGraph of a relation;
The fault determination module is additionally operable to:The critical field after the parametrization is carried out by Granger causalityExamine.
Above-mentioned device, the fault determination module is additionally operable to:
The critical field after the parametrization is tested by Granger causality, the key after parametrization is drawnThe causality value of field;
The quantitative causality figure of the IT failures is set up according to the causality value.
Above-mentioned device, the fault determination module, in addition to:
Path determination sub-module, for determining that path maximum in the quantitative causality figure is IT fault propagations roadFootpath.
Critical field of the invention by extracting the system journal, is counted to obtain system to the critical fieldThe time series data of daily record, set up after Granger causality by the correlated characteristic of each time series data in calculating figure itBetween causality value determine failure cause, and new parameter can be continuously added in Granger causality figure, realizedFailure root cause analysis process is automatically completed by way of machine learning, helps user to rapidly find out fault occurrence reason,Reduce failure diagnosis time (Mean Time To
Diagonise, MTTD), system is most recovered normal soon.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, notCan the limitation present invention.
Embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouchedThe specific embodiment stated is used only for explaining the present invention, rather than limitation of the invention.It also should be noted that, in order to justPart related to the present invention rather than entire infrastructure are illustrate only in description, accompanying drawing.
It should be mentioned that some exemplary embodiments are described as before exemplary embodiment is discussed in greater detailThe processing described as flow chart or method.It is therein to be permitted although each step to be described as to the processing of order in flow chartMulti-step can be implemented concurrently, concomitantly or simultaneously.In addition, the order of each step can be rearranged, when its operationThe processing can be terminated during completion, it is also possible to the other steps being not included in accompanying drawing.Processing can be corresponded toIn method, function, code, subroutine, subprogram etc..
The present invention relates to it is a kind of using timing dependence carry out IT failure root cause analysis method and its corresponding device, itsThe main Enterprise IT System that applies in the scene of network failure is excluded in time after breaking down, its basic thought is:Extract systemCritical field in system daily record as time series data, and to the automation of time series data progress correlated characteristic to carryTake, when IT system breaks down, sequential correlated characteristic is tested using Granger causality, to causality in figureTaken this as a foundation as fault propagation path as the scheme basis for solving the failure, or by event in the maximum path of valueOptimal fault solution is matched in barrier knowledge data base, fault occurrence reason can be rapidly found out, fault diagnosis is reducedTime MTTD, makes system most recover normal soon.
In the case of the present embodiment is applicable in the IT terminals with machine learning module to carry out fast failure exclusion,This method can be performed by machine learning module, and wherein the machine learning module can be realized by software and/or hardware,It can apply to such as the easy application program of daily record, as shown in figure 1, being the schematic flow sheet that the embodiment of the present invention one is provided, the sideMethod specifically includes following steps:
In step 110, system journal is obtained;
The system journal includes system journal, application log and the peace produced during the operating system of equipmentFull-time will etc.;In a kind of feasible embodiment, it can be by inputting eventvwr.msc in being run in operating systemThe event viewer of calling system is received to system journal.
In the step 120, the critical field of the system journal is extracted, the critical field is counted to beThe time series data of system daily record;
The critical field of the system journal can be the information for representing a certain type.
In step 130, assume that detection automatically extracts the correlated characteristic of the time series data based on quantization;
, can be every by calculating when carrying out feature extraction in a kind of implement scene of exemplary embodiment of the presentPre-selection model and be carried out by deep learning that the correlation of one feature and response variable, training can give a mark to featureThe modes such as feature selecting are carried out, the correlated characteristic of the time series data can be then autocorrelation parameter, partial correlation parameter withAnd delayed period parameters etc..
When detection is assumed in the quantization, generally using percentage % as detection level, the correlation more than the detection level is specialLevy and be extracted, the correlated characteristic that Shamanism is less than the detection level is then then filtered for irrelevant feature.
In step 140, when IT failures occur, correlation of the Granger causality to the time series data is passed throughFeature is tested, wherein, the size of the causality value between each correlated characteristic of the time series data is as commentingValency is the foundation of the occurrence cause of the IT failures.
In another implement scene of exemplary embodiment of the present, as shown in Fig. 2 described extract the system journalCritical field, the critical field is counted to obtain the time series data of system journal, in addition to:
In step 121, parametrization setting is carried out to the critical field;
In step 122, the critical field after the parametrization obtained to the system journal sets up parameter role graph of a relation;
It is described that the correlated characteristic of the time series data is tested by Granger causality, including:Step123:The critical field after the parametrization is tested by Granger causality.
In another implement scene of exemplary embodiment of the present, as shown in figure 3, the time series data is eachThe size of causality value between correlated characteristic as the occurrence cause for being evaluated as the IT failures foundation, including:
In step 131, the critical field after the parametrization is tested by Granger causality, ginseng is drawnThe causality value of critical field after numberization;
When the critical field in fault knowledge database matches failed with the system journal, according to user onThe new established feedback solutions of failure are by the new phenomenon of the failure, new failure cause and new failure solution party of this new failureCase is added in the fault knowledge database, i.e. by machine learning, with the critical field of the system journal of extractionParameter role graph of a relation is set up as key index parameter, and to the key index parameter of the system journal, and and then is carried outStep 140, wherein, the key index parameter includes access number, authority change, more than one or both of error messageCombination.
In step 132, the quantitative causality figure of the IT failures is set up according to the causality value.
In a kind of implement scene of exemplary embodiment of the present, key can be referred to Granger causality testMark parameter is examined two-by-two, the quantitative causality figure of failure is set up according to the causality value calculated, by causalityThe maximum path of value is considered fault propagation path, and the fault propagation path is also the occurrence cause of the IT failures simultaneously.
In another implement scene of exemplary embodiment of the present, each correlated characteristic of the time series data itBetween causality value size as the occurrence cause for being evaluated as the IT failures foundation, in addition to:
Determine that path maximum in the quantitative causality figure is IT fault propagations path.
The present invention method, by extracting the critical field of the system journal, the critical field is counted withThe time series data of system journal is obtained, the phase by each time series data in calculating figure after Granger causality is set upThe causality value closed between feature determines failure cause, and new ginseng can be continuously added in Granger causality figureNumber, realizes and failure root cause analysis process is automatically completed by way of machine learning, helps user to rapidly find out failure hairRaw reason, reduces failure diagnosis time (Mean TimeTo Diagonise, MTTD), system is most recovered normal soon.
Fig. 4 is a kind of device that IT failure root cause analysis is carried out using timing dependence that the embodiment of the present invention five is providedStructural representation, the device can be realized by software and/or hardware, be usually integrated in machine learning, can be by using sequentialCorrelation carries out the method for IT failure root cause analysis to realize.As illustrated, the present embodiment can based on above-described embodiment,There is provided a kind of device that IT failure root cause analysis is carried out using timing dependence, it mainly includes acquisition module 410, statisticsModule 420, automatically extract module 430 and fault determination module 440.
Acquisition module 410 therein, for obtaining system journal;
Statistical module 420 therein, the critical field for extracting the system journal, unites to the critical fieldCount to obtain the time series data of system journal;
It is therein to automatically extract module 430, for assuming that detection automatically extracts the time series data based on quantizationCorrelated characteristic;
Fault determination module 440 therein, for when IT failures occur, by Granger causality to the timeThe correlated characteristic of sequence data is tested, wherein, the causality amount between each correlated characteristic of the time series dataThe size of value as the occurrence cause for being evaluated as the IT failures foundation.
In another implement scene of exemplary embodiment of the present, as shown in figure 5, the statistical module 420 includes:
Extract submodule 421, the critical field for extracting the system journal;
Statistic submodule 422, counts for the key index parameter progress to the system journal and obtains the system dayThe time series data of will;
Wherein, the key index parameter is included more than one or both of access number, authority change, error messageCombination.
In another implement scene of exemplary embodiment of the present, the statistical module also includes:
Setup module is parameterized, for carrying out parametrization setting to the critical field;
Parameter Map sets up module, and parameter role is set up for the critical field after the parametrization that is obtained to the system journalGraph of a relation;
The fault determination module is additionally operable to:The critical field after the parametrization is carried out by Granger causalityExamine.
Above-mentioned device, the fault determination module 440 is additionally operable to:
The critical field after the parametrization is tested by Granger causality, the key after parametrization is drawnThe causality value of field;
The quantitative causality figure of the IT failures is set up according to the causality value.
Above-mentioned device, the fault determination module 440, in addition to:
Path determination sub-module, for determining that path maximum in the quantitative causality figure is IT fault propagations roadFootpath.
The executable present invention of device for carrying out IT failure root cause analysis using timing dependence provided in above-described embodimentThe method that IT failure root cause analysis is carried out using timing dependence provided in middle any embodiment, possesses execution this method phaseThe functional module and beneficial effect answered, the ins and outs not being described in detail in the above-described embodiments, reference can be made to the present invention is any realApply the method that IT failure root cause analysis is carried out using timing dependence provided in example.
It will be appreciated that, the present invention also extends to the computer program for being suitable for putting the invention into practice, particularlyComputer program on carrier or in carrier.Program can be with source code, object code, code intermediate source and such as part volumeThe form of the object code for the form translated, or with any other shape for being adapted to use in the realization according to the method for the present inventionFormula.Also it will be noted that, such program may have many different frame designs.For example, realizing the side according to the present inventionFunctional program code of method or system may be subdivided into one or more subroutine.
For that will be obvious for technical personnel in the functional many different modes of these subroutine intermediate distributions.Subroutine can be collectively stored in an executable file, so as to form self-contained program.Such executable file canWith including computer executable instructions, such as processor instruction and/or interpreter instruction (for example, Java interpreter instruction).CanAlternatively, one or more or all subroutines of subroutine may be stored at least one external library file, andAnd statically or dynamically (for example at runtime) linked with main program.Main program contains at least one in subroutineAt least one call.Subroutine can also include to mutual function call.It is related to the embodiment bag of computer program productInclude the computer executable instructions corresponding at least one of illustrated method each step of the process step of method.These refer toOrder can be subdivided into subroutine and/or be stored in the file of one or more possible static or dynamic link.
Another is related to the embodiment of computer program product and includes corresponding in illustrated system and/or product at leastThe computer executable instructions of each device in the device of one.These instructions can be subdivided into subroutine and/or be storedIn the file of one or more possible static or dynamic link.
The carrier of computer program can deliver any entity or device of program.For example, carrier can be wrappedContaining storage medium, such as (ROM such as CDROM or semiconductor ROM) or magnetic recording media (such as floppy disk or hard disk).EnterOne step, carrier can be the carrier that can be transmitted, such as electricity or optical signalling, its can via cable or optical cable, orPerson is transmitted by radio or other means.When program is embodied as such signal, carrier can be by such cableOr other devices or device composition.Alternatively, carrier can be the integrated circuit for being wherein embedded with program, described integratedCircuit is adapted for carrying out correlation technique, or for used in the execution of correlation technique.
Should be noted that, embodiment mentioned above is to illustrate the present invention, rather than the limitation present invention, and thisThe technical staff in field is possible to design many alternate embodiments, without departing from scope of the following claims.In powerDuring profit is required, any reference symbol being placed between round parentheses is not to be read as being limitations on claims.Verb " bagInclude " and its is paradigmatic using being not excluded for depositing for element in addition to those recorded in the claims or step.Article " one " or " one " before element are not excluded for the presence of a plurality of such elements.The present invention can pass throughInclude the hardware of several visibly different elements, and realized by properly programmed computer.Enumerating several devicesIn device claim, several in these devices can be embodied by the same item of hardware.In mutually different appurtenanceProfit states that the simple fact of some measures is not intended that the combination of these measures can not be used to benefit in requiring.
If desired, difference in functionality discussed herein can be performed with different order and/or performed simultaneously with one another.In addition, if desired, one or more functions described above can be optional or can be combined.
If desired, each step is not limited to the execution sequence in each embodiment, different step as discussed aboveIt can be performed and/or performed simultaneously with one another with different order.In addition, in other embodiments, described above one or manyIndividual step can be optional or can be combined.
Although various aspects of the invention are provided in the independent claim, the other side of the present invention includes coming fromThe combination of the dependent claims of the feature of described embodiment and/or feature with independent claims, and not onlyIt is the combination clearly provided in claim.
Although it is to be noted here that the foregoing describing the example embodiment of the present invention, these descriptions are notIt should be understood in a limiting sense.Will without departing from such as appended right on the contrary, several change and modifications can be carried outThe scope of the present invention defined in asking.
Will be appreciated by those skilled in the art that each module in the device of the embodiment of the present invention can use general meterCalculate device to realize, each module can be concentrated in the group of networks of single computing device or computing device composition, and the present invention is realThe method that the device in example corresponds in previous embodiment is applied, it can be realized by executable program code, can also be led toThe mode of integrated circuit combination is crossed to realize, therefore the invention is not limited in specific hardware or software and its combination.
Will be appreciated by those skilled in the art that each module in the device of the embodiment of the present invention can use general shiftingDynamic terminal realizes that each module can be concentrated in the device combination of single mobile terminal or mobile terminal composition, the present inventionThe method that device in embodiment corresponds in previous embodiment, it can be realized by the executable program code of editor,It can be realized by way of integrated circuit combination, therefore the invention is not limited in specific hardware or software and its knotClose.
Note, above are only the exemplary embodiment and institute's application technology principle of the present invention.Those skilled in the art can manageSolution, the invention is not restricted to specific embodiment described here, can carry out various obvious changes for a person skilled in the artChange, readjust and substitute without departing from protection scope of the present invention.There is no need and unable to give all embodimentsWith exhaustion.Therefore, although the present invention is described in further detail by above example, but the present invention not only limitIn above example, without departing from the inventive concept, other more equivalent embodiments can also be included, it is all in this hairBright spirit and with any obvious change or variation extended out within principle still in the claims in the present inventionAmong the scope protected.