Background
The interconnected devices generate a large amount of log data each day, and the log data contains important information such as running states of hardware and application programs, error information, user behaviors, and the like, while log formats generated by various devices are not communicated under normal conditions. The information between the logs cannot be unified and the data cannot be obtained effectively, so that the logs need to be analyzed, the logs in different formats are analyzed into structured data, and statistics and analysis of the data are facilitated.
The analyzed log content can help enterprises and organizations to better understand the running conditions of hardware equipment and application programs of the enterprises and organizations, and specific logs can be uniformly processed based on uniform standard content, so that users can quickly perceive key content of the log content, and the reliability, performance and safety of the system and equipment are improved.
The current journal analysis and processing scheme mainly comprises the following steps:
and writing a log analysis file by combining the description file of the equipment log format and the content/format field of the log, wherein the file comprises a corresponding relation between an original field of the log and a standard field defined by a log platform, binding the analysis file with a log transmission source IP, matching the analysis file of the corresponding log format according to the log receiving time sequence, analyzing the original information in the log into a field which is not standardized by the log platform, and analyzing and extracting the analyzed log content to generate an alarm or perform other processing.
The scheme has the advantages that the reliability of the log analysis data source can be guaranteed, the access and analysis of each log are ensured to be within the plan, meanwhile, the analysis logic structure of the log is relatively simple, but the problems include:
(1) When one device is accessed each time, the log format and the analysis rule of the device need to be known in advance, the corresponding analysis file can be configured to carry out log analysis, otherwise, the log cannot be analyzed;
(2) If the log sent in a certain time period is larger, the log analysis rate cannot keep up with the log sending speed to form log accumulation, so that the overall analysis and log receiving speed is influenced;
(3) All logs can be analyzed and extracted after being analyzed, and if abnormal information appears, the abnormal information cannot be known in the first time according to the analysis sequence;
(4) The logic association between the data in the analysis process is tight, and if an error occurs in a certain link, the whole system operation can be deviated.
Disclosure of Invention
In order to solve the technical problems of poor analysis universality, low analysis speed and low efficiency in the current log analysis, the invention provides a log analysis method and a system, wherein the method comprises the following steps:
S1, creating an analysis rule base, wherein log content and an analysis file comparison table are stored in the analysis rule base;
S2, setting a log preprocessing rule, and generating a preprocessing rule word stock according to original content of the log;
S3, obtaining a log to be analyzed, and processing the log to be analyzed by utilizing the preprocessing rule lexicon to obtain the type of the log to be analyzed;
S4, dividing the logs to be analyzed of different types into a plurality of different independent lines, and analyzing the logs by utilizing analysis files;
S5, correcting and updating the preprocessing rule word stock by utilizing the analyzed content.
A log parsing system, comprising:
The analysis rule construction module is used for creating an analysis rule base, wherein the analysis rule base stores log content and an analysis file comparison table;
the preprocessing rule setting module is used for setting a log preprocessing rule and generating a preprocessing rule word stock according to the original content of the log;
the log classification module is used for obtaining a log to be analyzed, and processing the log to be analyzed by utilizing the preprocessing rule lexicon to obtain the type of the log to be analyzed;
The log analysis module is used for dividing logs to be analyzed of different types into a plurality of different independent lines and carrying out log analysis by utilizing analysis files;
And the optimization updating module is used for updating the preprocessing rule word stock by utilizing the analyzed content correction.
The beneficial effects provided by the invention are as follows:
1. The log information of various devices contains event information, the security events of the system can be monitored based on the transmitted log information, security threats can be found and dealt with in time, the situation that key logs are submerged in massive log information to cause larger accidents is avoided, and meanwhile, all types of logs can be analyzed.
2. After the running information of the equipment is recorded and the logs of the equipment are stored and analyzed, the fault problem can be rapidly positioned when the equipment is in fault, and the performance and response time of the system can be optimized in daily work.
3. And the data analysis is to analyze a large amount of data through the log, so that the rules and trends in the current data can be analyzed, the analysis efficiency is improved, and the construction level of a network system and the construction level of safety management capability are mastered.
4. Optimizing the resource utilization, knowing the utilization condition of the system resource through the running condition of the equipment, optimizing the allocation and utilization of the resource and improving the utilization rate of the system.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be further described with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic flow chart of the method of the present invention.
The invention provides a log analysis method and a system, comprising the following steps:
the method comprises the following steps:
S1, creating an analysis rule base, wherein log content and an analysis file comparison table are stored in the analysis rule base;
it should be noted that, step S1 specifically includes:
Acquiring a history analysis file, and summarizing specific values of a certain class of analysis files to obtain key parameters;
And matching original content of the log with the key parameters of the log sent by the unbound log IP source, and if the matching is successful, indicating that the log corresponds to the analysis file to form a comparison table.
As an embodiment, the invention generalizes the specific values of the analysis files to form some key parameters, matches the original content of the log with the key parameters of the file when the log is sent by the log source IP of the unbound analysis files, and uses the corresponding analysis files to analyze after the matching is completed.
Specifically, the specific value refers to the content of a certain value in the original log in the analysis file expression, for example, the alarm level in the original log is represented by leve l, and after the analysis file is analyzed, leve l is converted into the alarm level.
Whereas the values of the fields for leve l in the original log may be 1,2, 3, 4, which need to be mapped to alarm levels in the platform in the parsed file.
Thus, level 4 is a set of keywords, and when the original log contains the content, the content is parsed by the parsing file corresponding to the keywords.
S2, setting a log preprocessing rule, and generating a preprocessing rule word stock according to original content of the log;
It should be noted that, in step S2, a key portion in the original content of the log is extracted as a preprocessing rule word stock.
As one embodiment, the invention extracts the content summary of the analysis relative key in the original information of the log as a preprocessing rule word stock, and then generates a preprocessing rule based on the AND or NOR relation between the content of the rule word stock and the log source.
The specific procedure for the formation of the preprocessing rules is as follows:
1. Firstly, keywords are added for preprocessing rules to form word stock, wherein the added keywords are possible contents of original information content of a log, and can be set under which rules the keywords are allowed to be referenced, and the conditions comprise that the log source IP
2. Creating a preprocessing rule, namely creating a preset rule based on the content of the added keyword library and the log transmission incidental information, wherein a user can select multiple kinds of information to be combined with or not, and the specific explanation is as follows:
Referring to tables 1 and 2, table 1 is the incidental information of log transmission, and table 2 is an explanation of the information combination method.
Table 1 logging side information
Table 2 description of information combining modes
3. An execution range/condition is set for the preprocessing rule, the execution range of the rule is set by combining the above log transmission incidental information with each other, and the rule is executed when the received log incidental information meets the execution condition set for a certain rule.
4. The execution mode after the current log accords with the set preprocessing rule is set, and the method is divided into two modes, namely 1, the log is marked as an alarm log or other special events to enter an analysis link, and 2, the log is directly discarded and is not analyzed.
S3, obtaining a log to be analyzed, and processing the log to be analyzed by utilizing the preprocessing rule lexicon to obtain the type of the log to be analyzed;
It should be noted that the logs to be parsed of different types include a normal log and an alarm log. Before the log is analyzed, preprocessing rules are executed on the original content of the log, the log is discarded and is not analyzed or marked as an alarm log according to a processing mode set by the rules, and the alarm analysis link (not analyzed in the same line with the normal log) is entered.
S4, dividing the logs to be analyzed of different types into a plurality of different independent lines, and analyzing the logs by utilizing analysis files;
It should be noted that, for a log that is normally parsed, after the parsing is completed, the content of the log that is parsed is analyzed and key information is extracted, and alarm generation or other processing is performed based on the analysis rule.
S5, correcting and updating the preprocessing rule word stock by utilizing the analyzed content.
It should be noted that, when the normal log is analyzed, the key information is extracted from the analyzed content, and the key information is used as an alarm judgment standard when the alarm log is analyzed.
Based on the content generated after the normal log analysis, if the content is finally identified as the alarm content, generating a preprocessing keyword from the field content of the preprocessing rule triggered by the normal log at the value of the original log, and filling the preprocessing rule word stock.
And removing the preprocessing keywords which are not matched with the analysis rules after log analysis of the log marked as the alarm after preprocessing is completed from the preprocessing rule word stock.
As an embodiment, please refer to fig. 2, fig. 2 is a detailed process of processing a new log.
1. When the log platform receives a new log, the log is stored and then different processing paths are triggered based on the attached attribute of the sent log and the original content of the log, so that the purposes of obtaining the log information content and processing the log information content of different types of logs are obtained more quickly by step analysis
2. The log analysis method comprises the specific processes of judging whether an analysis file is bound or not according to an IP address of a log sending source after the log is received, analyzing the corresponding analysis file if the analysis file is bound, matching the original log content with specific values of all the analysis files if the analysis file is not bound, analyzing the log by the analysis file corresponding to the specific values if the matching is successful, and not analyzing the log if the analysis file is not matched. When the logs are all related to the analysis file, entering a preprocessing link of the logs, screening whether the logs need to be preprocessed according to an execution range set by a rule, if not, directly entering a normal log analysis channel to carry out log analysis (time sequence), if the logs need to be preprocessed, and if the logs do not meet the preprocessing rule, entering the normal analysis channel, if yes, carrying out different processes according to a rule execution mode, namely, 1, directly discarding the logs without analysis, and 2, marking the logs as alarm logs, wherein the log analysis link is not the same as the normal log. When log analysis through the normal analysis channel is completed, the log analysis is required to be matched with analysis log alarm rules except the database, if the rules are met, an alarm log is generated, and meanwhile, after the log alarm is generated, original contents of log information contents of analysis log alarm rules in matching before being not analyzed are filled into a preprocessing keyword library.
A log parsing system, comprising:
The analysis rule construction module is used for creating an analysis rule base, wherein the analysis rule base stores log content and an analysis file comparison table;
the preprocessing rule setting module is used for setting a log preprocessing rule and generating a preprocessing rule word stock according to the original content of the log;
the log classification module is used for obtaining a log to be analyzed, and processing the log to be analyzed by utilizing the preprocessing rule lexicon to obtain the type of the log to be analyzed;
The log analysis module is used for dividing logs to be analyzed of different types into a plurality of different independent lines and carrying out log analysis by utilizing analysis files;
And the optimization updating module is used for updating the preprocessing rule word stock by utilizing the analyzed content correction.
Finally, the processing of the overall method is described by one embodiment.
For example, a normal log is that a live l5 Fri Ju l 21 2023 09:32:04 192.168.184.145 family comes network full-flow security analysis system-server, a trigger time is 2023-07-21 09:32:04 link name, a honeycomb innovation valley alarm name is a certain data flow characteristic value alarm grade, a high alarm type is an information collection trigger condition, an HTTP OPTIONS method alarm type is that a data flow characteristic value alarm source IP address is 31.208.190.177 source port, 25074 target IP address is 192.168.201.211 target port is 4413, and accumulated trigger times are 1;
An alarm log is a full-flow security analysis system-server of the leve l3 Fr i Ju l 21 2023 09:32:04 192.168.184.145 family, the triggering time is 2023-07-21 09:32:04 link name, the honeycomb innovation valley alarm name is a certain data flow characteristic value alarm level, the high alarm type is an information collection triggering condition, the HTTP OPTIONS method alarm type is that the data flow characteristic value alarm source IP address is 31.208.190.177 source port, 25074 target IP address is 192.168.201.211 target port is 4413, and the accumulated triggering frequency is 1;
The specific value of the analysis file is that the Kelai network full-flow security analysis system
The preprocessing rule base is leve l3 and leve l4, the preprocessing rule base is that the alarm log analysis channel is entered
The parsed content is log level advanced log reception time 2023/12/02 20:13:25 field 1:......................
The preprocessing rules updated after analysis are leve l3, leve l4 and leve l5.
The beneficial effects of the invention are as follows:
1. The log information of various devices contains event information, the security events of the system can be monitored based on the transmitted log information, security threats can be found and dealt with in time, the situation that key logs are submerged in massive log information to cause larger accidents is avoided, and meanwhile, all types of logs can be analyzed.
2. After the running information of the equipment is recorded and the logs of the equipment are stored and analyzed, the fault problem can be rapidly positioned when the equipment is in fault, and the performance and response time of the system can be optimized in daily work.
3. And the data analysis is to analyze a large amount of data through the log, so that the rules and trends in the current data can be analyzed, the analysis efficiency is improved, and the construction level of a network system and the construction level of safety management capability are mastered.
4. Optimizing the resource utilization, knowing the utilization condition of the system resource through the running condition of the equipment, optimizing the allocation and utilization of the resource and improving the utilization rate of the system.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.