Disclosure of Invention
The application aims to provide a classified storage method and device of logs, computer equipment and a computer readable storage medium, which are used for improving the efficiency of storing the logs into a database, thereby being suitable for scenes such as a gateway and the like with a large number of log storage requirements.
One aspect of the embodiment of the application provides a classified storage method of logs, which comprises the following steps:
classifying the collected logs according to key fields in the logs;
and respectively sending the classified logs of different types to corresponding log processing programs for parallel processing, and storing the logs into a database.
Optionally, the logs of different types specifically include: at least one of a service type log, an alarm type log, and a kernel type log; and
The database comprises at least one of the following data tables: a service log data table for storing a log of a service type, an alarm log data table for storing a log of an alarm type, a kernel log data table for storing a log of a kernel type.
Optionally, when the log is a service type log, sending the log to a service log processing program for processing and storing the log into the service log data table; the service type logs are divided into a plurality of sub-categories, and the service log data tables are a plurality of sub-categories respectively corresponding to the service type logs;
The method for processing the log by the service log processing program comprises the following steps: screening out logs with correct formats by using a regular matching technology, and inserting the screened logs into a pre-constructed annular buffer queue;
Reading logs from the annular buffer queue one by one;
identifying a sub-category of each currently read log, and caching the log in a log warehousing queue corresponding to the sub-category according to the identified sub-category;
And for each subcategory, when the number of the logs in the log-in queue corresponding to the subcategory reaches a set number index, inserting the logs in the log-in queue into a service log data table corresponding to the subcategory in the database in batches, and emptying the log-in queue.
Optionally, after screening out logs with correct formats by using a regular matching technology, inserting the screened logs into a pre-constructed ring buffer queue, which specifically includes:
Obtaining a log data block with a set size through a block analyzer, checking a message format of the obtained log data block, discarding the non-compliant log information in the log data block, and splitting the log data block into log message information with a minimum information length of a behavior;
Distributing the split log message information of each row to a plurality of row resolvers running in parallel through the block resolvers, and carrying out further rule check by the row resolvers:
And checking whether the information of a plurality of necessary fields in each line of input log message information is in compliance or not through the line analyzer, discarding the non-compliance log message information, and inserting the compliance log message information into the annular buffer queue.
Optionally, the checking, by the line parser, whether the information of a plurality of necessary fields in the input log message information of each line is compliant specifically includes:
Dividing subcategories of log message information input in a period of time through the line analyzer; and aiming at the divided log message information of each sub-category in batches, using the SQL statement block corresponding to the sub-category to check the batch rule of the log message information of the sub-category.
Optionally, when the log is an alarm type log, sending the alarm type log to an alarm log processing program for processing and storing the alarm log into the alarm log data table; the method for processing the log by the alarm log processing program comprises the following steps:
The alarm log processing program compares the information abstract of the currently input log with the information abstract of each log stored in the first hash table; if the comparison result is inconsistent, storing the current input log into a first hash table and a first cache queue; otherwise:
Further comparing the state identification bits of the two logs with the same information abstract; if the state identification bits of the two are different, storing the currently input log into a first cache queue, and updating the state identification bit of the corresponding log in the first hash table according to the state identification bit of the currently input log;
if the state identification bits of the two logs are the same, further comparing the time stamps of the two logs; if the difference between the two time stamps is larger than a set value, storing the log into a first cache queue, and updating the time stamp of the corresponding log in a first hash table according to the time stamp of the currently input log;
And storing the log in the first cache queue into an alarm log data table in the database.
Optionally, when the log is a kernel type log, sending the log to a kernel log processing program for processing and storing the log into the kernel log data table; the method for processing the log by the kernel log processing program comprises the following steps:
Comparing the protocol information of the current input log with the protocol information of each log stored in the second hash table; if the comparison result is inconsistent, storing the log into a second hash table and a second cache queue; otherwise:
Further comparing the state identification bits of two logs with the same protocol information; if the state identification bits of the two are different, storing the log into a second cache queue, and updating the state identification bit of the corresponding log in the second hash table according to the state identification bit of the currently input log;
if the state identification bits of the two logs are the same, further comparing the time stamps of the two logs; if the difference between the two time stamps is larger than the set value, storing the log into a second cache queue, and updating the time stamp of the corresponding log in a second hash table according to the time stamp of the currently input log;
And storing the log in the second cache queue into a kernel log data table in the database.
An aspect of an embodiment of the present application further provides a classification storage device for logs, including:
the log collecting module is used for collecting logs and classifying the collected logs according to key fields in the logs;
And the log processing modules are respectively used for processing the logs of the corresponding types and storing the processed logs into the database.
An aspect of an embodiment of the present application further provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the sorted storage method of journals as described above when the computer program is executed.
An aspect of an embodiment of the present application further provides a computer readable storage medium including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the sorted storage method of journals as described above when the computer program is executed.
The classified storage method, the computer equipment and the computer readable storage medium for the logs provided by the embodiment of the application are used for classifying the collected logs according to key fields in the logs; and respectively sending the classified logs of different types to corresponding log processing programs for parallel processing, and storing the logs into a database. Therefore, the logs of different types are respectively sent to the corresponding log processing programs for parallel processing, and the efficiency of storing the logs in the database can be improved, so that the method is suitable for scenes such as a gateway and the like with a large number of log storage requirements. In addition, the logs of different types are respectively put in storage, and the readability and operability of the whole log system can be improved.
More preferably, when logs with correct format are screened, as the logs are screened by adopting the block analyzer and the line analyzer, the block analysis processing speed is high, the information source can be quickly acquired for preliminary screening, the line analysis granularity is accurate, and the lines can be bound to a multi-core for calculation processing, so that the efficiency of log screening is improved as a whole, and the efficiency of log warehousing can be further improved.
More preferably, the block parser is one production thread and the line parser is a plurality of consumption threads. The design method utilizes the characteristics of a multi-core CPU system to share analysis calculation to each core process, so that the analysis efficiency is improved, and the efficiency of log screening and log warehousing is improved.
More preferably, the line parser can perform centralized parsing, checking and buffering on the same subclasses accumulated in a period of time, and avoid switching SQL sentences back and forth, so that the efficiency of the line parser in screening logs is greatly improved, and the efficiency of log warehousing is also improved.
In addition, because the logs of different subcategories are stored in the database in a sub-table manner, the logs of the same subcategory are stored in the same data table, and therefore the readability and operability of the whole log system are improved.
Preferably, when the alarm/kernel log is processed, the alarm/kernel log of the same type can be merged according to a state edge triggering mode by comparing information abstract/protocol information with state identification bits; and through the comparison of the time stamps, the redundant logs repeatedly reported in a period of time can be subjected to merging processing. After the redundant logs are merged, the number of the logs subjected to warehousing can be greatly reduced, so that the warehousing efficiency of the logs is improved.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that the descriptions of "first," "second," etc. in the embodiments of the present application are for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.
In the description of the present application, it should be understood that the numerical references before the steps do not identify the order in which the steps are performed, but are merely used to facilitate description of the present application and to distinguish between each step, and thus should not be construed as limiting the present application.
The inventor of the application finds that different types of logs generally have different warehousing processing modes; the logs of the same type often have approximately the same warehousing operation mode; therefore, in the technical scheme of the application, the collected logs are classified according to key fields in the logs; and respectively sending the classified logs of different types to corresponding log processing programs for parallel processing, and storing the logs into a database. Therefore, the logs of different types are respectively sent to the corresponding log processing programs for parallel processing, and the efficiency of storing the logs in the database can be improved, so that the method is suitable for scenes such as a gateway and the like with a large number of log storage requirements. In addition, the logs of different types are respectively put in storage, and the readability and operability of the whole log system can be improved.
The schematic diagram of the classified storage device of the log provided by the application arranged in the security isolation gatekeeper is shown in fig. 1.
In a security isolation gatekeeper, file traffic, database traffic, and proxy traffic produce logs. The log is sent to a log collector syslog-ng in a classified storage device of the log by means of pipeline files, a local network and the like.
Three general types of logs, namely, a service type log (service log for short), an alarm type log (alarm log for short) and a kernel type log (kernel log for short), are generally included in the security isolation gatekeeper. The service log and the alarm log are generated by each large service module of the security isolation gatekeeper independently, and the format of the service log and the alarm log is subjected to field division optimization of data on the basis of conforming to the standard syslog log format. That is, the data sources of the alarm logs are the same as the service logs, and the logs are generated by file service, database service and proxy service.
The kernel log is a log generated by the kernel system. The kernel log is generally the content of starting, stopping, abnormal process information and the like of the system, and the corresponding log is generated when the gatekeeper is attacked by the network. The log processing is uniformly transmitted to a log collector syslog-ng in the classified storage device for log collection.
In the classified storage device of the logs, the collected logs are accurately classified according to the fields and then distributed to different log processing programs, and meanwhile, the collected logs are stored in different data tables in a database, so that high concurrency processing logic of the logs is realized.
The log processing programs corresponding to the log of the service type (service log for short), the log of the alarm type (alarm log for short) and the log of the kernel type (kernel log for short) are the service log processing program, the alarm log processing program and the kernel log processing program respectively.
Correspondingly, the database can also comprise at least one of the following data tables: a service log data table for storing a log of a service type, an alarm log data table for storing a log of an alarm type, a kernel log data table for storing a log of a kernel type.
The classified storage device of the log provided by the application can comprise: a log collection module and a plurality of log processing modules;
The log collecting module may include the log collector syslog-ng, which is used to collect logs and classify the collected logs according to key fields in the logs. Wherein, the key field can be a grade field or a module field; for example, the log collection module may identify a level field in the log, and if the level field of the log is specifically a common level, the log is classified as a log of a service type (service log for short); if the level field of the log is specifically an alarm level, the log is classified as an alarm type log (alarm log for short); if the level field of the log is specifically a kernel level, the log is classified as a kernel type log (kernel log for short).
And the log processing modules are respectively used for processing the logs of the corresponding types and storing the processed logs into the database.
The internal structure of the classified storage device for logs provided by the application can be shown in fig. 2, and the classified storage device comprises the following modules: a log collection module 201, a service log processing module 202, an alarm log processing module 203, and a kernel log processing module 204.
The service log processing module 202 corresponds to a log of service types (service log for short), and is configured to process the service log;
the alarm log processing module 203 corresponds to a log of alarm types (alarm log for short), and is configured to process the alarm log;
the kernel log processing module 204 corresponds to a kernel type log (simply referred to as kernel log), and is configured to process the kernel log.
Various embodiments are provided below, which may be used to implement the scheme of categorized storage of logs described above.
Example 1
Fig. 3 schematically shows a flow chart of a method for processing logs by a service log processing program according to a first embodiment of the application.
As shown in fig. 3, a method for processing a service log by a service log processing program according to a first embodiment of the present application may include the following steps:
Step S301: and screening out logs with correct formats from the logs of the service types by using a regular matching technology, and inserting the screened logs into the annular buffer queue.
Specifically, the service log processing program mainly processes system logs, management logs, service logs and tracking logs; in this step, the service log processing process may first screen out the dirty log by using a regular matching technique, for example, the key field content is empty, the field format is incorrect, and the like, and truncate the ultralong log content, so as to facilitate storage. The logs with correct formats are inserted into a pre-built ring buffer queue; when the ring buffer queue is full, subsequent logs will be discarded.
Preferably, in this step, a log data block with a set size, for example, a log data block with a size of 512KB, 1MB or 10MB, may be obtained by a block parser, and a syslog message format check is performed on the obtained log data block, after discarding the non-compliant log information in the log data block, splitting the log data block into log message information with a minimum information length of a row, that is, log message information of a line;
The block parser distributes the split log message information of each row to a plurality of row parsers running in parallel, and the row parsers perform further rule checking: the line analyzer checks whether the information of a plurality of necessary fields in the line of log message information is compliant with the input line of log message information, discards the non-compliant log message information, and inserts the compliant log message information into the annular buffer queue;
the block analyzer and the row analyzer are respectively implemented by using a grammar interpreter, and are functionally distinguished by coarse granularity detection and fine granularity detection. The block parser and the line parser have the advantages that the block parsing processing speed is high, the information source can be quickly obtained for preliminary screening, the line parsing granularity is accurate, and the line parsing granularity can be bound to a plurality of cores for calculation processing, so that the log screening efficiency is improved as a whole, and the log warehousing efficiency is improved.
A memory queue is arranged between the block analyzer and the line analyzer, the memory queue is mainly used for information transmission among threads, the block analyzer is a production thread, and the line analyzer is a plurality of consumption threads. The design method utilizes the characteristics of a multi-core CPU system to share analysis calculation to each core process, so that the analysis efficiency is improved, and the efficiency of log screening and log warehousing is improved. The memory queue supports a single-production-multi-consumption multithreading mode, and a plurality of consumption threads can acquire information output by the generation threads in a balanced mode. The program initial load stage may configure the number of threads of the row parser using the thread pool concept.
In general, logs of a traffic type may be divided into a plurality of sub-categories, e.g., may be distinguished by sub-category of the traffic log: web module logs, flow tracking logs, file tracking logs, database tracking logs and general service logs;
As a more preferable implementation manner, the line parser classifies the message information of the log input in a period of time into sub-categories; the line analyzer maintains an SQL (Structured Query Language ) statement block corresponding to each sub-category log for analyzing and checking the sub-category log; the row parser uses SQL statement blocks corresponding to the sub-categories to check the batch rules of the log message information of the sub-categories aiming at the batch log message information of each sub-category. Because the line parser can perform centralized parsing, checking and buffering on the same subclasses accumulated in a period of time, and avoid switching SQL sentences back and forth, the efficiency of the line parser for screening logs is greatly improved, and the efficiency of log warehousing is also improved.
Step S302: the logs are read from the ring buffer queue one by one.
The service log processing program can also read logs from the annular buffer queue one by one, and the read logs are used for warehousing.
The ring buffer queue is used for communication among multiple processes, and a multi-producer-multi-reader model can be realized through a memory sharing method; for example, in addition to reading logs from the ring buffer queue during warehousing, logs may also be read from the ring buffer queue for log snapshots, i.e., real-time log information in the ring buffer queue is synchronized to each log snapshot. The function of the log snapshot is to display the latest log information in real time, and the advantage is that the log snapshot is in a read-only form, and the operation of acquiring the log by a single log snapshot does not block the warehousing of the log. The log snapshot can be used for a background manager CLI to check and can also be used for WebUI log audit to check.
Step S303: for each currently read log, identifying a sub-category of the log, and according to the identified sub-category, caching the log in a log-in queue corresponding to the sub-category.
Step S304: and for each subcategory, when the number of the logs in the log-in queue corresponding to the subcategory reaches a set number index, inserting the logs in the log-in queue into a service log data table corresponding to the subcategory in the database in batches, and emptying the log-in queue.
Specifically, the service log data tables in the database are multiple, and each sub-category of the service log corresponds to each sub-category; in this step, for each sub-category, when determining that the number of logs in the log-in queue corresponding to the sub-category reaches a set number index, the service log processing program inserts the logs in the log-in queue into the service log data table corresponding to the sub-category in the database in batch, and empties the log-in queue.
Because the logs of different subcategories are stored in the database in a sub-table manner, the logs of the same subcategory are stored in the same data table, and therefore the readability and operability of the whole log system are improved.
Corresponding to the method for processing the service log by the service log processing program, the service log processing module 202 provided in the first embodiment of the present application is specifically configured to screen out the logs with correct formats for the service type of the logs by using the regular matching technology, and insert the screened logs into the pre-constructed ring buffer queue; reading logs from the annular buffer queue one by one; identifying a sub-category of each currently read log, and caching the log in a log warehousing queue corresponding to the sub-category according to the identified sub-category; and for each subcategory, when the number of the logs in the log-in queue corresponding to the subcategory reaches a set number index, inserting the logs in the log-in queue into a service log data table corresponding to the subcategory in the database in batches, and emptying the log-in queue.
An internal structure of the service log processing module 202 provided in the first embodiment of the present application is shown in fig. 4, and may include the following units: a log screening unit 401 and a log warehousing unit 402;
The log screening unit 401 is configured to screen logs with correct formats from logs of service types by using a regular matching technique, and insert the screened logs into a pre-constructed ring buffer queue; preferably, the log filtering unit 401 may include the block parser and the line parser described above.
The log storage unit 402 is configured to read logs from the ring buffer queue one by one; identifying a sub-category of each currently read log, and caching the log in a log warehousing queue corresponding to the sub-category according to the identified sub-category; and for each subcategory, when the number of the logs in the log-in queue corresponding to the subcategory reaches a set number index, inserting the logs in the log-in queue into a service log data table corresponding to the subcategory in the database in batches, and emptying the log-in queue.
In the technical scheme of the first embodiment of the application, the block parser and the line parser are adopted to screen the logs, so that the block parsing processing speed is high, the information sources can be quickly acquired to perform preliminary screening, the line parsing granularity is accurate, and the line parsing granularity can be bound to a multi-core to perform calculation processing, thereby improving the efficiency of log screening on the whole, and further improving the efficiency of log warehousing.
More preferably, the block parser is one production thread and the line parser is a plurality of consumption threads. The design method utilizes the characteristics of a multi-core CPU system to share analysis calculation to each core process, so that the analysis efficiency is improved, and the efficiency of log screening and log warehousing is improved.
More preferably, the line parser can perform centralized parsing, checking and buffering on the same subclasses accumulated in a period of time, and avoid switching SQL sentences back and forth, so that the efficiency of the line parser in screening logs is greatly improved, and the efficiency of log warehousing is also improved.
In addition, because the logs of different subcategories are stored in the database in a sub-table manner, the logs of the same subcategory are stored in the same data table, and therefore the readability and operability of the whole log system are improved.
Example two
The second embodiment of the application describes a scheme for processing the alarm log by the alarm log processing program.
Typically alarm logs are divided into eight sub-types: virus alarms, attack alarms, hardware anomalies, system anomalies, resource anomalies, configuration changes, log alarms, and policy alarms. Each sub-type alarm log is individually identified by a type identification bit, and the change of the state of the alarm log is identified through a status identification bit, for example, a cpu alarm log is taken as an example, and the status identification bit is high/mid/low (low); and taking the network card alarm log as an example, the state identification bit is fault/recovery (recovery).
Fig. 5 schematically shows a flowchart of a method for processing logs by an alarm log processing program according to a second embodiment of the present application.
As shown in fig. 5, the method for processing an alarm log by the alarm log processing program according to the second embodiment of the present application may include the following steps:
step S501, screening logs with correct formats from logs with alarm types by using a regular matching technology.
In this step, the method for screening the dirty logs by the alarm log processing program using the regular matching technique to screen the logs with correct format for the alarm type logs may be the same as the method for screening the logs in step S301 in fig. 3, and will not be described here again.
Because the alarm logs can cooperate with the outbound program to timely send the alarm content to users, such as short messages, mailboxes and the like, a plurality of alarm logs with the same content can be sent in a period of time. Therefore, in order to improve the warehousing efficiency of the logs, after the step, the screened logs with correct formats are merged by adopting the following steps, so that the quantity of the logs needing to be warehoused is greatly reduced.
Step S502: the alarm log processing program compares the information abstract of the currently input log with the information abstract of each log stored in the first hash table; if the comparison result is inconsistent, the following step S503 is executed to store the currently input log into the first hash table and the first cache queue; otherwise, the following step S504 is performed;
specifically, the msg (information) field of the alarm log with the correct format which is input currently can be hashed to extract a fixed-length information abstract, and the information abstract is a unique field of the alarm log. The abstract has a function to be used as index information and compression information.
Comparing the information abstract of the alarm log with the correct format which is input currently with the information abstract of each log stored in the hash table; if there is no log in the hash table, the information abstract of which is the same as the information abstract of the currently input alarm log, then executing the following step S503 to store the currently input log into the first hash table and the first buffer queue; otherwise, the following step S504 is performed.
Step S503: the current input log is stored in a first hash table and a first cache queue.
In this step, the current log is stored in the first hash table, and the current log is also stored in the first buffer queue for storage.
Step S504: further comparing the state identification bits of the two logs with the same information abstract; if the state identification bits of the two are different, executing step S505 to store the currently input log into the first cache queue, and updating the state identification bit of the corresponding log in the first hash table according to the state identification bit of the currently input log; if the status identification bits are the same, step S506 is performed.
Specifically, if a log with the same information abstract as the currently input alarm log exists in the hash table, the state identification bits of the two logs with the same information abstract are continuously compared in the step: extracting a state identification bit from an msg field of a currently input alarm log, comparing the state identification bit with a state identification bit of a log with the same information abstract, and comparing whether the state identification bit is changed from low to high or from fault to over; if the change occurs, the following step S505 is executed to store the currently input log into the first buffer queue, and update the status identification bits of the log with the same information abstract in the first hash table according to the status identification bits of the currently input log.
Step S505: storing the current input log into a first cache queue, and updating a state identification bit of a corresponding log in a first hash table according to the state identification bit of the current input log;
In this step, the current input log is stored in the first buffer queue for storage, and the state identification bit of the log with the same information abstract as the information abstract of the current input log in the first hash table is updated according to the state identification bit of the current input log.
Step S506: further comparing the time stamps of the two logs; if the difference between the two time stamps is greater than the set value, executing step S507 to store the log into a first cache queue, and updating the time stamp of the corresponding log in the first hash table according to the time stamp of the currently input log; otherwise, step S508 is performed;
Specifically, if the hash table has a log with the same information abstract as the currently input alarm log and unchanged identification bit, comparing the log with the timestamp of the currently input alarm log; if the difference between the two time stamps is greater than the set value, that is, if the insertion time of the two time stamps exceeds the timer duration, step S507 is executed to store the log into the first buffer queue, and update the time stamp of the corresponding log in the first hash table according to the time stamp of the currently input log.
Step S507: storing the current input log into a first cache queue, and updating the time stamp of the corresponding log in a first hash table according to the time stamp of the current input log;
in this step, the current log is stored in the first buffer queue to be put in storage, and the timestamp of the log which is the same as the summary of the log information and has unchanged state identification bit in the first hash table is updated according to the timestamp of the current log.
Step S508: the log currently entered is discarded.
In the step, merging processing of repeated logs is carried out; through the judgment of the steps, in the step, the fact that the logs with the same information abstract and state identification bits as the logs which are input currently and the time stamp difference within the set value range are stored in the first hash table is determined, the logs which are input currently are discarded as repeated logs, and repeated warehousing operation of the logs with the same content is avoided.
Step S509: and storing the log in the first cache queue into an alarm log data table in the database.
The log of the first buffer queue can be put into the ring buffer queue as a snapshot to be displayed while the database insertion operation is executed, and the principle of the log is the same as that of the business log.
Therefore, the similar alarm logs can be merged according to the state edge triggering mode through the comparison of the information abstracts and the state identification bits; and through the comparison of the time stamps, the redundant logs repeatedly reported in a period of time can be subjected to merging processing. After the redundant logs are merged, the number of the logs subjected to warehousing can be greatly reduced, so that the warehousing efficiency of the logs is improved.
Corresponding to the method for processing the alarm log by the alarm log processing program, the alarm log processing module 203 provided in the second embodiment of the present application is specifically configured to compare the information abstract of the currently input log with the information abstract of each log stored in the first hash table; if the comparison result is inconsistent, storing the log into a first hash table and a first cache queue; otherwise: further comparing the state identification bits of the two logs with the same information abstract; if the state identification bits of the two are different, storing the log into a first cache queue, and updating the state identification bit of the corresponding log in the first hash table according to the state identification bit of the currently input log; if the state identification bits of the two logs are the same, further comparing the time stamps of the two logs; if the difference between the two time stamps is larger than a set value, storing the log into a first cache queue, and updating the time stamp of the corresponding log in a first hash table according to the time stamp of the currently input log; and storing the log in the first cache queue into an alarm log data table in the database.
In the technical scheme of the second embodiment of the application, the similar alarm logs can be merged according to the state edge triggering mode by comparing the information abstract with the state identification bits; and through the comparison of the time stamps, the redundant logs repeatedly reported in a period of time can be subjected to merging processing. After the redundant logs are merged, the number of the logs subjected to warehousing can be greatly reduced, so that the warehousing efficiency of the logs is improved.
In addition, the alarm type log is stored in the alarm log data table in the database and is distinguished from the data tables stored in the service log and the kernel log, so that the readability and operability of the whole log system are improved.
Example III
FIG. 6 schematically illustrates a flow chart of a method for a kernel log handler to process logs according to a second embodiment of the present application.
The method for processing the kernel log is similar to the alarm log, as shown in fig. 6, the method for processing the kernel log by the kernel log processing program in the third embodiment of the present application may include the following steps:
And step S601, screening out logs with correct formats from the logs of the kernel types by using a regular matching technology.
In this step, the kernel log processing program screens out logs with correct formats for the kernel type logs by using the regular matching technology, and the method for screening out dirty logs may be the same as the method for screening logs in step S301 in fig. 3, which is not described here again.
Further, aiming at the screened kernel logs with correct formats, the method is processed according to the following steps:
Step S602: the kernel log processing program compares the protocol information of the log with the correct format which is input currently with the protocol information of each log stored in the second hash table; if the comparison result is inconsistent, the following step S603 is executed to store the currently input log into the second hash table and the second cache queue; otherwise, the following step S604 is performed;
Step S603: and storing the current input log into a second hash table and a second cache queue.
In this step, the current log is stored in the second hash table, and the current log is also stored in the second buffer queue for storage.
Step S604: further comparing the state identification bits of the two logs with the same information abstract; if the state identification bits of the two are different, executing step S605 to store the currently input log into the second buffer queue, and updating the state identification bit of the corresponding log in the second hash table according to the state identification bit of the currently input log; if the status identification bits are the same, step S606 is performed.
Step S605: storing the current input log into a second cache queue, and updating the state identification bit of the corresponding log in the second hash table according to the state identification bit of the current input log;
In this step, the current log is stored in the second buffer queue for storage, and the state identification bit of the log with the same information abstract as the information abstract of the current log in the second hash table is updated according to the state identification bit of the current log.
Step S606: further comparing the time stamps of the two logs; if the difference between the two time stamps is greater than the set value, executing step S607 to store the log into the second buffer queue, and updating the time stamp of the corresponding log in the second hash table according to the time stamp of the currently input log; otherwise, step S608 is executed;
step S607: storing the current input log into a second cache queue, and updating the time stamp of the corresponding log in a second hash table according to the time stamp of the current input log;
step S608: the log currently entered is discarded.
In this step, the merging process of the repeated logs is performed, and through a plurality of judgments in the above steps, in this step, it is determined that the second hash table stores the logs with the same protocol information and the same state identification bit as the current log, and the difference between the time stamps is within the set value range, and then the current log is discarded as the repeated log, so as to avoid repeated warehousing operations of a plurality of logs with the same content.
Step S609: and storing the log in the second cache queue into an alarm log data table in the database.
The log of the second cache queue can be put into the ring buffer while being executed with the database insertion operation, and can be displayed as a snapshot, and the principle of the log is the same as that of the business log.
Therefore, the similar kernel logs can be merged according to the state edge triggering mode through the comparison of the protocol information and the state identification bits; and through the comparison of the time stamps, the redundant logs repeatedly reported in a period of time can be subjected to merging processing. After the redundant logs are merged, the number of the logs subjected to warehousing can be greatly reduced, so that the warehousing efficiency of the logs is improved.
Corresponding to the method for processing the kernel log by the kernel log processing program, in the third embodiment of the present application, the kernel log processing module 204 is specifically configured to compare the protocol information of the currently input log with the protocol information of each log stored in the second hash table; if the comparison result is inconsistent, storing the log into a second hash table and a second cache queue; otherwise: further comparing the state identification bits of two logs with the same protocol information; if the state identification bits of the two are different, storing the log into a second cache queue, and updating the state identification bit of the corresponding log in the second hash table according to the state identification bit of the currently input log; if the state identification bits of the two logs are the same, further comparing the time stamps of the two logs; if the difference between the two time stamps is larger than the set value, storing the log into a second cache queue, and updating the time stamp of the corresponding log in a second hash table according to the time stamp of the currently input log; and storing the log in the cache queue into a kernel log data table in the database.
In the technical scheme of the third embodiment of the application, the similar kernel logs can be merged according to the state edge triggering mode by comparing the protocol information with the state identification bits; and through the comparison of the time stamps, the redundant logs repeatedly reported in a period of time can be subjected to merging processing. After the redundant logs are merged, the number of the logs subjected to warehousing can be greatly reduced, so that the warehousing efficiency of the logs is improved.
In addition, the log of the kernel type is stored in the kernel log data table in the database and is distinguished from the data tables stored in the service log and the alarm log, so that the readability and operability of the whole log system are improved.
Example IV
Fig. 7 schematically illustrates a hardware architecture diagram of a computer device 1000 adapted to implement a sorted storage method of journals according to a fourth embodiment of the present application. In an exemplary embodiment of the present application, the computer apparatus 1000 may be an apparatus capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. For example, it may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster composed of a plurality of servers), a gateway, or the like. As shown in fig. 7, the computer device 1000 includes at least, but is not limited to: the memory 1010, processor 1020, and network interface 1030 may be communicatively linked together by a system bus. Wherein:
Memory 1010 includes at least one type of computer-readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 1010 may be an internal storage module of the computer device 1000, such as a hard disk or memory of the computer device 1000. In other embodiments, the memory 1010 may also be an external storage device of the computer device 1000, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, abbreviated as SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 1000. Of course, the memory 1010 may also include both internal memory modules of the computer device 1000 and external memory devices. In this embodiment, the memory 1010 is typically used to store an operating system installed on the computer device 1000 and various types of application software, such as program codes of a method for identifying a software behavior body. In addition, the memory 1010 can also be used to temporarily store various types of data that have been output or are to be output.
The processor 1020 may be a central processing unit (Central Processing Unit, abbreviated as CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 1020 is generally used to control the overall operation of the computer device 1000, such as performing control and processing related to data interaction or communication with the computer device 1000, and the like. In this embodiment, processor 1020 is used to execute program code or process data stored in memory 1010.
The network interface 1030 may include a wireless network interface or a wired network interface, with the network interface 1030 typically being used to establish a communications link between the computer device 1000 and other computer devices. For example, the network interface 1030 is used to connect the computer device 1000 to an external terminal through a network, establish a data transmission channel and a communication link between the computer device 1000 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, abbreviated as GSM), wideband code division multiple access (Wideband Code Division Multiple Access, abbreviated as WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, etc.
It should be noted that fig. 7 only shows a computer device having components 1010-1030, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead.
In this embodiment, the method for identifying a software behavior body stored in the memory 1010 may also be divided into one or more program modules and executed by one or more processors (the processor 1020 in this embodiment) to implement an embodiment of the present application.
Example five
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of identifying a software behavior body in an embodiment.
In this embodiment, the computer-readable storage medium includes a flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may also be an external storage device of a computer device, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, abbreviated as SMC), a Secure Digital (abbreviated as SD) card, a flash memory card (FLASH CARD), or the like, which are provided on the computer device. Of course, the computer-readable storage medium may also include both internal storage units of a computer device and external storage devices. In this embodiment, the computer readable storage medium is typically used to store an operating system installed on a computer device and various types of application software, such as program code for a method for identifying a software behavior body in the embodiment, and the like. Furthermore, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps in them may be fabricated into a single integrated circuit module. Thus, embodiments of the application are not limited to any specific combination of hardware and software.
The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the application, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.