Disclosure of Invention
The embodiment of the invention provides a container platform-based log aggregation method and device, which are used for solving the problem that the operation, maintenance and debugging cost is high when a cluster is realized based on a container platform in the prior art.
The embodiment of the invention provides a container platform-based log aggregation method, which is applied to a cluster realized on the basis of a container platform, wherein the cluster comprises at least one node, and the method comprises the following steps:
collecting log data of each node, compressing and packaging the log data, adding identification information, and sending the log data to each log processing unit;
each log processing unit processes the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated;
the log aggregation unit aggregates the log data to be aggregated from different log processing units and belonging to the same application according to the positioning identifier in the log data to be aggregated to obtain aggregated log data;
and classifying and storing the aggregation log data for calling access.
Wherein, the collecting the log data of each node comprises:
acquiring application-level log data and system-level log data of each node;
when the node is a management node, the method also comprises the step of collecting event level log data of the management node.
Wherein, the compressing, packaging and sending the information to each log processing unit after adding the identification information comprises:
and compressing and packaging, and adding identification information including a collection timestamp, a node name and a belonging level, and then sending to each log processing unit.
Wherein the processing levels of the log processing unit include an event level, an application level, and a system level;
correspondingly, each log processing unit processes the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated, and the method comprises the following steps:
and each log processing unit decompresses and analyzes the log data compression packet corresponding to the own processing level according to the belonging level in the identification information to obtain the log data with aggregation carrying the positioning identification, wherein the positioning identification comprises an application name and a container operation Pool (POD) name.
The aggregating the log data to be aggregated, which are from different log processing units and belong to the same application, to obtain aggregated log data includes:
and aggregating the log data with aggregation from different log processing units and belonging to the same application to obtain aggregated log data carrying an aggregated identifier with a format of namespaced _ resource _ level.
An embodiment of the present invention further provides a container platform-based log aggregation apparatus, where the apparatus is applied to a cluster implemented by a container platform, where the cluster includes at least one node, and the apparatus includes: the system comprises an acquisition unit, a plurality of log processing units, a log aggregation unit and a storage unit which are arranged in each node; wherein,
the acquisition unit is used for acquiring the log data of each node, compressing and packaging the log data, adding identification information and then sending the log data to each log processing unit;
the log processing unit is used for processing the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated;
the log aggregation unit is used for aggregating the log data to be aggregated from different log processing units and belonging to the same application according to the positioning identifier in the log data to be aggregated to obtain aggregated log data;
the storage unit is used for classifying and storing the aggregation log data for calling and accessing.
The acquisition unit is configured to acquire log data of each node, and includes:
acquiring application-level log data and system-level log data of each node;
when the node is a management node, the method also comprises the step of collecting event level log data of the management node.
The acquisition unit is used for compressing and packaging the log and adding the identification information and then sending the log to each log processing unit, and comprises:
and compressing and packaging, and adding identification information including a collection timestamp, a node name and a belonging level, and then sending to each log processing unit.
Wherein the processing levels of the log processing unit include an event level, an application level, and a system level;
correspondingly, the log processing unit is configured to process the log data compression packet corresponding to the own processing level according to the identification information, so as to obtain the log data to be aggregated, and includes:
and each log processing unit decompresses and analyzes the log data compression packet corresponding to the own processing level according to the belonging level in the identification information to obtain the log data with aggregation carrying the positioning identification, wherein the positioning identification comprises an application name and a POD name.
The log aggregation unit is configured to aggregate log data to be aggregated, which are from different log processing units and belong to the same application, to obtain aggregated log data, and includes:
and aggregating the log data with aggregation from different log processing units and belonging to the same application to obtain aggregated log data carrying an aggregated identifier with a format of namespaced _ resource _ level.
The invention has the following beneficial effects:
according to the container platform-based log aggregation method and device provided by the embodiment of the invention, log data of each node is collected, compressed and packaged, identification information is added, and then the compressed and packaged log data is sent to each log processing unit; each log processing unit processes the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated; the log aggregation unit aggregates the log data to be aggregated from different log processing units and belonging to the same application according to the positioning identifier in the log data to be aggregated to obtain aggregated log data; and classifying and storing the aggregation log data for calling access. The log data of each node in the cluster container can be efficiently collected and stored according to application category aggregation so as to be used for follow-up calling, the deployment, debugging and operation and maintenance difficulty of the container platform are greatly reduced, the operation and maintenance cost of the container platform is reduced, and the usability and expansibility of the container platform are improved.
Detailed Description
The method aims to solve the problems that log data are dispersed and correlated when standard Kubernets are applied in containerization in the prior art, operation and maintenance personnel expend more energy in looking up the log data on different machines of different components at different levels in the operation and maintenance process, and the cost is high. The embodiment of the invention provides a container platform-based log aggregation method, which is characterized in that dispersed log data are classified, grouped and structured based on storm streaming calculation, and then structured log data are classified and aggregated based on kafka message queues and stored in a database for upper layer restful api to access. The flow of the method of the present invention is shown in fig. 1, the method is applied to a cluster implemented based on a container platform, the cluster includes at least one node, the cluster in the embodiment of the present invention may be a kubernets cluster, and the method includes the following steps:
step 101, collecting log data of each node, compressing and packaging the log data, adding identification information, and sending the log data to each log processing unit;
102, each log processing unit processes the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated;
103, the log aggregation unit aggregates the log data to be aggregated from different log processing units and belonging to the same application according to the positioning identifier in the log data to be aggregated to obtain aggregated log data;
and 104, classifying and storing the aggregated log data for calling and accessing.
Instep 101, log collection is required for each node in the cluster, and collecting log data of each node includes:
acquiring application-level log data and system-level log data of each node;
when the node is a management node, not only the application-level log data and the system-level log data need to be collected, but also the event-level log data of the management node needs to be collected. Specifically, event-level log data can be collected by using a Kubernets management API, and is mainly used for tracking the process of managing and arranging the Pod by the Kubernets; the application level log data comprises Pod inner container state data and a container inner program output log, the container state data is collected by using a Kubernetes management API, and the container inner program output log is collected by using a Docker management API; the system-level log data mainly comprises running log data of Docker and Kubelet services, including service state data and system output logs, wherein the state data is collected through system management work, and the system logs are collected through a jounalctl tool.
Wherein, the compressing, packaging and sending the information to each log processing unit after adding the identification information comprises:
and compressing and packaging, and adding identification information including a collection timestamp, a node name and a belonging level, and then sending to each log processing unit.
The processing level of the log processing unit corresponds to the log data level, and comprises an event level, an application level and a system level;
correspondingly, each log processing unit processes the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated, and the method comprises the following steps:
each log processing unit decompresses and analyzes the log data compression packet corresponding to the own processing level according to the belonging level in the identification information to obtain the log data with aggregation carrying the positioning identification, wherein the positioning identification comprises an application name and a container operation Pool (POD) name, and the POD is a virtual environment pool for operating the container and has an isolated network environment. Specifically, each log processing unit receives log data compression packets indiscriminately, and then filters out log data compression packets which do not belong to the own processing level according to the preset processing logic.
The aggregating the log data to be aggregated, which are from different log processing units and belong to the same application, to obtain aggregated log data includes:
aggregating the log data with aggregation from different log processing units and belonging to the same application to obtain aggregated log data carrying an aggregation identifier with a format of namespaced _ resource _ level, wherein the format of namespaced: the name space identification is used for identifying the name space in which the service instance generating the log runs; resource id: an application identifier for identifying to which application the service instance that generated the log belongs; level: and the log level is used for controlling the display granularity of the log corresponding to the level of the generated log.
Further, sending the aggregation log data to a theme (topic) corresponding to the kafka service in the cluster according to the aggregation identifier; running an aggregated log data storage service and a kafka service on the kubernets cluster, the kafka service being responsible for storing all aggregated log data from the log aggregation unit, the kafka ensuring high performance and high reliability of the aggregated log data under large scale clusters.
The log data storage service classifies the data of different topics in the kafka according to topic information, aggregation identification of the aggregated log data and other information, and finally stores the aggregated log data in the database system for calling and accessing by an external system or other APIs.
Based on the same inventive concept, an embodiment of the present invention provides a container platform-based log aggregation apparatus, which may be applied to a cluster implemented by a container platform, where the cluster includes at least one node, and a structure of the apparatus is shown in fig. 2, and includes: the system comprises acollection unit 21, a plurality oflog processing units 22, alog aggregation unit 23 and astorage unit 24 which are arranged in each node; wherein,
theacquisition unit 21 is configured to acquire log data of each node, compress and package the log data, attach identification information to the log data, and send the compressed and packaged log data to each log processing unit;
thelog processing unit 22 is configured to process the log data compression packet corresponding to the own processing level according to the identification information, so as to obtain log data to be aggregated;
thelog aggregation unit 23 is configured to aggregate, according to the positioning identifier in the log data to be aggregated, which are from different log processing units and belong to the same application, to obtain aggregated log data;
thestorage unit 24 is configured to perform classified storage on the aggregated log data for call access.
The acquiringunit 21 is configured to acquire log data of each node, and includes:
acquiring application-level log data and system-level log data of each node;
when the node is a management node, the method also comprises the step of collecting event level log data of the management node.
Theacquisition unit 21 is configured to compress, pack, attach identification information, and send the information to each log processing unit, and includes:
and compressing and packaging, and adding identification information including a collection timestamp, a node name and a belonging level, and then sending to each log processing unit.
Wherein the processing levels of the log processing unit include an event level, an application level, and a system level;
correspondingly, thelog processing unit 22 is configured to process the log data compression packet corresponding to the own processing level according to the identification information, so as to obtain the log data to be aggregated, and includes:
eachlog processing unit 22 decompresses and analyzes the log data compression packet corresponding to its own processing level according to the belonging level in the identification information, and obtains the log data with aggregation carrying the positioning identification, where the positioning identification includes an application name and a POD name.
Thelog aggregating unit 23 is configured to aggregate log data to be aggregated, which are from different log processing units and belong to the same application, to obtain aggregated log data, and includes:
and aggregating the log data with aggregation from differentlog processing units 22 and belonging to the same application to obtain aggregated log data carrying the aggregated identifier with the format of namespaced _ resource _ level.
It should be understood that the implementation principle and process of the container platform based log aggregation apparatus provided in the embodiment of the present invention are similar to those in fig. 1 and the embodiment shown above, and are not described herein again.
According to the container platform-based log aggregation method and device provided by the embodiment of the invention, log data of each node is collected, compressed and packaged, identification information is added, and then the compressed and packaged log data is sent to each log processing unit; each log processing unit processes the log data compression packet corresponding to the own processing level according to the identification information to obtain the log data to be aggregated; the log aggregation unit aggregates the log data to be aggregated from different log processing units and belonging to the same application according to the positioning identifier in the log data to be aggregated to obtain aggregated log data; and classifying and storing the aggregation log data for calling access. The log data of each node in the cluster container can be efficiently collected and stored according to application category aggregation so as to be used for follow-up calling, the deployment, debugging and operation and maintenance difficulty of the container platform are greatly reduced, the operation and maintenance cost of the container platform is reduced, and the usability and expansibility of the container platform are improved.
Those of ordinary skill in the art will understand that: the figures are merely schematic representations of one embodiment, and the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for apparatus or system embodiments, since they are substantially similar to method embodiments, they are described in relative terms, as long as they are described in partial descriptions of method embodiments. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a specific order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and the sequence numbers of the operations, such as 201, 202, 203, etc., are merely used for distinguishing different operations, and the sequence numbers themselves do not represent any execution order. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While alternative embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following appended claims be interpreted as including alternative embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present invention without departing from the spirit or scope of the embodiments of the invention. Thus, if such modifications and variations of the embodiments of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to encompass such modifications and variations.