CN113760878A

Movatterモバイル変換

Info

Publication number: CN113760878A
Application number: CN202110884184.2A
Authority: CN
Inventors: 王威; 李春龙; 焦方忠
Original assignee: Inspur Software Group Co Ltd
Current assignee: Inspur Software Group Co Ltd
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2021-12-07

Abstract

Translated fromChinese

本发明公开了一种基于国产CPU和操作系统的微服务架构日志解析方法及系统，属于微服务架构技术领域，该方法通过filebeat进行日志的采集，收集数据库的慢查询日志、错误日志以及第三方服务日志，结合日志系统，自动发布并启动每一个filebeat进程；采用Log Streams作为流处理服务，并采用KafkaStreams作为ETL流处理过滤清洗无用日志；通过深度分析算法依据日志分析出来的不同的问题采用不同的算法分析生成解决方案；支持自动容错处理。本发明充分考虑到全国产环境下性能及兼容性问题，对于微服务架构系统问题的排查与处理起到很好的辅助作用，具有良好的通用性、移植性和扩展性。

The invention discloses a microservice architecture log analysis method and system based on a domestic CPU and an operating system, and belongs to the technical field of microservice architecture. The method collects logs through filebeat, and collects slow query logs, error logs and third-party logs of a database. The service log, combined with the log system, automatically publishes and starts each filebeat process; uses Log Streams as the stream processing service, and uses Kafka Streams as the ETL stream processing to filter and clean useless logs; through the in-depth analysis algorithm according to the different problems analyzed by the log, different problems are used. Algorithm analysis generates solutions; supports automatic fault-tolerant processing. The invention fully considers the performance and compatibility problems in the national production environment, plays a good auxiliary role in the investigation and processing of the micro-service architecture system problems, and has good versatility, portability and scalability.

Description

Micro-service architecture log analysis method and system based on domestic CPU and operating system

Technical Field

The invention relates to the technical field of micro-service architecture, in particular to a micro-service architecture log analysis method and system based on a domestic CPU and an operating system.

Background

Under the vigorous support of the country, nationwide hardware with independent intellectual property rights is developed rapidly, and particularly in recent years, a plurality of basic hardware and software products with independent intellectual property rights emerge in China. The high-end general chips with independent intellectual property rights, such as dragon cores, soars, the great public, and the like, are developed vigorously, and the technical level reaches the world advanced level of similar products.

Software systems based on national environment microservice architecture have been deployed in many areas.

In a production environment, logs play an important role, the logs are needed for abnormal troubleshooting, the logs are needed for performance optimization, the logs are needed for business troubleshooting, and the like, however, a micro-service architecture usually has a plurality of services, each service only stores the local logs of the service individually, when the logs are needed to assist in troubleshooting, nodes where the logs are located are difficult to find, valuable logs related to problems are difficult to find, and in view of the problems of performance, compatibility and the like in a domestic environment, the problem troubleshooting of the micro-service architecture system is difficult.

Disclosure of Invention

The technical task of the invention is to provide a micro-service architecture log analysis method and system based on a domestic CPU and an operating system, which fully consider the problems of performance and compatibility in the national environment, play a good auxiliary role in the troubleshooting and processing of the problems of the micro-service architecture system, and have good universality, portability and expansibility.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a log analysis method of a micro-service architecture based on a domestic CPU and an operating system is characterized in that logs are collected through filebeat, slow query logs, error logs and third-party service logs of a database are collected, and each filebeat process is automatically issued and started in combination with a log system;

adopting Log Streams as stream processing service, and adopting Kafka Streams as ETL stream processing, filtering and cleaning useless logs;

the logs are stored in an ElasticSearch, the ElasticSearch is deployed in a cluster mode, and cluster management nodes included in ElasticSearch cluster nodes are used as distributed nodes and nodes responsible for input storage, query and import;

automatically generating a corresponding correct sql statement solution according to a currently used database through a depth analysis algorithm, and analyzing and generating the solution by adopting different algorithms according to different problems analyzed by a log;

and supporting automatic fault-tolerant processing, calling a corresponding script to automatically perform fault-tolerant processing, and supporting secondary expansion of an algorithm.

The method is based on the operating systems of various domestic CPUs (central processing units), such as winning bid, depth, Puhua and the like, and can be compatible with the domestic software and hardware environments;

the method is compatible and adaptive to various domestic databases such as Shentong, Jincang, Dameng and the like in the national environment, can solve the problems caused by sql grammar difference, and can generate correct sql statement processing methods for some common database problems according to the currently used database.

Further, considering the performance problem under the domestic environment, the filebeat tuning treatment includes:

optimizing a Filebeat. yml configuration file, and improving the performance of the filebeat writing ES by adjusting configuration parameters of an input end and an output. elastic search end;

optimization at the source code level further improves performance by reducing unnecessary fields added by filecut for the log.

Preferably, the filtering and cleaning of the useless logs is performed in a multidimensional way according to log grades, time points, time periods and business type weight indexes, the useless logs are cleared, and different requirements are met.

Preferably, the rules for implementing dynamic filtering cleaning by the interfacing configuration are as follows:

1) acquiring an interface configuration log, and acquiring the full amount of logs at a default error level;

2) windowing in the flow processing process by taking the error time point as a center, radiating N time points which can be configured up and down, acquiring non-error level logs, and acquiring only info level by default;

3) counting service sql in real time according to service requirements, such as a peak period stage, counting query frequency of similar services sql within one hour, and providing a basis for optimizing a database for dba, for example, creating an index according to the queried sql;

4) and dynamically cleaning the filtering logs according to the indexes of the service types in the peak period, wherein the filtering logs comprise weight indexes, log grade indexes, log maximum limiting quantity indexes and time period indexes of each service in a time period, and time windows are dynamically shrunk according to different time periods.

Further, the log collection configuration is configured through interface, wherein the log collection configuration comprises a service name, a log level, keywords and a time point;

after the log collection is finished, the log is transversely spliced into a complete link log according to the method name, the timestamp and the service calling sequence, the calling sequence of each service can be visually seen, and the problem is easier to troubleshoot.

Preferably, the deep analysis algorithm collects common problems accumulated in daily projects and solutions thereof for classification and arrangement, and a whole set of solution formed by a corresponding algorithm in java language and shell script is formed for each problem.

For example, if a field in the log is too long, the program will automatically find out the table name and field length of the table name and field length in the log, and give the possibility that the fields are too long with the maximum probability.

Preferably, the automatic fault tolerance processing is processed by a shell scripting language; the problems capable of being automatically processed in a fault-tolerant manner are in an automatic fault-tolerant library, and the problems are automatically processed in a fault-tolerant manner by clicking the problems with automatic fault-tolerant execution marks in a log system.

Preferably, the method is implemented as follows:

1) the method comprises the steps of collecting logs, wherein in a distributed scene, a log collection module is added on each application microservice and is used for collecting various types of logs on the microservice;

2) filtering and cleaning the log, and realizing dynamic filtering and cleaning through interface configuration;

3) the log storage, wherein the management of the log system is to perform initialization action on an ElasticSearch in advance for service, and create the type and the related attribute of a designated field of a storage structure;

4) displaying the log, namely displaying an interface of a log system by using a web page mode, wherein the front end adopts AngularJS and the rear end SpringBoot;

5) intelligent log analysis and problem processing, and log intelligent analysis, automatic fault-tolerant processing and performance optimization functions are integrated; collecting common problems, corresponding algorithms and shell scripting language to form a whole set of solution; a quadratic extension of the solution is supported.

The invention also claims a micro-service architecture log analysis system based on a domestic CPU and an operating system, which comprises a log collection module, a log cleaning and filtering module and a log processing module, wherein the log processing comprises log display search, log statistics, log export, intelligent analysis and automatic fault-tolerant processing;

the system realizes the micro-service architecture log analysis method based on the domestic CPU and the operating system.

The invention also claims a micro-service architecture log analysis device based on a domestic CPU and an operating system, which comprises: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor is used for calling the machine readable program and executing the micro service architecture log analysis method based on the domestic CPU and the operating system.

Compared with the prior art, the micro-service architecture log analysis method and system based on the domestic CPU and the operating system have the following beneficial effects:

the method and the system can realize the investigation of the logs of the national environment microservice architecture system, can more intuitively display the error report of each service in the transverse series connection for a certain problem, and are easier to solve the problem;

the performance optimization investigation of the system can be supported on the premise of solving the problems, and a better auxiliary effect is achieved on the optimization of the performance problems in the national production environment;

the intelligent algorithm analysis is supported, and a solution to the problem can be further provided according to the log, so that the problem solving efficiency is further improved;

the automatic fault-tolerant function of simple problems is supported, and a large amount of workload is reduced;

the processing of the common problems of the method and the system is an independent algorithm, secondary expansion is supported, the processing capacity of the method and the system is more mature along with the maturity of the technology, experience and solution, the functions can be further played, manual intervention is reduced, and further the workload is reduced.

Drawings

FIG. 1 is a functional flowchart of a method for parsing a micro service architecture log based on a domestic CPU and an operating system according to an embodiment of the present invention;

fig. 2 is a core architecture diagram of a micro-service architecture log parsing method based on a domestic CPU and an operating system according to an embodiment of the present invention.

Detailed Description

The present invention will be further described with reference to the following specific examples.

A log analysis method of micro-service architecture based on domestic CPU and operating system, which collects logs through filebeat, collects slow query logs, error logs and third-party service logs of database, such as nginx, besides business service logs; automatically releasing and starting each filecut process by combining a log system; considering the performance problem under the domestic environment, the filecut tuning treatment comprises the following steps:

Adopting Log Streams as stream processing service, and adopting Kafka Streams as ETL stream processing, filtering and cleaning useless logs; and carrying out multidimensional filtering on log levels, time points, time periods and service type weight indexes to eliminate useless logs and meet different requirements.

The log collection configuration is configured through interfacing, wherein the log collection configuration comprises a service name, a log level, keywords and a time point;

the deep analysis algorithm is a complete set of solution formed by collecting common problems accumulated in daily projects and solving methods thereof, and forming a corresponding algorithm for each problem by java language and shell script. For example, if a field in the log is too long, the program will automatically find out the table name and field length of the log, and automatically find out the table-building statement and give the probability of which fields are too long.

Supporting automatic fault-tolerant processing, calling a corresponding script to automatically carry out fault-tolerant processing, and supporting secondary expansion of an algorithm;

the automatic fault-tolerant processing is mainly processed by shell scripting language; problems capable of being automatically fault-tolerant processed are usually in an automatic fault-tolerant library, and automatic fault-tolerant processing can be carried out by clicking when automatic fault-tolerant execution marks exist in a log system for the problems;

for example, a service is started, the occupation of a prompt port cannot be started, an automatic fault tolerance mechanism finds the process id of a specified port number through a netstat-tunlp | grep port number in a shell script language according to the port number occupied in a log, then finds the position and the name of the process through ll/proc/process id/cwd, feeds back whether an interface kills the process for automatic fault tolerance, and if so, kills the process and then transfers a start script to complete the automatic fault tolerance processing of the problem. The above is a simple example to illustrate the processing mechanism. Other problem solutions are probably the same, and the common simple problem processing is supported at present, secondary expansion can be supported, and the problem of mature processing of an algorithm and a script is more and more complicated.

The method fully considers the different grammatical problems of databases such as Shentong databases, Jincang databases and Dameng databases in the national production environment, and can automatically generate corresponding sql sentences according to the database used in the current environment; the common problem depth intelligent analysis algorithm supports secondary expansion.

The method supports a low-level automatic fault-tolerant function according to the log, and can automatically carry out fault-tolerant processing according to a corresponding script after analysis according to an intelligent analysis algorithm for some common simple problems such as field shortage of a database, few tables of the database, port occupation, service stop and the like.

The method provided by the embodiment is compatible and adaptive to various domestic databases such as Shentong, Jincang, Dameng and the like in the national environment, can solve the problems caused by sql grammar difference, and can generate correct sql statement processing methods for some common database problems according to the currently used database.

The implementation flow of the method is as follows:

1. log collection

In a distributed scene, a log collection module is added on each application microservice and is responsible for collecting various logs on the microservice, a log file collection end uses filehead, operation and maintenance are configured in a background management interface mode, each machine corresponds to one filehead, topic corresponding to each filehead log can be one-to-one or multiple-to-one, and different strategies are configured according to daily log quantity.

2. Log cleaning filtering

Log filtering and cleaning adopt Log Streams processing service, and the Log Streams processing service introduces a filter to filter valuable Log data, thereby reducing the resource cost used by the Log service; the technique uses Kafka Streams as ETL stream processing.

The rules for implementing dynamic filtering cleaning by interfacing configuration are as follows:

3) and counting the service sql in real time according to service requirements, such as: in the peak period stage, the query frequency of the similar service sql within one hour is counted, and a basis for optimizing a database can be provided for dba, for example, an index is created according to the queried sql;

4) dynamically cleaning and filtering logs according to the weight index, the log grade index, the maximum log limiting amount index and the time period index of each service in a peak period in accordance with the service type; dynamically contracting the time window according to different time periods; .

3. Log storage

The logs are stored in an ElasticSearch, the ElasticSearch is deployed in a cluster mode, and the ElasticSearch cluster nodes are divided into three classes, namely, masternode, client node and data node.

Master mode, management node of cluster, its main function is to maintain metadata, manage the state of each node of cluster;

the client node is used as a distribution node and is responsible for distributing the received request to each data node;

and the Data node is responsible for storing, inquiring and importing the number.

Elastic search memory structure definition: the management microserver of the log system carries out initialization action on an elastic search in advance and creates the type and the related attributes of the designated field of the storage structure.

4. Log display

The interface of the log system uses a web page mode, and the front end adopts AngularJS rear-end Spring Boot. The service is separated and specially used as a display service, and the service develops and integrates the statistical query, log export, micro-service log chain tracking viewing and the like of the log in a web interface.

5. Intelligent log analysis and problem processing

The log system integrates the functions of log intelligent deep analysis, automatic fault-tolerant processing and performance optimization, collects a common problem library, a corresponding algorithm and a shell script language to form a whole set of solution, and supports secondary expansion of the solution.

The embodiment of the invention also provides a micro-service architecture log analysis system based on a domestic CPU and an operating system, which comprises a log collection module, a log cleaning and filtering module and a log processing module, wherein the log processing comprises log display search, log statistics, log export, intelligent analysis and automatic fault-tolerant processing;

the system realizes the micro-service architecture log analysis method based on the domestic CPU and the operating system in the embodiment.

The collection of the logs adopts lightweight filebeat, and meanwhile, the performance problem under the domestic environment is considered, and the log collection efficiency is further improved by optimizing from the aspects of configuration and source codes.

Firstly, configuring log acquisition configuration including service names, log levels, keywords and time points through an interface session, acquiescently acquiring error-level logs, transversely splicing the logs into complete link logs according to method names, timestamps and service calling sequences after log acquisition is finished, and visually seeing the calling sequences of various services to be easier to troubleshoot problems; besides the problem of troubleshooting, the method supports sql optimization extraction according to the log and performs real-time service sql statistics according to service requirements for a specific service, such as: and in the peak period stage, counting the query frequency of the similar service sql within one hour. The dba may be provided a basis for optimizing the database, such as creating an index by the sql of the query.

The system supports intelligent analysis of algorithms of common problems, can perform deep analysis according to the log and generate solutions, and for example, intelligent analysis algorithms such as database table shortage, field overlong and the like can analyze which table, field or about which field overlong range set according to the log. Meanwhile, in view of grammatical differences among databases in a nationwide environment, such as the databases of Shentong, Jincang, Dameng and the like, an intelligent algorithm can automatically generate a corresponding correct sql statement solution according to the currently used database, for example, length statements of Shentong data modification fields are as follows:

the alter table name modify field name type (length),

and the sql statement under the treasury database is:

the alter table name the alter field name type field type (length).

Different algorithms are needed to analyze and generate solutions according to different problems analyzed by logs, and only daily simple problem processing and secondary expansion of the algorithms are supported at present.

For simple problems, the method can support an automatic fault-tolerant function, and calls a corresponding script to automatically carry out fault-tolerant processing.

The embodiment of the invention also provides a micro-service architecture log analysis device based on a domestic CPU and an operating system, which comprises: at least one memory and at least one processor;

the at least one memory to store a machine readable program;

the at least one processor is configured to invoke the machine-readable program to execute the method for parsing a micro service architecture log based on a domestic CPU and an operating system in the foregoing embodiment.

The present invention can be easily implemented by those skilled in the art from the above detailed description. It should be understood, however, that the intention is not to limit the invention to the particular embodiments described. On the basis of the disclosed embodiments, a person skilled in the art can combine different technical features at will, thereby implementing different technical solutions.