Disclosure of Invention
The invention aims at the problem that the market demand and the achievement target of the current technical development are not matched, and provides a monitoring system and a method based on Zabbix and Prometheus.
The method comprises the steps of converting Zabbix alarm messages into message formats which can be identified by Prometous through a unified message converter, sending early warning information to an early warning receiving medium and a Mysql database through an Alertmanager component respectively, transmitting the early warning receiving medium through a unified early warning message distributor, transmitting the Mysql database through an early warning data collector, unifying the two messages through standardization of early warning levels and message grouping rules, and filing the messages to the database through the early warning data collector for unified management, analysis and display.
Zabbix is a relatively primitive network monitoring system.
A unified message converter for receiving messages from Zabbix.
Prometheus is a newer network monitoring system.
An alert manager component for an early warning management system within a promemeus ecosystem.
And the early warning data collector is used for receiving the message of the Alertmanager.
And the Mysql database is used for storing the data of the early warning data collector.
The uniform early warning message distributor is used for receiving the messages of the Alertmanager.
And the early warning information management platform is used for displaying a network interface of the early warning information.
The invention at least comprises the following beneficial effects:
the monitoring of the original single set of Zabbix monitoring system service layer is combined into a whole set of monitoring system by adopting a Prometheus monitoring system and a unified message converter, so that the problem that one monitoring system is not enough to solve the monitoring is realized, and the problems that two monitoring systems are difficult to manage and the standard cannot be established are solved. On the other hand, the failure of tracing of historical early warning is another critical problem existing in Prometheus at present, and for the technical field of the current year, technical means are necessary to be adopted to implement the necessary technical characteristics of tracing of early warning.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Detailed Description
The invention patent is a complete set of monitoring system which runs on a plurality of servers or in a plurality of containers.
On the basis of the original Zabbix monitoring system, an early warning plug-in for sending alarm messages of Zabbix is manufactured to a unified message converter.
The early warning levels of all monitoring items are configured inside the Zabbix system so as to meet the overall level division standard.
The early warning level is set into five layers including notification, warning, fault, emergency fault and others.
And the interior of the Zabbix system is configured with an early warning strategy, so that all early warning messages are sent to a self-made early warning plug-in.
And the early warning plug-in unit forwards the Zabbix early warning message to the unified message converter according to the original format.
The unified message transformer is a connector of Zabbix and alert manager.
And sending a self-defined json template message by the Zabbix early warning plug-in.
The template message needs to contain an authentication identifier, i.e. an early warning or recovery.
The message bodies are listed as follows:
it needs to contain an identifier to identify whether the message is an early warning or a recovery, here "status", and when a recovery message is received, the content of "status" is "resolved" to facilitate the judgment of the unified message converter.
The unified message converter converts the message into an alarm message format which can be identified by Prometheus; considering the grouping strategy of Prometheus, the content is fixed by taking 'alert name' as a grouping basis; secondly, through the early warning grade division standard, the "visibility" is converted into a unified standard grade, and other basic information is converted into a cable of Prometheus.
Particularly, it is stated that "status", when "status" is recovery, it is necessary to set the endsAt item of Prometheus to rfc3339 format of the current time, and Prometheus judges whether the warning is recovered through endsAt.
The above processes all need to inject additional information such as grouping and message type information into the message according to the overall standard specification of the system, and if special service needs exist, other identifying contents can be added according to the requirements.
And sending the message to an API (application program interface) of an early warning management system Alertmanager of Prometheus for unified processing. The message will then follow the alert manager policy for subsequent processing.
According to the overall standard specification of the system, an alarm strategy meeting the standard is formulated, and the alarm strategy comprises the following steps: early warning level, grouping according to classes, and adding other identification contents according to requirements if special service requirements exist.
And after the early warning is triggered, the early warning is sent to an Alertmanager for unified processing.
And adding a unique identifier of each UUID of the new early warning to the inside of the Alertmanager component for judging the repeatability of the early warning in a later system, wherein the UUID is obtained by performing HASH operation on Labels and startsAt attributes of the early warning.
And modifying the source code of the Alertmanager, and adding a unique identification code for each early warning.
The processed early warning message is sent to the early warning data collector and the unified early warning message distributor through webhook (an external function call interface) of Alertmanager.
The early warning data collector receives the information from the Alertmanager, analyzes and splits the early warning information into database entries with single unique identification codes and inserts the database entries into the Mysql database, and if the database already contains the information, the information is not processed.
When the early warning message with the unique identification code is received again and is in a recovery state, the updating database marks the message in the recovery state.
The Mysql database stores the early warning data from the "early warning data collector".
The storage data fields include: the system comprises an early warning ID, an early warning name, a host name, an IP address, early warning information, a product type, an early warning level, early warning time, an early warning state, an early warning label, an early warning unique identification code, a claim state, a claimant, claim time, closing time and completion time.
The uniform early warning message distributor receives the message from the Alertmanager, and sends the message to E-mail, nail, short message and telephone in a mode of self-defining template according to different warning levels and grouping information.
The early warning information management platform UI is a webpage-form early warning information management system, and has the functions of allowing technicians to log in through own accounts, claim early warning information, and mark a processing process and a processing result in a comment form after the early warning information is processed; and searching and screening historical early warning information.
The implementation mode of the method is divided into a front end part and a rear end part, the front end is developed by adopting a mainstream front end framework VUE, and the rear end is compiled by adopting a Python Web framework flash.