Movatterモバイル変換


[0]ホーム

URL:


CN113704065A - Monitoring method, device, equipment and computer storage medium - Google Patents

Monitoring method, device, equipment and computer storage medium
Download PDF

Info

Publication number
CN113704065A
CN113704065ACN202111014468.2ACN202111014468ACN113704065ACN 113704065 ACN113704065 ACN 113704065ACN 202111014468 ACN202111014468 ACN 202111014468ACN 113704065 ACN113704065 ACN 113704065A
Authority
CN
China
Prior art keywords
alarm
data
indicator
rule
configuration file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111014468.2A
Other languages
Chinese (zh)
Inventor
董安伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co LtdfiledCriticalPing An Puhui Enterprise Management Co Ltd
Priority to CN202111014468.2ApriorityCriticalpatent/CN113704065A/en
Publication of CN113704065ApublicationCriticalpatent/CN113704065A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

Translated fromChinese

本申请涉及软件监控领域,提供一种监控方法、装置、设备及存储介质,该方法包括:获取配置文件,所述配置文件是根据用户在用户界面输入的配置参数生成的,所述配置文件包括指标获取规则和告警规则;基于指标获取规则,获取所述监控对象相关的指标数据;基于告警规则,根据所述指标数据判断是否发生告警事件;若判定发生告警事件,将所述告警事件相关的告警信息推送至告警组件,以获取所述告警组件合并所述告警信息得到的告警通知;根据所述告警规则中的告警发送方式,发送所述告警通知。根据预设的配置参数生成配置文件,提高了监控效率。本申请还涉及人工智能技术,本申请的监控方法可以应用于大数据和人工智能平台云计算服务的云服务器。

Figure 202111014468

The present application relates to the field of software monitoring, and provides a monitoring method, device, device and storage medium. The method includes: acquiring a configuration file, where the configuration file is generated according to configuration parameters input by a user on a user interface, and the configuration file includes Indicator acquisition rules and alarm rules; based on the indicator acquisition rules, acquire the indicator data related to the monitoring object; based on the alarm rules, determine whether an alarm event occurs according to the indicator data; if it is determined that an alarm event occurs, the alarm event related The alarm information is pushed to an alarm component to obtain an alarm notification obtained by combining the alarm information by the alarm component; and the alarm notification is sent according to the alarm sending mode in the alarm rule. The configuration file is generated according to the preset configuration parameters, which improves the monitoring efficiency. The present application also relates to artificial intelligence technology, and the monitoring method of the present application can be applied to cloud servers of big data and artificial intelligence platform cloud computing services.

Figure 202111014468

Description

Monitoring method, device, equipment and computer storage medium
Technical Field
The present application relates to the field of software monitoring, and in particular, to a monitoring method, apparatus, device, and computer storage medium.
Background
An open source monitoring alarm system based on a Time Series Database (TSDB), such as Prometheus, can periodically capture (Pull) the state of a monitored program according to an HTTP protocol by specifying a port, a path and an interval Time of a target program in a configuration file, and is suitable for monitoring hardware indexes such as a server and the like and for monitoring a high-dynamic service-oriented architecture because the system maintains very low service intrusiveness and has excellent performance in recording pure numbers based on a Time Series. However, Prometheus and its components are managed by configuring YAML files, so that the learning cost is high for ordinary users, the access of index data also needs additional access cost, and the alarm style is single, so that the advanced customization requirements cannot be met.
Disclosure of Invention
The present application mainly aims to provide a monitoring method, device, apparatus, and computer-readable storage medium, which aim to reduce the learning cost of operation and maintenance personnel and improve the monitoring efficiency.
In a first aspect, the present application provides a monitoring method, including the steps of:
acquiring a configuration file, wherein the configuration file is generated according to configuration parameters input by a user on a user interface and comprises an index acquisition rule and an alarm rule;
acquiring index data related to the monitored object based on the index acquisition rule;
judging whether an alarm event occurs according to the index data based on the alarm rule;
if an alarm event is judged to occur, pushing alarm information related to the alarm event to an alarm component so as to obtain an alarm notice obtained by combining the alarm information by the alarm component;
and sending the alarm notification according to an alarm sending mode in the alarm rule.
In a second aspect, the present application further provides a monitoring device, comprising:
the system comprises a configuration file acquisition module, a configuration file generation module and a configuration file generation module, wherein the configuration file acquisition module is used for acquiring a configuration file, the configuration file is generated according to configuration parameters input by a user on a user interface, and the configuration file comprises an index acquisition rule and an alarm rule;
the index data acquisition module is used for acquiring the index data related to the monitored object based on the index acquisition rule;
the alarm event judging module is used for judging whether an alarm event occurs or not according to the index data based on the alarm rule;
the alarm information pushing module is used for pushing alarm information related to an alarm event to an alarm component if the alarm event is judged to occur so as to obtain an alarm notice obtained by combining the alarm information with the alarm component;
and the alarm notification sending module is used for sending the alarm notification according to the alarm sending mode in the alarm rule.
In a third aspect, the present application also provides a computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the monitoring method as described above.
In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, where the computer program, when executed by a processor, implements the steps of the monitoring method as described above.
The application provides a monitoring method, a monitoring device and a computer readable storage medium, the monitoring method, the monitoring device and the computer readable storage medium are characterized in that a configuration file is obtained, the configuration file is generated according to configuration parameters input by a user on a user interface, and the configuration file comprises an index obtaining rule and an alarm rule; acquiring index data related to the monitored object based on the index acquisition rule; judging whether an alarm event occurs according to the index data based on the alarm rule; if an alarm event is judged to occur, pushing alarm information related to the alarm event to an alarm component so as to obtain an alarm notice obtained by combining the alarm information by the alarm component; and sending the alarm notification according to an alarm sending mode in the alarm rule. The method comprises the steps of generating a configuration file according to preset configuration parameters, obtaining index data according to the generated configuration file and judging whether an alarm event occurs. The operation and maintenance personnel can generate the configuration file only by simply configuring the configuration file on the user interface, so that the time for compiling the configuration file and the learning cost are saved, and the monitoring efficiency is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a monitoring method according to an embodiment of the present application;
fig. 2 is a usage scenario diagram of a monitoring method according to an embodiment of the present application;
fig. 3 is a schematic block diagram of a monitoring apparatus according to an embodiment of the present application;
fig. 4 is a block diagram illustrating a structure of a computer device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
The embodiment of the application provides a monitoring method, a monitoring device, computer equipment and a computer readable storage medium. The monitoring method can be applied to terminal equipment, and the terminal equipment can be electronic equipment such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant and wearable equipment; the method can also be applied to a server, and the server can be an independent server, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, Content Delivery Network (CDN), big data and an artificial intelligence platform.
Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic flow chart of a monitoring method according to an embodiment of the present application. For example, the monitoring method in the embodiment of the present invention may be implemented by a plug-in based on Prometheus, which is not limited to this, and may also be based on other monitoring systems, which is not limited herein.
As shown in fig. 1, the monitoring method includes steps S101 to S105.
Step S101, obtaining a configuration file, wherein the configuration file is generated according to configuration parameters input by a user on a user interface, and comprises an index obtaining rule and an alarm rule;
illustratively, the preset configuration parameters are determined according to configuration operations of operation and maintenance personnel on the user interface. The configuration file is generated according to the configuration parameters set on the user interface, so that the time for compiling the configuration file and the learning cost are saved, and the monitoring efficiency is improved.
Specifically, the preset configuration parameters are used for indicating an acquisition path, an acquisition cycle, an alarm threshold, an alarm information sending mode, and the like of the index data.
For example, the configuration file may be generated according to the preset configuration parameters and a preset configuration file framework.
For example, a user may fill in a threshold value of relevant data in a user interface, set the acquisition period to 2 seconds, and select an alarm level, and then may generate the following corresponding instructions in a configuration file:
groups:
-name:default
rules:
-alert:Test_Alert
expr:PACES_EMALL_DB021679<0
for:2s
labels:
severity:warning
wherein the instructions in the configuration file are determined according to configuration parameters input by a user in the user interface.
Similarly, other parts of the configuration file, such as the index obtaining rule, may also be generated according to the configuration parameters, and are not described herein again.
Illustratively, the generated configuration file includes an index obtaining rule and an alarm rule. Specifically, the index acquisition rule includes an acquisition path, an acquisition period, and the like for acquiring the index data, where the acquisition path is used to indicate an acquisition entry of the index data, and the acquisition period is used to determine a period for acquiring the index data; the alarm rule comprises an alarm threshold value of the index data, an alarm information sending mode and the like.
As shown in fig. 2, fig. 2 is a usage scenario diagram provided in an embodiment of the present application, and the configuration parameters may be determined by configuration operations of an operation and maintenance person at a terminal device.
The configuration file is generated according to the preset configuration parameters, the operation complexity is reduced, the configuration file can be generated only by simple configuration of operation and maintenance personnel on a user interface, the time for compiling the configuration file and the learning cost are saved, and the monitoring efficiency is improved.
And S102, acquiring index data related to the monitored object based on the index acquisition rule.
In some embodiments, step S102 includes steps S1021-S1022:
and S1021, acquiring the service data of the monitored object according to the acquisition path and the acquisition cycle in the index acquisition rule.
Illustratively, the acquisition path in the index acquisition rule specifies an acquisition entry for capturing the service data, and the service data is periodically and actively captured through the acquisition entry. Specifically, the period for collecting the service data is determined based on the collection period in the index acquisition rule. Illustratively, the service data may be data traffic information, data storage information, CPU occupation information of the server, and the like.
For example, the monitoring platform may concurrently execute a multi-thread task, and acquire a plurality of service data of the monitoring object, for example, simultaneously acquire data traffic information and data storage information.
The service data of the monitored object is captured through a preset acquisition inlet, the monitored object does not need to sense the existence of a monitoring platform, the coupling degree of the monitoring platform and the monitored object is reduced, and the stability of the monitoring system is enhanced.
Step S1022, performing preset indexing processing on the service data to obtain index data related to the monitoring object.
Illustratively, the index data includes business data and an index name (metrics name), an index tag (label name).
In some embodiments, the performing a preset indexing process on the service data to obtain index data related to the monitored object includes: determining an index name and an index label of the business data according to the configuration file and the source of the business data; and adding the index name and the index label to the service data to obtain index data. It can be understood that the preset processing on the service data may also be to determine the index tag first and then determine the index name, which is not limited herein.
Illustratively, the index name and the index tag are configured in advance, and an operation and maintenance worker names the index name and the index tag for a data source for collecting business data, and determines the pre-configured index name and the index tag according to the source of the business data when the collected business data is subjected to the preset indexing processing. The index name is used for indicating the name of the index data, so that the abnormal position can be positioned simply and clearly when the alarm message is sent; the index tag is used for grouping alarm information so as to avoid simultaneously generating a large number of alarm events when the services in the cloud environment are intensively coupled, thereby avoiding instantly and suddenly generating a large number of alarm notifications and enabling operation and maintenance personnel to be incapable of quickly positioning problems.
And S103, judging whether an alarm event occurs or not according to the index data based on the alarm rule.
In some embodiments, the alarm rule is used to determine the alarm threshold, which comprises a data threshold.
In some embodiments, the determining whether an alarm event occurs according to the index data based on the alarm rule includes: judging whether the numerical value of the service data in the index data is larger than the data threshold value corresponding to the index data in the alarm rule; and if the numerical value of the service data in the index data is larger than the data threshold, judging that an alarm event occurs.
For example, taking the index data as CPU occupation information as an example, if the data threshold is set to 80%, when the value of the service data in the index data indicating the CPU occupation information is greater than 80%, it is determined that an alarm event occurs.
In some embodiments, the alert threshold further comprises a time threshold.
In some embodiments, determining whether a value of service data in the index data is greater than a data threshold corresponding to the index data in the alarm rule; if the numerical value of the service data in the index data is larger than the data threshold, judging whether the duration time that the numerical value of the service data in the index data is larger than the data threshold is larger than the time threshold corresponding to the index data in the alarm rule; and if the duration time that the numerical value of the service data in the index data is greater than the data threshold value is greater than the time threshold value corresponding to the index data in the alarm rule, judging that an alarm event occurs.
For example, taking the index data as CPU occupation information as an example, if the data threshold is set to 80% and the event threshold is set to 15 seconds, when the value of the service data in the index data indicating the CPU occupation information is greater than 80%, and the duration of the service data in the index data indicating the CPU occupation information, which is greater than 80%, is greater than 15 seconds, it is determined that an alarm event occurs.
Step S104, if the alarm event is judged to occur, the alarm information related to the alarm event is pushed to an alarm component so as to obtain an alarm notice obtained by combining the alarm information by the alarm component.
Referring to fig. 2, the alarm component can obtain alarm information related to an alarm event, and combine the alarm information to obtain an alarm notification. The alarm component may be configured on a server to which the monitoring method of the embodiment of the present application is applied, or may also be configured on other servers or terminal devices.
In some embodiments, the alert component merges the alert information after receiving the alert information. Specifically, according to the types of the alarm information indicated by the index tags, the alarm information of the same type is divided into the same group, and the alarm information of different types is divided into different groups; and generating corresponding alarm notifications according to the alarm information of each group so as to prevent a large number of alarm events from being generated simultaneously when the services in the cloud environment are intensively coupled, thereby preventing a large number of alarm notifications from being generated suddenly in a moment and ensuring that operation and maintenance personnel cannot quickly locate problems.
Illustratively, the alert component may be an alert manager.
And step S105, sending the alarm notification according to the alarm sending mode in the alarm rule.
Illustratively, the alarm notification is sent to the alarm notification receiver in the alarm rule by the alarm sending mode in the alarm rule, so that the alarm event can be processed in time.
Illustratively, the alarm information sending manner includes at least one of a WeChat, a mail, a Slack (message integration tool), and a Webhook (message interface), but is not limited thereto.
Illustratively, the alarm notification includes an index name of index data of the alarm event, so that the operation and maintenance personnel can locate the alarm event according to the index name.
Specifically, the alarm sending method determines a receiver of the alarm notification, and taking sending the alarm notification by mail as an example, the receiver may be a mailbox address of an operation and maintenance worker, so that when an alarm event occurs, the alarm component sends the alarm notification to the mailbox address of the operation and maintenance worker.
In some embodiments, after sending the alert notification, if no feedback for the receiving side to process the alert event is received, the alert notification may be sent again according to a preset sending interval, or the alert notification may be muted. For example, the importance degree of the alarm notification may be determined according to the group and the number to which the alarm notification belongs, and the alarm notification may be determined to be sent again or muted according to the importance degree of the alarm notification.
In some embodiments, the monitoring method further comprises:
generating a data base line by performing preset operation on the index data;
determining an index data predicted value according to the data base line;
and determining the data threshold value and the time threshold value according to the index data predicted value.
Illustratively, the data baseline can be obtained by performing a preset operation on the index data. Specifically, the data base line is a smooth curve of the index data changing with time obtained by performing preset operation on the index data, and a base line predicted value can be determined through the data base line to predict the index data. And the preset operation on the index data comprises calculating at least one of standard deviation, average value, maximum value and minimum value of the index data.
For example, the index data at a future time, i.e., the predicted baseline value, can be determined from the points on the data baseline. Taking the CPU occupation information as an example, the baseline predicted value of the CPU occupation information at least one moment in a future period of time can be determined according to the data baseline generated by performing the customized operation on the CPU occupation information.
The index data can be predicted through the baseline predicted value, so that the alarm threshold value of the index data can be dynamically adjusted according to the baseline predicted value. Taking a data threshold as an example, at least one value greater than the baseline predicted value may be determined at a time as the data threshold; the time threshold of the alarm threshold may also be determined based on a similar principle, which is not described herein again.
In some embodiments, the monitoring method further comprises:
responding to the modification operation of the configuration parameters to obtain modified configuration parameters;
performing increment judgment on the modified configuration parameters;
and if the result of the increment judgment indicates that the configuration file needs to be updated, updating the configuration file according to the modified configuration parameters.
Illustratively, in response to an operation of modifying configuration parameters at the user interface, the monitoring platform may perform an incremental determination on the configuration parameters, determine changed configuration parameters, and determine whether the configuration file needs to be updated based on the changed configuration parameters.
And if the increment judgment result is that the configuration file needs to be updated, updating the configuration file according to the changed configuration parameters. For example, if the acquisition path in the configuration parameters changes, the part of the configuration file for determining the acquisition path is updated, so that the operation of modifying the configuration parameters on the user interface can be reflected in the configuration file in real time, the time and the learning cost for compiling the configuration file by operation and maintenance personnel are reduced, and the user experience is improved.
In some embodiments, the monitoring method further comprises:
and sending the index data to a man-machine interaction subsystem for display.
Illustratively, the human-computer interaction subsystem can acquire the index data acquired by the monitoring platform and display the index data according to a preset mode so as to enhance the readability of the index data and enable the monitoring process to be more intuitive.
In some embodiments, the human-computer interaction subsystem is a Grafana visualization panel. The Grafana visualization panel can acquire and display the index data, such as data flow information, data storage information, CPU occupation information of a server and the like, so that the readability of the data is enhanced, and the operation and maintenance are more visual.
According to the monitoring method provided by the embodiment of the application, the configuration file is generated according to the preset configuration parameters, the index data is obtained according to the generated configuration file, and whether the alarm event occurs or not is judged. The operation and maintenance personnel can generate the configuration file only by simply configuring the configuration file on the user interface, so that the time for compiling the configuration file and the learning cost are saved, and the monitoring efficiency is improved.
Referring to fig. 3 in conjunction with the foregoing embodiment, fig. 3 is a schematic diagram of a monitoring device according to an embodiment of the present application, where the monitoring device may be configured in a server or a terminal for executing the foregoing monitoring method.
As shown in fig. 3, the monitoring apparatus includes: a configuration file obtaining module 110, an indexdata obtaining module 120, an alarmevent judging module 130, an alarm information pushing module 140, and an alarm notification sending module 150.
A configuration file obtaining module 110, configured to obtain a configuration file, where the configuration file is generated according to configuration parameters input by a user on a user interface, and the configuration file includes an index obtaining rule and an alarm rule;
an indexdata obtaining module 120, configured to obtain, based on the index obtaining rule, index data related to the monitored object;
an alarmevent determining module 130, configured to determine whether an alarm event occurs according to the index data based on the alarm rule;
the alarm information pushing module 140 is configured to, if it is determined that an alarm event occurs, push alarm information related to the alarm event to an alarm component to obtain an alarm notification that the alarm component merges the alarm information;
and an alarm notification sending module 150, configured to send the alarm notification according to an alarm sending mode in the alarm rule.
Illustratively, the indexdata obtaining module 120 includes a service data obtaining sub-module and an index data generating sub-module.
And the service data acquisition submodule is used for acquiring the service data of the monitored object according to the acquisition path and the acquisition cycle in the index acquisition rule.
And the index data generation submodule is used for carrying out preset indexing processing on the service data to obtain index data related to the monitored object.
Illustratively, the index data generation submodule includes an index name determination submodule, an index tag determination submodule, and an index name tag addition submodule.
The index name determining submodule is used for determining an index name and an index label of the business data according to the configuration file and the source of the business data;
and the index name tag adding submodule is used for adding the index name and the index tag to the service data to obtain index data.
Illustratively, the alarmevent determination module 130 includes a first data threshold determination sub-module and a first alarm event determination sub-module.
And the first data threshold judgment submodule is used for judging whether the numerical value of the service data in the index data is greater than the data threshold corresponding to the index data in the alarm rule.
And the first alarm event judgment submodule is used for judging that an alarm event occurs if the numerical value of the service data in the index data is greater than the data threshold value.
Illustratively, the alarmevent determination module 130 includes a second data threshold determination sub-module, a time threshold determination sub-module, and a second alarm event determination sub-module.
A second data threshold judgment submodule, configured to judge whether a numerical value of service data in the index data is greater than a data threshold corresponding to the index data in the alarm rule;
a time threshold judgment submodule, configured to, if the value of the service data in the index data is greater than the data threshold, judge whether a duration that the value of the service data in the index data is greater than the data threshold is greater than a time threshold corresponding to the index data in the alarm rule;
and the second alarm event judgment submodule is used for judging that an alarm event occurs if the duration time that the numerical value of the service data in the index data is greater than the data threshold is greater than the time threshold corresponding to the index data in the alarm rule.
Illustratively, the monitoring device further comprises a data baseline generation module, an index data prediction module and an alarm threshold determination module.
A data baseline generation module, configured to generate a data baseline by performing a preset operation on the index data, where the preset operation includes at least one of: standard deviation operation, average value operation, maximum value operation and minimum value operation;
and the index data prediction module is used for determining the index data prediction value according to the data base line.
And the alarm threshold value determining module is used for determining the data threshold value and the time threshold value according to the index data predicted value.
Illustratively, the monitoring device further comprises a configuration parameter modification module, an increment judgment module and a configuration file updating module.
And the configuration parameter modification module is used for responding to the modification operation of the configuration parameters to obtain the modified configuration parameters.
And the increment judgment module is used for carrying out increment judgment on the modified configuration parameters.
And the configuration file updating module is used for updating the configuration file according to the modified configuration parameters if the result of the increment judgment indicates that the configuration file needs to be updated.
It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the apparatus, the modules and the units described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The methods, apparatus, and devices of the present application may be deployed in numerous general-purpose or special-purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The above-described methods and apparatuses may be implemented, for example, in the form of a computer program that can be run on a computer device as shown in fig. 4.
Referring to fig. 4, fig. 4 is a schematic block diagram of a computer device according to an embodiment of the present disclosure. The computer device may be a server or a terminal.
As shown in fig. 4, the computer device includes a processor, a memory, and a network interface connected by a system bus, wherein the memory may include a storage medium and an internal memory.
The storage medium may store an operating system and a computer program. The computer program comprises program instructions which, when executed, cause a processor to perform any of the monitoring methods.
The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.
The internal memory provides an environment for the execution of a computer program on a storage medium, which when executed by a processor causes the processor to perform any of the monitoring methods.
The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 3 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:
acquiring a configuration file, wherein the configuration file is generated according to configuration parameters input by a user on a user interface and comprises an index acquisition rule and an alarm rule;
acquiring index data related to the monitored object based on the index acquisition rule;
judging whether an alarm event occurs according to the index data based on the alarm rule;
if an alarm event is judged to occur, pushing alarm information related to the alarm event to an alarm component so as to obtain an alarm notice obtained by combining the alarm information by the alarm component;
and sending the alarm notification according to an alarm sending mode in the alarm rule.
In some embodiments, the processor, when implementing acquiring the index data related to the monitored object based on the index acquisition rule, is configured to implement:
acquiring the service data of the monitored object according to the acquisition path and the acquisition cycle in the index acquisition rule;
and performing preset indexing processing on the service data to obtain index data related to the monitored object.
In some embodiments, when the processor implements the preset indexing processing on the service data to obtain the index data related to the monitored object, the processor is configured to implement:
determining an index name and an index label of the business data according to the configuration file and the source of the business data;
and adding the index name and the index label to the service data to obtain index data.
In some embodiments, the processor, when implementing the determination of whether an alarm event occurs according to the index data based on the alarm rule, is configured to implement:
judging whether the numerical value of the service data in the index data is larger than the data threshold value corresponding to the index data in the alarm rule;
and if the numerical value of the service data in the index data is larger than the data threshold, judging that an alarm event occurs.
In some embodiments, the processor, when implementing the determination of whether an alarm event occurs according to the index data based on the alarm rule, is configured to implement:
judging whether the numerical value of the service data in the index data is larger than the data threshold value corresponding to the index data in the alarm rule;
if the numerical value of the service data in the index data is larger than the data threshold, judging whether the duration time that the numerical value of the service data in the index data is larger than the data threshold is larger than the time threshold corresponding to the index data in the alarm rule;
and if the duration time that the numerical value of the service data in the index data is greater than the data threshold value is greater than the time threshold value corresponding to the index data in the alarm rule, judging that an alarm event occurs.
In some embodiments, the processor, when implementing the monitoring method, is configured to implement:
generating a data baseline by performing a preset operation on the index data, wherein the preset operation comprises at least one of the following operations: standard deviation operation, average value operation, maximum value operation and minimum value operation;
determining an index data predicted value according to the data base line;
and determining the data threshold value and the time threshold value according to the index data predicted value.
In some embodiments, the processor, when implementing the monitoring method, is configured to implement:
responding to the modification operation of the configuration parameters to obtain modified configuration parameters;
performing increment judgment on the modified configuration parameters;
and if the result of the increment judgment indicates that the configuration file needs to be updated, updating the configuration file according to the modified configuration parameters.
It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working process of the monitoring method described above may refer to the corresponding process in the foregoing monitoring method embodiment, and is not described herein again.
Embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, where the computer program includes program instructions, and a method implemented when the program instructions are executed may refer to various embodiments of the monitoring method of the present application.
The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.
It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

Translated fromChinese
1.一种监控方法,其特征在于,所述方法包括:1. a monitoring method, is characterized in that, described method comprises:获取配置文件,所述配置文件是根据用户在用户界面输入的配置参数生成的,所述配置文件包括指标获取规则和告警规则;obtaining a configuration file, where the configuration file is generated according to configuration parameters input by the user on the user interface, and the configuration file includes an indicator obtaining rule and an alarming rule;基于所述指标获取规则,获取所述监控对象相关的指标数据;Based on the indicator acquisition rule, acquire indicator data related to the monitoring object;基于所述告警规则,根据所述指标数据判断是否发生告警事件;Based on the alarm rule, determine whether an alarm event occurs according to the indicator data;若判定发生告警事件,将所述告警事件相关的告警信息推送至告警组件,以获取所述告警组件合并所述告警信息得到的告警通知;If it is determined that an alarm event occurs, push the alarm information related to the alarm event to the alarm component, so as to obtain an alarm notification obtained by combining the alarm information by the alarm component;根据所述告警规则中的告警发送方式,发送所述告警通知。The alarm notification is sent according to the alarm sending mode in the alarm rule.2.根据权利要求1所述的方法,其特征在于,所述基于所述指标获取规则,获取所述监控对象相关的指标数据,包括:2 . The method according to claim 1 , wherein the acquiring the indicator data related to the monitoring object based on the indicator acquiring rule comprises: 2 .根据所述指标获取规则中的采集路径和采集周期,获取所述监控对象的业务数据;Acquire the business data of the monitoring object according to the collection path and collection period in the indicator acquisition rule;对所述业务数据进行预设的指标化处理,得到所述监控对象相关的指标数据。Perform preset indexing processing on the business data to obtain index data related to the monitoring object.3.根据权利要求2所述的方法,其特征在于,所述对所述业务数据进行预设的指标化处理,得到所述监控对象相关的指标数据,包括:3. The method according to claim 2, characterized in that, performing a preset indexing process on the business data to obtain index data related to the monitoring object, comprising:根据所述配置文件以及所述业务数据的来源,确定所述业务数据的指标名称和指标标签;Determine the indicator name and indicator label of the business data according to the configuration file and the source of the business data;对所述业务数据添加所述指标名称和所述指标标签,得到指标数据;adding the indicator name and the indicator label to the business data to obtain indicator data;所述指标名称用于告警事件发生时指示告警事件发生的位置,所述指标标签用于告警事件发生时合并告警信息。The indicator name is used to indicate the location where the alarm event occurs when the alarm event occurs, and the indicator label is used to combine the alarm information when the alarm event occurs.4.根据权利要求1-3任一项所述的方法,其特征在于,所述基于所述告警规则,根据所述指标数据判断是否发生告警事件,包括:4. The method according to any one of claims 1-3, wherein the determining whether an alarm event occurs based on the alarm rule and the indicator data comprises:判断所述指标数据中业务数据的数值是否大于所述告警规则中与所述指标数据对应的数据阈值;Judging whether the value of the business data in the indicator data is greater than the data threshold corresponding to the indicator data in the alarm rule;若所述指标数据中业务数据的数值大于所述数据阈值,则判定发生告警事件。If the value of the service data in the indicator data is greater than the data threshold, it is determined that an alarm event occurs.5.根据权利要求1-3任一项所述的方法,其特征在于,所述基于所述告警规则,根据所述指标数据判断是否发生告警事件,包括:5. The method according to any one of claims 1-3, wherein the determining whether an alarm event occurs based on the alarm rule and the indicator data comprises:判断所述指标数据中业务数据的数值是否大于所述告警规则中与所述指标数据对应的数据阈值;Judging whether the value of the business data in the indicator data is greater than the data threshold corresponding to the indicator data in the alarm rule;若所述指标数据中业务数据的数值大于所述数据阈值,判断所述指标数据中业务数据的数值大于所述数据阈值的持续时间是否大于所述告警规则中与所述指标数据对应的时间阈值;If the value of the business data in the indicator data is greater than the data threshold, determine whether the duration for which the value of the business data in the indicator data is greater than the data threshold is greater than the time threshold corresponding to the indicator data in the alarm rule ;若所述指标数据中业务数据的数值大于所述数据阈值的持续时间大于所述告警规则中与所述指标数据对应时间阈值,则判定发生告警事件。If the value of the service data in the indicator data is greater than the data threshold and the duration is longer than the time threshold corresponding to the indicator data in the alarm rule, it is determined that an alarm event occurs.6.根据权利要求5所述的方法,其特征在于,所述方法还包括:6. The method according to claim 5, wherein the method further comprises:通过对所述指标数据中业务数据的数值进行预设运算,生成数据基线,所述预设运算包括以下至少一种:标准差运算、平均值运算、最大值运算、最小值运算;A data baseline is generated by performing a preset operation on the value of the business data in the indicator data, and the preset operation includes at least one of the following: a standard deviation operation, an average value operation, a maximum value operation, and a minimum value operation;根据所述数据基线,确定指标数据预测值;According to the data baseline, determine the predicted value of the indicator data;根据所述指标数据预测值,确定所述数据阈值和所述时间阈值。The data threshold and the time threshold are determined according to the predicted value of the indicator data.7.根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:7. The method according to any one of claims 1-3, wherein the method further comprises:响应对所述配置参数的修改操作,得到修改后的配置参数;In response to the modification operation on the configuration parameters, the modified configuration parameters are obtained;对所述修改后的配置参数进行增量判断;performing incremental judgment on the modified configuration parameters;若所述增量判断的结果指示需要更新配置文件,则根据所述修改后的配置参数更新所述配置文件。If the result of the incremental judgment indicates that the configuration file needs to be updated, the configuration file is updated according to the modified configuration parameter.8.一种监控装置,其特征在于,所述监控装置包括:8. A monitoring device, characterized in that the monitoring device comprises:配置文件获取模块,用于获取配置文件,所述配置文件是根据用户在用户界面输入的配置参数生成的,所述配置文件包括指标获取规则和告警规则;a configuration file obtaining module, configured to obtain a configuration file, the configuration file is generated according to the configuration parameters input by the user in the user interface, and the configuration file includes an indicator obtaining rule and an alarming rule;指标数据获取模块,用于基于所述指标获取规则,获取所述监控对象相关的指标数据;an indicator data acquisition module, configured to acquire indicator data related to the monitoring object based on the indicator acquisition rule;告警事件判断模块,用于基于所述告警规则,根据所述指标数据判断是否发生告警事件;an alarm event judgment module, configured to judge whether an alarm event occurs based on the alarm rule and according to the indicator data;告警信息推送模块,用于若判定发生告警事件,将所述告警事件相关的告警信息推送至告警组件,以获取所述告警组件合并所述告警信息得到的告警通知;an alarm information push module, configured to push alarm information related to the alarm event to an alarm component if it is determined that an alarm event occurs, so as to obtain an alarm notification obtained by combining the alarm information by the alarm component;告警通知发送模块,用于根据所述告警规则中的告警发送方式,发送所述告警通知。An alarm notification sending module, configured to send the alarm notification according to the alarm sending mode in the alarm rule.9.一种计算机设备,其特征在于,所述计算机设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机程序,其中所述计算机程序被所述处理器执行时,实现如权利要求1至7中任一项所述的监控方法的步骤。9. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program is executed by the processor When executed, the steps of the monitoring method according to any one of claims 1 to 7 are implemented.10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,其中所述计算机程序被处理器执行时,实现如权利要求1至7中任一项所述的监控方法的步骤。10. A computer-readable storage medium, characterized in that, a computer program is stored on the computer-readable storage medium, and when the computer program is executed by the processor, the computer program as claimed in any one of claims 1 to 7 is implemented. The steps of the monitoring method described above.
CN202111014468.2A2021-08-312021-08-31Monitoring method, device, equipment and computer storage mediumPendingCN113704065A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202111014468.2ACN113704065A (en)2021-08-312021-08-31Monitoring method, device, equipment and computer storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202111014468.2ACN113704065A (en)2021-08-312021-08-31Monitoring method, device, equipment and computer storage medium

Publications (1)

Publication NumberPublication Date
CN113704065Atrue CN113704065A (en)2021-11-26

Family

ID=78658139

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202111014468.2APendingCN113704065A (en)2021-08-312021-08-31Monitoring method, device, equipment and computer storage medium

Country Status (1)

CountryLink
CN (1)CN113704065A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114090408A (en)*2021-11-292022-02-25平安壹账通云科技(深圳)有限公司 Data monitoring and analysis method, device, computer equipment and storage medium
CN114331050A (en)*2021-12-132022-04-12湖南天云软件技术有限公司 Alarm method, device, device and computer storage medium
CN114443434A (en)*2022-01-262022-05-06北京嗨学网教育科技股份有限公司Method, device and equipment for pushing alarm event and storage medium
CN114490237A (en)*2021-12-162022-05-13银保信科技(北京)有限公司Operation and maintenance monitoring method and device based on multiple data sources
CN114595274A (en)*2022-03-142022-06-07上汽通用五菱汽车股份有限公司Rolling force monitoring method, system, equipment and computer readable storage medium
CN114625611A (en)*2022-03-212022-06-14山东浪潮科学研究院有限公司Monitoring method and device for quantum computing environment and storage medium
CN114661562A (en)*2022-04-222022-06-24北京博睿宏远数据科技股份有限公司 A data alarm method, device, equipment and medium
CN114844761A (en)*2022-07-012022-08-02中国电子信息产业集团有限公司第六研究所Monitoring alarm system
CN114968723A (en)*2022-06-172022-08-30平安普惠企业管理有限公司 CPU testing method, apparatus, device and storage medium for mobile devices
CN115017023A (en)*2022-05-302022-09-06北京高阳捷迅信息技术有限公司 Method and system for monitoring and alarming of embedded data indicators based on data warehouse
CN115118571A (en)*2022-06-282022-09-27中国平安财产保险股份有限公司 Business monitoring method, platform, computer equipment and storage medium
CN115292624A (en)*2022-10-082022-11-04成都同步新创科技股份有限公司Universal message processing method and device based on HTTP (hyper text transport protocol)
CN115473785A (en)*2022-05-302022-12-13北京罗克维尔斯科技有限公司Alarm information processing method and device, electronic equipment and storage medium
CN115473783A (en)*2022-08-042022-12-13浪潮软件集团有限公司Prometheus-based index alarm management system and method
CN115941444A (en)*2022-12-082023-04-07企查查科技有限公司 Alarm method and device for server cluster
CN115987756A (en)*2022-12-192023-04-18北京自如信息科技有限公司 Alarm information processing method, device, electronic equipment and storage medium
CN116015853A (en)*2022-12-262023-04-25上海中通吉网络技术有限公司Flow monitoring method and device and electronic equipment
CN116743791A (en)*2022-09-302023-09-12腾讯云计算(北京)有限责任公司Cloud edge synchronization method, device and equipment for subway cloud platform and storage medium
WO2024174700A1 (en)*2023-02-242024-08-29天翼云科技有限公司Method and apparatus for calculating component value of alarm information, and electronic device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107832200A (en)*2017-10-242018-03-23平安科技(深圳)有限公司Alert processing method, device, computer equipment and storage medium
CN110688281A (en)*2019-09-252020-01-14凡普数字技术有限公司Alarm method and device in monitoring system and storage medium
CN111553560A (en)*2020-04-012020-08-18车智互联(北京)科技有限公司Service index monitoring method, monitoring server and system
CN111597091A (en)*2020-05-202020-08-28北京金山云网络技术有限公司Data monitoring method and system, electronic equipment and computer storage medium
CN112511339A (en)*2020-11-092021-03-16宝付网络科技(上海)有限公司Container monitoring alarm method, system, equipment and storage medium based on multiple clusters

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN107832200A (en)*2017-10-242018-03-23平安科技(深圳)有限公司Alert processing method, device, computer equipment and storage medium
CN110688281A (en)*2019-09-252020-01-14凡普数字技术有限公司Alarm method and device in monitoring system and storage medium
CN111553560A (en)*2020-04-012020-08-18车智互联(北京)科技有限公司Service index monitoring method, monitoring server and system
CN111597091A (en)*2020-05-202020-08-28北京金山云网络技术有限公司Data monitoring method and system, electronic equipment and computer storage medium
CN112511339A (en)*2020-11-092021-03-16宝付网络科技(上海)有限公司Container monitoring alarm method, system, equipment and storage medium based on multiple clusters

Cited By (25)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN114090408A (en)*2021-11-292022-02-25平安壹账通云科技(深圳)有限公司 Data monitoring and analysis method, device, computer equipment and storage medium
CN114331050A (en)*2021-12-132022-04-12湖南天云软件技术有限公司 Alarm method, device, device and computer storage medium
CN114490237A (en)*2021-12-162022-05-13银保信科技(北京)有限公司Operation and maintenance monitoring method and device based on multiple data sources
CN114490237B (en)*2021-12-162025-03-21银保信科技(北京)有限公司 Operation and maintenance monitoring method and device based on multiple data sources
CN114443434A (en)*2022-01-262022-05-06北京嗨学网教育科技股份有限公司Method, device and equipment for pushing alarm event and storage medium
CN114595274A (en)*2022-03-142022-06-07上汽通用五菱汽车股份有限公司Rolling force monitoring method, system, equipment and computer readable storage medium
CN114625611A (en)*2022-03-212022-06-14山东浪潮科学研究院有限公司Monitoring method and device for quantum computing environment and storage medium
CN114625611B (en)*2022-03-212024-05-24山东浪潮科学研究院有限公司Quantum computing environment monitoring method, device and storage medium
CN114661562A (en)*2022-04-222022-06-24北京博睿宏远数据科技股份有限公司 A data alarm method, device, equipment and medium
CN115017023B (en)*2022-05-302025-02-28北京高阳捷迅信息技术有限公司 Embedded data indicator monitoring and alarm method and system based on data warehouse
CN115473785B (en)*2022-05-302024-02-27北京罗克维尔斯科技有限公司Alarm information processing method and device, electronic equipment and storage medium
CN115017023A (en)*2022-05-302022-09-06北京高阳捷迅信息技术有限公司 Method and system for monitoring and alarming of embedded data indicators based on data warehouse
CN115473785A (en)*2022-05-302022-12-13北京罗克维尔斯科技有限公司Alarm information processing method and device, electronic equipment and storage medium
CN114968723A (en)*2022-06-172022-08-30平安普惠企业管理有限公司 CPU testing method, apparatus, device and storage medium for mobile devices
CN115118571A (en)*2022-06-282022-09-27中国平安财产保险股份有限公司 Business monitoring method, platform, computer equipment and storage medium
CN114844761B (en)*2022-07-012022-09-06中国电子信息产业集团有限公司第六研究所Monitoring alarm system
CN114844761A (en)*2022-07-012022-08-02中国电子信息产业集团有限公司第六研究所Monitoring alarm system
CN115473783A (en)*2022-08-042022-12-13浪潮软件集团有限公司Prometheus-based index alarm management system and method
CN116743791A (en)*2022-09-302023-09-12腾讯云计算(北京)有限责任公司Cloud edge synchronization method, device and equipment for subway cloud platform and storage medium
CN115292624A (en)*2022-10-082022-11-04成都同步新创科技股份有限公司Universal message processing method and device based on HTTP (hyper text transport protocol)
CN115292624B (en)*2022-10-082023-08-04成都同步新创科技股份有限公司General message processing method and device based on HTTP protocol
CN115941444A (en)*2022-12-082023-04-07企查查科技有限公司 Alarm method and device for server cluster
CN115987756A (en)*2022-12-192023-04-18北京自如信息科技有限公司 Alarm information processing method, device, electronic equipment and storage medium
CN116015853A (en)*2022-12-262023-04-25上海中通吉网络技术有限公司Flow monitoring method and device and electronic equipment
WO2024174700A1 (en)*2023-02-242024-08-29天翼云科技有限公司Method and apparatus for calculating component value of alarm information, and electronic device and storage medium

Similar Documents

PublicationPublication DateTitle
CN113704065A (en)Monitoring method, device, equipment and computer storage medium
US20230039566A1 (en)Automated system and method for detection and remediation of anomalies in robotic process automation environment
US11586972B2 (en)Tool-specific alerting rules based on abnormal and normal patterns obtained from history logs
CN105573824B (en)Monitoring method and system for distributed computing system
CN108845910A (en)Monitoring method, device and the storage medium of extensive micro services system
US20170031744A1 (en)Time series metric data modeling and prediction
US12068924B2 (en)Monitoring network activity for anomalies using activity metric forecasting model
CN114116429B (en) Abnormal log collection method, device, equipment, medium and product
CN109947616A (en) An automatic monitoring operation and maintenance system of cloud operating system based on OpenStack technology
US20240427791A1 (en)Monitoring and alerting platform for extract, transform, and load jobs
CN111934920A (en)Monitoring alarm method, device, equipment and storage medium
CN110677271B (en)Big data alarm method, device, equipment and storage medium based on ELK
CN113377626B (en)Visual unified alarm method, device, equipment and medium based on service tree
CN104104734A (en) Log analysis method and device
CN112948223B (en)Method and device for monitoring running condition
CN113762910A (en)Document monitoring method and device
US9563485B2 (en)Business transaction context for call graph
US8966503B1 (en)System and method for correlating anomalous events
US10331484B2 (en)Distributed data platform resource allocator
CN108171265B (en) Label obtaining method, device and electronic device
US10990090B2 (en)Apparatus and method for automatic detection and classification of industrial alarms
CN115629933A (en)Business system monitoring method, device, equipment and storage medium
CN111597091A (en)Data monitoring method and system, electronic equipment and computer storage medium
CN116450465B (en)Data processing method, device, equipment and medium
CN112667475A (en)Risk notification method and device, electronic equipment and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
WD01Invention patent application deemed withdrawn after publication

Application publication date:20211126

WD01Invention patent application deemed withdrawn after publication

[8]ページ先頭

©2009-2025 Movatter.jp