Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The service monitoring method of the embodiment of the application can be operated on the terminal equipment or the server. The terminal device may be a local terminal device. When the method is run as a server, it may be a cloud presentation.
In an alternative embodiment, cloud presentation refers to a presentation of information based on cloud computing. In the cloud display operation mode, the operation main body of the information processing program and the information picture presentation main body are separated, the storage and operation of the display switching method are completed on a cloud display server, and the cloud display client functions as data receiving, sending and information picture presentation, for example, the cloud display client can be a display device with a data transmission function close to a user side, such as a mobile terminal, a television, a computer, a palm computer and the like; the terminal equipment for information data processing is a cloud display server of the cloud. When browsing, a user operates the cloud display client to send an operation instruction to the cloud display server, the cloud display server displays information according to the operation instruction, codes and compresses data, the data is returned to the cloud display client through a network, and finally, the cloud display client decodes and outputs display content.
In another alternative embodiment, the terminal device may be a local terminal device. The local terminal device stores an application program and is used for presenting an application interface. The local terminal device is used for interacting with a user through a graphical user interface, namely, conventionally downloading and installing an application program through the electronic device and running. The way in which the local terminal device provides the graphical user interface to the user may include a variety of ways, for example, it may be rendered for display on a display screen of the terminal, or provided to the user by holographic projection. For example, the local terminal device may include a display screen for presenting a graphical user interface including an application screen, and a processor for running the application, generating the graphical user interface, and controlling the display of the graphical user interface on the display screen.
The application provides a service monitoring method, a device, electronic equipment and a storage medium, which can reduce a plurality of invalid alarms, thereby improving the alarm accuracy rate, and enabling a user to pay attention to each alarm information instead of being tired to cope with the duration of the invalid alarms.
In order to facilitate understanding of the service monitoring method provided by the embodiment of the present application, the following concepts are explained first:
Quality of service Indicator (SERVICE LEVEL Indicator, SLI), which refers to an Indicator of quality of service, SLI is generally considered as a ratio of two numbers, good number of events/total number of events (e.g., number of successful hypertext transfer protocol (http) requests/total http requests, number of remote procedure call protocol (rpc) calls completed within 100 ms/total rpc calls).
A quality of service target (SERVICE LEVEL Objective, SLO) is a target of service within a valid window in a certain metric dimension, e.g. 99.9% of http requests succeed within 30 days. I.e. SLO is the target of SLI. The effective window of the SLO refers to the time required to implement the SLO.
Error budget: refers to subtracting the percent value of SLO from 100%, i.e., the theoretically allowable amount of error over a period of time (e.g., 30 days).
Error rate: problem events account for a proportion of all events.
Combustion rate: refers to the speed at which the service consumes the error budget relative to the SLO.
Accuracy rate: of all detected events, the proportion of problem events detected.
Recall ratio: in all objectively existing questions, the proportion of the question event that is detected.
When in reset, the following steps are used: and after the problem is solved, alarming duration is long.
The service monitoring method provided by the embodiment of the application is explained in detail below.
Referring to fig. 1, a flowchart illustrating steps of a service monitoring method in an embodiment of the present application may include the following steps 101 to 103.
Step 101: and acquiring service data of the target service in a time window at preset time intervals.
The preset time interval can be adjusted according to different application scenes. I.e. the time intervals corresponding to different target services have different values. For example, the preset time interval may be 1 minute.
In addition, the target service may be an http request, rpc call, or the like.
Step 102: and calculating the combustion rate according to the service data.
Optionally, the service data includes: the total number of events and the number of problem events for the target service; said calculating a burn rate from said service data, comprising:
calculating the ratio of the number of the problem events to the total number of the events as an error rate;
The ratio of the error rate to a predetermined error budget is calculated as a burn-in rate.
Here, the number of problem events is X, the total number of events is Y, the predetermined error budget is Z, and the error rate is C, and therefore, the error rate c=x/Y, and the burn rate=c/Z.
It follows that the burn rate represents the rate of consumption of the error budget. For example, a burn rate of 1 may be defined, representing the rate at which this error budget is consumed, and at the end of the SLO's active window the error budget will just become 0. For example, the effective window for SLO is 30 days, then based on 99% SLO, it takes 30 days with a burn rate of 1, just to consume all the error budget. Wherein the burn rate and the time required for the error budget to run out may be as shown in table 1.
TABLE 1 burn rate and time required to deplete error budget versus Table
| Combustion rate | Error Rate (SLO 99%) | Time required for error budget exhaustion |
| 1 | 1% | For 30 days |
| 2 | 2% | For 15 days |
| 10 | 10% | For 3 days |
| 1000 | 1000% | 43.2 Minutes |
Wherein, the burning rate=error rate/error budget, the burning rate represents the consumption speed of the error budget, and the error budget is fixed, then the error rate is increased by N times, the burning rate is increased by N times, the consumption time is shortened by N times, and N is greater than 1.
Step 103: and executing a preset alarm operation under the condition that the duration time when the obtained combustion rate reaches the combustion rate threshold value exceeds the duration time threshold value.
Wherein, different burning rate thresholds can be preset for different target services. Namely, different burning rate thresholds can be set in different application scenes, so that service monitoring requirements of different scenes can be met.
In addition, the alert operation may be at least one of sending alert information (i.e., a short message alert) to a predetermined electronic device, making a call to a predetermined communication number (i.e., a telephone alert), and sending an alert mail (i.e., a mail alert) to a predetermined mailbox address.
As can be seen from the steps 101 to 103, in the embodiment of the present application, service data of the target service in a time window is obtained at intervals; calculating the combustion rate according to the service data; and executing preset alarm operation under the condition that the calculated combustion rate reaches a duration threshold of the combustion rate threshold and exceeds a preset duration. The combustion rate represents the consumption degree of the error budget, and the occurrence condition of the problem event of the target service can be more intuitively represented, so that more accurate warning can be given for the problem event based on the combustion rate. In the embodiment of the application, the alarm operation is executed when the duration of the calculated combustion rate reaching the combustion rate threshold reaches the duration threshold, so that accidental fluctuation of the combustion rate can be avoided, and frequent alarm can be avoided. Therefore, the embodiment of the application can improve the accuracy of the alarm and reduce the invalid alarm, so that the user can pay attention to each alarm information instead of being tired of dealing with the invalid alarm.
Optionally, the time window includes at least two windows;
Said calculating a burn rate from said service data, comprising:
According to the service data in each time window, respectively calculating the combustion rate of each time window;
the step of executing the preset alarm operation when the calculated combustion rate reaches the duration threshold of the combustion rate threshold and exceeds the duration threshold comprises the following steps:
And executing preset alarm operation under the condition that the calculated combustion rate of each time window reaches the duration threshold of the combustion rate threshold and exceeds the duration threshold.
It is understood that in the embodiment of the present application, at least two time windows may be stored in advance. And then, respectively calculating the combustion rate of each time window according to the service data in at least two time windows, further respectively comparing the calculated combustion rate of each time window with a preset combustion rate threshold, and executing preset alarm operation if the duration of reaching the combustion rate threshold exceeds the duration threshold.
For example, two A, B time windows are preset, and the combustion rate threshold is S, where service data in the time window a is acquired at time t, and the combustion rate a1 is calculated, and service data in the time window B is acquired at time t, and the combustion rate B1 is calculated, if a1> S and B1 > S, then the combustion rates recorded in the time windows at time t A, B both exceed the combustion rate threshold S, and at this time, the combustion rates of the time windows at time A, B both exceed the combustion rate threshold S and the start time of the combustion rate threshold S is t.
Service data in the A time window is acquired at the time t+T1, the combustion rate a2 is obtained through calculation, service data in the B time window is acquired at the time t+T1, and the combustion rate B2 is obtained through calculation, if a2> S and B2 > S, the combustion rates of the two time windows recorded at the time t+T1 at A, B both exceed the combustion rate threshold S, and at the moment, the duration that the combustion rate of the A, B two time windows exceeds the combustion rate threshold S is T1.
If the combustion rate of the time window a and the combustion rate of the time window B at the time T exceed the combustion rate threshold S, and the combustion rate of the time window a and the combustion rate of the time window B at the time t+t1 exceed the combustion rate threshold S, determining A, B that the duration that the combustion rates of the two time windows exceed the combustion rate threshold S is T1.
The above process is repeatedly executed every T1 time until the duration of the combustion rate exceeding the combustion rate threshold S of A, B is greater than the duration threshold, and then a preset alarm operation is executed. Here, if a2> S and B2< S, the combustion rate recorded in the time window a at time t+t1 exceeds the combustion rate threshold S, but the combustion rate recorded in the time window B does not exceed the combustion rate threshold S, and in this case, the duration of the combustion rate exceeding the combustion rate threshold S in both time windows A, B is cleared.
In addition, T1 is smaller than a predetermined value, for example, T1 may be 1 minute.
As can be seen from the foregoing, in the embodiment of the present application, a plurality of combustion rates may be calculated according to at least two pre-stored time windows, and when the duration of the combustion rate obtained by calculation reaches the duration threshold, an alarm operation is performed, so that accidental fluctuation of the combustion rate may be avoided, and frequent alarms may not be performed.
The number of times that the combustion rate continuously reaches the combustion rate threshold value can be recorded, so that whether the preset alarm operation is executed or not is judged according to the number of times that the combustion rate continuously reaches the combustion rate threshold value. For example, when the number of times the combustion rate continuously reaches the combustion rate threshold exceeds a preset number of times, a preset warning operation is performed.
Optionally, the time window includes a first type window and a second type window, where an absolute value of a time length difference between the first type window and the second type window is greater than a second preset time length.
Wherein, two types of time windows with larger time difference can be preset. Of these two types of time windows, one with a larger duration may be referred to as a long window and one with a smaller duration may be referred to as a short window. For example, the collocation of the long and short windows can be shown in table 2.
In addition, under the condition that the long window and the short window are matched, if the combustion rate of the long window and the combustion rate of the short window obtained through calculation at the moment t exceed the preset combustion rate threshold, and the duration time exceeds the duration time threshold, the alarm execution operation is triggered, so that relevant processing personnel can process relevant alarms, namely, the problem event of the target service is solved. If the problem event in T to t+t2 is solved, the time for reset is T2, where when the combustion rate is calculated after t+t2, the service data in the long window may include the service data in the time for reset, so that the combustion rate in the long window is larger, and still may reach the combustion rate threshold, and the service data in the short window may not include the service data in the time for reset, so that the combustion rate in the short window is smaller, and the combustion rate threshold may not be exceeded. Therefore, under the collocation of the long window and the short window, after the problem event is solved, the calculated combustion rate of the long window and the short window may not meet the alarm condition, so that the alarm is not triggered, and the related processor is not required to spend time to solve the problem event, so that the collocation of the long window and the short window can reduce the time for resetting.
Optionally, a plurality of sets of configurations are pre-stored, a set of configurations including at least two time windows, a burn rate threshold, a duration threshold, and an alarm operation;
And executing a preset alarm operation under the condition that the calculated combustion rate of each time window reaches a duration threshold of the combustion rate threshold and exceeds the duration threshold, wherein the alarm operation comprises the following steps:
Executing the alarm operation in the ith configuration under the condition that the duration of the time window in the ith configuration reaches the duration threshold in the ith configuration and exceeds the duration threshold in the ith configuration;
Wherein i is an integer greater than 0.
The different combustion rate thresholds may correspond to different alarm operations, so as to realize that an alarm mode is selected according to the severity of a problem event, for example, the severity of a service problem can be divided into a deadly degree and a general degree according to the combustion rate threshold, the deadly degree is a short message and a telephone alarm, and the general degree is a mail alarm. For example, in table 2, burn rate thresholds 14.4 and 6 may be assigned to deadlines and burn rate thresholds 3 and 1 may be assigned to general levels.
Table 2 prestored multi-group configuration
For example, the pre-stored sets of configurations are shown in table 2. I.e. different combinations of such first to fourth groups may be generated for time windows of different durations, and each group is associated with a burn rate threshold and an alarm operation. For monitoring a certain service, which group or groups of configurations in table 2 can be adopted specifically, and the actual situation of the service can be determined.
For example, for monitoring http requests, the configurations of the first group to the fourth group may be selected, and then service data of each time window shown in table 2 may be obtained, and according to the service data of each time window, the combustion rate of each time window may be calculated; then, the following procedure is performed for the burn rate of the time window in the first group:
Judging whether the duration of the time window of the first group exceeds the duration threshold (14.4) of the first group or not, if so, executing the alarm operation (namely, short message and telephone alarm) of the first group.
The executing process of the combustion rate for the time windows in the second group, the third group and the fourth group may participate in the "executing process of the combustion rate for the time windows in the first group" described above, which is not described herein.
It can be known that, according to the embodiment of the application, for monitoring different services, the time window, the combustion rate threshold, the duration threshold and the alarm operation can be flexibly configured, so that the quality of the service can be monitored more accurately.
For the configuration in table 2, if the combustion rate of the 3-day time window reaches 1, it means that the 3-day time window can detect a complete problem event, and in this case, the recall of the problem event is higher.
In addition, table 2 is configured on the basis of 30 days of elapsed error budget (i.e., 30 days of burn rate of 1). . When the error budget consumed by the time window of 1 hour is considered to be 2%, and the severity of the service occurrence problem belongs to the deadline, the value x1 reached by the burning rate when the error budget is consumed by 2% can be determined, namely x1=2% (1/(30×24)) can be obtained according to the formula 1/x1=14.4;
Similarly, when the error budget consumed by the 6-hour time window is considered to be 5%, and the severity of the service occurrence problem belongs to the deadline, the value x2 reached by the burning rate when the error budget is consumed by 5% can be determined, namely x2=6 can be obtained according to the formula 1/x2=5% (1/(30×24) ×6);
similarly, when the error budget consumed by the 24-hour time window is considered to be 10%, and the severity of the service occurrence problem belongs to a general degree, a value x3 reached by the combustion rate when the error budget is consumed by 10% can be determined, namely, x3=3 can be obtained according to the formula 1/x3=10%/(1/(30×24) ×24);
Similarly, when the error budget consumed by the 3-day time window is considered to be 10%, and the severity of the service occurrence problem is a general level, the value x4 reached by the burning rate when the error budget is consumed by 10% can be determined, that is, x4=1 can be obtained according to the formula 1/x4=10%/(1/30×3).
From this, it is clear that the burn rate threshold of each group can be calculated from the "consumed error budget" in table 2.
Optionally, a plurality of groups of configurations are pre-stored, wherein one group of configurations comprises at least two time windows, at least two combustion rate thresholds, a duration threshold and alarm operations corresponding to the combustion rate thresholds one by one; the step of executing the preset alarm operation when the calculated combustion rate reaches the duration threshold of the combustion rate threshold and exceeds the duration threshold comprises the following steps:
Acquiring a combustion rate threshold value smaller than each combustion rate obtained through calculation in the ith group of configuration as a first combustion rate threshold value;
acquiring a combustion rate threshold value with the smallest sum of absolute values of differences between the first combustion rate threshold value and the calculated combustion rate as a second combustion rate threshold value;
executing an alarm operation corresponding to the second combustion rate threshold value in the ith configuration under the condition that the duration of each combustion rate reaching the second combustion rate threshold value obtained through calculation exceeds the duration threshold value in the ith configuration;
Wherein i is an integer greater than 0.
For example, if one of the configurations stored in advance is shown in table 3, and the combustion rate obtained by this calculation is c for 1 hour and d for 5 minutes, it is necessary to search for a combustion rate threshold value that is smaller than c and d and has the smallest sum of absolute values of differences from c and d among 14.4, 6, 3, and 1. For example, if both the combustion rate thresholds of 3 and 1 are smaller than c and d, the magnitude of |3-c|+|3-d| and |1-c|+|1-d| need to be compared, and if |3-c|+|3-d| < 1-c|+|1-d|, an alarm operation (i.e., mail alarm) corresponding to 3 such combustion rate thresholds is performed.
Therefore, according to the embodiment of the application, different burning rate thresholds and alarm operations can be flexibly matched aiming at a group of configurations of the time windows, so that different degrees of service occurrence problems can be alarmed aiming at the configuration of the group of time windows.
TABLE 3 prestored set of configurations
It should be noted that, for simplicity of description, the method embodiments are shown as a series of acts, but it should be understood by those skilled in the art that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts are not necessarily required by the embodiments of the application.
Referring to fig. 2, which shows a block diagram of a service monitoring apparatus in an embodiment of the present application, the service monitoring apparatus 200 may include the following modules:
a data acquisition module 201, configured to acquire service data of a target service within a time window at intervals of a preset time interval;
A data calculation module 202 for calculating a combustion rate from the service data;
And the display module 203 is configured to record a condition that the calculated duration time period threshold that the combustion rate reaches the combustion rate threshold exceeds the combustion rate threshold, and execute a preset alarm operation.
Optionally, the service data includes: the total number of events and the number of problem events for the target service; the data calculation module 202 includes:
a first calculation sub-module for calculating a ratio of the number of the problem events to the total number of the events as an error rate;
a second calculation sub-module for calculating a ratio of the error rate to a predetermined error budget as a burn-in rate.
Optionally, the time window includes at least two windows;
the data calculation module 202 is specifically configured to:
According to the service data in each time window, respectively calculating the combustion rate of each time window;
the alarm module 203 is specifically configured to:
And executing preset alarm operation under the condition that the duration time threshold value of the combustion rate threshold value is exceeded by the calculated combustion rate of each time window.
Optionally, the time window includes a first type window and a second type window, where an absolute value of a time length difference between the first type window and the second type window is greater than a third preset time length.
Optionally, a plurality of sets of configurations are pre-stored, a set of configurations including at least two time windows, a burn rate threshold, a duration threshold, and an alarm operation; the alarm module 203 is specifically configured to:
Executing the alarm operation in the ith configuration under the condition that the duration of the time window in the ith configuration reaches the duration threshold in the ith configuration and exceeds the duration threshold in the ith configuration;
Wherein i is an integer greater than 0.
Optionally, a plurality of groups of configurations are pre-stored, wherein one group of configurations comprises at least two time windows, at least two combustion rate thresholds, a duration threshold and alarm operations corresponding to the combustion rate thresholds one by one;
the alarm module 203 is specifically configured to:
Acquiring a combustion rate threshold value smaller than each combustion rate obtained through calculation in the ith group of configuration as a first combustion rate threshold value;
acquiring a combustion rate threshold value with the smallest sum of absolute values of differences between the first combustion rate threshold value and the calculated combustion rate as a second combustion rate threshold value;
executing an alarm operation corresponding to the second combustion rate threshold value in the ith configuration under the condition that the duration of each combustion rate reaching the second combustion rate threshold value obtained through calculation exceeds the duration threshold value in the ith configuration;
Wherein i is an integer greater than 0.
As can be seen from the above, in the embodiment of the present application, service data of the target service in a time window is acquired at intervals; calculating the combustion rate according to the service data; and executing preset alarm operation under the condition that the calculated combustion rate reaches a duration threshold of the combustion rate threshold and exceeds a preset duration. The combustion rate represents the consumption degree of the error budget, and the occurrence condition of the problem event of the target service can be more intuitively represented, so that more accurate warning can be given for the problem event based on the combustion rate. In the embodiment of the application, the alarm operation is executed when the duration of the calculated combustion rate reaching the combustion rate threshold reaches the duration threshold, so that accidental fluctuation of the combustion rate can be avoided, and frequent alarm can be avoided. Therefore, the embodiment of the application can improve the accuracy of the alarm and reduce the invalid alarm, so that the user can pay attention to each alarm information instead of being tired of dealing with the invalid alarm.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
The embodiment of the application also provides electronic equipment, which comprises:
One or more processors; and one or more machine readable media having instructions stored thereon, which when executed by the one or more processors, cause the electronic device to perform the method of embodiments of the present application.
Embodiments of the application also provide one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the methods described in embodiments of the application.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
It will be apparent to those skilled in the art that embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or terminal device that comprises the element.
The foregoing has described in detail the method and apparatus for displaying a cover image according to the present application, and specific examples have been applied to illustrate the principles and embodiments of the present application, and the above description of the examples is only for helping to understand the method and core idea of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.