Movatterモバイル変換


[0]ホーム

URL:


CN116112397B - Method and device for determining delay abnormality and cloud platform - Google Patents

Method and device for determining delay abnormality and cloud platform
Download PDF

Info

Publication number
CN116112397B
CN116112397BCN202211441731.0ACN202211441731ACN116112397BCN 116112397 BCN116112397 BCN 116112397BCN 202211441731 ACN202211441731 ACN 202211441731ACN 116112397 BCN116112397 BCN 116112397B
Authority
CN
China
Prior art keywords
target
target service
service component
path information
request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211441731.0A
Other languages
Chinese (zh)
Other versions
CN116112397A (en
Inventor
高力鹏
张志鹏
范晓辉
何兴建
张云锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Postal Savings Bank of China Ltd
Original Assignee
Postal Savings Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Postal Savings Bank of China LtdfiledCriticalPostal Savings Bank of China Ltd
Priority to CN202211441731.0ApriorityCriticalpatent/CN116112397B/en
Publication of CN116112397ApublicationCriticalpatent/CN116112397A/en
Application grantedgrantedCritical
Publication of CN116112397BpublicationCriticalpatent/CN116112397B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The application provides a method and a device for determining delay abnormality and a cloud platform. The determining method comprises the following steps: based on the operation log information of each service component, constructing an execution path of a user request on each target service component to obtain target path information; classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response time of a corresponding target service component when each user request is executed at least based on each target path information in each request category group; based on the compensation response time of each target service component in all request class groups, whether delay faults occur to the corresponding target service components is determined, so that the problem that delay anomalies of service components of a cloud platform are difficult to detect accurately in the prior art is solved, and further the high efficiency of fault processing of the cloud platform and the high reliability of operation of the cloud platform are guaranteed.

Description

Method and device for determining delay abnormality and cloud platform
Technical Field
The application relates to the field of data processing, in particular to a method and a device for determining delay abnormality and a cloud platform.
Background
Cloud computing is a novel computing mode with high flexibility, and the complexity of a cloud computing system is accompanied with the flexibility. Cloud computing integrates massive computing resources (including hardware resources and software resources) and provides services for users through networks in the form of a cloud platform.
The user initiated request is completed by the interaction and cooperative processing of numerous service components of the cloud platform. These service components include virtualization components, security components, user management components, application extension components, and the like. While these service components are deployed decentralized on different compute nodes, even in different clusters and in different data centers. The service components also communicate with each other via the internet or a private network. If a service component fails or has abnormal performance during the execution of the user request, the user request cannot be effectively executed, i.e. the user request processing fails. Therefore, fault identification is required to be performed on the cloud platform according to the operation log information of each service component, so that normal operation of the cloud platform is ensured.
Because the operation log information of each service component only comprises the time stamp information of whether each service component successfully executes the user request and whether the user request starts to execute and ends to execute, the cloud platform is subjected to fault identification by checking and analyzing the operation log information of each service component or checking the resource utilization rate of each service component at different time points, and the abnormality of the execution interruption of the user request can only be detected. However, when the user requests that the delay abnormality occurs, the fault identification method can only detect that the delay abnormality exists, and cannot accurately detect which service component has the delay abnormality.
Therefore, a method capable of detecting delay anomalies of service components in a cloud platform is needed.
Disclosure of Invention
The application mainly aims to provide a method and a device for determining delay abnormality and a cloud platform, so as to solve the problem that delay abnormality of a service component of the cloud platform is difficult to detect accurately in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a method for determining a delay abnormality, the method being applied in a cloud platform, the cloud platform including a plurality of service components, the method including: constructing an execution path of a user request on each target service component based on the operation log information of each service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components; classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, calculating compensation response time of the target service component corresponding to each user request when being executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests; determining whether a delay fault occurs in the corresponding target service component based at least on the compensating response times of each of the target service components in all of the request class groups.
Optionally, classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, including: dividing the target path information of each user request to obtain a plurality of path elements, wherein each path element is a path formed by two adjacent target service components on the target path information, and one target path information corresponds to at least one path element; and dividing the user requests with the same path primitives into a category to obtain a plurality of request category groups.
Optionally, the target path information further includes timestamp information that each target service component is called, where the timestamp information includes a start call time and a stop call time, and calculating, based at least on each target path information in each request class group, a compensation response time of each target service component corresponding to each user request when executed, includes: calculating the compensation response time of each target service component corresponding to each user request when being executed by adopting tj(i)=|Tbj(i)-Tej(i)|-|Tbj+1(i)-Tej+1 (i), wherein tj (i) is the compensation response time on the jth target service component in the target path information corresponding to the ith user request, tbj (i) is the start calling time of the jth target service component in the target path information corresponding to the ith user request, tej (i) is the stop calling time of the jth target service component in the target path information corresponding to the ith user request, tbj+1 (i) is the start calling time of the jth target service component in the target path information corresponding to the ith user request, tej+1 (i) is the stop calling time of the jth target service component in the target path information corresponding to the ith user request, and the i is less than or equal to the total number of target service components in the group of j or less than the total number of target service requests.
Optionally, the target path information further includes timestamp information that each target service component is called, where the timestamp information includes a start call time and a stop call time, and calculating, based at least on each target path information in each request class group, a compensation response time of each target service component corresponding to each user request when executed, includes: determining unidirectional links corresponding to the target path information of each user request in each request category group, wherein one target path information corresponds to at least two unidirectional links; at least adoptCalculating the compensation response time on the target service component corresponding to each of the user requests when executed, wherein,For the compensation response time of the jth target service component on the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the jth target service component on the kth unidirectional link,In the target path information corresponding to the ith user request, the stop calling time of the jth target service component of the kth unidirectional link,In the target path information corresponding to the ith user request, the start calling time of the (j+1) th target service component on the kth unidirectional link,And in the target path information corresponding to the ith user request, the stop call time of the (j+1) th target service components on the kth unidirectional link is less than or equal to the total number of the user requests in one request type group, j is less than or equal to the total number of the target service components on one target path information, and k is less than or equal to the total number of the unidirectional links of one target path information.
Optionally, determining whether the corresponding target service component has a delay fault based at least on the compensating response time of each target service component in all the request class groups includes: normalizing at least all the compensation response times of the target service components in each request class group to obtain the delay fault rate of each target service component in the corresponding request class group; calculating a target delay fault rate of the corresponding target service component based on the delay fault rate of the target service component in each request class group; and determining whether the corresponding target service assembly has delay faults or not based on the target delay fault rate and the calling ratio of each target service assembly, wherein one service assembly corresponds to one calling ratio, and the calling ratio is the ratio of the total number of times the corresponding target service assembly is called to the total number of times the cloud platform processes the user request.
Optionally, normalizing at least all the compensating response times of the target service components in each request class group to obtain the delay fault rate of each target service component in the corresponding request class group, including: in one request category group, carrying out normalization processing on all compensation response time of the z-th target service component to obtain a plurality of normalization probabilities; by usingCalculating the delay fault rate of the corresponding target service component, wherein Hz is the delay fault rate of the z-th target service component in one request class group, Pz (l) is the first normalized probability of the z-th target service component, m is the total number of the normalized probabilities of the z-th target service component, and z is smaller than or equal to the total number of the target service components in the corresponding request class group.
Optionally, calculating a target delay fault rate of the corresponding target service component based on the delay fault rate of the target service component in each request class group includes: calculating the sum of the delay fault rates of the same target service component in all the request class groups to obtain the corresponding fault rate sum of the target service components; and calculating the quotient of the sum of the failure rates and the total number of the request category groups to obtain the target delay failure rate.
Optionally, determining whether the corresponding target service component has a delay fault based on the target delay fault rate and the calling ratio of each target service component includes: calculating the product of the target delay fault rate of each target service component and the corresponding calling ratio to obtain a plurality of target fault thresholds; determining that a delay fault occurs in the target service component corresponding to the target fault threshold under the condition that the target fault threshold is larger than a preset fault identification threshold; and under the condition that the target fault threshold is smaller than or equal to the preset fault identification threshold, determining that the target service component corresponding to the target fault threshold has no delay fault.
According to another aspect of the embodiment of the present invention, there is also provided a determining apparatus for latency anomaly, where the determining apparatus is applied in a cloud platform, the cloud platform includes a plurality of service components, and the determining apparatus includes: the construction unit is used for constructing an execution path of a user request on each target service component based on the running log information of each service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components; a calculating unit, configured to classify each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculate a compensation response time of the target service component corresponding to each user request when executed based on at least each target path information in each request category group, where one request category group includes a plurality of user requests; and the determining unit is used for determining whether the corresponding target service component has delay faults or not at least based on the compensation response time of each target service component in all the request category groups.
According to still another aspect of the embodiment of the present invention, there is also provided a cloud platform, including: and the determining device is used for executing any one of the determining methods for the delay abnormality.
In the method for determining delay abnormality in the embodiment of the present invention, firstly, an execution path of a user request on each target service component is constructed based on operation log information of each service component, so as to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:
FIG. 1 illustrates a flow chart of a method of determining a delay anomaly in one embodiment of the present application;
FIG. 2 illustrates a schematic diagram of target path information for a target service component of one embodiment of the application;
FIG. 3 illustrates a schematic diagram of target path information for a target service component of one embodiment of the application;
fig. 4 shows a schematic diagram of target path information of a target service component according to another embodiment of the application.
Wherein the above figures include the following reference numerals:
100. a UI service port; 101. a first target service component; 102. a second target service component; 103. and a third target service component.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the application herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As described in the background art, in order to solve the above-mentioned problem, in an exemplary embodiment of the present application, a method and an apparatus for determining a delay abnormality are provided.
According to the embodiment of the application, a method for determining delay abnormality is provided.
Fig. 1 is a flowchart of a method of determining a delay exception according to an embodiment of the present application. The determining method is applied to a cloud platform, wherein the cloud platform comprises a plurality of service components, as shown in fig. 1, and the determining method comprises the following steps:
Step S101, based on the operation log information of each service component, constructing an execution path of a user request on each target service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components;
step S102, classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculating the compensation response time of the corresponding target service component when each user request is executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests;
step S103, determining whether a delay fault occurs in the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups.
In the method for determining delay abnormality, firstly, based on the operation log information of each service component, an execution path of a user request on each target service component is constructed to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
In the actual application process, for a cloud platform (i.e. a cloud computing service platform), a user inputs a user request at different UI service ports, and the cloud platform directly returns the result. In fact, the user request input by the user through the UI service port is sequentially executed through a plurality of target service components in the cloud platform. Therefore, when one user requests input, the running log information of each service component in the cloud platform records the time stamp of starting and ending the execution request, the ID information corresponding to the executed user request and the source component (i.e. the upper component) of the user request, so that a path of executing each target service component when the user request is executed can be constructed according to the running log information of each service component on the cloud platform. In a specific embodiment of the present application, assuming that ID information corresponding to a user request is 00001, target path information thereof may be constructed as a directed graph as shown in fig. 2. I.e. the user request is input from the UI service port a (i.e. the UI service port 100), goes through the target service component B (the first target service component 101), the target service component C (the second target service component 102) and the target service component D (i.e. the third target service component 103), and its target path information is a— > B, B — > C and b— > D.
In a specific embodiment of the application, based on the target path information of each user request, each user request is classified to obtain a plurality of request category groups, so that target service components experienced by each user request in one request category group are the same, and further, the compensation response time based on each target service component is further ensured, and whether delay faults occur in the corresponding target service components is accurately determined.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
In order to more simply classify each user request, in one embodiment of the present application, classifying each user request based on the target path information of each user request to obtain a plurality of request class groups includes: dividing the target path information requested by each user to obtain a plurality of path elements, wherein each path element is a path formed by two adjacent target service components on the target path information, and one target path information corresponds to at least one path element; and dividing the user requests with the same path primitives into a category to obtain a plurality of request category groups.
In a specific embodiment of the present application, before dividing the target path information corresponding to each user request to obtain the path primitive, the target path information corresponding to each user request may be further classified to obtain the classified target path information. The process of grading the target path information corresponding to each user request may be: when a user inputs a user request from the UI service port, the user will go through each target service component on the execution path, so the first target service component directly connected to the UI service port can directly obtain the user request. And when the second target service component next to the first target service component executes the user request, the call command sent by the first target service component is required, so that the call of each target service component has a concept of level. Therefore, the target service components of each level can be classified according to the positions of the target service components on the execution paths of the corresponding user requests. The method comprises the following steps: for a user request, it corresponds to an execution path of a target service component, and the execution path can search out various unidirectional links according to the transmission direction of the user request. The level division of the target service components on each unidirectional link is independent of each other. Thus, for a unidirectional link from a UI service port, the level of each target service component is defined as follows: the target service component directly connected with the UI service port is a primary target service component, the service component which needs to be connected with the UI service port through the primary target service component is a secondary target service component, the target component which needs to be connected with the UI service port through the primary target service component and the secondary target service component is a tertiary target service component, the UI service port is a 0-level target service component, and the like.
In the practical application process, since each user request is in one-to-one correspondence with the target path information, that is, the user requests with different target path information are not necessarily user requests of the same category. Therefore, the target path information corresponding to each user request is divided to obtain path primitives, specifically, the target path information may be divided into the smallest path primitives, and taking the target path information shown in fig. 2 as an example, the path primitives may be divided into three path primitives, namely a- (B, B) -C and B- (D). If the path primitive of another user request has the same composition as the user request, the two user requests can be divided into a request category group, so that each user request corresponds to a request category group, and the target service component in each user request corresponds to a level on the transmission path of one user request.
In the actual application process, for a user request, as shown in fig. 3, the target path information of the user request has only one unidirectional path, and a- > B, B- > C, namely, from the UI service port 100 to the first target service component 101 and then to the second target service component 102. In order to more simply calculate the compensation response time of each target service component in such a case, in a further embodiment of the present application, the target path information further includes timestamp information of each of the target service components being invoked, the timestamp information including a start invocation time and a stop invocation time, and calculating the compensation response time of each of the target service components corresponding to each of the user requests when being executed based at least on each of the target path information in each of the request class groups includes: calculating the backoff response time of each target service component corresponding to each user request when the user request is executed, wherein tj (i) is the backoff response time of the jth target service component among the target path information corresponding to the ith user request, tbj (i) is the start call time of the jth target service component among the target path information corresponding to the ith user request, tej (i) is the stop call time of the jth target service component among the target path information corresponding to the ith user request, tbj+1 (i) is the start call time of the jth target service component among the target path information corresponding to the ith user request, tej+1 (i) is the target path information corresponding to the ith user request, and the stop call time of the jth+1 target service component is equal to or less than the total number of the j target service components in the group of the user requests.
In the actual application process, the value of i starts from 1 up to the total number of all user requests in a request class group. The value of j starts from 1 up to the total number of target service components in one target path information.
In a specific embodiment of the present application, as shown in fig. 3, on the target path information corresponding to the user request, the compensation response time for the target service component B (the first target service component 101) is the difference between the first response time and the second response time, where the first response time is the absolute value of the difference between the start call time and the stop call time of the target service component B, and the second response time is the absolute value of the difference between the start call time and the stop call time of the target service component C (i.e. the second target service component 102). Of course, the compensation response time to the target service component C may be only the absolute value of the difference between the start call time and the stop call time of the target service component C.
In the actual application process, for a user request, the target path information of the user request is shown in fig. 2, and there may be more than two unidirectional paths. In order to more simply calculate the compensation response time of each target service component in such a case, in another embodiment of the present application, the target path information further includes timestamp information of each of the target service components being invoked, the timestamp information includes a start invocation time and a stop invocation time, and calculating the compensation response time of each of the target service components corresponding to each of the user requests when being executed based at least on each of the target path information in each of the request class groups includes: determining unidirectional links corresponding to the target path information requested by each user in each request type group, wherein one target path information corresponds to at least two unidirectional links; at least adoptCalculating the compensation response time on the target service component corresponding to each of the user requests when executed, wherein,For the compensation response time of the jth target service component on the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the jth target service component on the kth unidirectional link,The stop call time of the j-th target service component of the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the (j+1) th target service component on the kth unidirectional link,And in the target path information corresponding to the ith user request, the stop call time of the (j+1) th target service components on the kth unidirectional link is less than or equal to the total number of the user requests in one request type group, j is less than or equal to the total number of the target service components on one target path information, and k is less than or equal to the total number of the unidirectional links of one target path information.
In the actual application process, the value of i starts from 1 up to the total number of all user requests in a request class group. The value of j starts from 1 up to the total number of target service components in one target path information. The value of k starts from 1 up to the total number of the unidirectional links described above for one target path information.
In a specific embodiment of the present application, as shown in FIG. 2, the target path information corresponding to the user request has two unidirectional paths A- (B-) C and A- (B-) D, so that the formula can be adoptedAnd respectively calculating the compensation response time of the target service component A, the target service component B, the target service component C and the target service component D. However, since the target path information corresponding to the user request has two unidirectional paths a— B-C and a— B-D, when calculating the compensation response times of the target service component a, the target service component B, the target service component C and the target service component D, calculation may be performed from the unidirectional paths a — B-C and a— B-D, respectively, in which case the target service component B is caused to be calculated twice, the compensation response time of the target service component B may be an average value of the compensation response times on the two unidirectional paths.
In yet another embodiment of the present application, determining whether a delay fault occurs in a corresponding target service component based at least on the compensating response time of each of the target service components in all of the request class groups includes: normalizing at least all the compensation response times of the target service components in each request class group to obtain the corresponding delay fault rate of each target service component in the request class group; calculating a target delay fault rate of the corresponding target service component based on the delay fault rates of the target service components in the request class groups; and determining whether a delay fault occurs to the corresponding target service component based on the target delay fault rate and a calling ratio of each target service component, wherein one of the service components corresponds to one of the calling ratios, and the calling ratio is a ratio of the total number of times the corresponding target service component is called to the total number of times the cloud platform processes the user request. In this embodiment, based on the target delay fault rate and the call ratio of the target service component in all request class groups, it is possible to accurately determine whether the corresponding target service component has a delay fault.
In one embodiment of the present application, normalizing at least all the compensation response times of the target service components in each request class group to obtain the delay fault rate of each target service component in the corresponding request class group includes: in one request category group, normalizing all the compensation response time of the z-th target service component to obtain a plurality of normalized probabilities; by usingCalculating the delay fault rate of the corresponding target service component, wherein Hz is the delay fault rate of the z-th target service component in one request class group, Pz (l) is the first normalized probability of the z-th target service component, m is the total number of the normalized probabilities of the z-th target service component, and z is less than or equal to the total number of the target service components in the corresponding request class group. In the embodiment, in a request class group, all compensation response times of a target service component in the request class group are normalized, so that a plurality of normalized probabilities are obtained, and then the delay fault rate of the target service component is calculated based on the normalized probabilities of the target service component, so that the influence of singular sample data can be avoided, the obtained delay fault rate is ensured to be accurate, and further, whether the target service component fails or not can be accurately determined later is ensured.
In practical application, the compensating response time of a target service component in the same kind of user request should be concentrated around one or several fixed values, because the corresponding target service component completes the same kind of user request in a substantially identical manner. Thus, each target service component should have its compensating response time concentrated around one or several fixed values for different user requests of the same type, the more random and discrete this distribution, the more likely it is that the target service component will be in delay faults. The failure rate of each target service component is calculated based on the logic, specifically in the following manner:
For a target service component, the compensation response time of the target service component under the request of one user is obtained, so that the compensation response time of all the user requests of the target service component in a request type group can be correspondingly arranged on a one-dimensional space, the maximum value Tmax of the compensation response time is set, the one-dimensional space is limited according to the minimum value 0 of the compensation response time and the maximum value Tmax of the compensation response time, all the obtained compensation response times of the target service component are normalized, the normalized compensation response time is graded, 0-0.1 is classified as level 1, 0.1-0.2 is classified as level 2, and the like, 0-1 is classified as 10 levels, and the probability of occurrence of the compensation response time level j in the one-dimensional space is expressed by P (j). The delay fault rate of the target service component is calculated as follows:
In an actual application process, after obtaining a delay fault rate of a target service component in a request class group, in order to eliminate influence of singular data and further determine more accurately whether a delay fault occurs in a corresponding target service component, in another embodiment of the present application, calculating a target delay fault rate of the corresponding target service component based on the delay fault rates of the target service components in the request class groups includes: calculating the sum of the delay fault rates of the same target service component in all the request class groups to obtain the sum of the fault rates of the corresponding target service components; and calculating the quotient of the sum of the fault rates and the total number of the request category groups to obtain the target delay fault rate. That is, based on the delay fault rate of the same target service component in each request class group, the average value of the delay fault rates of the target service components is calculated to obtain the target delay fault rate of the target service components, so that the abrupt change influence of the delay fault rate in a certain request class group can be avoided, and further, whether the corresponding target service components have delay faults or not can be accurately determined.
In order to determine whether a corresponding target service component fails more simply, in another embodiment of the present application, determining whether a corresponding target service component fails based on the target delay fault rate and the call ratio of each of the target service components includes: calculating the product of the target delay fault rate and the corresponding calling ratio of each target service component to obtain a plurality of target fault thresholds; determining that a delay fault occurs in the target service component corresponding to the target fault threshold under the condition that the target fault threshold is larger than a preset fault identification threshold; and under the condition that the target fault threshold is smaller than or equal to the preset fault identification threshold, determining that the target service component corresponding to the target fault threshold has no delay fault.
In the actual application process, if the target delay fault rate of the nth target service component in the cloud platform is thatThe ratio of the total number of times the target service component is called to the total number of times the cloud platform processes the user request is phi, and the preset fault identification threshold value is phi. If it isDetermining that the target service component has a delay fault; if it isIt is determined that the target service component has not failed in a delay.
The embodiment of the application also provides a device for determining the delay abnormality, and the device for determining the delay abnormality can be used for executing the method for determining the delay abnormality. The following describes a delay abnormality determination device provided in the embodiment of the present application.
Fig. 4 is a schematic structural diagram of a delay abnormality determination apparatus according to an embodiment of the present application. The determining apparatus is applied to a cloud platform, where the cloud platform includes a plurality of service components, as shown in fig. 4, and includes:
A construction unit 10, configured to construct an execution path of a user request on each target service component based on the running log information of each service component, to obtain target path information, where one of the user requests corresponds to one of the target path information, and one of the target path information includes at least two of the target service components;
A calculating unit 20 configured to classify each of the user requests based on the target path information of each of the user requests to obtain a plurality of request category groups, and calculate a compensation response time of the target service component corresponding to each of the user requests when executed based on at least each of the target path information in each of the request category groups, one of the request category groups including the plurality of user requests;
A determining unit 30, configured to determine whether a delay fault occurs in the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups.
In the delay abnormality determining device, the constructing unit is configured to construct an execution path of a user request on each target service component based on the running log information of each service component, so as to obtain target path information of each user request; the computing unit is used for classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and computing compensation response information when each target service component executes the corresponding user request in each request category group; the determining unit is used for determining whether the corresponding target service component has delay faults or not at least based on the compensation response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
In the actual application process, for a cloud platform (i.e. a cloud computing service platform), a user inputs a user request at different UI service ports, and the cloud platform directly returns the result. In fact, the user request input by the user through the UI service port is sequentially executed through a plurality of target service components in the cloud platform. Therefore, when one user requests input, the running log information of each service component in the cloud platform records the time stamp of starting and ending the execution request, the ID information corresponding to the executed user request and the source component (i.e. the upper component) of the user request, so that a path of executing each target service component when the user request is executed can be constructed according to the running log information of each service component on the cloud platform. In a specific embodiment of the present application, assuming that ID information corresponding to a user request is 00001, target path information thereof may be constructed as a directed graph as shown in fig. 2. I.e. the user request is input from the UI service port a (i.e. the UI service port 100), goes through the target service component B (the first target service component 101), the target service component C (the second target service component 102) and the target service component D (i.e. the third target service component 103), and its target path information is a— > B, B — > C and b— > D.
In a specific embodiment of the application, based on the target path information of each user request, each user request is classified to obtain a plurality of request category groups, so that target service components experienced by each user request in one request category group are the same, and further, the compensation response time based on each target service component is further ensured, and whether delay faults occur in the corresponding target service components is accurately determined.
In order to classify each user request more simply, in one embodiment of the present application, the computing unit includes a first dividing module and a second dividing module, where the first dividing module is configured to divide the target path information of each user request to obtain a plurality of path primitives, where the path primitives are paths formed by two adjacent target service components on the target path information, and one of the target path information corresponds to at least one of the path primitives; the second dividing module is configured to divide the user requests with the same path primitives into one category, to obtain a plurality of request category groups.
In a specific embodiment of the present application, before dividing the target path information corresponding to each user request to obtain the path primitive, the target path information corresponding to each user request may be further classified to obtain the classified target path information. The process of grading the target path information corresponding to each user request may be: when a user inputs a user request from the UI service port, the user will go through each target service component on the execution path, so the first target service component directly connected to the UI service port can directly obtain the user request. And when the second target service component next to the first target service component executes the user request, the call command sent by the first target service component is required, so that the call of each target service component has a concept of level. Therefore, the target service components of each level can be classified according to the positions of the target service components on the execution paths of the corresponding user requests. The method comprises the following steps: for a user request, it corresponds to an execution path of a target service component, and the execution path can search out various unidirectional links according to the transmission direction of the user request. The level division of the target service components on each unidirectional link is independent of each other. Thus, for a unidirectional link from a UI service port, the level of each target service component is defined as follows: the target service component directly connected with the UI service port is a primary target service component, the service component which needs to be connected with the UI service port through the primary target service component is a secondary target service component, the target component which needs to be connected with the UI service port through the primary target service component and the secondary target service component is a tertiary target service component, the UI service port is a 0-level target service component, and the like.
In the practical application process, since each user request is in one-to-one correspondence with the target path information, that is, the user requests with different target path information are not necessarily user requests of the same category. Therefore, the target path information corresponding to each user request is divided to obtain path primitives, specifically, the target path information may be divided into the smallest path primitives, and taking the target path information shown in fig. 2 as an example, the path primitives may be divided into three path primitives, namely a- (B, B) -C and B- (D). If the path primitive of another user request has the same composition as the user request, the two user requests can be divided into a request category group, so that each user request corresponds to a request category group, and the target service component in each user request corresponds to a level on the transmission path of one user request.
In the actual application process, for a user request, as shown in fig. 3, the target path information of the user request has only one unidirectional path, and a- > B, B- > C, namely, from the UI service port 100 to the first target service component 101 and then to the second target service component 102. In order to more simply calculate the backoff response time of each target service component in such a case, in still another embodiment of the present application, the target path information further includes timestamp information of each target service component to be invoked, the timestamp information includes a start invocation time and a stop invocation time, the calculating unit further includes a first calculating module configured to calculate the backoff response time of each target service component corresponding to each user request when the user request is executed, using tj(i)=|Tbj(i)-Tej(i)|-|Tbj+1(i)-Tej+1 (i), wherein tj (i) is the backoff response time of the i-th target service component, tbj (i) is the start invocation time of the j-th target service component, tej (i) is the target path information corresponding to the i-th target service component, tbj+1 (i) is the stop invocation time of the j-th target service component, and the total number of the j-th target path components is equal to or less than the total number of target path information corresponding to the i-th target service component, and is equal to 35 j+1, among the target path information corresponding to the i-th target service component, and the total number of the j-th target path information is equal to or less than 35 i-th target path information.
In the actual application process, the value of i starts from 1 up to the total number of all user requests in a request class group. The value of j starts from 1 up to the total number of target service components in one target path information.
In a specific embodiment of the present application, as shown in fig. 3, on the target path information corresponding to the user request, the compensation response time for the target service component B (the first target service component 101) is the difference between the first response time and the second response time, where the first response time is the absolute value of the difference between the start call time and the stop call time of the target service component B, and the second response time is the absolute value of the difference between the start call time and the stop call time of the target service component C (i.e. the second target service component 102). Of course, the compensation response time to the target service component C may be only the absolute value of the difference between the start call time and the stop call time of the target service component C.
In the actual application process, for a user request, the target path information of the user request is shown in fig. 2, and there may be more than two unidirectional paths. In order to more simply calculate the compensation response time of each target service component in such a case, in another embodiment of the present application, the target path information further includes timestamp information that each target service component is called, where the timestamp information includes a start call time and a stop call time, and the calculating unit further includes a first determining module and a second calculating module, where the first determining module is configured to determine unidirectional links corresponding to the target path information requested by each user in each request type group, and one target path information corresponds to at least two unidirectional links; the second computing module is used for adopting at leastCalculating the compensation response time on the target service component corresponding to each of the user requests when executed, wherein,For the compensation response time of the jth target service component on the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the jth target service component on the kth unidirectional link,The stop call time of the j-th target service component of the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the (j+1) th target service component on the kth unidirectional link,And in the target path information corresponding to the ith user request, the stop call time of the (j+1) th target service components on the kth unidirectional link is less than or equal to the total number of the user requests in one request type group, j is less than or equal to the total number of the target service components on one target path information, and k is less than or equal to the total number of the unidirectional links of one target path information.
In the actual application process, the value of i starts from 1 up to the total number of all user requests in a request class group. The value of j starts from 1 up to the total number of target service components in one target path information. The value of k starts from 1 up to the total number of the unidirectional links described above for one target path information.
In a specific embodiment of the present application, as shown in FIG. 2, the target path information corresponding to the user request has two unidirectional paths A- (B-) C and A- (B-) D, so that the formula can be adoptedAnd respectively calculating the compensation response time of the target service component A, the target service component B, the target service component C and the target service component D. However, since the target path information corresponding to the user request has two unidirectional paths a— B-C and a— B-D, when calculating the compensation response times of the target service component a, the target service component B, the target service component C and the target service component D, calculation may be performed from the unidirectional paths a — B-C and a— B-D, respectively, in which case the target service component B is caused to be calculated twice, the compensation response time of the target service component B may be an average value of the compensation response times on the two unidirectional paths.
In still another embodiment of the present application, the determining unit includes a processing module, a third calculating module, and a second determining module, where the processing module is configured to normalize at least all the compensation response times of the target service components in each request class group to obtain a delay fault rate of each target service component in the corresponding request class group; the third calculation module is configured to calculate a target delay fault rate of the corresponding target service component based on the delay fault rates of the target service components in each request class group; the second determining module is configured to determine whether a delay fault occurs in a corresponding target service component based on the target delay fault rate and a calling ratio of each target service component, where one of the service components corresponds to one of the calling ratios, and the calling ratio is a ratio of a total number of times the corresponding target service component is called to a total number of times the cloud platform processes the user request. In this embodiment, based on the target delay fault rate and the call ratio of the target service component in all request class groups, it is possible to accurately determine whether the corresponding target service component has a delay fault.
In one embodiment of the present application, the processing module includes a normalization sub-module and a first calculation sub-module, where the normalization sub-module is configured to normalize all the compensation response times of the z-th service component in one of the request class groups to obtain a plurality of normalization probabilities; the first computing submodule is used for adoptingCalculating the delay fault rate of the corresponding target service component, wherein Hz is the delay fault rate of the z-th target service component in one request class group, Pz (l) is the first normalized probability of the z-th target service component, m is the total number of the normalized probabilities of the z-th target service component, and z is less than or equal to the total number of the target service components in the corresponding request class group. In the embodiment, in a request class group, all compensation response times of a target service component in the request class group are normalized, so that a plurality of normalized probabilities are obtained, and then the delay fault rate of the target service component is calculated based on the normalized probabilities of the target service component, so that the influence of singular sample data can be avoided, the obtained delay fault rate is ensured to be accurate, and further, whether the target service component fails or not can be accurately determined later is ensured.
In practical application, the compensating response time of a target service component in the same kind of user request should be concentrated around one or several fixed values, because the corresponding target service component completes the same kind of user request in a substantially identical manner. Thus, each target service component should have its compensating response time concentrated around one or several fixed values for different user requests of the same type, the more random and discrete this distribution, the more likely it is that the target service component will be in delay faults. The failure rate of each target service component is calculated based on the logic, specifically in the following manner:
For a target service component, the compensation response time of the target service component under the request of one user is obtained, so that the compensation response time of all the user requests of the target service component in a request type group can be correspondingly arranged on a one-dimensional space, the maximum value Tmax of the compensation response time is set, the one-dimensional space is limited according to the minimum value 0 of the compensation response time and the maximum value Tmax of the compensation response time, all the obtained compensation response times of the target service component are normalized, the normalized compensation response time is graded, 0-0.1 is classified as level 1, 0.1-0.2 is classified as level 2, and the like, 0-1 is classified as 10 levels, and the probability of occurrence of the compensation response time level j in the one-dimensional space is expressed by P (j). The delay fault rate of the target service component is calculated as follows:
In an actual application process, after obtaining a delay fault rate of a target service component in a request class group, in order to eliminate influence of singular data and further determine more accurately whether a delay fault occurs in a corresponding target service component, in another embodiment of the present application, the third calculation module includes a second calculation sub-module and a third calculation sub-module, where the second calculation sub-module is configured to calculate a sum of the delay fault rates of the same target service component in all the request class groups, and obtain a sum of fault rates of the corresponding target service components; the third calculation sub-module is configured to calculate a quotient of the sum of the failure rates and the total number of the request class groups, and obtain the target delay failure rate. That is, based on the delay fault rate of the same target service component in each request class group, the average value of the delay fault rates of the target service components is calculated to obtain the target delay fault rate of the target service components, so that the abrupt change influence of the delay fault rate in a certain request class group can be avoided, and further, whether the corresponding target service components have delay faults or not can be accurately determined.
In order to determine whether the corresponding target service component fails more simply, in a further embodiment of the present application, the second determining module includes a fourth calculating submodule, a first determining submodule and a second determining submodule, where the fourth calculating submodule is configured to calculate a product of the target delay failure rate of each of the target service components and the corresponding calling ratio to obtain a plurality of target failure thresholds; the first determining submodule is used for determining that the target service component corresponding to the target fault threshold has a delay fault under the condition that the target fault threshold is larger than a preset fault identification threshold; the second determining submodule is configured to determine that the target service component corresponding to the target fault threshold does not have a delay fault when the target fault threshold is less than or equal to the preset fault identification threshold.
In the actual application process, if the target delay fault rate of the nth target service component in the cloud platform is thatThe ratio of the total number of times the target service component is called to the total number of times the cloud platform processes the user request is phi, and the preset fault identification threshold value is phi. If it isDetermining that the target service component has a delay fault; if it isIt is determined that the target service component has not failed in a delay.
The delay abnormality determination device comprises a processor and a memory, wherein the construction unit, the calculation unit, the determination unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.
The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The kernel can be provided with one or more than one kernel, and the problem that delay abnormality of a service component of a cloud platform is difficult to detect accurately in the prior art is solved by adjusting kernel parameters.
The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip.
An embodiment of the present invention provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the above-described method of determining a delay abnormality.
The embodiment of the invention provides a processor, which is used for running a program, wherein the method for determining the delay abnormality is executed when the program runs.
In an exemplary embodiment of the present application, a cloud platform is further provided, where the cloud platform includes a delay anomaly determination device, where the determination device is configured to perform any one of the delay anomaly determination methods described above.
The cloud platform comprises a delay abnormality determining device, and the determining device is used for executing any one of the delay abnormality determining methods. In the above determination method, firstly, based on the operation log information of each service component, an execution path of a user request on each target service component is constructed to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
The embodiment of the invention provides equipment, which comprises a processor, a memory and a program stored in the memory and capable of running on the processor, wherein the processor realizes at least the following steps when executing the program:
Step S101, based on the operation log information of each service component, constructing an execution path of a user request on each target service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components;
step S102, classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculating the compensation response time of the corresponding target service component when each user request is executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests;
step S103, determining whether a delay fault occurs in the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups.
The device herein may be a server, PC, PAD, cell phone, etc.
The application also provides a computer program product adapted to perform, when executed on a data processing device, a program initialized with at least the following method steps:
Step S101, based on the operation log information of each service component, constructing an execution path of a user request on each target service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components;
step S102, classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculating the compensation response time of the corresponding target service component when each user request is executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests;
step S103, determining whether a delay fault occurs in the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units may be a logic function division, and there may be another division manner when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units described above, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the above-mentioned method of the various embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
From the above description, it can be seen that the above embodiments of the present application achieve the following technical effects:
1) In the method for determining the delay abnormality, firstly, an execution path of a user request on each target service component is constructed based on the operation log information of each service component to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
2) In the delay abnormality determination device, a construction unit is used for constructing an execution path of a user request on each target service component based on the operation log information of each service component to obtain target path information of each user request; the computing unit is used for classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and computing compensation response information when each target service component executes the corresponding user request in each request category group; the determining unit is used for determining whether the corresponding target service component has delay faults or not at least based on the compensation response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
3) The cloud platform comprises a delay abnormality determining device, wherein the determining device is used for executing any one of the delay abnormality determining methods. In the above determination method, firstly, based on the operation log information of each service component, an execution path of a user request on each target service component is constructed to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.
The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (9)

Calculating the compensation response time of each target service component corresponding to each user request when being executed by adopting tj(i)=|Tbj(i)-Tej(i)|-|Tbj+1(i)-Tej+1 (i), wherein tj (i) is the compensation response time on the jth target service component in the target path information corresponding to the ith user request, tbj (i) is the start calling time of the jth target service component in the target path information corresponding to the ith user request, tej (i) is the stop calling time of the jth target service component in the target path information corresponding to the ith user request, tbj+1 (i) is the start calling time of the jth target service component in the target path information corresponding to the ith user request, tej+1 (i) is the stop calling time of the jth target service component in the target path information corresponding to the ith user request, and the i is less than or equal to the total number of target service components in the group of j or less than the total number of target service requests.
At least adoptCalculating the compensation response time on the target service component corresponding to each of the user requests when executed, wherein,For the compensation response time of the jth target service component on the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the jth target service component on the kth unidirectional link,In the target path information corresponding to the ith user request, the stop calling time of the jth target service component of the kth unidirectional link,In the target path information corresponding to the ith user request, the start calling time of the (j+1) th target service component on the kth unidirectional link,And in the target path information corresponding to the ith user request, the stop call time of the (j+1) th target service components on the kth unidirectional link is less than or equal to the total number of the user requests in one request type group, j is less than or equal to the total number of the target service components on one target path information, and k is less than or equal to the total number of the unidirectional links of one target path information.
The target path information further includes time stamp information that is called by each target service component, where the time stamp information includes a start call time and a stop call time, and the computing unit further includes a first computing module configured to use tj(i)=|Tbj(i)-Tej(i)|-|Tbj+1(i)-Tej+1 (i) | to compute the offset response time of each target service component that corresponds to each user request when the target path information is executed, where tj (i) is the offset response time on the target service component corresponding to the i-th user request, tbj (i) is the start call time of the j-th target service component in the target path information corresponding to the i-th user request, tej (i) is the stop call time of the j-th target service component in the target path information corresponding to the i-th user request, tbj+1 (i) is the stop call time of the j-th target service component in the target path information corresponding to the i-th user request, and the start call time of the j+1-th target service component in the target path information corresponding to the i-th user request is equal to the total number of j-th target path information, and the total number of the j-th target path information is equal to the total number of the j-th target path information in the target path information corresponding to the i-th user request.
CN202211441731.0A2022-11-172022-11-17Method and device for determining delay abnormality and cloud platformActiveCN116112397B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202211441731.0ACN116112397B (en)2022-11-172022-11-17Method and device for determining delay abnormality and cloud platform

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202211441731.0ACN116112397B (en)2022-11-172022-11-17Method and device for determining delay abnormality and cloud platform

Publications (2)

Publication NumberPublication Date
CN116112397A CN116112397A (en)2023-05-12
CN116112397Btrue CN116112397B (en)2024-08-30

Family

ID=86258623

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202211441731.0AActiveCN116112397B (en)2022-11-172022-11-17Method and device for determining delay abnormality and cloud platform

Country Status (1)

CountryLink
CN (1)CN116112397B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115098224A (en)*2022-07-152022-09-23济南浪潮数据技术有限公司 A kind of cluster service process exception processing method, device and medium thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20160099853A1 (en)*2014-10-012016-04-07Cisco Technology, Inc.Active and passive dataplane performance monitoring of service function chaining
CN110245035A (en)*2019-05-202019-09-17平安普惠企业管理有限公司A kind of link trace method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115098224A (en)*2022-07-152022-09-23济南浪潮数据技术有限公司 A kind of cluster service process exception processing method, device and medium thereof

Also Published As

Publication numberPublication date
CN116112397A (en)2023-05-12

Similar Documents

PublicationPublication DateTitle
US11269718B1 (en)Root cause detection and corrective action diagnosis system
US20220045968A1 (en)Nonintrusive dynamically-scalable network load generation
US11985040B2 (en)Multi-baseline unsupervised security-incident and network behavioral anomaly detection in cloud-based compute environments
Sauvanaud et al.Anomaly detection and root cause localization in virtual network functions
US10318366B2 (en)System and method for relationship based root cause recommendation
US10177984B2 (en)Isolation of problems in a virtual environment
US9122784B2 (en)Isolation of problems in a virtual environment
Meedeniya et al.Architecture-driven reliability optimization with uncertain model parameters
EP3692443B1 (en)Application regression detection in computing systems
CN104615476A (en)Selected virtual machine replication and virtual machine restart techniques
US10831587B2 (en)Determination of cause of error state of elements in a computing environment based on an element's number of impacted elements and the number in an error state
JP7619744B2 (en) Fault Localization for Cloud-Native Applications
EP3956771B1 (en)Timeout mode for storage devices
WO2023197453A1 (en)Fault diagnosis method and apparatus, device, and storage medium
CN110543462A (en)Microservice reliability prediction method, prediction device, electronic device, and storage medium
Otani et al.Application of AI to mobile network operation
US20250077631A1 (en)Method and system for investigating resiliency of a software application
Yan et al.Aegis: Attribution of control plane change impact across layers and components for cloud systems
WO2025064274A1 (en)Log representation learning for automated system maintenance
CN108334427B (en)Fault diagnosis method and device in storage system
Nazari Cheraghlou et al.New fuzzy-based fault tolerance evaluation framework for cloud computing
CN116112397B (en)Method and device for determining delay abnormality and cloud platform
Nozomi et al.Unavailability-aware backup allocation model for middleboxes with two-stage shared protection
Cheng et al.The anomaly detection mechanism using extreme learning machine for service function chaining
US20230412449A1 (en)Network alert detection utilizing trained edge classification models

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp