CN116112397B

Movatterモバイル変換

Info

Publication number: CN116112397B
Application number: CN202211441731.0A
Authority: CN
Inventors: 高力鹏; 张志鹏; 范晓辉; 何兴建; 张云锋
Original assignee: Postal Savings Bank of China Ltd
Current assignee: Postal Savings Bank of China Ltd
Priority date: 2022-11-17
Filing date: 2022-11-17
Publication date: 2024-08-30
Anticipated expiration: 2042-11-17
Also published as: CN116112397A

Abstract

The application provides a method and a device for determining delay abnormality and a cloud platform. The determining method comprises the following steps: based on the operation log information of each service component, constructing an execution path of a user request on each target service component to obtain target path information; classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response time of a corresponding target service component when each user request is executed at least based on each target path information in each request category group; based on the compensation response time of each target service component in all request class groups, whether delay faults occur to the corresponding target service components is determined, so that the problem that delay anomalies of service components of a cloud platform are difficult to detect accurately in the prior art is solved, and further the high efficiency of fault processing of the cloud platform and the high reliability of operation of the cloud platform are guaranteed.

Description

Method and device for determining delay abnormality and cloud platform

Technical Field

The application relates to the field of data processing, in particular to a method and a device for determining delay abnormality and a cloud platform.

Background

Cloud computing is a novel computing mode with high flexibility, and the complexity of a cloud computing system is accompanied with the flexibility. Cloud computing integrates massive computing resources (including hardware resources and software resources) and provides services for users through networks in the form of a cloud platform.

The user initiated request is completed by the interaction and cooperative processing of numerous service components of the cloud platform. These service components include virtualization components, security components, user management components, application extension components, and the like. While these service components are deployed decentralized on different compute nodes, even in different clusters and in different data centers. The service components also communicate with each other via the internet or a private network. If a service component fails or has abnormal performance during the execution of the user request, the user request cannot be effectively executed, i.e. the user request processing fails. Therefore, fault identification is required to be performed on the cloud platform according to the operation log information of each service component, so that normal operation of the cloud platform is ensured.

Because the operation log information of each service component only comprises the time stamp information of whether each service component successfully executes the user request and whether the user request starts to execute and ends to execute, the cloud platform is subjected to fault identification by checking and analyzing the operation log information of each service component or checking the resource utilization rate of each service component at different time points, and the abnormality of the execution interruption of the user request can only be detected. However, when the user requests that the delay abnormality occurs, the fault identification method can only detect that the delay abnormality exists, and cannot accurately detect which service component has the delay abnormality.

Therefore, a method capable of detecting delay anomalies of service components in a cloud platform is needed.

Disclosure of Invention

The application mainly aims to provide a method and a device for determining delay abnormality and a cloud platform, so as to solve the problem that delay abnormality of a service component of the cloud platform is difficult to detect accurately in the prior art.

According to an aspect of an embodiment of the present invention, there is provided a method for determining a delay abnormality, the method being applied in a cloud platform, the cloud platform including a plurality of service components, the method including: constructing an execution path of a user request on each target service component based on the operation log information of each service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components; classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, calculating compensation response time of the target service component corresponding to each user request when being executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests; determining whether a delay fault occurs in the corresponding target service component based at least on the compensating response times of each of the target service components in all of the request class groups.

Optionally, classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, including: dividing the target path information of each user request to obtain a plurality of path elements, wherein each path element is a path formed by two adjacent target service components on the target path information, and one target path information corresponds to at least one path element; and dividing the user requests with the same path primitives into a category to obtain a plurality of request category groups.

Optionally, determining whether the corresponding target service component has a delay fault based at least on the compensating response time of each target service component in all the request class groups includes: normalizing at least all the compensation response times of the target service components in each request class group to obtain the delay fault rate of each target service component in the corresponding request class group; calculating a target delay fault rate of the corresponding target service component based on the delay fault rate of the target service component in each request class group; and determining whether the corresponding target service assembly has delay faults or not based on the target delay fault rate and the calling ratio of each target service assembly, wherein one service assembly corresponds to one calling ratio, and the calling ratio is the ratio of the total number of times the corresponding target service assembly is called to the total number of times the cloud platform processes the user request.

Optionally, calculating a target delay fault rate of the corresponding target service component based on the delay fault rate of the target service component in each request class group includes: calculating the sum of the delay fault rates of the same target service component in all the request class groups to obtain the corresponding fault rate sum of the target service components; and calculating the quotient of the sum of the failure rates and the total number of the request category groups to obtain the target delay failure rate.

Optionally, determining whether the corresponding target service component has a delay fault based on the target delay fault rate and the calling ratio of each target service component includes: calculating the product of the target delay fault rate of each target service component and the corresponding calling ratio to obtain a plurality of target fault thresholds; determining that a delay fault occurs in the target service component corresponding to the target fault threshold under the condition that the target fault threshold is larger than a preset fault identification threshold; and under the condition that the target fault threshold is smaller than or equal to the preset fault identification threshold, determining that the target service component corresponding to the target fault threshold has no delay fault.

According to another aspect of the embodiment of the present invention, there is also provided a determining apparatus for latency anomaly, where the determining apparatus is applied in a cloud platform, the cloud platform includes a plurality of service components, and the determining apparatus includes: the construction unit is used for constructing an execution path of a user request on each target service component based on the running log information of each service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components; a calculating unit, configured to classify each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculate a compensation response time of the target service component corresponding to each user request when executed based on at least each target path information in each request category group, where one request category group includes a plurality of user requests; and the determining unit is used for determining whether the corresponding target service component has delay faults or not at least based on the compensation response time of each target service component in all the request category groups.

According to still another aspect of the embodiment of the present invention, there is also provided a cloud platform, including: and the determining device is used for executing any one of the determining methods for the delay abnormality.

In the method for determining delay abnormality in the embodiment of the present invention, firstly, an execution path of a user request on each target service component is constructed based on operation log information of each service component, so as to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:

FIG. 1 illustrates a flow chart of a method of determining a delay anomaly in one embodiment of the present application;

FIG. 2 illustrates a schematic diagram of target path information for a target service component of one embodiment of the application;

FIG. 3 illustrates a schematic diagram of target path information for a target service component of one embodiment of the application;

fig. 4 shows a schematic diagram of target path information of a target service component according to another embodiment of the application.

Wherein the above figures include the following reference numerals:

100. a UI service port; 101. a first target service component; 102. a second target service component; 103. and a third target service component.

Detailed Description

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the application herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

As described in the background art, in order to solve the above-mentioned problem, in an exemplary embodiment of the present application, a method and an apparatus for determining a delay abnormality are provided.

According to the embodiment of the application, a method for determining delay abnormality is provided.

Fig. 1 is a flowchart of a method of determining a delay exception according to an embodiment of the present application. The determining method is applied to a cloud platform, wherein the cloud platform comprises a plurality of service components, as shown in fig. 1, and the determining method comprises the following steps:

Step S101, based on the operation log information of each service component, constructing an execution path of a user request on each target service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components;

step S102, classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculating the compensation response time of the corresponding target service component when each user request is executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests;

step S103, determining whether a delay fault occurs in the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups.

In the method for determining delay abnormality, firstly, based on the operation log information of each service component, an execution path of a user request on each target service component is constructed to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

In a specific embodiment of the application, based on the target path information of each user request, each user request is classified to obtain a plurality of request category groups, so that target service components experienced by each user request in one request category group are the same, and further, the compensation response time based on each target service component is further ensured, and whether delay faults occur in the corresponding target service components is accurately determined.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.

In order to more simply classify each user request, in one embodiment of the present application, classifying each user request based on the target path information of each user request to obtain a plurality of request class groups includes: dividing the target path information requested by each user to obtain a plurality of path elements, wherein each path element is a path formed by two adjacent target service components on the target path information, and one target path information corresponds to at least one path element; and dividing the user requests with the same path primitives into a category to obtain a plurality of request category groups.

In the practical application process, since each user request is in one-to-one correspondence with the target path information, that is, the user requests with different target path information are not necessarily user requests of the same category. Therefore, the target path information corresponding to each user request is divided to obtain path primitives, specifically, the target path information may be divided into the smallest path primitives, and taking the target path information shown in fig. 2 as an example, the path primitives may be divided into three path primitives, namely a- (B, B) -C and B- (D). If the path primitive of another user request has the same composition as the user request, the two user requests can be divided into a request category group, so that each user request corresponds to a request category group, and the target service component in each user request corresponds to a level on the transmission path of one user request.

In the actual application process, the value of i starts from 1 up to the total number of all user requests in a request class group. The value of j starts from 1 up to the total number of target service components in one target path information.

In a specific embodiment of the present application, as shown in fig. 3, on the target path information corresponding to the user request, the compensation response time for the target service component B (the first target service component 101) is the difference between the first response time and the second response time, where the first response time is the absolute value of the difference between the start call time and the stop call time of the target service component B, and the second response time is the absolute value of the difference between the start call time and the stop call time of the target service component C (i.e. the second target service component 102). Of course, the compensation response time to the target service component C may be only the absolute value of the difference between the start call time and the stop call time of the target service component C.

In the actual application process, the value of i starts from 1 up to the total number of all user requests in a request class group. The value of j starts from 1 up to the total number of target service components in one target path information. The value of k starts from 1 up to the total number of the unidirectional links described above for one target path information.

In practical application, the compensating response time of a target service component in the same kind of user request should be concentrated around one or several fixed values, because the corresponding target service component completes the same kind of user request in a substantially identical manner. Thus, each target service component should have its compensating response time concentrated around one or several fixed values for different user requests of the same type, the more random and discrete this distribution, the more likely it is that the target service component will be in delay faults. The failure rate of each target service component is calculated based on the logic, specifically in the following manner:

In order to determine whether a corresponding target service component fails more simply, in another embodiment of the present application, determining whether a corresponding target service component fails based on the target delay fault rate and the call ratio of each of the target service components includes: calculating the product of the target delay fault rate and the corresponding calling ratio of each target service component to obtain a plurality of target fault thresholds; determining that a delay fault occurs in the target service component corresponding to the target fault threshold under the condition that the target fault threshold is larger than a preset fault identification threshold; and under the condition that the target fault threshold is smaller than or equal to the preset fault identification threshold, determining that the target service component corresponding to the target fault threshold has no delay fault.

In the actual application process, if the target delay fault rate of the nth target service component in the cloud platform is thatThe ratio of the total number of times the target service component is called to the total number of times the cloud platform processes the user request is phi, and the preset fault identification threshold value is phi. If it isDetermining that the target service component has a delay fault; if it isIt is determined that the target service component has not failed in a delay.

The embodiment of the application also provides a device for determining the delay abnormality, and the device for determining the delay abnormality can be used for executing the method for determining the delay abnormality. The following describes a delay abnormality determination device provided in the embodiment of the present application.

Fig. 4 is a schematic structural diagram of a delay abnormality determination apparatus according to an embodiment of the present application. The determining apparatus is applied to a cloud platform, where the cloud platform includes a plurality of service components, as shown in fig. 4, and includes:

A construction unit 10, configured to construct an execution path of a user request on each target service component based on the running log information of each service component, to obtain target path information, where one of the user requests corresponds to one of the target path information, and one of the target path information includes at least two of the target service components;

A calculating unit 20 configured to classify each of the user requests based on the target path information of each of the user requests to obtain a plurality of request category groups, and calculate a compensation response time of the target service component corresponding to each of the user requests when executed based on at least each of the target path information in each of the request category groups, one of the request category groups including the plurality of user requests;

A determining unit 30, configured to determine whether a delay fault occurs in the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups.

In the delay abnormality determining device, the constructing unit is configured to construct an execution path of a user request on each target service component based on the running log information of each service component, so as to obtain target path information of each user request; the computing unit is used for classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and computing compensation response information when each target service component executes the corresponding user request in each request category group; the determining unit is used for determining whether the corresponding target service component has delay faults or not at least based on the compensation response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

In order to classify each user request more simply, in one embodiment of the present application, the computing unit includes a first dividing module and a second dividing module, where the first dividing module is configured to divide the target path information of each user request to obtain a plurality of path primitives, where the path primitives are paths formed by two adjacent target service components on the target path information, and one of the target path information corresponds to at least one of the path primitives; the second dividing module is configured to divide the user requests with the same path primitives into one category, to obtain a plurality of request category groups.

In order to determine whether the corresponding target service component fails more simply, in a further embodiment of the present application, the second determining module includes a fourth calculating submodule, a first determining submodule and a second determining submodule, where the fourth calculating submodule is configured to calculate a product of the target delay failure rate of each of the target service components and the corresponding calling ratio to obtain a plurality of target failure thresholds; the first determining submodule is used for determining that the target service component corresponding to the target fault threshold has a delay fault under the condition that the target fault threshold is larger than a preset fault identification threshold; the second determining submodule is configured to determine that the target service component corresponding to the target fault threshold does not have a delay fault when the target fault threshold is less than or equal to the preset fault identification threshold.

The delay abnormality determination device comprises a processor and a memory, wherein the construction unit, the calculation unit, the determination unit and the like are stored in the memory as program units, and the processor executes the program units stored in the memory to realize corresponding functions.

The processor includes a kernel, and the kernel fetches the corresponding program unit from the memory. The kernel can be provided with one or more than one kernel, and the problem that delay abnormality of a service component of a cloud platform is difficult to detect accurately in the prior art is solved by adjusting kernel parameters.

The memory may include volatile memory, random Access Memory (RAM), and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM), among other forms in computer readable media, the memory including at least one memory chip.

An embodiment of the present invention provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the above-described method of determining a delay abnormality.

The embodiment of the invention provides a processor, which is used for running a program, wherein the method for determining the delay abnormality is executed when the program runs.

In an exemplary embodiment of the present application, a cloud platform is further provided, where the cloud platform includes a delay anomaly determination device, where the determination device is configured to perform any one of the delay anomaly determination methods described above.

The cloud platform comprises a delay abnormality determining device, and the determining device is used for executing any one of the delay abnormality determining methods. In the above determination method, firstly, based on the operation log information of each service component, an execution path of a user request on each target service component is constructed to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

The embodiment of the invention provides equipment, which comprises a processor, a memory and a program stored in the memory and capable of running on the processor, wherein the processor realizes at least the following steps when executing the program:

The device herein may be a server, PC, PAD, cell phone, etc.

The application also provides a computer program product adapted to perform, when executed on a data processing device, a program initialized with at least the following method steps:

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units may be a logic function division, and there may be another division manner when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units described above, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the above-mentioned method of the various embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

From the above description, it can be seen that the above embodiments of the present application achieve the following technical effects:

1) In the method for determining the delay abnormality, firstly, an execution path of a user request on each target service component is constructed based on the operation log information of each service component to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

2) In the delay abnormality determination device, a construction unit is used for constructing an execution path of a user request on each target service component based on the operation log information of each service component to obtain target path information of each user request; the computing unit is used for classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, and computing compensation response information when each target service component executes the corresponding user request in each request category group; the determining unit is used for determining whether the corresponding target service component has delay faults or not at least based on the compensation response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

3) The cloud platform comprises a delay abnormality determining device, wherein the determining device is used for executing any one of the delay abnormality determining methods. In the above determination method, firstly, based on the operation log information of each service component, an execution path of a user request on each target service component is constructed to obtain target path information of each user request; then, classifying each user request based on target path information of each user request to obtain a plurality of request category groups, and calculating compensation response information when each target service component executes the corresponding user request in each request category group; finally, determining whether the corresponding target service component has delay faults based on at least the compensating response time of each target service component in all the request category groups. Compared with the prior art, whether the delay faults occur in the cloud platform is determined only through the log information of the service components or the resource utilization rate of the cloud platform, in the scheme, whether the delay faults occur in the corresponding target service components is determined at least based on the compensation response time of each target service component in all request type groups, namely, whether the delay faults occur in the cloud platform can be determined, and the delay faults can be accurately determined, so that the problem that the delay faults of the service components of the cloud platform are difficult to accurately detect in the prior art is solved, and further the fault processing efficiency of the cloud platform and the running reliability of the cloud platform are guaranteed to be higher.

The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for determining a delay anomaly, wherein the method is applied to a cloud platform, the cloud platform comprising a plurality of service components, the method comprising:

Constructing an execution path of a user request on each target service component based on the operation log information of each service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components;

Classifying each user request based on the target path information of each user request to obtain a plurality of request category groups, calculating compensation response time of the target service component corresponding to each user request when being executed based on at least each target path information in each request category group, wherein one request category group comprises a plurality of user requests;

determining whether a delay fault occurs to the corresponding target service component based at least on the compensating response time of each target service component in all the request class groups;

The target path information further includes timestamp information that each of the target service components is invoked, the timestamp information including a start invocation time and a stop invocation time, calculating a compensation response time of each of the target service components corresponding to each of the user requests when executed based at least on each of the target path information in each of the request class groups, including:

Calculating the compensation response time of each target service component corresponding to each user request when being executed by adopting t^j(i)＝|Tb^j(i)-Te^j(i)|-|Tb^j+1(i)-Te^j+1 (i), wherein t^j (i) is the compensation response time on the jth target service component in the target path information corresponding to the ith user request, tb^j (i) is the start calling time of the jth target service component in the target path information corresponding to the ith user request, te^j (i) is the stop calling time of the jth target service component in the target path information corresponding to the ith user request, tb^j+1 (i) is the start calling time of the jth target service component in the target path information corresponding to the ith user request, te^j+1 (i) is the stop calling time of the jth target service component in the target path information corresponding to the ith user request, and the i is less than or equal to the total number of target service components in the group of j or less than the total number of target service requests.

2. The determination method according to claim 1, wherein classifying each of the user requests based on the target path information of each of the user requests to obtain a plurality of request category groups includes:

Dividing the target path information of each user request to obtain a plurality of path elements, wherein each path element is a path formed by two adjacent target service components on the target path information, and one target path information corresponds to at least one path element;

and dividing the user requests with the same path primitives into a category to obtain a plurality of request category groups.

3. The determination method according to claim 1, wherein the target path information further includes timestamp information that each of the target service components is invoked, the timestamp information including a start invocation time and a stop invocation time, calculating a compensation response time of each of the target service components to which each of the user requests when executed based at least on each of the target path information in each of the request class groups, comprising:

Determining unidirectional links corresponding to the target path information of each user request in each request category group, wherein one target path information corresponds to at least two unidirectional links;

At least adoptCalculating the compensation response time on the target service component corresponding to each of the user requests when executed, wherein,For the compensation response time of the jth target service component on the kth unidirectional link in the target path information corresponding to the ith user request,In the target path information corresponding to the ith user request, the start calling time of the jth target service component on the kth unidirectional link,In the target path information corresponding to the ith user request, the stop calling time of the jth target service component of the kth unidirectional link,In the target path information corresponding to the ith user request, the start calling time of the (j+1) th target service component on the kth unidirectional link,And in the target path information corresponding to the ith user request, the stop call time of the (j+1) th target service components on the kth unidirectional link is less than or equal to the total number of the user requests in one request type group, j is less than or equal to the total number of the target service components on one target path information, and k is less than or equal to the total number of the unidirectional links of one target path information.

4. A method of determining according to any one of claims 1 to 3, wherein determining whether a delay fault has occurred in a corresponding one of the target service components based at least on the backoff response times of each of the target service components in all of the request class groups comprises:

normalizing at least all the compensation response times of the target service components in each request class group to obtain the delay fault rate of each target service component in the corresponding request class group;

Calculating a target delay fault rate of the corresponding target service component based on the delay fault rate of the target service component in each request class group;

And determining whether the corresponding target service assembly has delay faults or not based on the target delay fault rate and the calling ratio of each target service assembly, wherein one service assembly corresponds to one calling ratio, and the calling ratio is the ratio of the total number of times the corresponding target service assembly is called to the total number of times the cloud platform processes the user request.

5. The method of determining according to claim 4, wherein normalizing at least all of the backoff response times for the target service components in each of the request class groups to obtain the delay fault rate for each of the target service components in the corresponding request class group comprises:

in one request category group, carrying out normalization processing on all compensation response time of the z-th target service component to obtain a plurality of normalization probabilities;

By usingCalculating the delay fault rate of the corresponding target service component, wherein H_z is the delay fault rate of the z-th target service component in one request class group, P_z (l) is the first normalized probability of the z-th target service component, m is the total number of the normalized probabilities of the z-th target service component, and z is smaller than or equal to the total number of the target service components in the corresponding request class group.

6. The method of determining of claim 4, wherein calculating a target delay fault rate for the corresponding target service component based on the delay fault rates for the target service components in each of the request class groups comprises:

calculating the sum of the delay fault rates of the same target service component in all the request class groups to obtain the corresponding fault rate sum of the target service components;

and calculating the quotient of the sum of the failure rates and the total number of the request category groups to obtain the target delay failure rate.

7. The method of determining of claim 4, wherein determining whether a delay fault has occurred for a corresponding target service component based on the target delay fault rate and call ratio of each of the target service components comprises:

Calculating the product of the target delay fault rate of each target service component and the corresponding calling ratio to obtain a plurality of target fault thresholds;

Determining that a delay fault occurs in the target service component corresponding to the target fault threshold under the condition that the target fault threshold is larger than a preset fault identification threshold;

and under the condition that the target fault threshold is smaller than or equal to the preset fault identification threshold, determining that the target service component corresponding to the target fault threshold has no delay fault.

8. A delay anomaly determination device, wherein the determination device is applied in a cloud platform, the cloud platform comprising a plurality of service components, the determination device comprising:

The construction unit is used for constructing an execution path of a user request on each target service component based on the running log information of each service component to obtain target path information, wherein one user request corresponds to one target path information, and one target path information at least comprises two target service components;

a calculating unit, configured to classify each user request based on the target path information of each user request to obtain a plurality of request category groups, and calculate a compensation response time of the target service component corresponding to each user request when executed based on at least each target path information in each request category group, where one request category group includes a plurality of user requests;

a determining unit, configured to determine whether a delay fault occurs in a corresponding target service component based at least on the compensation response time of each target service component in all the request class groups;

The target path information further includes time stamp information that is called by each target service component, where the time stamp information includes a start call time and a stop call time, and the computing unit further includes a first computing module configured to use t^j(i)＝|Tb^j(i)-Te^j(i)|-|Tb^j+1(i)-Te^j+1 (i) | to compute the offset response time of each target service component that corresponds to each user request when the target path information is executed, where t^j (i) is the offset response time on the target service component corresponding to the i-th user request, tb^j (i) is the start call time of the j-th target service component in the target path information corresponding to the i-th user request, te^j (i) is the stop call time of the j-th target service component in the target path information corresponding to the i-th user request, tb^j+1 (i) is the stop call time of the j-th target service component in the target path information corresponding to the i-th user request, and the start call time of the j+1-th target service component in the target path information corresponding to the i-th user request is equal to the total number of j-th target path information, and the total number of the j-th target path information is equal to the total number of the j-th target path information in the target path information corresponding to the i-th user request.

9. A cloud platform, comprising: determination means of delay anomalies for performing the determination method of delay anomalies according to any one of claims 1 to 7.