Movatterモバイル変換


[0]ホーム

URL:


CN118796597A - A method, device, equipment, storage medium and product for managing scheduled tasks - Google Patents

A method, device, equipment, storage medium and product for managing scheduled tasks
Download PDF

Info

Publication number
CN118796597A
CN118796597ACN202410140757.4ACN202410140757ACN118796597ACN 118796597 ACN118796597 ACN 118796597ACN 202410140757 ACN202410140757 ACN 202410140757ACN 118796597 ACN118796597 ACN 118796597A
Authority
CN
China
Prior art keywords
scheduled task
time
task
scheduled
execution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410140757.4A
Other languages
Chinese (zh)
Inventor
郭佳豪
周广为
孔维莲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
Research Institute of China Mobile Communication Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
Research Institute of China Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, Research Institute of China Mobile Communication Co LtdfiledCriticalChina Mobile Communications Group Co Ltd
Priority to CN202410140757.4ApriorityCriticalpatent/CN118796597A/en
Publication of CN118796597ApublicationCriticalpatent/CN118796597A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本申请公开了一种定时任务的管理方法、装置、设备、存储介质及产品,其中,方法包括:在多依赖任务中的定时任务执行前,基于第一时段对定时任务进行监控,第一时段指示定时任务的实际开始时间应属的时段;在定时任务执行中,基于第二时段对定时任务进行监控,第二时段指示定时任务的实际完成时间应属的时段;在定时任务执行后,对定时任务的执行结果进行校验。

The present application discloses a method, apparatus, device, storage medium and product for managing scheduled tasks, wherein the method comprises: before the scheduled task in a multi-dependent task is executed, the scheduled task is monitored based on a first time period, the first time period indicating the time period to which the actual start time of the scheduled task should belong; during the execution of the scheduled task, the scheduled task is monitored based on a second time period, the second time period indicating the time period to which the actual completion time of the scheduled task should belong; after the scheduled task is executed, the execution result of the scheduled task is verified.

Description

Translated fromChinese
一种定时任务的管理方法、装置、设备、存储介质及产品A method, device, equipment, storage medium and product for managing scheduled tasks

技术领域Technical Field

本申请涉及但不限于计算机技术领域,尤其涉及一种定时任务的管理方法、装置、设备、存储介质及计算机程序产品。The present application relates to, but is not limited to, the field of computer technology, and in particular to a method, apparatus, device, storage medium, and computer program product for managing scheduled tasks.

背景技术Background Art

随着大数据等计算机技术的发展,对业务系统的管理更为精细,定时任务作为一种数据处理方式,常出现在各类业务系统的管理过程中。定时任务执行异常时,会对线上各种业务系统造成不良影响。With the development of computer technologies such as big data, the management of business systems has become more sophisticated. Scheduled tasks, as a data processing method, often appear in the management process of various business systems. When scheduled tasks are executed abnormally, it will have a negative impact on various online business systems.

相关技术中进行定时任务的管理时,是在单个任务的执行过程中进行监控和异常告警,具体地仅监控定时执行后的任务有没有异常退出。如此,使得在复杂多依赖任务场景下,导致业务系统不能及时处理未监控到的异常,造成业务损失。In the related technology, when managing scheduled tasks, monitoring and abnormal alarm are performed during the execution of a single task, specifically, only monitoring whether the task after scheduled execution has abnormal exit. As a result, in complex multi-dependent task scenarios, the business system cannot handle unmonitored abnormalities in a timely manner, resulting in business losses.

发明内容Summary of the invention

本申请提供一种定时任务的管理方法、装置、设备、存储介质及产品;本申请提供的定时任务的管理方式,实现对定时任务的全流程监控,包括定时任务执行前、定时任务执行中和定时任务执行后三个阶段的监控。The present application provides a scheduled task management method, device, equipment, storage medium and product; the scheduled task management method provided in the present application realizes the full process monitoring of the scheduled task, including monitoring of three stages: before the scheduled task is executed, during the scheduled task execution and after the scheduled task execution.

本申请实施例的技术方案是这样实现的:The technical solution of the embodiment of the present application is implemented as follows:

一种定时任务的管理方法,该方法包括:A method for managing a scheduled task, the method comprising:

在多依赖任务中的定时任务执行前,基于第一时段对所述定时任务进行监控,所述第一时段指示所述定时任务的实际开始时间应属的时段;Before executing a scheduled task among multi-dependent tasks, monitoring the scheduled task based on a first time period, the first time period indicating the time period to which the actual start time of the scheduled task should belong;

在所述定时任务执行中,基于第二时段对所述定时任务进行监控,所述第二时段指示所述定时任务的实际完成时间应属的时段;During the execution of the scheduled task, the scheduled task is monitored based on a second time period, the second time period indicating the time period to which the actual completion time of the scheduled task should belong;

在所述定时任务执行后,对所述定时任务的执行结果进行校验。After the scheduled task is executed, the execution result of the scheduled task is verified.

上述方案中,所述方法还包括:In the above solution, the method further comprises:

获取所述定时任务的第一时间差值,所述第一时间差值指示所述定时任务的历史实际开始时间减去历史计划开始时间的差值;Acquire a first time difference of the scheduled task, where the first time difference indicates a difference between a historical actual start time of the scheduled task and a historical planned start time;

根据所述第一时间差值,获得所述定时任务在多个历史时间下对应的多个第一时间差值的第一正态分布结果;According to the first time difference, obtaining a first normal distribution result of a plurality of first time differences corresponding to the scheduled task at a plurality of historical times;

根据所述第一正态分布结果中的第一参数和第二参数,确定所述第一时段,所述第一参数指示所述多个第一时间差值的平均值,所述第二参数指示所述多个第一时间差值的标准差。The first time period is determined according to a first parameter and a second parameter in the first normal distribution result, wherein the first parameter indicates an average value of the plurality of first time difference values, and the second parameter indicates a standard deviation of the plurality of first time difference values.

上述方案中,所述方法还包括:In the above solution, the method further comprises:

获取所述定时任务的第二时间差值,所述第二时间差值指示所述定时任务的历史实际完成时间减去历史实际开始时间的差值;Obtaining a second time difference of the scheduled task, where the second time difference indicates a difference between a historical actual completion time of the scheduled task and a historical actual start time;

根据所述第二时间差值,获得所述定时任务在多个历史时间下对应的多个第二时间差值的第二正态分布结果;According to the second time difference, a second normal distribution result of a plurality of second time differences corresponding to the scheduled task at a plurality of historical times is obtained;

根据所述第二正态分布结果中的第三参数和第四参数,确定所述第二时段,所述第三参数指示所述多个第二时间差值的平均值,所述第四参数指示所述多个第二时间差值的标准差。The second time period is determined according to a third parameter and a fourth parameter in the second normal distribution result, wherein the third parameter indicates an average value of the plurality of second time difference values, and the fourth parameter indicates a standard deviation of the plurality of second time difference values.

上述方案中,所述在多依赖任务中的定时任务执行前,基于第一时段对所述定时任务进行监控,包括:In the above solution, before the scheduled task in the multi-dependent tasks is executed, the scheduled task is monitored based on the first time period, including:

获取所述定时任务的所述实际开始时间;Obtaining the actual start time of the scheduled task;

监控所述实际开始时间是否在所述第一时段内,得到第一监控结果。Monitor whether the actual start time is within the first time period to obtain a first monitoring result.

上述方案中,所述方法还包括:In the above solution, the method further comprises:

在所述第一监控结果表征所述实际开始时间不在所述第一时段内的情况下,输出第一告警信息,所述第一告警信息用于对所述定时任务执行前进行异常告警。When the first monitoring result indicates that the actual start time is not within the first time period, a first alarm message is output, where the first alarm message is used to issue an abnormal alarm before the scheduled task is executed.

上述方案中,所述在所述定时任务执行中,基于第二时段对所述定时任务进行监控,包括:In the above solution, during the execution of the scheduled task, monitoring the scheduled task based on the second time period includes:

获取所述定时任务的所述实际完成时间;Obtaining the actual completion time of the scheduled task;

监控所述实际完成时间是否在所述第二时段内,得到第二监控结果。Monitor whether the actual completion time is within the second time period to obtain a second monitoring result.

上述方案中,所述方法还包括:In the above solution, the method further comprises:

在所述第二监控结果表征所述实际完成时间不在所述第二时段内的情况下,输出第二告警信息,所述第二告警信息用于对所述定时任务执行中进行异常告警。When the second monitoring result indicates that the actual completion time is not within the second time period, second alarm information is output, and the second alarm information is used to issue an abnormal alarm during the execution of the scheduled task.

上述方案中,所述在所述定时任务执行后,对所述定时任务的执行结果进行校验,包括:In the above solution, after the scheduled task is executed, verifying the execution result of the scheduled task includes:

对所述定时任务的所述执行结果进行第一校验操作,得到第一指标结果;Performing a first verification operation on the execution result of the scheduled task to obtain a first indicator result;

对所述定时任务的所述执行结果进行第二校验操作,得到第二指标结果;Performing a second verification operation on the execution result of the scheduled task to obtain a second indicator result;

根据所述第一指标结果和所述第二指标结果,确定所述执行结果的校验得分;Determining a verification score of the execution result according to the first indicator result and the second indicator result;

判断所述校验得分是否小于第一阈值,得到第三监控结果。It is determined whether the verification score is less than a first threshold value to obtain a third monitoring result.

上述方案中,所述方法还包括:In the above scheme, the method further includes:

在所述第三监控结果表征所述校验得分小于所述第一阈值的情况下,输出第三告警信息,所述第三告警信息用于对所述定时任务执行后进行异常告警。When the third monitoring result indicates that the verification score is less than the first threshold, a third alarm information is output, where the third alarm information is used to issue an abnormal alarm after the scheduled task is executed.

上述方案中,所述在多依赖任务中的定时任务执行前,所述方法还包括:In the above solution, before the scheduled task in the multi-dependent task is executed, the method further includes:

获取所述定时任务的计划开始时间;Get the scheduled start time of the scheduled task;

基于所述第一时间差值配置更新所述计划开始时间。The planned start time is updated based on the first time difference configuration.

本申请实施例还提供一种定时任务的管理装置,所述定时任务的管理装置包括:The embodiment of the present application further provides a management device for a scheduled task, the management device for a scheduled task comprising:

处理单元,用于在多依赖任务中的定时任务执行前,基于第一时段对所述定时任务进行监控,所述第一时段指示所述定时任务的实际开始时间应属的时段;A processing unit, configured to monitor a scheduled task among multiple dependent tasks before the scheduled task is executed based on a first time period, wherein the first time period indicates a time period to which an actual start time of the scheduled task should belong;

所述处理单元,还用于在所述定时任务执行中,基于第二时段对所述定时任务进行监控,所述第二时段指示所述定时任务的实际完成时间应属的时段;The processing unit is further configured to monitor the scheduled task based on a second time period during the execution of the scheduled task, wherein the second time period indicates the time period to which the actual completion time of the scheduled task should belong;

所述处理单元,还用于在所述定时任务执行后,对所述定时任务的执行结果进行校验。The processing unit is further used to verify the execution result of the scheduled task after the scheduled task is executed.

本申请实施例还提供一种定时任务的管理设备,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器;其中,The embodiment of the present application also provides a scheduled task management device, comprising: a processor and a memory for storing a computer program that can be run on the processor; wherein:

所述处理器,用于运行所述计算机程序时,执行上述定时任务的管理方法的步骤。The processor is used to execute the steps of the above-mentioned scheduled task management method when running the computer program.

本申请实施例还提供一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述定时任务的管理方法的步骤。An embodiment of the present application also provides a storage medium on which a computer program is stored. When the computer program is executed by a processor, the steps of the above-mentioned scheduled task management method are implemented.

本申请实施例还提供一种计算机程序产品,包括计算机程序,所述计算机程序可由电子设备的处理器执行,以完成上述定时任务的管理的步骤。An embodiment of the present application also provides a computer program product, including a computer program, which can be executed by a processor of an electronic device to complete the steps of managing the above-mentioned scheduled tasks.

本申请实施例提供的一种定时任务的管理方法、装置、设备、存储介质及产品,其中,方法包括:在多依赖任务中的定时任务执行前,基于第一时段对定时任务进行监控,第一时段指示定时任务的实际开始时间应属的时段;在定时任务执行中,基于第二时段对定时任务进行监控,第二时段指示定时任务的实际完成时间应属的时段;在定时任务执行后,对定时任务的执行结果进行校验;本申请在多依赖定时任务执行过程中,按照任务执行的时间线,依次在任务执行前、任务执行中、任务执行后所对应的三个阶段分别采取相应的监控措施,实现对定时任务的全流程监控。解决了相关技术中仅监控定时执行后的任务有没有异常退出,使得在复杂多依赖任务场景下,导致业务系统不能及时处理未监控到的异常,造成业务损失的问题。The embodiment of the present application provides a method, device, equipment, storage medium and product for managing scheduled tasks, wherein the method includes: before the execution of a scheduled task in a multi-dependent task, monitoring the scheduled task based on a first time period, the first time period indicating the time period to which the actual start time of the scheduled task should belong; during the execution of the scheduled task, monitoring the scheduled task based on a second time period, the second time period indicating the time period to which the actual completion time of the scheduled task should belong; after the execution of the scheduled task, verifying the execution result of the scheduled task; in the execution process of multi-dependent scheduled tasks, the present application takes corresponding monitoring measures in the three stages corresponding to before, during and after the execution of the task according to the timeline of the task execution, so as to realize the full-process monitoring of the scheduled task. The problem that only monitoring whether the task after the scheduled execution has abnormal exit in the related technology makes it impossible for the business system to handle the unmonitored abnormalities in a timely manner in a complex multi-dependent task scenario, resulting in business losses.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本申请实施例一种定时任务的管理方法的流程示意图;FIG1 is a flow chart of a method for managing scheduled tasks according to an embodiment of the present application;

图2为本申请实施例一种定时多依赖任务的全流程监控方法的流程图;FIG2 is a flow chart of a full-process monitoring method for a scheduled multi-dependent task according to an embodiment of the present application;

图3为本申请实施例一种定时多依赖任务的全流程监控系统的示意图;FIG3 is a schematic diagram of a full-process monitoring system for scheduled multi-dependent tasks according to an embodiment of the present application;

图4为本申请实施例一种简单场景下的定时任务调度方法的流程图;FIG4 is a flow chart of a method for scheduling scheduled tasks in a simple scenario according to an embodiment of the present application;

图5为本申请实施例一种简单场景下的定时任务的工作流示意图;FIG5 is a schematic diagram of a workflow of a scheduled task in a simple scenario according to an embodiment of the present application;

图6为本申请实施例一种复杂场景下的定时任务的工作流示意图;FIG6 is a schematic diagram of a workflow of a scheduled task in a complex scenario according to an embodiment of the present application;

图7为本申请实施例一种定时任务数据分析方法的流程图;FIG7 is a flow chart of a method for analyzing scheduled task data according to an embodiment of the present application;

图8为本申请实施例一种定时任务数据分析模块的示意图;FIG8 is a schematic diagram of a scheduled task data analysis module according to an embodiment of the present application;

图9为本申请实施例一种任务执行结果的数据校验的架构图;FIG9 is an architecture diagram of data verification of a task execution result according to an embodiment of the present application;

图10为本申请实施例一种定时任务的管理装置的结构示意图;FIG10 is a schematic diagram of the structure of a management device for scheduled tasks according to an embodiment of the present application;

图11为本申请实施例一种定时任务的管理设备的结构示意图。FIG. 11 is a schematic diagram of the structure of a scheduled task management device according to an embodiment of the present application.

具体实施方式DETAILED DESCRIPTION

下面结合附图及实施例对本申请再作进一步详细的描述。The present application is further described in detail below in conjunction with the accompanying drawings and embodiments.

在多依赖任务中,定时任务的按时执行依赖上游任务的按时执行,定时任务的正确执行依赖上游任务的正确执行,如果上游任务的执行时间过长,则会造成等待执行的定时任务过多,当大量定时任务同时处于等待状态时会造成线程的大批量调用,造成系统线程等资源的占用和浪费;如果上游任务已经执行完,但是上游任务执行时长过长、网络问题、数据量变少、执行程序出错等异常导致定时任务启动过晚,会造成任务不能及时开始执行的问题。In multi-dependent tasks, the on-time execution of scheduled tasks depends on the on-time execution of upstream tasks, and the correct execution of scheduled tasks depends on the correct execution of upstream tasks. If the execution time of upstream tasks is too long, there will be too many scheduled tasks waiting to be executed. When a large number of scheduled tasks are in a waiting state at the same time, it will cause a large number of thread calls, resulting in the occupation and waste of system threads and other resources; if the upstream task has been executed, but the upstream task execution time is too long, network problems, less data volume, execution program errors and other abnormalities cause the scheduled task to start too late, it will cause the task to not start execution in time.

相关技术中,在定时任务执行前,没有对任务进行监控,仅监控定时任务执行后是否发生异常退出,使得即使任务开始时间过晚也无法监控到更无法及时报错,并没有考虑在复杂多依赖场景下,定时任务需要依赖多个上游任务,而上游任务的执行时间存在不确定性,导致定时任务不能按照预期时间执行。例如,当设置数据清洗的定时任务在每日8:00对用户行为数据进行清洗,由于数据清洗任务依赖上游的数据传输任务,但是当发生上游的数据传输任务由于网络等问题没有传完数据时,数据清洗任务就会一直处于执行等待状态,以等待数据传输任务的完成,而在此期间,由于没有进行任务执行前监控,当在任务执行前发生类似上述的传输问题导致数据清洗任务执行过晚时,无法监控并及时告警。In the related art, the scheduled task is not monitored before it is executed, and only whether an abnormal exit occurs after the scheduled task is executed. As a result, even if the task starts too late, it cannot be monitored and an error cannot be reported in time. It does not take into account that in complex multi-dependency scenarios, scheduled tasks need to rely on multiple upstream tasks, and the execution time of upstream tasks is uncertain, resulting in the scheduled task not being executed as expected. For example, when a scheduled task for data cleaning is set to clean user behavior data at 8:00 every day, since the data cleaning task depends on the upstream data transmission task, when the upstream data transmission task fails to complete the data transmission due to network and other problems, the data cleaning task will remain in the execution waiting state to wait for the completion of the data transmission task. During this period, since there is no monitoring before the task is executed, when a transmission problem similar to the above occurs before the task is executed, causing the data cleaning task to be executed too late, it cannot be monitored and an alarm cannot be issued in time.

基于此,本申请提供了一种定时任务的管理方法,参照图1所示,该方法包括以下步骤:Based on this, the present application provides a method for managing scheduled tasks, as shown in FIG1 , the method comprises the following steps:

步骤101、在多依赖任务中的定时任务执行前,基于第一时段对定时任务进行监控。Step 101: Before a scheduled task among multiple dependent tasks is executed, the scheduled task is monitored based on a first time period.

其中,实际应用时,第一时段指示定时任务的实际开始时间应属的时段。In actual application, the first time period indicates the time period to which the actual start time of the scheduled task should belong.

实际应用时,多依赖任务是指工作流程中下游定时任务的执行开始依赖于多个上游定时任务的执行结束,也可以称为定时多依赖任务。在多依赖任务中的定时任务执行前,判断定时任务的实际开始时间是否位于第一时段指示的定时任务的实际开始时间应属的时段,从而对定时任务进行执行前监控。In actual application, a multi-dependent task refers to a time in which the execution start of a downstream scheduled task in a workflow depends on the execution end of multiple upstream scheduled tasks, which can also be called a scheduled multi-dependent task. Before the execution of a scheduled task in a multi-dependent task, it is determined whether the actual start time of the scheduled task is within the time period indicated by the first time period, so as to monitor the scheduled task before execution.

步骤102、在定时任务执行中,基于第二时段对定时任务进行监控。Step 102: During the execution of the scheduled task, the scheduled task is monitored based on the second time period.

其中,实际应用时,第二时段指示定时任务的实际完成时间应属的时段;Wherein, in actual application, the second time period indicates the time period to which the actual completion time of the scheduled task should belong;

实际应用时,在多依赖任务中的定时任务开始执行后,在任务执行中,判断定时任务的实际完成时间是否位于第二时段指示的定时任务的实际完成时间应属的时段,从而对定时任务进行执行中监控。In actual application, after the scheduled task in the multi-dependent task starts to execute, during the task execution, it is determined whether the actual completion time of the scheduled task is within the time period indicated by the second time period, so as to monitor the execution of the scheduled task.

步骤103、在定时任务执行后,对定时任务的执行结果进行校验。Step 103: After the scheduled task is executed, the execution result of the scheduled task is verified.

实际应用时,在复杂多依赖定时任务执行场景下,常会发生部分定时任务执行状态成功,但总体工作流程的执行后结果不符合预期数据。例如,用户画像需要根据离线行为数据,统计出每个用户各个维度的画像指标数据,如果发生部分上游离线数据传输出现异常,但是总体的用户画像计算任务执行状态为成功执行状态,虽然执行状态成功,但是该情况下任务的执行结果并没有达到预期。本申请是在多依赖任务中的定时任务执行结束后,及时获取定时任务的执行结果,判断该定时任务的执行结果是否符合预期,从而对定时任务的执行结果进行监控。以避免当发生程序执行正常、但执行结果不符合预期的情况发生时无法及时监控异常情况。In actual application, in complex multi-dependent scheduled task execution scenarios, it often happens that the execution status of some scheduled tasks is successful, but the results of the overall workflow after execution do not meet the expected data. For example, user portraits need to count the portrait indicator data of each user in each dimension based on offline behavior data. If some upstream offline data transmission exceptions occur, but the overall user portrait calculation task execution status is a successful execution status, although the execution status is successful, the execution result of the task in this case does not meet expectations. This application obtains the execution result of the scheduled task in a timely manner after the execution of the scheduled task in a multi-dependent task is completed, and determines whether the execution result of the scheduled task meets expectations, so as to monitor the execution result of the scheduled task. This is to avoid the inability to monitor abnormal situations in a timely manner when the program execution is normal, but the execution result does not meet expectations.

本申请实施例提供的一种定时任务的管理方法,方法包括:在多依赖任务中的定时任务执行前,基于第一时段对定时任务进行监控,第一时段指示定时任务的实际开始时间应属的时段;在定时任务执行中,基于第二时段对定时任务进行监控,第二时段指示定时任务的实际完成时间应属的时段;在定时任务执行后,对定时任务的执行结果进行校验;本申请在多依赖定时任务执行过程中,按照任务执行的时间线,依次在任务执行前、任务执行中、任务执行后所对应的三个阶段分别采取相应的监控措施,实现对定时任务的全流程监控。解决了相关技术中仅监控定时执行后的任务有没有异常退出,使得在复杂多依赖任务场景下,导致业务系统不能及时处理未监控到的异常,造成业务损失的问题。The embodiment of the present application provides a method for managing scheduled tasks, the method comprising: before the execution of a scheduled task in a multi-dependent task, monitoring the scheduled task based on a first time period, the first time period indicating the time period to which the actual start time of the scheduled task should belong; during the execution of the scheduled task, monitoring the scheduled task based on a second time period, the second time period indicating the time period to which the actual completion time of the scheduled task should belong; after the execution of the scheduled task, verifying the execution result of the scheduled task; in the execution process of the multi-dependent scheduled task, the present application takes corresponding monitoring measures in the three stages corresponding to before the task execution, during the task execution, and after the task execution, according to the timeline of the task execution, to achieve full-process monitoring of the scheduled task. The method solves the problem that only monitoring whether the task after the scheduled execution has abnormal exit in the related art makes it impossible for the business system to handle the unmonitored abnormalities in a timely manner in a complex multi-dependent task scenario, resulting in business losses.

在一实施例中,步骤101中确定第一时段,包括:In one embodiment, determining the first time period in step 101 includes:

获取定时任务的第一时间差值,第一时间差值指示定时任务的历史实际开始时间减去历史计划开始时间的差值;Obtaining a first time difference of a scheduled task, where the first time difference indicates a difference between a historical actual start time of the scheduled task and a historical planned start time;

根据第一时间差值,获得定时任务在多个历史时间下对应的多个第一时间差值的第一正态分布结果;According to the first time difference, a first normal distribution result of a plurality of first time difference values corresponding to the scheduled task at a plurality of historical times is obtained;

根据第一正态分布结果中的第一参数和第二参数,确定第一时段,第一参数指示多个第一时间差值的平均值,第二参数指示多个第一时间差值的标准差。The first time period is determined according to a first parameter and a second parameter in the first normal distribution result, wherein the first parameter indicates an average value of a plurality of first time difference values, and the second parameter indicates a standard deviation of the plurality of first time difference values.

实际应用时,获得定时任务历史实际开始时间与历史计划开始时间,并根据定时任务的历史实际开始时间减去历史计划开始时间的差值得到第一时间差值,进一步得到定时任务在多个历史时间下对应的多个第一时间差值,得到对应的第一时间差值序列,对第一时间差值序列进行数据处理,得到对应的第一正态分布结果。In actual application, the historical actual start time and the historical planned start time of the scheduled task are obtained, and the first time difference is obtained based on the difference between the historical actual start time of the scheduled task and the historical planned start time. Further, multiple first time differences corresponding to the scheduled task at multiple historical times are obtained, and the corresponding first time difference sequence is obtained. Data processing is performed on the first time difference sequence to obtain the corresponding first normal distribution result.

实际应用时,一种情况下,对第一时间差值序列进行Z标准化,经过处理的数据符合均值为0,标准差为1的标准正态分布,得到上述第一正态分布结果:In practical application, in one case, the first time difference sequence is Z-standardized, and the processed data conforms to the standard normal distribution with a mean of 0 and a standard deviation of 1, and the above-mentioned first normal distribution result is obtained:

其中,x1为定时任务在各个历史时间的第一时间差值,单位为秒;Among them, x1 is the first time difference of the scheduled task at each historical time, in seconds;

μ1为第一时间差值序列对应的平均值,单位为秒;μ1 is the average value corresponding to the first time difference sequence, in seconds;

σ1为第一时间差值序列对应的标准差,单位为秒;σ1 is the standard deviation corresponding to the first time difference sequence, in seconds;

Z1为第一时间差值序列对应的在各个历史时间的观测值的Z分数,用于同一尺度下的不同特征之间的比较。Z1 is the Z score of the observations at each historical time corresponding to the first time difference sequence, which is used to compare different features at the same scale.

实际应用时,根据第一正态分布结果中的多个第一时间差值的平均值和多个第一时间差值的标准差,利用3sigma法则获取分布在(μ1-3σ12+3σ1)的数据的上下限,进而得到第一时间差值的上下限,由于定时任务的计划开始时间为预先设置的定值,根据第一差值的定义可推算出定时任务的实际开始时间的上下限,从而得到定时任务的实际开始时间应属的第一时段。可以理解地,定时任务应该在达到第一时段的最小值对应的时间时开始执行,最晚实际开始执行时间不应超过该第一时段的最大值对应的时间,以确保定时任务的实际开始时间在预期的时段范围内。如此,使用统计学分析方法计算出实际开始时间的上下限阈值,从而得到第一时段,并以第一时段为基准判断实际开始时间是否符合预期,得到第一监控结果,从而实现多依赖定时任务的任务执行前监控。In actual application, according to the average value of multiple first time difference values and the standard deviation of multiple first time difference values in the first normal distribution result, the upper and lower limits of the data distributed in (μ1 -3σ12 +3σ1 ) are obtained by using the 3sigma rule, and then the upper and lower limits of the first time difference value are obtained. Since the planned start time of the scheduled task is a preset fixed value, the upper and lower limits of the actual start time of the scheduled task can be calculated according to the definition of the first difference value, thereby obtaining the first time period to which the actual start time of the scheduled task should belong. It can be understood that the scheduled task should start execution when the time corresponding to the minimum value of the first time period is reached, and the latest actual start execution time should not exceed the time corresponding to the maximum value of the first time period, so as to ensure that the actual start time of the scheduled task is within the expected time period range. In this way, the upper and lower limit thresholds of the actual start time are calculated using a statistical analysis method, thereby obtaining the first time period, and the first time period is used as a benchmark to determine whether the actual start time meets expectations, and the first monitoring result is obtained, thereby realizing the pre-task execution monitoring of multi-dependent scheduled tasks.

在一实施例中,步骤102中确定第二时段,包括:In one embodiment, determining the second time period in step 102 includes:

获取定时任务的第二时间差值,第二时间差值指示定时任务的历史实际完成时间减去历史实际开始时间的差值;Obtain a second time difference value of the scheduled task, where the second time difference value indicates a difference between a historical actual completion time of the scheduled task and a historical actual start time;

根据第二时间差值,获得定时任务在多个历史时间下对应的多个第二时间差值的第二正态分布结果;According to the second time difference, a second normal distribution result of a plurality of second time differences corresponding to the scheduled task at a plurality of historical times is obtained;

根据第二正态分布结果中的第三参数和第四参数,确定第二时段,第三参数指示多个第二时间差值的平均值,第四参数指示多个第二时间差值的标准差。The second time period is determined according to a third parameter and a fourth parameter in the second normal distribution result, the third parameter indicates an average value of the plurality of second time difference values, and the fourth parameter indicates a standard deviation of the plurality of second time difference values.

实际应用时,获得定时任务历史实际开始时间与历史实际完成时间,并根据定时任务的历史实际完成时间减去历史实际开始时间的差值得到第二时间差值,进一步得到定时任务在多个历史时间下对应的多个第二时间差值,得到对应的第二时间差值序列,对第二时间差值序列进行数据处理,得到对应的第二正态分布结果。In actual application, the historical actual start time and the historical actual completion time of the scheduled task are obtained, and a second time difference is obtained based on the difference between the historical actual completion time of the scheduled task and the historical actual start time. Further, multiple second time differences corresponding to the scheduled task at multiple historical times are obtained, and the corresponding second time difference sequence is obtained. Data processing is performed on the second time difference sequence to obtain the corresponding second normal distribution result.

实际应用时,一种情况下,对第二时间差值序列进行Z标准化,经过处理的数据符合均值为0,标准差为1的标准正态分布,得到上述第二正态分布结果:In practical application, in one case, the second time difference sequence is Z-standardized, and the processed data conforms to the standard normal distribution with a mean of 0 and a standard deviation of 1, and the above second normal distribution result is obtained:

其中,x2为定时任务在各个历史时间的第二时间差值,单位为秒;Among them, x2 is the second time difference of the scheduled task at each historical time, in seconds;

μ2为第二时间差值序列对应的平均值,单位为秒;μ2 is the average value corresponding to the second time difference sequence, in seconds;

σ2为第二时间差值序列对应的标准差,单位为秒;σ2 is the standard deviation corresponding to the second time difference sequence, in seconds;

Z2为第二时间差值序列对应的在各个历史时间的观测值的Z分数,用于同一尺度下的不同特征之间的比较。Z2 is the Z score of the observations at each historical time corresponding to the second time difference sequence, which is used for comparison between different features at the same scale.

实际应用时,根据第二正态分布结果中的多个第二时间差值的平均值和多个第二时间差值的标准差,利用3sigma法则获取分布在(μ2-3σ22+3Z2)的数据的上下限,进而得到第二时间差值的上下限,由于定时任务的计划开始时间为预先设置的定值,根据第一差值和第二差值的定义可推算出定时任务的实际结束时间的上下限,从而得到定时任务的实际结束时间应属的第二时段。可以理解地,定时任务应该在达到第二时段的最小值对应的时间时结束执行,最晚实际结束执行时间不应超过该第二时段的最大值对应的时间,以确保定时任务的实际执行时长在预期的时长范围内。如此,使用统计学分析方法计算出实际结束时间的上下限阈值,从而得到第二时段,并以第二时段为基准判断实际结束时间是否符合预期,得到第二监控结果,从而实现多依赖定时任务的任务执行中监控。In actual application, according to the average value of multiple second time difference values and the standard deviation of multiple second time difference values in the second normal distribution result, the upper and lower limits of the data distributed in (μ2 -3σ22 +3Z2 ) are obtained by using the 3sigma rule, and then the upper and lower limits of the second time difference value are obtained. Since the planned start time of the scheduled task is a preset fixed value, the upper and lower limits of the actual end time of the scheduled task can be calculated according to the definition of the first difference and the second difference, so as to obtain the second time period to which the actual end time of the scheduled task should belong. It can be understood that the scheduled task should end execution when the time corresponding to the minimum value of the second time period is reached, and the latest actual end execution time should not exceed the time corresponding to the maximum value of the second time period, so as to ensure that the actual execution time of the scheduled task is within the expected time range. In this way, the upper and lower limit thresholds of the actual end time are calculated using a statistical analysis method, so as to obtain the second time period, and the second time period is used as a benchmark to judge whether the actual end time meets expectations, and the second monitoring result is obtained, so as to realize the monitoring of the task execution of multi-dependent scheduled tasks.

在一实施例中,步骤101中在多依赖任务中的定时任务执行前,基于第一时段对定时任务进行监控,包括:In one embodiment, in step 101, before the scheduled task in the multi-dependent tasks is executed, monitoring the scheduled task based on the first time period includes:

获取定时任务的实际开始时间;Get the actual start time of the scheduled task;

监控实际开始时间是否在第一时段内,得到第一监控结果。Monitor whether the actual start time is within the first time period to obtain a first monitoring result.

实际应用时,监控定时任务是否按时开始执行时,在任务执行前,就开始监控并获取待监控的定时任务的实际开始时间,判断该实际开始时间是否落入第一时段内,当落入第一时段内时,表明定时任务的实际开始时间符合预期范围,任务执行前没有发生执行前异常,即第一监控结果为执行前正常;当实际开始时间没有落入第一时段内时,判断实际开始时间是否小于该第一时段的最小值,或大于该第一时段的最大值,当实际开始时间小于该第一时段的最小值时,说明定时任务提前执行了,或当实际开始时间大于该第一时段的最大值时,说明定时任务滞后执行了,此时第一监控结果均为执行前异常。如此,通过判断定时任务的实际开始时间是否在第一时段内,实现对定时任务执行前监控。In actual application, when monitoring whether a scheduled task starts to execute on time, before the task is executed, the actual start time of the scheduled task to be monitored is monitored and obtained, and it is determined whether the actual start time falls within the first time period. When it falls within the first time period, it indicates that the actual start time of the scheduled task meets the expected range, and no pre-execution anomaly occurs before the task is executed, that is, the first monitoring result is normal before execution; when the actual start time does not fall within the first time period, it is determined whether the actual start time is less than the minimum value of the first time period, or greater than the maximum value of the first time period. When the actual start time is less than the minimum value of the first time period, it indicates that the scheduled task is executed in advance, or when the actual start time is greater than the maximum value of the first time period, it indicates that the scheduled task is executed late, and at this time, the first monitoring result is all pre-execution anomaly. In this way, by determining whether the actual start time of the scheduled task is within the first time period, the monitoring of the scheduled task before execution is realized.

其中,在一实施例中,方法还包括:In one embodiment, the method further includes:

在第一监控结果表征实际开始时间不在第一时段内的情况下,输出第一告警信息,第一告警信息用于对定时任务执行前进行异常告警。When the first monitoring result indicates that the actual start time is not within the first time period, a first alarm message is output, and the first alarm message is used to issue an abnormal alarm before the scheduled task is executed.

实际应用时,当第一监控结果表征实际开始时间小于该第一时段的最小值或大于该第一时段的最大值时,代表定时任务提前或滞后执行了,为了进一步判断导致该定时任务出现执行前异常的原因,根据第一监控结果自动输出第一告警信息对定时任务执行前进行异常告警,以通知运维人员及时进行异常处理。In actual application, when the first monitoring result indicates that the actual start time is less than the minimum value of the first time period or greater than the maximum value of the first time period, it means that the scheduled task is executed ahead of schedule or delayed. In order to further determine the cause of the abnormality before the execution of the scheduled task, the first alarm information is automatically output according to the first monitoring result to issue an abnormal alarm before the execution of the scheduled task, so as to notify the operation and maintenance personnel to handle the abnormality in time.

在一实施例中,步骤102中在定时任务执行中,基于第二时段对定时任务进行监控,包括:In one embodiment, in step 102, during the execution of the scheduled task, monitoring the scheduled task based on the second time period includes:

获取定时任务的实际完成时间;Get the actual completion time of the scheduled task;

监控实际完成时间是否在第二时段内,得到第二监控结果。Monitor whether the actual completion time is within the second time period to obtain a second monitoring result.

实际应用时,监控定时任务是否按时执行结束时,在任务执行中,监控并获取待监控的定时任务的实际结束时间,判断该实际结束时间是否落入第二时段内,当落入第二时段内时,表明定时任务的实际结束时间符合预期范围,没有发生执行中异常,即第二监控结果为执行中正常;当实际结束时间没有落入第二时段内时,判断实际结束时间是否小于该第二时段的最小值,或大于该第二时段的最大值,当实际结束时间小于该第二时段的最小值时,说明定时任务提前结束了,或当实际结束时间大于该第二时段的最大值时,说明定时任务执行时间过长了,此时第二监控结果均为执行中异常。In actual application, when monitoring whether the scheduled task is executed and completed on time, during task execution, monitor and obtain the actual end time of the scheduled task to be monitored, and determine whether the actual end time falls within the second time period. When it falls within the second time period, it indicates that the actual end time of the scheduled task is in line with the expected range, and no abnormality occurs during execution, that is, the second monitoring result is normal execution; when the actual end time does not fall within the second time period, determine whether the actual end time is less than the minimum value of the second time period, or greater than the maximum value of the second time period. When the actual end time is less than the minimum value of the second time period, it means that the scheduled task has ended ahead of schedule, or when the actual end time is greater than the maximum value of the second time period, it means that the scheduled task has been executed too long. At this time, the second monitoring results are all abnormalities during execution.

其中,在一实施例中,方法还包括:In one embodiment, the method further includes:

在第二监控结果表征实际完成时间不在第二时段内的情况下,输出第二告警信息,第二告警信息用于对定时任务执行中进行异常告警。When the second monitoring result indicates that the actual completion time is not within the second time period, a second alarm message is output, and the second alarm message is used to issue an abnormal alarm during the execution of the scheduled task.

实际应用时,当第二监控结果表征实际结束时间小于该第二时段的最小值或大于该第二时段的最大值时,代表定时任务提前结束或执行时间过长了,为了进一步判断导致该定时任务出现执行中异常的原因,根据第二监控结果自动输出第二告警信息对定时任务执行中进行异常告警,以通知运维人员及时进行异常处理。In actual application, when the second monitoring result indicates that the actual end time is less than the minimum value of the second time period or greater than the maximum value of the second time period, it means that the scheduled task ends prematurely or the execution time is too long. In order to further determine the cause of the abnormality in the execution of the scheduled task, the second alarm information is automatically output according to the second monitoring result to issue an abnormal alarm during the execution of the scheduled task, so as to notify the operation and maintenance personnel to handle the abnormality in time.

在一实施例中,步骤103中在定时任务执行后,对定时任务的执行结果进行校验,包括:In one embodiment, after the scheduled task is executed in step 103, the execution result of the scheduled task is verified, including:

对定时任务的执行结果进行第一校验操作,得到第一指标结果;Performing a first verification operation on the execution result of the scheduled task to obtain a first indicator result;

对定时任务的执行结果进行第二校验操作,得到第二指标结果;Perform a second verification operation on the execution result of the scheduled task to obtain a second indicator result;

根据第一指标结果和第二指标结果,确定执行结果的校验得分;Determine a verification score of the execution result according to the first indicator result and the second indicator result;

判断校验得分是否小于第一阈值,得到第三监控结果。It is determined whether the verification score is less than the first threshold value to obtain a third monitoring result.

实际应用时,监控定时任务是否正确执行时,根据得到的第一指标结果和第二指标结果,综合确定执行结果的校验得分,比较该校验得分是否小于第一阈值,以得到第三监控结果,从而判断定时任务是否正确执行。In actual application, when monitoring whether the scheduled task is executed correctly, the verification score of the execution result is comprehensively determined based on the first indicator result and the second indicator result, and the verification score is compared to see whether it is less than the first threshold to obtain the third monitoring result, thereby judging whether the scheduled task is executed correctly.

实际应用时,第一校验操作为对强校验指标进行校验,强校验指标包括但不限于是否有数据、是否有特定字段的索引等,在强校验下,得到的第一指标结果为0或1的数值,也可以称为强校验指标得分。In actual application, the first verification operation is to verify the strong verification indicators. The strong verification indicators include but are not limited to whether there is data, whether there is an index of a specific field, etc. Under strong verification, the first indicator result obtained is a value of 0 or 1, which can also be called a strong verification indicator score.

实际应用时,第二校验操作为对若校验指标进行校验,弱校验指标包括但不限于数据填充率、数据波动率等,在弱校验下,得到的第二指标结果为0至1之间的数值,也可以称为弱校验指标得分。In actual application, the second verification operation is to verify some verification indicators. Weak verification indicators include but are not limited to data fill rate, data volatility, etc. Under weak verification, the second indicator result obtained is a value between 0 and 1, which can also be called a weak verification indicator score.

实际应用时,第一阈值为是预先设置的时间类阈值。示例性地,可以根据定时任务的历史日志数据中的实际执行时间、任务执行耗时即执行时长等数据得到第一阈值。In actual application, the first threshold is a preset time threshold. For example, the first threshold can be obtained based on data such as the actual execution time in the historical log data of the scheduled task, the task execution time, ie, the execution duration, etc.

实际应用时,第三监控结果可以直接是校验得分与第一阈值的比较结果,也可以是根据比较结果得出的是否发生执行后异常的指示信息,可以根据实际需求进行设置,本申请对此不作具体限定。In actual application, the third monitoring result can be directly the comparison result of the verification score and the first threshold, or it can be an indication information of whether a post-execution exception occurs based on the comparison result. It can be set according to actual needs, and this application does not make specific limitations on this.

实际应用时,根据第一指标结果和第二指标结果,确定执行结果的校验得分,可以通过如下步骤执行:In actual application, the verification score of the execution result is determined according to the first indicator result and the second indicator result, which can be performed through the following steps:

当获取第一指标结果指示的强校验指标得分和第二指标结果指示的弱校验指标得分之后,根据公式(3),对强校验指标得分进行连乘,对弱校验指标得分求平均,以计算定时任务的执行结果的校验得分:After obtaining the strong check index score indicated by the first indicator result and the weak check index score indicated by the second indicator result, the strong check index score is multiplied and the weak check index score is averaged according to formula (3) to calculate the check score of the execution result of the scheduled task:

其中,m为选取的强校验指标的个数,单位为个;Among them, m is the number of selected strong verification indicators, in units;

ai为强校验指标得分;ai is the strong verification index score;

n为选取的弱校验指标的个数,单位为个;n is the number of selected weak verification indicators, in units;

bj为弱校验指标得分。bj is the weak verification index score.

Score为执行结果的校验得分。Score is the verification score of the execution result.

实际应用时,当第一指标结果中只要存在一个0则需要直接进行告警,当第一指标结果均为1时,计算第二指标结果和校验得分,并基于预先设置的第一阈值判断该校验得分是否大于该第一阈值,以进行执行后监控,使得当校验得分小于该第一阈值时进行执行后异常告警。如此,在任务执行后,根据校验得分得到第三监控结果,从而判断是否需要进行异常告警,采用数据校验的方式进行监控,使得当发生任务执行完成但未正确执行时,也能够准确监控到异常情况,实现任务执行中监控。In actual application, when there is only one 0 in the first indicator result, an alarm needs to be issued directly. When the first indicator results are all 1, the second indicator result and the verification score are calculated, and based on the pre-set first threshold, it is determined whether the verification score is greater than the first threshold, so as to perform post-execution monitoring, so that when the verification score is less than the first threshold, a post-execution abnormality alarm is issued. In this way, after the task is executed, the third monitoring result is obtained according to the verification score, so as to determine whether an abnormal alarm is required, and monitoring is performed by data verification, so that when the task is completed but not executed correctly, the abnormal situation can also be accurately monitored, and monitoring during task execution is realized.

其中,在一实施例中,方法还包括:In one embodiment, the method further includes:

在第三监控结果表征校验得分小于第一阈值的情况下,输出第三告警信息,第三告警信息用于对定时任务执行后进行异常告警。When the third monitoring result indicates that the verification score is less than the first threshold, a third alarm information is output, and the third alarm information is used to issue an abnormal alarm after the scheduled task is executed.

实际应用时,当第三告警信息表征校验得分小于第一阈值时,表明定时任务的执行结果不符合预期,出现执行后异常,为了进一步判断导致该定时任务出现执行后异常的原因,根据第三监控结果自动输出第三告警信息进行定时任务执行后异常告警,以通知运维人员及时进行异常处理。In actual application, when the third alarm information represents that the verification score is less than the first threshold, it indicates that the execution result of the scheduled task does not meet expectations and an abnormality occurs after execution. In order to further determine the cause of the abnormality after execution of the scheduled task, the third alarm information is automatically output according to the third monitoring result to issue an abnormality alarm after the scheduled task is executed, so as to notify the operation and maintenance personnel to handle the abnormality in time.

在一实施例中,步骤101中在多依赖任务中的定时任务执行前,方法还包括:In one embodiment, before the scheduled task in the multi-dependent task is executed in step 101, the method further includes:

获取定时任务的计划开始时间;Get the scheduled start time of the scheduled task;

基于第一时间差值配置更新计划开始时间。The scheduled start time is updated based on the first time difference configuration.

实际应用时,第一时间差值指示定时任务的历史实际开始时间减去历史计划开始时间的差值。定时任务的计划开始时间为预先设置的时间,用于参考该计划开始时间开始执行任务。由于在复杂场景下,下游的定时任务的正常执行需要依赖上游任务的正常结束,而上游任务的执行时间存在不确定性,为了综合考虑上游任务的执行时间的动态变化因素对下游任务的执行时间的影响,基于第一时间差值配置并动态更新定时任务的计划开始时间,实现根据上游依赖任务来灵活、动态地调整定时任务的计划开始时间,提高定时任务的执行时间的准确性。In actual application, the first time difference indicates the difference between the historical actual start time of the scheduled task and the historical planned start time. The planned start time of the scheduled task is a pre-set time, which is used to start the task with reference to the planned start time. Because in complex scenarios, the normal execution of downstream scheduled tasks depends on the normal completion of upstream tasks, and there is uncertainty in the execution time of upstream tasks, in order to comprehensively consider the impact of dynamic changes in the execution time of upstream tasks on the execution time of downstream tasks, the planned start time of the scheduled task is configured and dynamically updated based on the first time difference, so as to realize flexible and dynamic adjustment of the planned start time of the scheduled task according to the upstream dependent tasks, and improve the accuracy of the execution time of the scheduled task.

实际应用时,为了提高配置更新的准确性,收集定时任务在各个历史时间下的实际开始时间和计划开始时间的差值,得到多个第一时间差值,对应第一时间差值序列,第一时间差值序列反映了在各个历史时间下,定时任务实际开始执行时的执行时间受上游依赖任务的影响情况,差值越小表明上游依赖的影响越小,对第一时间差值求取平均值,得到历史一段时期内定时任务实际开始执行时的执行时间受上游依赖任务的平均影响程度。进一步地,在当前预设的定时任务的计划开始时间的基础上,考虑该第一时间差值的平均值,以得到调整后的推荐计划开始时间,使得调整后的推荐计划开始时间更为准确。配置并更新当前的计划开始时间,使得定时任务按照更新后的推荐计划开始时间执行定时任务。如此,能够自动化地批量配置并更新各个定时任务的计划开始时间,替代了人工对任务执行情况的监控,有效降低定时任务的计划开始时间与实际开始时间之间的执行时间误差,节约人力维护成本的同时提升了系统利用率。In actual application, in order to improve the accuracy of configuration update, the difference between the actual start time and the planned start time of the scheduled task at each historical time is collected to obtain multiple first time differences, corresponding to the first time difference sequence. The first time difference sequence reflects the impact of the execution time of the scheduled task when it actually starts to execute on the upstream dependent task at each historical time. The smaller the difference, the smaller the impact of the upstream dependency. The first time difference is averaged to obtain the average impact of the execution time of the scheduled task when it actually starts to execute on the upstream dependent task in a historical period. Further, based on the currently preset scheduled task plan start time, the average value of the first time difference is considered to obtain the adjusted recommended plan start time, so that the adjusted recommended plan start time is more accurate. Configure and update the current planned start time so that the scheduled task executes the scheduled task according to the updated recommended plan start time. In this way, the planned start time of each scheduled task can be automatically configured and updated in batches, replacing the manual monitoring of the task execution, effectively reducing the execution time error between the planned start time and the actual start time of the scheduled task, saving manpower maintenance costs and improving system utilization.

下面结合应用示例对本申请再作进一步详细的描述。The present application is described in further detail below in conjunction with application examples.

一种实际场景下,参照图2所示,本申请实施提供一种定时多依赖任务的全流程监控方法,将定时任务的监控过程整体分为三个阶段,包括在任务执行前、任务执行中、任务执行后分别对定时任务进行监控和异常告警,具体说明如下:In an actual scenario, as shown in FIG. 2 , the present application implements a full-process monitoring method for a scheduled multi-dependent task, and divides the monitoring process of the scheduled task into three stages as a whole, including monitoring and abnormal alarming of the scheduled task before, during, and after the task is executed, as described below:

步骤201、开始。Step 201, start.

步骤202、在任务执行前,根据任务执行开始时间的阈值进行监控。Step 202: Before the task is executed, monitoring is performed according to the threshold of the task execution start time.

步骤203、判断是否发生任务执行前异常,是则进入步骤204,否则进入步骤205。Step 203 , determine whether an exception occurs before task execution, if yes, proceed to step 204 , otherwise proceed to step 205 .

步骤204、发出任务执行前异常的告警信息。Step 204: Issue an abnormal warning message before task execution.

在监控到异常时及时发送告警信息Send warning information in time when abnormalities are detected

步骤205、在任务执行中,根据任务结束时间的阈值进行监控。Step 205: During task execution, monitoring is performed according to the threshold of the task end time.

步骤206、判断是否发生任务执行中异常,是则进入步骤207,否则进入步骤208。Step 206 , determine whether an exception occurs during task execution, if yes, proceed to step 207 , otherwise proceed to step 208 .

步骤207、发出任务执行中异常的告警信息。Step 207: Issue an alarm message indicating an abnormality during task execution.

在监控到异常时及时发送告警信息Send warning information in time when abnormalities are detected

步骤208、在任务执行后,对任务执行结果进行数据校验。Step 208: After the task is executed, data verification is performed on the task execution result.

步骤209、判断是否发生任务执行后异常,是则进入步骤210,否则进入步骤211。Step 209 , determine whether an exception occurs after task execution, if yes, proceed to step 210 , otherwise proceed to step 211 .

在任务执行后,对任务执行结果进行数据校验,根据校验结果是否异常发送告警。After the task is executed, the data of the task execution result is verified, and an alarm is sent based on whether the verification result is abnormal.

步骤210、发出任务执行后异常的告警信息。Step 210: Issue an abnormal warning message after the task is executed.

在监控到异常时及时发送告警信息Send warning information in time when abnormalities are detected

步骤211、结束。Step 211, end.

相关技术中,进行定时任务调度时,监控中心只是监控主节点(Master)、工作节点(Worker)的硬件内存指标数据、中央处理器(Central Processing Unit,CPU)等资源指标数据,并在上述指标数据出现异常并在任务执行失败之后发送执行失败的简单告警信息,使得当定时任务所依赖的上游任务一直在超时运行中时系统不判定为执行失败,则无法及时发送执行异常的告警信息。In the related art, when scheduling a scheduled task, the monitoring center only monitors the hardware memory indicator data, central processing unit (CPU) and other resource indicator data of the master node (Master) and the worker node (Worker), and sends a simple alarm message of execution failure when an abnormality occurs in the above indicator data and the task fails to execute. When the upstream task on which the scheduled task depends has been running in timeout, the system does not determine it as an execution failure, and cannot send an alarm message of execution abnormality in time.

基于此,一种实际场景下,参照图3所示,本申请实施提供一种定时多依赖任务的全流程监控系统300包括用户层301、交互层302和调度层303;其中,Based on this, in an actual scenario, as shown in FIG3 , the present application implements a full-process monitoring system 300 for timing multi-dependent tasks, including a user layer 301 , an interaction layer 302 , and a scheduling layer 303 ; wherein,

用户层301,用于数据分析人员、大数据开发人员和运维人员等相关人员进行信息输入,包括向交互层输入定时任务调度配置信息等与定时任务设置的相关信息,以及用于接收交互层反馈的定时任务执行结果的相关信息;The user layer 301 is used for data analysts, big data developers, operation and maintenance personnel and other related personnel to input information, including inputting scheduled task scheduling configuration information and other related information related to scheduled task settings to the interactive layer, and receiving related information related to scheduled task execution results fed back by the interactive layer;

交互层302,用于将从调度层接收到的监控配置信息、定时任务的推荐执行时间和/或从调度层接收到的告警信息反馈至用户层,包括在浏览器、邮件、短信等通信工具的人机交互界面上与用户层的数据分析人员、大数据开发人员和运维人员执行人机交互相关操作,以及用于将用户层输入的定时任务调度配置信息发送至调度层以执行定时任务的调度;The interaction layer 302 is used to feed back the monitoring configuration information, the recommended execution time of the scheduled task and/or the alarm information received from the scheduling layer to the user layer, including performing human-computer interaction related operations with the data analysts, big data developers and operation and maintenance personnel of the user layer on the human-computer interaction interface of communication tools such as browsers, emails, and text messages, and sending the scheduled task scheduling configuration information input by the user layer to the scheduling layer to execute the scheduling of the scheduled task;

调度层303,用于执行定时任务的调度,包括定时任务调度模块3031、定时任务数据分析模块3032、定时任务监控模块3033;其中,The scheduling layer 303 is used to execute the scheduling of scheduled tasks, including a scheduled task scheduling module 3031, a scheduled task data analysis module 3032, and a scheduled task monitoring module 3033; wherein,

定时任务调度模块3031,是将用户层输入的定时任务调度配置信息进行设置,并根据该调度配置信执行定时任务,并在执行定时任务后生成任务执行日志;The scheduled task scheduling module 3031 is to set the scheduled task scheduling configuration information input by the user layer, execute the scheduled task according to the scheduling configuration information, and generate a task execution log after executing the scheduled task;

定时任务数据分析模块3032,是基于定时任务调度模块3031生成的任务执行日志,进行执行时间的数据采集,并基于采集的数据进行执行时间的数据分析,以生成定时任务的监控配置信息和推荐计划开始时间,生成的监控配置信息和定时任务的推荐执行时间可以通过交互层反馈给用户层的相关人员;The scheduled task data analysis module 3032 collects data on the execution time based on the task execution log generated by the scheduled task scheduling module 3031, and performs data analysis on the execution time based on the collected data, so as to generate monitoring configuration information and recommended planned start time of the scheduled task. The generated monitoring configuration information and recommended execution time of the scheduled task can be fed back to relevant personnel of the user layer through the interaction layer;

定时任务监控模块3033,是根据定时任务数据分析模块3032生成的监控配置信息来生成监控任务,并在定时任务调度模块3031执行定时任务时,对任务进行全流程监控并返回监控结果信息,定时任务监控模块3033根据返回的监控结果信息进行异常分析,当发生监控结果异常时及时发送告警信息,以将监控告警信息通过交互层的人机交互界面反馈给用户层的相关人员。The scheduled task monitoring module 3033 generates a monitoring task based on the monitoring configuration information generated by the scheduled task data analysis module 3032, and monitors the entire process of the task and returns the monitoring result information when the scheduled task scheduling module 3031 executes the scheduled task. The scheduled task monitoring module 3033 performs an abnormality analysis based on the returned monitoring result information, and sends an alarm message in a timely manner when an abnormal monitoring result occurs, so as to feed back the monitoring alarm information to the relevant personnel at the user layer through the human-computer interaction interface of the interactive layer.

一种实际场景下,参照图4所示,本申请实施提供一种定时多依赖任务的定时任务调度方法,可以基于上述的定时多依赖任务的全流程监控系统执行,以一种简单上游依赖的定时任务的工作流为例,具体说明如下:In an actual scenario, as shown in FIG. 4 , the present application implements a method for scheduling a scheduled task with multiple dependent tasks, which can be executed based on the full-process monitoring system for scheduled multi-dependent tasks described above. Taking a simple upstream dependent scheduled task workflow as an example, the specific description is as follows:

预先设置定时任务TASK1的第一计划开始时间(Plan Time,PT1)和定时任务TASK2的第二计划开始时间PT2。步骤401、开始。步骤402、执行定时任务TASK1:当流程开始后,按照TASK1的第一计划开始时间PT1执行定时任务TASK1,确认定时任务TASK1在第一实际开始时间(Start Time,ST1)开始执行,并于第一实际完成时间(End Time,ET1)执行结束,当定时任务TASK1执行结束之后,触发定时任务TASK2的执行流程,进入步骤403。步骤403、执行定时任务TASK2:按照TASK2的第二计划开始时间PT2执行定时任务TASK2,确认定时任务TASK2在第二实际开始时间ST2开始执行,并于第二实际完成时间ET2执行结束,进入步骤404。步骤404、结束。触发定时任务TASK2的执行流程的成功开始依赖于定时任务TASK1的执行流程的成功结束,当定时任务TASK1失败也会导致定时任务TASK2的失败。Preset the first planned start time (Plan Time, PT1) of the scheduled task TASK1 and the second planned start time PT2 of the scheduled task TASK2. Step 401, start. Step 402, execute the scheduled task TASK1: when the process starts, execute the scheduled task TASK1 according to the first planned start time PT1 of TASK1, confirm that the scheduled task TASK1 starts at the first actual start time (Start Time, ST1) and ends at the first actual completion time (End Time, ET1), when the scheduled task TASK1 is completed, trigger the execution process of the scheduled task TASK2, and enter step 403. Step 403, execute the scheduled task TASK2: execute the scheduled task TASK2 according to the second planned start time PT2 of TASK2, confirm that the scheduled task TASK2 starts at the second actual start time ST2 and ends at the second actual completion time ET2, and enter step 404. Step 404, end. The successful start of the execution process of the scheduled task TASK2 is dependent on the successful completion of the execution process of the scheduled task TASK1. If the scheduled task TASK1 fails, the scheduled task TASK2 will also fail.

其中,定时任务会产生三个时间点,包括任务的计划开始时间(Plan Time,PT)、任务的实际开始时间(Start Time,ST)、任务的实际完成时间(End Time,ET)。示例性地,参照图5所示,当调度系统设置每天8点定时执行一下用户画像数据处理任务,因上游任务执行过晚,导致实际8点15分开始执行该任务,并于8点21分执行完成该任务。则该任务的PT为8:00,ST为8:15,ET为8:21。基于PT进行任务执行前监控,基于ST进行任务执行中监控,基于ET进行任务执行后监控。Among them, the scheduled task will generate three time points, including the planned start time (Plan Time, PT) of the task, the actual start time (Start Time, ST) of the task, and the actual completion time (End Time, ET) of the task. For example, as shown in Figure 5, when the scheduling system is set to execute the user portrait data processing task at 8 o'clock every day, the upstream task is executed too late, resulting in the task actually starting at 8:15 and completing the task at 8:21. The PT of the task is 8:00, the ST is 8:15, and the ET is 8:21. Monitor the task before execution based on PT, monitor the task during execution based on ST, and monitor the task after execution based on ET.

实际生产中,当定时任务按照开发人员设定好的计划开始时间执行任务时,需要考虑该定时任务的上游依赖任务的执行情况。参照图6所示,为一种复杂上游依赖的定时任务的工作流6的示意图,除了工作流程内上下游定时任务601的相互依赖之外,不同工作流程之间的定时任务601也存在相互依赖。In actual production, when a scheduled task is executed according to the planned start time set by the developer, the execution of the upstream dependent tasks of the scheduled task needs to be considered. Referring to FIG6 , a schematic diagram of a workflow 6 of a scheduled task with complex upstream dependencies is shown. In addition to the mutual dependence of the upstream and downstream scheduled tasks 601 within the workflow, the scheduled tasks 601 between different workflows also have mutual dependence.

一种实际场景下,参照图7所示,本申请实施提供一种定时多依赖任务的定时任务数据分析方法,对应图8所示的数据分析模块800或上述的定时任务数据分析模块3032,可以基于上述的定时多依赖任务的全流程监控系统执行,具体说明如下:In an actual scenario, as shown in FIG. 7 , the present application implements a method for analyzing scheduled task data of a scheduled multi-dependent task, corresponding to the data analysis module 800 shown in FIG. 8 or the above-mentioned scheduled task data analysis module 3032 , which can be executed based on the above-mentioned full-process monitoring system of the scheduled multi-dependent task, and is specifically described as follows:

步骤701、基于调度日志采集执行时间数据。Step 701: Collect execution time data based on the scheduling log.

根据定时任务调度系统的调度日志,获取日志中待执行的定时任务在各个历史时间的PT、ST、ET,得到对应的三个原始序列数据:任务计划开始时间序列{PT1,PT2,PT3,...,PTt},任务实际开始时间序列{ST1,ST2,ST3,...,STt},任务实际结束时间序列{ET1,ET2,ET3,...,ETt},原始序列为时间格式。According to the scheduling log of the scheduled task scheduling system, the PT, ST, and ET of the scheduled tasks to be executed in the log at various historical times are obtained, and the corresponding three original sequence data are obtained: the task planned start time series {PT1 , PT2 , PT3 ,..., PTt }, the task actual start time series {ST1 , ST2 , ST3 ,..., STt }, and the task actual end time series {ET1 , ET2 , ET3 ,..., ETt }. The original sequence is in time format.

步骤702、存储采集结果。Step 702: Store the collection results.

将采集的数据同步至数据库中进行记录。The collected data is synchronized to the database for recording.

实际应用时,可以使用大数据日志采集工具如Flume从调度系统的调度日志中采集上述关键数据,并将采集的数据存储至数据仓库工如Hive中进行记录。In actual application, you can use big data log collection tools such as Flume to collect the above key data from the scheduling log of the scheduling system, and store the collected data in a data warehouse such as Hive for recording.

步骤703、进行数据分析,生成监控阈值、推荐计划开始时间。Step 703: Perform data analysis to generate monitoring thresholds and recommend plan start times.

使用数据分析工具如Spark进行数据分析,以生成监控阈值、推荐计划开始时间。Use data analysis tools such as Spark to analyze data to generate monitoring thresholds and recommend plan start times.

首先,分析采集结果的数值分布情况。First, the numerical distribution of the collected results is analyzed.

根据得到的PT、ST,采用公式(4)计算得到任务执行计划时间误差Terror,Terror代表了定时任务在计划开始时间到达后一直到实际开始执行任务时,处于执行等待状态的时长:According to the obtained PT and ST, the task execution plan time error Terror is calculated using formula (4). Terror represents the length of time that the scheduled task is in the execution waiting state from the arrival of the planned start time to the actual start of the task execution:

Terror=ST-PT (4)Terror = ST-PT (4)

其中,ST为定时任务执行的实际开始时间,单位为秒;Among them, ST is the actual start time of the scheduled task execution, in seconds;

PT为定时任务执行的计划开始时间,单位为秒;PT is the scheduled start time of the scheduled task execution, in seconds;

Terror为任务执行计划时间误差,单位为秒。Terror is the task execution plan time error, in seconds.

根据得到的ST、ET,采用公式(5)计算得到任务执行时间Texecute,Texecute代表了定时任务从实际开始执行到实际结束执行时,任务的实际执行时长:According to the obtained ST and ET, the task execution time Texecute is calculated using formula (5). Texecute represents the actual execution time of the scheduled task from the actual start to the actual end of the execution:

Texecute=ET-ST (5)Texecute = ET-ST (5)

其中,ST为定时任务执行的实际开始时间,单位为秒;Among them, ST is the actual start time of the scheduled task execution, in seconds;

ET为定时任务执行的实际结束时间,单位为秒;ET is the actual end time of the scheduled task execution, in seconds;

Texecute为定时任务的实际执行时长,单位为秒。Texecute is the actual execution time of the scheduled task, in seconds.

分别计算待执行的定时任务在各个历史时间的Terror和Texecute,生成计划时间误差序列和任务执行时间序列Calculate Terror and Texecute of the scheduled tasks to be executed at each historical time and generate a planned time error sequence and task execution time series

计算两个序列分别对应的平均值μ、标准差σ。再按照公式(1)或公式(2)对数据进行Z标准化,经过处理的数据符合标准正态分布,即均值为0,标准差为1。Calculate the mean μ and standard deviation σ of the two sequences respectively. Then perform Z-standardization on the data according to formula (1) or formula (2). The processed data conforms to the standard normal distribution, that is, the mean is 0 and the standard deviation is 1.

第二,根据数值分布情况,生成监控阈值。Second, generate monitoring thresholds based on the numerical distribution.

监控阈值指示执行时间应该所属的时段,包含了该时段的上下限。本申请实施例利用3sigma法则计算执行时间上下限;其中,The monitoring threshold indicates the time period to which the execution time should belong, including the upper and lower limits of the time period. The embodiment of the present application uses the 3sigma rule to calculate the upper and lower limits of the execution time; wherein,

监控阈值包括第一监控阈值对应的第一时段,指示定时任务的实际开始时间ST应属的时间段。根据3sigma法则,数值分布在(μ-3σ,μ+3σ)中的概率为0.9973,对应获得标准化后数据的上下限,再根据公式(4)计算出对应Terror的上下限,由于PT为提前设定的,进而计算出对应ST的上下限:STmin、STmax,从而获得第一监控阈值对应的第一时段:[STmin,STmax]。The monitoring threshold includes the first time period corresponding to the first monitoring threshold, indicating the time period to which the actual start time ST of the scheduled task should belong. According to the 3sigma rule, the probability of the value being distributed in (μ-3σ, μ+3σ) is 0.9973, corresponding to the upper and lower limits of the standardized data, and then the upper and lower limits of the corresponding Terror are calculated according to formula (4). Since PT is set in advance, the upper and lower limits of the corresponding ST are calculated: STmin , STmax , thereby obtaining the first time period corresponding to the first monitoring threshold: [STmin , STmax ].

监控阈值包括第一监控阈值对应的第二时段,指示定时任务的实际结束时间ET应属的时间段。根据3sigma法则,数值分布在(μ-3σ,μ+3σ)中的概率为0.9973,对应获得标准化后的实际结束时间ET序列中各数据的上下限,再根据公式(5)和上述ST的上下限反向推算出对应Texecute的上下限,进而计算出对应ET的上下限:ETmin、ETmax,从而获得第二监控阈值对应的第二时段:[ETmin,ETmax]。The monitoring threshold includes the second time period corresponding to the first monitoring threshold, indicating the time period to which the actual end time ET of the scheduled task should belong. According to the 3sigma rule, the probability of the value being distributed in (μ-3σ, μ+3σ) is 0.9973, corresponding to the upper and lower limits of each data in the standardized actual end time ET sequence, and then the upper and lower limits of the corresponding Texecute are reversely calculated according to formula (5) and the upper and lower limits of the above ST, and then the upper and lower limits of the corresponding ET are calculated: ETmin , ETmax , so as to obtain the second time period corresponding to the second monitoring threshold: [ETmin , ETmax ].

第三,根据数值分布情况,生成推荐计划开始时间。Third, based on the numerical distribution, generate the recommended plan start time.

根据计划时间误差序列得到计划时间误差的平均值表示了定时任务在各个历史时间下执行时,在该定时任务所依赖的上游任务的实际结束时间的影响下,该定时任务的计划开始时间与实际开始时间的误差的平均值,代表了上游依赖任务的平均影响程度。结合公式(4)可知,当计划开始时间PT与实际开始时间ST越接近时,定时任务实际执行时对应的计划时间误差越小,在考虑该定时任务所依赖的上游任务时,应考虑计划时间误差的历史值的平均值对该定时任务的计划开始时间的影响,因此,在生成推荐计划开始时间时,按照公式(6)设置推荐计划开始时间PTrecAccording to the planned time error sequence Get the average value of the planned time error It represents the average value of the error between the scheduled start time and the actual start time of the scheduled task when the scheduled task is executed at each historical time, under the influence of the actual end time of the upstream task on which the scheduled task depends, and represents the average degree of influence of the upstream dependent task. Combined with formula (4), it can be seen that when the planned start time PT is closer to the actual start time ST, the corresponding planned time error when the scheduled task is actually executed is smaller. When considering the upstream task on which the scheduled task depends, the influence of the historical value of the planned time error on the planned start time of the scheduled task should be considered. Therefore, when generating the recommended planned start time, the recommended planned start time PTrec is set according to formula (6):

其中,PT为定时任务执行的计划开始时间,单位为秒;Among them, PT is the planned start time of the scheduled task execution, in seconds;

为定时任务各个历史时间对应的计划时间误差序列的平均值; It is the average value of the planned time error sequence corresponding to each historical time of the scheduled task;

PTrec为定时任务在上游依赖任务的影响下的推荐计划开始时间。PTrec is the recommended planned start time of the scheduled task under the influence of upstream dependent tasks.

示例性地,当调度系统的待执行的定时任务的上游所依赖的任务可以8点执行完,该定时任务预先设置的计划开始时间为8点15分,但是根据历史执行情况发现,受上游任务执行情况的影响,该定时任务实际开始时间平均为8点25分,也就是说,计划时间误差的平均值为10分钟,那么可设置该定时任务的推荐计划开始时间调整为8点25分。如果定时任务设置推荐计划开始时间为0点,则调度时在0点就开始一直处于执行等待状态,至少需要在上游所依赖的任务于8点执行结束后再执行即至少需要等待8小时;如果设置为18点,则该定时任务至少滞后执行10小时。For example, when the upstream dependent task of the scheduled task to be executed by the scheduling system can be completed at 8 o'clock, the scheduled task is pre-set to start at 8:15, but according to the historical execution situation, it is found that the actual start time of the scheduled task is 8:25 on average due to the execution of the upstream task. In other words, the average value of the planned time error is If the recommended start time of the scheduled task is 10 minutes, the recommended start time of the scheduled task can be set to 8:25. If the recommended start time of the scheduled task is 0:00, it will be in the execution waiting state from 0:00 during scheduling, and it will need to wait at least 8 hours after the upstream dependent task is completed at 8:00. If it is set to 18:00, the scheduled task will be executed at least 10 hours later.

本申请实施例是在预先设置的计划开始时间的基础上,增加定时任务在各个历史时间下的计划时间误差的平均值以得到推荐计划开始时间,综合考虑了定时任务的上游依赖任务对该定时任务的实际执行时间的影响,并能随着时间推进自动更新计划时间误差的平均值,使得推荐计划开始时间能够根据实际情况灵活调整,避免因定时任务的计划开始时间设置过晚或过早导致的资源浪费,有效节约系统资源、提升系统利用率。The embodiment of the present application adds the average value of the planned time error of the scheduled task at each historical time on the basis of the pre-set planned start time to obtain a recommended planned start time, comprehensively considers the impact of the upstream dependent tasks of the scheduled task on the actual execution time of the scheduled task, and can automatically update the average value of the planned time error as time goes by, so that the recommended planned start time can be flexibly adjusted according to the actual situation, avoiding resource waste caused by setting the planned start time of the scheduled task too late or too early, effectively saving system resources and improving system utilization.

步骤704、存储并显示监控阈值、推荐计划开始时间。Step 704: store and display the monitoring threshold and recommended plan start time.

将结果存入MySQL数据库并展示在监控前台页面,以使用户层的相关人员能够直观地查看及使用。在任务量较小的情况下,相关人员可以直接在前台输入定时任务执行阈值;在任务量较大的情况下,优先根据上述生成的监控阈值自动设置ST、ET的上下限阈值:STmin、STmax、ETmin、ETmaxThe results are stored in the MySQL database and displayed on the monitoring front page so that relevant personnel at the user level can view and use them intuitively. When the task volume is small, relevant personnel can directly enter the scheduled task execution threshold at the front page; when the task volume is large, the upper and lower limits of ST and ET are automatically set according to the above-generated monitoring thresholds: STmin , STmax , ETmin , ETmax .

步骤705、根据推荐定时时间,自动化设置任务计划开始时间。Step 705: automatically set the task plan start time according to the recommended timing.

在任务量较小的情况下,开发人员、运维人员可根据显示的监控阈值、推荐计划开始时间,对定时任务的计划开始时间进行手动设置和调整。在任务数量非常庞大时,每天人工/手动去修改PT非常耗费人力,且不能灵活应对可能随时发生变化或异常的任务执行情况,因此,本申请实施例基于上述公式(6)自动计算出各个定时任务的推荐计划开始时间,并使用相关程序自动化地批量修改各个定时任务的PT为PTrec,从而在有效节约系统资源、提升系统利用率的基础上,节约了人力维护成本。When the task volume is small, developers and operation and maintenance personnel can manually set and adjust the scheduled start time of the scheduled task according to the displayed monitoring threshold and recommended scheduled start time. When the number of tasks is very large, manually modifying PT every day is very labor-intensive and cannot flexibly respond to task execution conditions that may change or be abnormal at any time. Therefore, the embodiment of the present application automatically calculates the recommended scheduled start time of each scheduled task based on the above formula (6), and uses related programs to automatically batch modify the PT of each scheduled task to PTrec , thereby saving human maintenance costs on the basis of effectively saving system resources and improving system utilization.

一种实际场景下,结合图2和图5所示,本申请实施提供一种定时多依赖任务的定时任务监控方法,对应于上述定时任务监控模块3033,可以基于上述的定时多依赖任务的全流程监控系统执行,具体说明如下:In an actual scenario, in combination with FIG. 2 and FIG. 5 , the present application implements a method for monitoring a scheduled task with multiple dependent tasks, corresponding to the above-mentioned scheduled task monitoring module 3033, which can be executed based on the above-mentioned full-process monitoring system for scheduled multi-dependent tasks, and is specifically described as follows:

图5中进行任务执行前监控,对应图2中的步骤202:设置检测频率,如每10分钟检测一次,以检测定时任务是否开始执行,如果ST不在第一监控阈值即第一时段对应的(STmin,STmax)范围内,则进行任务执行前异常告警:如果监控到当时间到达第一时段的监控上限值STmax之后,该定时任务还没有开始,则表明定时任务执行过晚,则发送任务执行前异常告警,以便相关人员及时进行异常情况检查。如此,可以在任务执行前及时进行告警,方便运维人员、开发人员进行更深一步检查。In FIG5, monitoring is performed before task execution, corresponding to step 202 in FIG2: setting the detection frequency, such as once every 10 minutes, to detect whether the scheduled task has started to execute. If ST is not within the first monitoring threshold, i.e., the range of (STmin , STmax ) corresponding to the first time period, an abnormal alarm is issued before task execution: if it is monitored that the scheduled task has not started after the time reaches the monitoring upper limit value STmax of the first time period, it indicates that the scheduled task is executed too late, and an abnormal alarm is sent before task execution, so that relevant personnel can check the abnormal situation in time. In this way, an alarm can be issued in time before the task is executed, which is convenient for operation and maintenance personnel and developers to conduct further inspections.

图5中进行任务执行中监控,对应图2中的步骤205:设置检测频率,如每10分钟检测一次,以检测该定时任务是否结束执行,如果ET不在第二监控阈值即第二时段对应的(ETmin,ETmax)范围内,则进行任务执行中异常告警:如果在时间当时间到达第二时段的监控上限值ETmax之后,该定时任务还没有执行结束,则表明定时任务执行太久,则发送任务执行前异常告警,以便相关人员及时进行异常情况检查。如此,能够在任务执行中也会检测任务是否执行异常的基础上,对任务执行时长也进行了有效监控。示例性地,在离线数仓任务执行过程中,当发生任务执行过快,可能是因为数据量的变少等因素;当发生任务执行过慢,有可能是由于数据量增多、集群故障等因素。如此,进行执行时间即实际执行时长的监控,可以使开发人员等更加高效的设置任务的实际执行时长,避免消耗资源高的定时任务同时执行量过大,有效提高计算资源的利用率。In FIG5, monitoring is performed during task execution, corresponding to step 205 in FIG2: setting the detection frequency, such as once every 10 minutes, to detect whether the scheduled task has finished execution. If ET is not within the second monitoring threshold, i.e., the range of (ETmin , ETmax ) corresponding to the second time period, an abnormal alarm is issued during task execution: if the scheduled task has not finished execution after the time reaches the monitoring upper limit ETmax of the second time period, it indicates that the scheduled task has been executed for too long, and an abnormal alarm before task execution is sent so that relevant personnel can check the abnormal situation in time. In this way, the task execution time can be effectively monitored on the basis of detecting whether the task is executed abnormally during task execution. For example, in the offline data warehouse task execution process, when the task is executed too fast, it may be due to factors such as the decrease in the amount of data; when the task is executed too slowly, it may be due to factors such as the increase in the amount of data, cluster failure, etc. In this way, monitoring the execution time, i.e., the actual execution time, can enable developers and others to set the actual execution time of the task more efficiently, avoid the simultaneous execution of scheduled tasks with high resource consumption and excessive amount, and effectively improve the utilization of computing resources.

图5中进行任务执行后监控,对应图2中的步骤208:定时任务执行后,参照图9所示,生成并存储在数据库(如Hive、HBase、MySQL、MongoDB等)执行结果,从数据存储层获取执行结果的数据后,对数据进行统计分析得出数据校验指标(如是否有数据、是否有索引、数据填充率、数据波动率等),基于数据校验指标计算校验得分,并基于该校验得分和得分阈值判断是否告警,从而完成数据校验,以完成定时任务的执行后监控。Post-task execution monitoring is performed in Figure 5, corresponding to step 208 in Figure 2: After the scheduled task is executed, as shown in Figure 9, the execution result is generated and stored in the database (such as Hive, HBase, MySQL, MongoDB, etc.), and after obtaining the data of the execution result from the data storage layer, the data is statistically analyzed to obtain data verification indicators (such as whether there is data, whether there is an index, data fill rate, data volatility, etc.), and the verification score is calculated based on the data verification indicator. Based on the verification score and the score threshold, it is determined whether to alarm, thereby completing data verification to complete the post-execution monitoring of the scheduled task.

其中,由于校验分为强校验和弱校验,在基于数据校验指标计算校验得分时,校验得分包括强校验指标得分和为弱校验指标得分。强校验指标得分表示得分出现绝对异常的情况下直接进行异常告警,如在特征处理任务执行后,校验当天的mongo表中是否有数据、mongo表是否有特定字段的索引,返回校验结果为0或1,当校验结果显示没有数据和/或没有特定字段的索引时,则表明定时任务执行异常,会直接告警;弱校验指标得分表示得分出现一定范围内的异常,如填充率、数据波动率等指标、异常值比例等均没有明确范围标准,返回结果为0-1之间的数值。Among them, since the verification is divided into strong verification and weak verification, when the verification score is calculated based on the data verification index, the verification score includes the strong verification index score and the weak verification index score. The strong verification index score indicates that when the score is absolutely abnormal, an abnormal alarm will be directly issued. For example, after the feature processing task is executed, it is checked whether there is data in the mongo table of the day and whether the mongo table has an index of a specific field. The verification result returned is 0 or 1. When the verification result shows that there is no data and/or no index of a specific field, it indicates that the scheduled task execution is abnormal and an alarm will be issued directly; the weak verification index score indicates that the score is abnormal within a certain range, such as the fill rate, data volatility and other indicators, the proportion of abnormal values, etc., all have no clear range standards, and the return result is a value between 0-1.

当获取执行结果的强校验指标得分和弱校验指标得分之后,根据上述公式(3),计算定时任务的执行结果的校验得分。当强校验指标得分中只要存在一个0则需要直接进行告警,此外,可以预先设置执行结果的校验得分的得分阈值即预设的第一阈值,当强校验指标得分均为1时,可根据预先自定义的得分阈值进行执行后监控,使得当校验得分小于得分阈值时进行执行后异常告警。After obtaining the strong check index score and weak check index score of the execution result, the check score of the execution result of the scheduled task is calculated according to the above formula (3). When there is only one 0 in the strong check index score, an alarm needs to be directly issued. In addition, the score threshold of the check score of the execution result, that is, the preset first threshold, can be pre-set. When the strong check index scores are all 1, post-execution monitoring can be performed according to the pre-defined score threshold, so that when the check score is less than the score threshold, a post-execution abnormal alarm is issued.

本申请定时任务监控是在执行定时任务的过程中,实时地对定时任务的实际执行时间、执行结果或执行状态等进行监控,从而当定时任务发生执行过晚、执行失败等异常情况,由告警系统将异常情况及时发送给运维人员,由运维人员及时解决线上业务或网络相关问题,以避免造成更严重的损失。The scheduled task monitoring of this application is to monitor the actual execution time, execution results or execution status of the scheduled task in real time during the execution of the scheduled task. When the scheduled task encounters abnormal situations such as late execution or execution failure, the alarm system will send the abnormal situation to the operation and maintenance personnel in time, and the operation and maintenance personnel will solve the online business or network-related problems in time to avoid causing more serious losses.

本申请定时任务监控通过对定时任务进行实时监控和跟踪,能够及时发现当任务执行失败、延迟、提前等异常,通过提前告警以及时通知运维人员采取措施来处理执行异常,减少任务的执行错误,保证任务按时、准确地执行,降低因任务执行异常导致的业务损失等潜在的风险,保证业务的可靠性和准确性。The scheduled task monitoring of this application can promptly detect abnormalities such as task execution failure, delay, and early completion by real-time monitoring and tracking of scheduled tasks. It can also promptly notify operation and maintenance personnel to take measures to handle execution abnormalities through early warning, reduce task execution errors, ensure that tasks are executed on time and accurately, reduce potential risks such as business losses caused by abnormal task execution, and ensure business reliability and accuracy.

为了实现本申请实施例定时任务的管理方法,本申请实施例还提供了一种定时任务的管理装置,参照图10所示,该定时任务的管理装置1000包括:获取单元1001和处理单元1002;其中,In order to implement the management method of the scheduled task in the embodiment of the present application, the embodiment of the present application also provides a management device for scheduled tasks. As shown in FIG. 10 , the management device 1000 for scheduled tasks includes: an acquisition unit 1001 and a processing unit 1002; wherein,

处理单元1002,用于在多依赖任务中的定时任务执行前,基于第一时段对定时任务进行监控,第一时段指示定时任务的实际开始时间应属的时段;The processing unit 1002 is used to monitor the scheduled task based on a first time period before the scheduled task in the multi-dependent tasks is executed, where the first time period indicates the time period to which the actual start time of the scheduled task should belong;

处理单元1002,用于在定时任务执行中,基于第二时段对定时任务进行监控,第二时段指示定时任务的实际完成时间应属的时段;The processing unit 1002 is used to monitor the scheduled task based on the second time period during the execution of the scheduled task, where the second time period indicates the time period to which the actual completion time of the scheduled task should belong;

处理单元1002,用于在定时任务执行后,对定时任务的执行结果进行校验。The processing unit 1002 is used to verify the execution result of the scheduled task after the scheduled task is executed.

其中,in,

在一实施例中,获取单元1001,具体用于:获取定时任务的第一时间差值,第一时间差值指示定时任务的历史实际开始时间减去历史计划开始时间的差值;In one embodiment, the acquisition unit 1001 is specifically used to: acquire a first time difference value of a scheduled task, where the first time difference value indicates a difference value between a historical actual start time of the scheduled task and a historical planned start time;

在一实施例中,获取单元1001,具体用于:根据第一时间差值,获得定时任务在多个历史时间下对应的多个第一时间差值的第一正态分布结果;In one embodiment, the acquisition unit 1001 is specifically configured to: obtain, according to the first time difference, a first normal distribution result of a plurality of first time differences corresponding to the scheduled task at a plurality of historical times;

在一实施例中,处理单元1002,具体用于:根据第一正态分布结果中的第一参数和第二参数,确定第一时段,第一参数指示多个第一时间差值的平均值,第二参数指示多个第一时间差值的标准差。In one embodiment, the processing unit 1002 is specifically used to determine the first time period according to a first parameter and a second parameter in the first normal distribution result, wherein the first parameter indicates an average value of multiple first time differences, and the second parameter indicates a standard deviation of multiple first time differences.

在一实施例中,获取单元1001,具体用于:获取定时任务的第二时间差值,第二时间差值指示定时任务的历史实际完成时间减去历史实际开始时间的差值;In one embodiment, the acquisition unit 1001 is specifically used to: acquire a second time difference value of the scheduled task, where the second time difference value indicates a difference between a historical actual completion time and a historical actual start time of the scheduled task;

在一实施例中,获取单元1001,具体用于:根据第二时间差值,获得定时任务在多个历史时间下对应的多个第二时间差值的第二正态分布结果;In one embodiment, the acquisition unit 1001 is specifically configured to: obtain, according to the second time difference, a second normal distribution result of a plurality of second time differences corresponding to the scheduled task at a plurality of historical times;

在一实施例中,处理单元1002,具体用于:根据第二正态分布结果中的第三参数和第四参数,确定第二时段,第三参数指示多个第二时间差值的平均值,第四参数指示多个第二时间差值的标准差。In one embodiment, the processing unit 1002 is specifically used to determine the second time period according to a third parameter and a fourth parameter in the second normal distribution result, where the third parameter indicates an average value of multiple second time differences and the fourth parameter indicates a standard deviation of multiple second time differences.

在一实施例中,获取单元1001,具体用于:获取定时任务的实际开始时间;In one embodiment, the acquisition unit 1001 is specifically used to: acquire the actual start time of the scheduled task;

在一实施例中,处理单元1002,具体用于:监控实际开始时间是否在第一时段内,得到第一监控结果。In one embodiment, the processing unit 1002 is specifically configured to: monitor whether the actual start time is within a first time period to obtain a first monitoring result.

在一实施例中,处理单元1002,具体用于:在第一监控结果表征实际开始时间不在第一时段内的情况下,输出第一告警信息,第一告警信息用于对定时任务执行前进行异常告警。In one embodiment, the processing unit 1002 is specifically used to: output first alarm information when the first monitoring result indicates that the actual start time is not within the first time period, and the first alarm information is used to issue an abnormal alarm before the scheduled task is executed.

在一实施例中,获取单元1001,具体用于:获取定时任务的实际完成时间;In one embodiment, the acquisition unit 1001 is specifically used to: acquire the actual completion time of the scheduled task;

在一实施例中,处理单元1002,具体用于:监控实际完成时间是否在第二时段内,得到第二监控结果。In one embodiment, the processing unit 1002 is specifically configured to monitor whether the actual completion time is within a second time period to obtain a second monitoring result.

在一实施例中,处理单元1002,具体用于:在第二监控结果表征实际完成时间不在第二时段内的情况下,输出第二告警信息,第二告警信息用于对定时任务执行中进行异常告警。In one embodiment, the processing unit 1002 is specifically used to: when the second monitoring result indicates that the actual completion time is not within the second time period, output second alarm information, and the second alarm information is used to issue an abnormal alarm during the execution of the scheduled task.

在一实施例中,处理单元1002,具体用于:在定时任务执行后,对定时任务的执行结果进行校验,包括:对定时任务的执行结果进行第一校验操作,得到第一指标结果;对定时任务的执行结果进行第二校验操作,得到第二指标结果;根据第一指标结果和第二指标结果,确定执行结果的校验得分;判断校验得分是否小于第一阈值,得到第三监控结果。In one embodiment, the processing unit 1002 is specifically used to: after the scheduled task is executed, verify the execution result of the scheduled task, including: performing a first verification operation on the execution result of the scheduled task to obtain a first indicator result; performing a second verification operation on the execution result of the scheduled task to obtain a second indicator result; determining a verification score of the execution result based on the first indicator result and the second indicator result; and determining whether the verification score is less than a first threshold to obtain a third monitoring result.

在一实施例中,处理单元1002,具体用于:在第三监控结果表征校验得分小于第一阈值的情况下,输出第三告警信息,第三告警信息用于对定时任务执行后进行异常告警。In one embodiment, the processing unit 1002 is specifically used to: when the third monitoring result indicates that the verification score is less than the first threshold, output third alarm information, and the third alarm information is used to issue an abnormal alarm after the scheduled task is executed.

在一实施例中,获取单元1001,具体用于:在多依赖任务中的定时任务执行前,获取定时任务的计划开始时间;In one embodiment, the acquisition unit 1001 is specifically used to: before the scheduled task in the multi-dependent tasks is executed, acquire the planned start time of the scheduled task;

在一实施例中,处理单元1002,具体用于:在多依赖任务中的定时任务执行前,基于第一时间差值配置更新计划开始时间。In one embodiment, the processing unit 1002 is specifically configured to: before the scheduled task in the multi-dependent tasks is executed, update the planned start time based on the first time difference configuration.

本实施例中与其它实施例中相同步骤和相同内容的说明,可以参照其它实施例中的描述,此处不再赘述。For the description of the same steps and the same contents in this embodiment as those in other embodiments, reference can be made to the description in other embodiments and will not be repeated here.

为了实现本申请实施例定时任务的管理方法,本申请实施例还提供了一种定时任务的管理设备,参照图11所示,该定时任务的管理设备1100包括:处理器1101,存储器1102,通信总线1103;其中,In order to implement the management method of the scheduled task in the embodiment of the present application, the embodiment of the present application also provides a management device for scheduled tasks. As shown in FIG. 11 , the management device 1100 for scheduled tasks includes: a processor 1101, a memory 1102, and a communication bus 1103; wherein,

处理器1101,用于运行计算机程序时,执行上述一个或多个技术方案提供的方法;Processor 1101, configured to execute the method provided by one or more of the above technical solutions when running a computer program;

存储器1102,存储能够在处理器1101上运行的计算机程序;A memory 1102 for storing computer programs that can be executed on the processor 1101;

通信总线1103,用于实现处理器,1101和存储器1102之间的通信连接。The communication bus 1103 is used to realize the communication connection between the processor 1101 and the memory 1102.

需要说明的是:处理器1101的具体处理过程可参照上述方法理解,这里不再赘述。It should be noted that the specific processing process of the processor 1101 can be understood by referring to the above method, which will not be repeated here.

当然,实际应用时,定时任务的管理设备1100中的各个组件通过通信总线1103耦合在一起。可理解,通信总线1103用于实现这些组件之间的连接通信。通信总线1103除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图8中将各种总线都标为通信总线1103。Of course, in actual application, each component in the management device 1100 of the timed task is coupled together through the communication bus 1103. It can be understood that the communication bus 1103 is used to realize the connection communication between these components. In addition to the data bus, the communication bus 1103 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are marked as communication buses 1103 in Figure 8.

本申请实施例中的存储器1102用于存储各种类型的数据以支持定时任务的管理设备1100的操作。这些数据的示例包括:用于在定时任务的管理设备1100上操作的任何计算机程序。The memory 1102 in the embodiment of the present application is used to store various types of data to support the operation of the scheduled task management device 1100. Examples of such data include: any computer program used to operate on the scheduled task management device 1100.

上述本申请实施例揭示的方法可以应用于处理器1101中,或者由处理器1101实现。处理器1101可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1101中的硬件的集成逻辑电路或者软件形式的指令完成。处理器1101可以是通用处理器、DSP,或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器1101可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本申请实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器1102,处理器1101读取存储器1102中的信息,结合其硬件完成前述方法的步骤。The method disclosed in the above embodiment of the present application can be applied to the processor 1101, or implemented by the processor 1101. The processor 1101 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 1101 or an instruction in software form. The processor 1101 can be a general-purpose processor, a DSP, or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The processor 1101 can implement or execute the various methods, steps and logic block diagrams disclosed in the embodiments of the present application. The general-purpose processor can be a microprocessor or any conventional processor, etc. In combination with the steps of the method disclosed in the embodiment of the present application, it can be directly embodied as a hardware decoding processor to execute, or it can be executed by a combination of hardware and software modules in the decoding processor. The software module can be located in a storage medium, which is located in the memory 1102, and the processor 1101 reads the information in the memory 1102 and completes the steps of the above method in combination with its hardware.

在示例性实施例中,定时任务的管理设备1100可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit,ASIC)、DSP、可编程逻辑器件(Programmable Logic Device,PLD)、复杂可编程逻辑器件(Complex Programmable LogicDevice,CPLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器(Micro Controller Unit,MCU)、微处理器(Microprocessor)、或者其它电子元件实现,用于执行前述方法。In an exemplary embodiment, the scheduled task management device 1100 can be implemented by one or more application specific integrated circuits (ASIC), DSP, programmable logic device (PLD), complex programmable logic device (CPLD), field programmable gate array (FPGA), general processor, controller, microcontroller (MCU), microprocessor, or other electronic components to execute the aforementioned method.

可以理解,本申请实施例的存储器(存储器1102)可以是易失性存储器或者非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性随机存取存储器(ferromagnetic random access memory,FRAM)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(Compact Disc Read-Only Memory,CD-ROM);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static Random Access Memory,SRAM)、同步静态随机存取存储器(Synchronous Static Random Access Memory,SSRAM)、动态随机存取存储器(Dynamic Random Access Memory,DRAM)、同步动态随机存取存储器(Synchronous Dynamic Random Access Memory,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate Synchronous Dynamic Random Access Memory,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced Synchronous Dynamic Random Access Memory,ESDRAM)、同步连接动态随机存取存储器(Sync Link Dynamic Random Access Memory,SLDRAM)、直接内存总线随机存取存储器(Direct Rambus Random Access Memory,DRRAM)。本申请实施例描述的存储器旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory (memory 1102) of the embodiment of the present application can be a volatile memory or a non-volatile memory, and can also include both volatile and non-volatile memories. Among them, the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), a magnetic random access memory (FRAM), a flash memory, a magnetic surface memory, an optical disk, or a compact disc read-only memory (CD-ROM); the magnetic surface memory can be a disk memory or a tape memory. The volatile memory can be a random access memory (RAM), which is used as an external cache. By way of example but not limitation, many forms of RAM are available, such as static random access memory (SRAM), synchronous static random access memory (SSRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous link dynamic random access memory (SLDRAM), direct RAM bus random access memory (DRRAM). The memory described in the embodiments of the present application is intended to include but is not limited to these and any other suitable types of memory.

在示例性实施例中,本申请实施例还提供了一种存储介质,即计算机存储介质,具体为计算机可读存储介质,例如包括存储计算机程序的存储器1102,上述存储器1102中的计算机程序可由定时任务的管理设备1100的处理器1101执行,以完成前述方法的步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器。In an exemplary embodiment, the present application embodiment further provides a storage medium, namely a computer storage medium, specifically a computer-readable storage medium, for example, including a memory 1102 storing a computer program, and the computer program in the above memory 1102 can be executed by a processor 1101 of a scheduled task management device 1100 to complete the steps of the aforementioned method. The computer-readable storage medium can be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface storage, optical disk, or CD-ROM.

在示例性实施例中,本申请实施例还提供了一种计算机程序产品,包括计算机程序,上述计算机程序可由定时任务的管理设备1100的处理器1101执行,以完成前述定时任务的管理设备的上述方法的步骤。In an exemplary embodiment, the embodiment of the present application also provides a computer program product, including a computer program, which can be executed by the processor 1101 of the scheduled task management device 1100 to complete the steps of the above method of the aforementioned scheduled task management device.

需要说明的是:“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that: "first", "second", etc. are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.

另外,本申请实施例所记载的技术方案之间,在不冲突的情况下,可以任意组合。In addition, the technical solutions described in the embodiments of the present application can be combined arbitrarily without conflict.

以上所述,仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。The above description is only a preferred embodiment of the present application and is not intended to limit the protection scope of the present application.

Claims (14)

Translated fromChinese
1.一种定时任务的管理方法,其特征在于,所述方法包括:1. A method for managing scheduled tasks, characterized in that the method comprises:在多依赖任务中的定时任务执行前,基于第一时段对所述定时任务进行监控,所述第一时段指示所述定时任务的实际开始时间应属的时段;Before executing a scheduled task among multi-dependent tasks, monitoring the scheduled task based on a first time period, the first time period indicating the time period to which the actual start time of the scheduled task should belong;在所述定时任务执行中,基于第二时段对所述定时任务进行监控,所述第二时段指示所述定时任务的实际完成时间应属的时段;During the execution of the scheduled task, the scheduled task is monitored based on a second time period, the second time period indicating the time period to which the actual completion time of the scheduled task should belong;在所述定时任务执行后,对所述定时任务的执行结果进行校验。After the scheduled task is executed, the execution result of the scheduled task is verified.2.根据权利要求1所述的方法,其特征在于,所述方法还包括:2. The method according to claim 1, characterized in that the method further comprises:获取所述定时任务的第一时间差值,所述第一时间差值指示所述定时任务的历史实际开始时间减去历史计划开始时间的差值;Acquire a first time difference of the scheduled task, where the first time difference indicates a difference between a historical actual start time of the scheduled task and a historical planned start time;根据所述第一时间差值,获得所述定时任务在多个历史时间下对应的多个第一时间差值的第一正态分布结果;According to the first time difference, obtaining a first normal distribution result of a plurality of first time differences corresponding to the scheduled task at a plurality of historical times;根据所述第一正态分布结果中的第一参数和第二参数,确定所述第一时段,所述第一参数指示所述多个第一时间差值的平均值,所述第二参数指示所述多个第一时间差值的标准差。The first time period is determined according to a first parameter and a second parameter in the first normal distribution result, wherein the first parameter indicates an average value of the plurality of first time difference values, and the second parameter indicates a standard deviation of the plurality of first time difference values.3.根据权利要求1所述的方法,其特征在于,所述方法还包括:3. The method according to claim 1, characterized in that the method further comprises:获取所述定时任务的第二时间差值,所述第二时间差值指示所述定时任务的历史实际完成时间减去历史实际开始时间的差值;Obtaining a second time difference of the scheduled task, where the second time difference indicates a difference between a historical actual completion time of the scheduled task and a historical actual start time;根据所述第二时间差值,获得所述定时任务在多个历史时间下对应的多个第二时间差值的第二正态分布结果;According to the second time difference, a second normal distribution result of a plurality of second time differences corresponding to the scheduled task at a plurality of historical times is obtained;根据所述第二正态分布结果中的第三参数和第四参数,确定所述第二时段,所述第三参数指示所述多个第二时间差值的平均值,所述第四参数指示所述多个第二时间差值的标准差。The second time period is determined according to a third parameter and a fourth parameter in the second normal distribution result, wherein the third parameter indicates an average value of the plurality of second time difference values, and the fourth parameter indicates a standard deviation of the plurality of second time difference values.4.根据权利要求1所述的方法,其特征在于,所述在多依赖任务中的定时任务执行前,基于第一时段对所述定时任务进行监控,包括:4. The method according to claim 1, characterized in that before the scheduled task in the multi-dependent tasks is executed, monitoring the scheduled task based on the first time period comprises:获取所述定时任务的所述实际开始时间;Obtaining the actual start time of the scheduled task;监控所述实际开始时间是否在所述第一时段内,得到第一监控结果。Monitor whether the actual start time is within the first time period to obtain a first monitoring result.5.根据权利要求4所述的方法,其特征在于,所述方法还包括:5. The method according to claim 4, characterized in that the method further comprises:在所述第一监控结果表征所述实际开始时间不在所述第一时段内的情况下,输出第一告警信息,所述第一告警信息用于对所述定时任务执行前进行异常告警。When the first monitoring result indicates that the actual start time is not within the first time period, a first alarm message is output, where the first alarm message is used to issue an abnormal alarm before the scheduled task is executed.6.根据权利要求1所述的方法,其特征在于,所述在所述定时任务执行中,基于第二时段对所述定时任务进行监控,包括:6. The method according to claim 1, characterized in that, during the execution of the scheduled task, monitoring the scheduled task based on the second time period comprises:获取所述定时任务的所述实际完成时间;Obtaining the actual completion time of the scheduled task;监控所述实际完成时间是否在所述第二时段内,得到第二监控结果。Monitor whether the actual completion time is within the second time period to obtain a second monitoring result.7.根据权利要求6所述的方法,其特征在于,所述方法还包括:7. The method according to claim 6, characterized in that the method further comprises:在所述第二监控结果表征所述实际完成时间不在所述第二时段内的情况下,输出第二告警信息,所述第二告警信息用于对所述定时任务执行中进行异常告警。When the second monitoring result indicates that the actual completion time is not within the second time period, second alarm information is output, and the second alarm information is used to issue an abnormal alarm during the execution of the scheduled task.8.根据权利要求1所述的方法,其特征在于,所述在所述定时任务执行后,对所述定时任务的执行结果进行校验,包括:8. The method according to claim 1, characterized in that after the scheduled task is executed, verifying the execution result of the scheduled task comprises:对所述定时任务的所述执行结果进行第一校验操作,得到第一指标结果;Performing a first verification operation on the execution result of the scheduled task to obtain a first indicator result;对所述定时任务的所述执行结果进行第二校验操作,得到第二指标结果;Performing a second verification operation on the execution result of the scheduled task to obtain a second indicator result;根据所述第一指标结果和所述第二指标结果,确定所述执行结果的校验得分;Determining a verification score of the execution result according to the first indicator result and the second indicator result;判断所述校验得分是否小于第一阈值,得到第三监控结果。It is determined whether the verification score is less than a first threshold value to obtain a third monitoring result.9.根据权利要求8所述的方法,其特征在于,所述方法还包括:9. The method according to claim 8, characterized in that the method further comprises:在所述第三监控结果表征所述校验得分小于所述第一阈值的情况下,输出第三告警信息,所述第三告警信息用于对所述定时任务执行后进行异常告警。When the third monitoring result indicates that the verification score is less than the first threshold, a third alarm information is output, where the third alarm information is used to issue an abnormal alarm after the scheduled task is executed.10.根据权利要求2所述的方法,其特征在于,所述在多依赖任务中的定时任务执行前,所述方法还包括:10. The method according to claim 2, characterized in that before the scheduled task in the multi-dependent task is executed, the method further comprises:获取所述定时任务的计划开始时间;Get the scheduled start time of the scheduled task;基于所述第一时间差值配置更新所述计划开始时间。The planned start time is updated based on the first time difference configuration.11.一种定时任务的管理装置,其特征在于,所述定时任务的管理装置包括:11. A scheduled task management device, characterized in that the scheduled task management device comprises:处理单元,用于在多依赖任务中的定时任务执行前,基于第一时段对所述定时任务进行监控,所述第一时段指示所述定时任务的实际开始时间应属的时段;A processing unit, configured to monitor a scheduled task among multiple dependent tasks before the scheduled task is executed based on a first time period, wherein the first time period indicates a time period to which an actual start time of the scheduled task should belong;所述处理单元,还用于在所述定时任务执行中,基于第二时段对所述定时任务进行监控,所述第二时段指示所述定时任务的实际完成时间应属的时段;The processing unit is further configured to monitor the scheduled task based on a second time period during the execution of the scheduled task, wherein the second time period indicates the time period to which the actual completion time of the scheduled task should belong;所述处理单元,还用于在所述定时任务执行后,对所述定时任务的执行结果进行校验。The processing unit is further used to verify the execution result of the scheduled task after the scheduled task is executed.12.一种定时任务的管理设备,其特征在于,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器;其中,12. A scheduled task management device, comprising: a processor and a memory for storing a computer program that can be run on the processor; wherein:所述处理器,用于运行所述计算机程序时,执行权利要求1至10任一项所述的定时任务的管理方法的步骤。The processor is used to execute the steps of the scheduled task management method according to any one of claims 1 to 10 when running the computer program.13.一种存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至10任一项所述的定时任务的管理方法的步骤。13. A storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the steps of the method for managing scheduled tasks according to any one of claims 1 to 10 are implemented.14.一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至10任一项所述方法的步骤。14. A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 10 are implemented.
CN202410140757.4A2024-01-312024-01-31 A method, device, equipment, storage medium and product for managing scheduled tasksPendingCN118796597A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202410140757.4ACN118796597A (en)2024-01-312024-01-31 A method, device, equipment, storage medium and product for managing scheduled tasks

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202410140757.4ACN118796597A (en)2024-01-312024-01-31 A method, device, equipment, storage medium and product for managing scheduled tasks

Publications (1)

Publication NumberPublication Date
CN118796597Atrue CN118796597A (en)2024-10-18

Family

ID=93026615

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202410140757.4APendingCN118796597A (en)2024-01-312024-01-31 A method, device, equipment, storage medium and product for managing scheduled tasks

Country Status (1)

CountryLink
CN (1)CN118796597A (en)

Similar Documents

PublicationPublication DateTitle
CN110287052B (en)Root cause task determination method and device for abnormal task
US8141053B2 (en)Call stack sampling using a virtual machine
CN113946499B (en) A microservice link tracking and performance analysis method, system, device and application
US20100017583A1 (en)Call Stack Sampling for a Multi-Processor System
US9471459B2 (en)Information acquisition method and information acquisition apparatus
CN104933618A (en) Method and device for monitoring batch operation data of core banking system
CN110727556A (en)BMC health state monitoring method, system, terminal and storage medium
WO2019214010A1 (en)Method and device for monitoring for equipment failure
CN112749013A (en)Thread load detection method and device, electronic equipment and storage medium
US8489938B2 (en)Diagnostic data capture in a computing environment
CN112306567A (en)Cluster management system and container management and control method
CN112132544A (en)Inspection method and device of business system
CN115499302A (en)Monitoring method and device of business system, readable storage medium and electronic equipment
CN113064765B (en)Node exception handling method, device, electronic equipment and machine-readable storage medium
CN118034887A (en)Big data platform task management method and system
CN118796597A (en) A method, device, equipment, storage medium and product for managing scheduled tasks
CN118113508A (en)Network card fault risk prediction method, device, equipment and medium
CN105446289B (en)By the method and system of the timestamp of manufacturing execution system collection work state
CN118409751A (en)Automatic analysis method, system, device and equipment for AI (advanced technology attachment) acceleration card calculation errors
CN111414295A (en)CPU occupancy rate statistical method, device, equipment and medium
CN117407245A (en)Model training task anomaly detection method and system, electronic equipment and storage medium
CN110502404B (en)Early warning processing method based on data management platform and related equipment
CN115437961A (en)Data processing method and device, electronic equipment and storage medium
CN115718658A (en) A time efficiency optimization method and device
CN117290113B (en)Task processing method, device, system and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp