Movatterモバイル変換


[0]ホーム

URL:


CN120196421A - A method and device for GPU resource virtualization computing power scheduling - Google Patents

A method and device for GPU resource virtualization computing power scheduling
Download PDF

Info

Publication number
CN120196421A
CN120196421ACN202510678046.7ACN202510678046ACN120196421ACN 120196421 ACN120196421 ACN 120196421ACN 202510678046 ACN202510678046 ACN 202510678046ACN 120196421 ACN120196421 ACN 120196421A
Authority
CN
China
Prior art keywords
task
gpu
resource
determining
scheduling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510678046.7A
Other languages
Chinese (zh)
Inventor
肖滨
林雄辉
王凝华
臧洋洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ruishi Data Technology Shenzhen Co ltd
Original Assignee
Ruishi Data Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ruishi Data Technology Shenzhen Co ltdfiledCriticalRuishi Data Technology Shenzhen Co ltd
Priority to CN202510678046.7ApriorityCriticalpatent/CN120196421A/en
Publication of CN120196421ApublicationCriticalpatent/CN120196421A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明提供了一种GPU资源虚拟化算力调度的方法,包括步骤:获取任务并确定所述任务的计算需求特性和任务属性;依据所述计算需求特性与任务属性确定资源需求;依据所述任务属性确定任务优先级;依据所述任务优先级与所述资源需求确定所述GPU资源的调度结果。通过具体任务属性对该任务的GPU资源进行分配,保证GPU运行的高效性,对算力资源进行均衡分配,提高GPU的使用率,并保证任务负载的均衡,更高效的完成任务。

The present invention provides a method for scheduling GPU resource virtualization computing power, including the steps of: obtaining a task and determining the computing demand characteristics and task attributes of the task; determining the resource requirements according to the computing demand characteristics and task attributes; determining the task priority according to the task attributes; and determining the scheduling result of the GPU resources according to the task priority and the resource requirements. The GPU resources of the task are allocated through specific task attributes to ensure the efficiency of GPU operation, evenly allocate computing power resources, improve the utilization rate of the GPU, ensure the balance of task load, and complete the task more efficiently.

Description

GPU resource virtualization computing power scheduling method and device
Technical Field
The invention mainly relates to the technical field of data processing, in particular to a method for dispatching GPU resource virtualization computing power.
Background
In the fields of artificial intelligence, deep learning, high-performance computing, and the like, a GPU (graphics processing unit) has become a central computational support. Whether training a large-scale neural network model or carrying out real-time reasoning, the high concurrent computing capacity and floating point operation efficiency of the GPU play a key role. However, as model computing demands increase, single GPU utilization and resource scheduling issues become key bottlenecks that impact overall computational power usage efficiency. In deep learning training or reasoning tasks, GPU resources often suffer from idle or inefficient utilization. Part of tasks cannot fully utilize the computing power of the GPU, resulting in a waste of computing resources. How to effectively schedule GPU resources in a multi-user and multi-task environment and avoid resource contention and allocation is not a public problem. Some tasks may be wasted resources due to over-allocation, while other tasks face the dilemma of insufficient resources. As tasks change, the GPU load assumes an unstable state. The computing demands of certain tasks are momentarily increased, which can easily cause system overload, while the GPU is in an idle state for other periods of time. In a large-scale computing cluster, how to efficiently schedule GPUs on different nodes, especially cross-node resource scheduling, faces a great technical challenge including network delay, data transmission bottlenecks, and the like.
Disclosure of Invention
In view of the foregoing, the present application has been developed to provide a method and apparatus for GPU resource virtualization power scheduling that overcomes, or at least partially solves, the problems, including:
a method for GPU resource virtualization computing power scheduling comprises the following steps:
acquiring a task and determining the calculation demand characteristics and task attributes of the task;
Determining resource requirements according to the computing requirement characteristics and task attributes;
Determining task priority according to the task attribute;
and determining a scheduling result of the GPU resources according to the task priority and the resource demand.
Further, the method further comprises the following steps:
monitoring the GPU resources;
And when the GPU resource load is unbalanced, starting cross-node task scheduling.
Further, the step of monitoring the GPU resource includes:
Acquiring the use condition and task execution state of the GPU;
determining the idle state of the GPU according to the use condition;
And updating the scheduling result according to the task execution state and the idle state.
Further, the step of determining the resource requirement according to the computing requirement characteristic and the task attribute includes:
Determining the calculated amount and the memory requirement according to the calculated requirement characteristics;
determining a task type according to the task attribute;
And determining the resource requirement according to the calculated amount, the memory requirement and the task type.
Further, the step of determining the task priority according to the task attribute includes:
determining task types according to the task attributes, wherein the task types comprise a real-time reasoning task, a model training task and a data preprocessing task;
and determining the task priority according to the task type.
Further, the step of scheduling GPU resources according to the task priority and the resource requirement includes:
dividing a physical GPU into a plurality of virtual instances, and establishing a virtualized GPU environment;
Determining a computing power allocation strategy of the virtual instance according to the resource requirement;
and determining a GPU dispatching result according to the task priority and the computing power distribution strategy.
Further, the method further comprises the following steps:
predicting GPU resources required by the task according to the historical resource demand data;
and pre-distributing the GPU resources according to the predicted result.
A device for GPU resource virtualized power scheduling, the device for GPU resource virtualized power scheduling implementing the steps of the method for GPU resource virtualized power scheduling according to any of the above claims, comprising:
the acquisition module is used for acquiring the task and determining the calculation demand characteristics and the task attributes of the task;
The resource demand module is used for determining resource demands according to the calculation demand characteristics and the task attributes;
the priority module is used for determining task priority according to the task attribute;
And the scheduling module is used for determining a scheduling result of the GPU resources according to the task priority and the resource demand.
An electronic device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the federally learning based echocardiographic segmentation method of any of the above.
A computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of a method of GPU resource virtualization power scheduling as described in any of the preceding claims.
The application has the following advantages:
Aiming at the defects that GPU resources cannot be distributed efficiently, calculation resources are wasted and GPU loads are unstable in the prior art, the application provides a method for dispatching GPU resource virtualization computing power, which comprises the steps of obtaining tasks and determining the calculation demand characteristics and task attributes of the tasks; determining resource requirements according to the calculation requirement characteristics and task attributes, determining task priorities according to the task attributes, and determining scheduling results of the GPU resources according to the task priorities and the resource requirements. The GPU resources of the task are distributed through specific task attributes, so that the high efficiency of GPU operation is guaranteed, the computing power resources are distributed in a balanced mode, the utilization rate of the GPU is improved, the balance of task loads is guaranteed, and the task is completed more efficiently.
Drawings
For a clearer description of the technical solutions of the present application, the drawings that are needed in the description of the present application will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art;
FIG. 1 is a flowchart illustrating a method for virtualized power scheduling of GPU resources according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of an apparatus for virtualized power scheduling of GPU resources according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention.
Detailed Description
In order that the manner in which the above recited objects, features and advantages of the present application are obtained will become more readily apparent, a more particular description of the application briefly described above will be rendered by reference to the appended drawings. It will be apparent that the described embodiments are some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The inventor discovers through analysis of the prior art that in the deep learning training or reasoning task, the phenomenon of idle or low-efficiency utilization of GPU resources often exists. Part of tasks cannot fully utilize the computing power of the GPU, resulting in a waste of computing resources. How to effectively schedule GPU resources in a multi-user and multi-task environment and avoid resource contention and allocation is not a public problem. Some tasks may be wasted resources due to over-allocation, while other tasks face the dilemma of insufficient resources. As tasks change, the GPU load assumes an unstable state. The computing demands of certain tasks are momentarily increased, which can easily cause system overload, while the GPU is in an idle state for other periods of time. In a large-scale computing cluster, how to efficiently schedule GPUs on different nodes, especially cross-node resource scheduling, faces a great technical challenge including network delay, data transmission bottlenecks, and the like.
It should be noted that, in any embodiment of the present invention, GPU (Graphic Process Unit) is an image processor, which is a special graphics core processor, and the 3D display chip concentrates the three-dimensional image and special effect processing functions in the display chip, and relies on the computing power of the GPU, that is, the so-called "hardware acceleration" function.
Referring to fig. 1, a method for GPU resource virtualization computing power scheduling according to an embodiment of the present application is shown;
S110, acquiring a task and determining the calculation demand characteristics and task attributes of the task;
s120, determining resource requirements according to the calculation requirement characteristics and the task attributes;
s130, determining task priority according to the task attribute;
and S140, determining a scheduling result of the GPU resources according to the task priority and the resource demand.
Aiming at the defects that GPU resources cannot be distributed efficiently, calculation resources are wasted and GPU loads are unstable in the prior art, the application provides a method for dispatching GPU resource virtualization computing power, which comprises the steps of obtaining tasks and determining the calculation demand characteristics and task attributes of the tasks; determining resource requirements according to the calculation requirement characteristics and task attributes, determining task priorities according to the task attributes, and determining scheduling results of the GPU resources according to the task priorities and the resource requirements. The GPU resources of the task are distributed through specific task attributes, so that the high efficiency of GPU operation is guaranteed, the computing power resources are distributed in a balanced mode, and the task is completed more efficiently.
Next, a method for GPU resource virtualization power scheduling in the present exemplary embodiment will be further described.
In one embodiment of the present invention, the method further comprises monitoring the GPU resource;
And when the GPU resource load is unbalanced, starting cross-node task scheduling.
It should be noted that, adding the monitoring function to the GPU resources, the task migration or redistribution can be performed among multiple GPU nodes by using the intelligent scheduler through real-time monitoring of GPU load and task progress, and the GPU resources are scheduled according to the real-time situation, so as to ensure the high efficiency of task completion, and the load threshold is not exceeded. When the GPU load of a certain node is too high, the scheduler automatically migrates part of tasks to the node with lower load, and resource balanced distribution in the global scope is realized.
In one embodiment of the present invention, the specific process of step "monitor the GPU resources" may be further described in conjunction with the following description.
The use condition and task execution state of the GPU are obtained as follows;
Determining the idle state of the GPU according to the use condition as follows;
And updating the scheduling result according to the task execution state and the idle state as described in the following steps.
It is noted that, if the load is balanced, the task is continuously executed, if the load is unbalanced, the cross-node task scheduling is required to be started, the data transmission is optimized (such as optimizing by using an efficient network protocol), and the idle GPU resources are scheduled to the tasks with the loads exceeding the threshold or about to reach the threshold for operation so as to balance the scheduling, improve the task efficiency and ensure the balance of the GPU scheduling.
As described above in step S110, a task is acquired and the computing demand characteristics and task attributes of the task are determined.
It should be noted that, the calculation requirement characteristics include calculation amount, memory requirement, special requirement and the like required by executing the task, the task attributes are task information such as task type, task number, task source and the like, and after the calculation requirement characteristics of the task are obtained, GPU resources can be dynamically pre-allocated according to characteristics (such as deep learning training, reasoning, data preprocessing and the like) of different tasks. Before the task starts, the GPU resources required by the task are predicted according to the task scale and the historical resource demand data, and a proper number of GPU instances are allocated, so that the task can be supported by enough computing power, and meanwhile excessive allocation is avoided.
As described in the above step S120, the resource requirement is determined according to the computing requirement characteristic and the task attribute.
In one embodiment of the present invention, the specific process of "determining resource requirements according to the computing requirement characteristics and task attributes" described in step S120 may be further described in conjunction with the following description.
Determining a calculated amount and a memory requirement according to the calculated requirement characteristics as follows;
Determining a task type according to the task attribute as follows;
and determining the resource requirement according to the calculated amount, the memory requirement and the task type as follows.
The computing demand characteristics are analyzed, the computing amount and the memory demand of the task are determined, the allocation of resources required by the task corresponding to the task is estimated, and the computing resources can be allocated to the corresponding task in a targeted manner later, so that the GPU resources are fully utilized.
As described in step S130, the task priority is determined according to the task attribute.
In one embodiment of the present invention, the specific process of determining task priority according to the task attribute in step S130 may be further described in conjunction with the following description.
Determining task types according to the task attributes, wherein the task types comprise a real-time reasoning task, a model training task and a data preprocessing task;
the task priority is determined according to the task type, as described in the following steps.
It should be noted that, by designing a set of GPU scheduling algorithm based on task priority, tasks with different priorities are scheduled with different levels according to importance, emergency degree and resource requirements, and when idle. For example, real-time reasoning tasks have high priority, model training has higher priority, and auxiliary tasks such as data preprocessing have lowest priority. This scheduling mechanism can reasonably balance the overall load of the system.
As described in step S140, the scheduling result of the GPU resource is determined according to the task priority and the resource requirement.
In one embodiment of the present invention, the following description may be further described with reference to step S140, "determining the scheduling result of the GPU resource according to the task priority and the resource requirement".
Dividing a physical GPU into a plurality of virtual instances, and establishing a virtualized GPU environment;
determining a computing power allocation strategy of the virtual instance according to the resource requirement;
And determining a GPU dispatching result according to the task priority and the computing power distribution strategy as described in the following steps.
Through the GPU virtualization technology, a physical GPU is divided into a plurality of virtual instances so as to support parallel execution of a plurality of tasks on the same GPU. When the task demand is smaller, a plurality of small tasks can be concentrated on one GPU to run, and the computing power of the small tasks is fully utilized.
In one embodiment of the invention, the method further comprises the steps of predicting GPU resources required by the task according to the historical resource demand data;
and pre-distributing the GPU resources according to the predicted result.
Before the task starts, GPU resources required by the task can be predicted according to the task scale and the historical resource demand data, and a proper number of GPU instances are allocated, so that the task can be supported by enough computing power, and meanwhile excessive allocation is avoided. The GPU resources are dynamically pre-allocated according to the characteristics of different tasks (e.g., deep learning training, reasoning, data preprocessing, etc.). The history records are analyzed to obtain the predicted data of the task, GPU resources are pre-allocated according to the predicted data, the subsequent allocation and adjustment of the resources are facilitated, the resources are only required to be properly adjusted, the head calculation is not required, the load balancing and the dynamic scheduling are mutually matched, the resource balancing allocation in the global range is realized, the allocation balancing is guaranteed, and the allocation efficiency is improved.
The scheme adopts a self-adaptive scheduling algorithm, and can dynamically adjust the allocation strategy of GPU resources according to the resource requirements of tasks, the current load of the GPU and the future task prediction. The scheduling algorithm can also learn through historical data to continuously optimize the efficiency of resource allocation, in a large-scale computing cluster, the data transmission delay among different GPU nodes is reduced by improving a cross-node scheduling mechanism and combining a high-speed data transmission protocol, so that the cross-node scheduling process is more efficient, and the future GPU load trend is predicted by introducing a machine learning model and combining task historical load data. The prediction capability can help the system to carry out resource optimization configuration before task execution, and unnecessary resource waste is reduced.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
Referring to fig. 2, an apparatus for GPU resource virtualized computing power scheduling according to an embodiment of the present application is shown;
The method specifically comprises the following steps:
an acquisition module 210, configured to acquire a task and determine a computation demand characteristic and a task attribute of the task;
a resource requirement module 220, configured to determine a resource requirement according to the computing requirement characteristic and the task attribute;
a priority module 230, configured to determine a task priority according to the task attribute;
And the scheduling module 240 is configured to determine a scheduling result of the GPU resource according to the task priority and the resource requirement.
In an embodiment of the present invention, further includes:
the monitoring module is used for monitoring the GPU resources;
And the cross-node scheduling module is used for starting cross-node task scheduling when the GPU resource load is unbalanced.
In an embodiment of the present invention, the monitoring module includes:
the GPU state acquisition sub-module is used for acquiring the use condition and task execution state of the GPU;
the GPU idle state submodule is used for determining the idle state of the GPU according to the use condition;
And the updating and scheduling sub-module is used for updating the scheduling result according to the task execution state and the idle state.
In one embodiment of the present invention, the obtaining module 210 includes:
The calculation amount determining submodule is used for determining calculation amount and memory requirements according to the calculation requirement characteristics;
The task type sub-module is used for determining the task type according to the task attribute;
And the resource requirement sub-module is used for determining the resource requirement according to the calculated amount, the memory requirement and the task type.
In one embodiment of the present invention, the priority module 230 includes:
The task type determining submodule is used for determining a task type according to the task attribute, wherein the task type comprises a real-time reasoning task, a model training task and a data preprocessing task;
And the priority determining sub-module is used for determining the task priority according to the task type.
In one embodiment of the present invention, the scheduling module 240 includes:
The GPU environment submodule is used for dividing a physical GPU into a plurality of virtual instances and establishing a virtualized GPU environment;
the allocation policy sub-module is used for determining the computing power allocation policy of the virtual instance according to the resource requirement;
and the scheduling result submodule is used for determining a GPU scheduling result according to the task priority and the computing power distribution strategy.
In an embodiment of the present invention, further includes:
The prediction module is used for predicting GPU resources required by the task according to the historical resource demand data;
and the pre-allocation module is used for pre-allocating the GPU resources according to the predicted result.
Referring to fig. 3, a computer device illustrating a method for GPU resource virtualization computing power scheduling according to the present invention may specifically include the following:
The computer device 12 described above is in the form of a general purpose computing device and the components of the computer device 12 may include, but are not limited to, one or more processors or processing units 16, a system memory 28, and a bus 18 that connects the various system components, including the system memory 28 and the processing units 16.
Bus 18 represents one or more of several types of bus 18 structures, including a memory bus 18 or memory controller, a peripheral bus 18, an accelerated graphics port, a processor, or a local bus 18 using any of a variety of bus 18 architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus 18, micro channel architecture (MAC) bus 18, enhanced ISA bus 18, video Electronics Standards Association (VESA) local bus 18, and Peripheral Component Interconnect (PCI) bus 18.
Computer device 12 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by computer device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32. The computer device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from or write to non-removable, nonvolatile magnetic media (commonly referred to as a "hard disk drive"). Although not shown in fig. 3, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk such as a CD-ROM, DVD-ROM, or other optical media may be provided. In such cases, each drive may be coupled to bus 18 through one or more data medium interfaces. The memory may include at least one program product having a set (e.g., at least one) of program modules 42, the program modules 42 being configured to carry out the functions of embodiments of the invention.
A program/utility 40 having a set (at least one) of program modules 42 may be stored in, for example, a memory, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules 42, and program data, each or some combination of which may include an implementation of a network environment. Program modules 42 generally perform the functions and/or methods of the embodiments described herein.
The computer device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, camera, etc.), one or more devices that enable a healthcare worker to interact with the computer device 12, and/or any devices (e.g., network card, modem, etc.) that enable the computer device 12 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 22. Moreover, computer device 12 may also communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet, through network adapter 20. As shown, network adapter 20 communicates with other modules of computer device 12 via bus 18. It should be appreciated that although not shown in FIG. 3, other hardware and/or software modules may be used in connection with computer device 12, including, but not limited to, microcode, device drivers, redundant processing units 16, external disk drive arrays, RAID systems, tape drives, and data backup storage system 34, among others.
The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28, for example, to implement a method for virtualized computing power scheduling of GPU resources provided by embodiments of the present invention.
That is, the processing unit 16 performs the above-described program by acquiring a task and determining the calculation demand characteristics and task attributes of the task;
Determining resource requirements according to the computing requirement characteristics and task attributes;
Determining task priority according to the task attribute;
and determining a scheduling result of the GPU resources according to the task priority and the resource demand.
In an embodiment of the present application, the present application further provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for GPU resource virtualized computing power scheduling as provided in all embodiments of the present application:
that is, the program is realized when being executed by a processor by acquiring a task and determining the calculation requirement characteristic and the task attribute of the task;
Determining resource requirements according to the computing requirement characteristics and task attributes;
Determining task priority according to the task attribute;
and determining a scheduling result of the GPU resources according to the task priority and the resource demand.
Any combination of one or more computer readable media may be employed. The computer readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPOM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the healthcare worker computer, partly on the healthcare worker computer, as a stand-alone software package, partly on the healthcare worker computer and partly on a remote computer or entirely on the remote computer or server. In the case of remote computers, the remote computer may be connected to the healthcare worker computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (e.g., through the internet using an internet service provider). In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.
The method and apparatus for GPU resource virtualized computing power scheduling provided by the present application have been described in detail, and specific examples are used herein to illustrate the principles and embodiments of the present application, and the description of the above examples is only for aiding in understanding the method and core concept of the present application, and meanwhile, to those skilled in the art, according to the concept of the present application, there are variations in the specific embodiments and application ranges, so the disclosure should not be interpreted as limiting the present application.

Claims (10)

Translated fromChinese
1.一种GPU资源虚拟化算力调度的方法,其特征在于,包括步骤:1. A method for scheduling GPU resource virtualization computing power, characterized in that it includes the steps of:获取任务并确定所述任务的计算需求特性和任务属性;Obtaining a task and determining computational demand characteristics and task attributes of the task;依据所述计算需求特性与任务属性确定资源需求;Determining resource requirements based on the computing requirement characteristics and task attributes;依据所述任务属性确定任务优先级;Determining task priorities according to the task attributes;依据所述任务优先级与所述资源需求确定所述GPU资源的调度结果。The scheduling result of the GPU resources is determined according to the task priority and the resource requirement.2.根据权利要求1所述的方法,其特征在于,还包括:2. The method according to claim 1, further comprising:对所述GPU资源进行监控;Monitoring the GPU resources;当所述GPU资源负载不均衡时,启动跨节点任务调度。When the GPU resource load is unbalanced, cross-node task scheduling is started.3.根据权利要求2所述的方法,其特征在于,所述对所述GPU资源进行监控的步骤,包括:3. The method according to claim 2, wherein the step of monitoring the GPU resources comprises:获取GPU的使用情况和任务执行状态;Get GPU usage and task execution status;依据所述使用情况确定GPU的闲置状态;Determining an idle state of the GPU based on the usage;依据所述任务执行状态和所述闲置状态更新所述调度结果。The scheduling result is updated according to the task execution status and the idle status.4.根据权利要求1所述的方法,其特征在于,所述依据所述计算需求特性与任务属性确定资源需求的步骤,包括:4. The method according to claim 1, wherein the step of determining resource requirements based on the computing requirement characteristics and task attributes comprises:依据所述计算需求特性确定计算量和内存需求;Determining the amount of computation and memory requirements according to the computational requirements characteristics;依据所述任务属性确定任务类型;Determining a task type according to the task attributes;依据所述计算量、所述内存需求和所述任务类型确定资源需求。The resource requirement is determined according to the computation amount, the memory requirement and the task type.5.根据权利要求1所述的方法,其特征在于,所述依据所述任务属性确定任务优先级的步骤,包括:5. The method according to claim 1, characterized in that the step of determining the task priority according to the task attribute comprises:依据所述任务属性确定任务类型,其中,所述任务类型包括实时推理任务、模型训练任务和数据预处理任务;Determining a task type according to the task attributes, wherein the task type includes a real-time reasoning task, a model training task, and a data preprocessing task;依据所述任务类型确定任务优先级。The task priority is determined according to the task type.6.根据权利要求1所述的方法,其特征在于,所述依据所述任务优先级与所述资源需求调度GPU资源的步骤,包括:6. The method according to claim 1, wherein the step of scheduling GPU resources according to the task priority and the resource demand comprises:将物理GPU划分为多个虚拟实例,建立虚拟化GPU环境;Divide the physical GPU into multiple virtual instances to establish a virtualized GPU environment;依据所述资源需求确定虚拟实例的算力分配策略;Determining a computing power allocation strategy for the virtual instance based on the resource requirements;依据所述任务优先级与所述算力分配策略确定GPU调度结果。The GPU scheduling result is determined according to the task priority and the computing power allocation strategy.7.根据权利要求1所述的方法,其特征在于,还包括:7. The method according to claim 1, further comprising:依据历史资源需求数据对任务所需的GPU资源进行预测;Predict the GPU resources required for the task based on historical resource demand data;依据所述预测的结果对所述GPU资源进行预分配。The GPU resources are pre-allocated according to the prediction result.8.一种GPU资源虚拟化算力调度的装置,其特征在于,所述GPU资源虚拟化算力调度的装置实现如权利要求1至7中任一项所述的GPU资源虚拟化算力调度的方法的步骤,包括:8. A device for scheduling GPU resource virtualization computing power, characterized in that the device for scheduling GPU resource virtualization computing power implements the steps of the method for scheduling GPU resource virtualization computing power as described in any one of claims 1 to 7, including:获取模块,用于获取任务并确定所述任务的计算需求特性和任务属性;An acquisition module, used to acquire a task and determine the computing requirement characteristics and task attributes of the task;资源需求模块,用于依据所述计算需求特性与任务属性确定资源需求;A resource requirement module, used to determine resource requirements based on the computing requirement characteristics and task attributes;优先级模块,用于依据所述任务属性确定任务优先级;A priority module, used to determine the task priority according to the task attributes;调度模块,用于依据所述任务优先级与所述资源需求确定所述GPU资源的调度结果。The scheduling module is used to determine the scheduling result of the GPU resources according to the task priority and the resource demand.9.一种电子设备,其特征在于,包括处理器、存储器及存储在所述存储器上并能够在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1-7中任一项所述的基于联邦学习的超声心动图分割方法的步骤。9. An electronic device, characterized in that it comprises a processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein when the computer program is executed by the processor, the steps of the echocardiogram segmentation method based on federated learning as described in any one of claims 1 to 7 are implemented.10.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述的GPU资源虚拟化算力调度的方法的步骤。10. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method for GPU resource virtualization computing power scheduling as described in any one of claims 1 to 7 are implemented.
CN202510678046.7A2025-05-262025-05-26 A method and device for GPU resource virtualization computing power schedulingPendingCN120196421A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202510678046.7ACN120196421A (en)2025-05-262025-05-26 A method and device for GPU resource virtualization computing power scheduling

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202510678046.7ACN120196421A (en)2025-05-262025-05-26 A method and device for GPU resource virtualization computing power scheduling

Publications (1)

Publication NumberPublication Date
CN120196421Atrue CN120196421A (en)2025-06-24

Family

ID=96065898

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202510678046.7APendingCN120196421A (en)2025-05-262025-05-26 A method and device for GPU resource virtualization computing power scheduling

Country Status (1)

CountryLink
CN (1)CN120196421A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120469816A (en)*2025-07-142025-08-12融科联创(天津)信息技术有限公司Dynamic segmentation method, system, computer equipment and storage medium for computing power

Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN119003149A (en)*2024-07-182024-11-22合肥领航磐云信息科技有限公司GPU task queue management method, system and device
WO2024245038A1 (en)*2023-06-022024-12-05阿里云计算有限公司Method and apparatus for scheduling virtual cloud computing resources
CN119668832A (en)*2024-10-162025-03-21中国南方电网有限责任公司 Computing power scheduling method in distributed computing environment
CN119806839A (en)*2024-12-312025-04-11联想(北京)有限公司 Resource allocation method and electronic device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2024245038A1 (en)*2023-06-022024-12-05阿里云计算有限公司Method and apparatus for scheduling virtual cloud computing resources
CN119003149A (en)*2024-07-182024-11-22合肥领航磐云信息科技有限公司GPU task queue management method, system and device
CN119668832A (en)*2024-10-162025-03-21中国南方电网有限责任公司 Computing power scheduling method in distributed computing environment
CN119806839A (en)*2024-12-312025-04-11联想(北京)有限公司 Resource allocation method and electronic device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
林瑞杰等著: "《云游戏:5G开启数字娱乐新时代》", vol. 1, 31 January 2021, 机械工业出版社, pages: 190*
黄智伟等著: "《ARM9嵌入式系统设计基础教程》", vol. 1, 31 August 2008, 北京航空航天大学出版社, pages: 250*

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120469816A (en)*2025-07-142025-08-12融科联创(天津)信息技术有限公司Dynamic segmentation method, system, computer equipment and storage medium for computing power

Similar Documents

PublicationPublication DateTitle
Elmougy et al.A novel hybrid of Shortest job first and round Robin with dynamic variable quantum time task scheduling technique
CN112181613B (en)Heterogeneous resource distributed computing platform batch task scheduling method and storage medium
CN114443263A (en)Video memory management method, device, equipment and system
CN120196421A (en) A method and device for GPU resource virtualization computing power scheduling
KR20210108749A (en)Accelerator, method for operating the same and accelerator system including the same
WO2021136512A1 (en)Method and device for scheduling on basis of deep learning node computation, and storage medium
EP3989067A1 (en)Data processing method and apparatus for dynamic runtime selection of a kernel candidate implementing a layer of a neural network
US9471387B2 (en)Scheduling in job execution
CN112860396B (en)GPU scheduling method and system based on distributed deep learning
CN114896070A (en) A GPU resource allocation method for deep learning tasks
CN114546587A (en) A method for expanding and shrinking capacity of online image recognition service and related device
CN117539598A (en)Task processing method and device, electronic equipment and storage medium
CN120256056A (en) Machine learning training task communication scheduling method, device, equipment and storage medium
CN119829898A (en)Computing task scheduling method, electronic device, storage medium and product
CN118034938B (en)Job scheduling method, intelligent computing cloud operating system and computing platform
CN118963941A (en) Task allocation method and device
CN112783651A (en)Load balancing scheduling method, medium and device for vGPU of cloud platform
CN112114951A (en)Bottom-up distributed scheduling system and method
CN117539597A (en)Task processing method and device, electronic equipment and storage medium
CN112114967B (en)GPU resource reservation method based on service priority
CN112346863B (en)Method and system for processing dynamic adjustment data of computing resources
CN113407313B (en)Resource demand-aware multi-queue scheduling method, system and server
CN118069302A (en)Data processing method and device, electronic equipment and storage medium
CN116302478A (en)Multi-tenant resource allocation method, device, computer equipment and storage medium
CN120011093B (en)Accelerator-oriented multitasking method and related device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp