Movatterモバイル変換


[0]ホーム

URL:


CN119440808B - Computing resource allocation control method, device, electronic device and storage medium - Google Patents

Computing resource allocation control method, device, electronic device and storage medium
Download PDF

Info

Publication number
CN119440808B
CN119440808BCN202411424947.5ACN202411424947ACN119440808BCN 119440808 BCN119440808 BCN 119440808BCN 202411424947 ACN202411424947 ACN 202411424947ACN 119440808 BCN119440808 BCN 119440808B
Authority
CN
China
Prior art keywords
kernel function
amount
resource utilization
resource allocation
processing process
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411424947.5A
Other languages
Chinese (zh)
Other versions
CN119440808A (en
Inventor
程伟
邓玲
曾楚轩
李飞鹏
杜量
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Corp Ltd Guangdong Branch
Original Assignee
China United Network Communications Corp Ltd Guangdong Branch
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Corp Ltd Guangdong BranchfiledCriticalChina United Network Communications Corp Ltd Guangdong Branch
Priority to CN202411424947.5ApriorityCriticalpatent/CN119440808B/en
Publication of CN119440808ApublicationCriticalpatent/CN119440808A/en
Application grantedgrantedCritical
Publication of CN119440808BpublicationCriticalpatent/CN119440808B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种算力资源分配控制方法、装置、电子设备及存储介质。该方法包括:针对与图形处理器相关联的至少一个图形处理进程,根据图形处理进程在当前时刻对应的实际资源利用率和设定资源利用率,确定当前时刻对应的资源利用率偏差;获取上一时刻对应的资源分配控制量和当前时刻对应的核函数累积量;根据控制参数预测模型、实际资源利用率、设定资源利用率、资源利用率偏差、资源分配控制量和核函数累积量,确定当前时刻的资源分配控制量,以基于资源分配控制量确定当前时刻的下一时刻与待执行核函数对应的执行决策。本技术方案,实现了基于神经网络模型优化PID控制算法,以基于优化后的PID算法动态控制图形处理器的算力资源分配过程的效果。

The present invention discloses a computing power resource allocation control method, device, electronic device and storage medium. The method includes: for at least one graphics processing process associated with a graphics processor, according to the actual resource utilization rate and set resource utilization rate corresponding to the graphics processing process at the current moment, determining the resource utilization rate deviation corresponding to the current moment; obtaining the resource allocation control amount corresponding to the previous moment and the kernel function accumulation amount corresponding to the current moment; according to the control parameter prediction model, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount, determining the resource allocation control amount at the current moment, so as to determine the execution decision corresponding to the kernel function to be executed at the next moment of the current moment based on the resource allocation control amount. This technical solution realizes the effect of optimizing the PID control algorithm based on the neural network model, so as to dynamically control the computing power resource allocation process of the graphics processor based on the optimized PID algorithm.

Description

Computing power resource allocation control method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of resource scheduling technologies, and in particular, to a method and apparatus for controlling computing power resource allocation, an electronic device, and a storage medium.
Background
Virtual graphics processing unit (Virtual Graphics Processing Unit, VGPU) power resource scheduling is to efficiently allocate power resources of a physical GPU to different virtual machine or container instances in a virtualized environment to ensure that each instance achieves proper GPU performance.
In the related art, in the conventional GPU computing power resource allocation, a static allocation or a dynamic allocation manner based on a simple rule is generally adopted, and these methods may perform well under a certain scene, but have some technical problems when facing diversified and dynamically changing load and task situations.
Firstly, the computing task and load condition of the GPU computing power resources can be dynamically changed due to the change of time and application scenes, the traditional static allocation is difficult to adapt to the dynamic property, so that the resource allocation efficiency is low, and secondly, under the condition of unbalanced load, the balanced allocation of the tasks can not be realized, so that some GPUs process excessive tasks and are overloaded, and other GPUs are relatively idle, so that the overall performance is influenced.
Disclosure of Invention
The invention provides a computing power resource allocation control method, a computing power resource allocation control device, electronic equipment and a storage medium, which are used for realizing the optimization of a PID control algorithm based on a neural network model, so as to dynamically control the effect of the computing power resource allocation process of a graphic processor based on the optimized PID algorithm, and further improve the precision and efficiency of computing power resource allocation control.
According to an aspect of the present invention, there is provided a computing power resource allocation control method, the method comprising:
determining a resource utilization rate deviation corresponding to a graphic processing process at a current moment according to an actual resource utilization rate and a set resource utilization rate corresponding to the graphic processing process at the current moment aiming at least one graphic processing process associated with a graphic processor, wherein the actual resource utilization rate is used for indicating an amount of computational power resources applied by the graphic processing process at the current moment;
The method comprises the steps of obtaining a resource allocation control amount corresponding to a previous moment of a current moment of a graphics processing process and a kernel function accumulated amount corresponding to the current moment, wherein the kernel function accumulated amount is determined based on the resource allocation control amount of the previous moment and a kernel function operation amount, the kernel function operation amount is used for indicating the resource amount required by executing a kernel function to be executed, and the kernel function to be executed is used for executing the graphics processing process based on the graphics processor;
and determining the resource allocation control quantity of the graphic processing process at the current moment according to a control parameter prediction model obtained through pre-training, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity, so as to determine the kernel function execution decision of the graphic processing process at the moment next to the current moment based on the resource allocation control quantity.
According to another aspect of the present invention, there is provided a computing power resource allocation control device, the device comprising:
The deviation determining module is used for determining the deviation of the resource utilization rate corresponding to the graphic processing process at the current moment according to the actual resource utilization rate and the set resource utilization rate of at least one graphic processing process associated with the graphic processor at the current moment, wherein the actual resource utilization rate is used for indicating the calculated force resource quantity applied by the graphic processing process at the current moment;
The system comprises a cumulative quantity acquisition module, a kernel function operation quantity, a kernel function execution module and a graphics processor, wherein the cumulative quantity acquisition module is used for acquiring a resource allocation control quantity corresponding to the last moment of the current moment of the graphics processing process and a kernel function cumulative quantity corresponding to the current moment, wherein the kernel function cumulative quantity is determined based on the resource allocation control quantity of the last moment and the kernel function operation quantity;
and the allocation control quantity determining module is used for determining the resource allocation control quantity of the graphic processing process at the current moment according to a control parameter prediction model obtained through pre-training, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity so as to determine the kernel function execution decision of the graphic processing process at the moment next to the current moment based on the resource allocation control quantity.
According to another aspect of the present invention, there is provided an electronic apparatus including:
at least one processor, and
A memory communicatively coupled to the at least one processor, wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of controlling the allocation of computing power resources according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the method for controlling allocation of computing power resources according to any one of the embodiments of the present invention when executed.
According to the technical scheme, the resource allocation control quantity corresponding to the previous moment of the graphic processing process and the kernel function accumulation quantity corresponding to the current moment are obtained according to the actual resource utilization rate and the set resource utilization rate corresponding to the current moment of the graphic processing process, the resource allocation control quantity corresponding to the current moment of the graphic processing process is determined according to the at least one graphic processing process associated with the graphic processor, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the execution decision of the graphic processing process corresponding to the kernel function to be executed at the next moment of the current moment based on the resource allocation control quantity, the problems that the resource allocation control mode in related technologies is difficult to adapt to the calculation task and the load condition of dynamic change, and the resource allocation efficiency is low and the resource allocation is unbalanced are solved, the PID control algorithm is optimized based on the neural network model, the effect of dynamically controlling the calculation resource allocation process of the graphic processor based on the PID algorithm is improved, the algorithm is further, the calculation accuracy of the resource allocation control process is improved, the algorithm is controlled flexibly, and the algorithm is improved, and the algorithm of the resource allocation is well is controlled in the algorithm is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for controlling the allocation of computing power resources according to a first embodiment of the present invention;
fig. 2 is a flowchart of a method for controlling allocation of computing power resources according to a second embodiment of the present invention;
FIG. 3 is a flow chart of a method for controlling the allocation of computing power resources according to a third embodiment of the present invention;
Fig. 4 is a schematic structural diagram of a computing power resource allocation control device according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device implementing the method for controlling computing power resource allocation according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a method for controlling allocation of computing resources according to a first embodiment of the present invention, where the method may be performed by a computing resource allocation control device, and the computing resource allocation control device may be implemented in hardware and/or software, and the computing resource allocation control device may be configured in a terminal and/or a server. As shown in fig. 1, the method includes:
S110, aiming at least one graphic processing process associated with the graphic processor, determining a resource utilization deviation corresponding to the graphic processing process at the current moment according to the actual resource utilization and the set resource utilization corresponding to the graphic processing process at the current moment.
Among them, the graphic processor (Graphics Processing Unit, GPU) is a microprocessor that performs image and graphic related processing work exclusively on personal computers, workstations, game machines, and some mobile devices (tablet computers, smartphones, etc.). Typically, at least one process (or task) associated with graphics processing may be performed based on resources contained by the GPU. In the case of determining at least one graphics processing process, the graphics processing process may be caused to perform graphics processing operations in accordance with the allocated GPU resources by allocating at least part of the resources contained by the GPU to the respective graphics processing process. The graphic processing process may be a process of performing a series of operations and processes on graphic data. In this embodiment, the graphics processing processes may be executing respective graphics processing tasks on respective virtual graphics processing units in the graphics processor. It is appreciated that the rendering resources and computing resources of a GPU of a physical entity may be packaged into multiple independent virtual slices through virtualization techniques. Each virtual slice is assigned to a different virtual machine, enabling it to independently access and utilize GPU resources. At this point, the virtual slice may be referred to as a virtual graphics processing unit (Virtual Graphics Processing Unit, VGPU). The actual resource utilization is used to indicate the amount of computational resources that the graphics processing process is applying at the current time. The actual resource utilization rate can be understood as the ratio between the amount of resources actually applied by the graphics processing process to perform the graphics processing operation in a specific time and the total amount of resources. It should be noted that, for different graphics cards, the manner of determining the actual resource utilization of the graphics processing process is different. By way of example, assuming that the graphics card provided on the GPU is an Injeida graphics card, the actual resource utilization is typically determined by calculating the percentage of active streaming multiprocessors that account for the total number of lost multiprocessors. The resource utilization is set to indicate an amount of computational power resources that the graphics processing process expects to apply. Setting the resource utilization may be understood as the ratio between the amount of resources that the graphics processing process expects to apply to the graphics processing operation in a particular time period and the total amount of resources.
In this embodiment, the set resource utilization may be a predetermined resource utilization allocated to the corresponding graphics processing process, so that the corresponding graphics processing process may execute the corresponding graphics processing task on the graphics processor based on the corresponding set resource utilization. Further, in the process of executing the corresponding graphics processing task by the graphics processing process, a certain deviation exists between the actual resource utilization rate of the actual application of the execution task and the corresponding set resource utilization rate. Furthermore, in order to determine the resource utilization deviation corresponding to the graphics processing process at each moment, the actual resource utilization and the set resource utilization corresponding to the graphics processing process at the current moment can be obtained. Further, a difference between the set resource utilization rate and the actual resource utilization rate may be determined, and the difference may be used as a resource utilization rate deviation corresponding to the graphics processing process at the current time. The actual resource utilization rate can be obtained in various modes, and optionally, a performance detection tool is adopted to collect the resource utilization rate of the graphic processing process in real time or periodically, and the collected resource utilization rate is used as the actual resource utilization rate.
S120, acquiring a resource allocation control amount corresponding to the previous moment of the current moment of the graphic processing process and a kernel function accumulation amount corresponding to the current moment.
The resource allocation control quantity can be used for representing the control quantity according to which the resource utilization rate of the graphic processing process at the corresponding moment is adjusted. Under the condition of determining the actual resource utilization rate and the set resource utilization rate, the actual resource utilization rate can be adjusted according to the resource allocation control amount, so that the adjusted actual resource utilization rate approaches the set resource utilization rate. Therefore, the effect of effectively distributing the graphic processing resources and improving the utilization rate of the resources can be realized. In this embodiment, the resource allocation control amount may be understood as a computational resource allocated to the graphics processing process. The kernel function cumulative amount may be an amount of resources obtained after superposition of at least one resource allocation control amount, which may be used for executing the corresponding kernel function. The cumulative amount of the kernel function at the present time is determined based on the control amount of the resource allocation at the last time and the operation amount of the kernel function at the last time. The kernel operand may be used to indicate the amount of resources required to execute the kernel function to be executed. The core function to be executed may be used to execute a graphics processing process based on the graphics processor. In general, when a graphics processing process is executed based on a graphics processor, the corresponding graphics processing process may be executed by issuing a kernel function to be executed corresponding to the graphics processing process to the graphics processor. The kernel function operand may be determined according to the number of thread blocks carried by the kernel function to be executed when issuing and the number of threads in each thread block. In this embodiment, in the case where the resource allocation control amount at the previous time is determined, the resource allocation control amount may be input to the kernel function execution unit to superimpose the resource allocation control amount with the kernel function cumulative amount included in the kernel function execution unit. Further, the superimposed kernel cumulative amount may be compared with the kernel operand. When the cumulative amount of the superimposed kernel function is larger than the calculated amount of the kernel function, a difference between the cumulative amount of the superimposed kernel function and the calculated amount of the kernel function may be determined, and the difference is used as the cumulative amount of the kernel function at the current time.
In this embodiment, in order to determine the resource allocation control amount corresponding to the current time, the resource allocation control amount corresponding to the last time of the current time and the kernel function accumulation amount corresponding to the current time may be obtained. Further, the resource allocation control amount corresponding to the current time may be determined based on the acquired resource allocation control amount, the kernel function cumulative amount, and the determined resource utilization deviation.
S130, determining the resource allocation control quantity of the graphic processing process at the current moment according to a control parameter prediction model obtained through pre-training, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a resource allocation control quantity and a kernel function accumulation quantity, so as to determine an execution decision of the graphic processing process corresponding to the kernel function to be executed at the moment next to the current moment based on the resource allocation control quantity.
The control parameter prediction model may be a neural network model for predicting feedback control parameters. In this embodiment, the control parameter prediction model may be a neural network model that takes an actual resource utilization, a set resource utilization, a resource utilization deviation, a resource allocation control amount, and a kernel function accumulation amount as input objects to determine the feedback control parameter based on the input objects. The control parameter prediction model may be obtained by training a pre-constructed neural network model based on an actual resource utilization rate corresponding to a historical time, a set resource utilization rate, a resource utilization rate deviation, a kernel function cumulative amount, a resource allocation control amount at a time previous to the historical time, and an actual feedback control parameter corresponding to the historical time. It will be appreciated that the feedback control coefficient may be the core of a PID control algorithm, and may determine how the graphics processing process adjusts the actual resource utilization according to the resource utilization deviation, so as to reduce the difference between the actual resource utilization and the set resource utilization. Alternatively, the feedback control coefficients may include a proportional coefficient, an integral coefficient, and a differential coefficient. The execution decision may be used to characterize whether the respective kernel function is executed at the respective moment. Alternatively, executing the decision may include issuing the execution or delaying the execution.
In practical application, the PID control algorithm can be applied to the allocation control process of the VGPU computational power resource. In general, the PID control algorithm can ensure the correctness and convergence of the control quantity only under the condition of linear and time invariance of system performance. System performance linearity means that there should be a linear relationship between the amount of resource allocation control and the amount of resource allocation. Time invariance means that the same resource allocation control amount is input at any time, the resource allocation amount output by the system should be the same, and the time invariance is not influenced. However, these two features are difficult to achieve in the GPU, firstly, the kernel function and the data volume called by the GPU can be changed in a large scale frequently, the principle of linearity and time invariance is broken, and secondly, the instantaneous change of the massive parallel computing performance is limited by parameters such as a video memory, a computing core, data transmission and the like, the difference of the running performance at different time is large, and the linearity cannot be achieved. Furthermore, when the GPU power resource allocation control is performed based on the PID control algorithm, the resource utilization rate of each graphics processing process may not be precisely controlled, and further, the execution efficiency and the execution effect of the graphics processing process may be affected.
In view of the above, in this embodiment, a control parameter prediction model may be used to predict the feedback control parameter at the current time. Further, the resource allocation control amount at the current time may be determined according to the predicted feedback control parameter to adjust the actual resource utilization based on the resource allocation control amount.
It should be noted that the control parameter prediction model may be a neural network model with any model structure, and optionally, a recurrent neural network (Recursive Neural Network, RNN). The benefit of applying recurrent neural networks is that RNNs are able to learn time-varying patterns and time-dependent relationships of properties, which is particularly important for non-stationary data systems where properties vary over time, making them suitable for controlling dynamic systems with time-varying behavior.
In this embodiment, when the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function accumulation amount corresponding to the current time of the graphics processing process are obtained, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function accumulation amount may be processed according to the control parameter prediction model obtained by training in advance, so as to obtain the feedback control parameter corresponding to the current time of the graphics processing process. Further, the resource allocation control amount of the graphics processing process at the time next to the current time can be determined according to the feedback control parameter. Further, an execution decision corresponding to the kernel function to be executed may be determined based on the resource allocation control amount.
According to the technical scheme, the resource allocation control quantity corresponding to the previous moment of the graphic processing process and the kernel function accumulation quantity corresponding to the current moment are obtained according to the actual resource utilization rate and the set resource utilization rate corresponding to the current moment of the graphic processing process, the resource allocation control quantity corresponding to the current moment of the graphic processing process is determined according to the at least one graphic processing process associated with the graphic processor, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the execution decision of the graphic processing process corresponding to the kernel function to be executed at the next moment of the current moment based on the resource allocation control quantity, the problems that the resource allocation control mode in related technologies is difficult to adapt to the calculation task and the load condition of dynamic change, and the resource allocation efficiency is low and the resource allocation is unbalanced are solved, the PID control algorithm is optimized based on the neural network model, the effect of dynamically controlling the calculation resource allocation process of the graphic processor based on the PID algorithm is improved, the algorithm is further, the calculation accuracy of the resource allocation control process is improved, the algorithm is controlled flexibly, and the algorithm is improved, and the algorithm of the resource allocation is well is controlled in the algorithm is improved.
Example two
Fig. 2 is a flowchart of a method for controlling computing power resource allocation according to a second embodiment of the present invention, where, based on the foregoing embodiment, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a resource allocation control amount, and a kernel function accumulation amount are input into a control parameter prediction model obtained by training in advance, so as to obtain a feedback control parameter corresponding to a graphics processing process at a current time, and the resource allocation control amount of the graphics processing process at the current time is determined based on the resource utilization rate deviation and the feedback control parameter. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical or similar to those of the above embodiments are not repeated herein.
As shown in fig. 2, the method includes:
S210, determining the resource utilization deviation corresponding to the graphics processing process at the current moment according to the actual resource utilization and the set resource utilization corresponding to the graphics processing process at the current moment aiming at least one graphics processing process associated with the graphics processor.
S220, acquiring a resource allocation control amount corresponding to the previous moment of the current moment of the graphic processing process and a kernel function accumulation amount corresponding to the current moment.
S230, inputting the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity into a control parameter prediction model obtained through pre-training to obtain feedback control parameters corresponding to the graphic processing process at the current moment.
The feedback control parameters refer to various parameters used for adjusting system performance and control accuracy in a feedback control system. In this embodiment, the feedback control parameter may be used to adjust the deviation between the actual resource utilization and the set resource utilization so that the actual resource utilization approaches the set resource utilization.
In this embodiment, the control parameter prediction model may be obtained by training based on an actual resource utilization rate corresponding to a historical time, a set resource utilization rate, a resource utilization rate deviation, a kernel function cumulative amount, a resource allocation control amount at a time immediately before the historical time, and an actual feedback control parameter corresponding to the historical time. Before the control parameter prediction model provided in this embodiment is applied, a pre-constructed neural network model may be trained in a supervised or unsupervised manner. Before training the neural network model, a plurality of training samples may be constructed to train the model based on the training samples. In order to improve the accuracy of the control parameter prediction model, training samples can be constructed as much and as abundant as possible. Optionally, the training process of the control parameter prediction model may include obtaining a plurality of training sample data, and training the model to be trained based on the plurality of training sample data to obtain the control parameter prediction model.
The training sample data comprises an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a kernel function accumulation amount, a resource allocation control amount at the last time of the historical time and an actual feedback control parameter corresponding to the historical time, wherein the actual resource utilization rate corresponds to the historical time of the sample graphic processing process. The sample graphics processing process may have performed any graphics processing process that is complete. The actual feedback control parameter may be a feedback control parameter determined according to a predetermined control parameter determination manner. In general, the actual feedback control parameter may be a feedback control parameter that enables the graphics processor to achieve a better effect on the resource allocation control of the sample graphics processing process when performing the resource allocation control based on the feedback control algorithm. The model to be trained may be a neural network model with model parameters that are initial or default values.
In this embodiment, for a plurality of history times associated with a sample graphics processing process, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a kernel function cumulative amount, a resource allocation control amount at a time immediately preceding the history time, and an actual feedback control parameter at the history time corresponding to the sample graphics processing process at the history time are obtained. Further, training sample data can be constructed according to the actual resource utilization rate corresponding to the historical moment, the set resource utilization rate, the resource utilization rate deviation and the kernel function accumulation amount, the resource allocation control amount at the moment above the historical moment and the actual feedback control parameters at the historical moment. Further, a plurality of training sample data corresponding to the sample graphic processing process can be obtained. Further, the model to be trained can be trained according to the plurality of training sample data, so as to obtain the control parameter prediction model.
Optionally, training the model to be trained based on a plurality of training sample data to obtain a control parameter prediction model, wherein the control parameter prediction model comprises the steps of inputting the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount in the training sample data into the model to be trained aiming at the plurality of training sample data to obtain a prediction control parameter corresponding to a sample graph processing process at a historical moment, determining a loss value based on the prediction control parameter and the actual feedback control parameter included in the training sample data, correcting the model parameter in the model to be trained based on the loss value, and converging the loss function in the model to be trained as a training target to obtain the control parameter prediction model.
Wherein the loss value may be a value indicative of the degree of difference between the predicted output and the actual output. The loss function may be a function determined based on the loss value that characterizes the degree of difference between the predicted output and the actual output. The loss function may be any loss function, alternatively a mean square error loss function.
As an optional implementation manner of this embodiment, for a plurality of training sample data, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount in the training sample data may be input into the model to be trained, so as to process the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount based on the model to be trained, and obtain the prediction control parameter corresponding to the historical moment. Further, the predicted control parameter may be compared to the actual feedback control parameter in the training sample data to determine the loss value. Further, the model parameters of the model to be trained may be corrected based on the loss values, and then the training error of the loss function in the model to be trained, that is, the loss parameters, may be used as a condition for detecting whether the current loss function reaches convergence, for example, whether the training error is smaller than a preset error or whether the error variation trend tends to be stable, or whether the current iteration number of the model is equal to a preset number of times, etc. If the detection reaches the convergence condition, for example, the training error of the loss function is smaller than the preset error or the error change tends to be stable, which indicates that the training of the model to be trained is completed, at this time, the iterative training can be stopped. If the current condition is detected not to be met, other sample data can be further obtained to train the model to be trained until the training error of the loss function is within a preset range. When the training error of the loss function reaches convergence, the training model to be trained after the training is completed can be used as a control parameter prediction model.
Further, when the trained control parameter prediction model is obtained, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount corresponding to the acquired graphics processing process at the current moment can be input into the control parameter prediction model obtained through pre-training. Furthermore, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount can be processed based on the control parameter prediction model, and feedback control parameters corresponding to the graphic processing process at the current moment can be obtained.
And S240, determining the resource allocation control quantity of the graphic processing process at the current moment based on the resource utilization rate deviation and the feedback control parameter, so as to determine the execution decision of the graphic processing process corresponding to the kernel function to be executed at the moment next to the current moment based on the resource allocation control quantity.
In this embodiment, when the feedback control parameter corresponding to the current time is obtained, the resource allocation control amount of the graphics processing process at the current time may be determined based on the resource utilization deviation corresponding to the current time and the feedback control parameter.
Optionally, the feedback control parameters comprise a proportional coefficient, an integral coefficient and a differential coefficient, the resource allocation control amount of the graphics processing process at the current moment is determined based on the resource utilization deviation and the feedback control parameters, the method comprises the steps of determining the product between the resource utilization deviation and the total calculation core corresponding to the graphics processor and taking the product as the resource utilization deviation, determining the product between the proportional coefficient and the resource utilization deviation as the proportional control amount, determining the product between the integral coefficient and the integral value of the resource utilization deviation as the integral control amount, determining the product between the differential coefficient and the differential value of the resource utilization deviation as the differential control amount, and adding the proportional control amount, the integral control amount and the differential control amount to obtain the resource allocation control amount of the graphics processing process at the current moment.
Where a compute core refers to a core unit in a computer processor (e.g., a CPU or GPU) that is responsible for performing computing tasks. The total number of compute cores may refer to the total number of independent processor cores integrated within a computer processor.
As an alternative implementation of this embodiment, a product between the resource utilization deviation and the total amount of computing cores corresponding to the graphics processor may be determined, and the product may be used as the resource utilization deviation corresponding to the current time. Further, a product between the scaling factor and the deviation of the resource utilization amount at the present time may be determined, and the product may be used as the scaling control amount at the present time. And determining an integral value of the resource utilization amount deviation corresponding to the preset time period, and determining a product between an integral coefficient and the integral value, wherein the product is used as an integral control amount. And determining a differential value corresponding to the resource utilization deviation in a preset time period, and determining a product between the differential coefficient and the differential value, wherein the product is used as a differential control quantity. Further, the proportional control amount, the integral control amount, and the differential control amount may be added, and the control amount obtained by the addition may be used as the resource allocation control amount of the graphics processing process at the current time. The preset time period may be a time period of any time length.
According to the technical scheme, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity are input into the control parameter prediction model obtained through training in advance to obtain the feedback control parameter corresponding to the graphic processing process at the current moment, the resource allocation control quantity of the graphic processing process at the current moment is determined based on the resource utilization rate deviation and the feedback control parameter, the effect of dynamically adjusting the feedback control parameter based on the neural network model to improve the allocation control precision of the computing power resource is achieved, the PID control algorithm is optimized based on the neural network model, the effect of quickly regulating and controlling the feedback control coefficient to be converged under the condition that the computing power resource of the GPU is dynamically changed is achieved, and the feedback control coefficient is eliminated from greatly oscillating and can be quickly returned to the right place when the feedback control coefficient is suddenly changed.
Example III
Fig. 3 is a flowchart of a method for controlling computing power resource allocation according to a third embodiment of the present invention, where, based on the foregoing embodiment, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a resource allocation control amount, and a kernel function cumulative amount are input into a control parameter prediction model obtained by training in advance, so as to obtain a feedback control parameter corresponding to a graphics processing process at a current time, and the resource allocation control amount of the graphics processing process at the current time is determined based on the resource utilization rate deviation and the feedback control parameter. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical or similar to those of the above embodiments are not repeated herein.
As shown in fig. 3, the method includes:
S310, determining the resource utilization deviation corresponding to the graphic processing process at the current moment according to the actual resource utilization and the set resource utilization corresponding to the graphic processing process at the current moment aiming at least one graphic processing process associated with the graphic processor.
S320, acquiring a resource allocation control amount corresponding to the previous moment of the current moment of the graphic processing process and a kernel function accumulation amount corresponding to the current moment.
S330, determining the resource allocation control quantity of the graphic processing process at the current moment according to the control parameter prediction model, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity which are obtained through pre-training, so as to add the resource allocation control quantity and the kernel function accumulation quantity corresponding to the current moment, and obtaining the kernel function accumulation quantity at the next moment of the current moment.
In this embodiment, when the resource allocation control amount at the current time is obtained, the resource allocation control amount may be input to the kernel function execution unit, and the kernel function execution unit may include the kernel function cumulative amount at the current time. Further, the resource allocation control amount corresponding to the current time may be added to the kernel function accumulation amount at the current time, and the added resource amount may be used as the kernel function accumulation amount at the next time of the current time.
S340, comparing the cumulative quantity of the kernel function at the next moment with the predetermined operation quantity of the kernel function at the next moment corresponding to the function to be executed, and determining an execution decision corresponding to the kernel function to be executed at the next moment based on the comparison result.
In this embodiment, after the kernel function cumulative amount at the next time is obtained, the kernel function cumulative amount may be compared with the kernel function operation amount at the next time to determine whether the kernel function cumulative amount can support the operation process of the kernel function to be executed at the next time.
Optionally, determining an execution decision corresponding to the kernel function to be executed at the next moment based on the comparison result comprises determining that the execution decision corresponding to the kernel function to be executed is issued and executed under the condition that the kernel function accumulation amount is larger than the kernel function operation amount, determining a difference value between the kernel function accumulation amount and the kernel function operation amount, updating the kernel function accumulation amount based on the difference value, and determining that the kernel function to be executed is delayed and executed under the condition that the kernel function accumulation amount is not larger than the kernel function operation amount.
As an optional implementation manner of this embodiment, in the case that the cumulative amount of the kernel function is greater than the operation amount of the kernel function, it may be determined that the execution decision corresponding to the kernel function to be executed is to be issued for execution, and the kernel function to be executed is issued to the graphics processor, so as to execute the graphics processing process based on the graphics processor. And under the condition of issuing the kernel function to be executed, the required resource quantity is the resource quantity corresponding to the kernel function operation quantity. Further, a difference between the kernel cumulative amount and the kernel operand may be determined, and the kernel cumulative amount may be updated based on the difference, that is, the difference may be taken as the kernel cumulative amount at the next time. Or under the condition that the cumulative amount of the kernel function is not larger than the operation amount of the kernel function, determining the execution decision corresponding to the kernel function to be executed as delayed execution, and keeping the cumulative amount of the kernel function at the next moment unchanged.
In this embodiment, before comparing the kernel cumulative amount with the kernel operand, the kernel operand corresponding to the kernel to be executed may be determined. Optionally, the determination method of the kernel function operand may be that the number of thread blocks corresponding to the kernel function to be executed and the number of threads in each thread block are obtained, and the product between the number of thread blocks and the number of threads is determined to obtain the kernel function operand corresponding to the kernel function to be executed.
Where a thread block is a packet made up of multiple threads, that is, a thread block is a collection containing multiple threads. Threads in a thread block may cooperate during execution. Thread blocks may be used to organize and manage threads to achieve efficient parallel computing. The thread is an execution unit in a process, and is the minimum unit of operation scheduling that an operating system can perform. The number of thread blocks and the number of threads in each thread block are parameters in the execution parameters associated with the kernel function to be executed, and may be obtained directly based on the execution parameters.
As an optional implementation manner of this embodiment, the number of thread blocks corresponding to the kernel function to be executed and the number of threads included in each thread block may be obtained. Further, a product between the number of thread blocks and the number of threads may be determined, and the product may be used as a kernel function operand corresponding to the kernel function to be executed.
According to the technical scheme, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the control parameter prediction model, the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity which are obtained through training in advance, the kernel function accumulation quantity corresponding to the current moment is obtained by adding the resource allocation control quantity and the kernel function accumulation quantity corresponding to the current moment, further, the kernel function accumulation quantity at the next moment is compared with the kernel function operation quantity which corresponds to the function to be executed and is determined according to the comparison result, an execution decision corresponding to the kernel function to be executed at the next moment is determined, and the effect of optimizing a PID control algorithm based on a neural network model is achieved, so that the computing force resource allocation process of the graphic processor is dynamically controlled based on the optimized PID algorithm, the precision and efficiency of the computing force resource allocation control are improved, the flexibility of the computing force resource allocation control process is improved, the computing force resource allocation balance degree of the PID control algorithm is improved, and the universality of the PID control algorithm in the computing force allocation control process is improved.
Example IV
Fig. 4 is a schematic structural diagram of a computing power resource allocation control device according to a fourth embodiment of the present invention. As shown in fig. 4, the apparatus includes a deviation determination module 410, an accumulated amount acquisition module 420, and an allocation control amount determination module 430.
The method comprises the steps of determining, for at least one graphics processing process associated with a graphics processor, a resource utilization rate deviation corresponding to the graphics processing process at a current moment according to an actual resource utilization rate and a set resource utilization rate corresponding to the graphics processing process at the current moment, wherein the actual resource utilization rate is used for indicating an applied computing power resource amount of the graphics processing process at the current moment, the set resource utilization rate is used for indicating an expected applied computing power resource amount of the graphics processing process, an accumulated amount acquisition module 420 is used for acquiring a resource allocation control amount corresponding to the graphics processing process at a moment on the current moment and a kernel function accumulated amount corresponding to the current moment, wherein the kernel function accumulated amount is determined based on a resource allocation control amount and a kernel function operation amount of the last moment, the kernel function operation amount is used for indicating a resource amount required for executing a kernel function to be executed, the kernel function to be executed is used for executing the graphics processing process based on the graphics processor, and an allocation control amount determination module 430 is used for determining a graph allocation control amount at the current moment, the actual resource utilization rate and the current moment, and the resource allocation control amount is determined based on the current control amount of the kernel function.
According to the technical scheme, the resource allocation control quantity corresponding to the previous moment of the graphic processing process and the kernel function accumulation quantity corresponding to the current moment are obtained according to the actual resource utilization rate and the set resource utilization rate corresponding to the current moment of the graphic processing process, the resource allocation control quantity corresponding to the current moment of the graphic processing process is determined according to the at least one graphic processing process associated with the graphic processor, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the execution decision of the graphic processing process corresponding to the kernel function to be executed at the next moment of the current moment based on the resource allocation control quantity, the problems that the resource allocation control mode in related technologies is difficult to adapt to the calculation task and the load condition of dynamic change, and the resource allocation efficiency is low and the resource allocation is unbalanced are solved, the PID control algorithm is optimized based on the neural network model, the effect of dynamically controlling the calculation resource allocation process of the graphic processor based on the PID algorithm is improved, the algorithm is further, the calculation accuracy of the resource allocation control process is improved, the algorithm is controlled flexibly, and the algorithm is improved, and the algorithm of the resource allocation is well is controlled in the algorithm is improved.
Alternatively, the allocation control amount determining module 430 includes a control parameter determining unit and a control amount determining unit.
A control parameter determining unit, configured to input the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function cumulative amount into a control parameter prediction model obtained by training in advance, so as to obtain a feedback control parameter corresponding to the current time of the graphics processing process;
And the control quantity determining unit is used for determining the resource allocation control quantity of the graphic processing process at the current moment based on the resource utilization rate deviation and the feedback control parameter.
Alternatively, the allocation control amount determining module 430 includes an accumulation amount determining unit and an execution decision determining unit.
An accumulated amount determining unit, configured to add the resource allocation control amount to an accumulated amount of kernel functions corresponding to the current time, to obtain an accumulated amount of kernel functions at a time next to the current time;
and the execution decision determining unit is used for comparing the kernel function accumulated quantity at the next moment with the kernel function operation quantity at the next moment, which is preset and corresponds to the function to be executed, and determining the kernel function execution decision at the next moment based on the comparison result.
Optionally, the execution decision determining unit comprises a issuing execution decision determining subunit and a delay execution decision determining subunit.
The issuing execution decision determining subunit is used for determining the execution decision corresponding to the kernel function to be executed as issuing execution under the condition that the kernel function cumulative amount is larger than the kernel function operation amount, determining the difference value between the kernel function cumulative amount and the kernel function operation amount, and updating the kernel function cumulative amount based on the difference value;
And the delayed execution decision determining subunit is used for determining that the kernel function to be executed corresponds to delayed execution under the condition that the cumulative quantity of the kernel function is not larger than the operation quantity of the kernel function.
Optionally, the device further comprises a thread number acquisition module and a kernel function operand determination module.
The thread number acquisition module is used for acquiring the number of thread blocks corresponding to the kernel function to be executed and the number of threads in each thread block;
and the kernel function operand determining module is used for determining the product between the number of the thread blocks and the number of the threads to obtain kernel function operand corresponding to the kernel function to be executed.
Optionally, the device further comprises a model training module.
The model training module is used for training to obtain a control parameter prediction model;
the model training module comprises a sample data acquisition unit and a model training unit.
The system comprises a sample data acquisition unit, a data processing unit and a data processing unit, wherein the sample data acquisition unit is used for acquiring a plurality of training sample data, and the training sample data comprises an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation and a kernel function accumulation amount, a resource allocation control amount at the last time of the historical time and an actual feedback control parameter corresponding to the historical time, which correspond to a sample graphic processing process at the historical time;
and the model training unit is used for training the model to be trained based on the training sample data to obtain a control parameter prediction model.
Optionally, the model training unit comprises a prediction control parameter determination subunit, a loss value determination subunit and a model parameter correction subunit.
The prediction control parameter determining subunit is configured to input, for a plurality of training sample data, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function cumulative amount and the resource allocation control amount in the training sample data into a model to be trained, so as to obtain a prediction control parameter corresponding to the sample graphics processing process at the historical moment;
a loss value determination subunit configured to determine a loss value based on the predicted control parameter and the actual feedback control parameter included in the training sample data;
and the model parameter correction subunit is used for correcting the model parameters in the model to be trained based on the loss value, and converging a loss function in the model to be trained as a training target to obtain a control parameter prediction model.
Optionally, the feedback control parameter includes a proportional coefficient, an integral coefficient, and a differential coefficient, and the control amount determining unit includes a deviation amount determining subunit, a proportional control amount determining subunit, an integral control amount determining subunit, a differential control amount determining subunit, and a control amount determining subunit.
A deviation amount determining subunit, configured to determine a product between the resource utilization deviation and a total amount of computation cores corresponding to the graphics processor, and use the product as a resource utilization deviation amount;
A proportional control amount determining subunit configured to determine a product between the proportional coefficient and the resource utilization deviation amount, and take the product as a proportional control amount;
an integral control amount determining subunit configured to determine a product between the integral coefficient and an integral value of the resource utilization deviation amount, the product being taken as an integral control amount;
a differential control amount determination subunit configured to determine a product between the differential coefficient and a differential value of the resource utilization deviation amount, the product being taken as a differential control amount;
And the control quantity determining subunit is used for adding the proportional control quantity, the integral control quantity and the differential control quantity to obtain the resource allocation control quantity of the graphic processing process at the current moment.
The computing power resource allocation control device provided by the embodiment of the invention can execute the computing power resource allocation control method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including an input unit 16, such as a keyboard, mouse, etc., an output unit 17, such as various types of displays, speakers, etc., a storage unit 18, such as a magnetic disk, optical disk, etc., and a communication unit 19, such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the computing resource allocation control method.
In some embodiments, the computing power resource allocation control method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the computing power resource allocation control method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the computing power resource allocation control method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be a special or general purpose programmable processor, operable to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user, for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback), and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a Local Area Network (LAN), a Wide Area Network (WAN), a blockchain network, and the Internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (9)

Translated fromChinese
1.一种算力资源分配控制方法,其特征在于,包括:1. A computing resource allocation control method, characterized by comprising:针对与图形处理器相关联的至少一个图形处理进程,根据所述图形处理进程在当前时刻所对应的实际资源利用率和设定资源利用率,确定所述图形处理进程在所述当前时刻对应的资源利用率偏差;其中,所述实际资源利用率用于指示所述图形处理进程在所述当前时刻所应用的算力资源量;所述设定资源利用率用于指示所述图形处理进程预期应用的算力资源量;For at least one graphics processing process associated with a graphics processor, determining a resource utilization deviation corresponding to the graphics processing process at the current moment according to an actual resource utilization and a set resource utilization corresponding to the graphics processing process at the current moment; wherein the actual resource utilization is used to indicate the amount of computing power resources applied by the graphics processing process at the current moment; and the set resource utilization is used to indicate the amount of computing power resources expected to be applied by the graphics processing process;获取所述图形处理进程在所述当前时刻的上一时刻对应的资源分配控制量和所述当前时刻对应的核函数累积量;其中,所述核函数累积量基于所述上一时刻的资源分配控制量和核函数运算量确定;所述核函数运算量用于指示执行待执行核函数所需的资源量;所述待执行核函数用于基于所述图形处理器执行所述图形处理进程;Obtaining a resource allocation control amount corresponding to the graphics processing process at the previous moment of the current moment and a kernel function accumulation amount corresponding to the current moment; wherein the kernel function accumulation amount is determined based on the resource allocation control amount and the kernel function operation amount at the previous moment; the kernel function operation amount is used to indicate the amount of resources required to execute the kernel function to be executed; the kernel function to be executed is used to execute the graphics processing process based on the graphics processor;根据预先训练得到的控制参数预测模型、所述实际资源利用率、所述设定资源利用率、所述资源利用率偏差、所述资源分配控制量和所述核函数累积量,确定所述图形处理进程在所述当前时刻的资源分配控制量,以基于所述资源分配控制量确定所述图形处理进程在所述当前时刻的下一时刻与所述待执行核函数对应的执行决策;Determine the resource allocation control amount of the graphics processing process at the current moment according to the control parameter prediction model obtained by pre-training, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount, so as to determine the execution decision of the graphics processing process corresponding to the kernel function to be executed at the next moment after the current moment based on the resource allocation control amount;所述根据预先训练得到的控制参数预测模型、所述实际资源利用率、所述设定资源利用率、所述资源利用率偏差、所述资源分配控制量和所述核函数累积量,确定所述图形处理进程在所述当前时刻的资源分配控制量,包括:The determining the resource allocation control amount of the graphics processing process at the current moment according to the control parameter prediction model obtained by pre-training, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount includes:将所述实际资源利用率、所述设定资源利用率、所述资源利用率偏差、所述资源分配控制量和所述核函数累积量输入至预先训练得到的控制参数预测模型中,以得到所述图形处理进程在所述当前时刻对应的反馈控制参数;Inputting the actual resource utilization, the set resource utilization, the resource utilization deviation, the resource allocation control amount and the kernel function accumulation amount into a pre-trained control parameter prediction model to obtain a feedback control parameter corresponding to the graphics processing process at the current moment;基于所述资源利用率偏差和所述反馈控制参数,确定所述图形处理进程在所述当前时刻的资源分配控制量。Based on the resource utilization deviation and the feedback control parameter, a resource allocation control amount of the graphics processing process at the current moment is determined.2.根据权利要求1所述的算力资源分配控制方法,其特征在于,所述基于所述资源分配控制量确定所述图形处理进程在所述当前时刻的下一时刻与待执行核函数对应的执行决策,包括:2. The computing resource allocation control method according to claim 1, characterized in that the step of determining the execution decision of the graphics processing process corresponding to the kernel function to be executed at the next moment after the current moment based on the resource allocation control amount comprises:将所述资源分配控制量与所述当前时刻对应的核函数累积量进行相加,得到所述当前时刻的下一时刻的核函数累积量;Adding the resource allocation control amount to the kernel function accumulation amount corresponding to the current moment to obtain the kernel function accumulation amount at the next moment of the current moment;将所述下一时刻的核函数累积量与预先确定的与所述待执行核函数对应的所述下一时刻的核函数运算量进行比对,并基于比对结果确定所述下一时刻与所述待执行核函数对应的执行决策。The kernel function accumulation at the next moment is compared with the predetermined kernel function operation amount at the next moment corresponding to the kernel function to be executed, and the execution decision corresponding to the kernel function to be executed at the next moment is determined based on the comparison result.3.根据权利要求2所述的算力资源分配控制方法,其特征在于,所述基于比对结果确定所述下一时刻与所述待执行核函数对应的执行决策,包括:3. The computing resource allocation control method according to claim 2, characterized in that the step of determining the execution decision corresponding to the kernel function to be executed at the next moment based on the comparison result comprises:在所述核函数累积量大于所述核函数运算量的情况下,确定所述待执行核函数对应的执行决策为下发执行,并确定所述核函数累积量与所述核函数运算量之间的差值,基于所述差值更新所述核函数累积量;In the case where the kernel function accumulation amount is greater than the kernel function operation amount, determining that the execution decision corresponding to the to-be-executed kernel function is to be issued for execution, and determining the difference between the kernel function accumulation amount and the kernel function operation amount, and updating the kernel function accumulation amount based on the difference;在所述核函数累积量未大于所述核函数运算量的情况下,确定所述待执行核函数对应的为延迟执行。When the kernel function accumulation amount is not greater than the kernel function operation amount, it is determined that the kernel function to be executed corresponds to delayed execution.4.根据权利要求2所述的算力资源分配控制方法,其特征在于,还包括:4. The computing resource allocation control method according to claim 2, further comprising:获取与待执行核函数对应的线程块数量以及各线程块中的线程数量;Obtain the number of thread blocks corresponding to the kernel function to be executed and the number of threads in each thread block;确定所述线程块数量和所述线程数量之间的乘积,以得到所述待执行核函数对应的核函数运算量。The product of the number of thread blocks and the number of threads is determined to obtain a kernel function operation amount corresponding to the kernel function to be executed.5.根据权利要求1所述的算力资源分配控制方法,其特征在于,还包括:5. The computing resource allocation control method according to claim 1, further comprising:训练得到控制参数预测模型;The control parameter prediction model is obtained through training;所述训练得到控制参数预测模型,包括:The training obtains a control parameter prediction model, including:获取多个训练样本数据;其中,所述训练样本数据包括样本图形处理进程在历史时刻对应的实际资源利用率、设定资源利用率、资源利用率偏差和核函数累积量、所述历史时刻的上一时刻的资源分配控制量以及所述历史时刻对应的实际反馈控制参数;Acquire a plurality of training sample data; wherein the training sample data includes the actual resource utilization rate, set resource utilization rate, resource utilization rate deviation and kernel function accumulation amount corresponding to the sample graphics processing process at the historical moment, the resource allocation control amount at the previous moment of the historical moment, and the actual feedback control parameter corresponding to the historical moment;基于多个所述训练样本数据对待训练模型进行训练,得到控制参数预测模型。The model to be trained is trained based on the plurality of training sample data to obtain a control parameter prediction model.6.根据权利要求5所述的算力资源分配控制方法,其特征在于,所述基于多个所述训练样本数据对待训练模型进行训练,得到控制参数预测模型,包括:6. The computing resource allocation control method according to claim 5, characterized in that the training of the to-be-trained model based on the plurality of training sample data to obtain the control parameter prediction model comprises:针对多个所述训练样本数据,将所述训练样本数据中的所述实际资源利用率、所述设定资源利用率、所述资源利用率偏差、所述核函数累积量和所述资源分配控制量输入至待训练模型中,得到所述样本图形处理进程在所述历史时刻对应的预测控制参数;For the plurality of training sample data, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount in the training sample data are input into the model to be trained to obtain the prediction control parameter corresponding to the sample graphic processing process at the historical moment;基于所述预测控制参数和所述训练样本数据包括的所述实际反馈控制参数,确定损失值;Determining a loss value based on the predicted control parameter and the actual feedback control parameter included in the training sample data;基于所述损失值对所述待训练模型中的模型参数进行修正,将所述待训练模型中的损失函数收敛作为训练目标,得到控制参数预测模型。The model parameters in the model to be trained are modified based on the loss value, and the convergence of the loss function in the model to be trained is taken as the training goal to obtain a control parameter prediction model.7.根据权利要求1所述的算力资源分配控制方法,其特征在于,所述反馈控制参数包括比例系数、积分系数和微分系数;所述基于所述资源利用率偏差和所述反馈控制参数,确定所述图形处理进程在所述当前时刻的资源分配控制量,包括:7. The computing resource allocation control method according to claim 1, wherein the feedback control parameter comprises a proportional coefficient, an integral coefficient and a differential coefficient; and the determining the resource allocation control amount of the graphics processing process at the current moment based on the resource utilization deviation and the feedback control parameter comprises:确定所述资源利用率偏差与所述图形处理器对应的计算核心总量之间的乘积,并将所述乘积作为资源利用量偏差;Determine a product of the resource utilization deviation and the total number of computing cores corresponding to the graphics processor, and use the product as the resource utilization deviation;确定所述比例系数与所述资源利用量偏差之间的乘积,将所述乘积作为比例控制量;Determine the product of the proportional coefficient and the resource utilization deviation, and use the product as the proportional control amount;确定所述积分系数与所述资源利用量偏差的积分值之间的乘积,将所述乘积作为积分控制量;Determine the product of the integral coefficient and the integral value of the resource utilization deviation, and use the product as the integral control amount;确定所述微分系数与所述资源利用量偏差的微分值之间的乘积,将所述乘积作为微分控制量;Determine the product of the differential coefficient and the differential value of the resource utilization deviation, and use the product as the differential control amount;将所述比例控制量、所述积分控制量和所述微分控制量相加,以得到所述图形处理进程在所述当前时刻的资源分配控制量。The proportional control amount, the integral control amount and the differential control amount are added together to obtain the resource allocation control amount of the graphics processing process at the current moment.8.一种算力资源分配控制装置,其特征在于,包括:8. A computing resource allocation control device, characterized in that it includes:偏差确定模块,用于针对与图形处理器相关联的至少一个图形处理进程,根据所述图形处理进程在当前时刻所对应的实际资源利用率和设定资源利用率,确定所述图形处理进程在所述当前时刻对应的资源利用率偏差;其中,所述实际资源利用率用于指示所述图形处理进程在所述当前时刻所应用的算力资源量;所述设定资源利用率用于指示所述图形处理进程预期应用的算力资源量;a deviation determination module, configured to determine, for at least one graphics processing process associated with a graphics processor, a resource utilization deviation corresponding to the graphics processing process at the current moment according to an actual resource utilization and a set resource utilization corresponding to the graphics processing process at the current moment; wherein the actual resource utilization is used to indicate the amount of computing power resources applied by the graphics processing process at the current moment; and the set resource utilization is used to indicate the amount of computing power resources expected to be applied by the graphics processing process;累积量获取模块,用于获取所述图形处理进程在所述当前时刻的上一时刻对应的资源分配控制量和所述当前时刻对应的核函数累积量;其中,所述核函数累积量基于所述上一时刻的资源分配控制量和核函数运算量确定;所述核函数运算量用于指示执行待执行核函数所需的资源量;所述待执行核函数用于基于所述图形处理器执行所述图形处理进程;A cumulative amount acquisition module, used to acquire the resource allocation control amount corresponding to the graphics processing process at the previous moment of the current moment and the kernel function cumulative amount corresponding to the current moment; wherein the kernel function cumulative amount is determined based on the resource allocation control amount and the kernel function operation amount at the previous moment; the kernel function operation amount is used to indicate the amount of resources required to execute the kernel function to be executed; the kernel function to be executed is used to execute the graphics processing process based on the graphics processor;分配控制量确定模块,用于根据预先训练得到的控制参数预测模型、所述实际资源利用率、所述设定资源利用率、所述资源利用率偏差、所述资源分配控制量和所述核函数累积量,确定所述图形处理进程在所述当前时刻的资源分配控制量,以基于所述资源分配控制量确定所述图形处理进程在所述当前时刻的下一时刻的核函数执行决策;an allocation control amount determination module, configured to determine the resource allocation control amount of the graphics processing process at the current moment according to a pre-trained control parameter prediction model, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function accumulation amount, so as to determine the kernel function execution decision of the graphics processing process at the next moment after the current moment based on the resource allocation control amount;其中,所述分配控制量确定模块包括:控制参数确定单元和控制量确定单元;Wherein, the distribution control amount determination module includes: a control parameter determination unit and a control amount determination unit;所述控制参数确定单元,用于将所述实际资源利用率、所述设定资源利用率、所述资源利用率偏差、所述资源分配控制量和所述核函数累积量输入至预先训练得到的控制参数预测模型中,以得到所述图形处理进程在所述当前时刻对应的反馈控制参数;The control parameter determination unit is used to input the actual resource utilization, the set resource utilization, the resource utilization deviation, the resource allocation control amount and the kernel function accumulation into a pre-trained control parameter prediction model to obtain the feedback control parameter corresponding to the graphics processing process at the current moment;所述控制量确定单元,用于基于所述资源利用率偏差和所述反馈控制参数,确定所述图形处理进程在所述当前时刻的资源分配控制量。The control amount determination unit is used to determine the resource allocation control amount of the graphics processing process at the current moment based on the resource utilization deviation and the feedback control parameter.9.一种电子设备,其特征在于,所述电子设备包括:9. An electronic device, characterized in that the electronic device comprises:至少一个处理器;以及at least one processor; and与所述至少一个处理器通信连接的存储器;其中,a memory communicatively connected to the at least one processor; wherein,所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的算力资源分配控制方法。The memory stores a computer program that can be executed by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the computing power resource allocation control method described in any one of claims 1-7.
CN202411424947.5A2024-10-122024-10-12 Computing resource allocation control method, device, electronic device and storage mediumActiveCN119440808B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202411424947.5ACN119440808B (en)2024-10-122024-10-12 Computing resource allocation control method, device, electronic device and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202411424947.5ACN119440808B (en)2024-10-122024-10-12 Computing resource allocation control method, device, electronic device and storage medium

Publications (2)

Publication NumberPublication Date
CN119440808A CN119440808A (en)2025-02-14
CN119440808Btrue CN119440808B (en)2025-07-11

Family

ID=94518732

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202411424947.5AActiveCN119440808B (en)2024-10-122024-10-12 Computing resource allocation control method, device, electronic device and storage medium

Country Status (1)

CountryLink
CN (1)CN119440808B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN120123107B (en)*2025-05-142025-07-18超讯通信股份有限公司Large-scale parallel computing optimization method based on GPU

Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103324780A (en)*2012-12-202013-09-25中国科学院近代物理研究所Particle flow simulation system and method
CN114168344A (en)*2021-12-152022-03-11中山大学GPU resource allocation method, device, equipment and readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN115098257B (en)*2022-06-232024-12-13中国电信股份有限公司 Resource scheduling method, device, equipment and storage medium
CN116541162A (en)*2023-03-152023-08-04北京趋动智能科技有限公司Calculation force control method and device, storage medium and electronic equipment
CN117201308A (en)*2023-08-142023-12-08北京邮电大学Network resource allocation method, system, storage medium and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN103324780A (en)*2012-12-202013-09-25中国科学院近代物理研究所Particle flow simulation system and method
CN114168344A (en)*2021-12-152022-03-11中山大学GPU resource allocation method, device, equipment and readable storage medium

Also Published As

Publication numberPublication date
CN119440808A (en)2025-02-14

Similar Documents

PublicationPublication DateTitle
CN119440808B (en) Computing resource allocation control method, device, electronic device and storage medium
CN112965903B (en) Testing method, device, electronic device and computer-readable storage medium
CN114500339B (en)Node bandwidth monitoring method and device, electronic equipment and storage medium
Singh et al.Ensemble learning for large-scale workload prediction
WO2021253851A1 (en)Cluster distributed resource scheduling method, apparatus and device, and storage medium
CN117785465A (en)Resource scheduling method, device, equipment and storage medium
CN117472471A (en)Application program configuration method, device, equipment and storage medium
CN114490160B (en)Automatic adjustment method, device, equipment and medium for data inclination optimization factors
CN119440726A (en) Resource optimization method, device, electronic device and medium for heterogeneous computing cluster
CN117112222A (en)Request processing method and device, electronic equipment and storage medium
CN115598967A (en)Parameter setting model training method, parameter determining method, device, equipment and medium
CN115438007A (en)File merging method and device, electronic equipment and medium
CN117251295B (en)Training method, device, equipment and medium of resource prediction model
CN113886746A (en) Page loading method, apparatus, device and medium
CN119311415A (en) Computing resource allocation method, device, electronic device and storage medium
US9389919B2 (en)Managing workload distribution among computer systems based on intersection of throughput and latency models
CN117971509B (en)Heterogeneous computing power cluster operation performance optimization method, heterogeneous computing power cluster operation performance optimization device, heterogeneous computing power cluster operation performance optimization equipment and medium
CN115361449B (en)Method, device, equipment and storage medium for adjusting IP resources
CN115442432B (en)Control method, device, equipment and storage medium
CN114626636B (en) Power grid load forecasting method, device, modeling method, computer equipment and medium
CN120492110A (en)Task optimization scheduling method, equipment, medium and program for power grid system
CN119697190A (en) A load balancing method, device, equipment, medium and program product
CN117687792A (en)Low-delay response method, device, equipment and medium for key process of system
CN119094472A (en) Large model resource allocation method, device, equipment and medium
CN119356857A (en) A computing task processing method, device, electronic device and storage medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp