Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a method for controlling allocation of computing resources according to a first embodiment of the present invention, where the method may be performed by a computing resource allocation control device, and the computing resource allocation control device may be implemented in hardware and/or software, and the computing resource allocation control device may be configured in a terminal and/or a server. As shown in fig. 1, the method includes:
S110, aiming at least one graphic processing process associated with the graphic processor, determining a resource utilization deviation corresponding to the graphic processing process at the current moment according to the actual resource utilization and the set resource utilization corresponding to the graphic processing process at the current moment.
Among them, the graphic processor (Graphics Processing Unit, GPU) is a microprocessor that performs image and graphic related processing work exclusively on personal computers, workstations, game machines, and some mobile devices (tablet computers, smartphones, etc.). Typically, at least one process (or task) associated with graphics processing may be performed based on resources contained by the GPU. In the case of determining at least one graphics processing process, the graphics processing process may be caused to perform graphics processing operations in accordance with the allocated GPU resources by allocating at least part of the resources contained by the GPU to the respective graphics processing process. The graphic processing process may be a process of performing a series of operations and processes on graphic data. In this embodiment, the graphics processing processes may be executing respective graphics processing tasks on respective virtual graphics processing units in the graphics processor. It is appreciated that the rendering resources and computing resources of a GPU of a physical entity may be packaged into multiple independent virtual slices through virtualization techniques. Each virtual slice is assigned to a different virtual machine, enabling it to independently access and utilize GPU resources. At this point, the virtual slice may be referred to as a virtual graphics processing unit (Virtual Graphics Processing Unit, VGPU). The actual resource utilization is used to indicate the amount of computational resources that the graphics processing process is applying at the current time. The actual resource utilization rate can be understood as the ratio between the amount of resources actually applied by the graphics processing process to perform the graphics processing operation in a specific time and the total amount of resources. It should be noted that, for different graphics cards, the manner of determining the actual resource utilization of the graphics processing process is different. By way of example, assuming that the graphics card provided on the GPU is an Injeida graphics card, the actual resource utilization is typically determined by calculating the percentage of active streaming multiprocessors that account for the total number of lost multiprocessors. The resource utilization is set to indicate an amount of computational power resources that the graphics processing process expects to apply. Setting the resource utilization may be understood as the ratio between the amount of resources that the graphics processing process expects to apply to the graphics processing operation in a particular time period and the total amount of resources.
In this embodiment, the set resource utilization may be a predetermined resource utilization allocated to the corresponding graphics processing process, so that the corresponding graphics processing process may execute the corresponding graphics processing task on the graphics processor based on the corresponding set resource utilization. Further, in the process of executing the corresponding graphics processing task by the graphics processing process, a certain deviation exists between the actual resource utilization rate of the actual application of the execution task and the corresponding set resource utilization rate. Furthermore, in order to determine the resource utilization deviation corresponding to the graphics processing process at each moment, the actual resource utilization and the set resource utilization corresponding to the graphics processing process at the current moment can be obtained. Further, a difference between the set resource utilization rate and the actual resource utilization rate may be determined, and the difference may be used as a resource utilization rate deviation corresponding to the graphics processing process at the current time. The actual resource utilization rate can be obtained in various modes, and optionally, a performance detection tool is adopted to collect the resource utilization rate of the graphic processing process in real time or periodically, and the collected resource utilization rate is used as the actual resource utilization rate.
S120, acquiring a resource allocation control amount corresponding to the previous moment of the current moment of the graphic processing process and a kernel function accumulation amount corresponding to the current moment.
The resource allocation control quantity can be used for representing the control quantity according to which the resource utilization rate of the graphic processing process at the corresponding moment is adjusted. Under the condition of determining the actual resource utilization rate and the set resource utilization rate, the actual resource utilization rate can be adjusted according to the resource allocation control amount, so that the adjusted actual resource utilization rate approaches the set resource utilization rate. Therefore, the effect of effectively distributing the graphic processing resources and improving the utilization rate of the resources can be realized. In this embodiment, the resource allocation control amount may be understood as a computational resource allocated to the graphics processing process. The kernel function cumulative amount may be an amount of resources obtained after superposition of at least one resource allocation control amount, which may be used for executing the corresponding kernel function. The cumulative amount of the kernel function at the present time is determined based on the control amount of the resource allocation at the last time and the operation amount of the kernel function at the last time. The kernel operand may be used to indicate the amount of resources required to execute the kernel function to be executed. The core function to be executed may be used to execute a graphics processing process based on the graphics processor. In general, when a graphics processing process is executed based on a graphics processor, the corresponding graphics processing process may be executed by issuing a kernel function to be executed corresponding to the graphics processing process to the graphics processor. The kernel function operand may be determined according to the number of thread blocks carried by the kernel function to be executed when issuing and the number of threads in each thread block. In this embodiment, in the case where the resource allocation control amount at the previous time is determined, the resource allocation control amount may be input to the kernel function execution unit to superimpose the resource allocation control amount with the kernel function cumulative amount included in the kernel function execution unit. Further, the superimposed kernel cumulative amount may be compared with the kernel operand. When the cumulative amount of the superimposed kernel function is larger than the calculated amount of the kernel function, a difference between the cumulative amount of the superimposed kernel function and the calculated amount of the kernel function may be determined, and the difference is used as the cumulative amount of the kernel function at the current time.
In this embodiment, in order to determine the resource allocation control amount corresponding to the current time, the resource allocation control amount corresponding to the last time of the current time and the kernel function accumulation amount corresponding to the current time may be obtained. Further, the resource allocation control amount corresponding to the current time may be determined based on the acquired resource allocation control amount, the kernel function cumulative amount, and the determined resource utilization deviation.
S130, determining the resource allocation control quantity of the graphic processing process at the current moment according to a control parameter prediction model obtained through pre-training, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a resource allocation control quantity and a kernel function accumulation quantity, so as to determine an execution decision of the graphic processing process corresponding to the kernel function to be executed at the moment next to the current moment based on the resource allocation control quantity.
The control parameter prediction model may be a neural network model for predicting feedback control parameters. In this embodiment, the control parameter prediction model may be a neural network model that takes an actual resource utilization, a set resource utilization, a resource utilization deviation, a resource allocation control amount, and a kernel function accumulation amount as input objects to determine the feedback control parameter based on the input objects. The control parameter prediction model may be obtained by training a pre-constructed neural network model based on an actual resource utilization rate corresponding to a historical time, a set resource utilization rate, a resource utilization rate deviation, a kernel function cumulative amount, a resource allocation control amount at a time previous to the historical time, and an actual feedback control parameter corresponding to the historical time. It will be appreciated that the feedback control coefficient may be the core of a PID control algorithm, and may determine how the graphics processing process adjusts the actual resource utilization according to the resource utilization deviation, so as to reduce the difference between the actual resource utilization and the set resource utilization. Alternatively, the feedback control coefficients may include a proportional coefficient, an integral coefficient, and a differential coefficient. The execution decision may be used to characterize whether the respective kernel function is executed at the respective moment. Alternatively, executing the decision may include issuing the execution or delaying the execution.
In practical application, the PID control algorithm can be applied to the allocation control process of the VGPU computational power resource. In general, the PID control algorithm can ensure the correctness and convergence of the control quantity only under the condition of linear and time invariance of system performance. System performance linearity means that there should be a linear relationship between the amount of resource allocation control and the amount of resource allocation. Time invariance means that the same resource allocation control amount is input at any time, the resource allocation amount output by the system should be the same, and the time invariance is not influenced. However, these two features are difficult to achieve in the GPU, firstly, the kernel function and the data volume called by the GPU can be changed in a large scale frequently, the principle of linearity and time invariance is broken, and secondly, the instantaneous change of the massive parallel computing performance is limited by parameters such as a video memory, a computing core, data transmission and the like, the difference of the running performance at different time is large, and the linearity cannot be achieved. Furthermore, when the GPU power resource allocation control is performed based on the PID control algorithm, the resource utilization rate of each graphics processing process may not be precisely controlled, and further, the execution efficiency and the execution effect of the graphics processing process may be affected.
In view of the above, in this embodiment, a control parameter prediction model may be used to predict the feedback control parameter at the current time. Further, the resource allocation control amount at the current time may be determined according to the predicted feedback control parameter to adjust the actual resource utilization based on the resource allocation control amount.
It should be noted that the control parameter prediction model may be a neural network model with any model structure, and optionally, a recurrent neural network (Recursive Neural Network, RNN). The benefit of applying recurrent neural networks is that RNNs are able to learn time-varying patterns and time-dependent relationships of properties, which is particularly important for non-stationary data systems where properties vary over time, making them suitable for controlling dynamic systems with time-varying behavior.
In this embodiment, when the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function accumulation amount corresponding to the current time of the graphics processing process are obtained, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function accumulation amount may be processed according to the control parameter prediction model obtained by training in advance, so as to obtain the feedback control parameter corresponding to the current time of the graphics processing process. Further, the resource allocation control amount of the graphics processing process at the time next to the current time can be determined according to the feedback control parameter. Further, an execution decision corresponding to the kernel function to be executed may be determined based on the resource allocation control amount.
According to the technical scheme, the resource allocation control quantity corresponding to the previous moment of the graphic processing process and the kernel function accumulation quantity corresponding to the current moment are obtained according to the actual resource utilization rate and the set resource utilization rate corresponding to the current moment of the graphic processing process, the resource allocation control quantity corresponding to the current moment of the graphic processing process is determined according to the at least one graphic processing process associated with the graphic processor, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the execution decision of the graphic processing process corresponding to the kernel function to be executed at the next moment of the current moment based on the resource allocation control quantity, the problems that the resource allocation control mode in related technologies is difficult to adapt to the calculation task and the load condition of dynamic change, and the resource allocation efficiency is low and the resource allocation is unbalanced are solved, the PID control algorithm is optimized based on the neural network model, the effect of dynamically controlling the calculation resource allocation process of the graphic processor based on the PID algorithm is improved, the algorithm is further, the calculation accuracy of the resource allocation control process is improved, the algorithm is controlled flexibly, and the algorithm is improved, and the algorithm of the resource allocation is well is controlled in the algorithm is improved.
Example two
Fig. 2 is a flowchart of a method for controlling computing power resource allocation according to a second embodiment of the present invention, where, based on the foregoing embodiment, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a resource allocation control amount, and a kernel function accumulation amount are input into a control parameter prediction model obtained by training in advance, so as to obtain a feedback control parameter corresponding to a graphics processing process at a current time, and the resource allocation control amount of the graphics processing process at the current time is determined based on the resource utilization rate deviation and the feedback control parameter. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical or similar to those of the above embodiments are not repeated herein.
As shown in fig. 2, the method includes:
S210, determining the resource utilization deviation corresponding to the graphics processing process at the current moment according to the actual resource utilization and the set resource utilization corresponding to the graphics processing process at the current moment aiming at least one graphics processing process associated with the graphics processor.
S220, acquiring a resource allocation control amount corresponding to the previous moment of the current moment of the graphic processing process and a kernel function accumulation amount corresponding to the current moment.
S230, inputting the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity into a control parameter prediction model obtained through pre-training to obtain feedback control parameters corresponding to the graphic processing process at the current moment.
The feedback control parameters refer to various parameters used for adjusting system performance and control accuracy in a feedback control system. In this embodiment, the feedback control parameter may be used to adjust the deviation between the actual resource utilization and the set resource utilization so that the actual resource utilization approaches the set resource utilization.
In this embodiment, the control parameter prediction model may be obtained by training based on an actual resource utilization rate corresponding to a historical time, a set resource utilization rate, a resource utilization rate deviation, a kernel function cumulative amount, a resource allocation control amount at a time immediately before the historical time, and an actual feedback control parameter corresponding to the historical time. Before the control parameter prediction model provided in this embodiment is applied, a pre-constructed neural network model may be trained in a supervised or unsupervised manner. Before training the neural network model, a plurality of training samples may be constructed to train the model based on the training samples. In order to improve the accuracy of the control parameter prediction model, training samples can be constructed as much and as abundant as possible. Optionally, the training process of the control parameter prediction model may include obtaining a plurality of training sample data, and training the model to be trained based on the plurality of training sample data to obtain the control parameter prediction model.
The training sample data comprises an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a kernel function accumulation amount, a resource allocation control amount at the last time of the historical time and an actual feedback control parameter corresponding to the historical time, wherein the actual resource utilization rate corresponds to the historical time of the sample graphic processing process. The sample graphics processing process may have performed any graphics processing process that is complete. The actual feedback control parameter may be a feedback control parameter determined according to a predetermined control parameter determination manner. In general, the actual feedback control parameter may be a feedback control parameter that enables the graphics processor to achieve a better effect on the resource allocation control of the sample graphics processing process when performing the resource allocation control based on the feedback control algorithm. The model to be trained may be a neural network model with model parameters that are initial or default values.
In this embodiment, for a plurality of history times associated with a sample graphics processing process, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a kernel function cumulative amount, a resource allocation control amount at a time immediately preceding the history time, and an actual feedback control parameter at the history time corresponding to the sample graphics processing process at the history time are obtained. Further, training sample data can be constructed according to the actual resource utilization rate corresponding to the historical moment, the set resource utilization rate, the resource utilization rate deviation and the kernel function accumulation amount, the resource allocation control amount at the moment above the historical moment and the actual feedback control parameters at the historical moment. Further, a plurality of training sample data corresponding to the sample graphic processing process can be obtained. Further, the model to be trained can be trained according to the plurality of training sample data, so as to obtain the control parameter prediction model.
Optionally, training the model to be trained based on a plurality of training sample data to obtain a control parameter prediction model, wherein the control parameter prediction model comprises the steps of inputting the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount in the training sample data into the model to be trained aiming at the plurality of training sample data to obtain a prediction control parameter corresponding to a sample graph processing process at a historical moment, determining a loss value based on the prediction control parameter and the actual feedback control parameter included in the training sample data, correcting the model parameter in the model to be trained based on the loss value, and converging the loss function in the model to be trained as a training target to obtain the control parameter prediction model.
Wherein the loss value may be a value indicative of the degree of difference between the predicted output and the actual output. The loss function may be a function determined based on the loss value that characterizes the degree of difference between the predicted output and the actual output. The loss function may be any loss function, alternatively a mean square error loss function.
As an optional implementation manner of this embodiment, for a plurality of training sample data, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount in the training sample data may be input into the model to be trained, so as to process the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function accumulation amount and the resource allocation control amount based on the model to be trained, and obtain the prediction control parameter corresponding to the historical moment. Further, the predicted control parameter may be compared to the actual feedback control parameter in the training sample data to determine the loss value. Further, the model parameters of the model to be trained may be corrected based on the loss values, and then the training error of the loss function in the model to be trained, that is, the loss parameters, may be used as a condition for detecting whether the current loss function reaches convergence, for example, whether the training error is smaller than a preset error or whether the error variation trend tends to be stable, or whether the current iteration number of the model is equal to a preset number of times, etc. If the detection reaches the convergence condition, for example, the training error of the loss function is smaller than the preset error or the error change tends to be stable, which indicates that the training of the model to be trained is completed, at this time, the iterative training can be stopped. If the current condition is detected not to be met, other sample data can be further obtained to train the model to be trained until the training error of the loss function is within a preset range. When the training error of the loss function reaches convergence, the training model to be trained after the training is completed can be used as a control parameter prediction model.
Further, when the trained control parameter prediction model is obtained, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount corresponding to the acquired graphics processing process at the current moment can be input into the control parameter prediction model obtained through pre-training. Furthermore, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount and the kernel function accumulation amount can be processed based on the control parameter prediction model, and feedback control parameters corresponding to the graphic processing process at the current moment can be obtained.
And S240, determining the resource allocation control quantity of the graphic processing process at the current moment based on the resource utilization rate deviation and the feedback control parameter, so as to determine the execution decision of the graphic processing process corresponding to the kernel function to be executed at the moment next to the current moment based on the resource allocation control quantity.
In this embodiment, when the feedback control parameter corresponding to the current time is obtained, the resource allocation control amount of the graphics processing process at the current time may be determined based on the resource utilization deviation corresponding to the current time and the feedback control parameter.
Optionally, the feedback control parameters comprise a proportional coefficient, an integral coefficient and a differential coefficient, the resource allocation control amount of the graphics processing process at the current moment is determined based on the resource utilization deviation and the feedback control parameters, the method comprises the steps of determining the product between the resource utilization deviation and the total calculation core corresponding to the graphics processor and taking the product as the resource utilization deviation, determining the product between the proportional coefficient and the resource utilization deviation as the proportional control amount, determining the product between the integral coefficient and the integral value of the resource utilization deviation as the integral control amount, determining the product between the differential coefficient and the differential value of the resource utilization deviation as the differential control amount, and adding the proportional control amount, the integral control amount and the differential control amount to obtain the resource allocation control amount of the graphics processing process at the current moment.
Where a compute core refers to a core unit in a computer processor (e.g., a CPU or GPU) that is responsible for performing computing tasks. The total number of compute cores may refer to the total number of independent processor cores integrated within a computer processor.
As an alternative implementation of this embodiment, a product between the resource utilization deviation and the total amount of computing cores corresponding to the graphics processor may be determined, and the product may be used as the resource utilization deviation corresponding to the current time. Further, a product between the scaling factor and the deviation of the resource utilization amount at the present time may be determined, and the product may be used as the scaling control amount at the present time. And determining an integral value of the resource utilization amount deviation corresponding to the preset time period, and determining a product between an integral coefficient and the integral value, wherein the product is used as an integral control amount. And determining a differential value corresponding to the resource utilization deviation in a preset time period, and determining a product between the differential coefficient and the differential value, wherein the product is used as a differential control quantity. Further, the proportional control amount, the integral control amount, and the differential control amount may be added, and the control amount obtained by the addition may be used as the resource allocation control amount of the graphics processing process at the current time. The preset time period may be a time period of any time length.
According to the technical scheme, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity are input into the control parameter prediction model obtained through training in advance to obtain the feedback control parameter corresponding to the graphic processing process at the current moment, the resource allocation control quantity of the graphic processing process at the current moment is determined based on the resource utilization rate deviation and the feedback control parameter, the effect of dynamically adjusting the feedback control parameter based on the neural network model to improve the allocation control precision of the computing power resource is achieved, the PID control algorithm is optimized based on the neural network model, the effect of quickly regulating and controlling the feedback control coefficient to be converged under the condition that the computing power resource of the GPU is dynamically changed is achieved, and the feedback control coefficient is eliminated from greatly oscillating and can be quickly returned to the right place when the feedback control coefficient is suddenly changed.
Example III
Fig. 3 is a flowchart of a method for controlling computing power resource allocation according to a third embodiment of the present invention, where, based on the foregoing embodiment, an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation, a resource allocation control amount, and a kernel function cumulative amount are input into a control parameter prediction model obtained by training in advance, so as to obtain a feedback control parameter corresponding to a graphics processing process at a current time, and the resource allocation control amount of the graphics processing process at the current time is determined based on the resource utilization rate deviation and the feedback control parameter. The specific implementation manner can be seen in the technical scheme of the embodiment. Wherein, the technical terms identical or similar to those of the above embodiments are not repeated herein.
As shown in fig. 3, the method includes:
S310, determining the resource utilization deviation corresponding to the graphic processing process at the current moment according to the actual resource utilization and the set resource utilization corresponding to the graphic processing process at the current moment aiming at least one graphic processing process associated with the graphic processor.
S320, acquiring a resource allocation control amount corresponding to the previous moment of the current moment of the graphic processing process and a kernel function accumulation amount corresponding to the current moment.
S330, determining the resource allocation control quantity of the graphic processing process at the current moment according to the control parameter prediction model, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity which are obtained through pre-training, so as to add the resource allocation control quantity and the kernel function accumulation quantity corresponding to the current moment, and obtaining the kernel function accumulation quantity at the next moment of the current moment.
In this embodiment, when the resource allocation control amount at the current time is obtained, the resource allocation control amount may be input to the kernel function execution unit, and the kernel function execution unit may include the kernel function cumulative amount at the current time. Further, the resource allocation control amount corresponding to the current time may be added to the kernel function accumulation amount at the current time, and the added resource amount may be used as the kernel function accumulation amount at the next time of the current time.
S340, comparing the cumulative quantity of the kernel function at the next moment with the predetermined operation quantity of the kernel function at the next moment corresponding to the function to be executed, and determining an execution decision corresponding to the kernel function to be executed at the next moment based on the comparison result.
In this embodiment, after the kernel function cumulative amount at the next time is obtained, the kernel function cumulative amount may be compared with the kernel function operation amount at the next time to determine whether the kernel function cumulative amount can support the operation process of the kernel function to be executed at the next time.
Optionally, determining an execution decision corresponding to the kernel function to be executed at the next moment based on the comparison result comprises determining that the execution decision corresponding to the kernel function to be executed is issued and executed under the condition that the kernel function accumulation amount is larger than the kernel function operation amount, determining a difference value between the kernel function accumulation amount and the kernel function operation amount, updating the kernel function accumulation amount based on the difference value, and determining that the kernel function to be executed is delayed and executed under the condition that the kernel function accumulation amount is not larger than the kernel function operation amount.
As an optional implementation manner of this embodiment, in the case that the cumulative amount of the kernel function is greater than the operation amount of the kernel function, it may be determined that the execution decision corresponding to the kernel function to be executed is to be issued for execution, and the kernel function to be executed is issued to the graphics processor, so as to execute the graphics processing process based on the graphics processor. And under the condition of issuing the kernel function to be executed, the required resource quantity is the resource quantity corresponding to the kernel function operation quantity. Further, a difference between the kernel cumulative amount and the kernel operand may be determined, and the kernel cumulative amount may be updated based on the difference, that is, the difference may be taken as the kernel cumulative amount at the next time. Or under the condition that the cumulative amount of the kernel function is not larger than the operation amount of the kernel function, determining the execution decision corresponding to the kernel function to be executed as delayed execution, and keeping the cumulative amount of the kernel function at the next moment unchanged.
In this embodiment, before comparing the kernel cumulative amount with the kernel operand, the kernel operand corresponding to the kernel to be executed may be determined. Optionally, the determination method of the kernel function operand may be that the number of thread blocks corresponding to the kernel function to be executed and the number of threads in each thread block are obtained, and the product between the number of thread blocks and the number of threads is determined to obtain the kernel function operand corresponding to the kernel function to be executed.
Where a thread block is a packet made up of multiple threads, that is, a thread block is a collection containing multiple threads. Threads in a thread block may cooperate during execution. Thread blocks may be used to organize and manage threads to achieve efficient parallel computing. The thread is an execution unit in a process, and is the minimum unit of operation scheduling that an operating system can perform. The number of thread blocks and the number of threads in each thread block are parameters in the execution parameters associated with the kernel function to be executed, and may be obtained directly based on the execution parameters.
As an optional implementation manner of this embodiment, the number of thread blocks corresponding to the kernel function to be executed and the number of threads included in each thread block may be obtained. Further, a product between the number of thread blocks and the number of threads may be determined, and the product may be used as a kernel function operand corresponding to the kernel function to be executed.
According to the technical scheme, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the control parameter prediction model, the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity which are obtained through training in advance, the kernel function accumulation quantity corresponding to the current moment is obtained by adding the resource allocation control quantity and the kernel function accumulation quantity corresponding to the current moment, further, the kernel function accumulation quantity at the next moment is compared with the kernel function operation quantity which corresponds to the function to be executed and is determined according to the comparison result, an execution decision corresponding to the kernel function to be executed at the next moment is determined, and the effect of optimizing a PID control algorithm based on a neural network model is achieved, so that the computing force resource allocation process of the graphic processor is dynamically controlled based on the optimized PID algorithm, the precision and efficiency of the computing force resource allocation control are improved, the flexibility of the computing force resource allocation control process is improved, the computing force resource allocation balance degree of the PID control algorithm is improved, and the universality of the PID control algorithm in the computing force allocation control process is improved.
Example IV
Fig. 4 is a schematic structural diagram of a computing power resource allocation control device according to a fourth embodiment of the present invention. As shown in fig. 4, the apparatus includes a deviation determination module 410, an accumulated amount acquisition module 420, and an allocation control amount determination module 430.
The method comprises the steps of determining, for at least one graphics processing process associated with a graphics processor, a resource utilization rate deviation corresponding to the graphics processing process at a current moment according to an actual resource utilization rate and a set resource utilization rate corresponding to the graphics processing process at the current moment, wherein the actual resource utilization rate is used for indicating an applied computing power resource amount of the graphics processing process at the current moment, the set resource utilization rate is used for indicating an expected applied computing power resource amount of the graphics processing process, an accumulated amount acquisition module 420 is used for acquiring a resource allocation control amount corresponding to the graphics processing process at a moment on the current moment and a kernel function accumulated amount corresponding to the current moment, wherein the kernel function accumulated amount is determined based on a resource allocation control amount and a kernel function operation amount of the last moment, the kernel function operation amount is used for indicating a resource amount required for executing a kernel function to be executed, the kernel function to be executed is used for executing the graphics processing process based on the graphics processor, and an allocation control amount determination module 430 is used for determining a graph allocation control amount at the current moment, the actual resource utilization rate and the current moment, and the resource allocation control amount is determined based on the current control amount of the kernel function.
According to the technical scheme, the resource allocation control quantity corresponding to the previous moment of the graphic processing process and the kernel function accumulation quantity corresponding to the current moment are obtained according to the actual resource utilization rate and the set resource utilization rate corresponding to the current moment of the graphic processing process, the resource allocation control quantity corresponding to the current moment of the graphic processing process is determined according to the at least one graphic processing process associated with the graphic processor, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the actual resource utilization rate and the set resource utilization rate, the resource utilization rate deviation, the resource allocation control quantity and the kernel function accumulation quantity, the resource allocation control quantity of the graphic processing process at the current moment is determined according to the execution decision of the graphic processing process corresponding to the kernel function to be executed at the next moment of the current moment based on the resource allocation control quantity, the problems that the resource allocation control mode in related technologies is difficult to adapt to the calculation task and the load condition of dynamic change, and the resource allocation efficiency is low and the resource allocation is unbalanced are solved, the PID control algorithm is optimized based on the neural network model, the effect of dynamically controlling the calculation resource allocation process of the graphic processor based on the PID algorithm is improved, the algorithm is further, the calculation accuracy of the resource allocation control process is improved, the algorithm is controlled flexibly, and the algorithm is improved, and the algorithm of the resource allocation is well is controlled in the algorithm is improved.
Alternatively, the allocation control amount determining module 430 includes a control parameter determining unit and a control amount determining unit.
A control parameter determining unit, configured to input the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the resource allocation control amount, and the kernel function cumulative amount into a control parameter prediction model obtained by training in advance, so as to obtain a feedback control parameter corresponding to the current time of the graphics processing process;
And the control quantity determining unit is used for determining the resource allocation control quantity of the graphic processing process at the current moment based on the resource utilization rate deviation and the feedback control parameter.
Alternatively, the allocation control amount determining module 430 includes an accumulation amount determining unit and an execution decision determining unit.
An accumulated amount determining unit, configured to add the resource allocation control amount to an accumulated amount of kernel functions corresponding to the current time, to obtain an accumulated amount of kernel functions at a time next to the current time;
and the execution decision determining unit is used for comparing the kernel function accumulated quantity at the next moment with the kernel function operation quantity at the next moment, which is preset and corresponds to the function to be executed, and determining the kernel function execution decision at the next moment based on the comparison result.
Optionally, the execution decision determining unit comprises a issuing execution decision determining subunit and a delay execution decision determining subunit.
The issuing execution decision determining subunit is used for determining the execution decision corresponding to the kernel function to be executed as issuing execution under the condition that the kernel function cumulative amount is larger than the kernel function operation amount, determining the difference value between the kernel function cumulative amount and the kernel function operation amount, and updating the kernel function cumulative amount based on the difference value;
And the delayed execution decision determining subunit is used for determining that the kernel function to be executed corresponds to delayed execution under the condition that the cumulative quantity of the kernel function is not larger than the operation quantity of the kernel function.
Optionally, the device further comprises a thread number acquisition module and a kernel function operand determination module.
The thread number acquisition module is used for acquiring the number of thread blocks corresponding to the kernel function to be executed and the number of threads in each thread block;
and the kernel function operand determining module is used for determining the product between the number of the thread blocks and the number of the threads to obtain kernel function operand corresponding to the kernel function to be executed.
Optionally, the device further comprises a model training module.
The model training module is used for training to obtain a control parameter prediction model;
the model training module comprises a sample data acquisition unit and a model training unit.
The system comprises a sample data acquisition unit, a data processing unit and a data processing unit, wherein the sample data acquisition unit is used for acquiring a plurality of training sample data, and the training sample data comprises an actual resource utilization rate, a set resource utilization rate, a resource utilization rate deviation and a kernel function accumulation amount, a resource allocation control amount at the last time of the historical time and an actual feedback control parameter corresponding to the historical time, which correspond to a sample graphic processing process at the historical time;
and the model training unit is used for training the model to be trained based on the training sample data to obtain a control parameter prediction model.
Optionally, the model training unit comprises a prediction control parameter determination subunit, a loss value determination subunit and a model parameter correction subunit.
The prediction control parameter determining subunit is configured to input, for a plurality of training sample data, the actual resource utilization rate, the set resource utilization rate, the resource utilization rate deviation, the kernel function cumulative amount and the resource allocation control amount in the training sample data into a model to be trained, so as to obtain a prediction control parameter corresponding to the sample graphics processing process at the historical moment;
a loss value determination subunit configured to determine a loss value based on the predicted control parameter and the actual feedback control parameter included in the training sample data;
and the model parameter correction subunit is used for correcting the model parameters in the model to be trained based on the loss value, and converging a loss function in the model to be trained as a training target to obtain a control parameter prediction model.
Optionally, the feedback control parameter includes a proportional coefficient, an integral coefficient, and a differential coefficient, and the control amount determining unit includes a deviation amount determining subunit, a proportional control amount determining subunit, an integral control amount determining subunit, a differential control amount determining subunit, and a control amount determining subunit.
A deviation amount determining subunit, configured to determine a product between the resource utilization deviation and a total amount of computation cores corresponding to the graphics processor, and use the product as a resource utilization deviation amount;
A proportional control amount determining subunit configured to determine a product between the proportional coefficient and the resource utilization deviation amount, and take the product as a proportional control amount;
an integral control amount determining subunit configured to determine a product between the integral coefficient and an integral value of the resource utilization deviation amount, the product being taken as an integral control amount;
a differential control amount determination subunit configured to determine a product between the differential coefficient and a differential value of the resource utilization deviation amount, the product being taken as a differential control amount;
And the control quantity determining subunit is used for adding the proportional control quantity, the integral control quantity and the differential control quantity to obtain the resource allocation control quantity of the graphic processing process at the current moment.
The computing power resource allocation control device provided by the embodiment of the invention can execute the computing power resource allocation control method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example five
Fig. 5 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including an input unit 16, such as a keyboard, mouse, etc., an output unit 17, such as various types of displays, speakers, etc., a storage unit 18, such as a magnetic disk, optical disk, etc., and a communication unit 19, such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as the computing resource allocation control method.
In some embodiments, the computing power resource allocation control method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the computing power resource allocation control method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the computing power resource allocation control method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be a special or general purpose programmable processor, operable to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user, for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback), and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a Local Area Network (LAN), a Wide Area Network (WAN), a blockchain network, and the Internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.